Question 21 · Section 15

What is batch in Kafka producer

4. Buffer management — preventing exhaustion

Language versions: English Russian Ukrainian

Junior Level

Definition

Batch — a mechanism where the producer accumulates messages and sends them as a package instead of one at a time.

Without batching:
  msg1 → request
  msg2 → request
  msg3 → request
  3 requests to the broker

With batching:
  [msg1, msg2, msg3] → 1 request
  1 request to the broker

Why is batching needed?

✅ Higher throughput (fewer requests)
✅ Lower per-message latency
✅ Better network utilization

Key Settings

props.put("batch.size", 16384);       // 16KB — batch size
props.put("linger.ms", 5);            // wait 5ms to fill the batch
props.put("buffer.memory", 33554432); // 32MB — in-memory buffer

Middle Level

How Batching Works

1. Producer calls send(record)
2. Message is added to the batch for the partition
3. Batch is sent when:
   - batch.size is reached
   - linger.ms has elapsed
   - Producer flush() is called

Batch Size

// Small batch.size
props.put("batch.size", 1024);  // 1KB
// → Frequent sends, lower throughput

// Large batch.size
props.put("batch.size", 65536);  // 64KB
// → Infrequent sends, higher throughput, more memory

Linger.ms

// linger.ms=0 — send immediately
props.put("linger.ms", "0");
// → Minimal latency, lower throughput

// linger.ms=5 — wait 5ms
props.put("linger.ms", "5");
// → Higher throughput, slightly more latency

Buffer Memory

// Buffer for storing pending batches
props.put("buffer.memory", 33554432);  // 32MB

// If the buffer is full:
// max.block.ms — how long to wait before erroring
props.put("max.block.ms", 60000);

Common Mistakes

  1. Too small batch.size:
    batch.size=1KB at 10K msg/s throughput → 10K requests/sec instead of ~100,
    100x broker load.
    
  2. linger.ms=0:
    Sends immediately → batch doesn't have time to fill
    → Lower throughput
    
  3. Too small buffer.memory:
    buffer.memory=1MB → fills up quickly
    → Producer blocks or throws an error
    

Senior Level

Internal Implementation

RecordAccumulator:

The producer uses RecordAccumulator for batching:
- One batch per partition
- Batches are sent when full or on linger.ms
- Compression is applied to the entire batch

Batch Completion:

A batch is considered full when:
1. batch.size (bytes) is reached
2. linger.ms (time) has elapsed
3. flush() is called
4. Producer is closed

Compression and Batching

Compression is applied to the entire batch:
props.put("compression.type", "lz4");

Advantages:
- Larger batch → better compression ratio
- Less network I/O
- Less CPU on the broker

Performance Tuning

# High throughput configuration
batch.size: 65536          # 64KB
linger.ms: 10              # wait 10ms
buffer.memory: 67108864    # 64MB
compression.type: lz4
max.in.flight.requests.per.connection: 5

# Low latency configuration
batch.size: 16384          # 16KB
linger.ms: 0               # don't wait
buffer.memory: 33554432    # 32MB
compression.type: none (in modern versions lz4 has minimal overhead ~1-2% CPU,
so it is often recommended even for low-latency).
max.in.flight.requests.per.connection: 1

Memory Management

Total memory usage:
  buffer.memory + (batch.size * num_partitions * in_flight)

Example:
  32MB buffer + (64KB * 100 partitions * 5 in-flight)
  = 32MB + 32MB = 64MB

If the buffer is full:
  Producer blocks for max.block.ms
  → Then throws BufferExhaustedException

Monitoring

Key metrics:

kafka.producer:batch-size-avg
kafka.producer:batch-size-max
kafka.producer:compression-rate-avg
kafka.producer:buffer-available-bytes
kafka.producer:buffer-exhausted-rate
kafka.producer:record-send-rate

Alerts:

- Buffer exhausted rate > 0 → critical
- Average batch size < 1KB → investigate
- Compression rate < 1.5 → check compression config

Best Practices

✅ batch.size=32KB-64KB for throughput
✅ linger.ms=5-10ms for batching
✅ compression=lz4 for network savings
✅ Monitor batch size and buffer usage
✅ buffer.memory sufficient for workload

❌ batch.size=0 or too small
❌ linger.ms=0 when throughput is needed
❌ buffer.memory too small
❌ Without monitoring batch metrics
❌ Compression without batching

Architectural Decisions

  1. Batch size tuning — balance between throughput and latency
  2. Linger.ms — time to fill the batch
  3. Compression on batch — better ratio
  4. Buffer management — preventing exhaustion

Summary for Senior

  • Batching is a key mechanism for throughput
  • batch.size and linger.ms are the main tuning parameters
  • Compression is applied to the entire batch
  • Buffer memory management is critical for stability
  • Monitoring batch metrics is mandatory for production

🎯 Interview Cheat Sheet

Must know:

  • Batch — accumulating messages and sending as a package instead of one at a time
  • Batch is sent when: batch.size is reached, linger.ms elapses, or flush() is called
  • batch.size=16KB by default; for throughput: 32KB-64KB
  • linger.ms=0 by default (send immediately); for batching: 5-10ms
  • Compression is applied to the entire batch — larger batch = better compression ratio
  • Buffer memory (32MB by default) — when full, the producer blocks
  • RecordAccumulator — internal batching mechanism, one batch per partition

Common follow-up questions:

  • What happens with batch.size=0? — Sending one message at a time, 100x broker load.
  • Why linger.ms if batch.size fills up? — For low-throughput, the batch doesn’t fill; linger.ms gives time to accumulate.
  • What happens on buffer exhaustion? — Producer blocks for max.block.ms → BufferExhaustedException.
  • How does compression affect batching? — Applied to the entire batch, larger batch = better ratio.

Red flags (DO NOT say):

  • “linger.ms=0 is the optimal choice for throughput” — the batch doesn’t have time to fill
  • “batch.size doesn’t matter” — it determines the number of requests to the broker
  • “Compression on each message” — on the entire batch
  • “Buffer memory doesn’t need tuning” — fills up quickly at high-throughput

Related topics:

  • [[22. How does message compression work]]
  • [[4. What is a message key and how does it affect partitioning]]
  • [[23. What is idempotent producer]]
  • [[3. How is data distributed across partitions]]