What is batch in Kafka producer

Junior Level

Definition

Batch — a mechanism where the producer accumulates messages and sends them as a package instead of one at a time.

Without batching:
  msg1 → request
  msg2 → request
  msg3 → request
  3 requests to the broker

With batching:
  [msg1, msg2, msg3] → 1 request
  1 request to the broker

Why is batching needed?

✅ Higher throughput (fewer requests)
✅ Lower per-message latency
✅ Better network utilization

Key Settings

props.put("batch.size", 16384);       // 16KB — batch size
props.put("linger.ms", 5);            // wait 5ms to fill the batch
props.put("buffer.memory", 33554432); // 32MB — in-memory buffer

Middle Level

How Batching Works

1. Producer calls send(record)
2. Message is added to the batch for the partition
3. Batch is sent when:
   - batch.size is reached
   - linger.ms has elapsed
   - Producer flush() is called

Batch Size

// Small batch.size
props.put("batch.size", 1024);  // 1KB
// → Frequent sends, lower throughput

// Large batch.size
props.put("batch.size", 65536);  // 64KB
// → Infrequent sends, higher throughput, more memory

Linger.ms

// linger.ms=0 — send immediately
props.put("linger.ms", "0");
// → Minimal latency, lower throughput

// linger.ms=5 — wait 5ms
props.put("linger.ms", "5");
// → Higher throughput, slightly more latency

Buffer Memory

// Buffer for storing pending batches
props.put("buffer.memory", 33554432);  // 32MB

// If the buffer is full:
// max.block.ms — how long to wait before erroring
props.put("max.block.ms", 60000);

Common Mistakes

Too small batch.size:

batch.size=1KB at 10K msg/s throughput → 10K requests/sec instead of ~100,
100x broker load.

linger.ms=0:

Sends immediately → batch doesn't have time to fill
→ Lower throughput

Too small buffer.memory:

buffer.memory=1MB → fills up quickly
→ Producer blocks or throws an error

Senior Level

Internal Implementation

RecordAccumulator:

The producer uses RecordAccumulator for batching:
- One batch per partition
- Batches are sent when full or on linger.ms
- Compression is applied to the entire batch

Batch Completion:

A batch is considered full when:
batch.size (bytes) is reached
linger.ms (time) has elapsed
flush() is called
Producer is closed

Compression and Batching

Compression is applied to the entire batch:
props.put("compression.type", "lz4");

Advantages:
- Larger batch → better compression ratio
- Less network I/O
- Less CPU on the broker

Performance Tuning

# High throughput configuration
batch.size: 65536          # 64KB
linger.ms: 10              # wait 10ms
buffer.memory: 67108864    # 64MB
compression.type: lz4
max.in.flight.requests.per.connection: 5

# Low latency configuration
batch.size: 16384          # 16KB
linger.ms: 0               # don't wait
buffer.memory: 33554432    # 32MB
compression.type: none (in modern versions lz4 has minimal overhead ~1-2% CPU,
so it is often recommended even for low-latency).
max.in.flight.requests.per.connection: 1

Memory Management

Total memory usage:
  buffer.memory + (batch.size * num_partitions * in_flight)

Example:
  32MB buffer + (64KB * 100 partitions * 5 in-flight)
  = 32MB + 32MB = 64MB

If the buffer is full:
  Producer blocks for max.block.ms
  → Then throws BufferExhaustedException

Monitoring

Key metrics:

kafka.producer:batch-size-avg
kafka.producer:batch-size-max
kafka.producer:compression-rate-avg
kafka.producer:buffer-available-bytes
kafka.producer:buffer-exhausted-rate
kafka.producer:record-send-rate

Alerts:

- Buffer exhausted rate > 0 → critical
- Average batch size < 1KB → investigate
- Compression rate < 1.5 → check compression config

Best Practices

✅ batch.size=32KB-64KB for throughput
✅ linger.ms=5-10ms for batching
✅ compression=lz4 for network savings
✅ Monitor batch size and buffer usage
✅ buffer.memory sufficient for workload

❌ batch.size=0 or too small
❌ linger.ms=0 when throughput is needed
❌ buffer.memory too small
❌ Without monitoring batch metrics
❌ Compression without batching

Architectural Decisions

Batch size tuning — balance between throughput and latency
Linger.ms — time to fill the batch
Compression on batch — better ratio
Buffer management — preventing exhaustion

Summary for Senior

Batching is a key mechanism for throughput
batch.size and linger.ms are the main tuning parameters
Compression is applied to the entire batch
Buffer memory management is critical for stability
Monitoring batch metrics is mandatory for production

🎯 Interview Cheat Sheet

Must know:

Batch — accumulating messages and sending as a package instead of one at a time
Batch is sent when: batch.size is reached, linger.ms elapses, or flush() is called
batch.size=16KB by default; for throughput: 32KB-64KB
linger.ms=0 by default (send immediately); for batching: 5-10ms
Compression is applied to the entire batch — larger batch = better compression ratio
Buffer memory (32MB by default) — when full, the producer blocks
RecordAccumulator — internal batching mechanism, one batch per partition

Common follow-up questions:

What happens with batch.size=0? — Sending one message at a time, 100x broker load.
Why linger.ms if batch.size fills up? — For low-throughput, the batch doesn’t fill; linger.ms gives time to accumulate.
What happens on buffer exhaustion? — Producer blocks for max.block.ms → BufferExhaustedException.
How does compression affect batching? — Applied to the entire batch, larger batch = better ratio.

Red flags (DO NOT say):

“linger.ms=0 is the optimal choice for throughput” — the batch doesn’t have time to fill
“batch.size doesn’t matter” — it determines the number of requests to the broker
“Compression on each message” — on the entire batch
“Buffer memory doesn’t need tuning” — fills up quickly at high-throughput

Related topics:

[[22. How does message compression work]]
[[4. What is a message key and how does it affect partitioning]]
[[23. What is idempotent producer]]
[[3. How is data distributed across partitions]]