What is batch in Kafka producer
4. Buffer management — preventing exhaustion
Junior Level
Definition
Batch — a mechanism where the producer accumulates messages and sends them as a package instead of one at a time.
Without batching:
msg1 → request
msg2 → request
msg3 → request
3 requests to the broker
With batching:
[msg1, msg2, msg3] → 1 request
1 request to the broker
Why is batching needed?
✅ Higher throughput (fewer requests)
✅ Lower per-message latency
✅ Better network utilization
Key Settings
props.put("batch.size", 16384); // 16KB — batch size
props.put("linger.ms", 5); // wait 5ms to fill the batch
props.put("buffer.memory", 33554432); // 32MB — in-memory buffer
Middle Level
How Batching Works
1. Producer calls send(record)
2. Message is added to the batch for the partition
3. Batch is sent when:
- batch.size is reached
- linger.ms has elapsed
- Producer flush() is called
Batch Size
// Small batch.size
props.put("batch.size", 1024); // 1KB
// → Frequent sends, lower throughput
// Large batch.size
props.put("batch.size", 65536); // 64KB
// → Infrequent sends, higher throughput, more memory
Linger.ms
// linger.ms=0 — send immediately
props.put("linger.ms", "0");
// → Minimal latency, lower throughput
// linger.ms=5 — wait 5ms
props.put("linger.ms", "5");
// → Higher throughput, slightly more latency
Buffer Memory
// Buffer for storing pending batches
props.put("buffer.memory", 33554432); // 32MB
// If the buffer is full:
// max.block.ms — how long to wait before erroring
props.put("max.block.ms", 60000);
Common Mistakes
- Too small batch.size:
batch.size=1KB at 10K msg/s throughput → 10K requests/sec instead of ~100, 100x broker load. - linger.ms=0:
Sends immediately → batch doesn't have time to fill → Lower throughput - Too small buffer.memory:
buffer.memory=1MB → fills up quickly → Producer blocks or throws an error
Senior Level
Internal Implementation
RecordAccumulator:
The producer uses RecordAccumulator for batching:
- One batch per partition
- Batches are sent when full or on linger.ms
- Compression is applied to the entire batch
Batch Completion:
A batch is considered full when:
1. batch.size (bytes) is reached
2. linger.ms (time) has elapsed
3. flush() is called
4. Producer is closed
Compression and Batching
Compression is applied to the entire batch:
props.put("compression.type", "lz4");
Advantages:
- Larger batch → better compression ratio
- Less network I/O
- Less CPU on the broker
Performance Tuning
# High throughput configuration
batch.size: 65536 # 64KB
linger.ms: 10 # wait 10ms
buffer.memory: 67108864 # 64MB
compression.type: lz4
max.in.flight.requests.per.connection: 5
# Low latency configuration
batch.size: 16384 # 16KB
linger.ms: 0 # don't wait
buffer.memory: 33554432 # 32MB
compression.type: none (in modern versions lz4 has minimal overhead ~1-2% CPU,
so it is often recommended even for low-latency).
max.in.flight.requests.per.connection: 1
Memory Management
Total memory usage:
buffer.memory + (batch.size * num_partitions * in_flight)
Example:
32MB buffer + (64KB * 100 partitions * 5 in-flight)
= 32MB + 32MB = 64MB
If the buffer is full:
Producer blocks for max.block.ms
→ Then throws BufferExhaustedException
Monitoring
Key metrics:
kafka.producer:batch-size-avg
kafka.producer:batch-size-max
kafka.producer:compression-rate-avg
kafka.producer:buffer-available-bytes
kafka.producer:buffer-exhausted-rate
kafka.producer:record-send-rate
Alerts:
- Buffer exhausted rate > 0 → critical
- Average batch size < 1KB → investigate
- Compression rate < 1.5 → check compression config
Best Practices
✅ batch.size=32KB-64KB for throughput
✅ linger.ms=5-10ms for batching
✅ compression=lz4 for network savings
✅ Monitor batch size and buffer usage
✅ buffer.memory sufficient for workload
❌ batch.size=0 or too small
❌ linger.ms=0 when throughput is needed
❌ buffer.memory too small
❌ Without monitoring batch metrics
❌ Compression without batching
Architectural Decisions
- Batch size tuning — balance between throughput and latency
- Linger.ms — time to fill the batch
- Compression on batch — better ratio
- Buffer management — preventing exhaustion
Summary for Senior
- Batching is a key mechanism for throughput
- batch.size and linger.ms are the main tuning parameters
- Compression is applied to the entire batch
- Buffer memory management is critical for stability
- Monitoring batch metrics is mandatory for production
🎯 Interview Cheat Sheet
Must know:
- Batch — accumulating messages and sending as a package instead of one at a time
- Batch is sent when:
batch.sizeis reached,linger.mselapses, orflush()is called batch.size=16KBby default; for throughput: 32KB-64KBlinger.ms=0by default (send immediately); for batching: 5-10ms- Compression is applied to the entire batch — larger batch = better compression ratio
- Buffer memory (32MB by default) — when full, the producer blocks
- RecordAccumulator — internal batching mechanism, one batch per partition
Common follow-up questions:
- What happens with batch.size=0? — Sending one message at a time, 100x broker load.
- Why linger.ms if batch.size fills up? — For low-throughput, the batch doesn’t fill; linger.ms gives time to accumulate.
- What happens on buffer exhaustion? — Producer blocks for max.block.ms → BufferExhaustedException.
- How does compression affect batching? — Applied to the entire batch, larger batch = better ratio.
Red flags (DO NOT say):
- “linger.ms=0 is the optimal choice for throughput” — the batch doesn’t have time to fill
- “batch.size doesn’t matter” — it determines the number of requests to the broker
- “Compression on each message” — on the entire batch
- “Buffer memory doesn’t need tuning” — fills up quickly at high-throughput
Related topics:
- [[22. How does message compression work]]
- [[4. What is a message key and how does it affect partitioning]]
- [[23. What is idempotent producer]]
- [[3. How is data distributed across partitions]]