What message delivery guarantees does Kafka provide
Kafka provides three levels of delivery guarantees:
Junior Level
Three levels of guarantees
Kafka provides three levels of delivery guarantees:
1. At most once:
Message will be delivered at most once
May be lost on errors
How it’s lost: producer sends and forgets (acks=0). If the network goes down — message is lost. Consumer auto-commits offset before processing — if consumer crashes after commit but before processing — message is lost.
2. At least once:
Message will be delivered at least once
May be duplicated on retry
3. Exactly once:
Message will be delivered exactly once
Works ONLY for Kafka-to-Kafka scenarios. For financial operations writing to a DB, you need the Outbox pattern or idempotent writes.
Visualization
At most once:
Producer → Kafka → (possibly lost) → Consumer
At least once:
Producer → Kafka → Consumer → (possibly duplicate)
Exactly once:
Producer → Kafka → Consumer → exactly once
Simple configuration
// At most once
props.put("acks", "0");
// At least once (recommended)
props.put("acks", "all");
props.put("enable.idempotence", "true");
// Exactly once
props.put("enable.idempotence", "true");
props.put("isolation.level", "read_committed");
When NOT to use each semantics
- At-most-once — NOT for payments, orders, notifications
- At-least-once — NOT when duplicates are critical (without idempotent processing)
- Exactly-once — NOT for Kafka-to-Database, NOT for Kafka-to-HTTP
Middle Level
Configuring guarantees
At most once:
props.put("acks", "0");
props.put("enable.idempotence", "false");
props.put("enable.auto.commit", "true");
// Fast, but possible data loss
At least once:
props.put("acks", "all");
props.put("enable.idempotence", "true");
props.put("retries", Integer.MAX_VALUE);
props.put("enable.auto.commit", "false");
// Reliable, but possible duplicates
Exactly once:
// Producer
props.put("enable.idempotence", "true");
props.put("acks", "all");
// Consumer
props.put("isolation.level", "read_committed");
props.put("enable.auto.commit", "false");
Transaction API for Exactly Once
producer.initTransactions();
try {
producer.beginTransaction();
// Read
ConsumerRecords<String, String> records = consumer.poll();
// Process
for (var record : records) {
process(record);
}
// Write result
producer.send(new ProducerRecord<>("output", result));
// Commit both actions atomically
producer.commitTransaction();
} catch (Exception e) {
// Rollback on error
producer.abortTransaction();
}
Common mistakes
- At least once without idempotency:
Retry → duplicates in topic → processed twice - Commit before processing:
consumer.commitSync(); // commit first process(records); // then process // If crashes → data lost - Exactly once without transaction:
Producer and consumer not in the same transaction → Possible loss or duplication
Senior Level
Internal Implementation
Idempotent Producer:
Each producer gets a unique PID (Producer ID)
Each message gets a Sequence Number
Broker tracks the sequence
Duplicate with same PID + Sequence → rejected
Transaction Coordinator:
Separate component coordinates transactions
Stores transaction state in __transaction_state topic
Ensures atomic commit/abort
Exactly Once — limitations
Exactly Once only works for:
Kafka → Kafka (read-process-write)
Does NOT work for:
Kafka → Database (Kafka doesn't control DB transaction)
Kafka → HTTP API (no guarantees on API side)
Solution for external systems:
1. Idempotent writes to DB (unique constraints)
2. Two-phase commit (XA — not recommended)
3. Outbox pattern
4. Debezium CDC
End-to-End Exactly Once
Scenario: Kafka → Processing → PostgreSQL
Solution: Outbox Pattern
1. Write to outbox table (in the same transaction)
2. Debezium CDC reads outbox
3. Sends to Kafka
4. Consumer processes and writes to target table
5. Idempotent upsert by business key
Failure Scenarios
1. Producer failure:
Sent → didn't receive ack → retry
Idempotent producer → duplicate rejected
2. Broker failure:
Written to Leader → didn't replicate → Leader crashed
acks=all + min.insync.replicas=2 → data saved
3. Consumer failure:
Read → didn't process → didn't commit
On restart → reads again (at least once)
Idempotent processing → duplicate handled correctly
Performance Trade-offs
| Guarantee | Latency | Throughput | Data Safety |
|---|---|---|---|
| At most once | Minimum | Maximum | Low |
| At least once | Medium | High | High |
| Exactly once | High | Medium | Maximum |
Best Practices
✅ At least once by default
✅ Exactly once when critical (finance, billing)
✅ Idempotent processing on consumer
✅ Idempotent producer (enable.idempotence=true)
✅ Transactional read-process-write for Kafka-to-Kafka
✅ Outbox pattern for external systems
❌ At most once for important data
❌ Without duplicate handling
❌ Exactly once without understanding limitations
❌ Commit before processing
Architectural decisions
- At least once + idempotent consumer — optimal balance
- Exactly only for Kafka-to-Kafka — doesn’t work with external systems
- Outbox pattern — for end-to-end guarantees
- Idempotent writes — universal solution for external systems
Summary for Senior
- Exactly Once works only in the Kafka-to-Kafka scenario
- For external systems use Outbox pattern or idempotent writes
- At least once + idempotent consumer — the optimal approach
- Transaction API provides atomic read-process-write
- Understanding limitations is critical for correct architecture
🎯 Interview Cheat Sheet
Must know:
- Three guarantees: at-most-once (0-1 times), at-least-once (1+ times), exactly-once (exactly 1)
- At-most-once:
acks=0, fast but possible data loss - At-least-once:
acks=all+enable.idempotence=true, reliable but possible duplicates - Exactly-once: Transaction API, works ONLY for Kafka-to-Kafka scenarios
- For external systems (DB, HTTP) — Outbox pattern or idempotent writes
- Idempotent producer: PID + Sequence Number, broker rejects duplicates on retry
- Commit BEFORE processing = data loss; commit AFTER = at-least-once
Common follow-up questions:
- Why doesn’t exactly-once work for Kafka → DB? — Kafka doesn’t control the DB transaction.
- How to ensure end-to-end guarantee? — Outbox pattern: write to outbox_table + CDC (Debezium).
- What does idempotent producer do? — Unique PID + sequence number, broker rejects duplicates on retry.
- Which guarantee to choose by default? — At-least-once + idempotent consumer processing.
Red flags (DO NOT say):
- “Exactly-once works for any system” — only Kafka-to-Kafka
- “At-most-once is fine for payments” — data loss is unacceptable
- “Commit before processing is fine” — on crash, data is lost
- “Idempotent producer isn’t needed” — without it, retry = duplicates
Related topics:
- [[10. What is the difference between at-most-once, at-least-once and exactly-once]]
- [[11. How to configure exactly-once semantics]]
- [[23. What is idempotent producer]]
- [[20. What is producer acknowledgment and what modes exist (acks=0,1,all)]]