Question 9 · Section 15

What message delivery guarantees does Kafka provide

Kafka provides three levels of delivery guarantees:

Language versions: English Russian Ukrainian

Junior Level

Three levels of guarantees

Kafka provides three levels of delivery guarantees:

1. At most once:

Message will be delivered at most once
May be lost on errors

How it’s lost: producer sends and forgets (acks=0). If the network goes down — message is lost. Consumer auto-commits offset before processing — if consumer crashes after commit but before processing — message is lost.

2. At least once:

Message will be delivered at least once
May be duplicated on retry

3. Exactly once:

Message will be delivered exactly once
Works ONLY for Kafka-to-Kafka scenarios. For financial operations writing to a DB, you need the Outbox pattern or idempotent writes.

Visualization

At most once:
  Producer → Kafka → (possibly lost) → Consumer

At least once:
  Producer → Kafka → Consumer → (possibly duplicate)

Exactly once:
  Producer → Kafka → Consumer → exactly once

Simple configuration

// At most once
props.put("acks", "0");

// At least once (recommended)
props.put("acks", "all");
props.put("enable.idempotence", "true");

// Exactly once
props.put("enable.idempotence", "true");
props.put("isolation.level", "read_committed");

When NOT to use each semantics

  • At-most-once — NOT for payments, orders, notifications
  • At-least-once — NOT when duplicates are critical (without idempotent processing)
  • Exactly-once — NOT for Kafka-to-Database, NOT for Kafka-to-HTTP

Middle Level

Configuring guarantees

At most once:

props.put("acks", "0");
props.put("enable.idempotence", "false");
props.put("enable.auto.commit", "true");
// Fast, but possible data loss

At least once:

props.put("acks", "all");
props.put("enable.idempotence", "true");
props.put("retries", Integer.MAX_VALUE);
props.put("enable.auto.commit", "false");
// Reliable, but possible duplicates

Exactly once:

// Producer
props.put("enable.idempotence", "true");
props.put("acks", "all");

// Consumer
props.put("isolation.level", "read_committed");
props.put("enable.auto.commit", "false");

Transaction API for Exactly Once

producer.initTransactions();

try {
    producer.beginTransaction();

    // Read
    ConsumerRecords<String, String> records = consumer.poll();

    // Process
    for (var record : records) {
        process(record);
    }

    // Write result
    producer.send(new ProducerRecord<>("output", result));

    // Commit both actions atomically
    producer.commitTransaction();

} catch (Exception e) {
    // Rollback on error
    producer.abortTransaction();
}

Common mistakes

  1. At least once without idempotency:
    Retry → duplicates in topic → processed twice
    
  2. Commit before processing:
    consumer.commitSync();  // commit first
    process(records);       // then process
    // If crashes → data lost
    
  3. Exactly once without transaction:
    Producer and consumer not in the same transaction
    → Possible loss or duplication
    

Senior Level

Internal Implementation

Idempotent Producer:

Each producer gets a unique PID (Producer ID)
Each message gets a Sequence Number
Broker tracks the sequence

Duplicate with same PID + Sequence → rejected

Transaction Coordinator:

Separate component coordinates transactions
Stores transaction state in __transaction_state topic
Ensures atomic commit/abort

Exactly Once — limitations

Exactly Once only works for:
  Kafka → Kafka (read-process-write)

Does NOT work for:
  Kafka → Database (Kafka doesn't control DB transaction)
  Kafka → HTTP API (no guarantees on API side)

Solution for external systems:

1. Idempotent writes to DB (unique constraints)
2. Two-phase commit (XA — not recommended)
3. Outbox pattern
4. Debezium CDC

End-to-End Exactly Once

Scenario: Kafka → Processing → PostgreSQL

Solution: Outbox Pattern
1. Write to outbox table (in the same transaction)
2. Debezium CDC reads outbox
3. Sends to Kafka
4. Consumer processes and writes to target table
5. Idempotent upsert by business key

Failure Scenarios

1. Producer failure:

Sent → didn't receive ack → retry
Idempotent producer → duplicate rejected

2. Broker failure:

Written to Leader → didn't replicate → Leader crashed
acks=all + min.insync.replicas=2 → data saved

3. Consumer failure:

Read → didn't process → didn't commit
On restart → reads again (at least once)
Idempotent processing → duplicate handled correctly

Performance Trade-offs

Guarantee Latency Throughput Data Safety
At most once Minimum Maximum Low
At least once Medium High High
Exactly once High Medium Maximum

Best Practices

✅ At least once by default
✅ Exactly once when critical (finance, billing)
✅ Idempotent processing on consumer
✅ Idempotent producer (enable.idempotence=true)
✅ Transactional read-process-write for Kafka-to-Kafka
✅ Outbox pattern for external systems

❌ At most once for important data
❌ Without duplicate handling
❌ Exactly once without understanding limitations
❌ Commit before processing

Architectural decisions

  1. At least once + idempotent consumer — optimal balance
  2. Exactly only for Kafka-to-Kafka — doesn’t work with external systems
  3. Outbox pattern — for end-to-end guarantees
  4. Idempotent writes — universal solution for external systems

Summary for Senior

  • Exactly Once works only in the Kafka-to-Kafka scenario
  • For external systems use Outbox pattern or idempotent writes
  • At least once + idempotent consumer — the optimal approach
  • Transaction API provides atomic read-process-write
  • Understanding limitations is critical for correct architecture

🎯 Interview Cheat Sheet

Must know:

  • Three guarantees: at-most-once (0-1 times), at-least-once (1+ times), exactly-once (exactly 1)
  • At-most-once: acks=0, fast but possible data loss
  • At-least-once: acks=all + enable.idempotence=true, reliable but possible duplicates
  • Exactly-once: Transaction API, works ONLY for Kafka-to-Kafka scenarios
  • For external systems (DB, HTTP) — Outbox pattern or idempotent writes
  • Idempotent producer: PID + Sequence Number, broker rejects duplicates on retry
  • Commit BEFORE processing = data loss; commit AFTER = at-least-once

Common follow-up questions:

  • Why doesn’t exactly-once work for Kafka → DB? — Kafka doesn’t control the DB transaction.
  • How to ensure end-to-end guarantee? — Outbox pattern: write to outbox_table + CDC (Debezium).
  • What does idempotent producer do? — Unique PID + sequence number, broker rejects duplicates on retry.
  • Which guarantee to choose by default? — At-least-once + idempotent consumer processing.

Red flags (DO NOT say):

  • “Exactly-once works for any system” — only Kafka-to-Kafka
  • “At-most-once is fine for payments” — data loss is unacceptable
  • “Commit before processing is fine” — on crash, data is lost
  • “Idempotent producer isn’t needed” — without it, retry = duplicates

Related topics:

  • [[10. What is the difference between at-most-once, at-least-once and exactly-once]]
  • [[11. How to configure exactly-once semantics]]
  • [[23. What is idempotent producer]]
  • [[20. What is producer acknowledgment and what modes exist (acks=0,1,all)]]