Question 10 · Section 15

What is the difference between at-most-once, at-least-once and exactly-once

Outbox pattern: (1) INSERT into outbox_table + business_data in one DB transaction. (2) A separate process reads outbox_table and sends to Kafka. (3) After successful delivery —...

Language versions: English Russian Ukrainian

Junior Level

Definitions

At-most-once:

  • Message will be delivered 0 or 1 times
  • May be lost on errors
  • ~1x baseline latency

At-least-once:

  • Message will be delivered 1 or more times
  • May be duplicated on retry
  • Data won’t be lost

Exactly-once:

  • Message will be delivered exactly 1 time
  • No losses, no duplicates
  • 2-3x latency

Real-life example

At-most-once:
  Sent SMS → didn't check if delivered
  → May not arrive

At-least-once:
  Sent SMS → didn't receive confirmation → sent again
  → May arrive twice

Exactly-once:
  Sent SMS → received unique ID → verified delivery
  → Arrived exactly once

When to use what

Semantics When to use
At-most-once Metrics, logs, monitoring
At-least-once Most business systems
Exactly-once Finance, billing, banking operations

Middle Level

How it works in Kafka

At-most-once:

// Producer — doesn't wait for confirmation
props.put("acks", "0");

// Consumer — commits immediately on receive
props.put("enable.auto.commit", "true");

At-least-once:

// Producer — waits for confirmation from all
props.put("acks", "all");
props.put("enable.idempotence", "true");
props.put("retries", Integer.MAX_VALUE);

// Consumer — commits after processing
props.put("enable.auto.commit", "false");
consumer.commitSync();  // after process()

Exactly-once:

// Producer
props.put("enable.idempotence", "true");
props.put("transactional.id", "my-tx-id");

// Consumer
props.put("isolation.level", "read_committed");

Comparison table

Parameter At-most-once At-least-once Exactly-once
Data loss Yes No No
Duplicates No Yes No
Performance Maximum High Medium
Complexity Minimum Medium High
Use case Metrics Business events Finance

Idempotency — key to at-least-once

// Idempotent processing
public void process(Message msg) {
    if (!alreadyProcessed(msg.getId())) {
        doProcess(msg);
        markAsProcessed(msg.getId());
    }
}

Common mistakes

  1. At-least-once without duplicate handling:
    Duplicates → double charges → bugs in business logic
    
  2. Exactly-once without understanding limitations:
    Kafka → PostgreSQL → exactly-once doesn't work
    Kafka doesn't control PostgreSQL transaction
    
  3. Commit before processing:
    consumer.commitSync();  // ❌
    process(records);       // If crashes — data lost
    

Senior Level

“EndOfWorld” problem

Exactly-once in Kafka works “out of the box” only for Kafka-to-Kafka:

Kafka → Processing → Kafka  ✅ Works
Kafka → Processing → PostgreSQL  ❌ Does not work

Why?

Kafka can guarantee atomic commit only for:
- Writing to Kafka topic
- Committing offsets in __consumer_offsets

Kafka cannot manage a PostgreSQL transaction!

Outbox pattern: (1) INSERT into outbox_table + business_data in one DB transaction. (2) A separate process reads outbox_table and sends to Kafka. (3) After successful delivery — DELETE from outbox_table.

Solutions for external systems

1. Idempotent writes:

// PostgreSQL — UPSERT by unique key
INSERT INTO orders (id, data)
VALUES (?, ?)
ON CONFLICT (id) DO NOTHING;

2. Outbox Pattern:

1. Business operation → write to outbox (same transaction)
2. CDC (Debezium) → reads outbox
3. Sends to Kafka
4. Consumer → processes → writes to target

3. Two-Phase Commit (XA — not recommended):

Prepare phase → all participants ready
Commit phase → all commit
Problems: locks, complexity, performance

Internal Implementation Details

Idempotent Producer:

PID (Producer ID) + Sequence Number
Broker rejects duplicates by sequence
Works at partition level

Transactions:

Transaction Coordinator → manages state
__transaction_state topic → stores state
Commit/Abort → atomic operations

Performance Analysis

Latency comparison (relative):
At-most-once:     1x
At-least-once:    1.2x
Exactly-once:     2-3x

Throughput comparison (relative):
At-most-once:     100%
At-least-once:    85-95%
Exactly-once:     50-70%

When to Use What

At-most-once:

  • Monitoring metrics
  • Application logs
  • Real-time analytics (acceptable loss)

At-least-once:

  • Most business systems
  • Order processing
  • Notifications
  • Cache updates

Exactly-once:

  • Financial transactions
  • Billing
  • Bank transfers
  • Kafka-to-Kafka scenarios only

Best Practices

✅ Choose at-least-once by default
✅ Strive for idempotent consumers
✅ Exactly-once only for Kafka-to-Kafka
✅ Outbox pattern for external systems
✅ Idempotent writes to DB

❌ At-most-once for important data
❌ Exactly-once for Kafka → Database
❌ Without duplicate handling
❌ Ignoring performance trade-offs

Architectural decisions

  1. At-least-once + idempotent consumer — optimal balance
  2. Exactly-once requires broker transaction parameter configuration (transaction.state.log.replication.factor >= 2). In Kafka 1.0+ transaction support is enabled by default.
  3. Outbox pattern — universal solution for end-to-end
  4. Idempotent writes — cheaper and more reliable than transactions

Summary for Senior

  • Choose at-least-once by default
  • Strive to make consumers idempotent
  • Exactly-once works only Kafka-to-Kafka
  • For external systems use Outbox or idempotent writes
  • Understand performance trade-offs of each semantics

🎯 Interview Cheat Sheet

Must know:

  • At-most-once: 1x latency, possible loss, for metrics/logs
  • At-least-once: 1.2x latency, possible duplicates, for business systems
  • Exactly-once: 2-3x latency, 50-70% throughput, only for finance
  • Exactly-once works out of the box only for Kafka-to-Kafka
  • For Kafka → DB: Outbox pattern or idempotent writes (UPSERT)
  • At-least-once + idempotent consumer — optimal balance for most systems
  • Commit BEFORE processing = data loss, commit AFTER = at-least-once

Common follow-up questions:

  • Why doesn’t exactly-once work with external systems? — Kafka cannot manage a PostgreSQL transaction.
  • What is the Outbox pattern? — INSERT into outbox_table in the same transaction → Debezium CDC → Kafka.
  • Which semantics is the default? — At-least-once, strive for idempotent consumers.
  • What’s the overhead of exactly-once? — 2-3x latency, 50-70% throughput.

Red flags (DO NOT say):

  • “Exactly-once works for Kafka → HTTP API” — HTTP doesn’t support Kafka transactions
  • “At-most-once for orders” — lost orders are unacceptable
  • “Duplicates aren’t a problem” — double charges = bug in business logic
  • “Commit before processing is standard practice” — that’s data loss

Related topics:

  • [[9. What message delivery guarantees does Kafka provide]]
  • [[11. How to configure exactly-once semantics]]
  • [[23. What is idempotent producer]]
  • [[25. What is DLQ (Dead Letter Queue)]]