What is topic in Kafka
Think of TV channels:
🟢 Junior Level
What is a topic?
Topic is a named channel (logical category) into which producers send messages and consumers read them. It is the primary abstraction for organizing data in Kafka.
How a topic differs from a queue: In a queue, message read = deleted. In a topic, the message is stored for a specified time (retention), and it can be read by any number of independent consumer groups. This is the key difference from traditional message brokers (RabbitMQ).
Analogy
Think of TV channels:
- Topic “orders” — a channel about orders
- Topic “users” — a channel about users
- Topic “notifications” — a channel about notifications
Each producer “broadcasts” into its own channel, each consumer “tunes the receiver” to the desired channel.
Important difference from TV channels: in Kafka, the same message can be read by multiple independent groups, each at its own speed and from its own position.
Simple example
Producer → Topic "orders" → Consumer
Producer → Topic "users" → Consumer
// Producer sends to topic
producer.send(new ProducerRecord<>("orders", "order-data"));
// Consumer reads from topic
consumer.subscribe(List.of("orders"));
Key properties
- A topic is created with a specified number of partitions (sub-channels for parallelism) and replication factor (how many copies to store on different servers for fault tolerance).
- Messages are stored for a defined time (retention period).
- A topic can be created, modified, and deleted.
kafka-topics.sh --create \
--topic orders \
--partitions 3 \
--replication-factor 3 \
--bootstrap-server localhost:9092
When NOT to use Kafka topics
- Request-response patterns — HTTP/gRPC is better
- Guaranteed delivery to a single recipient — RabbitMQ is better
- Data needs frequent updates/modifications — a database is better
🟡 Middle Level
Logical and physical structure
Logically: a topic is a category for messages.
Physically: a topic consists of partitions distributed across brokers. Each partition is a set of files on disk.
Topic "orders"
├── Partition 0 → Broker A
├── Partition 1 → Broker B
└── Partition 2 → Broker C
Cleanup Policies
Two main cleanup policies:
| Policy | Description | Use Case |
|---|---|---|
| delete (default) | Messages are deleted after retention expires | Events, logs, orders |
| compact | Keeps only the latest value for each key | User profiles, current prices |
Partition file structure
/partition-0/
├── 00000000000000000000.log # data
├── 00000000000000000000.index # offset index
└── 00000000000000000000.timeindex # time index
Ordering guarantees
- Within a partition: strict ordering (FIFO)
- Within a topic: no ordering guarantees (if partitions > 1)
Topic configurations
| Parameter | Description | Example |
|---|---|---|
cleanup.policy |
delete or compact | delete |
retention.ms |
storage duration | 604800000 (7 days) |
min.insync.replicas |
minimum replicas for write | 2 |
compression.type |
compression type | lz4, snappy, zstd |
Common mistakes
| Mistake | Consequence | Solution |
|---|---|---|
| Too few partitions | Limits consumer parallelism | Plan ahead |
| No retention configured | Disk will fill up | Set retention.ms |
| One topic for all data | Hard to scale | Split by domain |
🔴 Senior Level
Topic as a distributed append-only log
A topic is an immutable distributed append-only log, split into partitions for scalability and fault tolerance. At the broker level, each topic is managed via LogManager, which stores a collection of Log objects (one per partition).
Log Compaction — deep dive
How Log Cleaner works:
- Background thread
LogCleanerscans segments (threshold:min.cleanable.dirty.ratio=0.5) - Builds a hash table
offset → keyfor all records - Rewrites segments, keeping only the latest values for each key
- Tombstone records (
value = null) mark keys for complete deletion
Performance:
log.cleaner.threads=1(default) — sufficient for most caseslog.cleaner.io.buffer.size=512KB— read/write bufferlog.cleaner.backoff.ms=15000— pause between cycles
Use Cases:
- Storing user profile (Changelog pattern)
- Current aggregate state (Event Sourcing)
- State recovery in Kafka Streams (changelog topics)
Segment architecture
Active segment → writes (sealed when segment.bytes or segment.ms reached)
Closed segments → read-only, candidates for deletion/compaction
segment.bytes=1GB
segment.ms=7 days
segment.index.bytes=10MB
Benefits of segmentation:
- O(1) deletion — entire segments are deleted
- Efficient OS Page Cache utilization
- Parallel cleanup (different threads clean different segments)
- Index-based navigation — binary search on
.indexfiles
Senior-level configurations
| Parameter | Impact | Recommendation |
|---|---|---|
min.insync.replicas |
Critical for durability (paired with acks=all) | 2 with RF=3 |
unclean.leader.election.enable |
Availability (true) vs Durability (false) | false for critical data |
message.format.version |
Mismatch → conversion overhead | Single version across cluster |
max.message.bytes |
Message size limit | 1048588 (1MB default) |
Edge Cases
- Topic with 0 partitions — impossible, Kafka requires at least 1 partition at creation
- Reducing partitions — impossible without deleting the topic (
alter --partitionsonly increases) - Topic with RF > number of brokers — creation error (
Replication factor: 3 larger than available brokers: 2) - Topic name with dot (
.) or underscore (_) — in older versions this caused conflicts with internal topics (__consumer_offsets) - Concurrent writes during ISR rebalance — possible message loss if
acks=1and leader fails before replication
Performance Numbers
| Metric | Value | Conditions |
|---|---|---|
| Throughput per topic | ~500 MB/s | 12 partitions, lz4, batch.size=256KB |
| Latency (p99) | 5-15 ms | acks=all, RF=3, local data center |
| Max partitions/broker | ~200,000 | Depends on OS file descriptors |
| Optimal partitions/broker | 2,000-4,000 | For stable controller operation |
Production War Story
Situation: A fintech company used a single topic
eventsfor all event types (orders, payments, logins). The topic had 6 partitions. When load grew to 50K msg/s, consumers couldn’t keep up — lag grew exponentially.Problem: When trying to increase partitions to 60, order keys no longer landed in the same partitions — payment processing order was broken. Clients saw duplicate charges.
Solution: Split into 4 domain-specific topics (
orders,payments,auth,audit), each with 20 partitions. Migrated via dual-write with a 48-hour “shadow” period. Addedmin.insync.replicas=2andacks=allfor the payment topic.Lesson: Split topics by business domain from the very beginning. Plan partitions with a 10x margin.
Monitoring (JMX metrics)
kafka.server:name=BytesInPerSec,type=BrokerTopicMetrics
kafka.server:name=BytesOutPerSec,type=BrokerTopicMetrics
kafka.server:name=MessagesInPerSec,type=BrokerTopicMetrics
kafka.cluster:type=Partition,name=UnderReplicated
kafka.log:type=Log,name=LogEndOffset
kafka.log:type=Log,name=LogStartOffset
Burrow (LinkedIn):
- Evaluates consumer lag and status (OK, WARN, ERR)
- HTTP API for Alertmanager/PagerDuty integration
- Requires no consumer modifications
Highload Best Practices
- Plan partitions:
partitions = max(producer_throughput, consumer_throughput) * 1.5 - Use
min.insync.replicas=2with RF=3 for critical data - Split topics by business domain — simplifies scaling and monitoring
- Compact for state, Delete for events — right policy for the task
- Compression:
lz4for throughput,zstdfor storage savings - Batch tuning:
batch.size=256KB,linger.ms=10for high-throughput producers - Monitor Under-Replicated Partitions — first indicator of cluster problems
- Use KRaft (Kafka Raft, production-ready since Kafka 3.3+) instead of ZooKeeper for new clusters. In Kafka 2.x KRaft is experimental.
🎯 Interview Cheat Sheet
Must know:
- Topic — logical category for messages, physically consists of partitions
- Unlike a queue, messages are not deleted after reading — stored for a defined time (retention)
- Cleanup policies:
delete(by time) andcompact(latest value per key) - Ordering is guaranteed only within a partition, not within a topic
- Topic is an immutable append-only log, split into partitions
min.insync.replicas=2with RF=3 is critical for durability- Reducing partitions is impossible — only topic recreation
- Split topics by business domain from the very beginning
Common follow-up questions:
- Can you reduce the number of partitions? — No, only increase. Reduction requires recreating the topic.
- What is log compaction? — Keeps only the latest value for each key, deletes old ones.
- What happens if a topic has 0 partitions? — Impossible, Kafka requires at least 1 partition.
- How to ensure global ordering? — Only 1 partition, but that kills parallelism.
Red flags (DO NOT say):
- “Topic is a queue, messages are deleted after reading” — it’s not a queue
- “Ordering is guaranteed across the whole topic” — only within a partition
- “You can reduce partitions via alter” — you can’t
- “One topic for all events is fine” — anti-pattern
Related topics:
- [[2. What is partition and why is it needed]]
- [[3. How is data distributed across partitions]]
- [[27. What is retention policy]]
- [[28. How are old messages deleted from a topic]]
- [[16. What is replication in Kafka]]