Question 4 Β· Section 15

What is message key and how does it affect partitioning

Think of an airport:

Language versions: English Russian Ukrainian

🟒 Junior Level

What is a message key?

Message Key is a value that determines which partition a message will land in. The key is the β€œaddress” for routing a message.

Why: a way to tell Kafka: β€œall events related to this user/order should be processed in one place.” Without a key, events for the same user would scatter across different partitions and might arrive out of order.

Analogy

Think of an airport:

  • Key = flight destination code (e.g., β€œKYIV-BA101”)
  • Partition = specific gate
  • All passengers with the same code go to the same gate
  • Passengers with different codes β€” to different gates
key="KYIV-BA101" β†’ gate 5 (partition 1)
key="NYC-AA202"  β†’ gate 3 (partition 0)
key="PARIS-AF303" β†’ gate 7 (partition 2)

A better analogy: a key is a case number in court. All documents for the same case go into one folder and are processed in strict order. Documents for different cases β€” in different folders, order between them doesn’t matter.

Why is a key needed?

  1. Ordering guarantee β€” messages with the same key go to one partition β†’ preserve order
  2. Grouping β€” all events for one user/object are processed together

Example

// With key β€” ordering guaranteed
ProducerRecord<String, String> record =
    new ProducerRecord<>("orders", "user-123", "Order created");
//                    topic         key          value

// user-123 always lands in the same partition
With key:
  user-123 β†’ partition 1 β†’ order: login β†’ browse β†’ purchase

Without key:
  msg-1 β†’ partition 2
  msg-2 β†’ partition 0  // ordering not guaranteed
  msg-3 β†’ partition 1

When NOT to use a message key

  1. Message order doesn’t matter and maximum throughput is needed β€” key=null
  2. Keys have strong skew (uneven distribution) β€” hot partition
  3. Key size is too large (>1KB) β€” CPU overhead on hashing

🟑 Middle Level

Ordering guarantees

Partition 0: user-456 events (strictly in order)
Partition 1: user-123 events (strictly in order)
Partition 2: user-789 events (strictly in order)

Important: Ordering is guaranteed only within a single partition! Between partitions β€” no guarantees.

Hashing algorithm

partition = Math.abs(Utils.murmur2(keyBytes)) % numPartitions

Kafka uses MurmurHash 2 for uniform distribution of keys across partitions.

Choosing a good key

Key Distribution When to use
userId βœ… Uniform (many users) User events
orderId βœ… Uniform Order events
"all" ❌ Hot partition Do not use for event-driven systems. Acceptable for broadcast (global config), but be aware of hot partition risk.
null βœ… Sticky partitioning When ordering is not important
aggregateId βœ… Uniform Event sourcing

Null Key

// key = null β†’ sticky partitioning
// Ordering is NOT guaranteed
ProducerRecord<String, String> record =
    new ProducerRecord<>("orders", "Order created");

Key strategy comparison

Strategy Ordering Throughput Distribution
Business Key (userId) βœ… Within partition Medium Depends on data
Composite Key (userId_orderId) βœ… Medium Good
Salted Key (userId + random) ⚠️ Partial Good Excellent
Null Key ❌ Maximum Uniform

Common mistakes

Mistake Consequence Solution
Null key when ordering is needed User events are mixed up Use userId as key
Same key for everything Hot partition Different keys for different entities
Large key (JSON) CPU overhead on hashing Compact keys (UUID, Long)
Changing partition count Key distribution changes Create a new topic

πŸ”΄ Senior Level

Internal Implementation β€” MurmurHash 2

// org.apache.kafka.common.utils.Utils
public static int murmur2(byte[] data) {
    int length = data.length;
    int seed = 0x9747b28c;  // Kafka seed
    // ... MurmurHash2 algorithm ...
    return fmix(hash(length, seed, data));
}

// Partition calculation
int hash = Utils.murmur2(keyBytes);
int partition = Math.abs(hash) % numPartitions;

Why MurmurHash 2:

  • Non-cryptographic, fast (bitwise operations)
  • Avalanche effect β€” changing 1 bit β†’ ~50% of result bits change
  • Uniform distribution β€” minimum collisions for real-world data

Integer.MIN_VALUE edge case:

// Math.abs(Integer.MIN_VALUE) == Integer.MIN_VALUE (negative!)
// Kafka uses a bitmask:
int partition = (hash & 0x7fffffff) % numPartitions;

Key Ordering Guarantees β€” formal model

Within a partition: strict FIFO (total order)
Between partitions: partial order (causal ordering only with explicit design)

Formally:
  βˆ€ messages m1, m2:
    key(m1) == key(m2) β†’ partition(m1) == partition(m2) β†’ order(m1, m2) preserved
    key(m1) != key(m2) β†’ order(m1, m2) NOT guaranteed

Advanced Key Strategies

1. Composite Key for fighting hot partitions:

// Composite key: entity + sub-entity
String key = userId + "_" + orderId;
// Provides more even distribution
// But: ordering only within userId_orderId, not within userId

2. Key Salting (randomized partitioning within entity):

// N-ary salting: distribute events of one user across N partitions
int salt = ThreadLocalRandom.current().nextInt(10);
String key = userId + "_" + salt;
// Ordering is lost, but throughput grows N times
// Use when ordering is NOT critical

3. Time-based Key Rotation:

// Key includes a time bucket
String key = userId + "_" + (System.currentTimeMillis() / 3600000);
// Key changes every hour β†’ messages land in new partition
// Useful for time-windowed aggregations

4. Consistent Hashing for dynamic partitions:

Ring-based approach: minimizes key movement when N changes
Libraries: ketama, jmpmv/consistent-hashing
Application: multi-tenant systems, sharded databases

Key Evolution Problem β€” formal analysis

Before: partition = hash(key) % 3
  key="user-1" β†’ hash("user-1") % 3 = 1 β†’ Partition 1

After: partition = hash(key) % 6
  key="user-1" β†’ hash("user-1") % 6 = 4 β†’ Partition 4

Same key β†’ different partition β†’ ordering broken!

Mathematically:
  P(new_partition != old_partition) = 1 - (old_N / new_N)
  For 3 β†’ 6: P = 1 - 3/6 = 50% of keys move
  For 10 β†’ 100: P = 1 - 10/100 = 90% of keys move

Solutions:

  1. Plan partition count with margin (10x current need)
  2. Use consistent hashing (custom partitioner)
  3. Create a new topic with the desired partition count and migrate via dual-write

Key Size Impact on Performance

Key size Memory overhead CPU (hash) Network Recommendation
Long (8 bytes) Minimal ~5 ns Minimal βœ… Optimal
UUID (36 bytes) Medium ~20 ns Medium βœ… Acceptable
String (100 bytes) Noticeable ~50 ns Noticeable ⚠️ Acceptable
JSON (10 KB) Significant ~500 ¡s Significant ❌ Avoid

Memory footprint:

Key is stored in:
- Producer RecordBatch β†’ MemoryRecord (heap)
- Broker Log β†’ MessageSet (page cache + heap)
- Consumer ConsumerRecord (heap)

Total: 3 Γ— key_size Γ— throughput Γ— retention
For 10KB keys at 10K msg/s = ~3 GB/s memory traffic

Edge Cases (3+)

  1. Key Null + Idempotent Producer: With enable.idempotence=true and key=null, Kafka generates sequence numbers at the partition level. Since sticky partitioning switches partitions, sequence numbers don’t conflict. But if partitioner.class is custom and non-deterministic β€” OutOfOrderSequenceException may occur.

  2. Key Serialization Schema Mismatch: Producer serializes key as String (StringSerializer), consumer deserializes as ByteArray (ByteArrayDeserializer). Key β€œworks” for partitioning, but consumer business logic receives byte[] instead of the expected String. Test serialization on both ends.

  3. Unicode Keys and MurmurHash: MurmurHash operates on bytes. Key "cafΓ©" in UTF-8 = [99, 97, 102, 195, 169], in Latin-1 = [99, 97, 102, 233]. Different encodings β†’ different bytes β†’ different partitions. Ensure all producers use the same encoding (UTF-8 by default in Kafka).

  4. Key Mutation in Processor: In Kafka Streams or ksqlDB, if a processor changes the message key (map((key, value) -> new KeyValue<>(newKey, value))), the message is redistributed to a new partition. This causes a shuffle β€” an expensive operation. Use through() for explicit repartitioning control.

  5. Empty Key (not null, but empty string): key="" β†’ MurmurHash("") = constant β†’ all messages with empty key go to one partition. This is like a hot partition, but harder to detect (key β€œexists” but is useless).

Performance Numbers

Metric Value Conditions
MurmurHash latency ~5-50 ns Depends on key size
Partition compute overhead < 100 ns Including serialization
Key serialization (String, 36 bytes) ~100 ns UTF-8 encoding
Impact of hot key on p99 3-10x latency increase 1 key = 50% of traffic

Production War Story

Situation: Ride-sharing company used driverId as key for topic rides. With 10,000 active drivers, distribution was even (20 partitions). But during peak hours, the top 50 drivers (most active) generated 40% of all events.

Problem: P99 latency for orders grew from 15ms to 500ms. Lag on β€œhot” partitions reached 30K messages. Clients complained about β€œstuck” orders.

Analysis:

Top-10 keys by volume:
  driver-42: 15K msg/min β†’ Partition 7
  driver-17: 12K msg/min β†’ Partition 3
  driver-88: 11K msg/min β†’ Partition 14
  Average: 50 msg/min per driver

Solution:

  1. Introduced composite key: driverId + "_" + (timestamp / 60000) β€” key changes every minute
  2. Increased partitions from 20 to 60
  3. For aggregations (count rides per driver) used Kafka Streams with .groupByKey() β€” automatic repartition
  4. Set up alert on per-partition throughput imbalance > 3x

Lesson: Uniform key distribution β‰  uniform load. Analyze frequency distribution of keys, not just cardinality.

Monitoring (JMX + Burrow)

JMX metrics:

kafka.producer:type=ProducerMetrics,client-id=producer-1,key=record-send-rate
kafka.producer:type=ProducerMetrics,client-id=producer-1,key=batch-size-avg
kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec

Custom monitoring:

// Interceptor for key distribution analysis
public class KeyDistributionInterceptor implements ProducerInterceptor {
    private final Map<String, Long> keyCounts = new ConcurrentHashMap<>();

    public ProducerRecord onSend(ProducerRecord record) {
        keyCounts.merge(String.valueOf(record.key()), 1L, Long::sum);
        return record;
    }
}

Highload Best Practices

  1. Key = compact business entity ID (Long > UUID > String)
  2. Analyze frequency distribution of keys before production
  3. Composite/salted keys for fighting hot partitions (trade-off: ordering)
  4. Avoid changing partition count β€” create a new topic
  5. Test on production-like data β€” real keys have skew
  6. Monitor per-partition throughput β€” alert on imbalance > 3x
  7. Empty string != null β€” empty key means hot partition
  8. Unicode consistency β€” all producers must use UTF-8

🎯 Interview Cheat Sheet

Must know:

  • Key determines partition: MurmurHash2(key) % numPartitions
  • Messages with the same key always land in one partition β†’ ordering guaranteed
  • Without key (null) β€” Sticky Partitioning, no ordering
  • Composite key (userId + "_" + orderId) β€” for even distribution
  • Large key (>1KB) β€” CPU overhead on hashing every message
  • When changing partition count, same key β†’ different partition
  • Empty string β‰  null β€” empty key = hot partition (all to one partition)
  • Key = compact business entity ID (Long > UUID > String)

Common follow-up questions:

  • What happens if key=null? β€” Sticky Partitioning, ordering not guaranteed.
  • Why not use JSON as a key? β€” 10KB key Γ— 10K msg/s = ~3 GB/s memory traffic.
  • How to ensure ordering with hot key? β€” Composite/salted key, trade-off: partial loss of ordering.
  • What is the Key Evolution Problem? β€” When changing N in hash(key) % N, 50-90% of keys move.

Red flags (DO NOT say):

  • β€œKey guarantees global ordering” β€” only within one partition
  • β€œEmpty string is the same as null” β€” empty = hot partition, null = sticky
  • β€œYou can change partition count without consequences” β€” keys move
  • β€œKey is stored only in the producer” β€” stored in RecordBatch, Log, ConsumerRecord (3Γ—)

Related topics:

  • [[3. How is data distributed across partitions]]
  • [[2. What is partition and why is it needed]]
  • [[21. What is batch in Kafka producer]]
  • [[1. What is topic in Kafka]]