What is a Bucket in HashMap?
When HashMap wants to add an element, it computes the bucket number using the formula:
🟢 Junior Level
Bucket is an element of the table[] array inside HashMap.
Analogy: Imagine a mail cabinet: the array is the whole cabinet, a bucket is a specific slot. But unlike a real slot, you can put as many elements in a bucket as needed — they line up in a chain (linked list).
When HashMap wants to add an element, it computes the bucket number using the formula:
index = (table_size - 1) & hash(key)
Example:
HashMap<String, Integer> map = new HashMap<>();
map.put("key1", 100); // Went to bucket X (depends on hashCode("key1"))
map.put("key2", 200); // Went to bucket Y (depends on hashCode("key2"))
Simple analogy: A ZIP code determines which post office (bucket) to send mail to. Inside the office, the specific recipient is looked up.
🟡 Middle Level
Three Bucket States
- Empty — cell contains
null. Takes only pointer memory (4-8 bytes). - Linked List — on collisions, nodes are linked via the
nextfield. Search: O(k), where k is the number of elements in the bucket. - Tree — at 8+ elements and capacity >= 64, the list converts to a red-black tree (
TreeNode, Java 8+). Search: O(log k).
Why Buckets Are Always 2^n?
This is a fundamental architectural decision:
- The formula
index = (n-1) & hashonly distributes correctly when n is a power of two - On resize (doubling), elements either stay in place or move to
index + oldCap— without full hash recalculation
Parameters Affecting Buckets
| Parameter | Default Value | Effect |
|---|---|---|
| Initial Capacity | 16 | Starting number of buckets |
| Load Factor | 0.75 | At what fill point buckets are doubled |
Common mistake: Not accounting for new HashMap(10) creating 16 buckets (nearest power of two upward).
When NOT to Use a Large initialCapacity
Don’t set excessive initialCapacity: a HashMap with 1,000,000 buckets and 10 elements wastes ~8 MB of memory (array of null references).
🔴 Senior Level
Internal Implementation
A bucket is an element of the Node<K,V>[] table array. Each Node contains:
class Node<K,V> {
final int hash;
final K key;
V value;
Node<K,V> next; // Next element in the collision chain
}
On treeification, the bucket stores TreeNode:
class TreeNode<K,V> {
TreeNode<K,V> parent, left, right, prev;
boolean red;
// + inherits hash, key, value, next from Node
}
TreeNode takes ~2x more memory due to additional references and color flag.
Poisson Distribution
Poisson distribution — a statistical model describing the probability of rare events. The probability of 8+ elements landing in one bucket with a good hashCode ≈ 0.00000006 — practically zero.
At loadFactor 0.75, the probability of element count per bucket follows Poisson distribution:
- 0 elements: ~22%
- 1 element: ~33%
- 2 elements: ~25%
- 8 elements: ~0.00000006 (nearly impossible with a good hashCode)
The threshold of 8 for treeification was chosen based on this exact statistics.
Hash Flooding Attack
If an attacker crafts keys with the same hash, all elements land in one bucket. Before Java 8, this led to O(n) and DoS attacks. Treeification (O(log n)) mitigates this threat.
Cache Locality
Buckets are scattered across different Heap addresses. Long chains lead to frequent Cache Misses, since each Node is a separate object. This affects real-world performance more than asymptotic complexity.
Memory Implications
- Empty bucket: 4-8 bytes (depends on UseCompressedOops)
- Node in list: ~32-48 bytes
- TreeNode: ~64-80 bytes
- At capacity=16: ~128 bytes for the array itself (16 references)
Production Monitoring
For diagnosing bucket issues, reflection or instrumentation can be used to analyze element distribution across buckets. In production, an abnormally high number of elements in one bucket indicates a poor key hashCode().
🎯 Interview Cheat Sheet
Must know:
- Bucket = element of table[] array, stores a list or tree of elements
- Three states: empty, linked list, tree (TreeNode at 8+ elements and capacity >= 64)
- Capacity is always a power of two — for the formula
index = (n-1) & hash - On resize, elements either stay in place or shift by
index + oldCap - Don’t set excessive initialCapacity — empty buckets waste memory
- Poisson distribution: probability of 8+ elements in a bucket ≈ 0.00000006
Common follow-up questions:
- What happens with new HashMap(10)? — capacity becomes 16 (nearest power of two)
- Why treeification at 8, not 4? — for k ≤ 7, list is faster than tree (less overhead)
- How much memory does an empty bucket take? — 4-8 bytes (reference to null)
- What is treeification? — conversion of a list to a red-black tree at 8+ collisions
Red flags (DO NOT say):
- “A bucket stores only one element” — no, on collisions it stores a list/tree
- “HashMap creates the array in the constructor” — no, lazy initialization (Java 8+)
- “At 8 elements it’s always a tree” — only if capacity >= 64
Related topics:
- [[01. How HashMap Works Internally]]
- [[03. How HashMap Determines Which Bucket to Put an Element In]]
- [[06. What Happens When 8 Elements Are Reached in One Bucket]]
- [[17. What is Capacity in HashMap]]