Question 5 · Section 10

How HashMap Handles Collisions?

When two keys land in the same bucket, HashMap doesn't panic — it stores both as a chain (linked list).

Language versions: English Russian Ukrainian

🟢 Junior Level

When two keys land in the same bucket, HashMap doesn’t panic — it stores both as a chain (linked list).

How it works: Each element in a bucket contains a next reference to the next one. The bucket points to the first element, the first to the second, and so on. Like train cars: to find the right one, you must go through all of them from the start.

Simple analogy: Imagine a mailbox with two letters for different people. Inside the mailbox is a list: “Letter 1 — for Alice, Letter 2 — for Bob”. When the mail carrier arrives, they check both letters.

Example:

HashMap<String, String> map = new HashMap<>();
map.put("Aa", "Hello");    // hashCode = 2112
map.put("BB", "World");    // hashCode = 2112 — collision!
// Both elements stored in the same bucket
System.out.println(map.get("Aa")); // "Hello" — finds the right one
System.out.println(map.get("BB")); // "World" — also finds it

🟡 Middle Level

Java 7 vs Java 8+

Characteristic Java 7 Java 8+
Insertion point Head of list Tail of list
Structure Linked list only List → Red-black tree
Complexity O(n) O(log n)
Cycling Possible on multi-threaded resize Fixed

Why threshold = 8? For k ≤ 7, a linked list is faster than a tree: the overhead of tree traversal (parent/left/right references, color checks) outweighs the O(log k) benefit. That’s why treeification starts at 8, not 2 or 3.

Treeification Mechanism

Transition parameters:

  • TREEIFY_THRESHOLD = 8 — on the 9th element, treeifyBin() is called
  • MIN_TREEIFY_CAPACITY = 64 — if the table is less than 64, resize is done instead of treeification
  • UNTREEIFY_THRESHOLD = 6 — reverse transition on shrink

put Algorithm on Collision

  1. Compute bucket index
  2. Iterate existing structure (list or tree)
  3. Fast check: p.hash == hash
  4. Precise check: p.key == key || p.key.equals(key)
  5. If match — value is updated
  6. If no match — new node added to the tail
  7. At list length >= 8 — treeification (Java 8+)

Common Mistakes

  1. Poor hashCode() — causes mass collisions and degradation
  2. Using HashMap in multi-threading — collisions amplify data races
  3. Mutable keys — key modification makes the element unreachable

🔴 Senior Level

Internal Implementation

On collision, HashMap traverses the chain:

for (int binCount = 0; ; ++binCount) {
    if ((e = p.next) == null) {
        p.next = newNode(hash, key, value, null);
        if (binCount >= TREEIFY_THRESHOLD - 1)
            treeifyBin(tab, hash);
        break;
    }
    if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k))))
        break;
    p = e;
}

Optimization: hash is compared first (fast int comparison), then equals() is called only if needed.

TreeNode Structure

static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
    TreeNode<K,V> parent, left, right, prev;
    boolean red;
    // + inherits hash, key, value, next
}

TreeNode takes ~64-80 bytes vs ~32-48 bytes for Node. So treeification is a trade-off between speed and memory.

Architectural Trade-offs

List vs Tree:

  • List: minimal memory, but O(k) search
  • Tree: O(log k) search, but ~2x memory and balancing overhead
  • Threshold 8 was chosen statistically: with a good hashCode, the probability of 8 collisions is negligible

Multi-threaded Resize

In Java 7, head insertion during resize reversed element order. In a multi-threaded environment, this created cyclic references (infinite loop on get()). Java 8 switched to tail insertion, preserving order. In ConcurrentHashMap, CAS + synchronized on the bucket head is used.

CAS (Compare-And-Swap) — an atomic CPU instruction: “compare a value with expected and, if it matches, replace with new”. Used for thread safety without locks.

GC Impact

Heavy collisions create GC pressure:

  • Many small Node objects
  • TreeNode is even heavier
  • Frequent put/remove — churn in Young Gen

Best Practices

  1. Quality hashCode() — best defense against collisions
  2. For object keys, implement Comparable — this speeds up treeification
  3. In production, monitor: growing get/put latency = sign of collisions

🎯 Interview Cheat Sheet

Must know:

  • Separate Chaining — used in HashMap: collisions = list/tree in bucket
  • Java 7: head insertion, Java 8+: tail insertion (prevents cycling)
  • Tree at 8 elements, reverse transition at 6 (hysteresis for stability)
  • put algorithm on collision: hash check first (fast), then equals (precise)
  • TreeNode takes ~2x more memory than Node (~64-80 bytes vs ~32-48 bytes)
  • CAS + synchronized on bucket head — how ConcurrentHashMap solves multi-threaded collisions

Common follow-up questions:

  • Why threshold = 8? — for k ≤ 7, list is faster than tree (less overhead on references and balancing)
  • Why reverse transition at 6, not 7? — hysteresis: prevents bouncing on fluctuations
  • What happens if all hashCode = 1? — Java 7: O(n²) DoS, Java 8+: tree, but with overhead
  • Why tail insertion in Java 8? — head insertion reversed the list on resize → infinite loop in multi-threading

Red flags (DO NOT say):

  • “Collisions are impossible with a good hashCode” — they’re possible, just rare
  • “Tree is always better than list” — no, for small k, list is faster
  • “HashMap is safe in multi-threading” — no, even with treeification

Related topics:

  • [[04. What is a Collision in HashMap]]
  • [[06. What Happens When 8 Elements Are Reached in One Bucket]]
  • [[07. What is the equals() and hashCode() Contract]]
  • [[22. How HashMap Works in a Multi-threaded Environment]]