What is Load Factor in HashMap?

🟢 Junior Level

Load Factor is a number that determines when HashMap should increase its size.

By default, load factor = 0.75 (75%).

How it works:

HashMap<String, Integer> map = new HashMap<>();
// capacity = 16, load factor = 0.75
// threshold = 16 * 0.75 = 12
// On adding the 13th element, HashMap grows to 32 buckets

Simple analogy: Imagine a glass filled with water to 75%. If you pour more — it overflows, and you need a bigger glass. Load factor is the “75%” mark on the glass.

Why 0.75? It’s a balance: not too many empty slots (memory) and not too many collisions (speed).

🟡 Middle Level

Resize Threshold Formula

threshold = capacity * loadFactor

When size > threshold, resize (array doubling) occurs.

Impact on Performance and Memory

Load Factor	Memory	Speed	Collisions
0.5 (low)	More (50% empty buckets)	Faster (buckets contain 0-1 elements, search is almost always O(1) without list/tree traversal)	Minimum collisions
0.75 (default)	Balance	Balance	Few
0.9 (high)	Less	Slower	More
1.0+	Minimum	Even slower	Many

When to Change loadFactor?

Low (0.5): When access speed is critical (low latency systems)

new HashMap<>(1000, 0.5f); // Fast access, but 2x memory

High (>0.75): When memory saving is critical (Embedded, IoT)

new HashMap<>(1000, 0.9f); // Saves memory, but more collisions

Interaction with Initial Capacity

// Correct calculation for 1000 elements:
int capacity = (int)(1000 / 0.75f) + 1; // = 1334
Map<K, V> map = new HashMap<>(capacity, 0.75f);
// HashMap rounds up to 2048, threshold = 1536

Common Mistakes

Load factor > 1.0 — allowed, but search degrades to O(n)
Too low — wasted memory, frequent resizes at low fill
Not accounting for rounding — HashMap always rounds capacity to a power of two

🔴 Senior Level

Poisson Distribution

The choice of 0.75 is statistically justified. At this factor, element distribution across buckets:

P(x) = (0.75^x * e^-0.75) / x!

x=0: 47.2%
x=1: 35.4%
x=2: 13.3%
x=8: 0.000006% — practically impossible

Probability of a chain of 8 elements: 0.00000006. Treeification — anomaly protection.

Cache Locality Impact

High load factor affects real-world performance more than asymptotics suggest:

Long chains → nodes scattered across Heap → Cache Miss
CPU waits for data from RAM (100+ ns) instead of L1/L2 cache (1-5 ns)
O(log n) with poor cache locality (tree nodes scattered across heap) can be slower than O(1) with a good load factor. CPU waits for data from RAM (~100 ns) instead of L1 cache (~1 ns).

Memory Footprint

At capacity = 1024:

Load factor 0.5: 512 elements, ~512 empty buckets → ~4KB “waste”
Load factor 0.75: 768 elements, ~256 empty → ~2KB “waste”
Load factor 0.9: 922 elements, ~102 empty → ~1KB “waste”

Edge Cases

Load factor = 1.0: Minimum empty buckets, but chains are guaranteed
Load factor > 1.0: HashMap works as an array of lists, search is O(n)
Load factor < 0.1: Extremely fast, but 10x memory overhead

Production Tuning

In high-load systems:

Read-heavy: load factor 0.5-0.6 — minimum collisions, maximum speed
Write-heavy: load factor 0.75 — balance, fewer resizes
Memory-constrained: load factor 0.9+ — memory saving, acceptable slowdown

Monitoring

To analyze effectiveness:

Measure average element count per bucket
Compare get/put latency at different load factors
Use JFR to analyze distribution

🎯 Interview Cheat Sheet

Must know:

Load factor = 0.75 by default — balance between memory and collisions
threshold = capacity * loadFactor; on exceeding — resize
Poisson distribution: at 0.75, probability of 8+ elements in a bucket ≈ 0.00000006
Low LF (0.5): faster, but 2x memory; high (0.9): saves memory, but more collisions
High load factor = long chains = Cache Miss = real performance worse than asymptotics
LF > 1.0 is allowed, but search degrades to O(n)

Common follow-up questions:

When to use LF = 0.5? — read-heavy, low latency systems
When LF = 0.9? — memory-constrained (Embedded, IoT)
Why 0.75, not 0.5 or 1.0? — statistically justified compromise (Poisson)
Does LF affect iteration? — indirectly: larger capacity at low LF = slower iteration

Red flags (DO NOT say):

“Load factor = 1.0 is optimal” — no, chains and collisions are guaranteed
“You can change load factor after creation” — no, it’s set in the constructor
“0.75 means 75% of buckets are filled” — no, 75% of capacity before resize

Related topics:

[[17. What is Capacity in HashMap]]
[[18. When Does Rehashing Occur in HashMap]]
[[28. How to Choose the Initial Capacity for HashMap]]