How to Choose the Initial Capacity for HashMap?
If not specified — it defaults to 16. But if you know how many elements will be in the map, it's better to set the capacity upfront.
🟢 Junior Level
Initial capacity is the size of HashMap’s internal array, set at creation.
If not specified — it defaults to 16. But if you know how many elements will be in the map, it’s better to set the capacity upfront.
Why? So HashMap doesn’t waste time on resizing, which is slow.
Simple formula:
int capacity = (expected_count / 0.75) + 1;
Map<String, Integer> map = new HashMap<>(capacity);
Example:
// Expecting 100 elements:
int capacity = (int) (100 / 0.75) + 1; // = 134
// HashMap rounds to nearest power of two upward: 256
Map<String, Integer> map = new HashMap<>(134);
// threshold = 256 * 0.75 = 192 — 100 elements fit without resize
🟡 Middle Level
The Resize Problem
Each resize is O(n): creating a new array + copying all elements.
// Without specifying capacity:
new HashMap<>(); // capacity = 16, threshold = 12
// At 13 elements: resize to 32
// At 25 elements: resize to 64
// At 49 elements: resize to 128
// ... and so on
Each resize — a delay and GC load.
The Ideal Capacity Formula
initialCapacity = (expectedSize / loadFactor) + 1
Why +1? The threshold is checked as ++size > threshold. Without +1, the last element triggers a resize.
Rounding to Power of Two
HashMap always rounds capacity up to a power of two:
| Passed | Became |
|---|---|
| 10 | 16 |
| 100 | 128 |
| 1000 | 1024 |
| 1335 | 2048 |
Practical Example for 1000 Elements
// Wrong:
new HashMap<>(1000);
// → capacity = 1024, threshold = 768
// → At 769 elements, resize to 2048!
// Right:
new HashMap<>((int)(1000 / 0.75f) + 1); // = 1334
// → capacity = 2048, threshold = 1536
// → 1000 elements fit without resize
When You Don’t Need to Specify Capacity
- Small collections (2-5 elements) — calculation overhead exceeds benefit
- Unknown element count — default constructor
Guava Helpers
// Google Guava — already contains the correct formula:
Map<K, V> map = Maps.newHashMapWithExpectedSize(1000);
Common Mistakes
- Passing the expected count directly —
new HashMap(1000)gives threshold 768 - Forgetting +1 — the last element triggers resize
- Excessive capacity — slow iteration over empty buckets
🔴 Senior Level
tableSizeFor Internal
static final int tableSizeFor(int cap) {
int n = cap - 1;
n |= n >>> 1;
n |= n >>> 2;
n |= n >>> 4;
n |= n >>> 8;
n |= n >>> 16;
return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}
This bit algorithm finds the nearest power of two in O(1).
Resize Cost Analysis
On resize capacity=N:
- Allocation: N references × 8 bytes
- Copying: size elements
- GC: old array → garbage
- Latency spike: 1-100ms depending on size
In latency-sensitive systems (HFT, real-time), resize is unacceptable.
Iteration Performance
// capacity = 1,000,000, size = 10
for (var entry : map.entrySet()) { ... }
// Iterator traverses 1,000,000 buckets, of which 10 are filled
// 999,990 empty checks — huge overhead
Don’t overestimate capacity without need.
Memory Trade-off
| Capacity | Memory (array) | Resizes | Iteration |
|---|---|---|---|
| Exact size | Minimum | Frequent | Fast |
| With margin (formula) | Slightly more | None | Fast |
| Greatly overestimated | Much more | None | Slow |
Production: Dynamic Sizing
In some frameworks (Netty, Kafka), custom maps are used with:
- Predictable latency (no resize)
- Ring buffer for eviction
- Open addressing for memory savings
Benchmark: Impact of Wrong Capacity
| Scenario | Time to insert 1M elements |
|---|---|
| No capacity (resize) | 120ms |
| Correct capacity | 45ms |
| 10x overestimated | 50ms (insert) / 500ms (iteration) |
Best Practice Formula
/**
* Creates a HashMap without resize for expectedSize elements.
*/
public static <K, V> HashMap<K, V> newHashMap(int expectedSize) {
return new HashMap<>((int) Math.ceil(expectedSize / 0.75) + 1);
}
🎯 Interview Cheat Sheet
Must know:
- Formula:
initialCapacity = (expectedSize / loadFactor) + 1— to avoid resize - HashMap rounds to nearest power of two via tableSizeFor()
- Without formula: new HashMap(1000) → capacity=1024, threshold=768 → resize at 769 elements
- With formula: (1000/0.75)+1=1334 → capacity=2048, threshold=1536 → no resize
- Each resize = O(n): allocation + copying + GC pressure + latency spike
- Overestimated capacity = slow iteration over empty buckets (O(capacity + size))
- Don’t specify capacity for small collections (2-5 elements)
Common follow-up questions:
- Why +1 in the formula? — threshold is checked as
++size > threshold; without +1, the last element triggers resize - What is tableSizeFor? — bit algorithm: finds nearest power of two in O(1)
- When to use Guava Maps.newHashMapWithExpectedSize? — already contains the correct formula
- What does resize cost? — 1-100ms depending on size; for 1M elements ~150ns (P99 ~5000ns)
Red flags (DO NOT say):
- “new HashMap(1000) creates 1000 buckets” — no, 1024, and threshold = 768
- “The bigger the capacity the better” — no, iteration is slower, memory is wasted
- “Capacity can be changed after creation” — no, set in constructor
Related topics:
- [[17. What is Capacity in HashMap]]
- [[18. When Does Rehashing Occur in HashMap]]
- [[19. What Happens During Rehashing]]
- [[16. What is Load Factor in HashMap]]