Question 17 · Section 10

What is Capacity in HashMap?

This is needed for the fast formula:

Language versions: English Russian Ukrainian

🟢 Junior Level

Capacity is the number of cells (buckets) in HashMap’s internal array.

  • Default value: 16
  • Always a power of two: 16, 32, 64, 128, 256…

Example:

HashMap<String, Integer> map = new HashMap<>();
// capacity = 16 (default)
// At 12 elements (16 * 0.75) — grows to 32

HashMap<String, Integer> map2 = new HashMap<>(64);
// capacity = 64 from the start

Simple analogy: Capacity is the number of mailboxes in an apartment building. More mailboxes = less chance that two letters land in the same one.

🟡 Middle Level

Capacity Properties

Parameter Value Description
Default 16 Starting size
Minimum 1 Smallest possible
Maximum 2^30 Max (1,073,741,824)
Step x2 Doubles each resize

Why Always a Power of Two?

This is needed for the fast formula:

index = (n - 1) & hash;  // Works only when n = 2^k

If n were 15 (not a power of two):

n-1 = 14 => 0000 1110

The lowest bit is always 0 — all odd buckets are empty. This is catastrophic for collisions.

Initial Capacity Rounding

new HashMap(10)  // capacity becomes 16
new HashMap(33)  // capacity becomes 64
new HashMap(100) // capacity becomes 128

The tableSizeFor() method finds the nearest power of two upward.

Relationship with Threshold

threshold = capacity * loadFactor

At capacity=16, loadFactor=0.75: threshold = 12. On the 13th element — resize.

Performance Impact

Too small: Frequent resize (O(n)), latency spikes Too large: Iteration is slow (traversing empty buckets). Large capacity is justified when: (1) you know the expected element count and want to avoid resize; (2) the Map lives long and is frequently read. NOT justified: for temporary/small Maps.

🔴 Senior Level

Internal Implementation

transient Node<K,V>[] table; // The array itself

On new HashMap() creation, the array is null. Initialization is lazy, on the first put().

tableSizeFor() — Finding the Power of Two Algorithm

static final int tableSizeFor(int cap) {
    int n = cap - 1;
    n |= n >>> 1;
    n |= n >>> 2;
    n |= n >>> 4;
    n |= n >>> 8;
    n |= n >>> 16;
    return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}

This bit trick fills all bits to the right of the highest set bit, then adds 1.

Resize Optimization

On doubling capacity (Java 8+):

// Element either stays in place or shifts to oldIndex + oldCap
if ((e.hash & oldCap) == 0) {
    // Same index
} else {
    // Index + oldCapacity
}

No need to recompute hashes — one bit is checked.

Memory Implications

  • capacity=16: array ~128 bytes (16 references × 8 bytes)
  • capacity=1,000,000: array ~4-8MB (even with 0 elements!). References on a 64-bit JVM without compressed references: 1M × 8 bytes ≈ 8MB. With UseCompressedOops (default, compresses references to 4 bytes): 1M × 4 bytes ≈ 4MB.
  • On iteration: O(capacity + size) — empty buckets are also traversed

Edge Cases

  1. capacity = MAXIMUM_CAPACITY (2^30): No further increase
  2. capacity > MAXIMUM_CAPACITY: Error (threshold = Integer.MAX_VALUE)
  3. Initial capacity = 0: Allowed, first put creates capacity = 1

Production Best Practices

// For 1000 elements:
int expected = 1000;
int capacity = (int)(expected / 0.75f) + 1; // 1334
// HashMap rounds to 2048
new HashMap<>(capacity);

This prevents resize and ensures stable latency.


🎯 Interview Cheat Sheet

Must know:

  • Capacity = number of buckets, always a power of two (16, 32, 64…)
  • Default = 16, Maximum = 2^30, doubles on resize
  • new HashMap(10) → capacity = 16 (rounding up to power of two via tableSizeFor)
  • Index formula: (n-1) & hash works ONLY when n = 2^k
  • Lazy initialization: array is created on first put (Java 8+)
  • tableSizeFor() — bit algorithm: fills all bits to the right of the highest + 1
  • On iteration: O(capacity + size) — empty buckets are also traversed

Common follow-up questions:

  • What happens with capacity = 0? — allowed, first put creates capacity = 1
  • Why not prime numbers? — for (n-1) & hash, a power of two is needed; Hashtable uses primes with %
  • How does tableSizeFor find the power of two? — via series of n |= n >>> k (1,2,4,8,16 bits)
  • How much memory at capacity = 1M? — ~4-8MB (array of references, depends on UseCompressedOops)

Red flags (DO NOT say):

  • “Capacity can be any number” — no, always a power of two
  • “HashMap creates the array in the constructor” — no, lazily on first put
  • “capacity = 10 creates 10 buckets” — no, rounds to 16

Related topics:

  • [[16. What is Load Factor in HashMap]]
  • [[18. When Does Rehashing Occur in HashMap]]
  • [[19. What Happens During Rehashing]]
  • [[28. How to Choose the Initial Capacity for HashMap]]