What to do about key collisions when collecting into Map?
By default, toMap throws an error:
🟢 Junior Level
Key collision — when two stream elements produce the same key for a Map.
By default, toMap throws an error:
// If two users have the same id → IllegalStateException
users.stream().collect(toMap(User::getId, u -> u));
Solution: Add a third parameter — the merge function:
users.stream().collect(toMap(
User::getId,
u -> u,
(existing, replacement) -> existing // keep first
));
Merge function options:
(old, newVal) -> old— keep the first value(old, newVal) -> newVal— replace with the new one(old, newVal) -> old + newVal— combine
// (existing, replacement) -> existing — keep the FIRST value // (existing, replacement) -> replacement — keep the LAST value // (existing, replacement) -> existing + replacement — combine
🟡 Middle Level
Conflict resolution strategies
1. Overwriting:
// Keep First — for deduplication
(oldValue, newValue) -> oldValue
// Keep Last — current state
(oldValue, newValue) -> newValue
2. Aggregation (Collating):
// Concatenation
.toMap(User::getRole, User::getName, (n1, n2) -> n1 + ", " + n2)
3. Complex choice (Business Logic):
(existing, replacement) ->
existing.getSalary() > replacement.getSalary() ? existing : replacement
When is toMap() a bad choice?
If one key should correspond to multiple values — use Collectors.groupingBy():
// Correct — creates Map<Role, List<User>>
users.stream().collect(groupingBy(User::getRole));
🔴 Senior Level
Merge Function Cost
The merge function is called in the critical section of collection. Heavy computations will slow down the entire stream.
Parallel Streams
In parallel streams, collisions are handled when merging sub-results (combiner):
- Few collisions — overhead is negligible
- Many collisions — better to use
groupingByConcurrent
// In parallelStream, combiner is called to merge results from different workers.
// It must be compatible with mergeFunction, otherwise the result will be incorrect.
Null Values
toMap does not tolerate null values, even if the merge function handles them → NPE inside Map.merge.
Static Analysis
Error Prone (Google) and Sonar tools flag toMap without a 3rd argument as a “potential bug”. This is a safe coding standard.
Diagnostics
For critical code, wrap toMap in a block that logs conflicting objects on error:
.collect(toMap(
User::getId, u -> u,
(old, newVal) -> {
log.warn("Duplicate key: {}, values: {} and {}", key, old, newVal);
return old;
}
))
🎯 Interview Cheat Sheet
Must know:
- Key collision = two elements produce the same key → without merge function: IllegalStateException
- Strategies:
(old, new) -> old(keep first),(old, new) -> newVal(replace), combine - If one key needs multiple values — use
groupingBy(), not toMap - In parallel streams, collisions are handled when merging sub-results (combiner)
- Many collisions in parallelStream — better to use
groupingByConcurrentwith ConcurrentHashMap - toMap does not tolerate null values → NPE inside Map.merge, even with a merge function
- Error Prone and Sonar flag toMap without a 3rd argument as a “potential bug”
Frequent follow-up questions:
- How to keep the first value on collision? —
(existing, replacement) -> existing— keeps the first found element. - When to use groupingBy instead of toMap? — When one key corresponds to multiple values — groupingBy creates Map<K, List
>. - Why must the merge function in parallelStream be associative? — Because the combiner may combine results in different order — a non-associative function gives a nondeterministic result.
- How to log duplicates on collision? — Wrap the merge function in a block that logs the conflicting objects.
Red flags (DO NOT say):
- “toMap without a merge function is safe” — incorrect, on duplicates it throws IllegalStateException
- “merge function is called for every element” — incorrect, only on key match
- “null values are acceptable in toMap” — incorrect, Map.merge will throw NPE
- “groupingByConcurrent is always faster than toMap” — incorrect, it is only effective with many collisions
Related topics:
- [[27. How to collect Stream into Map]]
- [[21. What is lazy evaluation in Stream]]
- [[22. When does Stream operation execution begin]]
- [[29. How to work with Optional in Stream]]