How to collect Stream into Map?
To collect into a Map, use Collectors.toMap():
🟢 Junior Level
To collect into a Map, use Collectors.toMap():
toMap() = one element per key (if keys duplicate → IllegalStateException). groupingBy() = list of elements per key (duplicates are grouped).
Map<Integer, String> idToName = users.stream()
.collect(Collectors.toMap(
User::getId, // key
User::getName // value
));
Simple example:
List<User> users = ...;
Map<String, User> nameToUser = users.stream()
.collect(Collectors.toMap(User::getName, u -> u));
If two elements have the same key — you get an IllegalStateException.
🟡 Middle Level
toMap signatures
Basic (2 arguments):
.toMap(keyMapper, valueMapper)
// On collision → IllegalStateException
With merge function (3 arguments):
.toMap(keyMapper, valueMapper, (old, replacement) -> old)
// On collision — keeps the old value
// mergeFunction is called ONLY on key duplicate. // It decides which value to keep: old, new, or combine.
With Map factory (4 arguments):
.toMap(keyMapper, valueMapper, mergeFn, LinkedHashMap::new)
// Preserves insertion order
GroupingBy
For grouping elements (one key — many values):
// Group by city
Map<City, List<Person>> byCity = persons.stream()
.collect(groupingBy(Person::getCity));
// Grouping + counting
Map<City, Long> countByCity = persons.stream()
.collect(groupingBy(Person::getCity, counting()));
When NOT to use toMap
- Keys may repeat — use
groupingBy(), otherwise IllegalStateException - You only need a count —
groupingBy(counting())instead of toMap + size() - One key → many values —
groupingBy()ortoMultimap()
🔴 Senior Level
EnumMap Optimization
If keys are an Enum, standard HashMap is inefficient:
.collect(groupingBy(User::getRole, () -> new EnumMap<>(Role.class), toList()))
EnumMap uses an array → O(1) access and minimal memory.
Preventing rehashing
If you know the data size:
.collect(toMap(k, v, merge, () -> new HashMap<>(expectedSize)))
Null Values
Collectors.toMap does not allow null in values, even if the target HashMap supports it. This is an implementation limitation.
Solution: Use forEach or collect into Optional.
Identity Function
Use Function.identity() instead of x -> x — in some JDKs it works more efficiently due to avoiding lambda object creation.
Parallel Streams
groupingByConcurrent for parallel streams — uses ConcurrentHashMap, threads write to one map without a combiner. Grouping without groupingByConcurrent requires Map.putAll() — can be slower than sequential.
Diagnostics
On IllegalStateException in toMap, log the duplicate keys. The standard JDK exception is not always informative.
🎯 Interview Cheat Sheet
Must know:
Collectors.toMap(keyMapper, valueMapper)— basic collection, throws IllegalStateException on collision- 3rd argument of toMap — merge function:
(old, replacement) -> old(keep first) - 4th argument — Map factory:
LinkedHashMap::newpreserves insertion order groupingBy()— when one key corresponds to many values (Map<K, List>) groupingBy + counting()— for counting elements by grouptoMapdoes not allow null in values — implementation limitationgroupingByConcurrentfor parallel streams — writes to ConcurrentHashMap without combiner- For Enum keys, use
EnumMap— O(1) access, minimal memory
Frequent follow-up questions:
- What happens on a duplicate key in toMap? — IllegalStateException. Solution: add a merge function as the third argument.
- How does toMap differ from groupingBy? — toMap: one element per key; groupingBy: list of elements per key (Map<K, List
>). - How to preserve insertion order? — Fourth argument:
.toMap(key, value, merge, LinkedHashMap::new). - Why does toMap not accept null values? — Limitation of the internal JDK implementation via Map.merge, which throws NPE on null.
Red flags (DO NOT say):
- “toMap automatically resolves collisions” — incorrect, without a merge function it throws IllegalStateException
- “groupingBy returns Map<K, V>” — incorrect, it returns Map<K, List
> - “toMap supports null values” — incorrect, it throws NPE
- “parallelStream with toMap is always faster” — incorrect, without groupingByConcurrent it can be slower due to combiner
Related topics:
- [[28. What to do about key collisions when collecting into Map]]
- [[23. What do distinct(), sorted(), limit(), skip() operations do]]
- [[29. How to work with Optional in Stream]]
- [[21. What is lazy evaluation in Stream]]