Question 27 · Section 8

How to collect Stream into Map?

To collect into a Map, use Collectors.toMap():

Language versions: English Russian Ukrainian

🟢 Junior Level

To collect into a Map, use Collectors.toMap():

toMap() = one element per key (if keys duplicate → IllegalStateException). groupingBy() = list of elements per key (duplicates are grouped).

Map<Integer, String> idToName = users.stream()
    .collect(Collectors.toMap(
        User::getId,       // key
        User::getName      // value
    ));

Simple example:

List<User> users = ...;
Map<String, User> nameToUser = users.stream()
    .collect(Collectors.toMap(User::getName, u -> u));

If two elements have the same key — you get an IllegalStateException.

🟡 Middle Level

toMap signatures

Basic (2 arguments):

.toMap(keyMapper, valueMapper)
// On collision → IllegalStateException

With merge function (3 arguments):

.toMap(keyMapper, valueMapper, (old, replacement) -> old)
// On collision — keeps the old value

// mergeFunction is called ONLY on key duplicate. // It decides which value to keep: old, new, or combine.

With Map factory (4 arguments):

.toMap(keyMapper, valueMapper, mergeFn, LinkedHashMap::new)
// Preserves insertion order

GroupingBy

For grouping elements (one key — many values):

// Group by city
Map<City, List<Person>> byCity = persons.stream()
    .collect(groupingBy(Person::getCity));

// Grouping + counting
Map<City, Long> countByCity = persons.stream()
    .collect(groupingBy(Person::getCity, counting()));

When NOT to use toMap

  1. Keys may repeat — use groupingBy(), otherwise IllegalStateException
  2. You only need a countgroupingBy(counting()) instead of toMap + size()
  3. One key → many valuesgroupingBy() or toMultimap()

🔴 Senior Level

EnumMap Optimization

If keys are an Enum, standard HashMap is inefficient:

.collect(groupingBy(User::getRole, () -> new EnumMap<>(Role.class), toList()))

EnumMap uses an array → O(1) access and minimal memory.

Preventing rehashing

If you know the data size:

.collect(toMap(k, v, merge, () -> new HashMap<>(expectedSize)))

Null Values

Collectors.toMap does not allow null in values, even if the target HashMap supports it. This is an implementation limitation.

Solution: Use forEach or collect into Optional.

Identity Function

Use Function.identity() instead of x -> x — in some JDKs it works more efficiently due to avoiding lambda object creation.

Parallel Streams

groupingByConcurrent for parallel streams — uses ConcurrentHashMap, threads write to one map without a combiner. Grouping without groupingByConcurrent requires Map.putAll() — can be slower than sequential.

Diagnostics

On IllegalStateException in toMap, log the duplicate keys. The standard JDK exception is not always informative.


🎯 Interview Cheat Sheet

Must know:

  • Collectors.toMap(keyMapper, valueMapper) — basic collection, throws IllegalStateException on collision
  • 3rd argument of toMap — merge function: (old, replacement) -> old (keep first)
  • 4th argument — Map factory: LinkedHashMap::new preserves insertion order
  • groupingBy() — when one key corresponds to many values (Map<K, List>)
  • groupingBy + counting() — for counting elements by group
  • toMap does not allow null in values — implementation limitation
  • groupingByConcurrent for parallel streams — writes to ConcurrentHashMap without combiner
  • For Enum keys, use EnumMap — O(1) access, minimal memory

Frequent follow-up questions:

  • What happens on a duplicate key in toMap? — IllegalStateException. Solution: add a merge function as the third argument.
  • How does toMap differ from groupingBy? — toMap: one element per key; groupingBy: list of elements per key (Map<K, List>).
  • How to preserve insertion order? — Fourth argument: .toMap(key, value, merge, LinkedHashMap::new).
  • Why does toMap not accept null values? — Limitation of the internal JDK implementation via Map.merge, which throws NPE on null.

Red flags (DO NOT say):

  • “toMap automatically resolves collisions” — incorrect, without a merge function it throws IllegalStateException
  • “groupingBy returns Map<K, V>” — incorrect, it returns Map<K, List>
  • “toMap supports null values” — incorrect, it throws NPE
  • “parallelStream with toMap is always faster” — incorrect, without groupingByConcurrent it can be slower due to combiner

Related topics:

  • [[28. What to do about key collisions when collecting into Map]]
  • [[23. What do distinct(), sorted(), limit(), skip() operations do]]
  • [[29. How to work with Optional in Stream]]
  • [[21. What is lazy evaluation in Stream]]