Question 14 · Section 8

Can you modify state of external variables in Stream operations?

Technically — yes, but it is strongly discouraged.

Language versions: English Russian Ukrainian

Junior Level

Technically — yes, but it is strongly discouraged.

Java requires variables in lambdas to be final or effectively final:

// WILL NOT COMPILE
int sum = 0;
list.stream().forEach(n -> sum += n); // error: sum must be final

// COMPILES with AtomicInteger
AtomicInteger sum = new AtomicInteger(0);
list.stream().forEach(n -> sum.addAndGet(n)); // compiles, but bad

Correct approach: Use reduce() or collect():

int sum = list.stream().mapToInt(Integer::intValue).sum();

Middle Level

“Effectively Final” restriction

A lambda captures the value of a variable (copies it into a hidden field), not a reference to the stack. If the variable changes in the main thread, the copy inside the lambda becomes stale.

Why this restriction: the lambda may execute later, when the method has already finished and its stack frame is destroyed. Copying guarantees that the data will be alive.

Why this is an antipattern

1. Violation of functional purity Streams are designed in the functional programming paradigm. Operations should be pure functions.

2. Parallelism collapse In parallel streams, modifying a shared variable leads to:

  • Race Conditions: Multiple threads read and write simultaneously
  • Visibility Issues: Changes in one thread’s L1 cache are not visible to another
  • Synchronization: Atomic or synchronized kill the purpose of parallelism

Examples

// BAD — Side Effect
List<Integer> result = new ArrayList<>();
stream.filter(x -> x > 10).forEach(result::add);

// GOOD — Reduction
List<Integer> result = stream.filter(x -> x > 10).collect(Collectors.toList());

Senior Level

Contention on Atomic

If 100 threads of a parallel stream increment one AtomicInteger, they conflict on the memory bus (CAS loop) — the stream can become ~10 times slower than a regular for loop (per JMH data on 8-core CPU with 1M elements and AtomicInteger contention). Your number depends on JVM and load.

GC Pressure

Creating mutable “wrappers” to bypass final restrictions creates extra garbage on the heap.

Edge Cases

  • AtomicInteger as counter: Result is unpredictable in a parallel stream (indices will be scrambled)
  • Collecting into Map via forEach: Risk of ConcurrentModificationException. If keys are unique — toMap(). If keys can repeat — groupingBy() (otherwise toMap throws IllegalStateException: Duplicate key).

Diagnostics

Static Analysis: SonarQube rule “Stream operations should be side-effect free” finds such places.

Rule of Thumb: If you need to modify an external variable inside a stream — you chose the wrong terminal operation. Look at reduce() or collect().


Interview Cheat Sheet

Must know:

  • Variables in lambdas must be final or effectively final — the lambda captures a copy of the value, not a reference
  • AtomicInteger compiles, but in a parallel stream creates contention (CAS loop on the memory bus) — the stream can be ~10 times slower
  • Modifying external variables violates the functional programming paradigm on which streams are built
  • In parallel streams, mutation of shared variables leads to race conditions, visibility issues, data loss
  • Correct approach: reduce(), collect(), sum() — they encapsulate accumulation
  • For Map collection with repeating keys use groupingBy(), not toMap() (otherwise IllegalStateException: Duplicate key)
  • SonarQube finds such places with the rule “Stream operations should be side-effect free”

Common follow-up questions:

  • Why do lambdas require effectively final? — The lambda may execute later, when the method’s stack frame is destroyed; copying guarantees data liveness
  • How does forEach(result::add) differ from collect(toList())?forEach mutates an external ArrayList (side effect), collect — a pure reduction operation
  • Why are 100 threads with AtomicInteger slower than a for loop? — CAS loop on one cache line causes cache thrashing — threads reread the same memory line
  • Which terminal operation replaces list mutation?filter(...).collect(Collectors.toList())

Red flags (DO NOT say):

  • “AtomicInteger is a safe way to mutate in a parallel stream” — it is thread-safe, but contention kills performance
  • “You can modify a variable if you wrap it in a class” — this bypasses the compiler, but the thread-safety problem remains
  • “Mutation is safe in a regular stream” — it compiles, but violates the functional paradigm and blocks parallelism
  • “forEach is a normal way to collect results” — this is an antipattern, use collect()

Related topics:

  • [[15. What are side effects in Stream]]
  • [[16. Why you should avoid side effects in Stream]]
  • [[17. What does reduce() operation do]]
  • [[18. What is the difference between reduce() and collect()]]
  • [[12. What potential problems can occur with parallel streams]]