Can you modify state of external variables in Stream operations?
Technically — yes, but it is strongly discouraged.
Junior Level
Technically — yes, but it is strongly discouraged.
Java requires variables in lambdas to be final or effectively final:
// WILL NOT COMPILE
int sum = 0;
list.stream().forEach(n -> sum += n); // error: sum must be final
// COMPILES with AtomicInteger
AtomicInteger sum = new AtomicInteger(0);
list.stream().forEach(n -> sum.addAndGet(n)); // compiles, but bad
Correct approach: Use reduce() or collect():
int sum = list.stream().mapToInt(Integer::intValue).sum();
Middle Level
“Effectively Final” restriction
A lambda captures the value of a variable (copies it into a hidden field), not a reference to the stack. If the variable changes in the main thread, the copy inside the lambda becomes stale.
Why this restriction: the lambda may execute later, when the method has already finished and its stack frame is destroyed. Copying guarantees that the data will be alive.
Why this is an antipattern
1. Violation of functional purity Streams are designed in the functional programming paradigm. Operations should be pure functions.
2. Parallelism collapse In parallel streams, modifying a shared variable leads to:
- Race Conditions: Multiple threads read and write simultaneously
- Visibility Issues: Changes in one thread’s L1 cache are not visible to another
- Synchronization:
Atomicorsynchronizedkill the purpose of parallelism
Examples
// BAD — Side Effect
List<Integer> result = new ArrayList<>();
stream.filter(x -> x > 10).forEach(result::add);
// GOOD — Reduction
List<Integer> result = stream.filter(x -> x > 10).collect(Collectors.toList());
Senior Level
Contention on Atomic
If 100 threads of a parallel stream increment one AtomicInteger, they conflict on the memory bus (CAS loop) — the stream can become ~10 times slower than a regular for loop (per JMH data on 8-core CPU with 1M elements and AtomicInteger contention). Your number depends on JVM and load.
GC Pressure
Creating mutable “wrappers” to bypass final restrictions creates extra garbage on the heap.
Edge Cases
- AtomicInteger as counter: Result is unpredictable in a parallel stream (indices will be scrambled)
- Collecting into Map via forEach: Risk of
ConcurrentModificationException. If keys are unique —toMap(). If keys can repeat —groupingBy()(otherwise toMap throwsIllegalStateException: Duplicate key).
Diagnostics
Static Analysis: SonarQube rule “Stream operations should be side-effect free” finds such places.
Rule of Thumb: If you need to modify an external variable inside a stream — you chose the wrong terminal operation. Look at reduce() or collect().
Interview Cheat Sheet
Must know:
- Variables in lambdas must be
finaloreffectively final— the lambda captures a copy of the value, not a reference AtomicIntegercompiles, but in a parallel stream creates contention (CAS loop on the memory bus) — the stream can be ~10 times slower- Modifying external variables violates the functional programming paradigm on which streams are built
- In parallel streams, mutation of shared variables leads to race conditions, visibility issues, data loss
- Correct approach:
reduce(),collect(),sum()— they encapsulate accumulation - For Map collection with repeating keys use
groupingBy(), nottoMap()(otherwiseIllegalStateException: Duplicate key) - SonarQube finds such places with the rule “Stream operations should be side-effect free”
Common follow-up questions:
- Why do lambdas require effectively final? — The lambda may execute later, when the method’s stack frame is destroyed; copying guarantees data liveness
- How does
forEach(result::add)differ fromcollect(toList())? —forEachmutates an external ArrayList (side effect),collect— a pure reduction operation - Why are 100 threads with AtomicInteger slower than a for loop? — CAS loop on one cache line causes cache thrashing — threads reread the same memory line
- Which terminal operation replaces list mutation? —
filter(...).collect(Collectors.toList())
Red flags (DO NOT say):
- “AtomicInteger is a safe way to mutate in a parallel stream” — it is thread-safe, but contention kills performance
- “You can modify a variable if you wrap it in a class” — this bypasses the compiler, but the thread-safety problem remains
- “Mutation is safe in a regular stream” — it compiles, but violates the functional paradigm and blocks parallelism
- “forEach is a normal way to collect results” — this is an antipattern, use
collect()
Related topics:
- [[15. What are side effects in Stream]]
- [[16. Why you should avoid side effects in Stream]]
- [[17. What does reduce() operation do]]
- [[18. What is the difference between reduce() and collect()]]
- [[12. What potential problems can occur with parallel streams]]