What are side effects in Stream?
Side effects make code unpredictable, especially in parallel streams.
🟢 Junior Level
Side effect — any action inside a stream that modifies something outside of the stream itself.
Examples of side effects:
- Modifying external variables
- Printing to console (
System.out.println) - Writing to a file or database
- Modifying objects inside the stream
// Side effect — modifying an external variable
List<String> result = new ArrayList<>();
stream.forEach(result::add);
// Side effect — printing to console
stream.peek(System.out::println).collect(toList());
Side effects make code unpredictable, especially in parallel streams.
🟡 Middle Level
What counts as a side effect
- Modifying values of external variables (mutating objects on the heap)
- Performing I/O operations (printing, writing to a file, network request)
- Modifying element state inside the stream
Three principles of a safe stream
For correct operation (especially parallel), functions must be:
- Identity: The initial value must not change the result
- Associativity: Grouping order does not affect the result
- Non-interference: The data source must not be modified during stream execution
Dangers
Unpredictability: In a parallel stream, side effects execute in random order.
Performance degradation: Side effects require synchronization — threads queue up.
Legitimate side effects
Only in two places:
forEach(): Terminal operation for actions “to the outside world”peek(): Debugging only (logging intermediate states)
🔴 Senior Level
The Atomic Reference Trap
AtomicInteger for counting inside forEach of a parallel stream:
- 100 threads writing to the same cache line (a cache line is a 64-byte memory block that the CPU loads entirely). Two threads writing to adjacent bytes force the CPU to constantly reread the same line (cache thrashing).
- The stream is slower than a regular
forloop
Transactional Side Effects
In most cases, do not call repo.save() inside map or forEach of a parallel stream. Exception: batch processing where each DB write is independent and you control transactions manually.
Logging Overhead
stream.peek(log::info) in high-load can slow down processing by 100x due to locks inside the logger.
Edge Cases
Missing Side Effects: If you use peek() for business logic — in Java 9+ with a terminal count() operation on sources with SIZED characteristic (ArrayList, array) — JVM optimizes count() and skips the entire pipeline. On non-SIZED sources (LinkedList, Stream.generate()) — peek() will execute.
Deadlock Potential: A side effect that acquires a lock can deadlock with another ForkJoinPool thread.
Diagnostics
Always test streams with side effects on parallelStream(). If results “float” — it’s a design bug.
🎯 Interview Cheat Sheet
Must know:
- Side effect — any action inside a stream that modifies something outside it (variable mutation, I/O, printing)
- Side effects make parallel streams unpredictable: execution order is random
- Three principles of a safe stream: Identity (neutral element), Associativity (grouping order doesn’t matter), Non-interference (source doesn’t change)
- Legitimate side effects only in two places:
forEach()(terminal) andpeek()(debugging) AtomicIntegerinforEachof a parallel stream: cache line contention — stream is slower than a regularforloop- In Java 9+
peek()may not execute onSIZEDsources withcount()— JVM optimizes the pipeline stream.peek(log::info)in high-load slows processing up to 100x due to logger locks
Frequent follow-up questions:
- What counts as a side effect? — Mutation of external variables, I/O, changing object state inside the stream
- Can I use peek() for business logic? — Absolutely not; in Java 9+ it may not execute on SIZED sources with count()
- What is Non-interference? — The stream data source must not be modified during pipeline execution, otherwise
ConcurrentModificationException - When is a side effect acceptable in forEach? — When the action is irreversible and one-off: sending an email, writing to a log
Red flags (DO NOT say):
- “peek() is a fine place for business logic” — it may not execute; use map() for production
- “AtomicInteger solves all concurrency problems” — contention on CAS kills performance
- “Side effects are safe in sequential streams” — they block future parallelism and violate the functional paradigm
- “repo.save() inside map of a parallel stream is fine” — unmanaged transactions, no rollback, connection pool exhaustion
Related topics:
- [[Why you should avoid side effects in Stream]]
- [[Can you modify external variable state in Stream operations]]
- [[What is peek() operation and when to use it]]
- [[What potential problems can occur with parallel streams]]