Question 15 · Section 8

What are side effects in Stream?

Side effects make code unpredictable, especially in parallel streams.

Language versions: English Russian Ukrainian

🟢 Junior Level

Side effect — any action inside a stream that modifies something outside of the stream itself.

Examples of side effects:

  • Modifying external variables
  • Printing to console (System.out.println)
  • Writing to a file or database
  • Modifying objects inside the stream
// Side effect — modifying an external variable
List<String> result = new ArrayList<>();
stream.forEach(result::add);

// Side effect — printing to console
stream.peek(System.out::println).collect(toList());

Side effects make code unpredictable, especially in parallel streams.

🟡 Middle Level

What counts as a side effect

  1. Modifying values of external variables (mutating objects on the heap)
  2. Performing I/O operations (printing, writing to a file, network request)
  3. Modifying element state inside the stream

Three principles of a safe stream

For correct operation (especially parallel), functions must be:

  1. Identity: The initial value must not change the result
  2. Associativity: Grouping order does not affect the result
  3. Non-interference: The data source must not be modified during stream execution

Dangers

Unpredictability: In a parallel stream, side effects execute in random order.

Performance degradation: Side effects require synchronization — threads queue up.

Legitimate side effects

Only in two places:

  • forEach(): Terminal operation for actions “to the outside world”
  • peek(): Debugging only (logging intermediate states)

🔴 Senior Level

The Atomic Reference Trap

AtomicInteger for counting inside forEach of a parallel stream:

  • 100 threads writing to the same cache line (a cache line is a 64-byte memory block that the CPU loads entirely). Two threads writing to adjacent bytes force the CPU to constantly reread the same line (cache thrashing).
  • The stream is slower than a regular for loop

Transactional Side Effects

In most cases, do not call repo.save() inside map or forEach of a parallel stream. Exception: batch processing where each DB write is independent and you control transactions manually.

Logging Overhead

stream.peek(log::info) in high-load can slow down processing by 100x due to locks inside the logger.

Edge Cases

Missing Side Effects: If you use peek() for business logic — in Java 9+ with a terminal count() operation on sources with SIZED characteristic (ArrayList, array) — JVM optimizes count() and skips the entire pipeline. On non-SIZED sources (LinkedList, Stream.generate()) — peek() will execute.

Deadlock Potential: A side effect that acquires a lock can deadlock with another ForkJoinPool thread.

Diagnostics

Always test streams with side effects on parallelStream(). If results “float” — it’s a design bug.


🎯 Interview Cheat Sheet

Must know:

  • Side effect — any action inside a stream that modifies something outside it (variable mutation, I/O, printing)
  • Side effects make parallel streams unpredictable: execution order is random
  • Three principles of a safe stream: Identity (neutral element), Associativity (grouping order doesn’t matter), Non-interference (source doesn’t change)
  • Legitimate side effects only in two places: forEach() (terminal) and peek() (debugging)
  • AtomicInteger in forEach of a parallel stream: cache line contention — stream is slower than a regular for loop
  • In Java 9+ peek() may not execute on SIZED sources with count() — JVM optimizes the pipeline
  • stream.peek(log::info) in high-load slows processing up to 100x due to logger locks

Frequent follow-up questions:

  • What counts as a side effect? — Mutation of external variables, I/O, changing object state inside the stream
  • Can I use peek() for business logic? — Absolutely not; in Java 9+ it may not execute on SIZED sources with count()
  • What is Non-interference? — The stream data source must not be modified during pipeline execution, otherwise ConcurrentModificationException
  • When is a side effect acceptable in forEach? — When the action is irreversible and one-off: sending an email, writing to a log

Red flags (DO NOT say):

  • “peek() is a fine place for business logic” — it may not execute; use map() for production
  • “AtomicInteger solves all concurrency problems” — contention on CAS kills performance
  • “Side effects are safe in sequential streams” — they block future parallelism and violate the functional paradigm
  • “repo.save() inside map of a parallel stream is fine” — unmanaged transactions, no rollback, connection pool exhaustion

Related topics:

  • [[Why you should avoid side effects in Stream]]
  • [[Can you modify external variable state in Stream operations]]
  • [[What is peek() operation and when to use it]]
  • [[What potential problems can occur with parallel streams]]