Question 21 · Section 8

What is lazy evaluation in Stream?

Each intermediate operation creates an AbstractPipeline object (an internal JDK class that connects pipeline stages into a doubly linked list — each node holds references to the...

Language versions: English Russian Ukrainian

🟢 Junior Level

Lazy evaluation — a strategy where intermediate operations are not executed immediately, but only recorded in a “plan.” Actual work begins exclusively when a terminal operation is called — the one that triggers the pipeline (e.g., collect, forEach, count).

Laziness — like a promise: the stream “promises” to execute operations, but only starts when you request a result. Without a terminal operation, a stream is just a description of a plan.

// These operations do NOT execute immediately
Stream<String> stream = list.stream()
    .filter(s -> s.startsWith("A"))  // only recorded in plan
    .map(String::toUpperCase);        // only recorded in plan

// Only here does actual work begin
List<String> result = stream.collect(Collectors.toList());

Example breakdown: if list = ["Apple", "Banana", "Apricot"], then when collect is called:

  1. Element "Apple" passes filter (starts with “A” — yes) → map"APPLE" → added to result
  2. Element "Banana" passes filter (does not start with “A”) → discarded, map is NOT called
  3. Element "Apricot" passes filter (yes) → map"APRICOT" → added to result

Why this is needed:

  • Resource saving — do not process data that is not needed
  • Ability to work with infinite streams

🟡 Middle Level

Internal mechanism: Pipeline and Sink

Each intermediate operation creates an AbstractPipeline object (an internal JDK class that connects pipeline stages into a doubly linked list — each node holds references to the previous and next operation). When a terminal operation is called, this list turns into a chain of Sink objects — handlers, each of which accepts an element, processes it, and passes it to the next Sink.

What is a Spliterator: This is an iterator with support for parallel splitting. It “pushes” elements through the Sink chain one at a time.

Horizontal vs Vertical Execution

Imperative approach (Horizontal):

filter for all → intermediate list → map for all → result

Stream API (Vertical):

Element 1: filter → map → collect
Element 2: filter → map → collect
...

Advantages

  • Memory saving: No intermediate collections needed
  • Short-circuiting: .limit(1) will stop processing after the first element — the pipeline will not request remaining data. Analogous to break in a loop.
  • Infinite Streams: Can work with infinite streams (Stream.generate())

🔴 Senior Level

When stream laziness is a problem

  1. You need to perform a side effect (logging, metrics) — laziness may skip operations during short-circuit
  2. I/O inside the pipeline — the connection will not open until the terminal operation, which may be surprising
  3. Debugging — the stack trace starts from the terminal operation, not from where the stream was created

Loop Fusion

The JIT compiler sees vertical traversal and may combine several lambdas into one highly optimized machine code, optimized for CPU cache lines.

Dangers of laziness

Hidden exceptions:

Stream<Integer> stream = list.stream()
    .map(n -> 100 / n); // Division by zero! But the error is not here...

stream.collect(toList()); // ...it is here — during the terminal operation

Side Effects Trap: If peek() writes to a DB, and the terminal operation is never called — the write will not happen.

Debugging

  • A breakpoint inside a lazy operation lambda will not trigger until the terminal operation is called
  • If logs from peek() output in interleaved order (element 1 goes through everything, then element 2) — this is visual proof of vertical execution

Diagnostics

IntelliJ Stream Debugger allows you to see the “map” of element traversal through the vertical pipeline.

When NOT to use lazy evaluation

  • When you need early validation. If map contains an operation that can fail (division by zero, parsing), the error will only manifest during the terminal operation — far from where it was declared. For early error detection, use an imperative loop.
  • When side effects are expected at each stage. Lazy evaluation makes execution order non-obvious for a developer accustomed to step-by-step loops. If it is important that each operation completes its action before moving to the next — use a regular for.
  • For debugging in production. The stack trace for an error inside a lazy lambda points to the terminal operation, not to the problematic map/filter, making diagnostics harder.

🎯 Interview Cheat Sheet

Must know:

  • Intermediate operations do not execute until a terminal operation is called — this is lazy evaluation
  • Without a terminal operation, a stream is just a description of a data processing plan
  • Vertical execution: each element goes through the entire pipeline, not horizontally
  • Loop Fusion: JIT can combine several pipeline stages into one machine code
  • Laziness saves resources — unneeded data is not processed
  • Short-circuit operations (limit) stop the pipeline early
  • Exceptions in lazy operations only manifest during the terminal operation
  • Side effects through peek() will not work without a terminal operation

Frequent follow-up questions:

  • Why doesn’t filter execute immediately? — Intermediate operations are only recorded in the plan (Pipeline), actual work starts from the terminal operation (collect, forEach).
  • What is Loop Fusion? — JIT optimization combining several pipeline stages into one machine code for efficient CPU cache utilization.
  • Can you execute a stream without a terminal operation? — You can create it, but processing will not start — elements will not be processed.
  • Why does the stack trace point to the terminal operation? — Because it is what triggers the pipeline; the actual error is inside an intermediate lambda.

Red flags (DO NOT say):

  • “Stream executes immediately upon creation” — incorrect, the stream is lazy
  • “Intermediate operations create intermediate collections” — incorrect, vertical execution avoids them
  • “You can execute one stream twice” — incorrect, streams are one-time-use (linkedOrConsumed flag)
  • “peek() always performs side effects” — incorrect, only when a terminal operation is called

Related topics:

  • [[22. When does Stream operation execution begin]]
  • [[24. How does short-circuiting work in Stream]]
  • [[23. What do distinct(), sorted(), limit(), skip() operations do]]
  • [[26. What do findFirst() and findAny() operations do]]