What is the difference between intermediate and terminal operations?
In Stream API, all operations are divided into two types:
Junior Level
In Stream API, all operations are divided into two types:
Intermediate operations:
- Return a new Stream
- Can be chained together
- Are not executed immediately — they are lazy.
Analogy: Laziness is like a recipe in a cookbook. You write down the steps (filter, map), but you only start cooking when someone asks for the result (terminal operation). Without a terminal operation, a stream is just a plan description — nothing is computed.
- Examples:
filter(),map(),sorted(),limit()
Terminal operations:
- Trigger execution of the entire chain
- Return a result (not a Stream)
- After them, the stream cannot be reused
- Examples:
collect(),forEach(),count(),findFirst()
list.stream() // creating a stream
.filter(s -> s.length() > 3) // intermediate
.map(String::toUpperCase) // intermediate
.collect(toList()); // terminal — triggers everything
Middle Level
How it works internally
Stream API is a state machine. Each intermediate operation registers a new link in the pipeline using the Sink structure:
- Each operation creates its own
Sink, which wraps the next one - Data is “pushed” through the chain of Sinks when a terminal operation is called
Sink is an internal Stream API interface that passes elements from one operation to the next along the chain. Each intermediate operation wraps the previous Sink in its own.
Types of intermediate operations
Stateless: filter, map, flatMap, peek
- Each element is processed independently
- Ideal for parallelism
Stateful: distinct, sorted, limit, skip
- Require knowledge about other elements
- Create “barriers” — reduce performance
Order matters
// BAD: sort 1 million, then take 5
list.stream().sorted().limit(5).collect(toList());
// GOOD: filter first, reducing the volume
list.stream().filter(relevantOnly).sorted().limit(5).collect(toList());
Rule: Put stateless operations (filter, map) before stateful ones (sorted, distinct).
Senior Level
Pipeline Overhead
Creating Stream, AbstractPipeline objects and a chain of Sinks has a cost. For small collections (up to ~100 elements) a regular for is faster since the overhead of creating a pipeline and Sink chain exceeds the benefit. Streams win on large volumes or complex logic.
Fusion (Operation merging)
Stream API can combine operations: .map(f1).map(f2) executes as a single pass with f2(f1(x)). This reduces the number of iterations.
Lazy Evaluation Catch
Heavy operations in intermediate lambdas are not executed until a terminal call:
Stream<Integer> stream = list.stream().map(this::expensiveDbCall);
// Nothing happened! The database was not called.
var result = stream.collect(toList());
// Only now did map() execute
This can lead to unexpected delays at the end of the method.
Edge Cases
peek()without terminal operation: Code insidepeekwill never execute- Stream reuse: Will throw
IllegalStateException - Breakpoint inside a lambda: Will only trigger when a terminal operation starts pulling data
Diagnostics
Use Java Flight Recorder (JFR) or profilers to analyze data “push” time through the Pipeline.
Interview Cheat Sheet
Must know:
- Intermediate operations return a Stream, are lazy, can be chained
- Terminal operations trigger execution, return a result (not a Stream)
- After a terminal operation the stream is exhausted — reuse will throw IllegalStateException
- Intermediate operations can be stateless (filter, map) or stateful (sorted, distinct, limit)
- Stateless operations are ideal for parallelism, stateful create barriers
- Sink — internal mechanism passing elements along the chain
Common follow-up questions:
- Why should filter be placed before sorted? — To reduce data volume before the expensive stateful operation
- What is Fusion? — JVM combines
.map(f1).map(f2)into a single pass withf2(f1(x)) - When will peek() not execute? — Without a terminal operation — nothing is computed due to laziness
- Why are stateful operations slower? — They require buffering and knowledge of all elements
Red flags (DO NOT say):
- “Intermediate operations process data immediately” — no, they are lazy
- “You can call a terminal operation twice on the same stream” — no, IllegalStateException
- “sorted and filter have the same performance” — sorted is stateful and requires buffering
- “Operation order doesn’t matter” — filter before map/sorted is critical for performance
Related topics:
- [[1. What advantages does Stream API provide]]
- [[3. What does filter() operation do]]
- [[5. What does collect() operation do]]
- [[7. What does flatMap() operation do]]