Question 20 · Section 8

Can you reuse a Stream?

You will get the error: java.lang.IllegalStateException: stream has already been operated upon or closed

Language versions: English Russian Ukrainian

🟢 Junior Level

No, you cannot. A stream is a one-time-use object. After a terminal operation is called, it cannot be used again.

Stream<String> stream = list.stream();
stream.forEach(System.out::println); // OK
stream.forEach(System.out::println); // IllegalStateException!

You will get the error: java.lang.IllegalStateException: stream has already been operated upon or closed

Solution: Create a new stream:

list.stream().forEach(System.out::println);
list.stream().forEach(System.out::println); // OK — new stream each time

🟡 Middle Level

Why is it designed this way?

Streams are lazy data pipelines, not collections:

  • The source may not support repeated traversal (Iterator, Socket)
  • Stream is a chain of Pipeline objects. Each node has a “used” flag. After a terminal operation, all nodes are marked as consumed. This protects against race conditions in parallelStream.
  • JIT optimizations and Pipeline-fusion only work for a single pass

How to work around the limitation

**1. Supplier pattern:**

Supplier<Stream<String>> supplier = () -> list.stream()
    .filter(s -> s.length() > 5);

supplier.get().forEach(System.out::println);
long count = supplier.get().count(); // Works — new stream each time

Do NOT use Supplier if the source is expensive (DB query, HTTP). In that case, it is better to collect into a collection once. Supplier is suitable for cheap sources (in-memory collections).

2. Collect into a collection:

List<String> cached = stream.collect(Collectors.toList());
cached.stream().forEach(...);
cached.stream().count();

🔴 Senior Level

Architectural implications

Stream as a method parameter: Never accept a Stream in a public API if you plan multiple passes. Accept Iterable, Collection, or Supplier<Stream>.

Resource Leaks: A stream on an external resource (Files.lines()) implements AutoCloseable. Repeated calls will not just throw an error — they may prevent handle release.

Object Allocation

One Stream pipeline — ~5-10 objects. One million streams = 5-10M allocations. For Young Gen this is significant. If you create > 100K streams/sec — consider a regular loop.

Edge Cases

  • parallelStream() uses a shared pool. Many stream reuses → thread starvation in ForkJoinPool.commonPool()
  • Exhaustion: A stream can be exhausted not only by a terminal operation, but also by explicitly calling close()

Diagnostics

When you get stream has already been operated upon or closed, find where a stream reference is stored in a variable and used twice. In IntelliJ IDEA debug mode, the stream is marked as “consumed” immediately after the terminal operation.


🎯 Interview Cheat Sheet

Must know:

  • Stream is a one-time-use object; after a terminal operation, reuse throws IllegalStateException
  • Stream is a lazy data pipeline, not a collection; the source may not support repeated traversal (Iterator, Socket)
  • Each Pipeline node has a “consumed” flag; after a terminal operation, all nodes are marked as used — protection against race conditions
  • Supplier<Stream> pattern — each get() call creates a new stream; suitable for cheap sources (in-memory collections)
  • For expensive sources (DB query, HTTP call) it is better to collect into a collection once rather than creating a stream each time
  • Never accept a Stream in a public API if you plan multiple passes — accept Iterable, Collection, or Supplier<Stream>
  • A stream on an external resource (Files.lines()) implements AutoCloseable; repeated calls may prevent handle release

Frequent follow-up questions:

  • Why can’t a stream be reused? — Stream is a chain of Pipeline objects with a consumed flag; this protects against race conditions in parallelStream and enables JIT optimizations.
  • How to work around the one-time-use limitation?Supplier<Stream> pattern (each get() — new stream) or collect into a collection once.
  • **When is Supplier a bad idea?** — If the source is expensive (DB query, HTTP call); then collect into a collection once.
  • How many objects does one Stream pipeline create? — ~5-10 objects; one million streams = 5-10M allocations, significant for Young Gen.

Red flags (DO NOT say):

  • “You can call reset() on a stream and use it again” — such a method does not exist
  • “Stream is the same thing as a collection” — Stream is lazy, one-time-use, may not support repeated traversal
  • “You can safely pass a Stream into a public API” — if the method plans multiple passes, accept Collection or Supplier
  • “parallelStream() solves the reuse problem” — on the contrary, it worsens it: thread starvation in commonPool

Related topics:

  • [[What is lazy evaluation in Stream]]
  • [[When does Stream operation execution begin]]
  • [[What potential problems can occur with parallel streams]]
  • [[What does collect() operation do]]