What are parallel streams?
Created in two ways:
Junior Level
Parallel streams are a way to process data in multiple threads simultaneously, using ForkJoinPool.commonPool(), which by default uses cores - 1 (one core is reserved for system tasks).
Created in two ways:
// From a collection
list.parallelStream().forEach(System.out::println);
// From a regular stream
list.stream().parallel().forEach(System.out::println);
By default uses all available CPU cores. For a list of 1000 elements, processing can theoretically be faster proportionally to the number of workers (cores - 1), but in practice the overhead of fork/join and merge reduces speedup to 2-4x.
Important: Element order is not guaranteed in forEach.
When NOT to use parallel streams
- I/O operations — block ForkJoinPool workers, other tasks wait
- A few thousand elements — overhead > benefit
- Stateful operations with ThreadLocal — workers are reused, data “leaks”
- When order matters — parallelStream does not guarantee order (except for ordered sources)
Middle Level
Mechanism: ForkJoin and Spliterator
Parallel streams use the ForkJoin framework:
- Data is split into parts via
Spliterator.trySplit() - Each part is processed by a separate thread
- Results are combined (
combiner)
Efficiency depends on the source:
ArrayList, arrays — split ideally by indexHashSet,TreeSet— split decentlyLinkedList— terrible (need to traverse half the list)Stream.iterate— impossible to parallelize
ForkJoinPool.commonPool()
By default all parallel streams use one shared pool:
- Size =
number_of_cores - 1 - Risk: If one stream performs blocking I/O, it occupies threads of the shared pool — all other parallel streams wait
Execution order
// Random order
parallelStream().forEach(System.out::println);
// Guaranteed order (slower)
parallelStream().forEachOrdered(System.out::println);
Senior Level
N*Q Model
Empirical rule: parallelism pays off when N * Q > 10,000:
N— number of elementsQ— cost of computation per element- 10,000 — approximate threshold where fork/join overhead pays off
When parallelism HURTS:
- Small collections (overhead on split/merge)
- Cheap operations (faster than context switching)
- Locks (synchronization kills parallelism)
- I/O operations (block commonPool)
Stateful Operations in parallelism
sorted(), distinct(), limit() in a parallel stream require full synchronization (“barrier”), which often makes them slower than sequential mode.
ThreadLocal Danger
In the general case, do not rely on ThreadLocal — ForkJoinPool workers are reused between tasks. If you control a custom ForkJoinPool and clean ThreadLocal in finally — acceptable.
Custom ForkJoinPool
For load isolation, use a custom pool:
ForkJoinPool customPool = new ForkJoinPool(4);
long result = customPool.submit(() ->
list.parallelStream().mapToInt(this::doWork).sum()
).get();
The stream uses the current thread’s pool, not commonPool.
Diagnostics
- Thread Names: Inside a lambda, print
Thread.currentThread().getName()— you will seeForkJoinPool.commonPool-worker-N - JFR (Java Flight Recorder): Shows thread activity in ForkJoinPool
-Djava.util.concurrent.ForkJoinPool.common.parallelism=N: System flag to adjust pool size
Interview Cheat Sheet
Must know:
- Parallel streams use ForkJoinPool.commonPool() (size = cores - 1)
- Two creation methods:
collection.parallelStream()andstream().parallel() - Mechanism: Spliterator.trySplit() splits data, each worker processes its part
- Splitting efficiency: ArrayList/arrays > HashSet/TreeSet > LinkedList > Stream.iterate
- Empirical rule: N * Q > 10,000 — parallelism pays off
- Order is not guaranteed in forEach, but guaranteed in forEachOrdered
Common follow-up questions:
- Why are I/O operations dangerous in parallelStream? — They block commonPool workers, all streams in the application stall
- When does parallelism HURT? — Small collections, cheap operations, locks, I/O, stateful operations
- How to isolate load? — Use a custom ForkJoinPool:
customPool.submit(() -> list.parallelStream()...) - Why is LinkedList terrible for parallelism? — Need to traverse half the list to split
Red flags (DO NOT say):
- “parallelStream is always faster” — no, fork/join/merge overhead can slow it down
- “parallelStream creates new threads” — no, it uses the shared ForkJoinPool.commonPool()
- “forEach in parallelStream guarantees order” — no, only forEachOrdered
- “ThreadLocal is safe in ForkJoinPool” — no, workers are reused between tasks
Related topics:
- [[10. When to use parallel streams]]
- [[1. What advantages does Stream API provide]]
- [[2. What is the difference between intermediate and terminal operations]]
- [[6. What is Collector and what built-in Collectors exist]]