How to create a parallel stream?

Junior Level

There are two main ways to create a parallel stream:

// Method 1: From a collection
List<String> list = List.of("a", "b", "c");
list.parallelStream().forEach(System.out::println);

// Method 2: Via parallel() method
list.stream().parallel().forEach(System.out::println);

Both methods do the same thing. parallelStream() is just a shorthand.

Verification:

stream.isParallel(); // returns true

Middle Level

Creation methods

1. Collection.parallelStream():

Stream<String> pStream = list.parallelStream();

Calls StreamSupport.stream(collection.spliterator(), true). The true flag means parallelism.

2. Stream.parallel():

Stream<Integer> stream = Stream.of(1, 2, 3).parallel();

Allows making any stream parallel “on the fly”.

3. StreamSupport:

Stream<T> stream = StreamSupport.stream(mySpliterator, true);

For low-level optimization or custom data structures.

Important nuance: the last call wins

stream.parallel().sequential().parallel(); // will be parallel
stream.parallel().sequential();            // will be sequential

Partially parallel streams do not exist — the entire pipeline is either fully parallel or fully sequential.

Senior Level

Thread management via custom ForkJoinPool

By default ForkJoinPool.commonPool() is used. In production it is often necessary to limit parallelism for a specific task:

ForkJoinPool myPool = new ForkJoinPool(4);
long result = myPool.submit(() ->
    list.parallelStream().mapToInt(this::doWork).sum()
).get();

How it works: The parallel stream checks if it is running inside a ForkJoinWorkerThread. If yes — it uses the current pool instead of commonPool. This isolates the load.

Optimizations

IntStream.range().parallel(): One of the most efficient ways — the Spliterator for numeric ranges works ideally.

Array vs List: Arrays.stream(arr).parallel() is faster than list.parallelStream() because the array has an exact size (SIZED) and O(1) index access. This allows the Spliterator to split it exactly in half without overhead on Iterator.next().

Unordered optimization

If the source is ordered (LinkedHashSet), the parallel stream spends resources preserving order. If order does not matter:

linkedHashSet.stream().unordered().parallel()...

// unordered() removes the obligation for ForkJoinPool to preserve order during merge. // Threads don’t need to synchronize when merging results -> less contention.

Edge Cases

FlatMap Constraints: In a parallel stream via flatMap, inner streams are processed sequentially within each ForkJoin task.

When NOT to create a custom ForkJoinPool

For simple tasks — commonPool covers 95% of cases
For I/O-bound tasks — use virtual threads (Java 21+) or CompletableFuture
If you don’t control shutdown — thread leak on application stop

Diagnostics

Thread Names: In a lambda print Thread.currentThread().getName() — you will see ForkJoinPool.commonPool-worker-N
VisualVM: “Threads” tab will show load on all commonPool workers

Interview Cheat Sheet

Must know:

Two methods: collection.parallelStream() and stream.parallel() — do the same thing
By default uses ForkJoinPool.commonPool() with size = number of cores - 1
parallel() and sequential() can be chained — the last call wins
Partially parallel streams do not exist — the entire pipeline is either parallel or sequential
For load isolation in production, a custom ForkJoinPool is used
Arrays.stream(arr).parallel() is faster than list.parallelStream() (O(1) access, SIZED)
.unordered() removes the obligation to preserve order during merge, reducing contention

Common follow-up questions:

How to check if a stream is parallel? — Call stream.isParallel(), returns true/false
Can you make part of a stream parallel? — No, the parallelism flag applies to the entire pipeline
Why is Arrays.stream().parallel() faster? — Array has SIZED characteristic and O(1) access, Spliterator splits it exactly in half
When to use a custom ForkJoinPool? — In Spring Boot applications for isolating load between components

Red flags (DO NOT say):

“Parallel stream is always faster than regular” — on small data ForkJoin overhead makes it slower
“You can control the number of threads via parallelStream” — without a custom ForkJoinPool this is impossible
“parallelStream and CompletableFuture are the same” — these are different abstractions with different use-cases
“You can make one stream half parallel” — the flag is binary, chaining switches the entire pipeline

Related topics:

[[12. What potential problems can occur with parallel streams]]
[[13. What is ForkJoinPool and how is it related to parallel streams]]
[[10. When to use parallel streams]]
[[9. What are parallel streams]]