Question 12 · Section 8

What potential problems can occur with parallel streams?

Parallel streams do not always make code faster. Here are the main problems:

Language versions: English Russian Ukrainian

Junior Level

Parallel streams do not always make code faster. Here are the main problems:

1. Shared thread pool blocking All parallel streams share one ForkJoinPool.commonPool(). If one stream makes a long HTTP request — all others wait.

2. Slower than a regular loop For small lists, a parallel stream will be slower due to the cost of task splitting and result merging. The overhead of ForkJoinTask for a list of 100 elements can be ~50us, while a simple loop is ~1us. Parallelism pays off from ~10,000 elements (N x Q > 10,000 rule).

3. Thread-safety issues

// DISASTER — ArrayList is not thread-safe
List<Integer> list = new ArrayList<>();
numbers.parallelStream().forEach(list::add);

You may get ConcurrentModificationException or data loss.

4. Unpredictable order parallelStream().forEach() processes elements in random order.

Middle Level

Race Conditions

Lambdas in streams should be stateless:

// BAD — mutation of external variable
AtomicInteger counter = new AtomicInteger();
numbers.parallelStream().forEach(n -> counter.incrementAndGet());

100 threads competing for one AtomicInteger create contention — the stream becomes slower than a regular loop.

Inefficient operations in parallelism

  • limit() and skip(): Require synchronization between threads
  • sorted(): Parallel sorting is effective only on very large volumes
  • findFirst(): Forces waiting for the result from the first thread (use findAny())

Loss of data locality

CPU loves sequential reading (L1/L2 cache prefetching — CPU preloads neighboring bytes into fast cache). When threads read from different memory areas, the CPU cannot predict the address — cache miss, processor waits for data from RAM (100+ cycles).

ThreadLocal Dangers

In the general case, do not use ThreadLocal inside a parallel stream — ForkJoinPool workers are reused between tasks. If you control a custom ForkJoinPool and clean ThreadLocal in finally — acceptable.

Senior Level

Common Pool Starvation

If component A launched a heavy stream, component B will slow down. In Enterprise applications (Spring Boot) this is critical.

Solution: Isolation via custom ForkJoinPool:

ForkJoinPool isolatedPool = new ForkJoinPool(8);
isolatedPool.submit(() -> stream.parallel()...).get();

False Sharing

When updating adjacent elements in a primitive array, threads invalidate each other’s CPU cache lines — performance degradation to the level of a single core and below.

Overhead breakdown

  1. Data splitting (Splitting)
  2. Task object creation (Task allocation)
  3. Context switching between cores
  4. Result merging (Merging/Combining)

Diagnostics

  • -Djava.util.concurrent.ForkJoinPool.common.parallelism=0: Temporarily disable parallelism for tests
  • Thread Dumps: If the application “hangs” — check commonPool. If all threads are in WAITING on network calls — the problem is found
  • JMH Benchmarking: The only way to prove the benefit of parallelism

Interview Cheat Sheet

Must know:

  • Parallel streams share one ForkJoinPool.commonPool() — one long request blocks all others
  • Fork/Join overhead pays off from ~10,000 elements (N x Q > 10,000 rule)
  • parallelStream().forEach() processes elements in random order — for ordered results use forEachOrdered()
  • Modifying external variables in a parallel stream leads to ConcurrentModificationException and race conditions
  • limit(), skip(), findFirst() require synchronization between threads and are inefficient
  • CPU cache locality is lost in parallelism — threads read from different memory areas, cache misses 100+ cycles
  • In Spring Boot the shared commonPool is a critical problem: component A slows down component B

Common follow-up questions:

  • When is a parallel stream slower than a regular loop? — On small data (< 10K elements) and when mutating external variables
  • How to isolate parallel stream load? — Run via a custom ForkJoinPool: pool.submit(() -> stream.parallel()...).get()
  • What to replace findFirst() with in parallel mode? — Use findAny() — it does not require synchronization
  • How to diagnose commonPool problems? — Disable parallelism via system property -Djava.util.concurrent.ForkJoinPool.common.parallelism=0

Red flags (DO NOT say):

  • “Parallel stream is always faster” — overhead and contention can make it much slower
  • “You can use ArrayList in parallel forEach” — ArrayList is not thread-safe, data loss is guaranteed
  • “ThreadLocal is safe in parallel streams” — workers are reused between tasks, ThreadLocal will “leak”
  • “Parallel stream decides how many threads to use on its own” — it uses one shared commonPool

Related topics:

  • [[11. How to create a parallel stream]]
  • [[13. What is ForkJoinPool and how is it related to parallel streams]]
  • [[14. Can you modify state of external variables in Stream operations]]
  • [[16. Why you should avoid side effects in Stream]]
  • [[10. When to use parallel streams]]