Question 17 · Section 8

What does reduce() operation do?

Step-by-step breakdown for [1, 2, 3, 4, 5]:

Language versions: English Russian Ukrainian

🟢 Junior Level

reduce() — a terminal operation that collapses a stream into a single value.

Three variants:

// 1. Without initial value — returns Optional
Optional<Integer> sum = numbers.stream()
    .reduce((a, b) -> a + b);

// 2. With initial value
int sum = numbers.stream()
    .reduce(0, (a, b) -> a + b);

// 3. With type change
int lengthSum = strings.stream()
    .reduce(0, (total, str) -> total + str.length(), Integer::sum);

Simple example:

List<Integer> numbers = List.of(1, 2, 3, 4, 5);
int sum = numbers.stream().reduce(0, (a, b) -> a + b);
// Result: 15

Step-by-step breakdown for [1, 2, 3, 4, 5]:

  • Step 1: a=0 (identity), b=1 → result 1
  • Step 2: a=1 (accumulated), b=2 → result 3
  • Step 3: a=3, b=3 → result 6
  • Step 4: a=6, b=4 → result 10
  • Step 5: a=10, b=5 → result 15

🟡 Middle Level

Mathematical Requirements

For reduce to work correctly (especially in parallel mode):

  • Identity: identity is the “neutral element”: accumulator.apply(identity, x) = x. For addition, identity = 0 (0 + x = x). For multiplication, identity = 1 (1 × x = x). If you set identity = 1 for addition — the result will be inflated by the number of elements!
  • Associativity: (a op b) op c == a op (b op c)
  • Commutativity: For parallel processing, it is desirable that order does not affect the result

Reduce vs Collect

This is the “golden interview question”:

  • reduce: Creates a new object at each step. Good for numbers, strings, immutable objects
  • collect: Modifies an existing container. More efficient for collections
// BAD — creates millions of strings O(n²)
stream.reduce("", String::concat)

// GOOD — uses a single StringBuilder O(n)
stream.collect(Collectors.joining())

Boxing

When using reduce on Integer/Long — there is autoboxing. For high-load, prefer primitive streams: IntStream.sum(), LongStream.max().

🔴 Senior Level

Third reduce signature

<U> U reduce(U identity,
             BiFunction<U, ? super T, U> accumulator,
             BinaryOperator<U> combiner)

Combiner is critical for parallelStream() — it teaches the stream how to combine results from different threads.

Identity Mutation

Never mutate the identity object. It is reused across parallel branches. If you mutate it — you affect all parallel computations.

Edge Cases

  • Parallel Combiner: An incompatible combiner → incorrect result without exceptions
  • Null Handling: An accumulator returning null → NPE
  • Empty Streams: The single-argument version returns Optional — do not call .get() without checking

Diagnostics

Use IntelliJ Stream Debugger — it visually shows how the accumulator combines elements at each step.


🎯 Interview Cheat Sheet

Must know:

  • reduce() — a terminal operation that collapses a stream into a single value
  • Three signatures: (1) with BinaryOperatorOptional, (2) with identity + BinaryOperator, (3) with identity + accumulator + combiner (for parallelism and type change)
  • Mathematical requirements: Identity (identity op x = x), Associativity ((a op b) op c = a op (b op c)), Commutativity is desirable
  • reduce() creates a new object at each step (immutable reduction); for collections use collect()
  • For string concatenation: reduce("", String::concat) — O(n²), collect(Collectors.joining()) — O(n)
  • Identity mutation is a serious error: the identity object is reused across parallel branches
  • On parallelStream() a correct combiner is mandatory — otherwise incorrect result without exceptions

Frequent follow-up questions:

  • Why does reduce with identity = 1 for addition give wrong result? — identity must be the neutral element: 0 for addition, 1 for multiplication. Otherwise the result is inflated by the element count.
  • How does reduce differ from collect? — reduce creates a new object at each step (immutable), collect modifies a single container (mutable). reduce for numbers/strings, collect for collections.
  • Why is the third parameter combiner needed in reduce? — It combines results from different threads in parallelStream(); without it, parallel mode is incorrect.
  • What will reduce return on an empty stream? — Version without identity → Optional.empty(). Version with identity → the identity value.

Red flags (DO NOT say):

  • “reduce can be used to build a List” — technically possible, but it will break parallel mode (mutable identity across branches)
  • “Order in reduce is always guaranteed” — in parallel mode, order depends on the combiner
  • “Identity can be mutated inside the accumulator” — this will break all parallel computations
  • “reduce and collect are interchangeable” — collect is more efficient for collections O(n) vs O(n²) of reduce

Related topics:

  • [[What is the difference between reduce() and collect()]]
  • [[What does collect() operation do]]
  • [[What potential problems can occur with parallel streams]]
  • [[Can you modify external variable state in Stream operations]]