What is Bulkhead pattern

🟢 Junior Level

Bulkhead is a pattern that isolates resources for different operations so that a failure in one does not affect others.

Real-life analogy: a ship is divided into compartments (bulkheads). If one compartment floods — the ship does not sink entirely.

Without Bulkhead:
Thread Pool [====================]
All calls share one pool → one slow service takes all threads

With Bulkhead:
Thread Pool A [========] → Service A
Thread Pool B [========] → Service B
Thread Pool C [========] → Service C
Service A is slow → only affects Pool A

When NOT to use Bulkhead

Single call to the entire service — if the service is atomic, splitting thread pools is pointless
Few threads — overhead of managing pools outweighs the benefit
Fully asynchronous system — backpressure solves the problem differently

🟡 Middle Level

Implementation with Resilience4j

BulkheadConfig config = BulkheadConfig.custom()
    .maxConcurrentCalls(10)  // maximum 10 concurrent calls
    .maxWaitDuration(Duration.ofSeconds(1))  // wait 1 sec if full
    .build();

// maxWaitDuration — how long a thread waits if bulkhead is full.
// After timeout — BulkheadFullException (not a timeout!).
// ThreadPool bulkhead queues, Semaphore — blocks or throws.

Bulkhead bulkhead = Bulkhead.of("backend", config);

Supplier<String> decorated = Bulkhead.decorateSupplier(bulkhead,
    () -> backendService.call());

Types of Bulkhead

1. Semaphore:

Limits concurrent calls
Lightweight, no thread overhead

2. Thread Pool:

Separate thread pool for each service
Thread-level isolation

Common mistakes

Too small limit:

maxConcurrentCalls = 2 → most requests rejected
Need to tune based on metrics

🔴 Senior Level

Architectural Trade-offs

Thread Pool	Semaphore
Full isolation (own threads), but overhead per thread	Lighter (less memory), but no thread isolation — all in one thread pool
Thread isolation	No thread isolation
More resources	Less overhead

Production Experience

resilience4j:
  bulkhead:
    instances:
      userService:
        maxConcurrentCalls: 20
      paymentService:
        maxConcurrentCalls: 10
      searchService:
        maxConcurrentCalls: 30

Best Practices

✅ Separate bulkhead for each external service
✅ Limits based on metrics
✅ Monitor rejection rate
✅ Combine with Circuit Breaker

❌ Single bulkhead for everything
❌ Too small limits
❌ Without monitoring

🎯 Interview Cheat Sheet

Must know:

Bulkhead isolates resources (thread pool) for different operations
Analogy: ship compartments — one floods, the ship does not sink
Two types: Thread Pool (full isolation, overhead) and Semaphore (lighter, no thread isolation)
Without Bulkhead, one slow service takes all threads from a shared pool
maxConcurrentCalls is tuned per service based on metrics
Combine with Circuit Breaker for maximum protection
NOT needed for a single atomic call, few threads, fully async systems

Frequent follow-up questions:

Thread Pool vs Semaphore? Thread Pool = own threads, full isolation, more overhead. Semaphore = blocks or throws, lighter.
How to set maxConcurrentCalls? Based on metrics: observe normal load, set with margin.
What happens with too small a limit? Most requests rejected — BulkheadFullException.
Why a separate bulkhead for each service? So a slow Payment Service does not affect User Service.

Red flags (NOT to say):

“Bulkhead = Circuit Breaker” — no, Bulkhead isolates resources, CB blocks calls
“One bulkhead for everything is enough” — no, then there’s no isolation
“maxConcurrentCalls = 2 is good protection” — no, most requests will be rejected
“Semaphore is not needed, only Thread Pool” — Semaphore is lighter, suitable for many cases

Related topics:

[[17. How to ensure fault tolerance of microservices]]
[[5. What is Circuit Breaker pattern]]
[[6. How does Circuit Breaker work and what states does it have]]
[[19. What is Retry pattern and how to use it correctly]]
[[15. How to organize communication between microservices]]