How to properly execute multiple parallel requests to microservices
When you need to get data from multiple microservices, don't make sequential requests — execute them in parallel.
🟢 Junior Level
When you need to get data from multiple microservices, don’t make sequential requests — execute them in parallel.
Sequential (slow)
// ❌ Each request waits for the previous one to complete
User user = userService.getUser(userId); // 200ms
Order orders = orderService.getOrders(userId); // 300ms
Notification notifs = notificationService.get(userId); // 150ms
// Total time: 200 + 300 + 150 = 650ms
Parallel (fast)
// ✅ All requests start simultaneously
CompletableFuture<User> user = userService.getUserAsync(userId);
CompletableFuture<List<Order>> orders = orderService.getOrdersAsync(userId);
CompletableFuture<List<Notification>> notifs = notificationService.getAsync(userId);
// Wait for all at once
CompletableFuture.allOf(user, orders, notifs)
.thenAccept(v -> {
Dashboard dashboard = new Dashboard(
user.join(),
orders.join(),
notifs.join()
);
// Total time: max(200, 300, 150) = 300ms — 2x faster!
});
🟡 Middle Level
Gather and Combine pattern
@Service
public class UserProfileService {
private final RestTemplate restTemplate;
private final Executor executor;
public UserProfileDto getProfile(Long userId) {
// 1. Launch all requests in parallel
CompletableFuture<User> userFuture = CompletableFuture.supplyAsync(
() -> restTemplate.getForObject("http://user-service/users/" + userId, User.class),
executor
);
CompletableFuture<List<Order>> ordersFuture = CompletableFuture.supplyAsync(
() -> restTemplate.getForObject("http://order-service/orders?userId=" + userId, OrderList.class).orders(),
executor
);
CompletableFuture<List<Review>> reviewsFuture = CompletableFuture.supplyAsync(
() -> restTemplate.getForObject("http://review-service/reviews?authorId=" + userId, ReviewList.class).reviews(),
executor
);
// 2. Wait for all
// allOf().join() throws CompletionException if any CF completed exceptionally.
// After a successful allOf().join(), calling .join() on each CF again is
// safe but redundant — they are already completed.
CompletableFuture.allOf(userFuture, ordersFuture, reviewsFuture).join();
// 3. Combine results
return new UserProfileDto(
userFuture.join(),
ordersFuture.join(),
reviewsFuture.join()
);
}
}
Error handling
// If one service goes down — the whole thing shouldn't fail
CompletableFuture<User> user = userService.getUserAsync(userId)
.exceptionally(ex -> {
log.error("User service unavailable", ex);
return User.unknown(); // fallback
});
CompletableFuture<List<Order>> orders = orderService.getOrdersAsync(userId)
.exceptionally(ex -> {
log.error("Order service unavailable", ex);
return List.of(); // empty list
});
CompletableFuture.allOf(user, orders).join();
// Works even with partial errors
Timeout for each request
CompletableFuture<User> user = userService.getUserAsync(userId)
.orTimeout(2, TimeUnit.SECONDS);
CompletableFuture<List<Order>> orders = orderService.getOrdersAsync(userId)
.orTimeout(3, TimeUnit.SECONDS);
// If user-service doesn't respond within 2 seconds — timeout
// But order-service can keep working (it has 3 seconds)
Executor configuration
@Configuration
public class AsyncConfig {
@Bean("microserviceExecutor")
public Executor microserviceExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(10);
executor.setMaxPoolSize(20);
executor.setQueueCapacity(50);
executor.setThreadNamePrefix("ms-call-");
executor.setRejectedExecutionHandler(new CallerRunsPolicy());
executor.initialize();
return executor;
}
}
Approach comparison
| Approach | Speed | Complexity | Fault tolerance |
|---|---|---|---|
| Sequential | N × latency | Low | Low (one down — everything down) |
| CompletableFuture.allOf | max(latency) | Medium | Medium (needs error handling) |
| Reactive (WebClient) | max(latency) | High | High (retry, timeout, circuit breaker) |
🔴 Senior Level
Resilience4j: Circuit Breaker + Retry
@Service
public class ResilientMicroserviceClient {
private final CircuitBreakerRegistry cbRegistry;
private final RetryRegistry retryRegistry;
private final WebClient webClient;
public CompletableFuture<User> getUserWithResilience(Long userId) {
CircuitBreaker cb = cbRegistry.circuitBreaker("user-service");
Retry retry = retryRegistry.retry("user-service");
return CompletableFuture.supplyAsync(
() -> Retry.decorateCheckedSupplier(retry,
() -> CircuitBreaker.decorateCheckedSupplier(cb,
() -> webClient.get()
.uri("/users/{id}", userId)
.retrieve()
.bodyToMono(User.class)
// ⚠️ If you're already using WebClient, it's better to work with Mono directly
// and convert to CompletableFuture via .toFuture(),
// rather than blocking inside supplyAsync.
.block()
).apply()
).apply(),
executor
);
}
}
Bulkhead — limiting parallelism
// Problem: 1000 simultaneous requests → microservice overload
// Solution: Bulkhead — limit the number of concurrent calls
Bulkhead bulkhead = Bulkhead.of("microservice-calls",
BulkheadConfig.custom()
.maxConcurrentCalls(50)
.maxWaitDuration(Duration.ofMillis(500))
.build());
public CompletableFuture<List<User>> getUsers(List<Long> ids) {
List<CompletableFuture<User>> futures = ids.stream()
.map(id -> CompletableFuture.supplyAsync(
Bulkhead.decorateSupplier(bulkhead,
() -> callUserService(id)),
executor
))
.toList();
return CompletableFuture.allOf(futures.toArray(new CompletableFuture[0]))
.thenApply(v -> futures.stream()
.map(CompletableFuture::join)
.toList());
}
Partial Results — don’t wait for everyone
// Scenario: 5 microservices, one responds slowly
// We don't want to wait 10 seconds for one service
public CompletableFuture<Dashboard> getDashboardPartial(Long userId) {
CompletableFuture<User> user = callUserAsync(userId).orTimeout(2, SECONDS);
CompletableFuture<List<Order>> orders = callOrdersAsync(userId).orTimeout(2, SECONDS);
CompletableFuture<List<Review>> reviews = callReviewsAsync(userId).orTimeout(2, SECONDS);
// Wait 2 seconds, then return whatever we have
CompletableFuture<Void> all = CompletableFuture.allOf(user, orders, reviews);
// delayedExecutor returns an Executor, not a CompletionStage.
// Need to wrap it in a CF:
CompletableFuture<Void> timeout = CompletableFuture
.runAsync(() -> {}, CompletableFuture.delayedExecutor(2, SECONDS));
return all.applyToEither(timeout, v -> buildDashboard(
getOrDefault(user, null),
getOrDefault(orders, List.of()),
getOrDefault(reviews, List.of())
)
);
}
private <T> T getOrDefault(CompletableFuture<T> future, T defaultValue) {
try {
return future.getNow(defaultValue);
} catch (CompletionException e) {
return defaultValue;
}
}
Service Degradation Levels
public CompletableFuture<Dashboard> getDashboardWithDegradation(Long userId) {
CompletableFuture<User> user = callUserAsync(userId)
.exceptionally(ex -> {
metrics.counter("degraded.user-service").increment();
return User.unknown(); // Level 1 degradation
});
CompletableFuture<List<Order>> orders = callOrdersAsync(userId)
.thenCompose(orderList -> {
if (orderList.isEmpty()) {
// Level 2 degradation: no orders — try from cache
return cacheService.getCachedOrders(userId)
.exceptionally(ex -> List.of());
}
return CompletableFuture.completedFuture(orderList);
});
return CompletableFuture.allOf(user, orders)
.thenApply(v -> new Dashboard(user.join(), orders.join()));
}
Monitoring and metrics
public CompletableFuture<Dashboard> getDashboardInstrumented(Long userId) {
Timer.Sample sample = Timer.start(metricsRegistry);
CompletableFuture<User> user = callUserAsync(userId);
CompletableFuture<List<Order>> orders = callOrdersAsync(userId);
return CompletableFuture.allOf(user, orders)
.thenApply(v -> new Dashboard(user.join(), orders.join()))
.whenComplete((result, ex) -> {
sample.stop(Timer.builder("dashboard.latency")
.tag("userId", String.valueOf(userId))
.tag("status", ex != null ? "error" : "success")
.register(metricsRegistry));
// Metrics for each call
metrics.timer("call.user-service.latency", ...);
metrics.timer("call.order-service.latency", ...);
});
}
Production Architecture
Client Request
│
├─→ API Gateway
│
├─→ CompletableFuture.allOf(
│ ├─→ User Service (with Circuit Breaker)
│ ├─→ Order Service (with Circuit Breaker)
│ ├─→ Review Service (with Circuit Breaker)
│ └─→ Notification Service (with Circuit Breaker)
│ )
│
├─→ Timeout: 3s
├─→ Partial results after 3s
└─→ Response with degradation flags
Best Practices
// ✅ Parallel calls via CompletableFuture.allOf
// ✅ Timeout on each call (orTimeout)
// ✅ Circuit Breaker for each microservice
// ✅ Fallback for graceful degradation
// ✅ Custom Executor (not ForkJoinPool.commonPool)
// ✅ Metrics and tracing for each call
// ✅ Partial results on timeout
// ✅ Bulkhead for limiting load
// ❌ Sequential calls (blocking)
// ❌ No timeout (infinite waiting)
// ❌ No circuit breaker (cascading failure)
// ❌ No fallback (complete failure on error)
// ❌ Shared ForkJoinPool (contention with other tasks)
🎯 Interview Cheat Sheet
Must know:
- Parallel: allOf(cf1, cf2, cf3), latency = max(all CFs). Sequential: latency = sum(all CFs)
- Timeout on each call: orTimeout(2, SECONDS)
- Circuit Breaker (Resilience4j) for each microservice
- Fallback for graceful degradation
- Bulkhead for limiting parallelism (maxConcurrentCalls)
- Custom Executor, not commonPool
Common follow-up questions:
- Will allOf().join() throw if one CF fails? — Yes, CompletionException. Need handle on each CF
- Partial results — how to return what you have? — orTimeout on each CF + applyToEither with delayedExecutor
- Why Bulkhead? — Limits the number of concurrent calls, prevents microservice overload
- Why a custom Executor? — Isolation: failure in one module doesn’t affect others
Red flags (DO NOT say):
- “Sequential calls are OK for 2-3 services” — latency adds up, user waits longer
- “No circuit breaker is fine” — cascading failure when a service is unavailable
- “commonPool is fine for HTTP requests” — thread pool starvation
Related topics:
- [[9. What does allOf() method do and when to use it]]
- [[8. How to combine results of multiple CompletableFuture]]
- [[21. How to implement timeout for CompletableFuture]]
- [[12. What thread pool is used by default for async methods]]