How to properly execute multiple parallel requests to microservices

🟢 Junior Level

When you need to get data from multiple microservices, don’t make sequential requests — execute them in parallel.

Sequential (slow)

// ❌ Each request waits for the previous one to complete
User user = userService.getUser(userId);       // 200ms
Order orders = orderService.getOrders(userId);  // 300ms
Notification notifs = notificationService.get(userId); // 150ms
// Total time: 200 + 300 + 150 = 650ms

Parallel (fast)

// ✅ All requests start simultaneously
CompletableFuture<User> user = userService.getUserAsync(userId);
CompletableFuture<List<Order>> orders = orderService.getOrdersAsync(userId);
CompletableFuture<List<Notification>> notifs = notificationService.getAsync(userId);

// Wait for all at once
CompletableFuture.allOf(user, orders, notifs)
    .thenAccept(v -> {
        Dashboard dashboard = new Dashboard(
            user.join(),
            orders.join(),
            notifs.join()
        );
        // Total time: max(200, 300, 150) = 300ms — 2x faster!
    });

🟡 Middle Level

Gather and Combine pattern

@Service
public class UserProfileService {

    private final RestTemplate restTemplate;
    private final Executor executor;

    public UserProfileDto getProfile(Long userId) {
        // 1. Launch all requests in parallel
        CompletableFuture<User> userFuture = CompletableFuture.supplyAsync(
            () -> restTemplate.getForObject("http://user-service/users/" + userId, User.class),
            executor
        );

        CompletableFuture<List<Order>> ordersFuture = CompletableFuture.supplyAsync(
            () -> restTemplate.getForObject("http://order-service/orders?userId=" + userId, OrderList.class).orders(),
            executor
        );

        CompletableFuture<List<Review>> reviewsFuture = CompletableFuture.supplyAsync(
            () -> restTemplate.getForObject("http://review-service/reviews?authorId=" + userId, ReviewList.class).reviews(),
            executor
        );

        // 2. Wait for all
        // allOf().join() throws CompletionException if any CF completed exceptionally.
        // After a successful allOf().join(), calling .join() on each CF again is
        // safe but redundant — they are already completed.
        CompletableFuture.allOf(userFuture, ordersFuture, reviewsFuture).join();

        // 3. Combine results
        return new UserProfileDto(
            userFuture.join(),
            ordersFuture.join(),
            reviewsFuture.join()
        );
    }
}

Error handling

// If one service goes down — the whole thing shouldn't fail
CompletableFuture<User> user = userService.getUserAsync(userId)
    .exceptionally(ex -> {
        log.error("User service unavailable", ex);
        return User.unknown(); // fallback
    });

CompletableFuture<List<Order>> orders = orderService.getOrdersAsync(userId)
    .exceptionally(ex -> {
        log.error("Order service unavailable", ex);
        return List.of(); // empty list
    });

CompletableFuture.allOf(user, orders).join();
// Works even with partial errors

Timeout for each request

CompletableFuture<User> user = userService.getUserAsync(userId)
    .orTimeout(2, TimeUnit.SECONDS);

CompletableFuture<List<Order>> orders = orderService.getOrdersAsync(userId)
    .orTimeout(3, TimeUnit.SECONDS);

// If user-service doesn't respond within 2 seconds — timeout
// But order-service can keep working (it has 3 seconds)

Executor configuration

@Configuration
public class AsyncConfig {
    @Bean("microserviceExecutor")
    public Executor microserviceExecutor() {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
        executor.setCorePoolSize(10);
        executor.setMaxPoolSize(20);
        executor.setQueueCapacity(50);
        executor.setThreadNamePrefix("ms-call-");
        executor.setRejectedExecutionHandler(new CallerRunsPolicy());
        executor.initialize();
        return executor;
    }
}

Approach comparison

Approach	Speed	Complexity	Fault tolerance
Sequential	N × latency	Low	Low (one down — everything down)
CompletableFuture.allOf	max(latency)	Medium	Medium (needs error handling)
Reactive (WebClient)	max(latency)	High	High (retry, timeout, circuit breaker)

🔴 Senior Level

Resilience4j: Circuit Breaker + Retry

@Service
public class ResilientMicroserviceClient {

    private final CircuitBreakerRegistry cbRegistry;
    private final RetryRegistry retryRegistry;
    private final WebClient webClient;

    public CompletableFuture<User> getUserWithResilience(Long userId) {
        CircuitBreaker cb = cbRegistry.circuitBreaker("user-service");
        Retry retry = retryRegistry.retry("user-service");

        return CompletableFuture.supplyAsync(
            () -> Retry.decorateCheckedSupplier(retry,
                () -> CircuitBreaker.decorateCheckedSupplier(cb,
                    () -> webClient.get()
                        .uri("/users/{id}", userId)
                        .retrieve()
                        .bodyToMono(User.class)
                        // ⚠️ If you're already using WebClient, it's better to work with Mono directly
                        // and convert to CompletableFuture via .toFuture(),
                        // rather than blocking inside supplyAsync.
                        .block()
                ).apply()
            ).apply(),
            executor
        );
    }
}

Bulkhead — limiting parallelism

// Problem: 1000 simultaneous requests → microservice overload
// Solution: Bulkhead — limit the number of concurrent calls

Bulkhead bulkhead = Bulkhead.of("microservice-calls",
    BulkheadConfig.custom()
        .maxConcurrentCalls(50)
        .maxWaitDuration(Duration.ofMillis(500))
        .build());

public CompletableFuture<List<User>> getUsers(List<Long> ids) {
    List<CompletableFuture<User>> futures = ids.stream()
        .map(id -> CompletableFuture.supplyAsync(
            Bulkhead.decorateSupplier(bulkhead,
                () -> callUserService(id)),
            executor
        ))
        .toList();

    return CompletableFuture.allOf(futures.toArray(new CompletableFuture[0]))
        .thenApply(v -> futures.stream()
            .map(CompletableFuture::join)
            .toList());
}

Partial Results — don’t wait for everyone

// Scenario: 5 microservices, one responds slowly
// We don't want to wait 10 seconds for one service

public CompletableFuture<Dashboard> getDashboardPartial(Long userId) {
    CompletableFuture<User> user = callUserAsync(userId).orTimeout(2, SECONDS);
    CompletableFuture<List<Order>> orders = callOrdersAsync(userId).orTimeout(2, SECONDS);
    CompletableFuture<List<Review>> reviews = callReviewsAsync(userId).orTimeout(2, SECONDS);

    // Wait 2 seconds, then return whatever we have
    CompletableFuture<Void> all = CompletableFuture.allOf(user, orders, reviews);

    // delayedExecutor returns an Executor, not a CompletionStage.
    // Need to wrap it in a CF:
    CompletableFuture<Void> timeout = CompletableFuture
        .runAsync(() -> {}, CompletableFuture.delayedExecutor(2, SECONDS));
    return all.applyToEither(timeout, v -> buildDashboard(
            getOrDefault(user, null),
            getOrDefault(orders, List.of()),
            getOrDefault(reviews, List.of())
        )
    );
}

private <T> T getOrDefault(CompletableFuture<T> future, T defaultValue) {
    try {
        return future.getNow(defaultValue);
    } catch (CompletionException e) {
        return defaultValue;
    }
}

Service Degradation Levels

public CompletableFuture<Dashboard> getDashboardWithDegradation(Long userId) {
    CompletableFuture<User> user = callUserAsync(userId)
        .exceptionally(ex -> {
            metrics.counter("degraded.user-service").increment();
            return User.unknown(); // Level 1 degradation
        });

    CompletableFuture<List<Order>> orders = callOrdersAsync(userId)
        .thenCompose(orderList -> {
            if (orderList.isEmpty()) {
                // Level 2 degradation: no orders — try from cache
                return cacheService.getCachedOrders(userId)
                    .exceptionally(ex -> List.of());
            }
            return CompletableFuture.completedFuture(orderList);
        });

    return CompletableFuture.allOf(user, orders)
        .thenApply(v -> new Dashboard(user.join(), orders.join()));
}

Monitoring and metrics

public CompletableFuture<Dashboard> getDashboardInstrumented(Long userId) {
    Timer.Sample sample = Timer.start(metricsRegistry);

    CompletableFuture<User> user = callUserAsync(userId);
    CompletableFuture<List<Order>> orders = callOrdersAsync(userId);

    return CompletableFuture.allOf(user, orders)
        .thenApply(v -> new Dashboard(user.join(), orders.join()))
        .whenComplete((result, ex) -> {
            sample.stop(Timer.builder("dashboard.latency")
                .tag("userId", String.valueOf(userId))
                .tag("status", ex != null ? "error" : "success")
                .register(metricsRegistry));

            // Metrics for each call
            metrics.timer("call.user-service.latency", ...);
            metrics.timer("call.order-service.latency", ...);
        });
}

Production Architecture

Client Request
     │
     ├─→ API Gateway
           │
           ├─→ CompletableFuture.allOf(
           │     ├─→ User Service (with Circuit Breaker)
           │     ├─→ Order Service (with Circuit Breaker)
           │     ├─→ Review Service (with Circuit Breaker)
           │     └─→ Notification Service (with Circuit Breaker)
           │   )
           │
           ├─→ Timeout: 3s
           ├─→ Partial results after 3s
           └─→ Response with degradation flags

Best Practices

// ✅ Parallel calls via CompletableFuture.allOf
// ✅ Timeout on each call (orTimeout)
// ✅ Circuit Breaker for each microservice
// ✅ Fallback for graceful degradation
// ✅ Custom Executor (not ForkJoinPool.commonPool)
// ✅ Metrics and tracing for each call
// ✅ Partial results on timeout
// ✅ Bulkhead for limiting load

// ❌ Sequential calls (blocking)
// ❌ No timeout (infinite waiting)
// ❌ No circuit breaker (cascading failure)
// ❌ No fallback (complete failure on error)
// ❌ Shared ForkJoinPool (contention with other tasks)

🎯 Interview Cheat Sheet

Must know:

Parallel: allOf(cf1, cf2, cf3), latency = max(all CFs). Sequential: latency = sum(all CFs)
Timeout on each call: orTimeout(2, SECONDS)
Circuit Breaker (Resilience4j) for each microservice
Fallback for graceful degradation
Bulkhead for limiting parallelism (maxConcurrentCalls)
Custom Executor, not commonPool

Common follow-up questions:

Will allOf().join() throw if one CF fails? — Yes, CompletionException. Need handle on each CF
Partial results — how to return what you have? — orTimeout on each CF + applyToEither with delayedExecutor
Why Bulkhead? — Limits the number of concurrent calls, prevents microservice overload
Why a custom Executor? — Isolation: failure in one module doesn’t affect others

Red flags (DO NOT say):

“Sequential calls are OK for 2-3 services” — latency adds up, user waits longer
“No circuit breaker is fine” — cascading failure when a service is unavailable
“commonPool is fine for HTTP requests” — thread pool starvation

Related topics:

[[9. What does allOf() method do and when to use it]]
[[8. How to combine results of multiple CompletableFuture]]
[[21. How to implement timeout for CompletableFuture]]
[[12. What thread pool is used by default for async methods]]