Question 20 · Section 17

What is exponential backoff

Structured Java interview answer with junior, middle, and senior-level explanation.

Language versions: English Russian Ukrainian

🟢 Junior Level

Exponential backoff is a retry strategy where the wait time doubles after each failed attempt.

Attempt 1: error → wait 1 sec
Attempt 2: error → wait 2 sec
Attempt 3: error → wait 4 sec
Attempt 4: error → wait 8 sec
Attempt 5: success! ✅

Why: If a service is overloaded, give it more time to recover.


🟡 Middle Level

Implementation

// ⚠️ BAD example for production: Thread.sleep blocks the thread,
// no jitter, no max cap. Use Resilience4j or Spring Retry.
long baseDelay = 1000; // 1 second
double multiplier = 2.0;

for (int attempt = 0; attempt < maxAttempts; attempt++) {
    try {
        return callService();
    } catch (Exception e) {
        long delay = (long) (baseDelay * Math.pow(multiplier, attempt));
        Thread.sleep(delay);
    }
}

With jitter

// Jitter adds randomness to prevent thundering herd
// This is "equal jitter" variant. AWS recommends "full jitter":
// sleep = random(0, min(cap, base * 2^attempt))
long delay = (long) (baseDelay * Math.pow(multiplier, attempt));
long jitter = random.nextInt((int)(delay * 0.1));
delay += jitter;

Common mistakes

  1. Without jitter:
    All clients retry simultaneously → thundering herd → service crashes again
    

When NOT to use exponential backoff

  • Idempotent read operations — simple retry is ok
  • Time-critical operations (real-time) — backoff adds unpredictability
  • Client requests from a user — user won’t wait 30 seconds for backoff

🔴 Senior Level

Full jitter formula

// AWS recommends: full jitter
long delay = min(cap, base * pow(multiplier, attempt)) + random(0, base)

Production Experience

Resilience4j:

RetryConfig config = RetryConfig.custom()
    .maxAttempts(5)
    .intervalFunction(IntervalFunction.ofExponentialBackoff(1000, 2.0))
    .build();

Best Practices

✅ Exponential backoff + jitter
✅ Cap on maximum delay
✅ Attempt limit
✅ Only for retryable errors

❌ Without cap (can wait minutes)
❌ Without jitter
❌ For all error types

🎯 Interview Cheat Sheet

Must know:

  • Exponential backoff — delay doubles: 1s → 2s → 4s → 8s
  • Jitter adds randomness: AWS recommends full jitter = random(0, min(cap, base * 2^attempt))
  • Cap (maximum delay) is mandatory — without cap, you can wait minutes
  • Only for retryable errors — not for 4xx, not for business exceptions
  • Do NOT use for idempotent reads (simple retry ok), real-time operations, user client requests
  • Resilience4j: IntervalFunction.ofExponentialBackoff(base, multiplier)
  • Thread.sleep in production — bad, use Resilience4j or Spring Retry

Frequent follow-up questions:

  • Full jitter vs equal jitter? Full jitter = random(0, base2^attempt), equal jitter = delay + random(0, delay0.1). Full jitter better prevents thundering herd.
  • What cap to choose? Depends on SLA — usually 30-60 seconds max.
  • Why is Thread.sleep bad? Blocks the thread, no jitter, no max cap — use Resilience4j.
  • When NOT to use backoff? Read operations (simple retry ok), real-time, user client requests.

Red flags (NOT to say):

  • “Backoff without cap is more reliable” — no, you can wait indefinitely
  • “Jitter is not needed for low traffic” — still needed, clients can synchronize
  • “Backoff for 404 errors” — no, the server won’t fix itself
  • “Thread.sleep is production-ready” — no, blocks the thread, no jitter

Related topics:

  • [[19. What is Retry pattern and how to use it correctly]]
  • [[17. How to ensure fault tolerance of microservices]]
  • [[15. How to organize communication between microservices]]
  • [[5. What is Circuit Breaker pattern]]
  • [[21. How to monitor a distributed microservices system]]