What is exponential backoff
Structured Java interview answer with junior, middle, and senior-level explanation.
🟢 Junior Level
Exponential backoff is a retry strategy where the wait time doubles after each failed attempt.
Attempt 1: error → wait 1 sec
Attempt 2: error → wait 2 sec
Attempt 3: error → wait 4 sec
Attempt 4: error → wait 8 sec
Attempt 5: success! ✅
Why: If a service is overloaded, give it more time to recover.
🟡 Middle Level
Implementation
// ⚠️ BAD example for production: Thread.sleep blocks the thread,
// no jitter, no max cap. Use Resilience4j or Spring Retry.
long baseDelay = 1000; // 1 second
double multiplier = 2.0;
for (int attempt = 0; attempt < maxAttempts; attempt++) {
try {
return callService();
} catch (Exception e) {
long delay = (long) (baseDelay * Math.pow(multiplier, attempt));
Thread.sleep(delay);
}
}
With jitter
// Jitter adds randomness to prevent thundering herd
// This is "equal jitter" variant. AWS recommends "full jitter":
// sleep = random(0, min(cap, base * 2^attempt))
long delay = (long) (baseDelay * Math.pow(multiplier, attempt));
long jitter = random.nextInt((int)(delay * 0.1));
delay += jitter;
Common mistakes
- Without jitter:
All clients retry simultaneously → thundering herd → service crashes again
When NOT to use exponential backoff
- Idempotent read operations — simple retry is ok
- Time-critical operations (real-time) — backoff adds unpredictability
- Client requests from a user — user won’t wait 30 seconds for backoff
🔴 Senior Level
Full jitter formula
// AWS recommends: full jitter
long delay = min(cap, base * pow(multiplier, attempt)) + random(0, base)
Production Experience
Resilience4j:
RetryConfig config = RetryConfig.custom()
.maxAttempts(5)
.intervalFunction(IntervalFunction.ofExponentialBackoff(1000, 2.0))
.build();
Best Practices
✅ Exponential backoff + jitter
✅ Cap on maximum delay
✅ Attempt limit
✅ Only for retryable errors
❌ Without cap (can wait minutes)
❌ Without jitter
❌ For all error types
🎯 Interview Cheat Sheet
Must know:
- Exponential backoff — delay doubles: 1s → 2s → 4s → 8s
- Jitter adds randomness: AWS recommends full jitter = random(0, min(cap, base * 2^attempt))
- Cap (maximum delay) is mandatory — without cap, you can wait minutes
- Only for retryable errors — not for 4xx, not for business exceptions
- Do NOT use for idempotent reads (simple retry ok), real-time operations, user client requests
- Resilience4j: IntervalFunction.ofExponentialBackoff(base, multiplier)
- Thread.sleep in production — bad, use Resilience4j or Spring Retry
Frequent follow-up questions:
- Full jitter vs equal jitter? Full jitter = random(0, base2^attempt), equal jitter = delay + random(0, delay0.1). Full jitter better prevents thundering herd.
- What cap to choose? Depends on SLA — usually 30-60 seconds max.
- Why is Thread.sleep bad? Blocks the thread, no jitter, no max cap — use Resilience4j.
- When NOT to use backoff? Read operations (simple retry ok), real-time, user client requests.
Red flags (NOT to say):
- “Backoff without cap is more reliable” — no, you can wait indefinitely
- “Jitter is not needed for low traffic” — still needed, clients can synchronize
- “Backoff for 404 errors” — no, the server won’t fix itself
- “Thread.sleep is production-ready” — no, blocks the thread, no jitter
Related topics:
- [[19. What is Retry pattern and how to use it correctly]]
- [[17. How to ensure fault tolerance of microservices]]
- [[15. How to organize communication between microservices]]
- [[5. What is Circuit Breaker pattern]]
- [[21. How to monitor a distributed microservices system]]