How to properly log exceptions?
Use a logger instead of printStackTrace():
Junior Level
Basic rule
Use a logger instead of printStackTrace():
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class OrderService {
// Logger - static final, one per class
private static final Logger log = LoggerFactory.getLogger(OrderService.class);
public void processOrder(Long orderId) {
try {
// business logic
} catch (Exception e) {
// Exception - last argument, has context
log.error("Failed to process order id={}", orderId, e);
}
}
}
Log levels
log.error()- critical errors requiring attentionlog.warn()- warnings, system is working but there’s risklog.info()- important events (start, completion)log.debug()- detailed debugging
With parameters
log.error("User id={} not found", userId, exception);
// exception will NOT go into {}, it goes to stack trace
How SLF4J works: SLF4J replaces {} with argument values left to right. But there’s a special case: if the last argument is a Throwable, SLF4J does not substitute it into {}, but extracts the full stack trace and adds it to the message.
// {} is replaced with userId, exception - last argument -> stack will be in log
log.error("User id={} not found", userId, exception);
// {} is replaced with e.getMessage(), exception NOT last -> stack will NOT get to log!
log.error("Error: {}", exception, userId);
// No {}, but exception last -> stack still appears
log.error("Critical failure", exception);
Common mistake: log.error("Error: {}", e) - here e is substituted into {} as a string (e.toString()), but stack trace is NOT output. Always pass the exception as the last argument, separate from {}.
What not to do
// BAD - no context, impossible to understand where the error is
log.error(e.getMessage());
// BAD - e as parameter {}, not as exception (stack won't get to log)
log.error("Error: {}", e);
// BAD - logging and printStackTrace duplicate output
log.error("Error", e);
e.printStackTrace();
// GOOD - exception as last argument, has context
log.error("Failed to process user id={}", userId, e);
When NOT to log exceptions
- Expected business scenarios -
UserNotFoundExceptionon login: this is not a system error, but a normal scenario. Uselog.warn()orlog.info()instead oflog.error(). If every failed login writes ERROR - your alerts will be devalued (alert fatigue) - Retry mechanisms - don’t log every retry attempt at ERROR level, use
log.debug(). Otherwise 5 retry attempts = 5 ERROR logs, although the system is still fighting - Batch processing - with 1000 elements and 500 errors, use log sampling (no more than 10 identical stacks per minute)
- Health check endpoint - health check errors log at DEBUG, don’t clutter ERROR logs
- Circuit breaker open - when CB is open, log once on opening, not every rejected request
- Duplicate logging - if exception will be logged at the top level (GlobalExceptionHandler), don’t log it again on each layer. One exception - one ERROR log
Caveat: PII (Personally Identifiable Information) in logs
Never log sensitive data with exceptions. This includes:
- Passwords, tokens, CVV codes
- Email, phone, passport data (GDPR!)
- IP addresses (in some jurisdictions)
// BAD - password in log
log.error("Auth failed for user={} with password={}", username, password);
// GOOD - only context without sensitive data
log.error("Auth failed for user={}", username);
Even if the logger supports masking, it’s better not to pass such data to the logger at all. Configure the log framework for redaction (scrubbing) at the appender level, not in each call separately.
Middle Level
SLF4J magic of the last argument
SLF4J looks for the last argument of the method. If it’s a Throwable, it’s handled specially - stack trace is extracted, even if there are no {} in the string:
log.error("Critical failure", exception); // Stack will be in log
MDC (Mapped Diagnostic Context)
In distributed systems, stack trace is useless without request context:
MDC.put("traceId", traceId);
MDC.put("userId", userId);
try {
orderService.process(orderId);
// Log will have traceId and userId
} catch (Exception e) {
log.error("Failed to process order id={}", orderId, e);
} finally {
MDC.clear(); // Required - otherwise traceId leaks to next request
}
JSON Layout
In ELK/Splunk, logs should be JSON:
{
"timestamp": "2024-01-15T10:30:00Z",
"level": "ERROR",
"message": "Failed to process order",
"exception_class": "java.sql.SQLException",
"stack_trace": "...",
"traceId": "abc-123"
}
Advantages:
- Indexing by
exception_class - Search by
stack_trace - No problems with multi-line logs
Cause Analysis
Make sure the logger outputs all levels of cause. The root of evil may be at the 5th nesting level.
Senior Level
Log Sampling
Under enormous load and mass errors, you can “drown” in logs (Log Flooding). Advanced loggers configure limits:
<!-- Logback - no more than 10 identical stacks per minute -->
<appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<filter class="ch.qos.logback.classic.filter.DuplicateMessageFilter">
<AllowedRepetitions>10</AllowedRepetitions>
</filter>
</appender>
Async Appenders
Always use async logging - separates disk/network writing from business thread:
<appender name="ASYNC" class="ch.qos.logback.classic.AsyncAppender">
<appender-ref ref="FILE" />
<includeCallerData>false</includeCallerData> <!-- for speed -->
</appender>
Risk: loss of last logs on JVM crash.
Log Scrubbing (PII)
Never log sensitive data (passwords, CVV, tokens). Modern loggers support masking:
// Logger automatically masks
log.error("Auth failed for user with password={}", password);
// In log: password=****
ShortenedThrowableConverter
Don’t print 200 lines of Hibernate stack. Usually first 20 frames are enough:
<conversionRule conversionWord="ex"
converterClass="com.example.ShortenedThrowableConverter" />
Generic Catch-all
Logging catch (Exception e) at the very top level is mandatory, but should be accompanied by alerting (Prometheus/Grafana):
meterRegistry.counter("errors.unhandled").increment();
alertService.send("Unhandled exception: " + e.getMessage());
Diagnostics
- Logback Configuration - configure
<shortenedThrowableConverter> - Cause Analysis - logger should output all
causelevels - Structured Logging - JSON format for ELK/Splunk
- Metric-driven - each exception through Micrometer to Prometheus
Interview Cheat Sheet
Must know:
- Exception - always the last argument in
log.error("msg {}", param, e); SLF4J extracts stack trace from the last argument - DON’T use
log.error("Error: {}", e)- hereeis substituted into{}as a string, stack is NOT output - Use MDC (
traceId,userId) for context in distributed systems; alwaysMDC.clear()infinally - For highload: async appender, log sampling (no more than 10 identical stacks/min), shortenedThrowableConverter
- Log expected business scenarios at
warn/info, noterror- otherwise alerts lose value (alert fatigue) - Never log PII: passwords, tokens, CVV, email (GDPR!), IP addresses
Frequent follow-up questions:
- How does SLF4J handle exception? - If last argument is
Throwable, SLF4J doesn’t substitute it into{}, but extracts full stack trace - Why not duplicate logging? - If
GlobalExceptionHandleralready logs, don’t log on each layer - one exception = one ERROR log - What is Log Sampling? - Limiting identical messages (e.g., 10/min) to not “drown” in logs on mass errors
- When NOT to log exception? - Expected business scenarios (UserNotFoundException on login), retry attempts, health check, circuit breaker open
Red flags (NOT to say):
- “I log
log.error(e.getMessage())without context” - Impossible to understand where error is and what caused it - “I log password with exception” - Security violation (GDPR, PCI DSS)
- “I log and printStackTrace simultaneously” - Duplication, clutters logs
- “Every retry at ERROR level” - 5 attempts = 5 ERROR logs, alerts become useless
Related topics:
- [[16. What does the printStackTrace() method do]]
- [[15. What is a stack trace]]
- [[13. Can you create custom exceptions]]
- [[19. Why you should not swallow exceptions (catch empty)]]
- [[18. What is exception wrapping (wrapping)]]