What is a heap dump?
4. Be careful with dumps > 64 GB 5. JFR for large Heaps 6. PII — mask before analysis 7. class_histogram → fast alternative
Junior Level
Heap Dump — a “photograph” of all Java memory at a point in time.
Simple analogy: A photograph of your desk to see what’s on it later.
What it contains:
- All objects in memory
- All references between them
- Static fields
Does NOT contain: thread stack traces (that’s Thread Dump — a separate mechanism).
Why it’s needed:
- Find memory leaks
- Understand why application is slow
- Analyze after
OutOfMemoryError
Middle Level
Dump Risks
Stop-The-World!
→ 32 GB on HDD → 30-60 seconds
→ 32 GB on SSD → 5-15 seconds
Disk:
→ Dump = Heap size (32 GB → 32 GB file)
→ Disk space can run out!
Capture Methods
| Method | When | Flag |
|---|---|---|
| Automatic | On OOME | -XX:+HeapDumpOnOutOfMemoryError |
| Manual | Diagnostics | jcmd <pid> GC.heap_dump dump.hprof |
| Live only | Leak analysis | jcmd <pid> GC.heap_dump -all=false |
Compression (Java 11+)
-XX:+HeapDumpGzip # On-the-fly compression
# → 32 GB → 6-10 GB file
# → Savings in space and transfer time
Confidentiality
Dump contains EVERYTHING:
→ Passwords in char[]
→ Card numbers
→ Personal data
→ Don't transmit over open channels!
→ Use RedactHprof for masking
Senior Level
HeapDumpPath in Containers
# In Kubernetes:
-XX:HeapDumpPath=/var/log/dumps/
# /var/log → external volume
# → Dump won't fill ephemeral storage
# → Can download after crash
Live vs All
-all=true (default):
→ All objects (live + dead)
→ Large file
→ Faster capture
-all=false (live):
→ Live objects only. Before capture, JVM runs
Full GC to determine reachability. Without Full GC,
JVM doesn't know which objects are "alive".
This adds an STW pause to dump creation time!
→ Smaller file
JFR as Alternative
Heap > 64 GB → dump practically useless
→ Too large
→ Takes long to capture
→ Takes long to analyze
Solution: JFR OldObjectSample
→ 90% of leak information
→ Overhead < 1%
→ No STW
OQL Queries
-- Find large strings
SELECT x.value.toString()
FROM java.lang.String x
WHERE x.value.length > 1000
-- Find all Maps
SELECT * FROM java.util.HashMap
-- Find objects of specific class
SELECT * FROM com.example.MyClass
Best Practices
- HeapDumpOnOutOfMemoryError — mandatory
- HeapDumpPath → external volume
- HeapDumpGzip (Java 11+) — compression
- Be careful with dumps > 64 GB
- JFR for large Heaps
- PII — mask before analysis
- class_histogram → fast alternative
Senior Summary
- Heap Dump = binary Heap snapshot
- STW pause = main risk
- Gzip (Java 11+) = space savings
- PII = dump contains secrets
- JFR > Heap Dump for > 64 GB
- OQL = queries to dump
- External volume for dumps in containers
Interview Cheat Sheet
Must know:
- Heap Dump — “photograph” of entire Heap memory: all objects, references, static fields; does NOT contain stack traces (that’s Thread Dump)
- Dump = STW: 32 GB on HDD → 30-60 seconds, on SSD → 5-15 seconds; in Kubernetes Liveness Probe fail → pod killed
- All vs Live:
-all=true(all objects, faster);-all=false(live only, requires Full GC before dump → additional STW) - Gzip (Java 11+):
-XX:+HeapDumpGzip→ 32 GB → 6-10 GB file; saves space and transfer time - PII: dump contains passwords, tokens, personal data; don’t transmit over open channels; use RedactHprof
- JFR > Heap Dump for > 64 GB: 90% of leak info, < 1% overhead, no STW
- In Kubernetes:
-XX:HeapDumpPath=/var/log/dumps/→ external volume; dump won’t fill ephemeral storage
Common follow-up questions:
- Why does Live dump require Full GC? — To determine object reachability, JVM must run Full GC; without Full GC, JVM doesn’t know which objects are “alive”
- What does Heap Dump contain vs not contain? — Contains: objects, references, static fields. Does NOT contain: thread stack traces, stack variables
- Why is JFR better for large Heaps? — Heap > 64 GB → dump practically useless (too large, long to capture/analyze); JFR gives 90% of info without STW
- What are the risks of transmitting a dump? — Contains passwords in char[], card numbers, sessions; transmit only over secure channels
Red flags (DO NOT say):
- “Heap Dump contains stack traces” — that’s Thread Dump; Heap Dump only has objects in Heap
- “Live dump is faster” — Live dump requires Full GC BEFORE dump → additional STW pause
- “Dump can be safely taken in production” — STW pause; increase probe timeouts or use JFR
Related topics:
- [[24. How to get a heap dump]]
- [[21. What is a memory leak and how to detect it]]
- [[22. What tools help analyze memory]]
- [[19. What happens on OutOfMemoryError]]
- [[16. What is stop-the-world]]