Question 19 · Section 3

What happens on OutOfMemoryError?

4. Thread that requested memory → dies

Language versions: English Russian Ukrainian

Junior Level

OutOfMemoryError (OOME) — error when Java cannot allocate memory for an object.

What happens:

  1. Memory allocation attempt → no space
  2. GC tries to clean → still no space
  3. OutOfMemoryError is thrown
  4. Thread that requested memory → dies

Types of OOME:

  • Java heap space — Heap overflowed
  • Metaspace — class metadata overflowed
  • GC overhead limit exceeded — GC runs constantly

Middle Level

OOME Mechanics

1. Allocation Failure → no memory
2. Full GC → tries to clean
3. SoftReference cleaned → still no space
4. GC Overhead Limit — THIS IS A SPECIFIC type of OOME (GC overhead limit exceeded).
   Triggers when JVM spends >98% time on GC and frees <2% of Heap.
   NOT all OOMEs go through this check — it's a separate heuristic.
5. → OutOfMemoryError!

Zombie State

OOME can occur ANYWHERE, including during resource allocation.
If OOME happened while opening a file → file may remain open.
If OOME during lock acquisition → lock won't be released.
⚠️ finally blocks may NOT execute if OOME occurs while allocating memory for the stack frame itself.

OOME does NOT kill JVM instantly!
  → Only the requesting thread dies
  → Other threads keep working
  → BUT: state may be corrupted!

JVM OOME vs OS OOM Killer

JVM OOME:
  → You see stack trace in logs
  → There's a heap dump

OS OOM Killer (Docker/K8s):
  → Process simply disappears
  → In dmesg: "Out of memory: Kill process"
  → No Java logs!
  → Cause: RSS > container limit

Production: Fail-Fast

# Mandatory in production:
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/var/log/dumps/
-XX:+ExitOnOutOfMemoryError

# → On OOME: dump + exit
# → Orchestrator will restart
# → No zombie state

Senior Level

Thread Death and Corruption

OOME in thread holding a Lock:
  → Lock not released
  → Other threads wait forever
  → Deadlock!

OOME in ConcurrentHashMap:
  → Map in intermediate state
  → Subsequent operations → undefined behavior

Native Memory OOME

DirectByteBuffer leak:
  → Heap empty
  → Native Memory = 100%
  → OOM Killer kills process
  → No Java logs!

Diagnostics:
  -XX:NativeMemoryTracking=detail
  dmesg | grep -i oom

Diagnostics by Type

Error Type Cause Solution
Heap space Leak or small -Xmx MAT dump analysis
GC overhead Heap nearly full GC Logs, optimize
Metaspace ClassLoader leak jcmd VM.metaspace
Direct buffer Netty/NIO leak NMT
Native thread OS thread limit ulimit -u

Heap Dump Analysis

# Automatic dump on OOME
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/dumps/

# Analysis in Eclipse MAT:
→ Dominator Tree
→ Path to GC Roots
→ Find the leak

Best Practices

  1. Never catch OOME in business code
  2. ExitOnOutOfMemoryError in production
  3. HeapDumpOnOutOfMemoryError — mandatory
  4. HeapDumpPath → external volume
  5. Monitor RSS and Native Memory
  6. NMT for off-heap leaks
  7. Fail-fast > zombie state

Senior Summary

  • OOME ≠ instant JVM death
  • Zombie state = corrupted locks and structures
  • OS OOM Killer = no logs, process disappears
  • Fail-fast = ExitOnOutOfMemoryError
  • Heap Dump = main diagnostic tool
  • Native Memory ≠ Heap (need NMT)
  • Never catch OOME

Interview Cheat Sheet

Must know:

  • OOME: JVM cannot allocate memory → GC tries to clean → SoftReference cleaned → still no → OutOfMemoryError
  • OOME kills only the requesting thread, NOT the entire JVM; other threads work in corrupted state (zombie state)
  • Zombie state: OOME during lock acquisition → lock not released → deadlock; OOME in ConcurrentHashMap → undefined behavior
  • JVM OOME vs OS OOM Killer: JVM OOME gives stack trace and dump; OS OOM Killer (Docker/K8s) — process disappears, no logs
  • GC Overhead Limit: specific type of OOME; triggers when JVM spends >98% time on GC and frees <2% of Heap
  • ExitOnOutOfMemoryError + HeapDumpOnOutOfMemoryError — mandatory in production: dump + exit → orchestrator restarts
  • finally blocks may NOT execute if OOME occurs while allocating memory for the stack frame itself

Common follow-up questions:

  • Why shouldn’t OOME be caught in business code? — State corrupted: lock not released, Map in intermediate state; application unreliable
  • How does JVM OOME differ from OS OOM Killer? — JVM OOME: stack trace in logs, dump available; OS OOM Killer: dmesg | grep oom, process killed via SIGKILL, no Java logs
  • What is zombie state? — JVM continues working after OOME of one thread, but with corrupted locks/structures → undefined behavior
  • Why should HeapDumpPath be on external volume? — In Kubernetes, ephemeral storage fills with dump → pod won’t restart

Red flags (DO NOT say):

  • “I catch OOME and continue working” — zombie state, corrupted data; better fail-fast
  • “OOME instantly kills JVM” — only thread dies; JVM continues in corrupted state
  • “OS OOM Killer is a Java problem” — it’s an OS decision; cause: RSS > container limit, not -Xmx

Related topics:

  • [[20. What types of OutOfMemoryError exist]]
  • [[21. What is a memory leak and how to detect it]]
  • [[23. What is a heap dump]]
  • [[18. What are -Xms and -Xmx parameters]]
  • [[6. What is a memory leak in Java]]