What happens on OutOfMemoryError?

Junior Level

OutOfMemoryError (OOME) — error when Java cannot allocate memory for an object.

What happens:

Memory allocation attempt → no space
GC tries to clean → still no space
OutOfMemoryError is thrown
Thread that requested memory → dies

Types of OOME:

Java heap space — Heap overflowed
Metaspace — class metadata overflowed
GC overhead limit exceeded — GC runs constantly

Middle Level

OOME Mechanics

Allocation Failure → no memory
Full GC → tries to clean
SoftReference cleaned → still no space
GC Overhead Limit — THIS IS A SPECIFIC type of OOME (GC overhead limit exceeded).
   Triggers when JVM spends >98% time on GC and frees <2% of Heap.
   NOT all OOMEs go through this check — it's a separate heuristic.
→ OutOfMemoryError!

Zombie State

OOME can occur ANYWHERE, including during resource allocation.
If OOME happened while opening a file → file may remain open.
If OOME during lock acquisition → lock won't be released.
⚠️ finally blocks may NOT execute if OOME occurs while allocating memory for the stack frame itself.

OOME does NOT kill JVM instantly!
  → Only the requesting thread dies
  → Other threads keep working
  → BUT: state may be corrupted!

JVM OOME vs OS OOM Killer

JVM OOME:
  → You see stack trace in logs
  → There's a heap dump

OS OOM Killer (Docker/K8s):
  → Process simply disappears
  → In dmesg: "Out of memory: Kill process"
  → No Java logs!
  → Cause: RSS > container limit

Production: Fail-Fast

# Mandatory in production:
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/var/log/dumps/
-XX:+ExitOnOutOfMemoryError

# → On OOME: dump + exit
# → Orchestrator will restart
# → No zombie state

Senior Level

Thread Death and Corruption

OOME in thread holding a Lock:
  → Lock not released
  → Other threads wait forever
  → Deadlock!

OOME in ConcurrentHashMap:
  → Map in intermediate state
  → Subsequent operations → undefined behavior

Native Memory OOME

DirectByteBuffer leak:
  → Heap empty
  → Native Memory = 100%
  → OOM Killer kills process
  → No Java logs!

Diagnostics:
  -XX:NativeMemoryTracking=detail
  dmesg | grep -i oom

Diagnostics by Type

Error Type	Cause	Solution
Heap space	Leak or small -Xmx	MAT dump analysis
GC overhead	Heap nearly full	GC Logs, optimize
Metaspace	ClassLoader leak	jcmd VM.metaspace
Direct buffer	Netty/NIO leak	NMT
Native thread	OS thread limit	ulimit -u

Heap Dump Analysis

# Automatic dump on OOME
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/dumps/

# Analysis in Eclipse MAT:
→ Dominator Tree
→ Path to GC Roots
→ Find the leak

Best Practices

Never catch OOME in business code
ExitOnOutOfMemoryError in production
HeapDumpOnOutOfMemoryError — mandatory
HeapDumpPath → external volume
Monitor RSS and Native Memory
NMT for off-heap leaks
Fail-fast > zombie state

Senior Summary

OOME ≠ instant JVM death
Zombie state = corrupted locks and structures
OS OOM Killer = no logs, process disappears
Fail-fast = ExitOnOutOfMemoryError
Heap Dump = main diagnostic tool
Native Memory ≠ Heap (need NMT)
Never catch OOME

Interview Cheat Sheet

Must know:

OOME: JVM cannot allocate memory → GC tries to clean → SoftReference cleaned → still no → OutOfMemoryError
OOME kills only the requesting thread, NOT the entire JVM; other threads work in corrupted state (zombie state)
Zombie state: OOME during lock acquisition → lock not released → deadlock; OOME in ConcurrentHashMap → undefined behavior
JVM OOME vs OS OOM Killer: JVM OOME gives stack trace and dump; OS OOM Killer (Docker/K8s) — process disappears, no logs
GC Overhead Limit: specific type of OOME; triggers when JVM spends >98% time on GC and frees <2% of Heap
ExitOnOutOfMemoryError + HeapDumpOnOutOfMemoryError — mandatory in production: dump + exit → orchestrator restarts
finally blocks may NOT execute if OOME occurs while allocating memory for the stack frame itself

Common follow-up questions:

Why shouldn’t OOME be caught in business code? — State corrupted: lock not released, Map in intermediate state; application unreliable
How does JVM OOME differ from OS OOM Killer? — JVM OOME: stack trace in logs, dump available; OS OOM Killer: dmesg | grep oom, process killed via SIGKILL, no Java logs
What is zombie state? — JVM continues working after OOME of one thread, but with corrupted locks/structures → undefined behavior
Why should HeapDumpPath be on external volume? — In Kubernetes, ephemeral storage fills with dump → pod won’t restart

Red flags (DO NOT say):

“I catch OOME and continue working” — zombie state, corrupted data; better fail-fast
“OOME instantly kills JVM” — only thread dies; JVM continues in corrupted state
“OS OOM Killer is a Java problem” — it’s an OS decision; cause: RSS > container limit, not -Xmx

Related topics:

[[20. What types of OutOfMemoryError exist]]
[[21. What is a memory leak and how to detect it]]
[[23. What is a heap dump]]
[[18. What are -Xms and -Xmx parameters]]
[[6. What is a memory leak in Java]]