What is the difference between Heap and Stack?
4. TLAB — Heap allocation can be faster than Stack! 5. Virtual Threads (Java 21+) for high-concurrency 6. StackWalker (Java 9+) for stack analysis 7. Monitor -Xss — deep recursi...
Junior Level
Heap and Stack — two separate areas of virtual memory that the JVM requests from the OS.
Why two areas: they have different access patterns. Stack is sequential (LIFO): data is added and removed in strict method call order. Heap is arbitrary: objects live long and are accessible from anywhere.
Simple analogy:
- Stack — like your desk: fast access, but limited space. Used for current tasks.
- Heap — like a warehouse: huge, but you need to walk to get things. Used for storing all your stuff.
Main differences:
| Stack | Heap |
|---|---|
| Stores local variables and references | Stores objects (new Object()) |
| Fast access | Slower |
| Small (1 MB per thread) | Huge (gigabytes) |
| Cleaned up automatically | Cleaned up by Garbage Collector |
| Private to each thread | Shared across all threads |
Example:
public void example() {
int x = 10; // Stack (primitive)
String name = "Ivan"; // Stack (reference) → "Ivan" in Heap
User user = new User(); // Stack (reference) → User object in Heap
}
Errors:
- Stack overflow →
StackOverflowError(infinite recursion) - Heap overflow →
OutOfMemoryError: Java heap space
Middle Level
Stack: Execution Memory
Stack frame structure:
Stack Frame (created on method call):
├── Local Variable Table ← local variables
├── Operand Stack ← operand stack for computations
├── Dynamic Linking ← reference to Constant Pool
└── Return Address ← where to return after method
Lifecycle:
- Created on method call → removed on return
- Deterministic — memory is freed instantly
- Does not require Garbage Collector
Parameter: -Xss (default 1 MB)
Heap: State Memory
What’s stored:
- All objects (
new ...) - Arrays (even of primitives
int[]) - Static fields of classes
- String Pool
Allocation:
- TLAB (Thread Local Allocation Buffer) — a thread-private buffer in Eden. Each thread allocates in its own TLAB without synchronization → very fast.
- Pointer Bumping — allocation = simply advancing a pointer (fast!)
Parameters: -Xms (initial), -Xmx (maximum)
Escape Analysis: JIT Optimization
Terms:
- TLAB (Thread Local Allocation Buffer) — a thread-private buffer in Eden. Each thread allocates in its own TLAB without synchronization → very fast.
- Compressed OOPs (Compressed Ordinary Object Pointers) — compresses references from 8 to 4 bytes when Heap < 32 GB. Saves ~30% memory.
- Escape Analysis — JIT analysis: does an object “escape” from a method? If not — JIT may allocate it on the Stack or eliminate it entirely (Scalar Replacement).
// JIT may decide the object doesn't "escape" the method
// and allocate it on the stack instead of the heap!
public Point createPoint() {
Point p = new Point(10, 20); // May be allocated on Stack!
return p; // Object "escapes" — JIT cannot eliminate allocation.
// JIT analyzes all exit paths: if a reference is returned or
// stored in a field — the object "escapes". Check: -XX:+PrintEscapeAnalysis
}
// Scalar Replacement: object is "broken down" into variables
// Lock Elision: synchronized is removed if object is used by one thread
Comparison Table
| Criterion | Stack | Heap |
|---|---|---|
| Visibility | Thread-local | Shared |
| Management | Hardware (Stack Pointer) | GC |
| Cleanup | Instant (Stack Unwinding) | Background (STW pauses) |
| Optimizations | Register Allocation | TLAB, Compressed OOPs |
| Errors | StackOverflowError |
OutOfMemoryError |
Project Loom (Virtual Threads, Java 21+)
Platform threads: stack in native memory (1 MB fixed)
Virtual threads: stack in Heap as an object!
→ Mount: copied into platform thread stack
→ Unmount: copied back into Heap
→ Result: millions of threads!
Senior Level
Stack Frame Internal
Stack Frame Layout:
┌─────────────────────────────────────┐
│ Local Variable Table │
│ [0] = this (for non-static) │
│ [1] = param1 │
│ [2] = param2 │
│ ... │
├─────────────────────────────────────┤
│ Operand Stack (for byte-code ops) │
│ push a, push b, iadd, store c │
├─────────────────────────────────────┤
│ Dynamic Linking → Constant Pool │
│ Return Address → next instruction │
│ Exception Table Reference │
└─────────────────────────────────────┘
Frame size is computed at compile time!
TLAB (Thread Local Allocation Buffer)
Eden Space:
┌────────────────────────────────────┐
│ Thread 1: [TLAB_1] 64KB │
│ Thread 2: [TLAB_2] 64KB │
│ Thread 3: [TLAB_3] 64KB │
│ ... │
└────────────────────────────────────┘
Allocation in TLAB:
pointer += object_size // Bump-the-pointer
→ 0 synchronization!
→ Often faster than malloc() in multithreaded environments, because malloc requires synchronization, while TLAB does not.
If object > TLAB → allocation in Eden with synchronization
NUMA Awareness
Multi-processor server:
CPU 1 ── Memory A (fast)
CPU 2 ── Memory B (fast for CPU 2, slow for CPU 1)
-XX:+UseNUMA → JVM distributes Heap across NUMA nodes
→ Threads on CPU 1 allocate in Memory A
→ +10-20% performance on Highload!
Object Layout in Memory
Object Header (12-16 bytes):
├── Mark Word (8 bytes)
│ ├── Hash Code (31 bits)
│ ├── Age (4 bits) → for GC
│ ├── Lock State (2 bits)
│ └── GC bits
├── Klass Pointer (4 bytes with Compressed OOPs)
│ → Pointer to class metadata in Metaspace
├── Instance Data (object fields)
└── Padding (aligned to 8 bytes)
Compressed OOPs: 32 GB Threshold
64-bit pointer = 8 bytes
Compressed OOPs = 4 bytes (shift × 8)
Maximum address: 2^32 × 8 = 32 GB
→ 31 GB Heap is faster than 33 GB!
→ Crossing 32 GB = -10-15% performance
→ CPU L1/L2 cache savings
Future: Project Valhalla (Value Types)
// Future: store by value, not by reference
public final class Point {
public final int x;
public final int y;
}
// Now:
Point[] arr = new Point[1000]; // 1000 objects in Heap + 1000 references
// Future (Value Types):
Point[] arr = new Point[1000]; // Stored directly in the array!
→ No object headers
→ No indirection
→ Cache locality × 10
Production Experience
Real scenario: 32 GB threshold killed performance
- Server: 64 GB RAM,
-Xmx48g - Crossed 32 GB → Compressed OOPs disabled
- Result: -15% throughput, +20% latency
- Solution:
-Xmx30g→ Compressed OOPs enabled → fixed
Best Practices
-Xms = -Xmxin production for long-running servers (prevents resizing overhead). For CLI utilities, keep a small -Xms for fast startup.- Avoid > 32 GB without need (Compressed OOPs)
- Escape Analysis — write so objects don’t “escape”
- TLAB — Heap allocation can be faster than Stack!
- Virtual Threads (Java 21+) for high-concurrency
- StackWalker (Java 9+) for stack analysis
- Monitor
-Xss— deep recursion = StackOverflowError
Senior Summary
- Stack = execution (fast, deterministic, thread-local)
- Heap = data (huge, shared, requires GC)
- TLAB = pointer bumping → allocation faster than malloc
- Escape Analysis = Stack Allocation + Lock Elision
- Compressed OOPs = 32 GB threshold → critical for performance
- NUMA = distribute Heap across nodes for multi-processor servers
- Object Layout = Mark Word + Klass Pointer + Data + Padding
- Virtual Threads = stack in Heap → millions of threads
Interview Cheat Sheet
Must know:
- Stack — LIFO, thread-local, stores local variables and references; Heap — shared, stores objects and arrays
- Stack is cleaned up automatically on method return, Heap — by Garbage Collector
- Stack Overflow →
StackOverflowError, Heap Overflow →OutOfMemoryError - TLAB allows object allocation in Heap without synchronization (pointer bumping)
- Escape Analysis (JIT) can allocate an object on Stack if it doesn’t “escape” the method
- Compressed OOPs: when Heap < 32 GB, references are compressed from 8 to 4 bytes → 31 GB is often faster than 33 GB
- Object Layout: Mark Word (8 bytes) + Klass Pointer (4 bytes) + Data + Padding
- Virtual Threads (Java 21+): stack in Heap, ~2 KB per thread instead of 1 MB
Common follow-up questions:
- Why is 31 GB Heap faster than 33 GB? — Compressed OOPs are disabled above 32 GB, pointers double in size → more cache misses
- Can an object be created on Stack? — Yes, through Escape Analysis JIT can do Stack Allocation or Scalar Replacement
- What is TLAB? — Thread Local Allocation Buffer, a thread-private buffer in Eden for lock-free allocation
- Which parameter sets Stack size? —
-Xss(default 1 MB)
Red flags (DO NOT say):
- “Heap and Stack are the same thing, just different names” — they are two different memory areas
- “GC cleans Stack” — Stack is cleaned automatically on method return
- “Objects are always created in Heap” — JIT may allocate on Stack via Escape Analysis
Related topics:
- [[2. What is stored in Heap]]
- [[3. What is stored in Stack]]
- [[4. What is Garbage Collection]]
- [[8. What are generations in GC]]
- [[18. What are -Xms and -Xmx parameters]]