Question 1 · Section 3

What is the difference between Heap and Stack?

4. TLAB — Heap allocation can be faster than Stack! 5. Virtual Threads (Java 21+) for high-concurrency 6. StackWalker (Java 9+) for stack analysis 7. Monitor -Xss — deep recursi...

Language versions: English Russian Ukrainian

Junior Level

Heap and Stack — two separate areas of virtual memory that the JVM requests from the OS.

Why two areas: they have different access patterns. Stack is sequential (LIFO): data is added and removed in strict method call order. Heap is arbitrary: objects live long and are accessible from anywhere.

Simple analogy:

  • Stack — like your desk: fast access, but limited space. Used for current tasks.
  • Heap — like a warehouse: huge, but you need to walk to get things. Used for storing all your stuff.

Main differences:

Stack Heap
Stores local variables and references Stores objects (new Object())
Fast access Slower
Small (1 MB per thread) Huge (gigabytes)
Cleaned up automatically Cleaned up by Garbage Collector
Private to each thread Shared across all threads

Example:

public void example() {
    int x = 10;           // Stack (primitive)
    String name = "Ivan"; // Stack (reference) → "Ivan" in Heap
    User user = new User(); // Stack (reference) → User object in Heap
}

Errors:

  • Stack overflow → StackOverflowError (infinite recursion)
  • Heap overflow → OutOfMemoryError: Java heap space

Middle Level

Stack: Execution Memory

Stack frame structure:

Stack Frame (created on method call):
├── Local Variable Table  ← local variables
├── Operand Stack         ← operand stack for computations
├── Dynamic Linking       ← reference to Constant Pool
└── Return Address        ← where to return after method

Lifecycle:

  • Created on method call → removed on return
  • Deterministic — memory is freed instantly
  • Does not require Garbage Collector

Parameter: -Xss (default 1 MB)

Heap: State Memory

What’s stored:

  • All objects (new ...)
  • Arrays (even of primitives int[])
  • Static fields of classes
  • String Pool

Allocation:

  • TLAB (Thread Local Allocation Buffer) — a thread-private buffer in Eden. Each thread allocates in its own TLAB without synchronization → very fast.
  • Pointer Bumping — allocation = simply advancing a pointer (fast!)

Parameters: -Xms (initial), -Xmx (maximum)

Escape Analysis: JIT Optimization

Terms:

  • TLAB (Thread Local Allocation Buffer) — a thread-private buffer in Eden. Each thread allocates in its own TLAB without synchronization → very fast.
  • Compressed OOPs (Compressed Ordinary Object Pointers) — compresses references from 8 to 4 bytes when Heap < 32 GB. Saves ~30% memory.
  • Escape Analysis — JIT analysis: does an object “escape” from a method? If not — JIT may allocate it on the Stack or eliminate it entirely (Scalar Replacement).
// JIT may decide the object doesn't "escape" the method
// and allocate it on the stack instead of the heap!
public Point createPoint() {
    Point p = new Point(10, 20);  // May be allocated on Stack!
    return p;  // Object "escapes" — JIT cannot eliminate allocation.
// JIT analyzes all exit paths: if a reference is returned or
// stored in a field — the object "escapes". Check: -XX:+PrintEscapeAnalysis
}

// Scalar Replacement: object is "broken down" into variables
// Lock Elision: synchronized is removed if object is used by one thread

Comparison Table

Criterion Stack Heap
Visibility Thread-local Shared
Management Hardware (Stack Pointer) GC
Cleanup Instant (Stack Unwinding) Background (STW pauses)
Optimizations Register Allocation TLAB, Compressed OOPs
Errors StackOverflowError OutOfMemoryError

Project Loom (Virtual Threads, Java 21+)

Platform threads: stack in native memory (1 MB fixed)
Virtual threads: stack in Heap as an object!
  → Mount: copied into platform thread stack
  → Unmount: copied back into Heap
  → Result: millions of threads!

Senior Level

Stack Frame Internal

Stack Frame Layout:
┌─────────────────────────────────────┐
│ Local Variable Table                │
│   [0] = this (for non-static)       │
│   [1] = param1                      │
│   [2] = param2                      │
│   ...                               │
├─────────────────────────────────────┤
│ Operand Stack (for byte-code ops)   │
│   push a, push b, iadd, store c     │
├─────────────────────────────────────┤
│ Dynamic Linking → Constant Pool     │
│ Return Address → next instruction   │
│ Exception Table Reference           │
└─────────────────────────────────────┘

Frame size is computed at compile time!

TLAB (Thread Local Allocation Buffer)

Eden Space:
┌────────────────────────────────────┐
│ Thread 1: [TLAB_1] 64KB            │
│ Thread 2: [TLAB_2] 64KB            │
│ Thread 3: [TLAB_3] 64KB            │
│ ...                                │
└────────────────────────────────────┘

Allocation in TLAB:
  pointer += object_size  // Bump-the-pointer
  → 0 synchronization!
  → Often faster than malloc() in multithreaded environments, because malloc requires synchronization, while TLAB does not.

If object > TLAB → allocation in Eden with synchronization

NUMA Awareness

Multi-processor server:
CPU 1 ── Memory A (fast)
CPU 2 ── Memory B (fast for CPU 2, slow for CPU 1)

-XX:+UseNUMA → JVM distributes Heap across NUMA nodes
→ Threads on CPU 1 allocate in Memory A
→ +10-20% performance on Highload!

Object Layout in Memory

Object Header (12-16 bytes):
├── Mark Word (8 bytes)
│     ├── Hash Code (31 bits)
│     ├── Age (4 bits) → for GC
│     ├── Lock State (2 bits)
│     └── GC bits
├── Klass Pointer (4 bytes with Compressed OOPs)
│     → Pointer to class metadata in Metaspace
├── Instance Data (object fields)
└── Padding (aligned to 8 bytes)

Compressed OOPs: 32 GB Threshold

64-bit pointer = 8 bytes
Compressed OOPs = 4 bytes (shift × 8)

Maximum address: 2^32 × 8 = 32 GB

→ 31 GB Heap is faster than 33 GB!
→ Crossing 32 GB = -10-15% performance
→ CPU L1/L2 cache savings

Future: Project Valhalla (Value Types)

// Future: store by value, not by reference
public final class Point {
    public final int x;
    public final int y;
}

// Now:
Point[] arr = new Point[1000];  // 1000 objects in Heap + 1000 references

// Future (Value Types):
Point[] arr = new Point[1000];  // Stored directly in the array!
 No object headers
 No indirection
 Cache locality × 10

Production Experience

Real scenario: 32 GB threshold killed performance

  • Server: 64 GB RAM, -Xmx48g
  • Crossed 32 GB → Compressed OOPs disabled
  • Result: -15% throughput, +20% latency
  • Solution: -Xmx30g → Compressed OOPs enabled → fixed

Best Practices

  1. -Xms = -Xmx in production for long-running servers (prevents resizing overhead). For CLI utilities, keep a small -Xms for fast startup.
  2. Avoid > 32 GB without need (Compressed OOPs)
  3. Escape Analysis — write so objects don’t “escape”
  4. TLAB — Heap allocation can be faster than Stack!
  5. Virtual Threads (Java 21+) for high-concurrency
  6. StackWalker (Java 9+) for stack analysis
  7. Monitor -Xss — deep recursion = StackOverflowError

Senior Summary

  • Stack = execution (fast, deterministic, thread-local)
  • Heap = data (huge, shared, requires GC)
  • TLAB = pointer bumping → allocation faster than malloc
  • Escape Analysis = Stack Allocation + Lock Elision
  • Compressed OOPs = 32 GB threshold → critical for performance
  • NUMA = distribute Heap across nodes for multi-processor servers
  • Object Layout = Mark Word + Klass Pointer + Data + Padding
  • Virtual Threads = stack in Heap → millions of threads

Interview Cheat Sheet

Must know:

  • Stack — LIFO, thread-local, stores local variables and references; Heap — shared, stores objects and arrays
  • Stack is cleaned up automatically on method return, Heap — by Garbage Collector
  • Stack Overflow → StackOverflowError, Heap Overflow → OutOfMemoryError
  • TLAB allows object allocation in Heap without synchronization (pointer bumping)
  • Escape Analysis (JIT) can allocate an object on Stack if it doesn’t “escape” the method
  • Compressed OOPs: when Heap < 32 GB, references are compressed from 8 to 4 bytes → 31 GB is often faster than 33 GB
  • Object Layout: Mark Word (8 bytes) + Klass Pointer (4 bytes) + Data + Padding
  • Virtual Threads (Java 21+): stack in Heap, ~2 KB per thread instead of 1 MB

Common follow-up questions:

  • Why is 31 GB Heap faster than 33 GB? — Compressed OOPs are disabled above 32 GB, pointers double in size → more cache misses
  • Can an object be created on Stack? — Yes, through Escape Analysis JIT can do Stack Allocation or Scalar Replacement
  • What is TLAB? — Thread Local Allocation Buffer, a thread-private buffer in Eden for lock-free allocation
  • Which parameter sets Stack size?-Xss (default 1 MB)

Red flags (DO NOT say):

  • “Heap and Stack are the same thing, just different names” — they are two different memory areas
  • “GC cleans Stack” — Stack is cleaned automatically on method return
  • “Objects are always created in Heap” — JIT may allocate on Stack via Escape Analysis

Related topics:

  • [[2. What is stored in Heap]]
  • [[3. What is stored in Stack]]
  • [[4. What is Garbage Collection]]
  • [[8. What are generations in GC]]
  • [[18. What are -Xms and -Xmx parameters]]