Question 4 · Section 3

What is Garbage Collection?

GC is based on two observations:

Language versions: English Russian Ukrainian

Junior Level

Garbage Collection (GC) — automatic “garbage cleanup” in Java memory.

“Garbage” — objects that are no longer used by the program (unreachable from GC Roots). GC finds them and frees memory. Without GC, developers would have to manually free memory (like in C/C++), which is prone to leaks and dangling pointers.

Simple analogy: Imagine you’re working at a desk. Over time, the desk gets covered with papers. GC is the cleaner who comes and throws away unnecessary papers, freeing up space.

Why it’s needed:

  • In C/C++, the programmer deletes memory themselves → often forgets → leaks
  • In Java, GC finds and removes unnecessary objects itself → almost no leaks

How GC knows an object is unnecessary:

  • If there are no references to an object → it’s garbage
  • If an object cannot be reached from code → it’s garbage

Example:

public void example() {
    String s1 = "hello";   // Object "hello" is used
    s1 = "world";          // "hello" is no longer needed → GC will remove it!
}

Main GCs in Java:

  • G1 GC — default (Java 9+)
  • ZGC — minimal pauses (< 1 ms), but throughput is 5-10% lower than G1.
  • Parallel GC — for batch processing

Middle Level

Weak Generational Hypothesis

Terms:

  • STW (Stop-The-World) — complete pause of all application threads during garbage collection.
  • SATB (Snapshot-At-The-Beginning) — concurrent marking algorithm: remembers objects that were reachable at the start of GC and only tracks their removal from the reachability graph.
  • Card Table — a bit map where each bit shows whether there are references from Old Gen to Young Gen. Allows GC to avoid scanning the entire Old Gen.

GC is based on two observations:

  1. For most applications: most objects die young — temporary variables, buffers. But for workloads with large long-lived caches, this observation does not hold.
  2. Objects that survive several collections live long — singletons, caches

Consequence: Divide Heap into generations:

  • Young Generation — frequent, fast collections
  • Old Generation — rare, slow collections

Main Algorithms

Algorithm How it works Where applied
Copying Copies live objects to a new area, clears the old one Young Gen
Mark-Sweep Marks live objects, removes dead ones Old Gen
Mark-Compact Marks + shifts live objects (eliminates fragmentation) Full GC

GC Phases

1. Mark
   → Graph traversal from GC Roots
   → All reachable objects are marked

2. Sweep
   → Removal of unmarked objects

3. Compact — not always
   → Shifting live objects to eliminate fragmentation

Stop-The-World (STW)

During GC, the application stops!
  → All threads wait
  → Application "freezes"
  → Duration: from 1 ms (ZGC) to several seconds (Full GC)

GC Types

G1 GC (default):

  • Divides Heap into regions
  • Target pause: -XX:MaxGCPauseMillis=200
  • Good latency/throughput balance

ZGC (Java 21+):

  • Pauses < 1 ms (independent of Heap size)
  • Up to 16 TB of memory
  • -5-10% throughput vs G1

Parallel GC:

  • Maximum throughput
  • Long pauses (500 ms+)
  • For Big Data, batch processing

Common mistakes

  1. System.gc() in code
    // ❌ Manual GC call — anti-pattern
    System.gc();  // Stops the entire application!
    
  2. Too small Heap
    -Xmx256m for Spring Boot → constant GC
    → Application slows down
    → Solution: -Xmx1g minimum
    
  3. Memory leaks
    // ❌ Static collection grows infinitely
    static List<Data> cache = new ArrayList<>();
    cache.add(data);  // GC cannot remove!
    

When NOT to use each GC

  • Serial GC — NOT for servers (single thread, long pauses)
  • Parallel GC — NOT when pauses matter (SLA < 100 ms)
  • G1 GC — NOT for SLA < 10 ms (use ZGC), NOT for batch without latency requirements (use Parallel)
  • ZGC — NOT on Java < 15 (no stable version), NOT when maximum throughput matters

Senior Level

GC Trade-offs (GC Triangle)

         Latency
          / \
         /   \
  Throughput ── Footprint

Cannot optimize all three simultaneously!

G1: latency + throughput balance
ZGC: minimum latency (at the cost of throughput: ZGC sacrifices 5-15% throughput for pauses < 1 ms)
Parallel: maximum throughput (at the cost of latency)

Safepoints and STW Mechanics

Threads cannot stop instantly!
→ Must reach a Safepoint (stop point)

Where Safepoints are placed:
  - Before method exit
  - At loop ends
  - Before method calls

Mechanism: Safepoint Poll
  → JVM marks the page as inaccessible
  → Thread reads → SIGSEGV → goes to waiting

TTSP (Time To Safepoint):
  → Time from "stop" command to the last thread
  → Counted Loops can delay for seconds!
  → -XX:+UseCountedLoopSafepoints

Write Barriers and Card Tables

Problem: Minor GC doesn't want to scan entire Old Gen

Solution: Card Table
  → Old Gen split into 512-byte cards
  → When writing oldObj.field = youngObj
    → Write Barrier marks the card as Dirty
  → Minor GC scans only Dirty cards

Write Barrier (inserted by JIT):
  card_table[address >> 9] = DIRTY
  → ~1-2 ns overhead per reference write

Modern GCs (Java 17-21+)

G1 GC:

  • Regions 1-32 MB (dynamic)
  • Remembered Sets (RSet) for inter-region references
  • SATB (Snapshot-At-The-Beginning) for concurrent marking
  • Mixed GC: Young + “dirtiest” Old regions

ZGC:

  • Colored Pointers (metadata in pointers)
  • Load Barriers (check on read)
  • Multi-mapping of virtual memory
  • Generational ZGC (Java 21+) — generational separation

Shenandoah:

  • Brooks Pointers → Load Reference Barriers
  • Concurrent compaction (Evacuation)
  • Alternative to ZGC for OpenJDK

Epsilon GC:

  • “No-op GC” — allocation only
  • For performance tests
  • For short-lived microservices

Allocation Stall

Most dangerous scenario for concurrent GCs:

Application creates garbage faster than GC collects
→ Memory runs out
→ Application thread FREEZES and waits for GC
→ Pause: hundreds of milliseconds or seconds

Solution:
  → Increase Heap
  → Increase ConcGCThreads
  → Decrease Allocation Rate

Production Experience

Real scenario #1: Full GC every minute

  • Spring Boot, -Xmx512m → too small
  • Full GC every 60 seconds, 2-second pause
  • SLA violated
  • Solution: -Xmx2g → Minor GC every 5 minutes, 50 ms pause

Real scenario #2: ZGC saved HFT system

  • High-frequency trading: SLA < 10 ms
  • G1 GC: 50-200 ms pauses → missed trades
  • ZGC: pauses < 1 ms → SLA met
  • Cost: -8% throughput (accepted)

Best Practices

  1. Do not call System.gc() manually
  2. G1 GC — good default choice
  3. ZGC (Java 21+) — if SLA < 10 ms
  4. Parallel GC — for batch processing
  5. Monitor GC logs via -Xlog:gc*
  6. Safepoint logs for “hang” diagnostics
  7. Avoid Allocation Stall → sufficient Heap
  8. The best garbage is the one not created → optimize allocations before tuning GC

Senior Summary

  • GC = resource management system, not just “cleanup”
  • Triangle: Latency ↔ Throughput ↔ Footprint
  • Safepoints = STW mechanism, TTSP can be a problem
  • Write Barriers = Card Tables for Minor GC optimization
  • ZGC = Colored Pointers + Load Barriers → < 1 ms pauses
  • Allocation Stall = catastrophe for latency
  • GC choice = architectural decision based on SLA
  • Best garbage = not created → optimize code before GC tuning

Interview Cheat Sheet

Must know:

  • GC automatically removes objects unreachable from GC Roots — developers don’t need to manage memory manually
  • Weak Generational Hypothesis: most objects die young → Heap split into Young/Old Generation
  • Main algorithms: Copying (Young Gen), Mark-Sweep/Mark-Compact (Old Gen)
  • G1 GC — default since Java 9, latency/throughput balance; ZGC — pauses < 1 ms, but throughput 5-15% lower
  • Stop-The-World: all threads stop during GC; ZGC < 1 ms, G1 20-200 ms, Full GC — seconds
  • Safepoints — points in code where threads can stop; TTSP can be longer than the GC itself
  • Card Table + Write Barriers optimize Minor GC, avoiding scanning entire Old Gen
  • The best garbage is the one not created; optimize allocations before tuning GC

Common follow-up questions:

  • Why can’t we optimize latency, throughput, and footprint simultaneously? — GC Triangle: improving one worsens another (e.g., ZGC sacrifices throughput for pauses < 1 ms)
  • What is Allocation Stall? — Application creates garbage faster than concurrent GC can keep up → thread freezes and waits
  • Why is ZGC slower than G1 in throughput? — Load Barriers on every reference read add 5-15% CPU overhead
  • When to call System.gc()? — Never in production. Only in tests/benchmarks.

Red flags (DO NOT say):

  • “I call System.gc() after big operations for optimization” — anti-pattern, GC is smarter
  • “GC uses Reference Counting” — Java uses Reachability Analysis, not reference counting
  • “ZGC is always better than G1” — ZGC sacrifices throughput; for batch processing G1/Parallel is better
  • “GC can be completely disabled” — Epsilon GC doesn’t collect garbage at all, but it’s not a production solution for long-lived apps

Related topics:

  • [[5. When does an object become eligible for GC]]
  • [[8. What are generations in GC]]
  • [[12. What GC algorithms exist]]
  • [[13. What is G1 GC]]
  • [[16. What is stop-the-world]]