What is String Deduplication in G1 GC?
4. Merging: If identical array is found, the value field is redirected to the existing array via atomic operation
π’ Junior Level
String Deduplication is a feature of the G1 garbage collector that automatically finds identical strings in memory and merges their internal arrays to save memory.
How it works:
- JVM notices that two different
Stringobjects contain the same text - Instead of storing two identical byte arrays, it makes both strings reference one array
- This happens automatically β you donβt need to change code
How to enable:
java -XX:+UseG1GC -XX:+UseStringDeduplication -jar app.jar
Simple analogy: Imagine you have 100 copies of the same book in a library. Deduplication is when the librarian leaves one book on the shelf and gives all other readers a reference to it. Same text, but space is saved.
Difference from String Pool:
| | String Pool (intern()) | String Deduplication |
| ββββββ | ββββββββ- | βββββββββ |
| What combines | String objects | Internal byte[] arrays |
| Need code change? | Yes (str.intern()) | No (only JVM flag) |
| When it works | On intern() call | During GC (in background) |
π‘ Middle Level
How it works internally
- Scanning: During GC (evacuation phase) G1 marks
Stringobjects in collected regions - Queue: References to candidate strings are placed in deduplication queue
- Background thread: Separate thread computes hash of
byte[]and searches for matches in deduplication table - Merging: If identical array is found, the
valuefield is redirected to the existing array via atomic operation
Difference from String Pool β detailed comparison
| Characteristic | String Pool (intern()) |
String Deduplication |
|---|---|---|
| What combines | String objects |
Internal byte[] arrays |
| When | On intern() call (synchronous) |
During GC (asynchronous) |
| Management | Manual (need to call intern()) |
Automatic (JVM flag) |
| Works with any GC | Yes | Only G1 GC and Shenandoah |
Effect on == |
Makes == true |
== remains false (objects different) |
| CPU overhead | On each intern() |
Background thread, ~2β5% CPU |
| Table memory | StringTable in Heap (~32 bytes/entry) | Native memory (~10β50MB) |
When to enable
- Profiler shows many duplicate strings in Heap
- You canβt use
intern()(legacy code, complex logic, no code access) - Application runs on G1 GC (default in Java 9+)
- Heap > 4GB and strings occupy significant portion
Table of typical mistakes
| Mistake | Consequences | Solution |
|---|---|---|
| Expecting instant results | βEnabled, but memory didnβt freeβ | Deduplication happens during GC, not instantly; needs several GC cycles |
| Enabling without monitoring | Donβt know if it works at all | Check via -XX:+PrintStringDeduplicationStatistics |
Expecting == to become true |
Comparison logic broken | Deduplication doesnβt change object references, only internal arrays |
| Enabling on ZGC | Doesnβt work | ZGC doesnβt support String Deduplication |
When NOT to use
- Few duplicates: if strings are mostly unique β overhead without benefit
- Short-lived strings: die before reaching deduplication queue
- ZGC: not supported (use
-XX:+UseStringDeduplicationwith Shenandoah) - Ultra-low-latency: 2β5% CPU overhead may be critical
π΄ Senior Level
Internal Implementation β G1 GC Deduplication Pipeline
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β GC (Evacuation Phase) β
β βββ Identify String objects in collection set β
β βββ Filter: age >= DeduplicationAgeThreshold (default 3) β
β βββ Enqueue candidates to dedup queue β
β βββ Continue evacuation β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Deduplication Thread (concurrent, low-priority) β
β βββ Dequeue String references β
β βββ Compute hash of byte[] value (age hash, not String.hashCode) β
β βββ Lookup in deduplication table (native memory hashtable) β
β βββ If found: byte-by-byte comparison to confirm β
β βββ If match: CAS redirect value reference β shared array β
β βββ If not found: add to table β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Deduplication Table:
- Native hash table (outside Java Heap, in C-heap)
- Stores hashes of
byte[]arrays + weak references - On hash match β byte-by-byte comparison to confirm (collision protection)
- Reference update via CAS (Compare-And-Swap) β thread-safe without locks
Marking Algorithm:
- G1 uses concurrent marking
- Strings are marked during marking phase
- Age threshold (default 3 GC cycles) β only strings that βsurvivedβ enough are deduplicated
- This filters out short-lived strings that die before queue processing
Trade-offs
Pros:
- Transparency: no code change needed β only JVM flag
- Savings: 10β20% Heap for text-heavy applications
- Safety: no risk of String Pool corruption (data doesnβt change)
- Works with any strings, not just interned ones
Cons:
- CPU overhead: hashing + lookup + byte comparison (~2β5% CPU)
- Memory: deduplication table (~10β50MB native memory)
- Only G1 GC (and Shenandoah in OpenJDK)
- Delay: deduplication happens after several GC cycles (age threshold)
- Doesnβt deduplicate: strings with different
coder(Latin-1 vs UTF-16 β arrays of different length)
Edge Cases (minimum 3)
1. Doesnβt deduplicate strings with different coder:
String s1 = "Hello"; // Latin-1, byte[5]
String s2 = new String("Hello".getBytes(StandardCharsets.UTF_16), StandardCharsets.UTF_16); // UTF-16, byte[12]
// Different coder β different byte[] arrays β deduplication won't work
// Even if content is the same, arrays have different size and bytes
2. Very short-lived strings:
void process() {
String temp = "duplicate"; // Eden
String temp2 = "duplicate"; // Eden
// Both strings die in Young GC β don't reach age threshold (3 GC)
// Deduplication won't have time to work
}
3. Race condition on redirect:
// Dedup thread performs CAS:
// if (CAS(oldValue, sharedValue)) β success
// If two threads simultaneously try to redirect β only one succeeds
// Other thread sees value is already redirect-ed, and skips
// Thread reading s.value always sees consistent value (@Stable + CAS)
// @Stable β JVM annotation telling JIT that field is written once during construction.
4. String Pool vs Deduplication β interaction:
String s1 = new String("Hello");
String s2 = new String("Hello");
// s1.value and s2.value β different byte[] arrays (both Latin-1, byte[5])
// After deduplication: s1.value and s2.value β SAME byte[]
// But s1 != s2 (String objects are different!)
// Savings: one byte[5] instead of two β 5 bytes
5. Subnormal strings (very long):
String huge = "A".repeat(10_000_000); // 10MB
String huge2 = "A".repeat(10_000_000); // 10MB
// Byte-by-byte comparison of 10MB β expensive (~10ms)
// Deduplication thread may slow down GC
// In practice: long strings may be identical in log aggregators or data pipelines
// (same JSON payloads), but this is rare for typical web applications.
Performance
| Metric | Without dedup | With dedup | Delta |
|---|---|---|---|
| Heap usage | 4.0 GB | 3.2 GB | -20% |
| GC pause (avg) | 50ms | 55ms | +10% |
| CPU overhead | Baseline | +2β5% | Small |
| Young GC | 10ms | 10ms | No change |
| Mixed GC | 50ms | 55ms | +5ms |
| Native memory (table) | 0MB | 10β50MB | Extra |
Memory savings (real scenarios):
- JSON API service: 15β25% strings are duplicates (keys, status values)
- Log aggregator: 30β40% duplicates (levels, service names, host names)
- ETL pipeline: 5β10% duplicates (category names, country codes)
Thread Safety
Deduplication is thread-safe:
valuereference update via CAS (atomic operation)valuefield is@Stable, but JVM allows redirect within GC- Reading threads always see consistent value (memory barriers during GC)
- Deduplication thread β single (single-threaded), no contention between dedup threads
Production War Story
Scenario 1: JSON API service (G1 GC, 8GB Heap, Spring Boot, 50K RPS):
- Without deduplication: Heap usage 6.5GB, Full GC every 30 minutes, p99 latency = 25ms
- With deduplication: Heap usage 5.2GB, Full GC every 50 minutes, p99 latency = 20ms
- CPU overhead: +3% (acceptable)
- Stats: deduplicated 2.3GB of strings, 850K unique byte[] arrays merged
- Result: reduced instances from 10 to 8 (saving $15K/month)
Scenario 2: Log aggregator (1M log lines/min, G1 GC, 12GB Heap):
- Fields
level,service,hostβ many duplicates ("INFO","UserService","host-1") - Without deduplication: 12GB Heap
- With deduplication: 9GB Heap
- Savings: 3GB β fewer instances in cluster
- Problem: CPU overhead grew to 7% due to huge number of strings. Fix:
-XX:StringDeduplicationAgeThreshold=5(increased age threshold, fewer candidates β less CPU).
Scenario 3 (anti-pattern): Team enabled deduplication βjust in caseβ for an app with unique strings (UUIDs, hashes, timestamps). CPU overhead +4%, memory savings 0.5%. Disabled it.
Monitoring
# Enable deduplication statistics
-XX:+PrintStringDeduplicationStatistics
-XX:+PrintGC
# Output in GC logs:
# [GC concurrent string deduplication]
# String Deduplication: 1.2GB deduplicated (500K strings)
# [DEDUP: 500K strings, 1.2GB, 2.3ms]
# JCmd β statistics in runtime
jcmd <pid> GC.string_deduplication_statistics
# Output:
# String Deduplication Statistics:
# Executed: 1234 times
# Deduplicated: 567890 strings (1.2GB)
# Skipped: 123456 strings (already deduplicated)
# JFR (Java Flight Recorder)
java -XX:StartFlightRecording=filename=recording.jfr ...
# Events: StringDeduplicationStatistics
# In JDK Mission Controller: Memory β String Deduplication
# Configure age threshold
-XX:StringDeduplicationAgeThreshold=3 # Default 3 GC cycles
# Increase to 5β10 if CPU overhead is too high
Best Practices for Highload
- Enable when profiler shows > 10% duplicate strings in Heap
- Donβt use as replacement for
intern()for highly duplicate long-lived strings (dictionaries, configs) βintern()is more efficient - Monitor CPU overhead β if > 5%, increase
-XX:StringDeduplicationAgeThreshold=5 - Combine:
intern()for dictionary data (enum values, status codes) + deduplication for everything else - For ZGC: not supported (ZGC as of JDK 21: doesnβt support String Deduplication) β consider Shenandoah (
-XX:+UseShenandoahGC -XX:+UseStringDeduplication) - For max savings: tune
-XX:G1HeapRegionSizeβ smaller regions β more frequent evacuation β more dedup candidates - Donβt enable for short-lived apps (CLI, batch jobs < 1min) β wonβt have time to work
- For ultra-low-latency: benchmark with and without deduplication; sometimes 5ms GC pause increase is critical
π― Interview Cheat Sheet
Must know:
- String Deduplication β G1 GC feature, automatically merges identical string
byte[]arrays - Enabled with flag:
-XX:+UseG1GC -XX:+UseStringDeduplication(no code change needed) - Difference from String Pool: deduplication merges
byte[], not String objects;==remains false - Works asynchronously during GC, age threshold (default 3 GC cycles) β filters out short-lived strings
- CPU overhead: ~2-5%, native memory for dedup table: ~10-50MB
- Doesnβt deduplicate strings with different coder (Latin-1 vs UTF-16)
Frequent follow-up questions:
- How is deduplication different from
intern()? βintern()merges String objects (== becomes true), requires code. Deduplication β onlybyte[]arrays,==remains false, no code change. - What memory savings? β 10-20% Heap for apps with duplicate strings. JSON API: 15-25%, log aggregator: 30-40%.
- Why doesnβt it work with ZGC? β ZGC (as of JDK 21) doesnβt support String Deduplication. Alternative: Shenandoah GC.
- How to reduce CPU overhead? β Increase
-XX:StringDeduplicationAgeThreshold=5β fewer candidates, less CPU.
Red flags (DONβT say):
- β βDeduplication makes
==trueβ β String objects remain different, onlybyte[]is merged - β βThis replaces
intern()β β no,intern()is more efficient for dictionary data - β βWorks instantlyβ β happens during GC, needs several cycles
- β βWorks with any GCβ β only G1 GC and Shenandoah
Related topics:
- [[1. How String Pool Works]]
- [[3. When to Use intern()]]
- [[19. What are Compact Strings in Java 9+]]
- [[20. How to Find Out How Much Memory a String Occupies]]