When to Use intern()?

🟢 Junior Level

The intern() method adds a string to the String Pool and returns a reference to it from the pool.

Imagine: you’re loading 10,000 records from a DB, and each record has the same word ‘Ukraine’. Without intern() — 10,000 separate objects. With intern() — one object and 10,000 references to it.

Simple example:

String s1 = new String("Hello"); // In regular heap
String s2 = s1.intern();          // Added to pool

String s3 = "Hello";              // From pool
System.out.println(s2 == s3);     // true — same string from pool

When to use: When you have many identical strings and want to save memory. For example, if you load 10,000 records from a database, and each record has a country = "Ukraine" field — instead of 10,000 objects in memory there will be one object in the pool.

When NOT to use: For short-lived strings that are quickly deleted. The regular Garbage Collector will handle them on its own.

🟡 Middle Level

How it works

The intern() method checks the String Pool:

If such a string already exists — returns a reference from the pool
If not — adds the current string to the pool and returns the reference

Practical application

// Loading data from DB — many duplicate values
while (rs.next()) {
    String city = rs.getString("city").intern();
    String country = rs.getString("country").intern();
    users.add(new User(city, country));
}

If the database has 1,000,000 records, but only 100 unique cities:

Without intern(): 1,000,000 String objects
With intern(): 100 String objects in pool + 1,000,000 references to them

Typical mistakes

Mistake: Calling intern() for every string without analysis Solution: Use only for long-lived data with duplicates
Mistake: Expecting instant results Solution: intern() is a native call with overhead, it’s not free

Comparison: intern() vs String Deduplication

🔴 Senior Level

Internal Implementation

String.intern() is a native method:

JVM_ENTRY(jstring, JVM_InternString(JNIEnv *env, jstring str))
  if (str == NULL) return NULL;
  oop string = JNIHandles::resolve_non_null(str);
  oop result = StringTable::intern(string, CHECK_NULL);
  return (jstring) JNIHandles::make_local(env, result);
JVM_END

StringTable::intern() performs:

String hash computation
Lookup in StringTable hash table
If found — return reference
If not found — insert into table (with possible resize)

Architectural Trade-offs

Pros of intern():

RAM savings: with 1000:1 duplicate ratio, savings >99%
Fewer objects → less frequent Full GC
Fast comparison via == (after interning)

Cons of intern():

CPU overhead: each call — hashing + lookup in global table
Contention: StringTable is a global data structure with locking
OOM Risk: with millions of unique strings, pool can fill Heap
StringTableSize: if table is small — collisions → O(n) degradation

Edge Cases

Multithreaded contention: When intern() is called in parallel from hundreds of threads, contention on StringTable lock occurs.
String Table Size: Default is 60013 (Java 8+). If planning 1M+ unique strings:
```
-XX:StringTableSize=1000003
```
Young Gen strings: intern() for short-lived strings is counterproductive — they’ll die at the next Minor GC anyway.

Performance

intern() without collisions: ~50-100ns
intern() with 1M entries and proper StringTableSize: ~200-500ns
intern() with 1M entries and small StringTableSize: 10-50μs (collisions!)

Production Experience

Scenario: Parsing 10GB of logs, where 500 unique log levels appear (INFO, WARN, ERROR, DEBUG, TRACE):

Without intern(): ~50M String objects for keys → 2.4GB
With intern(): 500 objects in pool → ~50KB
Result: Full GC every 30 seconds → every 15 minutes, p99 latency dropped from 200ms to 15ms // Fewer objects in Eden → less frequent filling → fewer Minor GC → lower latency.

Reverse scenario: User UUIDs — every string is unique. intern() here only wastes CPU and fills the pool with garbage.

Monitoring

# StringTable statistics
jcmd <pid> VM.stringtable -verbose

# Output:
# StringTable statistics:
# Number of buckets       : 60013
# Number of entries       : 1234567
# Number of loaded classes: N/A
# Maximum bucket size     : 42         ← if > 10, increase StringTableSize

Best Practices for Highload

Use intern() only for long-lived strings with high duplication ratio
Don’t intern UUIDs, hashes, tokens — they are unique
Profile: sometimes CPU overhead from intern() costs more than extra MBs in Heap
Alternative: your own ConcurrentHashMap<String, String> cache — control over eviction and size
For automatic deduplication without code changes: -XX:+UseStringDeduplication (G1 GC, since Java 8u20)

🎯 Interview Cheat Sheet

Must know:

intern() adds string to String Pool and returns reference from pool
Saves memory with many duplicate strings (1M records, 100 cities → 100 objects instead of 1M)
intern() is a native call with CPU overhead (~50-100ns without collisions)
Contention: StringTable is a global structure with locking, bottleneck at hundreds of threads
-XX:StringTableSize=1000003 — increase for 1M+ unique strings
Don’t use intern() for UUIDs, hashes, tokens — they’re all unique

Frequent follow-up questions:

When is intern() useful? — When loading data with high duplication: dictionaries, categories, cities, statuses.
When is intern() harmful? — For unique strings: UUIDs, IDs, emails, hashes. Fills pool, wastes CPU, saves no memory.
What’s faster: intern() or custom ConcurrentHashMap cache? — ConcurrentHashMap gives control over eviction and size, but intern() is JVM-native, no manual management.
What’s the overhead of intern()? — ~50-100ns without collisions. With 1M entries and small StringTableSize: 10-50μs (collisions!).

Red flags (DON’T say):

❌ “intern() for every string — good practice” — only for strings with duplicates
❌ “intern() is free” — native call, CPU overhead, contention on StringTable
❌ “intern() speeds up everything” — saves memory, but slows CPU
❌ “intern() — the only string optimization” — there’s -XX:+UseStringDeduplication (automatic, no code)

Related topics:

[[1. How String Pool Works]]
[[12. Can String Pool Cause OutOfMemoryError]]
[[22. What is String Deduplication in G1 GC]]
[[11. Where is String Pool Stored (Which Memory Area)]]