How to Find Out How Much Memory a String Occupies
The size of a string in memory depends on the Java version and the string's content.
🟢 Junior Level
The size of a string in memory depends on the Java version and the string’s content.
Simple calculation for Java 9+:
Total size = String object (~24 bytes) + byte[] array (~16 bytes header) + characters
Example:
String s = "Hello"; // 5 Latin-1 characters
// String object: ~24 bytes
// byte[5] array: ~21 bytes (16 header + 5 data, rounded to 24)
// Total: ~48 bytes
How to measure precisely: Use the JOL library (Java Object Layout):
import org.openjdk.jol.info.GraphLayout;
String s = "Hello";
System.out.println(GraphLayout.parseInstance(s).totalSize());
// Prints exact size in bytes, including the object itself and all related data
Simple analogy: String is like a box (String object) containing another box (byte array). To find the total size, you need to add both boxes together.
🟡 Middle Level
String object structure (Java 9+, 64-bit JVM with CompressedOops)
String Object (24 bytes):
├── Mark Word (object header): 12 bytes
├── Class Pointer (compressed): 4 bytes
├── byte[] value (reference): 4 bytes
├── byte coder: 1 byte
├── int hash: 4 bytes
├── Padding: 3 bytes (to 8-byte alignment)
└── TOTAL: 24 bytes (rounded to multiple of 8)
byte[] Array:
├── Array Header (Mark + Class): 12 bytes
├── Array length: 4 bytes
├── Data: N bytes (1 byte/char for Latin-1, 2 for UTF-16)
├── Padding: to 8-byte boundary
└── TOTAL: 16 + N (rounded to 8) bytes
Calculation examples
| String | Java 8 (char[]) |
Java 9+ (Latin-1) | Java 9+ (UTF-16) |
|---|---|---|---|
"" |
40 bytes | 48 bytes | N/A |
"Hello" |
48 bytes | 48 bytes | N/A |
"Привет" |
52 bytes | N/A | 52 bytes |
| 100 chars (Latin-1) | 232 bytes | 140 bytes | N/A |
| 100 chars (mixed) | 232 bytes | N/A | 240 bytes |
How to measure in practice
JOL (Java Object Layout):
<dependency>
<groupId>org.openjdk.jol</groupId>
<artifactId>jol-core</artifactId>
<version>0.17</version>
</dependency>
// Full size (String + array)
long total = GraphLayout.parseInstance(s).totalSize();
// Detailed layout
System.out.println(GraphLayout.parseInstance(s).toPrintable());
Table of typical mistakes
| Mistake | Consequences | Solution |
|---|---|---|
Using sizeof like in C++ |
Java has no sizeof |
Use JOL or Instrumentation.getObjectSize() |
| Counting only characters, forgetting headers | Underestimating by ~28–40 bytes | Always account for overhead: String object + array header |
| Not accounting for CompressedOops | Wrong calculations for Heap > 32GB | Without CompressedOops each pointer = 8 bytes instead of 4 |
Comparison: Java 8 vs Java 9+
| Aspect | Java 8 | Java 9+ (Latin-1) | Java 9+ (UTF-16) |
|---|---|---|---|
| Internal array | char[] (2 bytes/char) |
byte[] (1 byte/char) |
byte[] (2 bytes/char) |
| coder field | No | 1 byte | 1 byte |
"Hello" size |
48 bytes | 48 bytes | N/A |
"Привет" size |
52 bytes | N/A | 52 bytes |
When you don’t need exact String size measurement
- Short-lived strings — Young GC collects them for free
- Small number of strings — overhead is unnoticeable against Heap
- Prototypes and PoC — optimize only when problem is proven
🔴 Senior Level
Internal Implementation — exact calculation
64-bit JVM with UseCompressedOops (default for Heap < 32GB):
// String object (Java 9+)
// Mark Word: 12 bytes (8 mark word + 4 klass pointer compressed)
// value ref: 4 bytes
// coder: 1 byte
// hash: 4 bytes
// hashIsZero: 1 byte (in some JDK builds)
// Padding: to 24 bytes (multiple of 8)
// = 24 bytes total
// byte[] array
// Mark Word: 12 bytes
// length: 4 bytes
// data: N bytes
// Padding: to 8-byte boundary
// = 16 + N (rounded up to 8)
Without CompressedOops (-XX:-UseCompressedOops, Heap > 32GB):
- Each pointer = 8 bytes instead of 4
- String object: ~40 bytes (vs 24 with compressed)
- byte[] array: ~24 + N bytes
Edge Cases (minimum 3)
1. String Pool — one object, many references:
String s1 = "Hello";
String s2 = "Hello";
String s3 = "Hello";
// All three references point to ONE object in String Pool
// Total memory: 44 bytes (not 3 × 44 = 132)
2. Substring (Java 7+) — copies array:
String huge = "A".repeat(1_000_000); // ~1MB
String sub = huge.substring(0, 5); // "AAAAA"
// sub — separate byte[5], not a reference to part of huge
// Before Java 7: sub shared huge's array (memory leak with huge.substring(0,5))
// Java 7+: copies — safe, but sub = ~44 bytes
3. Interned strings — additional overhead:
String s = new String("Hello").intern();
// String object: ~44 bytes
// + entry in StringTable: ~24-40 bytes (depends on JVM version, native hashtable entry)
// Total: ~76 bytes per unique interned string
4. CompressedOops disabled at Heap > 32GB:
# At -Xmx64g: CompressedOops may be disabled
# String object: 40 bytes instead of 24
# With 10M strings: +160MB overhead!
5. Substring from UTF-16 string — inherits UTF-16:
String mixed = "Hello Мир"; // UTF-16 (due to Cyrillic)
String sub = mixed.substring(0, 5); // "Hello" — still UTF-16!
// sub takes ~56 bytes (UTF-16: 24 String + 32 byte[10]) instead of ~48 bytes
// (Latin-1: 24 String + 24 byte[5]). Difference ~8 bytes.
// Loss: ~5 bytes per substring
Performance — real measurements
// JOL benchmark
String empty = "";
String latin5 = "Hello";
String cyrillic5 = "Привет";
String latin100 = "A".repeat(100);
GraphLayout.parseInstance(empty).totalSize(); // 40 bytes
GraphLayout.parseInstance(latin5).totalSize(); // 44 bytes
GraphLayout.parseInstance(cyrillic5).totalSize(); // 52 bytes
GraphLayout.parseInstance(latin100).totalSize(); // 140 bytes
| Scenario | Size per string | 1M strings | 10M strings |
|---|---|---|---|
| Empty string | 40 bytes | 40MB | 400MB |
| Latin-1, 5 chars | 44 bytes | 44MB | 440MB |
| UTF-16, 5 chars | 52 bytes | 52MB | 520MB |
| Latin-1, 100 chars | 140 bytes | 140MB | 1.4GB |
| UTF-16, 100 chars | 240 bytes | 240MB | 2.4GB |
Memory and GC implications
Heap savings:
- Compact Strings (Java 9+): ~40–50% savings for ASCII/Latin-1 strings
- With 70% Latin-1 strings in app: overall Heap reduction 20–30%
- Less Heap → less frequent Full GC → lower latency
GC cycles:
- Young GC: scans Eden/Survivor — fewer string allocations → faster scan
- Old Gen: fewer objects → less work for marking/compaction
- G1 GC: smaller region size for string-heavy apps → more efficient evacuation
Thread Safety
String — immutable, thread-safe. Size doesn’t change after creation. coder — final, value — @Stable. No race conditions when reading size from multiple threads.
Production War Story
Scenario: Cache of 1M strings in memory (user profiles, JSON API service):
- Java 8: 1M × ~50 bytes (avg) = ~50MB
- Java 9+ Compact: 1M × ~35 bytes (avg, 70% Latin-1) = ~35MB
- Savings: 15MB → less GC pressure, Full GC 25% less frequent
Scenario 2: Highload service with -Xmx2g:
- Without CompressedOops: String overhead = ~40 bytes/object
- With CompressedOops: String overhead = ~24 bytes/object
- With 10M objects: 160MB savings on headers alone
- This is the difference between stable operation and OOM at peak load
Scenario 3: Log aggregator — storing 10M log lines in memory:
- Each string: ~200 bytes (avg, mixed Latin-1/UTF-16)
- Total: ~2GB for strings alone
- Enabling
-XX:+UseStringDeduplication: saves 400MB (duplicate log levels, host names)
Monitoring
# Check CompressedOops
java -XX:+PrintFlagsFinal -version 2>&1 | grep UseCompressedOops
# bool UseCompressedOops = true {lp64_product}
# Heap histogram — how many strings in memory
jmap -histo:live <pid> | head -30
# num #instances #bytes class name
# 1: 1234567 49382680 java.lang.String
# JOL in runtime
java -javaagent:jol-cli.jar=includes=java.lang.String -jar app.jar
# MAT (Memory Analyzer Tool)
# Heap dump → Dominator Tree → java.lang.String → Shallow/Retained Heap
# JFR — allocations
java -XX:StartFlightRecording=settings=profile,filename=recording.jfr ...
# In JFR: Memory → Object Allocation — filter by java.lang.String
// Runtime measurement via Instrumentation
// (requires -javaagent or Attach API)
long size = instrumentation.getObjectSize(stringInstance);
// JOL — full footprint
System.out.println(GraphLayout.parseInstance(s).toFootprint());
Best Practices for Highload
- Use JOL for exact measurement, don’t count manually
- CompressedOops is enabled by default — don’t disable without good reason
- Compact Strings (Java 9+) give free 40–50% savings for Latin-1
- String Pool: duplicate strings = one object (savings with high deduplication)
- For ultra-low-latency: avoid String, use
byte[]orByteBuf(Netty) - At Heap > 32GB: CompressedOops is disabled → +30–50% overhead on objects → plan capacity
- For string-heavy apps: consider
-XX:+UseStringDeduplication(G1 GC) - Profile before optimizing: sometimes String is only 5% of Heap, and optimization won’t help
🎯 Interview Cheat Sheet
Must know:
- String size = object (~24 bytes) + byte[] array (~16 + N bytes), where N depends on coder
- Latin-1: 1 byte/char, UTF-16: 2 bytes/char (Java 9+)
"Hello"≈ 48 bytes (24 String + 24 byte[5]),"Привет"≈ 52 bytes (UTF-16)- JOL (Java Object Layout) — library for exact measurement:
GraphLayout.parseInstance(s).totalSize() - CompressedOops (default for Heap < 32GB) reduces pointers from 8 to 4 bytes
- Without CompressedOops (Heap > 32GB): String object ≈ 40 bytes instead of 24
Frequent follow-up questions:
- How to find exact String size? — JOL:
GraphLayout.parseInstance(s).totalSize(). OrInstrumentation.getObjectSize(). - Why does
""(empty string) take 48 bytes? — 24 bytes (String object) + 24 bytes (empty byte[] array with header). - Does String Pool affect size? — Yes: duplicate literals = one object. 100 references to
"Hello"= 48 bytes total, not 4800. - What happens at Heap > 32GB? — CompressedOops is disabled, each pointer = 8 bytes. With 10M strings: +160MB overhead.
Red flags (DON’T say):
- ❌ “Java has
sizeoflike C++” — no, use JOL orInstrumentation - ❌ “String size = only character length” — forget about object headers (~40 bytes overhead)
- ❌ “CompressedOops is always enabled” — disabled at Heap > 32GB
- ❌ “substring() still shares array” — since Java 7u6 copies, memory leak is fixed
Related topics:
- [[19. What are Compact Strings in Java 9+]]
- [[22. What is String Deduplication in G1 GC]]
- [[1. How String Pool Works]]
- [[13. What substring() Does and How It Worked Before Java 7]]