Question 13 · Section 4

What is TreeSet and how does it work?

4. View = live, not a copy 5. Java 21 → SequencedSet API 6. Cache miss during iteration — keep in mind 7. Memory 3-4x more than HashSet

Language versions: English Russian Ukrainian

🟢 Junior Level

TreeSet — a sorted collection of unique elements.

Simple analogy: A dictionary — words are always in alphabetical order, regardless of the order you added them.

TreeSet<Integer> set = new TreeSet<>();
set.add(5);
set.add(2);
set.add(8);
set.add(1);

System.out.println(set);  // → [1, 2, 5, 8] (always sorted!)

Key properties:

  • Elements are always sorted
  • null is not allowed
  • Slower than HashSet: O(log n) instead of O(1) + each node is a separate heap object (cache miss during traversal). But data is sorted.

🟡 Middle Level

Internally: TreeMap

// TreeSet = adapter over TreeMap
private transient TreeMap<E, Object> m;
private static final Object PRESENT = new Object();

// All operations delegate to TreeMap
// Complexity: O(log n) for add/remove/contains

Red-black tree

Self-balancing binary tree:
  → Height is always O(log n)
  → On insert/delete: rebalancing (rotations)
    Rotations = restructuring links between tree nodes to keep height O(log n). Without rotations, the tree would degenerate into a linked list with O(n).
  → Node colors (red/black) for balance

    5(B)
   /   \
  2(R)  8(R)
 /  \
1(B) 3(B)
TreeSet<Integer> set = new TreeSet<>(Set.of(1, 3, 5, 7, 9));

set.floor(6);    // → 5 (nearest ≤ 6)
set.ceiling(6);  // → 7 (nearest ≥ 6)
set.lower(5);    // → 3 (strictly < 5)
set.higher(5);   // → 7 (strictly > 5)

set.subSet(3, 8);    // → [3, 5, 7] (range)
set.headSet(5);      // → [1, 3] (less than 5)
set.tailSet(5);      // → [5, 7, 9] (greater or equal)

compareTo vs equals

// TreeSet uses compareTo, NOT equals!
class Person implements Comparable<Person> {
    String name;
    public int compareTo(Person o) {
        return name.compareTo(o.name);  // By name
    }
}

// If compareTo = 0 → element will NOT be added
// Even if equals returns false!

// → equals and compareTo must be consistent!

When NOT to use TreeSet

  1. Only need uniqueness — HashSet is faster (O(1) vs O(log n))
  2. Elements don’t implement Comparable and no Comparator — TreeSet cannot sort
  3. Memory is critical — TreeSet is 3-4x larger than HashSet (tree nodes with prev/next/left/right)

🔴 Senior Level

Memory Footprint

Each Entry in TreeMap:
  Object Header: 12 bytes
  K key: 4-8 bytes
  V value: 4 bytes (PRESENT)
  Entry left: 4-8 bytes
  Entry right: 4-8 bytes
  Entry parent: 4-8 bytes
  boolean color: 1 byte
  Padding: up to 8 bytes

Total: ~40-48 bytes per element!

→ TreeSet is 3-4x larger than HashSet

Cache Locality problem

Tree nodes are scattered across the Heap:
  [Entry1]...[Entry2]...[Entry3]

Each transition to a child = cache miss!
  → Wait from RAM: 100+ cycles

→ TreeSet iteration is slower than HashSet

Java 21: SequencedSet

// TreeSet → SequencedSet (SequencedSet requires Java 21+, not available on Java 11/17.)
set.getFirst();   // Smallest element
set.getLast();    // Largest element
set.reversed();   // Reverse order (view)

// addFirst/addLast → UnsupportedOperationException
// → Tree order is fixed!

View semantics

NavigableSet<Integer> view = set.subSet(3, true, 7, true);
// → "Live" view, not a copy!

view.add(5);       // Adds to the ORIGINAL set
set.remove(5);     // Removes from view too!

view.add(10);      // → IllegalArgumentException!
// → 10 is outside the range [3, 7]

Custom Comparator with null

// TreeSet doesn't accept null by default
// But possible via Comparator:
TreeSet<String> set = new TreeSet<>(
    Comparator.nullsFirst(Comparator.naturalOrder())
);
set.add(null);  // OK! null goes first

Spliterator optimizations

Spliterator<Integer> spliterator = set.spliterator();
// Characteristics:
//   SORTED   → Stream API optimizes
//   DISTINCT → distinct() is free
//   SIZED    → size is known

set.stream()
   .distinct()  // → Does nothing! Already unique
   .count();    // → O(1), knows the size

Production Experience

Real scenario: Tax brackets

// Tax rates by income threshold
NavigableMap<Integer, Double> taxBrackets = new TreeMap<>();
taxBrackets.put(0, 0.0);      // 0% up to 0
taxBrackets.put(10000, 0.1);  // 10% up to 10k
taxBrackets.put(50000, 0.2);  // 20% up to 50k

// Find rate for income 35,000
double rate = taxBrackets.floorEntry(35000).getValue();
// → 0.1 (10%)

// HashMap cannot do this! Sorting is needed

Best Practices

  1. TreeMap internally → O(log n) operations
  2. compareTo = equals contract!
  3. null is prohibited (without custom Comparator)
  4. View = live, not a copy
  5. Java 21 → SequencedSet API
  6. Cache miss during iteration — keep in mind
  7. Memory 3-4x more than HashSet

Summary for Senior

  • TreeSet = TreeMap + PRESENT placeholder
  • Red-black tree → O(log n), balancing
  • compareTo determines uniqueness, not equals
  • NavigableSet → floor/ceiling/subSet
  • View = live view, O(1) creation
  • Cache Locality is bad → nodes are scattered
  • Memory ~40-48 bytes per element
  • SequencedSet (Java 21) → getFirst/reversed

🎯 Interview Cheat Sheet

Must know:

  • TreeSet = adapter over TreeMap, elements stored as keys, value = static final Object PRESENT
  • Internally: red-black tree (self-balancing binary tree), height always O(log n)
  • Complexity: O(log n) for add/remove/contains — slower than HashSet (O(1))
  • NavigableSet API: floor, ceiling, lower, higher, subSet, headSet, tailSet
  • Does not accept null (without custom Comparator) — NullPointerException
  • compareTo determines uniqueness, not equals — if compareTo = 0, element won’t be added
  • Memory: ~40-48 bytes per element (left, right, parent, color) — 3-4x more than HashSet
  • Bad cache locality — nodes scattered across heap, each transition = cache miss
  • View (subSet, etc.) = live view, not a copy, O(1) creation

Frequent follow-up questions:

  • Why is TreeSet slower than HashSet? — O(log n) due to tree + cache miss on every transition between nodes.
  • What happens if compareTo and equals are inconsistent? — TreeSet may treat different elements (by equals) as identical (by compareTo = 0) — data loss.
  • How to add null to TreeSet? — Via Comparator.nullsFirst(Comparator.naturalOrder()).
  • What are Views (subSet) and what are their limitations? — Live range representation. You can only add elements within the range — otherwise IllegalArgumentException.

Red flags (DO NOT say):

  • ❌ “TreeSet uses equals to compare elements” — it uses compareTo!
  • ❌ “TreeSet is faster than HashSet” — 50x slower due to O(log n) and cache miss
  • ❌ “TreeSet works with null by default” — no, only with custom Comparator
  • ❌ “subSet creates a copy of data” — it’s a live view, changes reflect in the original set

Related topics:

  • [[11. What is the difference between HashSet, LinkedHashSet and TreeSet]]
  • [[15. What is the difference between HashMap, LinkedHashMap and TreeMap]]
  • [[16. When to use TreeMap]]
  • [[14. What is Map and what implementations exist]]