What is the difference between List, Set and Queue?
Cache locality = elements are located adjacently in memory. CPU loads a 64-byte cache line at once, getting 8-16 references immediately. This makes iteration 10-100x faster comp...
Junior Level
List, Set and Queue — three main collection types in Java.
Main difference:
| Type | Order | Duplicates | Access | Analogy |
|---|---|---|---|---|
| List | Preserves order | Yes allowed | By index list.get(0) |
Shopping list |
| Set | Not guaranteed (except LinkedHashSet) | No allowed | Iteration only | Set of keys |
| Queue | FIFO (First-In-First-Out) | Yes allowed | Head only | Store checkout line |
Example:
// List: order and duplicates
List<String> list = new ArrayList<>();
list.add("B");
list.add("A");
list.add("B"); // OK
// → [B, A, B]
// Set: only unique
Set<String> set = new HashSet<>();
set.add("B");
set.add("A");
set.add("B"); // Won't be added
// → [A, B] (order not guaranteed)
// Queue: FIFO
Queue<String> queue = new ArrayDeque<>();
queue.offer("B");
queue.offer("A");
queue.poll(); // → "B" (first in, first out)
Middle Level
List (Sequence)
// Implementations:
ArrayList // Dynamic array → O(1) read
LinkedList // Doubly-linked list → O(n) read
// Operations:
list.get(0) // O(1) for ArrayList, O(n) for LinkedList
list.add(e) // O(1)* amortized
list.contains(e) // O(n) linear search
list.remove(0) // O(n) array shift
Set (Collection of unique elements)
// Implementations:
HashSet // Based on HashMap → O(1) lookup
LinkedHashSet // + insertion order
TreeSet // Red-black tree → O(log n), sorted
// Inside HashSet:
// HashMap<E, Object> where value = static final Object PRESENT
// → Despite sharing one PRESENT object, HashSet consumes more memory than ArrayList due to HashMap.Node wrapper (key + value + hash + next pointer).
// Operations:
set.add(e) // O(1) for HashSet, O(log n) for TreeSet
set.contains(e) // O(1) for HashSet, O(log n) for TreeSet
set.remove(e) // O(1) for HashSet
Queue
// Implementations:
ArrayDeque // Circular array → O(1) insert/remove
PriorityQueue // Binary Heap → O(log n), priority extraction
LinkedList // Can be used as Queue, but slower than ArrayDeque
// Methods:
queue.offer(e) // true/false (doesn't throw exception)
queue.add(e) // throws exception on error
queue.poll() // null if empty
queue.remove() // throws exception if empty
queue.peek() // null if empty
queue.element() // throws exception if empty
Performance Comparison
| Operation | ArrayList | HashSet | TreeSet | ArrayDeque |
|---|---|---|---|---|
| Insertion | O(1)* | O(1) | O(log n) | O(1) |
| Lookup | O(n) | O(1) | O(log n) | O(n) |
| Removal | O(n) | O(1) | O(log n) | O(1) |
| Index access | O(1) | N/A | N/A | N/A |
When to use which
| Scenario | Choose | Why |
|---|---|---|
| Store with order | List | Preserves order |
| Uniqueness check | Set | Fast contains |
| Process in order | Queue | FIFO |
| Sorting | TreeSet | Automatic sorting |
| Frequent reads | ArrayList | Cache locality |
Cache locality = elements are located adjacently in memory. CPU loads a 64-byte cache line at once, getting 8-16 references immediately. This makes iteration 10-100x faster compared to scattered elements in LinkedList.
When NOT to use
- List (ArrayList): Don’t use for uniqueness checks —
contains()= O(n) - HashSet: Don’t use if insertion order matters — use
LinkedHashSet - TreeSet: Don’t use if sorting is unnecessary — O(log n) is slower than O(1)
- Queue: Don’t use for random access — use
List
Senior Level
Memory Footprint
ArrayList:
→ Object[] array (4-8 bytes per reference)
→ Minimal overhead
HashSet:
→ HashMap inside
→ Node<K,V> per element
→ + static final Object PRESENT (shared)
→ ~2x more memory than ArrayList!
TreeSet:
→ TreeMap (red-black tree)
→ Entry<K,V> + left + right + parent + color
→ ~4x more memory than ArrayList
Concurrency
// List
CopyOnWriteArrayList // Many reads, few writes
Collections.synchronizedList // Universal, but slow
// Set
ConcurrentHashMap.newKeySet() // Best choice!
Collections.synchronizedSet()
// Queue
ConcurrentLinkedQueue // Lock-free, wait-free
LinkedBlockingQueue // Producer-Consumer
ArrayBlockingQueue // Bounded, single lock
SynchronousQueue // Zero capacity
Cache Locality
ArrayList: elements adjacent in memory
→ CPU prefetching → L1/L2 cache hits
→ Iteration: very fast
HashSet: elements scattered
→ Cache misses during iteration
→ But O(1) contains compensates
LinkedList: nodes at different heap locations
→ Cache miss on EVERY element
→ Anti-pattern for modern CPUs!
Java 21: SequencedCollection
// Now List and Deque inherit SequencedCollection
SequencedCollection<String> seq = new ArrayList<>();
seq.addFirst("A"); // New method!
seq.addLast("B"); // New method!
seq.getFirst(); // New method!
seq.reversed(); // Reverse-order view
// For Set:
SequencedSet<String> seqSet = new LinkedHashSet<>();
// TreeSet also supports via NavigableSet
Bloom Filter for massive Sets
// When elements number in billions, HashSet won't fit in RAM
// Solution: Bloom Filter (third-party data structure, not in JCF)
// "False positive" = filter may say "element possibly exists" when it actually doesn't.
// But "false negative" is impossible: if filter says "element doesn't exist" — it's definitely absent.
BloomFilter<String> filter = BloomFilter.create(
Funnels.stringFunnel(),
expectedInsertions,
falsePositiveRate
);
// O(1) lookup, minimal memory
// But: false positives possible (no false negatives)
Production Experience
Real scenario: HashSet vs ArrayList for contains
// ❌ O(n) lookup in a list of 1M elements
List<String> blacklist = List.of("bad1", "bad2", ...);
if (blacklist.contains(input)) { ... } // Slow!
// ✅ O(1) lookup
Set<String> blacklist = Set.of("bad1", "bad2", ...);
if (blacklist.contains(input)) { ... } // Fast!
Best Practices
- List → ArrayList by default
- Set → HashSet for uniqueness, TreeSet for sorting
- Queue → ArrayDeque for stack/queue
- Concurrency → ConcurrentHashMap.newKeySet() for Set
- Memory → ArrayList more compact than Set
- Cache Locality → ArrayList > LinkedList
- Java 21 → SequencedCollection for unified API
Senior Summary
- List = indexing, order, cache locality
- Set = uniqueness, O(1) contains (HashSet)
- Queue = FIFO/LIFO, backpressure (Bounded)
- Memory: ArrayList < HashSet < TreeSet
- Cache Locality: ArrayList » LinkedList
- Concurrency: CopyOnWriteArrayList, ConcurrentHashMap.newKeySet()
- Bloom Filter for massive sets
- Java 21 → SequencedCollection unifies API
Interview Cheat Sheet
Must know:
- List = order + duplicates + index access (ArrayList O(1) read, LinkedList O(n))
- Set = uniqueness via equals()/hashCode(), HashSet = O(1) lookup, TreeSet = O(log n) sorting
- Queue = FIFO, offer/poll/peek safer than add/remove/element
- HashSet internally = HashMap<E, Object> with shared PRESENT, ~2x more memory than ArrayList
- Cache locality: ArrayList » LinkedList — elements adjacent in memory, CPU prefetching
- For concurrency: CopyOnWriteArrayList (many reads), ConcurrentHashMap.newKeySet() for Set
- Bloom Filter — for massive sets (billions of elements), O(1) lookup, false positives possible
Common follow-up questions:
- Why does HashSet consume more memory than ArrayList? — Internally HashMap.Node (key + value + hash + next pointer)
- When to use TreeSet over HashSet? — When sorting is needed, but O(log n) is slower than O(1)
- How do offer/poll differ from add/remove? — offer/poll return false/null instead of exception
- How to ensure thread-safety for Set? — ConcurrentHashMap.newKeySet() — best choice
Red flags (DO NOT say):
- “LinkedList is faster than ArrayList for iteration” — opposite is true, ArrayList cache locality gives 10-100x advantage
- “HashSet guarantees insertion order” — that’s LinkedHashSet, regular HashSet doesn’t guarantee order
- “For thread-safe Set you need Collections.synchronizedSet” — ConcurrentHashMap.newKeySet() is more efficient
- “Queue and List are interchangeable” — Queue doesn’t support random index access
Related topics:
- [[1. What are the main Collection Framework interfaces]]
- [[3. What is the difference between ArrayList and LinkedList]]
- [[5. What is the time complexity of operations in ArrayList]]
- [[9. What is Queue and what implementations exist]]