Question 1 · Section 1

What are indexes and why are they needed?

An index stores column values in sorted order and pointers (TID — Tuple ID, a pair of block_number:offset_in_block) to physical rows in the table. The most common index type is...

Language versions: English Russian Ukrainian

🟢 Junior Level

Indexes are auxiliary data structures in a database that speed up data retrieval.

Simple analogy: Imagine a thick book without a table of contents. To find a specific topic, you’d have to flip through all the pages. An index is like a table of contents at the beginning of the book: you immediately see which page the needed section is on.

Core idea:

  • Without an index, the database scans the entire table sequentially (Seq Scan — sequential scanning, where each row is read one after another, as if you were flipping through a book page by page)
  • With an index, the database quickly finds the required rows

Example:

-- Without index: slow search across entire table
SELECT * FROM users WHERE email = 'test@example.com';

-- Create index
CREATE INDEX idx_users_email ON users(email);

-- Now search is instantaneous
SELECT * FROM users WHERE email = 'test@example.com'; -- Uses index

What changed internally: The SQL query is the same, but PostgreSQL no longer reads all 1,000,000 rows. Instead, it traverses 3-4 pages of the B-tree index (~0.01ms) and immediately jumps to the required row in the table via TID.

When to use:

  • When you frequently search data by a specific field
  • When the table is large (thousands of rows and more)
  • For fields used in WHERE, JOIN, ORDER BY

🟡 Middle Level

How it works

An index stores column values in sorted order and pointers (TID — Tuple ID, a pair of block_number:offset_in_block) to physical rows in the table. The most common index type is B-Tree (created by default).

Index types in PostgreSQL

Type When to use Supported operations
B-Tree Universal (default) <, <=, =, >=, >, ranges
Hash Exact equality only =
GIN Arrays, JSONB, full-text search @>, ?, @@
GiST Geospatial data, intervals Depends on operator
BRIN Huge tables (>100GB) with ordered data Ranges

Why not always B-Tree? A Hash index takes less disk space than B-Tree but only supports =. GIN can search within arrays and JSONB — which B-Tree can’t do at all. BRIN stores only value ranges per page — takes 100-1000x less space than B-Tree, but only works for ordered data (time series).

Scan types

Type Mechanism When it occurs
Index Scan Index → table read When columns not in index are needed
Index Only Scan Index only All data available in index
Bitmap Index Scan Index → bitmap → table When data is abundant but scattered
Seq Scan Sequential read Small tables or no index

Bitmap Index Scan: collects ALL matching TIDs from the index into a bitmap (1 = match, 0 = no match), then reads rows from heap. More efficient than Index Scan when reading 100-10,000 rows from different locations — fewer random I/O operations.

Common mistakes

  1. Creating indexes on all fields
    • ❌ Problem: Slows down INSERT/UPDATE/DELETE
    • ✅ Solution: Index only fields actually used in WHERE/JOIN
  2. Using functions in WHERE
    -- ❌ Index will NOT be used
    SELECT * FROM users WHERE LOWER(email) = 'test@example.com';
    
    -- ✅ Create a functional index
    CREATE INDEX idx_users_email_lower ON users(LOWER(email));
    
  3. Searching by second column of a composite index
    -- Index (city, age)
    -- ❌ Will not be used when searching only by age
    SELECT * FROM users WHERE age = 25;
    

When NOT to create an index

  • Small tables (< 1000 rows) — Seq Scan is faster because the cost of one sequential read of all pages is less than the cost of navigating the B-tree index + random heap page reads
  • Columns with low cardinality (e.g., gender: M/F)
  • Fields rarely used in queries
  • Tables with intensive writes (each index slows down INSERT)

Practical example

-- Composite index for a typical query
CREATE INDEX idx_orders_customer_date
ON orders(customer_id, created_at DESC);

-- This query fully uses the index
SELECT * FROM orders
WHERE customer_id = 123
ORDER BY created_at DESC
LIMIT 10;

🔴 Senior Level

Internal Implementation

An index in PostgreSQL is a disk structure optimized for concurrent access. B-tree is based on the Lehman and Yao algorithm, which allows searches and insertions without locking the entire path from root to leaf.

B-tree structure:

Meta-page (page 0)
  └── Root page
       ├── Internal pages (navigation)
       └── Leaf pages (data + TID → Heap)
            └── P_NEXT / P_PREV (scan links)

High Key: Each page contains the maximum value. If during a search we see that the search value > High Key — it means a split occurred, and we need to follow P_NEXT.

Architectural Trade-offs

Covering indexes (INCLUDE):

-- INCLUDE columns are stored only in leaf nodes, without bloating the tree
CREATE INDEX idx_orders_customer_total
ON orders(customer_id) INCLUDE (total_amount);

-- Index Only Scan without accessing Heap
SELECT customer_id, total_amount
FROM orders
WHERE customer_id = 123;

Partial indexes:

-- Index only active sessions (10-100x space savings)
CREATE INDEX idx_active_sessions
ON sessions(user_id)
WHERE status = 'ACTIVE';

Creation in Production:

-- CONCURRENTLY does not block writes, but takes longer
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);
-- ⚠️ Cannot be used inside a transaction!
-- ⚠️ If interrupted, an INVALID index remains

MVCC and HOT Updates

In PostgreSQL, UPDATE = DELETE + INSERT. If a column is indexed, all indexes must be updated.

HOT (Heap Only Tuple) — an optimization that allows updating a row without updating indexes:

  • Works if the modified columns are not indexed
  • There must be free space on the Heap page

Conclusion: Each extra index kills HOT and increases Write Amplification.

Edge Cases

  1. Index Bloat
    • Cause: Page Splits when inserting into the middle
    • Solution: fillfactor = 70-80 for frequently updated indexes
    • Maintenance: REINDEX CONCURRENTLY (VACUUM does not return space to OS)
  2. Selectivity and planner
    • If a query returns >10-15% of the table → planner will choose Seq Scan
    • Random I/O (Index Scan) is more expensive than Sequential Scan for large selections
  3. Invalid indexes
    -- Check for INVALID indexes (after interrupted CREATE INDEX CONCURRENTLY)
    SELECT indexrelid::regclass, indisvalid
    FROM pg_index
    WHERE NOT indisvalid;
    

Performance

Metric 1-2 indexes 10-15 indexes
Write IOPS Minimal Exponential
Replica Lag Almost absent High risk
Update Speed High (HOT possible) Low
Maintenance Fast Long REINDEX

Production Experience

Real scenario:

  • Orders table: 50 million rows, 15 indexes
  • Problem: INSERT slowed 10x, Replica Lag = 30 seconds
  • Solution:
    • Removed 8 unused indexes (idx_scan = 0 in pg_stat_user_indexes)
    • Replaced 3 separate indexes with 1 composite
    • Result: INSERT sped up 5x, Replica Lag < 1 second

Monitoring

-- Index sizes
SELECT
    indexrelname,
    pg_size_pretty(pg_relation_size(indexrelid)) as size,
    idx_scan as scans,
    idx_tup_read,
    idx_tup_fetch
FROM pg_stat_user_indexes
WHERE relname = 'orders'
ORDER BY pg_relation_size(indexrelid) DESC;

-- Unused indexes (candidates for deletion)
SELECT
    schemaname,
    relname as table_name,
    indexrelname as index_name,
    pg_size_pretty(pg_relation_size(indexrelid)) as size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND NOT indisunique  -- don't delete UNIQUE
ORDER BY pg_relation_size(indexrelid) DESC;

-- Check Bloat via pgstattuple
SELECT * FROM pgstattuple('idx_orders_customer');
-- avg_leaf_density < 50% → time for REINDEX

Best Practices

  1. Always use CONCURRENTLY in production
  2. Monitor pg_stat_user_indexes for unused indexes
  3. Check plans via EXPLAIN (ANALYZE, BUFFERS)
  4. Consolidate indexes: (a), (b), (a,b,c) → one (a,b,c)
  5. Use HypoPG for testing hypothetical indexes
  6. Foreign Keys — almost always index them

Summary for Senior

  • Indexes are a write tax in exchange for read speedup
  • Balance between SELECT speed and INSERT/UPDATE cost
  • Monitor Bloat, Replica Lag, HOT-update ratio
  • EXPLAIN (ANALYZE, BUFFERS) is your best friend
  • Regularly remove unused indexes

🎯 Interview Cheat Sheet

Must know:

  • Indexes speed up SELECT, slow down INSERT/UPDATE/DELETE
  • B-Tree is the default type, supports =, <, >, ranges
  • 5 types: B-Tree, Hash, GIN, GiST, BRIN
  • 4 scan types: Index Scan, Index Only Scan, Bitmap, Seq Scan
  • CONCURRENTLY for creation without blocking in production
  • INCLUDE for Index Only Scan without bloating the tree

Frequent follow-up questions:

  • “What happens if you create indexes on all fields?” → Write Amplification, Replica Lag, kills HOT
  • “How to check if an index is used?” → EXPLAIN (ANALYZE, BUFFERS), pg_stat_user_indexes
  • “Why doesn’t WHERE LOWER(email) = ... use an index?” → Need a functional index
  • “How to find unused indexes?” → idx_scan = 0 in pg_stat_user_indexes
  • “What’s the difference between VACUUM and REINDEX?” → VACUUM cleans dead rows, REINDEX rebuilds the index

Red flags (DON’T say):

  • ❌ “Indexes are always beneficial” (no — it’s a write tax)
  • ❌ “VACUUM shrinks index files” (no, need REINDEX)
  • ❌ “Hash index is better than B-Tree” (Hash only for =)
  • ❌ “I’ll create indexes just in case” (analyze the workload)

Related topics:

  • [[How does B-tree index work]] → internal structure
  • [[What is a composite index]] → leftmost prefix rule
  • [[How does MVCC work in PostgreSQL]] → MVCC and HOT updates
  • [[What is VACUUM in PostgreSQL]] → index maintenance
  • [[How to optimize slow queries]] → practical application