Section 1 Β· 21 questions

Databases SQL PostgreSQL

21 interview questions and answers in the Databases SQL PostgreSQL section.

English Databases SQL PostgreSQL Source Markdown
Language versions: English Russian Ukrainian

Questions in this section

  1. What are indexes and why are they needed?
  2. How does B-tree index work?
  3. What is a composite index?
  4. When should you create an index?
  5. What are the disadvantages of indexes?
  6. What is index cardinality?
  7. What JOIN types exist?
  8. Difference between INNER JOIN and LEFT JOIN?
  9. What is Better: JOIN or Subquery?
  10. What is a Correlated Subquery?
  11. What is the Difference Between WHERE and HAVING?
  12. What Does GROUP BY Do?
  13. When to Use HAVING?
  14. What Are Window Functions?
  15. What does ROW_NUMBER() do?
  16. What does RANK() and DENSE_RANK() do?
  17. How does MVCC work in PostgreSQL?
  18. What is VACUUM in PostgreSQL?
  19. Why is ANALYZE needed?
  20. What is Explain plan?
  21. How to optimize slow queries?

Study navigator

21 questions for preparing for a Middle Java Developer interview.


πŸ“‹ All Questions

# Question Difficulty
1 What are indexes and why are they needed ⭐⭐
2 How does B-tree index work ⭐⭐⭐
3 What is a composite index ⭐⭐
4 When should you create an index ⭐⭐
5 What are the disadvantages of indexes ⭐⭐
6 What is index cardinality ⭐⭐⭐
7 What JOIN types exist ⭐⭐⭐
8 Difference between INNER JOIN and LEFT JOIN ⭐⭐
9 What is better JOIN or subquery ⭐⭐⭐
10 What is a correlated subquery ⭐⭐⭐
11 Difference between WHERE and HAVING ⭐
12 What does GROUP BY do ⭐⭐
13 When to use HAVING ⭐
14 What are Window Functions ⭐⭐⭐
15 What does ROW_NUMBER() do ⭐⭐
16 What does RANK() and DENSE_RANK() do ⭐⭐
17 How does MVCC work in PostgreSQL ⭐⭐⭐
18 What is VACUUM in PostgreSQL ⭐⭐⭐
19 Why is ANALYZE needed ⭐⭐⭐
20 What is Explain plan ⭐⭐
21 How to optimize slow queries ⭐⭐⭐

πŸ—ΊοΈ Topic Dependency Map

                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   21. Query Optimization (Central)       β”‚
                    β”‚   pg_stat_statements β†’ EXPLAIN β†’ Fix    β”‚
                    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚          β”‚          β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚          └────────────┐
              β–Ό                       β–Ό                       β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ INDEXES (1-6)   β”‚    β”‚ JOIN (7-10)         β”‚    β”‚ PLANNER          β”‚
    β”‚ 1. Why needed   β”‚    β”‚ 7. JOIN types       β”‚    β”‚ 19. ANALYZE      β”‚
    β”‚ 2. B-tree       β”‚    β”‚ 8. INNER vs LEFT    β”‚    β”‚ 20. EXPLAIN      β”‚
    β”‚ 3. Composite    β”‚    β”‚ 9. JOIN vs subquery β”‚    β”‚                  β”‚
    β”‚ 4. When create  β”‚    β”‚ 10. Correlated      β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    β”‚ 5. Disadvantagesβ”‚    β”‚     subquery        β”‚             β”‚
    β”‚ 6. Cardinality  β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                       β”‚
             β”‚                                                β”‚
             β–Ό                                                β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚                MVCC and MAINTENANCE (17-18)                      β”‚
    β”‚  17. MVCC (xmin/xmax, HOT, Hint Bits, XID Wraparound)           β”‚
    β”‚  18. VACUUM (dead tuples, Bloat, pg_repack, autovacuum)         β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚                GROUPING and WINDOWS (11-16)                      β”‚
    β”‚  11. WHERE vs HAVING                                             β”‚
    β”‚  12. GROUP BY (HashAggregate, GroupAggregate)                    β”‚
    β”‚  13. When to use HAVING                                          β”‚
    β”‚  14. Window Functions (ROWS, RANGE, FILTER)                      β”‚
    β”‚  15. ROW_NUMBER()                                                β”‚
    β”‚  16. RANK() / DENSE_RANK()                                       β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

🟒 Junior Level (weeks 1-2)

Step Topic Files Goal
1 Index basics Q1, Q4, Q5 Why needed, when to create, disadvantages
2 JOIN basics Q7, Q8 JOIN types, INNER vs LEFT
3 WHERE / HAVING / GROUP BY Q11, Q12, Q13 Execution order, aggregation
4 MVCC basics Q17 (Junior) Why MVCC, UPDATE = DELETE + INSERT
5 VACUUM basics Q18 (Junior) Why VACUUM, regular vs FULL
6 EXPLAIN basics Q20 (Junior) How to read a plan
7 ROW_NUMBER, RANK Q15, Q16 (Junior) Numbering and ranking

🟑 Middle Level (weeks 3-4)

Step Topic Files Goal
1 B-tree internals Q2, Q3, Q6 Structure, composite, cardinality
2 Physical JOIN Q7 (Middle/Senior) Nested Loop, Hash, Merge
3 Subqueries Q9, Q10 Semi-Join, SubPlan, decorrelation
4 Window functions Q14, Q15, Q16 ROWS vs RANGE, FILTER, indexes
5 VACUUM details Q18 (Middle) Autovacuum, Hot Standby, Cost-based
6 ANALYZE Q19 Statistics, MCV, correlation
7 Optimization Q21 Algorithm, common problems

πŸ”΄ Senior Level (weeks 5-6)

Step Topic Files Goal
1 B-tree internals Q2 (Senior) Lehman-Yao, High Key, deduplication
2 JOIN optimization Q7 (Senior) GEQO, Join Reordering, Memoize
3 LATERAL + Memoize Q9, Q10 (Senior) PG 14+, Hits/Misses, CTE
4 MVCC deep dive Q17 (Senior) Hint Bits, HOT, SSI, CLOG, Wraparound
5 VACUUM deep dive Q18 (Senior) Visibility Map, Parallel, pg_repack
6 ANALYZE deep dive Q19 (Senior) Extended Statistics, Reservoir Sampling
7 EXPLAIN deep dive Q20 (Senior) Generic vs Custom, JIT, WAL
8 System optimization Q21 (Senior) Plan Cache, denormalization, pgBouncer

πŸ”— Key Connections Between Topics

Topic: Indexes

Q1 (Why needed) β†’ Q2 (B-tree) β†’ Q3 (Composite) β†’ Q4 (When) β†’ Q5 (Disadvantages) β†’ Q6 (Cardinality)
                                              ↓
                                    Q21 (Query Optimization)

Key connections:

  • Q1 ↔ Q17 (MVCC): HOT updates are not possible for indexed columns
  • Q2 ↔ Q18 (VACUUM): Bloat β†’ REINDEX
  • Q3 ↔ Q6 (Cardinality): column order by selectivity
  • Q4 ↔ Q19 (ANALYZE): statistics affect the planner’s index choice
  • Q5 ↔ Q21 (Optimization): Write Amplification, index balance

Topic: JOIN and Subqueries

Q7 (JOIN types) β†’ Q8 (INNER vs LEFT) β†’ Q9 (JOIN vs subquery) β†’ Q10 (Correlated)
                                                    ↓
                                          Q20 (EXPLAIN: look for SubPlan)

Key connections:

  • Q7 ↔ Q20 (EXPLAIN): physical strategies in the plan (Nested Loop, Hash, Merge)
  • Q8 ↔ Q11 (WHERE vs HAVING): filter on right table β†’ in ON, not WHERE
  • Q9 ↔ Q10: Semi-Join, decorrelation, SubPlan vs InitPlan
  • Q10 ↔ Q14 (Window Functions): window functions as an alternative to correlated subqueries

Topic: GROUP BY and Window Functions

Q11 (WHERE vs HAVING) β†’ Q12 (GROUP BY) β†’ Q13 (When HAVING)
                                                ↓
                                    Q14 (Window Functions) β†’ Q15 (ROW_NUMBER) β†’ Q16 (RANK)

Key connections:

  • Q11 ↔ Q12: WHERE before GROUP BY, HAVING after
  • Q12 ↔ Q14: window functions do NOT collapse rows (unlike GROUP BY)
  • Q13 ↔ Q14: window functions are evaluated AFTER HAVING β†’ subquery needed
  • Q14 ↔ Q15, Q16: ROW_NUMBER, RANK, DENSE_RANK, NTILE, PERCENT_RANK

Topic: MVCC and Maintenance

Q17 (MVCC) β†’ Q18 (VACUUM) β†’ Q19 (ANALYZE) β†’ Q20 (EXPLAIN) β†’ Q21 (Optimization)

Key connections:

  • Q17 ↔ Q18: MVCC creates dead tuples β†’ VACUUM removes them
  • Q18 ↔ Q19: Autovacuum triggers ANALYZE
  • Q19 ↔ Q20: Stale statistics β†’ wrong plan β†’ ANALYZE
  • Q20 ↔ Q21: EXPLAIN (ANALYZE, BUFFERS) β€” the primary optimization tool

πŸŽ“ Cheat Sheet: What to Know for Each Level

🟒 Junior

  • What an index is and why it’s needed
  • JOIN types (INNER, LEFT, RIGHT, FULL, CROSS)
  • WHERE vs HAVING: when to use which
  • GROUP BY + aggregate functions (COUNT, SUM, AVG)
  • MVCC: readers don’t block writers
  • VACUUM: why it’s needed, regular vs FULL

🟑 Middle

  • B-tree structure, composite indexes, left-prefix rule
  • Physical JOIN strategies (Nested Loop, Hash, Merge)
  • EXISTS vs JOIN, SubPlan vs InitPlan
  • Window functions: ROW_NUMBER, RANK, LAG/LEAD
  • Autovacuum: threshold, scale_factor, tuning
  • ANALYZE: statistics, n_distinct, MCV
  • EXPLAIN (ANALYZE, BUFFERS): reading the plan

πŸ”΄ Senior

  • Lehman-Yao, High Key, deduplication (PG 13+), Skip Scan emulation
  • GEQO, Join Reordering, Outer Join Removal, Memoize (PG 14+)
  • MVCC: Hint Bits, HOT, SSI, CLOG, XID Wraparound
  • VACUUM: Visibility Map, Parallel, pg_repack, Cost-based
  • Extended Statistics, Reservoir Sampling
  • Generic vs Custom Plans, plan_cache_mode, JIT, WAL
  • System optimization: denormalization, pgBouncer, Partitioning

πŸ“ File Format

Each file contains:

  • 🟒 Junior Level β€” basic understanding, simple analogies, examples
  • 🟑 Middle Level β€” internals, common mistakes, practical examples
  • πŸ”΄ Senior Level β€” deep dive, edge cases, production experience, monitoring
  • 🎯 Interview Cheat Sheet β€” key takeaways, common questions, red flags, related topics