Databases SQL PostgreSQL

Questions in this section

Study navigator

21 questions for preparing for a Middle Java Developer interview.

📋 All Questions

#	Question	Difficulty
1	What are indexes and why are they needed	⭐⭐
2	How does B-tree index work	⭐⭐⭐
3	What is a composite index	⭐⭐
4	When should you create an index	⭐⭐
5	What are the disadvantages of indexes	⭐⭐
6	What is index cardinality	⭐⭐⭐
7	What JOIN types exist	⭐⭐⭐
8	Difference between INNER JOIN and LEFT JOIN	⭐⭐
9	What is better JOIN or subquery	⭐⭐⭐
10	What is a correlated subquery	⭐⭐⭐
11	Difference between WHERE and HAVING	⭐
12	What does GROUP BY do	⭐⭐
13	When to use HAVING	⭐
14	What are Window Functions	⭐⭐⭐
15	What does ROW_NUMBER() do	⭐⭐
16	What does RANK() and DENSE_RANK() do	⭐⭐
17	How does MVCC work in PostgreSQL	⭐⭐⭐
18	What is VACUUM in PostgreSQL	⭐⭐⭐
19	Why is ANALYZE needed	⭐⭐⭐
20	What is Explain plan	⭐⭐
21	How to optimize slow queries	⭐⭐⭐

🗺️ Topic Dependency Map

                    ┌─────────────────────────────────────────┐
                    │   21. Query Optimization (Central)       │
                    │   pg_stat_statements → EXPLAIN → Fix    │
                    └──────┬──────────┬──────────┬────────────┘
                           │          │          │
              ┌────────────┘          │          └────────────┐
              ▼                       ▼                       ▼
    ┌─────────────────┐    ┌─────────────────────┐    ┌──────────────────┐
    │ INDEXES (1-6)   │    │ JOIN (7-10)         │    │ PLANNER          │
    │ 1. Why needed   │    │ 7. JOIN types       │    │ 19. ANALYZE      │
    │ 2. B-tree       │    │ 8. INNER vs LEFT    │    │ 20. EXPLAIN      │
    │ 3. Composite    │    │ 9. JOIN vs subquery │    │                  │
    │ 4. When create  │    │ 10. Correlated      │    └────────┬─────────┘
    │ 5. Disadvantages│    │     subquery        │             │
    │ 6. Cardinality  │   └─────────────────────┘             │
    └────────┬────────┘                                       │
             │                                                │
             ▼                                                ▼
    ┌──────────────────────────────────────────────────────────────────┐
    │                MVCC and MAINTENANCE (17-18)                      │
    │  17. MVCC (xmin/xmax, HOT, Hint Bits, XID Wraparound)           │
    │  18. VACUUM (dead tuples, Bloat, pg_repack, autovacuum)         │
    └──────────────────────────────────────────────────────────────────┘

    ┌──────────────────────────────────────────────────────────────────┐
    │                GROUPING and WINDOWS (11-16)                      │
    │  11. WHERE vs HAVING                                             │
    │  12. GROUP BY (HashAggregate, GroupAggregate)                    │
    │  13. When to use HAVING                                          │
    │  14. Window Functions (ROWS, RANGE, FILTER)                      │
    │  15. ROW_NUMBER()                                                │
    │  16. RANK() / DENSE_RANK()                                       │
    └──────────────────────────────────────────────────────────────────┘

🎯 Recommended Study Order

🟢 Junior Level (weeks 1-2)

Step	Topic	Files	Goal
1	Index basics	Q1, Q4, Q5	Why needed, when to create, disadvantages
2	JOIN basics	Q7, Q8	JOIN types, INNER vs LEFT
3	WHERE / HAVING / GROUP BY	Q11, Q12, Q13	Execution order, aggregation
4	MVCC basics	Q17 (Junior)	Why MVCC, UPDATE = DELETE + INSERT
5	VACUUM basics	Q18 (Junior)	Why VACUUM, regular vs FULL
6	EXPLAIN basics	Q20 (Junior)	How to read a plan
7	ROW_NUMBER, RANK	Q15, Q16 (Junior)	Numbering and ranking

🟡 Middle Level (weeks 3-4)

Step	Topic	Files	Goal
1	B-tree internals	Q2, Q3, Q6	Structure, composite, cardinality
2	Physical JOIN	Q7 (Middle/Senior)	Nested Loop, Hash, Merge
3	Subqueries	Q9, Q10	Semi-Join, SubPlan, decorrelation
4	Window functions	Q14, Q15, Q16	ROWS vs RANGE, FILTER, indexes
5	VACUUM details	Q18 (Middle)	Autovacuum, Hot Standby, Cost-based
6	ANALYZE	Q19	Statistics, MCV, correlation
7	Optimization	Q21	Algorithm, common problems

🔴 Senior Level (weeks 5-6)

Step	Topic	Files	Goal
1	B-tree internals	Q2 (Senior)	Lehman-Yao, High Key, deduplication
2	JOIN optimization	Q7 (Senior)	GEQO, Join Reordering, Memoize
3	LATERAL + Memoize	Q9, Q10 (Senior)	PG 14+, Hits/Misses, CTE
4	MVCC deep dive	Q17 (Senior)	Hint Bits, HOT, SSI, CLOG, Wraparound
5	VACUUM deep dive	Q18 (Senior)	Visibility Map, Parallel, pg_repack
6	ANALYZE deep dive	Q19 (Senior)	Extended Statistics, Reservoir Sampling
7	EXPLAIN deep dive	Q20 (Senior)	Generic vs Custom, JIT, WAL
8	System optimization	Q21 (Senior)	Plan Cache, denormalization, pgBouncer

🔗 Key Connections Between Topics

Topic: Indexes

Q1 (Why needed) → Q2 (B-tree) → Q3 (Composite) → Q4 (When) → Q5 (Disadvantages) → Q6 (Cardinality)
                                              ↓
                                    Q21 (Query Optimization)

Key connections:

Q1 ↔ Q17 (MVCC): HOT updates are not possible for indexed columns
Q2 ↔ Q18 (VACUUM): Bloat → REINDEX
Q3 ↔ Q6 (Cardinality): column order by selectivity
Q4 ↔ Q19 (ANALYZE): statistics affect the planner’s index choice
Q5 ↔ Q21 (Optimization): Write Amplification, index balance

Topic: JOIN and Subqueries

Q7 (JOIN types) → Q8 (INNER vs LEFT) → Q9 (JOIN vs subquery) → Q10 (Correlated)
                                                    ↓
                                          Q20 (EXPLAIN: look for SubPlan)

Key connections:

Q7 ↔ Q20 (EXPLAIN): physical strategies in the plan (Nested Loop, Hash, Merge)
Q8 ↔ Q11 (WHERE vs HAVING): filter on right table → in ON, not WHERE
Q9 ↔ Q10: Semi-Join, decorrelation, SubPlan vs InitPlan
Q10 ↔ Q14 (Window Functions): window functions as an alternative to correlated subqueries

Topic: GROUP BY and Window Functions

Q11 (WHERE vs HAVING) → Q12 (GROUP BY) → Q13 (When HAVING)
                                                ↓
                                    Q14 (Window Functions) → Q15 (ROW_NUMBER) → Q16 (RANK)

Key connections:

Q11 ↔ Q12: WHERE before GROUP BY, HAVING after
Q12 ↔ Q14: window functions do NOT collapse rows (unlike GROUP BY)
Q13 ↔ Q14: window functions are evaluated AFTER HAVING → subquery needed
Q14 ↔ Q15, Q16: ROW_NUMBER, RANK, DENSE_RANK, NTILE, PERCENT_RANK

Topic: MVCC and Maintenance

Q17 (MVCC) → Q18 (VACUUM) → Q19 (ANALYZE) → Q20 (EXPLAIN) → Q21 (Optimization)

Key connections:

Q17 ↔ Q18: MVCC creates dead tuples → VACUUM removes them
Q18 ↔ Q19: Autovacuum triggers ANALYZE
Q19 ↔ Q20: Stale statistics → wrong plan → ANALYZE
Q20 ↔ Q21: EXPLAIN (ANALYZE, BUFFERS) — the primary optimization tool

🎓 Cheat Sheet: What to Know for Each Level

🟢 Junior

What an index is and why it’s needed
JOIN types (INNER, LEFT, RIGHT, FULL, CROSS)
WHERE vs HAVING: when to use which
GROUP BY + aggregate functions (COUNT, SUM, AVG)
MVCC: readers don’t block writers
VACUUM: why it’s needed, regular vs FULL

🟡 Middle

B-tree structure, composite indexes, left-prefix rule
Physical JOIN strategies (Nested Loop, Hash, Merge)
EXISTS vs JOIN, SubPlan vs InitPlan
Window functions: ROW_NUMBER, RANK, LAG/LEAD
Autovacuum: threshold, scale_factor, tuning
ANALYZE: statistics, n_distinct, MCV
EXPLAIN (ANALYZE, BUFFERS): reading the plan

🔴 Senior

Lehman-Yao, High Key, deduplication (PG 13+), Skip Scan emulation
GEQO, Join Reordering, Outer Join Removal, Memoize (PG 14+)
MVCC: Hint Bits, HOT, SSI, CLOG, XID Wraparound
VACUUM: Visibility Map, Parallel, pg_repack, Cost-based
Extended Statistics, Reservoir Sampling
Generic vs Custom Plans, plan_cache_mode, JIT, WAL
System optimization: denormalization, pgBouncer, Partitioning

📝 File Format

Each file contains:

🟢 Junior Level — basic understanding, simple analogies, examples
🟡 Middle Level — internals, common mistakes, practical examples
🔴 Senior Level — deep dive, edge cases, production experience, monitoring
🎯 Interview Cheat Sheet — key takeaways, common questions, red flags, related topics