How to optimize slow queries?
4. Fix it (most often — add an index)
🟢 Junior Level
Query optimization is the process of speeding up slow SQL queries.
Simple algorithm:
- Find the slow query
- Look at its plan via
EXPLAIN - Understand what’s slowing it down
- Fix it (most often — add an index)
Example:
-- 1. Find the slow query
-- (application logs, pg_stat_statements)
-- 2. Check the plan
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'test@example.com';
-- 3. We see the problem
-- Seq Scan on users ← reads the entire table!
-- Rows Removed by Filter: 999999
-- 4. Fix it
CREATE INDEX idx_users_email ON users(email);
-- Now:
-- Index Scan using idx_users_email ← fast!
-- Execution Time: 0.05 ms (was 500 ms)
Most common causes of slow queries:
- ❌ No index on the field in WHERE
- ❌ Index exists but isn’t used (function in WHERE)
- ❌ Selection is too large (no LIMIT)
- ❌ JOIN without indexes on join fields
🟡 Middle Level
Optimization algorithm
Step 1: Find problematic queries
-- pg_stat_statements — the main tool
CREATE EXTENSION pg_stat_statements;
-- Top 10 slowest queries
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
-- Top 10 queries with the most disk reads
SELECT query, shared_blks_read, mean_exec_time
FROM pg_stat_statements
ORDER BY shared_blks_read DESC
LIMIT 10;
Step 2: Analyze the plan
EXPLAIN (ANALYZE, BUFFERS)
SELECT ...;
-- Look for:
-- Seq Scan on large tables → need an index
-- Sort → Disk Spill → increase work_mem
-- Nested Loop on large tables → bad plan
-- rows ≠ actual rows → stale statistics
Step 3: Fix
Common problems and solutions
1. No index
-- ❌ Seq Scan on 1M rows
SELECT * FROM users WHERE email = 'test@example.com';
-- ✅ Create an index
CREATE INDEX idx_users_email ON users(email);
2. Function in WHERE kills the index
-- ❌ Index is not used
SELECT * FROM users WHERE LOWER(email) = 'test@example.com';
-- ✅ Functional index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
3. Wrong data type
-- ❌ Implicit cast blocks the index
SELECT * FROM users WHERE phone = 89001234567; -- phone = text
-- ✅ Correct type
SELECT * FROM users WHERE phone = '89001234567';
4. Not enough work_mem
-- ❌ Sort on disk
EXPLAIN ANALYZE SELECT * FROM orders ORDER BY created_at DESC;
-- Sort Method: external merge Disk: 5000kB
-- ✅ Increase work_mem
SET work_mem = '256MB';
-- Now: Sort Method: quicksort Memory: 50000kB
5. JOIN without indexes
-- ❌ Hash Join without indexes
SELECT * FROM users u JOIN orders o ON u.id = o.user_id;
-- ✅ Index on the join field
CREATE INDEX idx_orders_user_id ON orders(user_id);
-- Now: Nested Loop + Index Scan → faster
Common mistakes
- Too many indexes
- Every index slows down INSERT/UPDATE
- Remove unused ones:
idx_scan = 0
- SELECT *
- Select only the columns you need
- Less data → faster transfer
- N+1 problem
-- ❌ 1000 queries in a loop for user in users: SELECT * FROM orders WHERE user_id = user.id -- ✅ 1 query with JOIN SELECT u.*, o.* FROM users u LEFT JOIN orders o ON u.id = o.user_id
🔴 Senior Level
Systematic approach to optimization
Optimization levels:
1. SQL Level (query)
├── Indexes
├── Query refactoring
└── Statistics
2. Configuration
├── work_mem
├── shared_buffers
├── effective_cache_size
└── random_page_cost
3. Architecture
├── Denormalization
├── Partitioning
├── Materialized Views
└── Read Replicas
4. Infrastructure
├── SSD vs HDD
├── RAM
├── Connection Pooling
└── Huge Pages
Plan Cache problems
Generic vs Custom Plans:
-- Problem: query is fast in psql, slow in the application
-- Cause: JDBC uses Prepared Statements → Generic Plan
-- Diagnostics
EXPLAIN (ANALYZE) SELECT * FROM users WHERE email = $1;
-- Generic Plan: Seq Scan
-- Solution 1: plan_cache_mode
SET plan_cache_mode = 'force_custom_plan';
-- Solution 2: disable Prepared Statements
-- JDBC URL: jdbc:postgresql://...?prepareThreshold=0
Implicit Casts
-- Check implicit type casts
EXPLAIN (ANALYZE) SELECT * FROM users WHERE phone = 89001234567;
-- Filter: (phone = '89001234567'::text) ← cast!
-- → Index is not used
-- Solution: matching types
-- 1. Change the column type
ALTER TABLE users ALTER COLUMN phone TYPE bigint;
-- 2. Or pass the correct type
SELECT * FROM users WHERE phone = '89001234567'; -- text
Function volatility
-- Functions are marked as:
-- IMMUTABLE → always the same result for the same arguments
-- STABLE → same result within a query
-- VOLATILE → can change (default)
-- ❌ VOLATILE function prevents optimization
CREATE OR REPLACE FUNCTION get_status(id int) RETURNS text AS $$
SELECT status FROM users WHERE id = $1;
$$ LANGUAGE sql; -- VOLATILE by default
-- ✅ STABLE allows caching
CREATE OR REPLACE FUNCTION get_status(id int) RETURNS text AS $$
SELECT status FROM users WHERE id = $1;
$$ LANGUAGE sql STABLE;
-- STABLE: within a SINGLE query, the function returns the same result
-- for the same arguments. The planner calls it once and substitutes.
-- Cache lives until the end of the query.
Architectural decisions
Denormalization:
-- ❌ JOIN 10 tables for every query
SELECT u.name, o.amount, p.title, c.name as category
FROM users u
JOIN orders o ON u.id = o.user_id
JOIN products p ON o.product_id = p.id
JOIN categories c ON p.category_id = c.id
WHERE ...
-- ✅ Denormalized columns
ALTER TABLE orders ADD COLUMN product_title text;
ALTER TABLE orders ADD COLUMN category_name text;
-- Update via trigger
-- Query: without JOIN → 10x faster
Materialized Views:
-- For complex analytical queries
CREATE MATERIALIZED VIEW mv_sales_summary AS
SELECT
department,
COUNT(*) as orders_count,
SUM(amount) as total_sales,
AVG(amount) as avg_order
FROM orders
GROUP BY department;
-- Refresh periodically
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_sales_summary;
-- Query: instant
SELECT * FROM mv_sales_summary;
Partitioning:
-- For tables > 100M rows
CREATE TABLE orders (
id serial,
created_at date,
amount numeric
) PARTITION BY RANGE (created_at);
CREATE TABLE orders_y2023 PARTITION OF orders
FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');
CREATE TABLE orders_y2024 PARTITION OF orders
FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');
-- Partition Pruning: the planner excludes unnecessary
-- partitions from the plan. Data from them is not even read from disk.
-- WHERE created_at > '2024-01-01' → scans only orders_y2024
System tuning
Parameters for SSD:
-- The planner should trust the cache
SET random_page_cost = 1.1; -- Default 4.0 (for HDD)
SET effective_cache_size = '24GB'; -- 75% RAM
SET shared_buffers = '8GB'; -- 25% RAM
Connection Pooling:
pgBouncer in transaction mode:
- 1000 app connections → 50 connections to DB
- 30% CPU saved on context switching
- Less memory for sessions
Huge Pages:
OS + PostgreSQL:
- Huge Pages = 2MB pages instead of 4KB
- Fewer TLB misses
(TLB — Translation Lookaside Buffer, CPU cache for addresses.
At 4KB for 8GB RAM → 2M TLB entries (doesn't fit).
At 2MB → 4000 entries (fits in TLB))
- +10-20% performance at high loads
Senior developer algorithm
1. pg_stat_statements → find queries with high shared_blks_read
2. EXPLAIN (ANALYZE, BUFFERS) → find the bottleneck node
3. Check locks: pg_blocking_pids()
4. Check statistics: rows ≠ actual → ANALYZE
5. Check work_mem: Disk Spill → increase
6. Check plan_cache_mode: Generic Plan problems
7. Refactoring: denormalization, partitioning, caching
8. If SQL is optimal → sharding (Citus)
When NOT to use architectural decisions
- Denormalization: Do NOT use with frequent UPDATEs of denormalized fields (trigger conflicts, desynchronization)
- Materialized Views: Do NOT use for data that must be up-to-date in real time (lag between REFRESHes)
- Partitioning: Do NOT use for tables < 10M rows — partition management overhead outweighs the benefit
Production Experience
Real scenario #1: Comprehensive optimization
- SaaS: API > 2 seconds (p95)
- Analysis:
- N+1 queries (100 queries per page)
- Seq Scan on a 50M row table
- work_mem = 4MB → spill to disk
- Generic Plan in Hibernate
- Solution:
- DataLoader → 5 queries instead of 100
- Index on hot fields
- work_mem = 256MB
- plan_cache_mode = force_custom_plan
- Result: p95 < 200ms (10x faster)
Real scenario #2: Architectural changes
- Analytics: report 5 minutes (JOIN 12 tables)
- Solution:
- Materialized View with REFRESH every 5 minutes
- Denormalized columns for hot fields
- Partitioning by date
- Result: report < 1 second
Monitoring Dashboard
-- 1. Top slow queries
SELECT
query,
calls,
mean_exec_time,
total_exec_time,
shared_blks_read,
rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
-- 2. Locks
SELECT
blocked_locks.pid AS blocked_pid,
blocked_activity.query AS blocked_query,
blocking_locks.pid AS blocking_pid,
blocking_activity.query AS blocking_query
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks ON blocking_locks.locktype = blocked_locks.locktype
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;
-- 3. Bloat
SELECT
relname,
n_dead_tup,
n_live_tup,
ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 2) as dead_pct
FROM pg_stat_user_tables
WHERE n_live_tup > 10000
ORDER BY dead_pct DESC;
-- 4. Cache Hit Rate
SELECT
sum(heap_blks_read) as heap_read,
sum(heap_blks_hit) as heap_hit,
ROUND(100.0 * sum(heap_blks_hit) / NULLIF(sum(heap_blks_read) + sum(heap_blks_hit), 0), 2) as cache_hit_pct
FROM pg_statio_user_tables;
-- > 99% → great
Best Practices
- Monitor via
pg_stat_statementsconstantly - EXPLAIN (ANALYZE, BUFFERS) for every problematic query
- Indexes — the first solution, but not the only one
- Control work_mem during Sort/Hash
- Check plan_cache_mode when using ORM
- Denormalization for hot-path queries
- Materialized Views for analytics
- pgBouncer recommended at > 50 connections. For small projects (< 10 users) you can do without it.
- Partitioning for tables > 100M rows
- Cache Hit Ratio > 99% — the goal
Summary for Senior
- Optimization is not just indexes, it’s a systematic process
- Plan Cache (Generic vs Custom) — a common cause of prod/dev discrepancies
- Implicit Casts block indexes → use matching types
- Denormalization for hot-path, MV for analytics
- Control work_mem → Disk Spill = disaster
- pgBouncer in transaction mode → 30% CPU saved
- Cache Hit Ratio > 99% → the main DB health metric
- Partitioning + Partition Pruning for large tables
- Write Amplification: every INSERT/UPDATE writes not only to the table but also to ALL indexes. 10 indexes = 10x writes. Physically 10x more data is written than the user changed.
🎯 Interview Cheat Sheet
Must know:
- Algorithm: pg_stat_statements → EXPLAIN (ANALYZE, BUFFERS) → fix → verify
- Common problems: no index, function in WHERE, implicit cast, disk spill, N+1
- Generic vs Custom Plans: Prepared Statements → plan_cache_mode = force_custom_plan
- Implicit Casts:
phone = 8900...(int) vsphone = '8900...'(text) → cast blocks index - Function volatility: IMMUTABLE, STABLE, VOLATILE → affects optimization
- Denormalization: hot columns → trigger for updates
- Materialized Views: complex analytical queries → REFRESH CONCURRENTLY
- Partitioning: tables > 100M rows → Partition Pruning
- Connection Pooling: pgBouncer in transaction mode → 1000 connections → 50 to DB
- System tuning:
random_page_cost = 1.1(SSD),shared_buffers = 25% RAM - Cache Hit Ratio > 99% — the goal; < 95% → problem
Frequent follow-up questions:
- “How to find slow queries?” → pg_stat_statements ORDER BY total_exec_time DESC
- “Why is a query fast in console but slow in the application?” → Generic Plan
- “What to do on Disk Spill?” → SET work_mem = ‘256MB’ for the session
- “When is denormalization justified?” → Hot-path queries, JOIN 10+ tables
Red flags (do NOT say):
- ❌ “Optimization = only indexes” (systematic process: SQL, config, architecture, infrastructure)
- ❌ “plan_cache_mode doesn’t matter” (Generic Plan — a common cause of discrepancies)
- ❌ “Materialized Views for OLTP” (only for analytics!)
- ❌ “pgBouncer isn’t needed” (1000 connections = 30% CPU on context switching)
Related topics:
- [[Why are indexes needed]] → the first solution for optimization
- [[What is explain plan]] → EXPLAIN (ANALYZE, BUFFERS)
- [[Why is ANALYZE needed]] → statistics for the planner
- [[How does MVCC work in PostgreSQL]] → Bloat and long transactions