Difference between INNER JOIN and LEFT JOIN?
This gives maximum freedom for choosing the optimal plan.
🟢 Junior Level
The main difference: INNER JOIN returns only rows with matches in both tables, while LEFT JOIN returns all rows from the left table, even if there are no matches.
Simple example:
-- Table users: Ivanov, Petrov, Sidorov
-- Table orders: Ivanov (order), Petrov (order)
-- INNER JOIN: only users with orders
SELECT u.name, o.amount
FROM users u
INNER JOIN orders o ON u.id = o.user_id;
-- Result: Ivanov, Petrov
-- LEFT JOIN: all users, even without orders
SELECT u.name, o.amount
FROM users u
LEFT JOIN orders o ON u.id = o.user_id;
-- Result: Ivanov (100), Petrov (200), Sidorov (NULL)
When to use:
- INNER JOIN — when you only need related data
- LEFT JOIN — when you need all records from the main table + possible related ones
🟡 Middle Level
Join Reordering freedom
INNER JOIN — associative and commutative:
-- The planner can join tables in any order:
(A JOIN B) JOIN C = A JOIN (B JOIN C) = (A JOIN C) JOIN B
This gives maximum freedom for choosing the optimal plan.
LEFT JOIN — NOT associative:
-- Order is strictly constrained:
(A LEFT JOIN B) LEFT JOIN C ≠ A LEFT JOIN (B LEFT JOIN C)
The planner is restricted in reordering → may choose a suboptimal plan.
Physical strategies (how PostgreSQL performs JOINs)
PostgreSQL uses three algorithms for physical JOIN execution:
- Nested Loop — for each row of the outer table, looks for matches in the inner table (fast if there’s an index)
- Hash Join — builds a hash table on the smaller table and looks for matches (fast for large tables without an index)
- Merge Join — both tables are sorted and merged like two sorted arrays (fast if data is already sorted)
Physical strategies comparison
| Metric | INNER JOIN | LEFT JOIN |
|---|---|---|
| Hash Join | Hash built on smaller table | Hash, typically, on the right table |
| Nested Loop | Either table can be outer | Left table is typically outer |
| Merge Join | Both tables can be sorted | Left table determines the order |
The WHERE trap (Outer-to-Inner Transformation)
PostgreSQL can convert LEFT JOIN to INNER JOIN “on the fly”:
-- ❌ LEFT JOIN turned into INNER JOIN!
SELECT u.name, o.amount
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100;
-- NULL > 100 = UNKNOWN → all users without orders are filtered out!
-- ✅ Correct: filter in ON
SELECT u.name, o.amount
FROM users u
LEFT JOIN orders o ON u.id = o.user_id AND o.amount > 100;
-- All users are preserved, orders are filtered
-- ✅ Or filter on IS NULL
SELECT u.name
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE o.id IS NULL; -- Users WITHOUT orders (Anti-Join)
Common mistakes
- Confusion with NULL
-- LEFT JOIN returns NULL for missing data SELECT u.name, SUM(o.amount) FROM users u LEFT JOIN orders o ON u.id = o.user_id GROUP BY u.name; -- NULL + number = NULL! -- Solution: COALESCE SELECT u.name, COALESCE(SUM(o.amount), 0) as total FROM users u LEFT JOIN orders o ON u.id = o.user_id GROUP BY u.name; - Row duplication
-- ❌ If a user has 5 orders, they appear 5 times! SELECT u.name, u.email FROM users u LEFT JOIN orders o ON u.id = o.user_id; -- ✅ Group or use EXISTS SELECT DISTINCT u.name, u.email -- or GROUP BY FROM users u LEFT JOIN orders o ON u.id = o.user_id; - LEFT JOIN where INNER is needed
-- ❌ LEFT JOIN is slower (less freedom for the planner) SELECT u.name, o.amount FROM users u LEFT JOIN orders o ON u.id = o.user_id WHERE o.amount IS NOT NULL; -- Essentially an INNER JOIN -- ✅ Use INNER JOIN SELECT u.name, o.amount FROM users u INNER JOIN orders o ON u.id = o.user_id;
Practical comparison
-- INNER JOIN: only orders with customers
SELECT u.name, COUNT(o.id) as orders_count
FROM users u
INNER JOIN orders o ON u.id = o.user_id
GROUP BY u.name;
-- Only customers with orders
-- LEFT JOIN: all customers + statistics
SELECT u.name, COUNT(o.id) as orders_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.name;
-- All customers, some with orders_count = 0
🔴 Senior Level
When NOT to use LEFT JOIN
- When data is guaranteed to exist in both tables — use INNER JOIN, it gives the optimizer more freedom
- When ORM generates LEFT JOINs “just in case” — check if you actually need data from the right table; if not, Outer Join Removal will remove the JOIN, but it’s better not to generate it in the first place
- Cascades of 5+ LEFT JOINs — the planner is restricted in reordering, may choose a catastrophically bad plan
Join Reordering: Deep analysis
INNER JOIN gives the optimizer full freedom:
Query: A JOIN B JOIN C JOIN D (4 tables)
Possible orders: 4! = 24 variants
The planner picks the cheapest one.
LEFT JOIN restricts:
Query: A LEFT JOIN B LEFT JOIN C LEFT JOIN D
Possible orders: only 1 (A → B → C → D)
The planner can only choose physical strategies.
Performance impact:
- On complex schemas (10+ tables), chains of LEFT JOINs can lead to a catastrophically bad plan
- Solution: Use INNER JOIN where possible, LEFT JOIN only where semantically necessary
Outer Join Removal (PG 10+)
Powerful optimization — PostgreSQL removes unnecessary LEFT JOINs:
Removal conditions:
- You don’t select any columns from the right table
- The key in the right table is UNIQUE or PRIMARY KEY
-- LEFT JOIN will be removed from the plan!
SELECT u.name FROM users u
LEFT JOIN orders o ON u.id = o.user_id;
-- orders is not used in SELECT + user_id is UNIQUE → removal
-- Check via EXPLAIN
EXPLAIN SELECT u.name FROM users u
LEFT JOIN orders o ON u.id = o.user_id;
-- Seq Scan on users ← orders is NOT in the plan!
Why this happens:
- Often ORMs or generic Views generate LEFT JOINs “just in case”
- The optimizer removes unnecessary work
Outer-to-Inner Transformation
The optimizer converts LEFT JOIN to INNER JOIN if WHERE makes NULLs impossible:
-- Original query
SELECT u.name, o.amount
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100;
-- After transformation (internally)
SELECT u.name, o.amount
FROM users u
INNER JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100;
-- Why? NULL > 100 = UNKNOWN → rows with NULL would be filtered anyway
-- So LEFT JOIN is semantically equivalent to INNER JOIN
Check:
-- See transformations
EXPLAIN (VERBOSE)
SELECT u.name, o.amount
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100;
-- In the output you'll see: "Join Type: Inner"
Cardinality Estimation
INNER JOIN:
rows = size(A) × size(B) × selectivity(join_condition)
Example:
- users: 1,000,000 rows
- orders: 10,000,000 rows
- selectivity: 0.001 (0.1%)
rows = 1,000,000 × 10,000,000 × 0.001 = 10,000,000,000 × 0.001 = 10,000,000
⚠️ A statistics error can lead to estimates 10x higher/lower!
LEFT JOIN:
rows >= size(left table)
The planner knows the result will be at least as large as the left table.
This makes plans more predictable but less flexible.
LEFT JOIN LATERAL
A powerful tool for complex sampling:
-- "Find top 3 orders for each user"
SELECT u.name, top_orders.amount, top_orders.created_at
FROM users u
LEFT JOIN LATERAL (
SELECT amount, created_at
FROM orders o
WHERE o.user_id = u.id
ORDER BY created_at DESC
LIMIT 3
) top_orders ON true;
-- Physical plan: Nested Loop
-- For each user → 3 orders from the index
-- Great for users < 10,000
-- Slow for users > 1,000,000
Alternative for large data:
-- Window Function instead of LATERAL
SELECT name, amount, created_at
FROM (
SELECT u.name, o.amount, o.created_at,
ROW_NUMBER() OVER(PARTITION BY u.id ORDER BY o.created_at DESC) as rn
FROM users u
INNER JOIN orders o ON u.id = o.user_id
) sub
WHERE rn <= 3;
-- Plan: Hash Join + WindowAgg
-- Much faster for large tables!
Edge Cases
- Mixing INNER and LEFT JOIN
-- Order matters! SELECT * FROM users u LEFT JOIN orders o ON u.id = o.user_id INNER JOIN products p ON o.product_id = p.id; -- This is NOT the same as: SELECT * FROM users u INNER JOIN orders o ON u.id = o.user_id LEFT JOIN products p ON o.product_id = p.id; -- First: all users, orders (if any), products (if there's an order) -- Second: only users with orders, products (if any) - Multiple LEFT JOINs
-- A cascade of LEFT JOINs can give unexpected results SELECT u.name, o.id as order_id, p.id as product_id FROM users u LEFT JOIN orders o ON u.id = o.user_id LEFT JOIN products p ON o.product_id = p.id; -- If user has no orders → order_id = NULL, product_id = NULL -- If there's an order but no product → order_id = 123, product_id = NULL - Coalescing with NULL
-- Aggregations with LEFT JOIN SELECT u.name, COUNT(o.id) as orders_count, -- 0 if no orders SUM(o.amount) as total_amount, -- NULL if no orders! AVG(o.amount) as avg_amount -- NULL if no orders! FROM users u LEFT JOIN orders o ON u.id = o.user_id GROUP BY u.name; -- Solution: COALESCE SELECT u.name, COUNT(o.id) as orders_count, COALESCE(SUM(o.amount), 0) as total_amount, COALESCE(AVG(o.amount), 0) as avg_amount FROM users u LEFT JOIN orders o ON u.id = o.user_id GROUP BY u.name;
Performance Comparison
| Scenario | INNER JOIN | LEFT JOIN |
|---|---|---|
| Planning | Faster (more options) | Slower (restrictions) |
| Nested Loop | Either table can be driving | Left table is always driving |
| Hash Join | Flexible building | Right table is always built |
| Merge Join | Both tables sorted | Left table determines order |
| Index Usage | Flexible | Depends on direction |
Production Experience
Real scenario #1: LEFT JOIN kills performance
- Report: 15 tables, 12 of them via LEFT JOIN “just in case”
- Problem: Query 45 seconds instead of expected 2
- EXPLAIN: Nested Loop with left table always driving → catastrophe
- Solution:
- Replaced 8 LEFT JOINs with INNER JOINs (data is always there)
- Split into 2 CTEs
- Result: Query < 3 seconds
Real scenario #2: Outer Join Removal didn’t work
- API: LEFT JOIN on a View with a UNIQUE constraint
- Expected: JOIN removal by the optimizer
- Reality: constraint was DEFERRABLE → optimizer not sure about UNIQUE
- Solution: Replaced DEFERRABLE with NOT DEFERRABLE → removal worked
- Result: Query sped up 5x
Monitoring
-- 1. Check plans with LEFT JOIN
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM users u
LEFT JOIN orders o ON u.id = o.user_id;
-- 2. Check for Outer Join Removal
EXPLAIN (VERBOSE)
SELECT u.name FROM users u
LEFT JOIN orders o ON u.id = o.user_id;
-- Look for "Join Type: Inner" or absence of orders in the plan
-- 3. Check Outer-to-Inner transformation
EXPLAIN (VERBOSE)
SELECT u.name, o.amount
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100;
-- "Join Type: Inner" → transformation occurred
-- 4. JOIN statistics in pg_stat_statements
SELECT query, mean_exec_time, calls
FROM pg_stat_statements
WHERE query LIKE '%LEFT JOIN%'
ORDER BY mean_exec_time DESC
LIMIT 10;
Best Practices
- Use INNER JOIN where semantically possible
- LEFT JOIN only where needed to preserve all records from the left table
- Filters on the right table go in
ON, not inWHERE - Check the plan via
EXPLAIN (VERBOSE)for JOIN removal - Avoid LEFT JOIN cascades > 5 — the planner is restricted
- LATERAL JOIN only for small outer sets
- COALESCE for aggregations with LEFT JOIN
- DISTINCT or GROUP BY with LEFT JOIN to avoid duplication
Summary for Senior
- INNER JOIN = maximum freedom for the optimizer
- LEFT JOIN = restrictions on Join Reordering
- Outer Join Removal (PG 10+) can remove unnecessary LEFT JOINs
- Outer-to-Inner Transformation happens when filtering on NULL
- LEFT JOIN LATERAL ≈ Nested Loop → only for small outer sets
- Filters on the right table → in ON, not in WHERE
- COALESCE for SUM/AVG with LEFT JOIN (NULL → 0)
- Always check the plan via
EXPLAIN (VERBOSE, ANALYZE)
🎯 Interview Cheat Sheet
Must know:
- INNER JOIN: associative and commutative → optimizer can reorder tables
- LEFT JOIN: NOT associative → order is strictly constrained
- Outer Join Removal (PG 10+): removes LEFT JOIN if the right table is not used in SELECT
- Outer-to-Inner Transformation: WHERE on the right table turns LEFT JOIN into INNER
- Hash Join: for LEFT JOIN the hash is ALWAYS on the right table (for INNER — flexible)
- Nested Loop: for LEFT JOIN the left table is ALWAYS driving
- Filters on the right table → in ON (preserves NULLs), not in WHERE (filters NULLs)
- COALESCE(SUM(…), 0) — SUM with LEFT JOIN returns NULL when no data
Common follow-up questions:
- “Why is LEFT JOIN slower than INNER JOIN?” → Less freedom for Join Reordering
- “When won’t Outer Join Removal work?” → DEFERRABLE constraint, VIEW
- “What happens if a filter is in WHERE instead of ON for LEFT JOIN?” → Filters out NULL rows
- “When is LATERAL JOIN a catastrophe?” → Outer table > 10,000 rows
Red flags (DO NOT say):
- ❌ “LEFT JOIN is always safer” (it’s slower and restricted for the optimizer)
- ❌ “A filter on the right table can go in WHERE” (turns LEFT JOIN into INNER!)
- ❌ “SUM with LEFT JOIN returns 0” (returns NULL! Need COALESCE)
Related topics:
- [[What JOIN types exist]] → physical execution strategies
- [[What is better JOIN or subquery]] → when EXISTS is better than JOIN
- [[What is a correlated subquery]] → EXISTS as an alternative