What does ROW_NUMBER() do?

🟢 Junior Level

ROW_NUMBER() — a window function that assigns a unique sequential number to each row.

Simple analogy: Like page numbering in a book or a queue at a store — everyone gets their own unique number in order.

Example:

-- Number all orders by date
SELECT
    id,
    amount,
    created_at,
    ROW_NUMBER() OVER(ORDER BY created_at DESC) as row_num
FROM orders;

-- Result:
-- id | amount | created_at | row_num
-- 5  | 100    | 2024-04-01 | 1
-- 4  | 200    | 2024-03-15 | 2
-- 3  | 150    | 2024-03-01 | 3

Common uses:

Row numbering in reports
Pagination
Finding the first/last record in a group
Removing duplicates

🟡 Middle Level

Syntax

ROW_NUMBER() OVER(
    PARTITION BY column1  -- Dividing into groups (optional)
    ORDER BY column2      -- Sorting within the group (required!)
)

Common patterns

1. Top-N for each group:

-- Latest 3 orders for each user
SELECT * FROM (
    SELECT
        user_id,
        amount,
        created_at,
        ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY created_at DESC) as rn
    FROM orders
) sub
WHERE rn <= 3;

2. Deduplication:

-- Keep only the latest version of each record
WITH ranked AS (
    SELECT id, email, created_at,
           ROW_NUMBER() OVER(PARTITION BY email ORDER BY created_at DESC) as rn
    FROM users
)
DELETE FROM users WHERE id IN (SELECT id FROM ranked WHERE rn > 1);

3. Pagination:

-- Page 3, 20 records per page
SELECT * FROM (
    SELECT id, name,
           ROW_NUMBER() OVER(ORDER BY id) as rn
    FROM users
) sub
WHERE rn BETWEEN 41 AND 60;

Nondeterminism — an important trap!

-- ❌ PROBLEM: if there are users with the same created_at
SELECT id, name, created_at,
       ROW_NUMBER() OVER(ORDER BY created_at DESC) as rn
FROM users;
-- The order of rows with the same created_at is NOT guaranteed!

-- ✅ SOLUTION: add a unique column
SELECT id, name, created_at,
       ROW_NUMBER() OVER(ORDER BY created_at DESC, id DESC) as rn
FROM users;
-- id guarantees a unique order

ROW_NUMBER vs COUNT(*) OVER()

-- ROW_NUMBER() — cheap (just increments a counter)
SELECT id, name, ROW_NUMBER() OVER() as rn FROM users LIMIT 10;

-- COUNT(*) OVER() — expensive (reads ALL rows in the partition)
SELECT id, name, COUNT(*) OVER() as total FROM users LIMIT 10;
-- ⚠️ Despite LIMIT 10, the database will read ALL users!

Common mistakes

Filtering in the same query

-- ❌ ERROR: WHERE can't see rn
SELECT id, ROW_NUMBER() OVER(ORDER BY id) as rn
FROM users
WHERE rn <= 10;  -- Error: column "rn" does not exist

-- ✅ Subquery or CTE
SELECT * FROM (
    SELECT id, ROW_NUMBER() OVER(ORDER BY id) as rn FROM users
) sub WHERE rn <= 10;

Forgot ORDER BY

-- ❌ Syntax error
ROW_NUMBER() OVER(PARTITION BY user_id)

-- ✅ ORDER BY is required
ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY created_at)

Comparison with alternatives

Function	Uniqueness	Handling equal values
ROW_NUMBER()	✅ Always	Arbitrarily numbers them
RANK()	❌ No	Equal values = same rank, then skip
DENSE_RANK()	❌ No	Equal values = same rank, no skips

When NOT to use ROW_NUMBER

For aggregations (COUNT, SUM, AVG) — use window aggregate functions instead
For ranking with ties — use RANK or DENSE_RANK instead
For API pagination on large tables (> 1M rows) — use Keyset Pagination (WHERE id > ?)

🔴 Senior Level

Physical optimization: eliminating Sort

Key insight: If the data is already sorted, PostgreSQL skips the Sort phase!

-- Index on (user_id, created_at DESC)
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at DESC);

-- Query:
SELECT user_id, amount, created_at,
       ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY created_at DESC) as rn
FROM orders;

-- Plan: WindowAgg → Index Scan (NO Sort!)
-- WindowAgg — a PostgreSQL execution plan node that computes the window
-- function. When you see `WindowAgg → Index Scan` — data is read
-- already in the correct order and the Sort phase is skipped.
-- → Data is read from the index already in the correct order
-- → 10-50x faster on large tables!

Monitoring:

EXPLAIN SELECT ... ROW_NUMBER() OVER(...);
-- Look for: WindowAgg → Index Scan → great!
-- Look for: WindowAgg → Sort → can be optimized with an index

Bounded Heap Sort (Top-N optimization)

When ROW_NUMBER() is used with LIMIT:

SELECT * FROM (
    SELECT id, amount,
           ROW_NUMBER() OVER(ORDER BY amount DESC) as rn
    FROM orders
) sub WHERE rn <= 10;

-- PostgreSQL applies Bounded Heap Sort:
-- → Does NOT sort the entire table
-- → Keeps only the top-10 elements in memory (as if you picked
--   top-10 candidates without sorting all 1000 resumes)
-- → Evicts extras as it processes
-- → Memory: O(k) instead of O(n), where k = LIMIT

Incremental Sort (PG 13+)

-- Index on user_id
CREATE INDEX idx_orders_user ON orders(user_id);

-- Query with PARTITION BY
SELECT user_id, created_at,
       ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY created_at DESC) as rn
FROM orders;

-- PG 13+: Incremental Sort
-- Data is already grouped by user_id (from the index)
-- → Only sorts by created_at within each group
-- → Much cheaper than a full sort!
-- On PG 12 and below — full sort of all data, even if part is already sorted.

Keyset Pagination (better than ROW_NUMBER)

-- ❌ ROW_NUMBER pagination (slow on large OFFSET)
SELECT * FROM (
    SELECT id, name, ROW_NUMBER() OVER(ORDER BY id) as rn
    FROM users
) sub WHERE rn BETWEEN 100001 AND 100020;

-- ✅ Keyset Pagination (fast, uses index)
SELECT id, name FROM users
WHERE id > 100000  -- Last id from the previous page
ORDER BY id
LIMIT 20;
-- → Index Scan, O(log n + k) instead of O(n)
-- ROW_NUMBER requires numbering ALL rows up to the desired OFFSET,
-- while Keyset Pagination jumps directly to the right spot via B-tree index.
-- For page 50,000: ROW_NUMBER processes 100,000 rows, Keyset — only 20.

When to use Keyset:

Infinite scroll
API pagination
Large tables (> 1M rows)

When ROW_NUMBER is OK:

Arbitrary pages (jump to page 57)
Small tables (< 100,000 rows)
Reports with fixed numbering

Determinism and stability

-- Always make ORDER BY deterministic!
-- Bad:
ROW_NUMBER() OVER(ORDER BY created_at)

-- Good (add PK):
ROW_NUMBER() OVER(ORDER BY created_at DESC, id DESC)

-- Why?
-- If 2 records have the same created_at
-- → The order between them is undefined
-- → Result may change from run to run
-- → Bugs in production!

Edge Cases

PARTITION BY with a single element

-- If a group has 1 row → rn = 1 always
SELECT user_id, ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY id) as rn
FROM orders;
-- For users with 1 order: rn = 1
-- For users with N orders: rn = 1, 2, ..., N

NULL in ORDER BY

-- NULL defaults to LAST in DESC
SELECT id, priority,
       ROW_NUMBER() OVER(ORDER BY priority DESC) as rn
FROM tasks;

-- Explicit NULL management
SELECT id, priority,
       ROW_NUMBER() OVER(ORDER BY priority DESC NULLS LAST) as rn
FROM tasks;

ROW_NUMBER() without PARTITION BY

-- Numbering ALL rows
SELECT id, name, ROW_NUMBER() OVER(ORDER BY id) as rn
FROM users;
-- rn = 1, 2, 3, ..., N

Performance Tuning

Memory for Sort:

-- If Sort doesn't fit in work_mem → spill to disk
EXPLAIN (ANALYZE) SELECT ... ROW_NUMBER() OVER(ORDER BY ...);
-- "external merge Disk: X kB" → increase work_mem

-- Temporarily increase for the query
SET work_mem = '256MB';
-- ... query ...
RESET work_mem;

Parallelism:

-- WindowAgg does NOT parallelize in PG < 14
-- PG 14+: may use Parallel Gather before WindowAgg
-- But WindowAgg itself runs in a single process

-- Workaround for large data:
-- Table partitioning + UNION ALL

Production Experience

Real scenario #1: Deduplicating 10M records

CRM: duplicate contacts (same email)
Query with ROW_NUMBER + DELETE: 3 minutes
Solution:
- Index on (email, created_at DESC)
- Batch DELETE in chunks of 10,000 records
Result: 15 seconds

Real scenario #2: API Pagination

E-commerce API: ROW_NUMBER pagination on 10M products
Page 50,000: 8 seconds
Solution: Keyset Pagination (WHERE id > ?)
Result: 50ms (160x faster)

Monitoring

-- 1. Check the execution plan
EXPLAIN (ANALYZE, BUFFERS)
SELECT ... ROW_NUMBER() OVER(...);
-- WindowAgg → Sort → can be optimized with an index
-- WindowAgg → Index Scan → great!

-- 2. Check Sort memory
EXPLAIN (ANALYZE)
SELECT ... ROW_NUMBER() OVER(ORDER BY ...);
-- "Disk: X kB" → spill! Increase work_mem

-- 3. Find queries with ROW_NUMBER
SELECT query, mean_exec_time
FROM pg_stat_statements
WHERE query LIKE '%ROW_NUMBER()%'
ORDER BY mean_exec_time DESC;

Best Practices

Always add PK to ORDER BY for determinism
Index on (PARTITION BY, ORDER BY) → no Sort needed
Keyset Pagination for large tables
ROW_NUMBER only for reports/page-by-number pagination
Bounded Heap Sort automatic with LIMIT
Incremental Sort (PG 13+) with indexes
Subquery/CTE for filtering by rn

Summary for Senior

ROW_NUMBER() — unique numbering, always deterministic with PK
Index on (PARTITION BY, ORDER BY) → WindowAgg without Sort
Bounded Heap Sort with LIMIT → O(k) memory
Keyset Pagination > ROW_NUMBER for API and infinite scroll
Incremental Sort (PG 13+) saves time with partial sorting
Determinism: always add PK to ORDER BY
COUNT(*) OVER() is expensive — reads ALL partition rows, don’t use with LIMIT
Filtering only in subquery/CTE

🎯 Interview Cheat Sheet

Must know:

ROW_NUMBER() — unique numbering of every row (1, 2, 3, …)
PARTITION BY — divides into groups, ORDER BY — sorting within (required!)
Determinism: always add PK to ORDER BY (ORDER BY created_at DESC, id DESC)
Filtering by rn → ONLY in subquery/CTE (window functions are computed AFTER WHERE)
Bounded Heap Sort: with LIMIT — O(k) memory instead of O(n)
Keyset Pagination (WHERE id > last_id) > ROW_NUMBER for API/infinite scroll
Index on (PARTITION BY, ORDER BY) → WindowAgg without Sort
Incremental Sort (PG 13+): data partially sorted → faster
COUNT(*) OVER() is expensive — reads ALL partition rows, don’t use with LIMIT
Deduplication: ROW_NUMBER + DELETE WHERE rn > 1

Frequent follow-up questions:

“Why is ROW_NUMBER pagination slow on large OFFSET?” → Sorts all rows up to OFFSET
“What is Keyset Pagination?” → WHERE id > last_id ORDER BY id LIMIT 20 → Index Scan
“Why is PK needed in ORDER BY?” → Without PK, order among equal values is undefined
“When is COUNT(*) OVER() a problem?” → Despite LIMIT, it reads ALL rows

Red flags (do NOT say):

❌ “ROW_NUMBER can be filtered in WHERE” (no, computed after WHERE)
❌ “Keyset Pagination is the same as ROW_NUMBER” (Keyset = Index Scan, ROW_NUMBER = Sort)
❌ “COUNT(*) OVER() OK with LIMIT” (reads ALL rows!)

Related topics:

[[What does RANK() and DENSE_RANK() do]] → differences from ROW_NUMBER
[[What are window functions]] → general overview
[[What is explain plan]] → checking WindowAgg → Sort/Index Scan plan