What Are Window Functions?

🟢 Junior Level

Window functions are functions that perform computations over a set of rows related to the current row, without collapsing the result into a single row (like GROUP BY does).

Main advantage: you get both the details of each row and aggregates — in a single query. With GROUP BY you would need a subquery or JOIN. With window functions: one query, one pass over the data.

Simple analogy: GROUP BY is when you put all apples into one pile and count them. A window function is when for each apple you write how many total apples are in its variety, but the apple itself stays in place.

Example:

-- For each order, show its amount and the user's average
SELECT
    user_id,
    amount,
    AVG(amount) OVER(PARTITION BY user_id) as user_avg
FROM orders;

-- Result:
-- user_id | amount | user_avg
-- 1       | 100    | 150      ← average for user 1
-- 1       | 200    | 150      ← same average, row didn't collapse
-- 2       | 300    | 250      ← average for user 2
-- 2       | 200    | 250

Main window functions:

ROW_NUMBER() — row number
RANK(), DENSE_RANK() — rank
SUM() OVER(), AVG() OVER() — aggregates without GROUP BY
LAG(), LEAD() — access to previous/next row

🟡 Middle Level

Anatomy of a Window Function

FUNCTION() OVER (
    PARTITION BY column1, column2  -- Division into groups (like GROUP BY)
    ORDER BY column3               -- Sorting within group
    ROWS BETWEEN ... AND ...       -- Frame boundaries
)

Key differences from GROUP BY:

GROUP BY	Window Functions
Collapses rows	Preserves rows
One row per group	Each row + computation
Cannot mix details and aggregate	You can!

Types of Window Functions

1. Aggregate:

SELECT
    name,
    salary,
    department,
    AVG(salary) OVER(PARTITION BY department) as dept_avg,
    SUM(salary) OVER(PARTITION BY department) as dept_total,
    COUNT(*) OVER(PARTITION BY department) as dept_count
FROM employees;

2. Ranking:

SELECT
    name,
    salary,
    ROW_NUMBER() OVER(ORDER BY salary DESC) as row_num,
    RANK() OVER(ORDER BY salary DESC) as rank,
    DENSE_RANK() OVER(ORDER BY salary DESC) as dense_rank
FROM employees;

3. Offset:

-- Comparison with previous row
SELECT
    month,
    revenue,
    LAG(revenue, 1) OVER(ORDER BY month) as prev_month,
    LEAD(revenue, 1) OVER(ORDER BY month) as next_month,
    revenue - LAG(revenue, 1) OVER(ORDER BY month) as growth
FROM monthly_sales;

ROWS vs RANGE vs GROUPS

-- ROWS: physical rows (fastest)
SELECT
    date,
    amount,
    AVG(amount) OVER(ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) as moving_avg_7d
FROM sales;

-- RANGE: logical values (slower, but more accurate)
SELECT
    date,
    amount,
    SUM(amount) OVER(ORDER BY date RANGE BETWEEN INTERVAL '7 days' PRECEDING AND CURRENT ROW) as weekly_sum
FROM sales;

Key difference with an example: If the date column has duplicates (two orders on the same day), then ROWS BETWEEN 6 PRECEDING AND CURRENT ROW takes exactly 6 previous ROWS (ignoring dates), while RANGE BETWEEN ... 7 days PRECEDING takes all rows from the last 7 days (even if there are 20 of them). ROWS = 6 rows, RANGE = 7 days.

Common Mistakes

Filtering by window result in the same query

-- ❌ ERROR: windows are computed AFTER WHERE
-- SQL execution order:
-- FROM → WHERE → GROUP BY → HAVING → SELECT (including window functions) → ORDER BY → LIMIT
-- Since WHERE executes BEFORE SELECT, it can't see the window function result.
-- Therefore a wrapper is needed: inner query computes the window, outer — filters.
SELECT * FROM (
    SELECT user_id, amount,
           ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY created_at DESC) as rn
    FROM orders
) sub
WHERE rn = 1;  -- ✅ Correct: in subquery

Mixing different PARTITION BY

-- ❌ Expensive: two different sorts
SELECT
    AVG(amount) OVER(PARTITION BY user_id) as user_avg,
    SUM(amount) OVER(PARTITION BY department) as dept_total
FROM orders;
-- → Two sort phases!

COUNT(*) OVER() on large data

-- ❌ Forces reading ALL rows of the partition
SELECT id, name, COUNT(*) OVER() as total FROM users LIMIT 10;

-- ✅ Better as separate query for total
SELECT COUNT(*) FROM users;  -- Once
SELECT id, name FROM users LIMIT 10;  -- Separately

Practical Examples

Running Total:

SELECT
    date,
    amount,
    SUM(amount) OVER(ORDER BY date ROWS UNBOUNDED PRECEDING) as running_total
FROM transactions;

Percentage of Group:

SELECT
    name,
    salary,
    department,
    ROUND(100.0 * salary / SUM(salary) OVER(PARTITION BY department), 2) as pct_of_dept
FROM employees;

🔴 Senior Level

When NOT to Use Window Functions

When you need a single aggregate (total sum, average) — use GROUP BY, it’s simpler and faster
When you do COUNT(*) OVER() only for pagination — a separate SELECT COUNT(*) is faster, as it doesn’t read all rows
When there are many different PARTITION BY in one query — each unique combination = a separate sort phase, which kills performance

Execution Order and Optimizations

SQL computation order:

FROM / JOIN
WHERE
GROUP BY
HAVING
SELECT / Window Functions  ← Windows are computed HERE
DISTINCT
ORDER BY
LIMIT

Important: Windows are computed AFTER WHERE, GROUP BY, and HAVING. Therefore, you cannot filter by a window result in the same query.

ROWS vs RANGE vs GROUPS: Deep Dive

ROWS:

-- Physical rows: 6 PRECEDING + CURRENT
AVG(amount) OVER(ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)

-- Fast: simply takes N rows
-- Problem: if there are duplicates in ORDER BY → non-deterministic

RANGE:

-- Logical values: all rows with the same value
SUM(amount) OVER(ORDER BY amount RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)

-- Slower: checks values, not just positions
-- Needed for correct handling of duplicates

GROUPS (PG 11+):

-- Groups of rows with the same value
AVG(amount) OVER(ORDER BY date GROUPS BETWEEN 1 PRECEDING AND 1 FOLLOWING)

-- Groups "peer" rows together

On PG 10 and below, GROUPS is not available. Alternative: subquery with RANGE or manual filtering.

EXCLUDE (PG 11+)

-- Excluding the current row from calculation
SELECT
    name,
    salary,
    AVG(salary) OVER(
        PARTITION BY department
        ORDER BY salary
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        EXCLUDE CURRENT ROW
    ) as others_avg
FROM employees;

EXCLUDE options:

EXCLUDE CURRENT ROW — without the current row
EXCLUDE GROUP — without current + peer rows
EXCLUDE TIES — without peers, but with current
EXCLUDE NO OTHERS — default

On PG 10 and below, EXCLUDE is not available. Alternative: CASE WHEN to exclude the current row or a subquery.

Incremental Sort (PG 13+)

-- Index on (department, created_at)
CREATE INDEX idx_orders_dept_date ON orders(department, created_at);

-- Query with PARTITION BY + ORDER BY
SELECT
    department,
    created_at,
    amount,
    SUM(amount) OVER(PARTITION BY department ORDER BY created_at) as running_total
FROM orders;

-- PG 13+: Incremental Sort
-- Data is already sorted by department (from index)
-- → Sort only by created_at within each department
-- → 5-10x faster than full sort!
-- Incremental Sort works because data is already sorted by the first
-- part of the key (from the index). Instead of full sort by (department, created_at)
-- the database sorts only by created_at within each department group.
-- If there are 10 groups, each sort is 10x smaller.
-- On PG 12 and below — full sort of all data, even if part is already sorted.

Cost of Window Frame

-- Cheap frame: from start to current row (default)
SUM(amount) OVER(ORDER BY date ROWS UNBOUNDED PRECEDING)
-- → Incremental computation, the database doesn't "subtract" expired rows

-- Expensive frame: sliding window
AVG(amount) OVER(ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
-- → For each row, a window of 7 elements is recomputed
-- → Requires state management (adding new, removing old)

Named Windows (WINDOW clause)

-- DRY for window functions
SELECT
    department,
    AVG(salary) OVER w as avg_sal,
    SUM(salary) OVER w as total_sal,
    COUNT(*) OVER w as emp_count
FROM employees
WINDOW w AS (PARTITION BY department ORDER BY hire_date);

-- → One window definition for all functions
-- → Fewer sorts, faster execution

FILTER with Window Functions

SELECT
    department,
    salary,
    SUM(salary) FILTER (WHERE status = 'ACTIVE')
        OVER(PARTITION BY department) as active_payroll
FROM employees;

Edge Cases

Empty window

-- Without PARTITION BY and ORDER BY
SELECT name, salary,
       SUM(salary) OVER() as total_payroll  -- Sum over the entire table
FROM employees;

NULL in ORDER BY

-- NULL defaults to LAST in DESC, FIRST in ASC
SELECT name, salary,
       RANK() OVER(ORDER BY salary DESC NULLS LAST) as rank
FROM employees;

Nested windows — NOT ALLOWED

-- ❌ ERROR: window functions cannot be nested
SELECT AVG(SUM(amount) OVER(PARTITION BY user_id)) OVER() FROM orders;

-- ✅ Solution: CTE
WITH user_totals AS (
    SELECT user_id, SUM(amount) as total
    FROM orders GROUP BY user_id
)
SELECT AVG(total) FROM user_totals;

Performance Tuning

Minimizing unique windows:

-- ❌ Expensive: 3 different PARTITION BY → 3 sorts
SELECT
    AVG(amount) OVER(PARTITION BY user_id) as u_avg,
    AVG(amount) OVER(PARTITION BY department) as d_avg,
    AVG(amount) OVER(PARTITION BY region) as r_avg
FROM orders;

-- ✅ Better: same PARTITION BY → 1 sort
SELECT
    AVG(amount) OVER(PARTITION BY user_id) as u_avg,
    SUM(amount) OVER(PARTITION BY user_id) as u_total,
    COUNT(*) OVER(PARTITION BY user_id) as u_count
FROM orders;

Index for WindowAgg:

-- Index on (PARTITION BY columns, ORDER BY columns)
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);

-- Query without Sort
SELECT user_id, created_at, amount,
       SUM(amount) OVER(PARTITION BY user_id ORDER BY created_at) as running_total
FROM orders;
-- Plan: WindowAgg → Index Scan (no Sort!)

Production Experience

Real scenario #1: Running Total for Report

Fintech: daily balance for 1M transactions
Query: 30 seconds (Sort on disk)
Solution:
- Index on (account_id, date)
- ROWS UNBOUNDED PRECEDING (cheap frame)
- work_mem = 256MB
Result: 2 seconds

Real scenario #2: Mixed windows kill performance

Analytics: 5 window functions with different PARTITION BY
Time: 2 minutes (5 sorts!)
Solution: split into 2 CTEs
Result: 15 seconds

Monitoring

-- 1. Check plan with window functions
EXPLAIN (ANALYZE, BUFFERS)
SELECT ... OVER(PARTITION BY ... ORDER BY ...);
-- Look for: WindowAgg → Sort (or Index Scan if index exists)

-- 2. Check memory for Sort
EXPLAIN (ANALYZE)
SELECT ... OVER(...);
-- "external merge Disk: X kB" → spill! Increase work_mem

-- 3. Find slow queries with windows
SELECT query, mean_exec_time
FROM pg_stat_statements
WHERE query LIKE '%OVER(%'
ORDER BY mean_exec_time DESC
LIMIT 10;

Best Practices

Index on (PARTITION BY, ORDER BY) → no Sort
ROWS instead of RANGE where possible (faster)
WINDOW clause for reusing window definitions
FILTER for conditional aggregation
EXCLUDE for complex statistics (PG 11+)
Minimize the number of unique PARTITION BY
Incremental Sort (PG 13+) use with indexes
Subquery/CTE for filtering by window result

Summary for Senior

Window functions are computed AFTER WHERE, GROUP BY, HAVING
ROWS is faster than RANGE, but RANGE is more correct with duplicates
Incremental Sort (PG 13+) saves on partial sorting
EXCLUDE (PG 11+) for complex statistics
FILTER is supported in window functions
Frame cost: UNBOUNDED PRECEDING is cheaper than sliding window
Minimize unique PARTITION BY → fewer Sort phases
Index on (PARTITION BY, ORDER BY) → WindowAgg without Sort

🎯 Interview Cheat Sheet

Must know:

Window functions: compute over a group of rows, do NOT collapse (unlike GROUP BY)
Syntax: FUNCTION() OVER(PARTITION BY ... ORDER BY ... ROWS/RANGE BETWEEN ...)
3 types: aggregate (SUM, AVG), ranking (ROW_NUMBER, RANK), offset (LAG, LEAD)
ROWS = physical rows (faster), RANGE = logical values (more correct with duplicates)
GROUPS (PG 11+): groups “peer” rows with the same value
EXCLUDE (PG 11+): EXCLUDE CURRENT ROW, EXCLUDE GROUP, EXCLUDE TIES
Filtering by window result → ONLY in subquery/CTE (windows are computed AFTER WHERE)
Index on (PARTITION BY, ORDER BY) → WindowAgg without Sort
Incremental Sort (PG 13+): data is partially sorted → sort the rest
Named windows (WINDOW clause): DRY for window functions
Window functions cannot be nested → CTE

Frequent follow-up questions:

“Why can’t you filter by window function in WHERE?” → Windows are computed AFTER WHERE
“ROWS vs RANGE — which to choose?” → ROWS is faster, RANGE is more correct with duplicates
“How to speed up a window function?” → Index on (PARTITION BY, ORDER BY)
“What is Incremental Sort?” → PG 13+: sorts only the missing part

Red flags (DO NOT say):

❌ “Window functions collapse rows” (no, that’s GROUP BY)
❌ “You can filter by window function in WHERE” (no, computed after WHERE)
❌ “RANGE is always better than ROWS” (ROWS is faster, RANGE only with duplicates)

Related topics:

[[What does ROW_NUMBER() do]] → unique numbering
[[What do RANK() and DENSE_RANK() do]] → ranking
[[What does GROUP BY do]] → difference from window functions
[[What is the difference between WHERE and HAVING]] → execution order