SQL window functions cheat sheet
Window functions compute a value across a set of rows related to the current row without collapsing rows like GROUP BY does. The whole family on one page.
The OVER clause #
function(args) OVER (
[PARTITION BY col_a, col_b] -- group rows; analogous to GROUP BY
[ORDER BY col_t [ASC|DESC]] -- sort within each partition
[ROWS|RANGE BETWEEN ... AND ...] -- frame: which rows in the partition to operate on
)Ranking: ROW_NUMBER, RANK, DENSE_RANK #
SELECT
user_id,
total,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY total DESC) AS rn, -- 1, 2, 3, 4
RANK() OVER (PARTITION BY user_id ORDER BY total DESC) AS rk, -- 1, 2, 2, 4 (ties skip)
DENSE_RANK() OVER (PARTITION BY user_id ORDER BY total DESC) AS dr -- 1, 2, 2, 3 (ties don't skip)
FROM orders;Pick ROW_NUMBER for "exactly one row per rank position" (Top-N). RANK for "ties share rank, gaps OK." DENSE_RANK for "ties share rank, no gaps."
Top-N per group (the most useful pattern) #
-- 3 most recent orders per user.
SELECT *
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY order_date DESC) AS rn
FROM orders
) ranked
WHERE rn <= 3;
-- DuckDB / Snowflake / BigQuery / Databricks: use QUALIFY to skip the wrapper.
SELECT *
FROM orders
QUALIFY ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY order_date DESC) <= 3;LAG / LEAD: previous and next row #
SELECT
day,
revenue,
LAG(revenue) OVER (ORDER BY day) AS prev_day,
LEAD(revenue) OVER (ORDER BY day) AS next_day,
revenue - LAG(revenue) OVER (ORDER BY day) AS day_over_day_delta,
LAG(revenue, 7) OVER (ORDER BY day) AS revenue_week_ago,
LAG(revenue, 1, 0) OVER (ORDER BY day) AS prev_or_zero -- default for first row
FROM daily_revenue;Aggregates as window functions #
SELECT
region,
user_id,
revenue,
-- Group total without collapsing rows.
SUM(revenue) OVER (PARTITION BY region) AS region_total,
-- Per-row percentage of region total.
100.0 * revenue / SUM(revenue) OVER (PARTITION BY region) AS pct_of_region,
-- Running total within region, ordered by date.
SUM(revenue) OVER (PARTITION BY region ORDER BY day) AS running_total,
-- Average so far for that region.
AVG(revenue) OVER (PARTITION BY region ORDER BY day) AS running_avg,
-- Min / Max so far.
MIN(revenue) OVER (PARTITION BY region ORDER BY day) AS region_low_so_far,
MAX(revenue) OVER (PARTITION BY region ORDER BY day) AS region_high_so_far
FROM region_daily;Frames (ROWS / RANGE) #
-- Last 7 rows (rolling 7-day if the data is one row per day).
SUM(revenue) OVER (
ORDER BY day
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
)
-- 30-day moving average.
AVG(revenue) OVER (
ORDER BY day
ROWS BETWEEN 29 PRECEDING AND CURRENT ROW
)
-- Forward-looking: this row plus next 4.
SUM(revenue) OVER (
ORDER BY day
ROWS BETWEEN CURRENT ROW AND 4 FOLLOWING
)
-- Whole partition (e.g., for LAST_VALUE to actually return the last).
LAST_VALUE(revenue) OVER (
PARTITION BY region ORDER BY day
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)Default frame with ORDER BY: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. Includes ties on the order column. ROWS walks physical rows, usually what you want when ties exist.
FIRST_VALUE / LAST_VALUE / NTH_VALUE #
-- First and last revenue in each region (sorted by date).
SELECT
region,
day,
revenue,
FIRST_VALUE(revenue) OVER (PARTITION BY region ORDER BY day) AS first_day_rev,
LAST_VALUE(revenue) OVER (PARTITION BY region ORDER BY day
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING) AS last_day_rev,
NTH_VALUE(revenue, 2) OVER (PARTITION BY region ORDER BY day) AS second_day_rev
FROM region_daily;Trap: LAST_VALUE with the default frame returns the current row's value, not the partition's last. Always specify ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING for "the actual last."
NTILE: quartiles, deciles, percentiles #
SELECT
user_id,
revenue,
NTILE(4) OVER (ORDER BY revenue) AS quartile,
NTILE(10) OVER (ORDER BY revenue) AS decile,
NTILE(100) OVER (ORDER BY revenue) AS percentile
FROM user_revenue;PERCENT_RANK / CUME_DIST #
SELECT
value,
PERCENT_RANK() OVER (ORDER BY value) AS pct_rank, -- 0 to 1
CUME_DIST() OVER (ORDER BY value) AS cumulative_dist -- 0 to 1
FROM measurements;PERCENTILE_CONT / PERCENTILE_DISC #
-- As an aggregate (one row per group): WITHIN GROUP form.
SELECT
region,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY revenue) AS median,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY revenue) AS p95
FROM user_revenue
GROUP BY region;
-- As a window function: per-row.
SELECT
region, user_id, revenue,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY revenue)
OVER (PARTITION BY region) AS region_median
FROM user_revenue;Reusable WINDOW clause (avoid copy-paste) #
SELECT
user_id,
total,
ROW_NUMBER() OVER w AS rn,
RANK() OVER w AS rk,
LAG(total) OVER w AS prev_total
FROM orders
WINDOW w AS (PARTITION BY user_id ORDER BY order_date); -- Postgres / MySQL 8 / DuckDB / SQLite / BigQuery / SQL Server 2022+ / Oracle 21c+ (not Snowflake)Common mistakes #
- Mixing
GROUP BYand window functions naively: window functions run afterGROUP BY.SUM(total) OVER (...)over a grouped query operates on the grouped rows, not the raw data. LAST_VALUEreturning the wrong value: default frame isUNBOUNDED PRECEDING ... CURRENT ROW, soLAST_VALUEis the same as the current row's value. Always specifyUNBOUNDED FOLLOWINGexplicitly.ORDER BYwith no tie-breaker:ROW_NUMBERover tied values is non-deterministic. Add a tie-breaker column (ORDER BY t, id) for reproducible results.- Using window functions where
GROUP BYwould do: if you don't need per-row output, plainGROUP BYis faster.
FAQ #
What's the difference between OVER and GROUP BY?
GROUP BY collapses rows to one output row per group. OVER (window function) computes a per-group value but keeps every input row. Use GROUP BY when you only want the aggregate; use OVER when you need each row alongside the aggregate.
When should I use ROW_NUMBER, RANK, or DENSE_RANK?
ROW_NUMBER always assigns unique sequential numbers, best for "exactly N rows per group" (Top-N pattern). RANK gives ties the same rank but skips numbers (1, 2, 2, 4), like a sports leaderboard. DENSE_RANK gives ties the same rank without gaps (1, 2, 2, 3), like pricing tiers.
Can I filter on a window function in WHERE?
No. Window functions are evaluated after WHERE. Wrap the query in a subquery / CTE and filter on the outer level: SELECT FROM (SELECT , ROW_NUMBER() OVER (...) AS rn FROM t) WHERE rn = 1. On DuckDB, Snowflake, BigQuery, and Databricks, QUALIFY lets you filter window results in one query.
Why does LAST_VALUE return the wrong row?
The default frame with ORDER BY is UNBOUNDED PRECEDING TO CURRENT ROW, so LAST_VALUE returns the current row's value. To get the actual partition's last value, specify the full frame: LAST_VALUE(col) OVER (PARTITION BY g ORDER BY t ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING).
What's the difference between ROWS and RANGE in a window frame?
ROWS walks physical rows. RANGE walks by the ORDER BY value, meaning ties on the order column are all included together. Use ROWS for "last 7 rows"; use RANGE for "last 7 days." RANGE with intervals (RANGE BETWEEN INTERVAL '7' DAY PRECEDING ...) is supported on Postgres 11+, Oracle, and DuckDB.
How do I compute a 7-day rolling average?
AVG(value) OVER (ORDER BY day ROWS BETWEEN 6 PRECEDING AND CURRENT ROW). The ROWS BETWEEN frame is required: the default frame would give a cumulative average, not a rolling one. For per-group rolling averages, add PARTITION BY group_col inside the OVER.
Two patterns cover 80% of analytics window-function work. (1) Top-N per group via ROW_NUMBER. (2) Running total / rolling average via SUM / AVG with ORDER BY and a frame. Master those two and the rest are variants. Detail pages: Running total, LAG / LEAD, First/last per group, Ranking.