QUALIFY
Every window-function lesson so far has hit the same wall: you compute rn, then you can't filter on it.
-- Fails: rn doesn't exist when WHERE runs
SELECT
id,
user_id,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY created_at DESC) AS rn
FROM orders
WHERE rn = 1;WHERE is evaluated before window functions are computed, and HAVING filters aggregates, not windows. So the standard workaround is the two-step CTE you've been writing: compute rn in a CTE, filter WHERE rn = 1 outside.
QUALIFY removes the workaround. It's a dedicated filter clause that runs after window functions are computed: WHERE for rows, HAVING for groups, QUALIFY for window results.
Syntax #
SELECT column_list
FROM t
WHERE row_filter -- runs first, before windows
GROUP BY ... -- optional
HAVING group_filter -- optional
QUALIFY window_predicate -- runs after windows are computed
ORDER BY ...
LIMIT n;The logical evaluation order:
FROM → WHERE → GROUP BY → HAVING → window functions → QUALIFY → DISTINCT → ORDER BY → LIMIT
Two consequences worth internalizing:
- Rows removed by
WHEREnever receive window numbers. The window only sees survivors. QUALIFYsees fully-computed window values, so it can compare them, combine them withAND/OR, or test them againstLAG/LEADresults.
DuckDB and Snowflake also let QUALIFY reference a window alias from SELECT: SELECT ..., ROW_NUMBER() OVER (...) AS rn FROM orders QUALIFY rn = 1. Handy when you want rn in the output anyway.
CTE vs QUALIFY: Same Query, Two Shapes #
The highest-total order within each status. First, the portable CTE form you already know:
-- Portable everywhere
WITH ranked AS (
SELECT
id AS order_id,
user_id,
status,
total,
ROW_NUMBER() OVER (
PARTITION BY status
ORDER BY total DESC, id
) AS rn
FROM orders
)
SELECT order_id, user_id, status, total
FROM ranked
WHERE rn = 1;Now the QUALIFY form, identical result, no wrapper:
-- DuckDB / Snowflake / BigQuery / Databricks / Teradata
SELECT id AS order_id, user_id, status, total
FROM orders
QUALIFY ROW_NUMBER() OVER (PARTITION BY status ORDER BY total DESC, id) = 1;Nothing changed semantically. QUALIFY is sugar over the CTE pattern, and the optimizer treats both the same. What changed is readability: the filter sits next to the query it filters, and there's one less name (ranked, rn) to invent.
The Three Canonical Uses #
Top-N per group #
-- Top 2 paid orders per user by total
SELECT id AS order_id, user_id, total
FROM orders
WHERE status = 'paid'
QUALIFY ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY total DESC, id) <= 2;Dedupe: keep the latest row per key #
-- Most recent order per user, full row detail
SELECT id, user_id, total, status, created_at
FROM orders
QUALIFY ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY created_at DESC, id DESC) = 1;Change detection with LAG #
QUALIFY isn't limited to ranking functions; any window result works. Here's "rows where the value differs from the previous row," the building block of change-history tables:
-- Orders where the total differs from the user's previous order
SELECT user_id, id AS order_id, total, created_at
FROM orders
QUALIFY total IS DISTINCT FROM
LAG(total) OVER (PARTITION BY user_id ORDER BY created_at, id);IS DISTINCT FROM instead of <> is deliberate: each user's first order has LAG(total) = NULL, and total <> NULL is NULL (filtered out). IS DISTINCT FROM treats NULL as a comparable value, so first rows survive, which is usually what change detection wants.
Portability #
QUALIFY is not standard SQL. It originated in Teradata and spread through the analytical engines:
| Engine | QUALIFY | What to write instead |
|---|---|---|
| DuckDB | Yes | — |
| Snowflake | Yes | — |
| BigQuery | Yes | — |
| Databricks / Teradata | Yes | — |
| PostgreSQL | No | CTE / subquery + WHERE rn = 1 |
| MySQL | No | CTE / subquery + WHERE rn = 1 |
| SQL Server | No | CTE / subquery + WHERE rn = 1 |
| SQLite | No | CTE / subquery + WHERE rn = 1 |
The editors on this site run DuckDB, so QUALIFY works everywhere here. But know both forms: if your job runs on Postgres or SQL Server, or an interviewer asks for portable SQL, the CTE pattern is the answer. QUALIFY saves typing; the CTE version is the one that runs everywhere.
Try It #
Each user's highest-total order, with their name, in one query and no CTE. The QUALIFY line does what an entire wrapper query used to.
Practice #
Return each user's latest paid order: latest by created_at, tie-broken by highest id. Output: order_id, user_id, total, created_at. Use QUALIFY, with no CTE and no subquery. No ordering requirement on the final result.
Mistakes to Watch For #
- Shipping
QUALIFYto Postgres, MySQL, SQL Server, or SQLite, where it's a syntax error. Translate to the CTE +WHERE rn = 1form before porting. - Still trying
WHERE rn = 1orHAVING rn = 1on a window alias in the sameSELECT. Both run before windows exist;QUALIFYis the clause that runs after. - Dropping the row filter and hoping
QUALIFYhandles it. WithoutWHERE status = 'paid', a cancelled order can takern = 1and knock out the paid order you wanted. Row filters belong inWHERE; window filters belong inQUALIFY. - Missing tie-breaker in the window's
ORDER BY— same non-determinism as always, andQUALIFYdoes nothing to fix it. - Treating
QUALIFYas aHAVINGreplacement.HAVINGfilters aggregate results afterGROUP BY;QUALIFYfilters window results. A query can legitimately use both.
Knowledge check #
4 questions
SELECT *, ROW_NUMBER() OVER (...) AS rn FROM orders WHERE rn = 1fails because:QUALIFYis natively supported in:To port
QUALIFY ROW_NUMBER() OVER (...) = 1to Postgres, you:A query has both
WHERE status = 'paid'andQUALIFY rn = 1. Which rows receive window numbers?
Next Step #
That wraps up the Window Functions section. You now have the full toolkit, including the clause that makes it pleasant to filter. Continue to String Cleanup, the TRIM / LOWER / REPLACE techniques that turn messy text into values you can join, group, and dedupe.