First or last row per group
Short answer: ROW_NUMBER() OVER (PARTITION BY group ORDER BY time_col) ranks rows within each group; filter to = 1 for the first row, or use ORDER BY time_col DESC to get the last. The portable solution. Postgres also has DISTINCT ON (group) as a shortcut. The MAX(time) / MIN(time) group-and-rejoin pattern is older and clunkier, so prefer the window-function form.
Canonical SQL #
-- The portable form: ROW_NUMBER + filter.
SELECT *
FROM (
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY user_id
ORDER BY order_date DESC -- DESC = last; ASC = first
) AS rn
FROM orders
) ranked
WHERE rn = 1;
-- DuckDB / Snowflake / BigQuery / Databricks: cleaner with QUALIFY.
SELECT *
FROM orders
QUALIFY ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY order_date DESC) = 1;
-- Postgres: DISTINCT ON shortcut.
SELECT DISTINCT ON (user_id)
user_id, order_date, total
FROM orders
ORDER BY user_id, order_date DESC;Example #
Common variants #
-- Top 3 most recent orders per user (not just the first).
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY order_date DESC) AS rn
FROM orders
) ranked
WHERE rn <= 3;
-- First order per user, but break ties on ID (deterministic).
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY order_date, id) AS rn
FROM orders
) ranked
WHERE rn = 1;
-- Latest non-cancelled order per user.
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY order_date DESC) AS rn
FROM orders
WHERE status <> 'cancelled' -- filter BEFORE ranking
) ranked
WHERE rn = 1;
-- "Latest snapshot" pattern (idempotent for SCD-style tables).
WITH user_dim_snapshots(user_id, snapshot_date, plan) AS (
VALUES (10, DATE '2026-01-01', 'free'),
(10, DATE '2026-04-01', 'pro'),
(11, DATE '2026-03-01', 'free')
)
SELECT *
FROM user_dim_snapshots
QUALIFY ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY snapshot_date DESC) = 1;Dialect comparison #
| Engine | Syntax | Gotcha |
|---|---|---|
| PostgreSQL | DISTINCT ON (user_id) | ORDER BY must start with the DISTINCT ON columns |
| MySQL | ROW_NUMBER() in a subquery, filter rn = 1 | needs MySQL 8+ |
| SQL Server | ROW_NUMBER() in a subquery, filter rn = 1 | — |
| BigQuery | QUALIFY ROW_NUMBER() OVER (…) = 1 | — |
| Snowflake | QUALIFY ROW_NUMBER() OVER (…) = 1 | — |
| Oracle | ROW_NUMBER() in a subquery, filter rn = 1 | — |
| SQLite | ROW_NUMBER() in a subquery, filter rn = 1 | needs SQLite 3.25+ |
| DuckDB | QUALIFY ROW_NUMBER() OVER (…) = 1 | — |
Dialect notes #
ROW_NUMBERis universal: Postgres / MySQL (8+) / SQL Server / Oracle / Snowflake / BigQuery / DuckDB / Redshift / SQLite (3.25+). The portable form.QUALIFYfor filtering on window functions in one go: DuckDB, Snowflake, BigQuery, Databricks, Teradata. See QUALIFY.DISTINCT ON (cols): Postgres-specific shortcut. Returns one row per distinct combination of theDISTINCT ONcolumns, picking the first row by theORDER BYclause. ForcesORDER BYto start with the same columns asDISTINCT ON.MAX(time)group-and-rejoin pattern:SELECT * FROM orders o JOIN (SELECT user_id, MAX(order_date) AS m FROM orders GROUP BY user_id) latest ON latest.user_id = o.user_id AND latest.m = o.order_date. This works without window functions but is fragile when ties exist (it returns multiple rows per user). UseROW_NUMBERinstead unless you're stuck on a pre-window-function engine.
Without a tie-breaker on ORDER BY, "first row per group" is non-deterministic. Two orders on the same date for the same user: which one wins? The engine picks one, but a re-run might pick the other. Always include a deterministic tie-breaker: ORDER BY order_date DESC, id DESC. The id (or created_at with second/microsecond precision) makes the result reproducible. Re-run-stable analytics is a downstream-stakeholder kind of stable. Anyone who's diffed two snapshots of "latest order per user" expecting them to match knows the cost of skipping this.