Recipe

First or last row per group

Short answer: ROW_NUMBER() OVER (PARTITION BY group ORDER BY time_col) ranks rows within each group; filter to = 1 for the first row, or use ORDER BY time_col DESC to get the last. The portable solution. Postgres also has DISTINCT ON (group) as a shortcut. The MAX(time) / MIN(time) group-and-rejoin pattern is older and clunkier, so prefer the window-function form.

Canonical SQL #

sql
-- The portable form: ROW_NUMBER + filter.
SELECT *
FROM (
  SELECT
    *,
    ROW_NUMBER() OVER (
      PARTITION BY user_id
      ORDER BY order_date DESC      -- DESC = last; ASC = first
    ) AS rn
  FROM orders
) ranked
WHERE rn = 1;

-- DuckDB / Snowflake / BigQuery / Databricks: cleaner with QUALIFY.
SELECT *
FROM   orders
QUALIFY ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY order_date DESC) = 1;

-- Postgres: DISTINCT ON shortcut.
SELECT DISTINCT ON (user_id)
  user_id, order_date, total
FROM   orders
ORDER  BY user_id, order_date DESC;

Example #

Loading SQL editor...

Common variants #

sql
-- Top 3 most recent orders per user (not just the first).
SELECT *
FROM (
  SELECT *, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY order_date DESC) AS rn
  FROM orders
) ranked
WHERE rn <= 3;

-- First order per user, but break ties on ID (deterministic).
SELECT *
FROM (
  SELECT *, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY order_date, id) AS rn
  FROM orders
) ranked
WHERE rn = 1;

-- Latest non-cancelled order per user.
SELECT *
FROM (
  SELECT *, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY order_date DESC) AS rn
  FROM orders
  WHERE status <> 'cancelled'                      -- filter BEFORE ranking
) ranked
WHERE rn = 1;

-- "Latest snapshot" pattern (idempotent for SCD-style tables).
WITH user_dim_snapshots(user_id, snapshot_date, plan) AS (
  VALUES (10, DATE '2026-01-01', 'free'),
         (10, DATE '2026-04-01', 'pro'),
         (11, DATE '2026-03-01', 'free')
)
SELECT *
FROM   user_dim_snapshots
QUALIFY ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY snapshot_date DESC) = 1;

Dialect comparison #

EngineSyntaxGotcha
PostgreSQLDISTINCT ON (user_id)ORDER BY must start with the DISTINCT ON columns
MySQLROW_NUMBER() in a subquery, filter rn = 1needs MySQL 8+
SQL ServerROW_NUMBER() in a subquery, filter rn = 1
BigQueryQUALIFY ROW_NUMBER() OVER (…) = 1
SnowflakeQUALIFY ROW_NUMBER() OVER (…) = 1
OracleROW_NUMBER() in a subquery, filter rn = 1
SQLiteROW_NUMBER() in a subquery, filter rn = 1needs SQLite 3.25+
DuckDBQUALIFY ROW_NUMBER() OVER (…) = 1

Dialect notes #

  • ROW_NUMBER is universal: Postgres / MySQL (8+) / SQL Server / Oracle / Snowflake / BigQuery / DuckDB / Redshift / SQLite (3.25+). The portable form.
  • QUALIFY for filtering on window functions in one go: DuckDB, Snowflake, BigQuery, Databricks, Teradata. See QUALIFY.
  • DISTINCT ON (cols): Postgres-specific shortcut. Returns one row per distinct combination of the DISTINCT ON columns, picking the first row by the ORDER BY clause. Forces ORDER BY to start with the same columns as DISTINCT ON.
  • MAX(time) group-and-rejoin pattern: SELECT * FROM orders o JOIN (SELECT user_id, MAX(order_date) AS m FROM orders GROUP BY user_id) latest ON latest.user_id = o.user_id AND latest.m = o.order_date. This works without window functions but is fragile when ties exist (it returns multiple rows per user). Use ROW_NUMBER instead unless you're stuck on a pre-window-function engine.
Warning

Without a tie-breaker on ORDER BY, "first row per group" is non-deterministic. Two orders on the same date for the same user: which one wins? The engine picks one, but a re-run might pick the other. Always include a deterministic tie-breaker: ORDER BY order_date DESC, id DESC. The id (or created_at with second/microsecond precision) makes the result reproducible. Re-run-stable analytics is a downstream-stakeholder kind of stable. Anyone who's diffed two snapshots of "latest order per user" expecting them to match knows the cost of skipping this.