Recipe

Rank rows within a group

Short answer: three ranking functions, three different tie-handling behaviors. ROW_NUMBER gives every row a unique rank (ties broken arbitrarily). RANK gives tied rows the same rank but skips the next number (1, 2, 2, 4). DENSE_RANK gives tied rows the same rank without skipping (1, 2, 2, 3). Pick by what should happen on ties in your data.

Quick comparison #

Score: 95, 90, 90, 85ROW_NUMBERRANKDENSE_RANK
95111
90 (tied)222
90 (tied)322
85443
Use for"exactly one row per rank" (Top-N)"tied scores share rank, gaps OK" (sports)"tied scores share rank, no gaps" (pricing tiers)

Canonical SQL #

sql
SELECT
  user_id,
  region,
  revenue,
  ROW_NUMBER() OVER (PARTITION BY region ORDER BY revenue DESC) AS row_num,
  RANK()       OVER (PARTITION BY region ORDER BY revenue DESC) AS rnk,
  DENSE_RANK() OVER (PARTITION BY region ORDER BY revenue DESC) AS dense_rnk
FROM   user_revenue;

Example #

Loading SQL editor...

Common patterns #

sql
-- Top-N per group: most common analytics ask.
SELECT *
FROM (
  SELECT *, ROW_NUMBER() OVER (PARTITION BY region ORDER BY revenue DESC) AS rn
  FROM user_revenue
) ranked
WHERE rn <= 5;                      -- Top 5 per region

-- Percentile rank (0 to 1).
SELECT
  user_id,
  revenue,
  PERCENT_RANK() OVER (ORDER BY revenue) AS pct_rank,
  CUME_DIST()    OVER (ORDER BY revenue) AS cumulative_dist
FROM user_revenue;

-- "Highest in group" tie-broken by a secondary column.
SELECT *
FROM (
  SELECT *, ROW_NUMBER() OVER (
    PARTITION BY region
    ORDER BY revenue DESC, signup_date ASC, user_id ASC      -- multi-key tie break
  ) AS rn
  FROM user_revenue
) ranked
WHERE rn = 1;

-- NTILE: split rows into N equal-sized buckets (quartiles, deciles).
SELECT
  user_id,
  revenue,
  NTILE(4) OVER (ORDER BY revenue) AS quartile
FROM user_revenue;

When to use which #

ROW_NUMBER: when you need exactly one row per rank position. For "Top 3 customers per region," you want exactly 3 rows per region, even if two customers are tied at #3. The engine picks one; add a deterministic tie-breaker column to make the choice reproducible.

RANK: when ties should share a rank but the next position should reflect how many rows came before it. Sports leaderboards work this way: two athletes tied for second place, the next is fourth (not third). RANK produces 1, 2, 2, 4.

DENSE_RANK: when ties should share a rank but the next position is just the next integer. Pricing tiers and segmentation buckets work this way, with Tier 1 / Tier 2 / Tier 3 regardless of how many products are in each tier. DENSE_RANK produces 1, 2, 2, 3.

NTILE(n): divides rows into n evenly-sized buckets. Quartiles (NTILE(4)), deciles (NTILE(10)), percentiles (NTILE(100)). Different problem from ranking, same family of functions.

Dialect notes #

  • ROW_NUMBER, RANK, DENSE_RANK, NTILE, PERCENT_RANK, CUME_DIST: ANSI standard, supported on Postgres, MySQL (8+), SQL Server, Oracle, Snowflake, BigQuery, DuckDB, Redshift, SQLite (3.25+).
  • QUALIFY clause: DuckDB / Snowflake / BigQuery / Databricks / Teradata let you filter on a window function without the subquery wrapper: SELECT ... QUALIFY ROW_NUMBER() OVER (...) <= 5. Cleaner.
  • MySQL pre-8.0: no window functions. Emulate ROW_NUMBER per group with user-defined variables (@row := IF(@grp = group, @row + 1, 1)); it works but is fragile.
  • RANK vs DENSE_RANK performance: identical on every modern engine. Choose by semantics, not by speed.
Tip

The single most-Googled SQL question is "Top N per group," and ROW_NUMBER() OVER (PARTITION BY g ORDER BY metric DESC) <= N (filtered in a subquery or with QUALIFY) is the canonical answer. Memorize the pattern, swap ROW_NUMBER for RANK / DENSE_RANK based on tie semantics, and you've got 80% of the analytics ranking work covered. References: Top N per group pattern, first or last row per group.