Lesson Beginner

CASE & Conditional Aggregation

CASE is SQL's if/else. It's an expression, not a statement. It produces one value per row, so it can go anywhere a column can: in SELECT, in ORDER BY, even inside an aggregate.
That last trick is the big one. Putting CASE inside SUM or COUNT lets one query answer "how many paid? how many cancelled? how much paid revenue?" in a single row per group, with no separate queries and no joins.

Syntax #

The searched form tests conditions top to bottom and returns the value of the first WHEN that matches.

sql
CASE
  WHEN condition_1 THEN result_1
  WHEN condition_2 THEN result_2
  ELSE fallback
END

There's also a simple form that compares one value against literals: CASE status WHEN 'paid' THEN 'Paid' WHEN 'pending' THEN 'Open' END. It's shorter but only does equality checks, so the searched form is what you'll write most of the time.

Warning

If no WHEN matches and there's no ELSE, the result is NULL.
That sounds like a footnote, but it's the engine behind conditional counting. COUNT ignores NULLs, so a CASE without ELSE quietly excludes non-matching rows from the count.

Basic Example #

Label each order with a price bucket. Conditions are checked top to bottom, so an order with total = 1299.99 matches the first branch and stops there.

sql
SELECT
  id,
  total,
  CASE
    WHEN total >= 500 THEN 'big ticket'
    WHEN total >= 100 THEN 'mid'
    ELSE 'small'
  END AS price_bucket
FROM orders
ORDER BY total DESC;

One label per row. Because CASE is just an expression, price_bucket behaves like any other column: you could group by it to count orders per bucket.

Conditional Aggregation #

Now put CASE inside the aggregate. Each row contributes its branch result to the aggregate, and rows that produce NULL are skipped by COUNT, SUM, and friends.

Two workhorse patterns:

  • COUNT(CASE WHEN status = 'paid' THEN 1 END) counts only paid rows. Non-paid rows yield NULL (no ELSE), and COUNT ignores them.
  • SUM(CASE WHEN status = 'paid' THEN total ELSE 0 END) adds total for paid rows, 0 for everything else.

Combined with GROUP BY, this pivots a tall status column into wide per-status columns:

sql
SELECT
  order_date,
  COUNT(*) AS all_orders,
  COUNT(CASE WHEN status = 'paid' THEN 1 END) AS paid,
  COUNT(CASE WHEN status = 'pending' THEN 1 END) AS pending,
  COUNT(CASE WHEN status = 'cancelled' THEN 1 END) AS cancelled,
  ROUND(SUM(CASE WHEN status = 'paid' THEN total ELSE 0 END), 2) AS paid_revenue
FROM orders
GROUP BY order_date
ORDER BY order_date;

One row per day, with each status getting its own column. Compare that to WHERE status = 'paid': the filter would remove non-paid rows before grouping, so you'd lose the pending and cancelled counts entirely. Conditional aggregation keeps every row in the group and decides per metric which rows count.

The FILTER Clause #

PostgreSQL and DuckDB offer a cleaner spelling for the same idea: FILTER (WHERE ...) attached to the aggregate:

sql
SELECT
  COUNT(*) FILTER (WHERE status = 'paid') AS paid_orders,
  ROUND(SUM(total) FILTER (WHERE status = 'paid'), 2) AS paid_revenue
FROM orders;
Tip

FILTER reads better, but it's not universal. MySQL, SQL Server, and Oracle don't support it. The CASE form is the portable version that runs everywhere, which is why it's worth learning first. On Postgres or DuckDB, reach for FILTER; in cross-database code, stick with CASE.

Try It #

Build a per-user status breakdown, every metric in one pass over orders.

Loading SQL editor...

Practice #

For each user_id in orders, return user_id, paid_orders (count of paid orders), cancelled_orders (count of cancelled orders), and paid_revenue (sum of total for paid orders, 0 otherwise, rounded to 2 decimals). One row per distinct user_id; users with no paid orders must still appear with zeros. No ordering requirement.

Loading SQL editor...

Mistakes to Watch For #

  • COUNT(CASE WHEN ... THEN 1 ELSE 0 END). The ELSE 0 makes COUNT count every row, because 0 is a value, not NULL. Either drop the ELSE or switch to SUM.
  • Forgetting END. Every CASE must close with it, and the error message rarely says so plainly.
  • Mixing types across branches: THEN 'high' ... ELSE 0 forces the engine to reconcile text with a number. Keep all branches the same type.
  • Relying on branch order without thinking. WHEN total >= 100 before WHEN total >= 500 means the 500 branch never fires, because the first match wins.
  • Using WHERE status = 'paid' when you need paid and cancelled metrics side by side: the filter removes rows before grouping, so the second metric is silently wrong (and zero-paid users vanish).

Knowledge check #

4 questions

0 / 4 answered
  1. A CASE expression has no ELSE, and none of the WHEN conditions match a row. What does it return for that row?

  2. A group has 10 orders, 4 of them paid. What does COUNT(CASE WHEN status = 'paid' THEN 1 ELSE 0 END) return?

  3. You need paid and cancelled order counts per user in one result row. Why is conditional aggregation better than WHERE status = 'paid'?

  4. Which is true about COUNT(*) FILTER (WHERE status = 'paid')?

Next Step #

Continue to Joins intro. So far every query has used one table; joins combine them.