Ignore NULLs in aggregates and window functions
Short answer: standard SQL aggregates (SUM, AVG, MIN, MAX, COUNT(col)) already ignore NULLs by default. The function applies only to non-NULL values; nulls don't pull averages down or zero out sums. The exceptions: COUNT(*) counts every row including nulls, and the LAG / LEAD / FIRST_VALUE / LAST_VALUE window functions take an optional IGNORE NULLS clause that's not on by default.
Default behavior: no flag needed #
-- Aggregates skip NULLs automatically.
WITH transactions(id, amount) AS (
VALUES (1, 100), (2, NULL), (3, 50), (4, NULL), (5, 200)
)
SELECT
COUNT(*) AS total_rows, -- counts NULLs
COUNT(amount) AS rows_with_amount, -- counts non-NULL only
SUM(amount) AS total, -- ignores NULL
AVG(amount) AS mean, -- ignores NULL (denominator is non-NULL count)
MIN(amount), MAX(amount)
FROM transactions;Example #
Window functions: IGNORE NULLS clause #
-- LAG / LEAD / FIRST_VALUE / LAST_VALUE — default behavior includes NULLs.
SELECT
day,
price,
LAG(price) OVER (ORDER BY day) AS prev_price,
LAG(price) IGNORE NULLS OVER (ORDER BY day) AS prev_non_null_price
FROM stock_prices;
-- IGNORE NULLS is supported on Snowflake, Oracle, BigQuery, Redshift, DuckDB, SQL Server (2022+).
-- Placement varies: Snowflake / Oracle / SQL Server write it after the parens (as above);
-- DuckDB / BigQuery write it inside: LAG(price IGNORE NULLS) OVER (ORDER BY day).
-- Postgres / MySQL: not natively supported. Workaround:
-- Postgres workaround: nested aggregate.
WITH numbered AS (
SELECT *,
CASE WHEN price IS NOT NULL THEN COUNT(price) OVER (ORDER BY day) END AS grp
FROM stock_prices
)
SELECT day, price, MAX(price) OVER (PARTITION BY grp ORDER BY day) AS last_non_null_price
FROM numbered;Common patterns #
-- "Average rating, treating missing as the dataset's mean."
WITH stats AS (SELECT AVG(rating) AS mean FROM reviews)
SELECT AVG(COALESCE(r.rating, s.mean)) FROM reviews r CROSS JOIN stats s;
-- Forward-fill missing values (carry-forward).
-- (DuckDB / BigQuery put IGNORE NULLS inside the parens; Snowflake / Oracle after them.)
WITH prices(day, price) AS (
VALUES (DATE '2026-04-01', 10.0),
(DATE '2026-04-02', NULL),
(DATE '2026-04-03', 12.5)
)
SELECT day,
COALESCE(price, LAG(price IGNORE NULLS) OVER (ORDER BY day)) AS price_filled
FROM prices;
-- "Conditional aggregate" — count only certain rows.
SELECT
COUNT(*) FILTER (WHERE status = 'paid') AS paid_count, -- Postgres / DuckDB / SQLite
SUM(CASE WHEN status = 'paid' THEN total ELSE 0 END) AS paid_revenue -- portable
FROM orders;
-- Distinct count, ignoring NULL.
SELECT COUNT(DISTINCT email) FROM users; -- nulls always excluded from DISTINCT countCommon mistakes #
- Believing
AVGincludes nulls as zeros: it doesn't. Nulls are excluded from both numerator and denominator. If "missing = 0" semantically, wrap withCOALESCE(col, 0)inside the aggregate. SUM(col)returningNULLinstead of0when there are no non-null rows. UseCOALESCE(SUM(col), 0)if your downstream code expects a number.- **
COUNT(col)instead ofCOUNT()after a LEFT JOIN*: surprises people. AfterLEFT JOIN orders,COUNT(orders.id)excludes users with no orders (NULL on the right side). UseCOUNT(*)if you want every row. STRING_AGG/LISTAGG/GROUP_CONCAT: these also ignore nulls.STRING_AGG(name, ',')over ('a',NULL,'c') returns'a,c'.
Dialect comparison #
| Engine | Syntax | Gotcha |
|---|---|---|
| PostgreSQL | LAG(col) OVER (…) | no IGNORE NULLS; use the nested-aggregate workaround |
| MySQL | LAG(col) OVER (…) | no IGNORE NULLS |
| SQL Server | LAG(col) IGNORE NULLS OVER (…) | 2022+ only |
| BigQuery | FIRST_VALUE(col IGNORE NULLS) OVER (…) | IGNORE NULLS goes inside the parentheses |
| Snowflake | LAG(col) IGNORE NULLS OVER (…) | — |
| Oracle | LAG(col) IGNORE NULLS OVER (…) | — |
| DuckDB | LAG(col IGNORE NULLS) OVER (…) | IGNORE NULLS goes inside the parentheses, like BigQuery |
The table covers the window-function clause only. Plain aggregates (SUM, AVG, MIN, MAX, COUNT(col)) skip NULLs by default on every engine.
Aggregates ignore NULLs is a feature, not a bug, but the semantic decision belongs to you. If a missing rating means "no opinion" (don't count it), the default behavior is right. If a missing rating means "neutral, treat as 3," wrap with COALESCE(rating, 3) to materialize that intent. Either choice is defensible; an unstated choice is the source of "the dashboard average doesn't match" arguments. Detail: COALESCE, IS NULL, COUNT comparison.