Lesson Advanced

Reading Explain Plans

EXPLAIN shows the plan the engine will use to run a query: which scans, which joins, in what order. EXPLAIN ANALYZE actually runs the query and reports what happened, with real row counts and per-operator time.
The golden rule: measure, don't guess. Execution plans catch surprises that intuition can't: a predicate pushed to the wrong side of a join, a missing filter, a 10M-row intermediate result you didn't know you were building.

The Two Commands #

sql
-- What the engine PLANS to do (cheap — query doesn't actually run)
EXPLAIN
SELECT user_id, SUM(total) AS revenue
FROM orders
WHERE status = 'paid'
GROUP BY user_id;

-- What the engine ACTUALLY did (runs the query, adds timing + real row counts)
EXPLAIN ANALYZE
SELECT user_id, SUM(total) AS revenue
FROM orders
WHERE status = 'paid'
GROUP BY user_id;

Start with EXPLAIN for quick structural sanity checks. Reach for EXPLAIN ANALYZE when you need real numbers. Estimated cardinalities lie; actual ones don't.

Operators to Recognize #

Every plan is a tree of operators. You don't need to memorize all of them; learn the handful that carry the cost:

  • SEQ_SCAN / TABLE_SCAN: reads rows from a base table. In DuckDB this is vectorized and cheap per row, but scans on huge tables with no filter pushdown are a red flag.
  • FILTER: applies a WHERE predicate. Watch where this sits: a filter above a join had to wait for the join; a filter pushed below reads less data.
  • HASH_JOIN / MERGE_JOIN: two input streams combined on a key. Hash joins build a hash of the smaller side; trouble starts when the "small side" isn't actually small.
  • HASH_GROUP_BY / GROUP_BY: aggregation. Cost scales with the number of groups, not input size.
  • PROJECTION: column selection and expression evaluation. Rarely the bottleneck.
  • ORDER_BY / TOP_N: sorting. TOP_N stops after N rows; much cheaper than a full sort plus LIMIT.

Reading a Plan #

Plans are printed top-down but execute leaves first (bottom-up):

  1. Scan base tables at the leaves.
  2. Filter and join as rows flow upward.
  3. Aggregate and sort near the root.
  4. Final projection and LIMIT at the top.

What to watch for:

  • Cardinality flow: does the row count shrink early (good) or stay huge until the last step (bad)?
  • Filter placement: a WHERE predicate should ideally push down to the scan, not apply after an expensive join.
  • Join order: the optimizer picks this, but bad statistics can cause it to join the two biggest tables first. On real warehouses, stale stats are a frequent culprit.
Warning

Equivalent SQL can produce different plans. WHERE u.country = 'US' on a left join might push down to the users scan, or not, depending on join type and optimizer version. If a rewrite feels like it should be faster, don't argue with yourself. Run EXPLAIN ANALYZE on both and compare actual times. Intuition about what "should" happen is frequently wrong once the optimizer gets involved.

Try It #

Run EXPLAIN on a paid-revenue-per-user query. At this data size the plan is tiny, but the structure (scan → filter → group-by → order-by) is what matters.

Loading SQL editor...

Practice #

Return each user's paid revenue for orders with total >= 50. Output: user_id, revenue (SUM, rounded to 2 decimals). Order by revenue DESC, then user_id ASC as tie-breaker (ordering is not graded, but keep it — you'll want to see the sort operator in the plan).
After submitting, try running EXPLAIN ANALYZE on the same query (manually, in the editor) to inspect the plan.

Loading SQL editor...

Mistakes to Watch For #

  • Trusting estimated cardinalities. EXPLAIN shows the optimizer's guess; EXPLAIN ANALYZE shows what really happened. Large divergence between estimated and actual rows usually means stale stats.
  • Optimizing without measuring. "I'll rewrite this as a CTE to make it faster," except the optimizer often flattens CTEs inline and the plan is identical. Benchmark before and after.
  • Reading plans top-down as execution order. Plans are printed top-down, executed bottom-up. Leaf scans happen first.
  • Ignoring the plan at small data sizes. At 10 rows everything is fast. But the shape (scans vs. index seeks, filter placement, join order) tells you what happens when the table grows to 10M rows.
  • Adding indexes to "fix" a slow query before running EXPLAIN. Half the time the bottleneck is a missing filter, a bad join, or an accidental cross product, and no index will save you.

Knowledge check #

4 questions

0 / 4 answered
  1. In Postgres, EXPLAIN ANALYZE differs from EXPLAIN because:

  2. A Seq Scan on a 100M-row table for a WHERE id = 42 query usually means:

  3. A nested loop join in the plan is most efficient when:

  4. You run EXPLAIN and see one step with very high estimated cost. Before adding an index, you should:

Next Step #

That wraps the Performance & Modeling section. Continue to the Capstone, starting with KPI Baseline, to put joins, window functions, and modeling choices to work on an end-to-end analytics mini-project.