Lesson Advanced

Capstone Part 1: KPI Baseline

You've just joined a new analytics team. They hand you three raw tables (users, orders, products) and ask: "How's the business doing?" No dashboard exists yet. Your first job is to produce a defensible KPI baseline: order volume, revenue, and average order value (AOV) over time.
This is where everything from the course so far pays off. Over the next three lessons you'll build a real end-to-end analysis: baseline (this lesson) → retention → segmentation + recommendation.

Step 1: Validate the Grain #

Before aggregating anything, check what one row in orders represents. If the grain is wrong, every downstream KPI is wrong.

sql
-- Is orders.id unique? (It should be — one row per order.)
SELECT
  COUNT(*) AS total_rows,
  COUNT(DISTINCT id) AS distinct_ids,
  COUNT(*) - COUNT(DISTINCT id) AS duplicates
FROM orders;

A duplicate here would double-count revenue. On real data, run this check on every new source before you trust it.

Step 2: Define "Revenue" #

"Revenue" isn't obvious; it's a product decision. Look at the distinct statuses first:

sql
SELECT
  status,
  COUNT(*) AS orders,
  ROUND(SUM(total), 2) AS gross_total
FROM orders
GROUP BY status
ORDER BY orders DESC;

paid is revenue you've earned. pending hasn't closed yet. cancelled / refunded are negatives. For this capstone we'll define revenue = SUM(total) WHERE status = 'paid'. State the definition clearly, because every stakeholder will ask.

Warning

Metric definitions are contracts. "Revenue" means different things to finance (recognized revenue, excluding refunds) and growth (gross bookings). A KPI with no documented definition drifts: two teams will report "revenue was $1.2M" and mean different numbers. Put the definition in writing and stick to it.

Step 3: Daily KPI Baseline #

sql
-- Daily: order count, revenue, AOV (paid only)
SELECT
  order_date,
  COUNT(*) AS paid_orders,
  ROUND(SUM(total), 2) AS revenue,
  ROUND(AVG(total), 2) AS aov
FROM orders
WHERE status = 'paid'
GROUP BY order_date
ORDER BY order_date;

Read it: are order counts growing? Is AOV stable, rising, or falling? A steady order count with rising AOV = pricing/mix win. Rising order count with falling AOV = discount-driven growth (watch margins). This is where the analysis, not just the SQL, starts.

Try It #

Run the daily KPI baseline. With ~10 orders across a few months, the per-day view is noisy. Modify it below to roll up by month using DATE_TRUNC.

Loading SQL editor...

Practice #

Build the monthly paid KPI baseline. Output: month (first day of the month, from DATE_TRUNC), paid_orders, revenue (SUM of total, rounded to 2 decimals), aov (AVG of total, rounded to 2 decimals). Include only orders with status = 'paid'. No ordering requirement.

Loading SQL editor...

Mistakes to Watch For #

  • Reporting gross revenue without filtering status. Including cancelled and refunded inflates the number. Always state the filter.
  • Skipping grain validation. A join that duplicates orders (joining to order_items without collapsing) will double revenue silently.
  • Presenting noisy daily numbers to execs. At small data volume, daily KPIs swing wildly. Roll up to weekly/monthly for narrative; keep daily for operational dashboards.
  • Confusing AOV with ARPU. AOV = revenue / paid-order-count. ARPU = revenue / distinct-buyer-count. Different denominators, different stories.
  • Hardcoding date ranges in the SQL. If the report needs "last 30 days," use WHERE order_date >= CURRENT_DATE - INTERVAL 30 DAY — not a baked-in literal that stops working next month.

Next Step #

Continue to Capstone Part 2: Cohorts & Retention. Is the growth you just measured coming from repeat customers, or from a treadmill of one-time buyers?