Capstone Part 1: KPI Baseline
You've just joined a new analytics team. They hand you three raw tables (users, orders, products) and ask: "How's the business doing?" No dashboard exists yet. Your first job is to produce a defensible KPI baseline: order volume, revenue, and average order value (AOV) over time.
This is where everything from the course so far pays off. Over the next three lessons you'll build a real end-to-end analysis: baseline (this lesson) → retention → segmentation + recommendation.
Step 1: Validate the Grain #
Before aggregating anything, check what one row in orders represents. If the grain is wrong, every downstream KPI is wrong.
-- Is orders.id unique? (It should be — one row per order.)
SELECT
COUNT(*) AS total_rows,
COUNT(DISTINCT id) AS distinct_ids,
COUNT(*) - COUNT(DISTINCT id) AS duplicates
FROM orders;A duplicate here would double-count revenue. On real data, run this check on every new source before you trust it.
Step 2: Define "Revenue" #
"Revenue" isn't obvious; it's a product decision. Look at the distinct statuses first:
SELECT
status,
COUNT(*) AS orders,
ROUND(SUM(total), 2) AS gross_total
FROM orders
GROUP BY status
ORDER BY orders DESC;paid is revenue you've earned. pending hasn't closed yet. cancelled / refunded are negatives. For this capstone we'll define revenue = SUM(total) WHERE status = 'paid'. State the definition clearly, because every stakeholder will ask.
Metric definitions are contracts. "Revenue" means different things to finance (recognized revenue, excluding refunds) and growth (gross bookings). A KPI with no documented definition drifts: two teams will report "revenue was $1.2M" and mean different numbers. Put the definition in writing and stick to it.
Step 3: Daily KPI Baseline #
-- Daily: order count, revenue, AOV (paid only)
SELECT
order_date,
COUNT(*) AS paid_orders,
ROUND(SUM(total), 2) AS revenue,
ROUND(AVG(total), 2) AS aov
FROM orders
WHERE status = 'paid'
GROUP BY order_date
ORDER BY order_date;Read it: are order counts growing? Is AOV stable, rising, or falling? A steady order count with rising AOV = pricing/mix win. Rising order count with falling AOV = discount-driven growth (watch margins). This is where the analysis, not just the SQL, starts.
Try It #
Run the daily KPI baseline. With ~10 orders across a few months, the per-day view is noisy. Modify it below to roll up by month using DATE_TRUNC.
Practice #
Build the monthly paid KPI baseline. Output: month (first day of the month, from DATE_TRUNC), paid_orders, revenue (SUM of total, rounded to 2 decimals), aov (AVG of total, rounded to 2 decimals). Include only orders with status = 'paid'. No ordering requirement.
Mistakes to Watch For #
- Reporting gross revenue without filtering status. Including
cancelledandrefundedinflates the number. Always state the filter. - Skipping grain validation. A join that duplicates orders (joining to
order_itemswithout collapsing) will double revenue silently. - Presenting noisy daily numbers to execs. At small data volume, daily KPIs swing wildly. Roll up to weekly/monthly for narrative; keep daily for operational dashboards.
- Confusing AOV with ARPU. AOV = revenue / paid-order-count. ARPU = revenue / distinct-buyer-count. Different denominators, different stories.
- Hardcoding date ranges in the SQL. If the report needs "last 30 days," use
WHERE order_date >= CURRENT_DATE - INTERVAL 30 DAY— not a baked-in literal that stops working next month.
Next Step #
Continue to Capstone Part 2: Cohorts & Retention. Is the growth you just measured coming from repeat customers, or from a treadmill of one-time buyers?