Reference

Case Study: Ecommerce KPI Regression

Case: A stakeholder claims average order value (AOV) regressed in April: "the dashboard shows it down versus March." Your task: verify or refute the claim with evidence.

This is the real shape of most "metric regressed" tickets. Sometimes the metric did move; often the definition moved. The skill being interviewed is the verification workflow, and it works regardless of which way the conclusion lands.

Interview Flow #

Pin down the claimed definition (gross or net of returns? completed orders only or all statuses?).
Recompute the metric yourself, under each plausible definition, at month grain.
Validate the inputs (join fanout, return coverage) so your numbers are trustworthy.
Conclude: confirm the regression, refute it, or show it is a definitional artifact.
Define preventive controls (one published metric definition, change review on dashboard logic).

Runnable Drill #

Compute monthly AOV both ways (gross and net of returns) over completed orders, with returns attributed to the order's month.

Starter query is executable and can be extended with additional checks.

Loading SQL editor...

Model Conclusion #

On this dataset the claim is refuted under every consistent definition:

Gross AOV: March 100.00 (5 orders, 500.00) → April 124.00 (5 orders, 620.00), up 24%.
Net AOV (returns against order month): March 100.00 (no returns on March orders) → April 105.00 (95.00 returned), still up 5%.

The only way to make April look down is to mix definitions across months: March computed over all statuses (including the cancelled 150.00 order: 650.00 / 6 = 108.33) compared against April net of returns (105.00) shows an apparent ~3% drop. That is the likely source of the stakeholder's number: a definition switch, not a regression.

Note how the conclusion is the cheap part: the credibility comes from showing both definitions, the validation queries below, and naming the artifact. The same workflow would have confirmed a real regression just as cleanly.

Validation Queries #

sql

-- 1) Detect order-level duplication risk after item join
SELECT
  o.order_id,
  ROUND(o.order_total, 2) AS order_total,
  ROUND(SUM(oi.line_total), 2) AS item_total
FROM ecom_orders o
JOIN ecom_order_items oi
  ON oi.order_id = o.order_id
WHERE o.order_status = 'completed'
GROUP BY o.order_id, o.order_total
HAVING ROUND(o.order_total, 2) <> ROUND(SUM(oi.line_total), 2);

-- 2) Check return coverage by order item
SELECT
  oi.order_item_id,
  ROUND(oi.line_total, 2) AS line_total,
  ROUND(COALESCE(r.return_amount, 0), 2) AS return_amount
FROM ecom_order_items oi
LEFT JOIN ecom_returns r
  ON r.order_item_id = oi.order_item_id
WHERE COALESCE(r.return_amount, 0) > oi.line_total;

Scoring Rubric #

triage structure and prioritization (30%), strong: pins down the claimed definition before touching SQL
correctness of SQL and grain control (35%), strong: distinct order counts survive the item join, and returns attribution is stated, not assumed
validation and reconciliation evidence (20%), strong: shows header-vs-line and return-coverage checks alongside the metric
conclusion and prevention plan quality (15%), strong: states verify/refute with numbers and proposes one published metric definition to prevent recurrence

Tip

Interview strength comes from structured triage and validation evidence, not fast guesses.