Practice

Event funnels: four drills

Four funnel drills, ordered easy to hard, all on one clickstream table. The funnel is view → add_to_cart → checkout → purchase. The default editor shows a working solution - try clearing it and writing your own first; the hidden checks will grade what's there when you press Run.

Table: ecom_events with event_id, customer_id, event_type, product_id, event_timestamp. Customers can repeat events, skip steps, and (at least one does) fire steps out of order.

1. Distinct users per step (Easy) #

Task: count how many distinct customers ever fired each event type. Output event_type, users. One row per event type. Order by users DESC - the funnel reads top-down for free, since every step loses people. Counts are whole numbers; no rounding involved.

The skill: funnel grain is users, not events. A customer who viewed five times is still one user at the view step - COUNT(*) would silently count events.

Loading SQL editor...

2. Step-to-step conversion rates (Medium) #

Task: collapse the funnel to a single row of user counts plus the conversion rate between each adjacent pair of steps. Output users_view, users_cart, users_checkout, users_purchase, view_to_cart_pct, cart_to_checkout_pct, checkout_to_purchase_pct. Exactly one row. No ordering requirement. Each percentage is 100.0 * step / prior step, rounded to 1 decimal place, with the denominator wrapped in NULLIF(..., 0).

This is the any-touch flavor: a customer counts at a step if they ever fired it, regardless of order. The skill: collapse to one row per customer with MAX(CASE WHEN ...) flags first, then aggregate the flags - and never divide by a raw step count that could be zero.

Loading SQL editor...

3. Time-ordered funnel (Medium) #

Task: rebuild the funnel counts, but a customer only counts at step N if their first step-N event is at or after their first step-N-1 event - all the way up the chain. Output users_view, users_cart, users_checkout, users_purchase. Exactly one row, whole-number counts, no rounding. No ordering requirement.

Drill 2 was the any-touch version: it credits a customer who purchased first and only added to cart days later as a full conversion. The time-ordered version refuses that credit - at least one customer in this table fires steps out of order, so your purchase count here must come out lower than drill 2's. Comparing a step against a NULL first-timestamp yields NULL, which the filter treats as false - exactly the drop-off behavior you want.

Loading SQL editor...

4. Time-to-convert (Hard) #

Task: for customers who completed the time-ordered funnel from drill 3 (first view ≤ first add_to_cart ≤ first checkout ≤ first purchase), measure how long conversion takes: the elapsed time from their first view to their first purchase. Output converting_users and avg_hours_to_convert. Exactly one row. No ordering requirement.

Aggregate and rounding, exactly: the average (mean, not median) of per-customer hours, where hours = elapsed seconds / 3600.0, rounded to 1 decimal place with ROUND(..., 1).

The skill: subtracting two timestamps gives an interval, not a number - EXTRACT(EPOCH FROM ...) converts it to seconds so you can do arithmetic. And the cohort must be time-ordered converters only: averaging in the out-of-order customer would produce a negative or meaningless duration.

Loading SQL editor...
Tip

The funnel patterns to remember:

  • Grain first - funnels count distinct users per step. COUNT(DISTINCT customer_id) or per-user MAX(CASE ...) flags; never raw event counts.
  • Any-touch = MAX(CASE WHEN ...) flags per user. Fast to write, but it credits out-of-order users and inflates conversion.
  • Time-ordered = MIN(CASE WHEN ... THEN event_timestamp END) per step, then chain t_step_n >= t_step_n_minus_1 all the way up. Honest, at the cost of undercounting users whose first occurrences are out of order but who later completed the sequence properly.
  • Rates = ROUND(100.0 * step / NULLIF(prior_step, 0), 1) - the NULLIF guard keeps an empty top of funnel from crashing the query.
  • Durations = EXTRACT(EPOCH FROM (t_end - t_start)) / 3600.0 for hours; only measure them on the cohort that actually converted in order.