Loss Ratio Trends & Claims Aging
A single portfolio-level loss ratio hides everything interesting. The questions underwriters actually ask are trend questions: is the ratio improving month over month? Which product line drives it? And how long have the open claims been sitting?
Three graded tasks build that reporting pack:
- a monthly loss-ratio time series for the whole portfolio,
- loss ratio by product line, with a credibility threshold so thin segments don't pollute the view,
- an open-claims aging report bucketed by days outstanding.
Reporting date convention: every "how old is this?" calculation on this page is measured against a fixed as-of date of DATE '2024-06-01', the first day after the dataset's last accrual month. Never use CURRENT_DATE in a graded query: results must be reproducible on any day, and so must real reports.
Task 1: Monthly Portfolio Loss Ratio #
Build the time series at month grain: earned premium per month from ins_premium_facts, incurred claims per month from ins_claims (truncate claim_date to its month), and the ratio between them.
Two traps to design around:
- The month spine comes from premium, not claims. January earns premium but has no claims. Start from
ins_premium_factsmonths and LEFT JOIN the claim totals, or January silently disappears and the trend starts a month late. - Aggregate each side to month grain before joining. Joining raw claims to raw fact rows multiplies premium by claims-in-month.
Output contract: one row per month with month_date, earned_premium (rounded to 2), incurred_claims (rounded to 2, 0.00 for claim-free months), and loss_ratio = incurred / earned with ROUND(..., 4) and a NULLIF guard on the denominator. Expect five rows, January through May 2024. February spikes above 2.0 (a big early Commercial claim against a young book), then the ratio cools every month after.
Task 2: Loss Ratio by Product Line (Credibility Threshold) #
Slice the same metric by product_line, but with a rule real reporting packs always carry: segments with too little earned premium are excluded, because one claim against a tiny base produces a wild, meaningless ratio. Actuaries call this a credibility threshold.
The threshold for this task: keep only product lines with total earned premium of at least 800. Apply it with HAVING on the unrounded earned sum.
Stay fanout-safe exactly as in the graded drills: aggregate premium and claims to policy_id in CTEs, join both to ins_policies, then roll up to product line.
Output contract: product_line, earned_premium (rounded 2), incurred_claims (rounded 2), loss_ratio (ROUND(..., 4), NULLIF guard). Expect three rows: Health falls just under the threshold and drops out, and Home is the only line running above 1.0.
Task 3: Open-Claims Aging Report #
Claims teams live by the aging report: how many unresolved claims are out there, how much money do they represent, and how stale are they? A claim aging past 60 days usually triggers an escalation.
Build it for unresolved claims — claim_status IN ('Open', 'Reported') — measuring age as days between claim_date and the fixed as-of date DATE '2024-06-01' (in DuckDB, subtracting two dates yields the day count directly). Bucket the ages with a CASE:
'0-30 days': age ≤ 30'31-60 days': age 31–60'60+ days': older than 60
Output contract: one row per non-empty bucket with age_bucket, claim_count, and open_incurred (summed incurred, rounded to 2). All three buckets are populated here, and the picture is uncomfortable: the oldest bucket holds both the most claims and by far the most money, led by the February claims that are still open.
Notice how the three views tell one story: the portfolio ratio looks like it is healing month over month, but the aging report shows the February losses have not gone away. They are just sitting unresolved in the 60+ bucket. Trend metrics and aging metrics are always read together for exactly this reason.
Next Step #
You now have every piece the final exercise asks for: