Fact Table Patterns
Use this page as a design guide for selecting the correct fact table pattern.
Goal:
- choose the right fact type for your business process
- define grain and metrics correctly
- avoid mixed-grain anti-patterns
- model for both correctness and query performance
Three Core Fact Patterns #
- Transaction fact: one row per event (order item, payment, claim)
- Periodic snapshot fact: one row per entity per period (daily account balance)
- Accumulating snapshot fact: one row per process instance with milestone timestamps (application lifecycle)
Decision Framework #
Use transaction facts when event-level analysis and drill-through are required.
Use periodic snapshots when state-over-time analysis is primary.
Use accumulating snapshots when process throughput and stage durations are central KPIs.
The accumulating snapshot is the odd one out operationally: its rows are UPDATEd in place as milestones complete, while the other two patterns are append-only. Each process instance (application, order, claim) gets exactly one row, whose milestone date columns start NULL and are filled as each stage finishes. Plan the load pipeline for updates, not just inserts.
Transaction Fact Template #
CREATE TABLE fact_order_item (
order_item_sk BIGINT,
order_id VARCHAR,
order_item_id VARCHAR,
order_date_key INTEGER,
customer_key BIGINT,
product_key BIGINT,
channel_key BIGINT,
quantity INTEGER,
gross_amount DECIMAL(18,2),
discount_amount DECIMAL(18,2),
net_amount DECIMAL(18,2),
cost_amount DECIMAL(18,2)
);Periodic Snapshot Template #
CREATE TABLE fact_account_daily_snapshot (
snapshot_date_key INTEGER,
account_key BIGINT,
customer_key BIGINT,
status VARCHAR,
ending_balance DECIMAL(18,2),
overdue_amount DECIMAL(18,2),
is_delinquent BOOLEAN
);Accumulating Snapshot Template #
CREATE TABLE fact_loan_application_lifecycle (
application_key BIGINT,
customer_key BIGINT,
product_key BIGINT,
application_created_date_key INTEGER,
underwriting_completed_date_key INTEGER,
approved_date_key INTEGER,
funded_date_key INTEGER,
current_stage VARCHAR,
requested_amount DECIMAL(18,2),
funded_amount DECIMAL(18,2)
);Additivity Rules #
- Additive: can sum across all dimensions (net_amount).
- Semi-additive: can sum across some dimensions, not time (ending_balance).
- Non-additive: ratios/percentages (loss_ratio). Compute from base measures.
Common Anti-Pattern: Mixed Grain #
Do not place order-level and order-item-level metrics in one fact table.
If one row means "order item", then order-level fields must be replicated deliberately or modeled separately in an order-level fact.
Validation SQL #
-- 1) transaction fact uniqueness
SELECT order_id, order_item_id, COUNT(*) AS row_count
FROM fact_order_item
GROUP BY order_id, order_item_id
HAVING COUNT(*) > 1;
-- 2) periodic snapshot uniqueness
SELECT snapshot_date_key, account_key, COUNT(*) AS row_count
FROM fact_account_daily_snapshot
GROUP BY snapshot_date_key, account_key
HAVING COUNT(*) > 1;
-- 3) accumulating snapshot uniqueness
SELECT application_key, COUNT(*) AS row_count
FROM fact_loan_application_lifecycle
GROUP BY application_key
HAVING COUNT(*) > 1;Performance Notes #
- Keep transaction facts at atomic grain; pre-aggregate for dashboard speed.
- Partition large facts by date where supported.
- Use conformed dimension keys consistently across facts.
- Keep expensive DISTINCT logic out of BI tools when possible.
Senior Engineer Playbook #
- Accumulating snapshot ops: make the milestone UPDATE pass idempotent and re-run the one-row-per-process-instance check after it: a replay that inserts instead of updating is the classic duplication source.
- Failure postmortem pattern: isolate incidents where mixed-grain inserts created silent double counting.
- Migration playbook: cut from legacy mixed fact to split facts with temporary compatibility views and KPI regression checks.
Interactive SQL Check #
Run this query to validate an order-level aggregate built from an item-level transaction fact.
Related Lessons #
Fact table type is a business semantics choice first, a storage choice second.