Date and Time Modeling
Use this page to model time correctly across facts, dimensions, and serving outputs.
Goal:
- standardize date/time semantics
- avoid timezone and calendar drift defects
- support event-based, reporting-based, and fiscal analysis
Time Semantics You Must Define #
For each metric, define which time field it uses:
- event time: when activity happened
- posting/accounting time: when it entered financial books
- processing/load time: when warehouse received it
- snapshot time: state capture timestamp
Date Dimension Template #
sql
CREATE TABLE dim_date (
date_key INTEGER PRIMARY KEY, -- yyyymmdd
calendar_date DATE,
day_of_week INTEGER,
day_name VARCHAR,
week_of_year INTEGER,
month INTEGER,
month_name VARCHAR,
quarter INTEGER,
year INTEGER,
fiscal_month INTEGER,
fiscal_quarter INTEGER,
fiscal_year INTEGER,
is_month_end BOOLEAN,
is_quarter_end BOOLEAN,
is_year_end BOOLEAN
);Role-Playing Date Keys #
A single fact often needs multiple date roles:
- order_date_key
- ship_date_key
- invoice_date_key
- due_date_key
All can reference the same dim_date table using different aliases in SQL.
sql
SELECT
o.order_id,
d_order.calendar_date AS order_date,
d_ship.calendar_date AS ship_date,
d_invoice.calendar_date AS invoice_date
FROM fact_order o
JOIN dim_date d_order ON d_order.date_key = o.order_date_key
LEFT JOIN dim_date d_ship ON d_ship.date_key = o.ship_date_key
LEFT JOIN dim_date d_invoice ON d_invoice.date_key = o.invoice_date_key;Timezone Strategy #
Recommended approach:
- store base timestamps in UTC
- where local-time analysis matters, store the source IANA timezone name (for example America/New_York), because a fixed offset breaks across DST transitions
- convert to reporting timezone at serving layer
- document which timezone each KPI uses
DuckDB-Compatible Time Handling Example #
sql
SELECT
order_id,
order_ts_utc,
order_ts_utc::DATE AS order_date_utc,
DATE_TRUNC('month', order_ts_utc) AS order_month_utc
FROM fact_order_item;Common Failure Modes #
- Mixing event date and posting date in one metric trend.
- Using local timestamp fields without standardized timezone conversion.
- Aggregating snapshots as if they were transactions.
- Missing fiscal calendar mapping for finance reporting.
Validation SQL #
sql
-- 1) fact date keys must exist in dim_date
SELECT COUNT(*) AS missing_date_keys
FROM fact_order_item f
LEFT JOIN dim_date d ON d.date_key = f.order_date_key
WHERE d.date_key IS NULL;
-- 2) future-dated transaction sanity check
-- order_ts_utc is a naive UTC timestamp; compare against UTC "now",
-- not session-local CURRENT_TIMESTAMP, or the check shifts by the session offset
SELECT COUNT(*) AS future_rows
FROM fact_order_item
WHERE order_ts_utc > CURRENT_TIMESTAMP AT TIME ZONE 'UTC';
-- 3) fiscal mapping completeness
SELECT COUNT(*) AS missing_fiscal_map
FROM dim_date
WHERE fiscal_year IS NULL OR fiscal_month IS NULL;Senior Engineer Playbook #
- DST regression test: keep a fixture date on each DST boundary and assert UTC-to-reporting-timezone conversion round-trips for it. Offset-based logic fails exactly there.
- Failure postmortem pattern: trace timezone conversion defects by comparing raw UTC and reporting-time aggregates.
- Migration playbook: when changing fiscal calendar logic, publish dual-calendar metrics before hard switch.
Interactive SQL Check #
Run this check to validate monthly aggregation behavior using a single event-time column.
Loading SQL editor...
Related Lessons #
Tip
If you do not define time semantics upfront, teams will create incompatible versions of the same KPI.