Reference

Date and Time Modeling

Use this page to model time correctly across facts, dimensions, and serving outputs.

Goal:

  • standardize date/time semantics
  • avoid timezone and calendar drift defects
  • support event-based, reporting-based, and fiscal analysis

Time Semantics You Must Define #

For each metric, define which time field it uses:

  • event time: when activity happened
  • posting/accounting time: when it entered financial books
  • processing/load time: when warehouse received it
  • snapshot time: state capture timestamp

Date Dimension Template #

sql
CREATE TABLE dim_date (
  date_key INTEGER PRIMARY KEY,      -- yyyymmdd
  calendar_date DATE,
  day_of_week INTEGER,
  day_name VARCHAR,
  week_of_year INTEGER,
  month INTEGER,
  month_name VARCHAR,
  quarter INTEGER,
  year INTEGER,
  fiscal_month INTEGER,
  fiscal_quarter INTEGER,
  fiscal_year INTEGER,
  is_month_end BOOLEAN,
  is_quarter_end BOOLEAN,
  is_year_end BOOLEAN
);

Role-Playing Date Keys #

A single fact often needs multiple date roles:

  • order_date_key
  • ship_date_key
  • invoice_date_key
  • due_date_key

All can reference the same dim_date table using different aliases in SQL.

sql
SELECT
  o.order_id,
  d_order.calendar_date AS order_date,
  d_ship.calendar_date AS ship_date,
  d_invoice.calendar_date AS invoice_date
FROM fact_order o
JOIN dim_date d_order ON d_order.date_key = o.order_date_key
LEFT JOIN dim_date d_ship ON d_ship.date_key = o.ship_date_key
LEFT JOIN dim_date d_invoice ON d_invoice.date_key = o.invoice_date_key;

Timezone Strategy #

Recommended approach:

  • store base timestamps in UTC
  • where local-time analysis matters, store the source IANA timezone name (for example America/New_York), because a fixed offset breaks across DST transitions
  • convert to reporting timezone at serving layer
  • document which timezone each KPI uses

DuckDB-Compatible Time Handling Example #

sql
SELECT
  order_id,
  order_ts_utc,
  order_ts_utc::DATE AS order_date_utc,
  DATE_TRUNC('month', order_ts_utc) AS order_month_utc
FROM fact_order_item;

Common Failure Modes #

  • Mixing event date and posting date in one metric trend.
  • Using local timestamp fields without standardized timezone conversion.
  • Aggregating snapshots as if they were transactions.
  • Missing fiscal calendar mapping for finance reporting.

Validation SQL #

sql
-- 1) fact date keys must exist in dim_date
SELECT COUNT(*) AS missing_date_keys
FROM fact_order_item f
LEFT JOIN dim_date d ON d.date_key = f.order_date_key
WHERE d.date_key IS NULL;

-- 2) future-dated transaction sanity check
-- order_ts_utc is a naive UTC timestamp; compare against UTC "now",
-- not session-local CURRENT_TIMESTAMP, or the check shifts by the session offset
SELECT COUNT(*) AS future_rows
FROM fact_order_item
WHERE order_ts_utc > CURRENT_TIMESTAMP AT TIME ZONE 'UTC';

-- 3) fiscal mapping completeness
SELECT COUNT(*) AS missing_fiscal_map
FROM dim_date
WHERE fiscal_year IS NULL OR fiscal_month IS NULL;

Senior Engineer Playbook #

  • DST regression test: keep a fixture date on each DST boundary and assert UTC-to-reporting-timezone conversion round-trips for it. Offset-based logic fails exactly there.
  • Failure postmortem pattern: trace timezone conversion defects by comparing raw UTC and reporting-time aggregates.
  • Migration playbook: when changing fiscal calendar logic, publish dual-calendar metrics before hard switch.

Interactive SQL Check #

Run this check to validate monthly aggregation behavior using a single event-time column.

Loading SQL editor...
Tip

If you do not define time semantics upfront, teams will create incompatible versions of the same KPI.