Bridge Tables and Many-to-Many Modeling
Use this page to model many-to-many relationships safely in dimensional systems.
Goal:
- preserve business meaning for multivalued relationships
- prevent metric duplication in joins
- expose analyst-friendly patterns for allocation and grouping
Where Many-to-Many Appears #
Common examples:
- policy shared by multiple insured parties
- sale attributed to multiple channels or campaigns
- customer belonging to multiple segments
- product mapped to multiple taxonomies
Bridge Table Pattern #
A bridge table resolves many-to-many by storing relationship rows and optional allocation weights.
CREATE TABLE bridge_policy_insured (
policy_key BIGINT,
insured_party_key BIGINT,
relationship_type VARCHAR,
allocation_weight DECIMAL(18,8),
effective_start_ts TIMESTAMP,
effective_end_ts TIMESTAMP,
is_current BOOLEAN
);Allocation Strategy #
If one fact row maps to multiple bridge rows, decide one of these explicitly:
- full attribution (duplicate total intentionally)
- weighted attribution (split total by weights)
- primary attribution (single chosen relationship)
Document strategy in metric definitions.
Weighted Attribution Query #
SELECT
b.insured_party_key,
SUM(f.written_premium * b.allocation_weight) AS allocated_written_premium,
SUM(f.earned_premium * b.allocation_weight) AS allocated_earned_premium
FROM fact_policy_financials f
JOIN bridge_policy_insured b
ON b.policy_key = f.policy_key
AND f.snapshot_ts >= b.effective_start_ts
AND f.snapshot_ts < b.effective_end_ts
GROUP BY b.insured_party_key;Validation SQL #
-- 1) for weighted-allocation bridges, weights should sum to 1 by policy/version
-- (full-attribution bridges carry weight = 1 on every row and intentionally
-- fail this check -- skip it there and validate the duplication factor instead)
SELECT policy_key, effective_start_ts, SUM(allocation_weight) AS weight_sum
FROM bridge_policy_insured
GROUP BY policy_key, effective_start_ts
HAVING ROUND(SUM(allocation_weight), 6) <> 1;
-- 2) duplicate relationship rows
SELECT policy_key, insured_party_key, effective_start_ts, COUNT(*) AS row_count
FROM bridge_policy_insured
GROUP BY policy_key, insured_party_key, effective_start_ts
HAVING COUNT(*) > 1;
-- 3) overlap checks on validity windows
SELECT a.policy_key
FROM bridge_policy_insured a
JOIN bridge_policy_insured b
ON a.policy_key = b.policy_key
AND a.insured_party_key = b.insured_party_key
AND a.effective_start_ts < b.effective_end_ts
AND b.effective_start_ts < a.effective_end_ts
AND a.effective_start_ts <> b.effective_start_ts
GROUP BY a.policy_key;Common Mistakes #
- Joining a many-to-many relationship directly without weight logic.
- Not versioning bridge rows when relationships change over time.
- Mixing primary and weighted attribution in different dashboards.
- Leaving allocation semantics undocumented.
Senior Engineer Playbook #
- Allocation reconciliation: assert SUM(metric) before the bridge join equals SUM(metric * weight) after it, one check that catches both bad weights and unintended fanout.
- Failure postmortem pattern: compare allocated vs unallocated totals by period to detect attribution drift.
- Migration playbook: introduce bridge with parallel output tables, then switch consumers after weighted reconciliation sign-off.
Interactive SQL Check #
Allocate earned premium to insured parties with a weighted bridge. The seed data has no bridge table, so the starter seeds one inline via a VALUES CTE (weights per policy sum to 1).
Join the bridge to ins_premium_facts on policy_id, multiply earned_premium by allocation_weight, and return one row per insured_party with the allocated total rounded to 2 decimals.
Related Lessons #
Bridge modeling is mostly about semantic control. If allocation rules are unclear, every KPI downstream can diverge.