Self Join
A self join is any join where both sides are the same table.
You use it to relate rows within a single entity: employees to their managers, orders to other orders by the same user, consecutive events for the same customer.
The Alias Rule #
You must give both copies different aliases.
Without aliases, every column reference is ambiguous; SQL has no way to know which copy of the table you mean. Pick aliases that describe the role each copy plays in the relationship: child/parent, earlier/later, a/b. Never use the bare table name on both sides.
Parent-Child Hierarchy #
The classic example: an employees table where each row has a manager_id pointing to another row in the same table. To show each employee alongside their manager's name, join the table to itself.
(The example builds a tiny employees table with WITH. WITH names a subquery, covered properly in the CTE lesson later. For now, read it as a named intermediate result.)
WITH employees AS (
SELECT * FROM (VALUES
(1, 'CEO', NULL),
(2, 'CTO', 1),
(3, 'Data Lead', 2),
(4, 'Analyst', 3)
) AS t(emp_id, emp_name, manager_id)
)
SELECT
child.emp_name AS employee,
parent.emp_name AS manager
FROM employees AS child
LEFT JOIN employees AS parent
ON child.manager_id = parent.emp_id;LEFT JOIN here is deliberate: the CEO has manager_id = NULL and should still appear (with NULL in the manager column). An INNER JOIN would silently drop the CEO.
Pair Matching #
Self joins also generate pairs of rows from the same table: "all pairs of orders by the same user", "all events within 5 minutes of each other". The trick is to add an inequality like a.id < b.id so each pair appears once, not twice (and no row pairs with itself).
-- All pairs of orders from the same user (each pair once)
SELECT
a.id AS order_a,
b.id AS order_b,
a.user_id
FROM orders AS a
INNER JOIN orders AS b
ON a.user_id = b.user_id
AND a.id < b.id;Try It #
Match employees to their managers. Then switch to INNER JOIN and note what disappears.
Practice #
Using the real orders table, find every pair of paid orders placed by the same user. Return user_id, order_a (the smaller order id), and order_b (the larger). Each pair should appear exactly once. No ordering requirement.
Mistakes to Watch For #
- Forgetting to alias both copies. "Ambiguous column" errors are the most common self-join failure.
- Using
INNER JOINfor hierarchies and losing root rows (like the CEO), whenLEFT JOINpreserves them. - Omitting
a.id < b.idin pair-matching queries, which gives you every pair twice and each row paired with itself. - Chaining many self joins to walk a variable-depth tree — recursive CTEs handle that case properly (different lesson).
Knowledge check #
3 questions
Why are aliases required when self-joining?
For "find each row alongside the next row by date," modern SQL prefers:
A self-join without an explicit "exclude this row" condition:
Next Step #
Continue to Join debugging: diagnosing duplicate rows and missing rows after a join.