SQL cheat sheet
One-page reference for the SQL you'll write 95% of the time. PostgreSQL-style syntax; dialect notes inline where engines disagree. Click any heading topic for the detail page.
Query basics #
SELECT col_a, col_b AS alias
FROM table_name
WHERE status = 'active'
AND created_at >= DATE '2026-01-01'
ORDER BY created_at DESC
LIMIT 100;
-- Distinct values.
SELECT DISTINCT region FROM orders;
-- Counting rows.
SELECT COUNT(*) FROM orders;Filtering #
WHERE col = 'x' -- exact match
WHERE col <> 'x' -- not equal (also: !=)
WHERE col IN ('a', 'b', 'c') -- membership
WHERE col NOT IN ('a', 'b') -- exclusion
WHERE col BETWEEN 10 AND 20 -- range, inclusive
WHERE col LIKE 'abc%' -- prefix match
WHERE col ILIKE 'abc%' -- case-insensitive (Postgres / DuckDB)
WHERE col IS NULL -- NULL test (= NULL never works)
WHERE col IS NOT NULL
WHERE col_a = 1 AND col_b = 2 -- combine with AND / OR / NOTJoins #
-- INNER: keep only matched rows.
SELECT * FROM a INNER JOIN b ON a.id = b.a_id;
-- LEFT: keep all left rows; NULL right when no match.
SELECT * FROM a LEFT JOIN b ON a.id = b.a_id;
-- RIGHT, FULL OUTER, CROSS — same shape, different match rules.
SELECT * FROM a CROSS JOIN b; -- every left row × every right row
-- Multi-table.
SELECT *
FROM orders o
JOIN users u ON u.id = o.user_id
LEFT JOIN coupons c ON c.id = o.coupon_id;Detail: INNER vs LEFT JOIN, Joins.
Group by, aggregates, and HAVING #
SELECT
region,
COUNT(*) AS row_count,
COUNT(DISTINCT user_id) AS unique_users,
SUM(total) AS revenue,
AVG(total) AS mean_order_value,
MIN(created_at) AS first_seen,
MAX(created_at) AS last_seen
FROM orders
WHERE status = 'paid' -- filter rows BEFORE grouping
GROUP BY region
HAVING SUM(total) > 10000 -- filter groups AFTER grouping
ORDER BY revenue DESC;CTEs (WITH) #
WITH paid_orders AS (
SELECT * FROM orders WHERE status = 'paid'
),
totals AS (
SELECT user_id, SUM(total) AS lifetime
FROM paid_orders
GROUP BY user_id
)
SELECT u.email, t.lifetime
FROM totals t
JOIN users u ON u.id = t.user_id
ORDER BY t.lifetime DESC;
-- Recursive CTE: find descendants in a tree.
WITH RECURSIVE descendants AS (
SELECT id, parent_id, 0 AS depth FROM categories WHERE id = 1
UNION ALL
SELECT c.id, c.parent_id, d.depth + 1
FROM categories c
JOIN descendants d ON d.id = c.parent_id
)
SELECT * FROM descendants;Window functions #
SELECT
user_id,
order_date,
total,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY order_date) AS order_seq,
RANK() OVER (PARTITION BY user_id ORDER BY total DESC) AS spend_rank,
LAG(total) OVER (PARTITION BY user_id ORDER BY order_date) AS prev_total,
SUM(total) OVER (PARTITION BY user_id ORDER BY order_date) AS running_total,
AVG(total) OVER (PARTITION BY user_id ORDER BY order_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS rolling_7
FROM orders;Detail: Window functions cheat sheet, Running total.
NULL handling #
-- Always use IS NULL / IS NOT NULL — never = NULL.
WHERE col IS NULL;
WHERE col IS NOT NULL;
-- Substitute a default for NULL.
COALESCE(col, 0) -- first non-NULL of the args
COALESCE(col_a, col_b, 'fallback')
-- Conditional swap.
NULLIF(a, b) -- returns NULL if a = b, else a (avoid div by 0)
-- Aggregates ignore NULLs by default.
COUNT(col) -- counts non-NULL values
COUNT(*) -- counts every row, NULLs included
SUM(col) -- ignores NULLs (returns NULL on empty input)CASE expressions #
SELECT
total,
CASE
WHEN total < 50 THEN 'small'
WHEN total < 500 THEN 'medium'
ELSE 'large'
END AS bucket
FROM orders;
-- Inside aggregates: pivot, count-where.
SELECT
region,
SUM(CASE WHEN status = 'paid' THEN total ELSE 0 END) AS paid_revenue,
COUNT(*) FILTER (WHERE status = 'paid') AS paid_count -- Postgres / DuckDB
FROM orders
GROUP BY region;Set operations #
SELECT id FROM a UNION ALL SELECT id FROM b; -- concat, keeps dupes (faster)
SELECT id FROM a UNION SELECT id FROM b; -- concat, removes dupes (slower)
SELECT id FROM a INTERSECT SELECT id FROM b; -- in both
SELECT id FROM a EXCEPT SELECT id FROM b; -- in a but not b (Oracle: MINUS)Detail: UNION vs UNION ALL.
Schema (DDL) #
-- Create.
CREATE TABLE users (
id INTEGER PRIMARY KEY,
email VARCHAR NOT NULL,
signup_date DATE DEFAULT CURRENT_DATE,
deleted_at TIMESTAMP,
CONSTRAINT users_email_uk UNIQUE (email)
);
CREATE INDEX users_signup_date_idx ON users(signup_date);
CREATE VIEW active_users AS SELECT * FROM users WHERE deleted_at IS NULL;
-- Modify.
ALTER TABLE users ADD COLUMN phone VARCHAR;
ALTER TABLE users RENAME COLUMN phone TO mobile;
ALTER TABLE users DROP COLUMN mobile;
ALTER TABLE users RENAME TO members;
-- Remove.
DROP TABLE IF EXISTS legacy_table;
DROP VIEW IF EXISTS active_users;
DROP INDEX IF EXISTS users_signup_date_idx;Data changes (DML) #
-- Insert.
INSERT INTO users (id, email) VALUES (1, 'a@x.com');
INSERT INTO users (id, email) VALUES (2, 'b@x.com'), (3, 'c@x.com');
INSERT INTO target (id, name) SELECT id, name FROM source WHERE flag = TRUE;
-- Insert-or-update (UPSERT) — re-uses id=1 on purpose to demonstrate the conflict path.
INSERT INTO users (id, email) VALUES (1, 'new@x.com')
ON CONFLICT (id) DO UPDATE SET email = EXCLUDED.email; -- Postgres / SQLite / DuckDB
-- Update.
UPDATE users SET plan = 'pro' WHERE id = 1;
UPDATE orders
SET user_email = u.email
FROM users AS u WHERE orders.user_id = u.id; -- Postgres / DuckDB
-- Delete.
DELETE FROM users WHERE last_seen_at < DATE '2024-01-01';
TRUNCATE TABLE staging; -- empties fast
-- Wrap risky DML in a transaction.
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;
-- ROLLBACK; to undo if anything looks wrong before COMMIT.Detail: INSERT, UPDATE, DELETE, BEGIN/COMMIT/ROLLBACK.
Common functions #
-- Strings.
LENGTH(s) UPPER(s) LOWER(s)
TRIM(s) LTRIM(s) RTRIM(s)
SUBSTRING(s, start, length) REPLACE(s, old, new)
CONCAT(a, b, c) a || b || c -- ANSI concat
SPLIT_PART(s, delim, n) -- Postgres / DuckDB / Snowflake / Redshift
-- Numbers.
ROUND(x, 2) CEIL(x) FLOOR(x) ABS(x)
MOD(a, b) a % b
POWER(x, 2) SQRT(x) EXP(x) LN(x)
-- Dates.
CURRENT_DATE CURRENT_TIMESTAMP
DATE_TRUNC('month', d) EXTRACT(YEAR FROM d)
d + INTERVAL '7 days' d - INTERVAL '1 month'
end_date - start_date -- days (Postgres / DuckDB)
-- Type casting.
CAST('2026-04-26' AS DATE) '2026-04-26'::DATE -- Postgres / DuckDB shorthandQuick wins (the patterns to memorize) #
-- Top N per group (most common analytics ask).
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY g ORDER BY metric DESC) AS rn
FROM t
) ranked
WHERE rn <= 5;
-- Find duplicates by a key.
SELECT key_col, COUNT(*) FROM t GROUP BY key_col HAVING COUNT(*) > 1;
-- Pivot: rows → columns.
SELECT
user_id,
SUM(CASE WHEN event = 'login' THEN 1 ELSE 0 END) AS logins,
SUM(CASE WHEN event = 'click' THEN 1 ELSE 0 END) AS clicks
FROM events GROUP BY user_id;
-- Safe percentage with divide-by-zero guard.
ROUND(100.0 * part / NULLIF(whole, 0), 2)FAQ #
Are SQL keywords case-sensitive?
No. SELECT, select, and Select are equivalent on every major engine. The convention of uppercase keywords is a style choice that improves readability. Most teams follow it, but the engine doesn't require it. Identifiers (table and column names) follow different rules: Postgres folds unquoted names to lowercase; MySQL on Linux makes table and database names filesystem-dependent (controlled by lower_case_table_names) but column names are case-insensitive everywhere; SQL Server treats identifiers as case-insensitive unless the database collation says otherwise.
What's the difference between SQL dialects like PostgreSQL, MySQL, and BigQuery?
The core SQL most analysts write (SELECT, JOIN, GROUP BY, WHERE) is broadly portable across every engine. Differences appear in date functions, string handling, JSON support, and engine-specific extensions. For example, DATE_TRUNC('month', d) works on Postgres but BigQuery wants DATE_TRUNC(d, MONTH) (argument order swapped, no quotes). See Dialect notes for the full list.
Why is SELECT * considered bad practice?
SELECT * considered bad practice?A new column added to the table changes what SELECT returns, and every dependent dashboard, application, or pipeline downstream is affected silently. Naming columns explicitly makes the contract stable. SELECT is fine for ad-hoc exploration; avoid it in production code, views, and stored procedures.
In what order do SQL clauses execute?
Logical evaluation order is FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT. You write SELECT first but the engine evaluates it near the end. This is why WHERE cannot reference column aliases or aggregates: at WHERE time, neither exists yet.
Why does my query return zero rows when I filter on = NULL?
= NULL?NULL is the absence of a value, not a value. Comparisons against NULL (= NULL, <> NULL, != NULL) all return NULL (UNKNOWN), which fails the WHERE filter, so no rows match. Use IS NULL and IS NOT NULL to test for nullness. See NULL comparison.
Is there a performance difference between UNION and UNION ALL?
UNION and UNION ALL?Yes. UNION removes duplicate rows from the combined result, which costs a sort or hash. UNION ALL skips that step. Default to UNION ALL and only reach for UNION when you genuinely need duplicate elimination. See UNION vs UNION ALL.
How do I learn SQL fast?
Start with the Course: sequenced lessons that build on each other. Use this cheat sheet as the syntax reference once you know what you're looking for. Practice with the SQL exercises. Most professionals reach productive in 2-4 weeks of daily practice; mastery takes longer and comes from real work.
The minute you've memorized this sheet, you're 80% productive in any SQL dialect. The remaining 20% is dialect quirks: how the engine spells DATE_DIFF, where STRING_SPLIT lives, when || does concatenation vs logical OR. The site's dialect notes cover those. Next steps: pick the cheat sheet you'll use most (joins, window functions, date functions, string functions) and bookmark it.