Cheat sheet

SQL cheat sheet

Q: Are SQL keywords case-sensitive?

No. SELECT, select, and Select are equivalent on every major engine. The convention of uppercase keywords is a style choice that improves readability. Most teams follow it, but the engine doesn't require it. Identifiers (table and column names) follow different rules: Postgres folds unquoted names to lowercase; MySQL on Linux makes table and database names filesystem-dependent (controlled by lower_case_table_names) but column names are case-insensitive everywhere; SQL Server treats identifiers as case-insensitive unless the database collation says otherwise.

Q: What's the difference between SQL dialects like PostgreSQL, MySQL, and BigQuery?

The core SQL most analysts write (SELECT, JOIN, GROUP BY, WHERE) is broadly portable across every engine. Differences appear in date functions, string handling, JSON support, and engine-specific extensions. For example, DATE_TRUNC('month', d) works on Postgres but BigQuery wants DATE_TRUNC(d, MONTH) (argument order swapped, no quotes). See Dialect notes for the full list.

Q: Why is SELECT * considered bad practice?

A new column added to the table changes what SELECT returns, and every dependent dashboard, application, or pipeline downstream is affected silently. Naming columns explicitly makes the contract stable. SELECT is fine for ad-hoc exploration; avoid it in production code, views, and stored procedures.

Q: In what order do SQL clauses execute?

Logical evaluation order is FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT. You write SELECT first but the engine evaluates it near the end. This is why WHERE cannot reference column aliases or aggregates: at WHERE time, neither exists yet.

Q: Why does my query return zero rows when I filter on = NULL?

NULL is the absence of a value, not a value. Comparisons against NULL (= NULL, <> NULL, != NULL) all return NULL (UNKNOWN), which fails the WHERE filter, so no rows match. Use IS NULL and IS NOT NULL to test for nullness. See NULL comparison.

Q: Is there a performance difference between UNION and UNION ALL?

Yes. UNION removes duplicate rows from the combined result, which costs a sort or hash. UNION ALL skips that step. Default to UNION ALL and only reach for UNION when you genuinely need duplicate elimination. See UNION vs UNION ALL.

One-page reference for the SQL you'll write 95% of the time. PostgreSQL-style syntax; dialect notes inline where engines disagree. Click any heading topic for the detail page.

Query basics #

sql

SELECT col_a, col_b AS alias
FROM   table_name
WHERE  status = 'active'
  AND  created_at >= DATE '2026-01-01'
ORDER  BY created_at DESC
LIMIT  100;

-- Distinct values.
SELECT DISTINCT region FROM orders;

-- Counting rows.
SELECT COUNT(*) FROM orders;

Filtering #

sql

WHERE col = 'x'                      -- exact match
WHERE col <> 'x'                     -- not equal (also: !=)
WHERE col IN ('a', 'b', 'c')         -- membership
WHERE col NOT IN ('a', 'b')          -- exclusion
WHERE col BETWEEN 10 AND 20          -- range, inclusive
WHERE col LIKE 'abc%'                -- prefix match
WHERE col ILIKE 'abc%'               -- case-insensitive (Postgres / DuckDB)
WHERE col IS NULL                    -- NULL test (= NULL never works)
WHERE col IS NOT NULL
WHERE col_a = 1 AND col_b = 2        -- combine with AND / OR / NOT

Joins #

sql

-- INNER: keep only matched rows.
SELECT * FROM a INNER JOIN b ON a.id = b.a_id;

-- LEFT: keep all left rows; NULL right when no match.
SELECT * FROM a LEFT JOIN b ON a.id = b.a_id;

-- RIGHT, FULL OUTER, CROSS — same shape, different match rules.
SELECT * FROM a CROSS JOIN b;        -- every left row × every right row

-- Multi-table.
SELECT *
FROM   orders   o
JOIN   users    u ON u.id = o.user_id
LEFT JOIN coupons c ON c.id = o.coupon_id;

Detail: INNER vs LEFT JOIN, Joins.

Group by, aggregates, and HAVING #

sql

SELECT
  region,
  COUNT(*)         AS row_count,
  COUNT(DISTINCT user_id) AS unique_users,
  SUM(total)       AS revenue,
  AVG(total)       AS mean_order_value,
  MIN(created_at)  AS first_seen,
  MAX(created_at)  AS last_seen
FROM   orders
WHERE  status = 'paid'                -- filter rows BEFORE grouping
GROUP  BY region
HAVING SUM(total) > 10000              -- filter groups AFTER grouping
ORDER  BY revenue DESC;

Detail: WHERE vs HAVING, COUNT(*) vs COUNT(1) vs COUNT(col).

CTEs (WITH) #

sql

WITH paid_orders AS (
  SELECT * FROM orders WHERE status = 'paid'
),
totals AS (
  SELECT user_id, SUM(total) AS lifetime
  FROM   paid_orders
  GROUP  BY user_id
)
SELECT u.email, t.lifetime
FROM   totals t
JOIN   users  u ON u.id = t.user_id
ORDER  BY t.lifetime DESC;

-- Recursive CTE: find descendants in a tree.
WITH RECURSIVE descendants AS (
  SELECT id, parent_id, 0 AS depth FROM categories WHERE id = 1
  UNION ALL
  SELECT c.id, c.parent_id, d.depth + 1
  FROM   categories c
  JOIN   descendants d ON d.id = c.parent_id
)
SELECT * FROM descendants;

Window functions #

sql

SELECT
  user_id,
  order_date,
  total,
  ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY order_date)        AS order_seq,
  RANK()       OVER (PARTITION BY user_id ORDER BY total DESC)        AS spend_rank,
  LAG(total)   OVER (PARTITION BY user_id ORDER BY order_date)        AS prev_total,
  SUM(total)   OVER (PARTITION BY user_id ORDER BY order_date)        AS running_total,
  AVG(total)   OVER (PARTITION BY user_id ORDER BY order_date
                     ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)        AS rolling_7
FROM orders;

Detail: Window functions cheat sheet, Running total.

NULL handling #

sql

-- Always use IS NULL / IS NOT NULL — never = NULL.
WHERE col IS NULL;
WHERE col IS NOT NULL;

-- Substitute a default for NULL.
COALESCE(col, 0)                       -- first non-NULL of the args
COALESCE(col_a, col_b, 'fallback')

-- Conditional swap.
NULLIF(a, b)                           -- returns NULL if a = b, else a (avoid div by 0)

-- Aggregates ignore NULLs by default.
COUNT(col)                             -- counts non-NULL values
COUNT(*)                               -- counts every row, NULLs included
SUM(col)                               -- ignores NULLs (returns NULL on empty input)

CASE expressions #

sql

SELECT
  total,
  CASE
    WHEN total <  50  THEN 'small'
    WHEN total < 500  THEN 'medium'
    ELSE                  'large'
  END AS bucket
FROM orders;

-- Inside aggregates: pivot, count-where.
SELECT
  region,
  SUM(CASE WHEN status = 'paid' THEN total ELSE 0 END) AS paid_revenue,
  COUNT(*) FILTER (WHERE status = 'paid')              AS paid_count   -- Postgres / DuckDB
FROM orders
GROUP BY region;

Set operations #

sql

SELECT id FROM a UNION ALL SELECT id FROM b;     -- concat, keeps dupes (faster)
SELECT id FROM a UNION     SELECT id FROM b;     -- concat, removes dupes (slower)
SELECT id FROM a INTERSECT SELECT id FROM b;     -- in both
SELECT id FROM a EXCEPT    SELECT id FROM b;     -- in a but not b (Oracle: MINUS)

Detail: UNION vs UNION ALL.

Schema (DDL) #

sql

-- Create.
CREATE TABLE users (
  id          INTEGER PRIMARY KEY,
  email       VARCHAR NOT NULL,
  signup_date DATE    DEFAULT CURRENT_DATE,
  deleted_at  TIMESTAMP,
  CONSTRAINT users_email_uk UNIQUE (email)
);

CREATE INDEX users_signup_date_idx ON users(signup_date);
CREATE VIEW active_users AS SELECT * FROM users WHERE deleted_at IS NULL;

-- Modify.
ALTER TABLE users ADD COLUMN phone VARCHAR;
ALTER TABLE users RENAME COLUMN phone TO mobile;
ALTER TABLE users DROP COLUMN mobile;
ALTER TABLE users RENAME TO members;

-- Remove.
DROP TABLE IF EXISTS legacy_table;
DROP VIEW  IF EXISTS active_users;
DROP INDEX IF EXISTS users_signup_date_idx;

Data changes (DML) #

sql

-- Insert.
INSERT INTO users (id, email)        VALUES (1, 'a@x.com');
INSERT INTO users (id, email)        VALUES (2, 'b@x.com'), (3, 'c@x.com');
INSERT INTO target (id, name) SELECT id, name FROM source WHERE flag = TRUE;

-- Insert-or-update (UPSERT) — re-uses id=1 on purpose to demonstrate the conflict path.
INSERT INTO users (id, email) VALUES (1, 'new@x.com')
ON CONFLICT (id) DO UPDATE SET email = EXCLUDED.email;       -- Postgres / SQLite / DuckDB

-- Update.
UPDATE users SET plan = 'pro' WHERE id = 1;
UPDATE orders
SET    user_email = u.email
FROM   users AS u WHERE orders.user_id = u.id;               -- Postgres / DuckDB

-- Delete.
DELETE FROM users WHERE last_seen_at < DATE '2024-01-01';
TRUNCATE TABLE staging;                                       -- empties fast

-- Wrap risky DML in a transaction.
BEGIN;
  UPDATE accounts SET balance = balance - 100 WHERE id = 1;
  UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;
-- ROLLBACK; to undo if anything looks wrong before COMMIT.

Detail: INSERT, UPDATE, DELETE, BEGIN/COMMIT/ROLLBACK.

Common functions #

sql

-- Strings.
LENGTH(s)                          UPPER(s)         LOWER(s)
TRIM(s)                            LTRIM(s)         RTRIM(s)
SUBSTRING(s, start, length)        REPLACE(s, old, new)
CONCAT(a, b, c)                    a || b || c     -- ANSI concat
SPLIT_PART(s, delim, n)            -- Postgres / DuckDB / Snowflake / Redshift

-- Numbers.
ROUND(x, 2)        CEIL(x)        FLOOR(x)        ABS(x)
MOD(a, b)          a % b
POWER(x, 2)        SQRT(x)        EXP(x)          LN(x)

-- Dates.
CURRENT_DATE                       CURRENT_TIMESTAMP
DATE_TRUNC('month', d)             EXTRACT(YEAR FROM d)
d + INTERVAL '7 days'              d - INTERVAL '1 month'
end_date - start_date              -- days (Postgres / DuckDB)

-- Type casting.
CAST('2026-04-26' AS DATE)         '2026-04-26'::DATE        -- Postgres / DuckDB shorthand

Quick wins (the patterns to memorize) #

sql

-- Top N per group (most common analytics ask).
SELECT *
FROM (
  SELECT *, ROW_NUMBER() OVER (PARTITION BY g ORDER BY metric DESC) AS rn
  FROM   t
) ranked
WHERE rn <= 5;

-- Find duplicates by a key.
SELECT key_col, COUNT(*) FROM t GROUP BY key_col HAVING COUNT(*) > 1;

-- Pivot: rows → columns.
SELECT
  user_id,
  SUM(CASE WHEN event = 'login'  THEN 1 ELSE 0 END) AS logins,
  SUM(CASE WHEN event = 'click'  THEN 1 ELSE 0 END) AS clicks
FROM events GROUP BY user_id;

-- Safe percentage with divide-by-zero guard.
ROUND(100.0 * part / NULLIF(whole, 0), 2)

FAQ #

Are SQL keywords case-sensitive?

No. SELECT, select, and Select are equivalent on every major engine. The convention of uppercase keywords is a style choice that improves readability. Most teams follow it, but the engine doesn't require it. Identifiers (table and column names) follow different rules: Postgres folds unquoted names to lowercase; MySQL on Linux makes table and database names filesystem-dependent (controlled by lower_case_table_names) but column names are case-insensitive everywhere; SQL Server treats identifiers as case-insensitive unless the database collation says otherwise.

What's the difference between SQL dialects like PostgreSQL, MySQL, and BigQuery?

The core SQL most analysts write (SELECT, JOIN, GROUP BY, WHERE) is broadly portable across every engine. Differences appear in date functions, string handling, JSON support, and engine-specific extensions. For example, DATE_TRUNC('month', d) works on Postgres but BigQuery wants DATE_TRUNC(d, MONTH) (argument order swapped, no quotes). See Dialect notes for the full list.

Why is SELECT * considered bad practice?

A new column added to the table changes what SELECT returns, and every dependent dashboard, application, or pipeline downstream is affected silently. Naming columns explicitly makes the contract stable. SELECT is fine for ad-hoc exploration; avoid it in production code, views, and stored procedures.

In what order do SQL clauses execute?

Logical evaluation order is FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT. You write SELECT first but the engine evaluates it near the end. This is why WHERE cannot reference column aliases or aggregates: at WHERE time, neither exists yet.

Why does my query return zero rows when I filter on = NULL?

NULL is the absence of a value, not a value. Comparisons against NULL (= NULL, <> NULL, != NULL) all return NULL (UNKNOWN), which fails the WHERE filter, so no rows match. Use IS NULL and IS NOT NULL to test for nullness. See NULL comparison.

Is there a performance difference between UNION and UNION ALL?

Yes. UNION removes duplicate rows from the combined result, which costs a sort or hash. UNION ALL skips that step. Default to UNION ALL and only reach for UNION when you genuinely need duplicate elimination. See UNION vs UNION ALL.

How do I learn SQL fast?

Start with the Course: sequenced lessons that build on each other. Use this cheat sheet as the syntax reference once you know what you're looking for. Practice with the SQL exercises. Most professionals reach productive in 2-4 weeks of daily practice; mastery takes longer and comes from real work.

Tip

The minute you've memorized this sheet, you're 80% productive in any SQL dialect. The remaining 20% is dialect quirks: how the engine spells DATE_DIFF, where STRING_SPLIT lives, when || does concatenation vs logical OR. The site's dialect notes cover those. Next steps: pick the cheat sheet you'll use most (joins, window functions, date functions, string functions) and bookmark it.