PostgreSQL cheat sheet
Postgres-specific syntax and features that diverge from generic ANSI SQL: jsonb, arrays, RETURNING, ON CONFLICT, generate_series, lateral joins, DISTINCT ON, regex operators, and the psql commands you'll use daily.
Type system highlights #
-- Auto-incrementing IDs.
id BIGSERIAL PRIMARY KEY -- legacy
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY -- modern
-- Native types worth knowing.
UUID -- uuid_generate_v4() with extension
JSONB -- binary, indexable, queryable JSON
TEXT -- unbounded varchar; same perf as VARCHAR
TIMESTAMPTZ -- timezone-aware timestamp (the default for events)
INTERVAL -- '7 days', '1 month', '90 minutes'
ARRAY (e.g., INTEGER[], TEXT[]) -- first-class arraysJSONB operators #
payload -> 'field' -- get JSON value (returns jsonb)
payload ->> 'field' -- get text value
payload #> '{a,b,c}' -- nested by path -> jsonb
payload #>> '{a,b,c}' -- nested by path -> text
payload @> '{"key":"val"}' -- contains (great with GIN index)
payload <@ '{"key":"val"}' -- contained by
payload ? 'field' -- key exists
payload ?| array['a','b'] -- any of these keys
payload ?& array['a','b'] -- all of these keys
payload || '{"new":1}' -- merge (shallow)
payload - 'field' -- remove key
payload #- '{a,b}' -- remove nested
-- Update inside JSON.
UPDATE t SET payload = jsonb_set(payload, '{user,plan}', '"pro"'::jsonb);
-- Index for fast containment queries.
CREATE INDEX events_payload_idx ON events USING GIN (payload jsonb_path_ops);Detail: Postgres jsonb operators.
Arrays #
-- Declare and insert.
tags TEXT[] -- column type
ARRAY['a','b','c'] -- literal
'{a,b,c}'::TEXT[] -- string-literal cast
-- Operators.
ARRAY[1,2,3] || ARRAY[4,5] -- concat -> {1,2,3,4,5}
arr[1] -- 1-based index
'a' = ANY(arr) -- contains
arr @> ARRAY['a'] -- left contains right
array_length(arr, 1) -- length
-- Unnest into rows.
SELECT id, tag FROM posts, UNNEST(tags) AS tag;INSERT ... RETURNING and ON CONFLICT #
-- Get back the auto-generated row.
INSERT INTO users (email) VALUES ('a@x.com')
RETURNING id, created_at;
-- Skip duplicates (UPSERT — Postgres-style).
INSERT INTO users (id, email) VALUES (1, 'a@x.com')
ON CONFLICT (id) DO NOTHING;
INSERT INTO users (id, email) VALUES (1, 'a@x.com')
ON CONFLICT (id) DO UPDATE SET email = EXCLUDED.email,
updated_at = CURRENT_TIMESTAMP;
-- Conditional update.
INSERT INTO target (id, value, updated_at)
VALUES (1, 'new', CURRENT_TIMESTAMP)
ON CONFLICT (id) DO UPDATE SET value = EXCLUDED.value
WHERE target.updated_at < EXCLUDED.updated_at;DISTINCT ON (one row per group) #
-- Most recent order per user — Postgres shortcut for the ROW_NUMBER pattern.
SELECT DISTINCT ON (user_id)
user_id, order_date, total
FROM orders
ORDER BY user_id, order_date DESC;generate_series: number/date spines #
-- Numbers 1..100.
SELECT generate_series(1, 100) AS n;
-- Date spine: every day in 2026.
SELECT generate_series(
DATE '2026-01-01',
DATE '2026-12-31',
INTERVAL '1 day'
)::DATE AS day;
-- Combine to find missing dates in a series (calendar-table pattern).
SELECT d.day
FROM generate_series(DATE '2026-01-01', DATE '2026-12-31', INTERVAL '1 day') AS d(day)
LEFT JOIN events e ON e.day = d.day
WHERE e.day IS NULL;Regex #
s ~ 'pattern' -- match (case-sensitive)
s ~* 'pattern' -- match (case-insensitive)
s !~ 'pattern' -- not match
s !~* 'pattern' -- not match (case-insensitive)
REGEXP_REPLACE(s, '[0-9]+', '?', 'g') -- replace all
REGEXP_MATCHES(s, '([a-z]+)@([a-z.]+)') -- capture groups
REGEXP_SPLIT_TO_TABLE(s, ',') -- split into rowsLateral joins #
-- For each user, get their 3 most recent orders.
SELECT u.id, u.email, o.id AS order_id, o.total
FROM users u
JOIN LATERAL (
SELECT id, total
FROM orders
WHERE orders.user_id = u.id
ORDER BY created_at DESC
LIMIT 3
) o ON TRUE;psql meta-commands (the keyboard shortcuts of Postgres) #
\l -- list databases
\dt -- list tables in current schema
\dt *.* -- all tables across all schemas
\d table_name -- describe table (columns, indexes, FKs)
\d+ table_name -- describe with size and stats
\dn -- list schemas
\df -- list functions
\dv -- list views
\du -- list users / roles
\dx -- list installed extensions
\timing -- toggle query timing
\x -- toggle expanded output
\copy ... FROM 'file.csv' CSV HEADER -- bulk load
\? -- help on psql commands
\q -- quitUseful pg_catalog queries #
-- Table sizes.
SELECT
schemaname || '.' || tablename AS full_name,
pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) AS size
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_total_relation_size(schemaname || '.' || tablename) DESC;
-- Active queries.
SELECT pid, query_start, state, query
FROM pg_stat_activity
WHERE state <> 'idle'
ORDER BY query_start;
-- Slowest queries (with pg_stat_statements extension).
SELECT calls, mean_exec_time, query
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 20;
-- Cancel a runaway query.
SELECT pg_cancel_backend(pid);
SELECT pg_terminate_backend(pid); -- harder killFAQ #
How do I do an upsert in PostgreSQL?
INSERT INTO table (id, value) VALUES (1, 'x') ON CONFLICT (id) DO UPDATE SET value = EXCLUDED.value is Postgres's native upsert. Use ON CONFLICT (id) DO NOTHING to skip duplicates instead. Postgres 15+ also supports the ANSI MERGE statement, but ON CONFLICT remains more idiomatic.
What's the difference between SERIAL and IDENTITY?
SERIAL is the legacy auto-increment shorthand (creates a sequence and a default). GENERATED ALWAYS AS IDENTITY (Postgres 10+) is the ANSI standard form, slightly stricter (you can't override the value with a regular INSERT unless you use OVERRIDING SYSTEM VALUE). Prefer IDENTITY for new code.
TIMESTAMP or TIMESTAMPTZ?
Almost always TIMESTAMPTZ (TIMESTAMP WITH TIME ZONE). Stores in UTC internally, displays in session timezone. The exception: when the value represents wall-clock time at an unspecified location (e.g., "store opens at 9 AM local"). For event timestamps and audit columns, TIMESTAMPTZ.
Are jsonb operators fast on large tables?
Only with the right index. Without one, every query scans and parses every JSON value. Add a GIN index for containment (@>) queries: CREATE INDEX ON events USING GIN (payload jsonb_path_ops). For known-field queries, add an expression index: CREATE INDEX ON events ((payload->>'user_id')).
How do I run psql commands from a script?
psql meta-commands (\d, \dt, \l) only work in the interactive shell. To run them programmatically, use the equivalent SQL: SELECT * FROM information_schema.tables instead of \dt, pg_database instead of \l. Or pipe psql commands: psql -c '\dt' runs a single meta-command and exits.
How do I cancel a runaway query?
Find the pid in pg_stat_activity, then SELECT pg_cancel_backend(pid) for a graceful interrupt. If the query doesn't respond, SELECT pg_terminate_backend(pid) is the harder kill: it terminates the connection.
Postgres-specific features that pay off the most: jsonb with a GIN index for semi-structured data, generate_series for date spines and missing-data detection, DISTINCT ON for one-row-per-group, ON CONFLICT for upserts, RETURNING to avoid second roundtrips, and lateral joins for "for each parent, find related children." These are where Postgres clearly beats portable SQL.