CREATE INDEX
Build a secondary lookup structure that lets the engine find rows by a column (or set of columns) without scanning the whole table. Indexes make targeted reads dramatically faster and writes slightly slower: every INSERT, UPDATE, and DELETE has to maintain every index on the table.
Syntax #
CREATE [UNIQUE] INDEX [IF NOT EXISTS] index_name
ON table_name (column_1 [, column_2, ...]);
-- Drop an index.
DROP INDEX [IF EXISTS] index_name;Example #
When an index helps (and when it doesn't) #
Helps:
- Selective
WHEREfilters (WHERE user_id = 42) on a column with high cardinality. - Join keys: every FK column on the child table should usually be indexed.
ORDER BYcolumns when the query also returns few rows (the index provides pre-sorted access).UNIQUEenforcement (CREATE UNIQUE INDEXon email, slug, etc.).
Doesn't help (or hurts):
- Low-cardinality columns (
is_active,status); a full table scan is often cheaper. - Tables small enough that the engine prefers a sequential scan anyway.
- Columns wrapped in functions (
WHERE LOWER(email) = ...): the index onemailwon't be used. Use a functional index (CREATE INDEX ON users (LOWER(email))) or normalize the data. - Heavy write workloads where the index maintenance cost exceeds the read savings.
Dialect notes #
- Postgres: rich index types, namely
btree(default),hash,gin,gist,brin. Partial indexes viaWHEREclause (CREATE INDEX ... WHERE is_active).CREATE INDEX CONCURRENTLYto avoid locking writers in production. - MySQL:
btreeandhash(hashonly on Memory engine). No partial or functional indexes until 8.0.13+. Add the FK index yourself — InnoDB does not auto-create it on the child side. - SQL Server: clustered (one per table, defines physical order) vs non-clustered.
INCLUDE (...)lets you carry extra columns in the leaf for covering indexes. - SQLite:
btreeonly. Auto-indexes the rowid; you index everything else. - BigQuery / Snowflake / Redshift: no traditional indexes. Performance comes from clustering / partitioning / sort keys / micro-partitions. Don't try to map Postgres index intuition onto warehouses.
FAQ #
When should I add an index?
Look at queries that are slow and run often. Selective filters on high-cardinality columns benefit most. Foreign-key columns on child tables (orders.user_id, comments.post_id). UNIQUE enforcement on emails and slugs. Sort columns paired with selective filters. Avoid: low-cardinality columns (status, gender), tiny tables, and columns wrapped in functions.
How many indexes is too many?
Each index slows down INSERT / UPDATE / DELETE proportionally. Six unused indexes per table is common in real codebases, and every one of them slows writes for no read benefit. Rule of thumb: keep an index only if you can name a query it speeds up.
Why doesn't my index get used?
Common reasons: (1) the column is wrapped in a function (LOWER(email) = ... defeats a B-tree on email); (2) the predicate is too unselective (engine prefers a full scan when it estimates scan + sort cheaper); (3) statistics are stale — run ANALYZE (Postgres) / UPDATE STATISTICS (SQL Server). EXPLAIN is the source of truth.
What's a composite index?
An index on multiple columns, e.g., CREATE INDEX ON orders (user_id, created_at). Speeds up queries that filter on the leading columns. WHERE user_id = X uses it; WHERE user_id = X AND created_at = Y uses it; WHERE created_at = Y (no user_id filter) generally does not. Order the columns by selectivity and query frequency.
What's the difference between a unique index and a unique constraint?
Functionally equivalent: both enforce uniqueness and create an index. Syntax differs: CREATE UNIQUE INDEX ... vs ALTER TABLE ... ADD CONSTRAINT ... UNIQUE (...). Constraints can be referenced as foreign-key targets; bare unique indexes typically can't. Use the constraint form for production schemas.
Do columnar warehouses (BigQuery, Snowflake, Redshift) use indexes?
Not in the traditional sense. Performance comes from partitioning (BigQuery, Snowflake), clustering (Snowflake), sort keys (Redshift), and column-pruning in query plans. Don't map Postgres index intuition onto warehouses — use the engine's native optimization mechanisms.
Index based on actual queries, not on assumptions. Look at the slow-query log (Postgres' pg_stat_statements, MySQL's slow log) and find the queries that are slow and run often. Add indexes one at a time, measure, and stop. Six unused indexes per table is a real situation in production codebases — every one of them slows writes for no read benefit. The two related references on this site: Index intuition for the conceptual model, and EXPLAIN plans for confirming an index is actually being used.