DML

INSERT

Q: How do I insert multiple rows at once?

Comma-separated VALUES rows: INSERT INTO users (id, email) VALUES (1, 'a@x.com'), (2, 'b@x.com'), (3, 'c@x.com'). Standard everywhere except very old Oracle (which uses INSERT ALL instead). Multi-row inserts are much faster than separate INSERT statements.

Q: How do I do an upsert (insert-or-update)?

Postgres / SQLite / DuckDB: ON CONFLICT (key) DO UPDATE SET col = EXCLUDED.col. MySQL: ON DUPLICATE KEY UPDATE col = VALUES(col). SQL Server / Oracle / Snowflake / BigQuery: MERGE INTO target USING source ON ... WHEN MATCHED THEN UPDATE ... WHEN NOT MATCHED THEN INSERT .... See MERGE / UPSERT.

Q: How do I get the auto-generated ID after an INSERT?

Postgres / SQLite (3.35+) / DuckDB / Oracle: INSERT ... RETURNING id. SQL Server: INSERT ... OUTPUT INSERTED.id .... MySQL: a separate SELECT LAST_INSERT_ID(). Drivers in most languages also expose this on the result of the insert.

Q: How do I insert without overwriting existing rows?

Postgres / SQLite / DuckDB: ON CONFLICT (id) DO NOTHING. MySQL: INSERT IGNORE. SQL Server: MERGE with no WHEN MATCHED branch. The pattern: skip the duplicate, keep going. Useful for idempotent re-runs of pipelines.

Q: What's the fastest way to bulk-load millions of rows?

Engine-specific bulk loaders. COPY (Postgres, Redshift), LOAD DATA INFILE (MySQL), COPY INTO (Snowflake), bq load (BigQuery), BULK INSERT (SQL Server). For more than ~100K rows, the bulk loader is one or two orders of magnitude faster than INSERT.

Add new rows to a table: from literal values, from another query, or from a one-off file load. Two flavors cover most real work: INSERT ... VALUES for hand-written rows, and INSERT ... SELECT for moving rows between tables (the workhorse of every ETL pipeline).

Syntax #

sql

-- Literal rows.
INSERT INTO table_name (col_1, col_2, ...)
VALUES (val_1, val_2, ...),
       (val_1, val_2, ...);

-- From a query (the ETL pattern).
INSERT INTO target_table (col_1, col_2)
SELECT col_a, col_b FROM source_table WHERE ...;

Example #

Loading SQL editor...

Common variants #

sql

-- Use defaults / NULLs explicitly.
INSERT INTO signups (id, email) VALUES (4, 'dani@example.com');
-- source = 'organic' (default), signed_up_at = CURRENT_TIMESTAMP

-- Insert and return the generated row (Postgres / SQLite / DuckDB).
INSERT INTO signups (id, email) VALUES (5, 'eve@example.com')
RETURNING id, signed_up_at;

-- Skip duplicates instead of failing.
INSERT INTO signups (id, email) VALUES (1, 'duplicate@example.com')
ON CONFLICT (id) DO NOTHING;        -- Postgres / SQLite / DuckDB

INSERT IGNORE INTO signups (id, email) VALUES (1, 'duplicate@example.com');
-- MySQL

Dialect notes #

Multi-row VALUES: standard everywhere except very old Oracle (use INSERT ALL instead).
RETURNING: Postgres, SQLite (3.35+), DuckDB, Oracle. SQL Server uses OUTPUT. MySQL has no equivalent; fetch LAST_INSERT_ID() after the statement.
Conflict handling: Postgres / SQLite / DuckDB use ON CONFLICT (...) DO NOTHING or DO UPDATE SET .... MySQL uses INSERT ... ON DUPLICATE KEY UPDATE ... or INSERT IGNORE. SQL Server uses MERGE. See MERGE / UPSERT.
Bulk loads: COPY (Postgres, Redshift), LOAD DATA INFILE (MySQL), COPY INTO (Snowflake), bq load (BigQuery), BULK INSERT (SQL Server). For more than ~100k rows, the bulk loader is one or two orders of magnitude faster than INSERT.
Auto-generated IDs: omit the column from the insert list (INSERT INTO signups (email) VALUES (...)) and the engine fills it. Don't pass NULL explicitly — some engines reject it.

FAQ #

How do I insert multiple rows at once?

Comma-separated VALUES rows: INSERT INTO users (id, email) VALUES (1, 'a@x.com'), (2, 'b@x.com'), (3, 'c@x.com'). Standard everywhere except very old Oracle (which uses INSERT ALL instead). Multi-row inserts are much faster than separate INSERT statements.

How do I insert from a SELECT?

INSERT INTO target (col1, col2) SELECT col1, col2 FROM source WHERE .... The workhorse of every ETL pipeline. Column lists must match — same number, compatible types. Source can be any query: filter, join, aggregate, all valid.

How do I do an upsert (insert-or-update)?

Postgres / SQLite / DuckDB: ON CONFLICT (key) DO UPDATE SET col = EXCLUDED.col. MySQL: ON DUPLICATE KEY UPDATE col = VALUES(col). SQL Server / Oracle / Snowflake / BigQuery: MERGE INTO target USING source ON ... WHEN MATCHED THEN UPDATE ... WHEN NOT MATCHED THEN INSERT .... See MERGE / UPSERT.

How do I get the auto-generated ID after an INSERT?

Postgres / SQLite (3.35+) / DuckDB / Oracle: INSERT ... RETURNING id. SQL Server: INSERT ... OUTPUT INSERTED.id .... MySQL: a separate SELECT LAST_INSERT_ID(). Drivers in most languages also expose this on the result of the insert.

How do I insert without overwriting existing rows?

Postgres / SQLite / DuckDB: ON CONFLICT (id) DO NOTHING. MySQL: INSERT IGNORE. SQL Server: MERGE with no WHEN MATCHED branch. The pattern: skip the duplicate, keep going. Useful for idempotent re-runs of pipelines.

What's the fastest way to bulk-load millions of rows?

Engine-specific bulk loaders. COPY (Postgres, Redshift), LOAD DATA INFILE (MySQL), COPY INTO (Snowflake), bq load (BigQuery), BULK INSERT (SQL Server). For more than ~100K rows, the bulk loader is one or two orders of magnitude faster than INSERT.

Warning

Always name your columns. INSERT INTO orders VALUES (1, 100, 'paid', NOW()) works today. Tomorrow someone adds a currency column to orders and every script that uses positional INSERT either fails or (worse) silently writes 'paid' into currency. Write INSERT INTO orders (id, total, status, created_at) VALUES (...) from day one. The extra typing is the cheapest insurance in this entire reference.