Extract a substring from a string
Short answer: SUBSTRING(str FROM start FOR length) is the ANSI standard. Most engines also accept the function-call form SUBSTRING(str, start, length). Position is 1-based on every major engine, so SUBSTRING('hello', 1, 3) returns 'hel'. For pulling out parts based on a delimiter rather than a fixed offset, use SPLIT_PART (see split string by delimiter).
Canonical SQL #
-- ANSI form (works on Postgres / DuckDB / SQL Server / Oracle).
SELECT SUBSTRING('dbsyntax.com' FROM 1 FOR 8); -- 'dbsyntax'
-- Function-call form (works on most engines).
SELECT SUBSTRING('dbsyntax.com', 1, 8); -- 'dbsyntax'
SELECT SUBSTR('dbsyntax.com', 1, 8); -- MySQL / SQLite / Oracle / Snowflake / DuckDB
-- From a position to the end (omit length).
SELECT SUBSTRING('dbsyntax.com', 10); -- 'com'
-- Last N characters: combine with LENGTH or use RIGHT.
SELECT RIGHT('dbsyntax.com', 3); -- 'com' (most engines)
SELECT SUBSTRING('dbsyntax.com', LENGTH('dbsyntax.com') - 2, 3);Example #
Common variants #
-- Find a position, then extract from it (POSITION + SUBSTRING).
SELECT SUBSTRING(email, POSITION('@' IN email) + 1) AS domain
FROM users;
-- 'ana@example.com' -> 'example.com'
-- Engine-specific synonyms for POSITION.
-- Postgres / DuckDB / Oracle: POSITION('@' IN email) or STRPOS(email, '@')
-- MySQL / SQL Server: LOCATE('@', email) / CHARINDEX('@', email)
-- Drop the last N chars (e.g., trim a trailing extension).
SELECT SUBSTRING(filename, 1, LENGTH(filename) - 4) AS basename
FROM (VALUES ('report.csv'), ('summary.pdf')) AS files(filename);
-- 'report.csv' -> 'report'
-- Pad-then-extract for fixed-width parsing.
SELECT SUBSTRING(LPAD(code, 10, '0'), 1, 6) AS region_code
FROM (VALUES ('42'), ('987654')) AS lookups(code);Dialect comparison #
| Engine | Syntax | Gotcha |
|---|---|---|
| PostgreSQL | SUBSTRING(str FROM 1 FOR 3) | negative start does not mean "from the end"; find with POSITION('@' IN str) |
| MySQL | SUBSTRING(str, 1, 3) | SUBSTR(str, -3) means last 3 chars; find with LOCATE(sub, str) |
| SQL Server | SUBSTRING(str, 1, 3) | length argument is mandatory; find with CHARINDEX(sub, str) |
| BigQuery | SUBSTR(str, 1, 3) | negative start counts from the end; find with STRPOS(str, sub) |
| Snowflake | SUBSTR(str, 1, 3) | negative start counts from the end; find with POSITION(sub, str) |
| Oracle | SUBSTR(str, 1, 3) | no LEFT/RIGHT; find with INSTR(str, sub) |
| SQLite | SUBSTR(str, 1, 3) | negative start counts from the end |
| DuckDB | SUBSTRING(str, 1, 3) | find with POSITION('@' IN str) or STRPOS |
Dialect notes #
- All major engines are 1-based (not 0-based like most programming languages).
SUBSTRING(s, 1, 3)returns the first three characters. SUBSTRINGvsSUBSTR: synonyms on most engines. Postgres prefersSUBSTRING; MySQL, SQLite, Oracle, Snowflake, DuckDB accept both. UseSUBSTRINGfor portability.- Negative start positions: MySQL and SQLite treat
SUBSTR(s, -3)as "last 3 characters." Postgres and SQL Server don't; they treat negative as before-string and may return empty. Don't rely on negatives; useRIGHT(s, n)instead. LEFTandRIGHT: shortcuts for "first N" and "last N." Available on Postgres, MySQL, SQL Server, DuckDB, Snowflake, Redshift. Not on Oracle (useSUBSTR) or older SQLite.- Beyond-string positions:
SUBSTRING('abc', 10, 5)returns empty string (not error) on every major engine. - Multibyte characters:
SUBSTRINGis character-based on most engines (correct for emoji / accented text). Some engines haveSUBSTRBfor byte-based extraction.
SUBSTRING is the right tool for fixed-width or known-position extraction. For delimited parsing ("everything after the @", "third dash-separated piece"), SPLIT_PART is shorter and clearer. For patterned extraction — "the four digits in this string" — REGEXP_SUBSTR / REGEXP_MATCHES is the right tool. Reaching for SUBSTRING with a hand-computed POSITION offset usually means you wanted one of the other two. Reference: split string by delimiter, regex in SQL.