Recipe

Extract a substring from a string

Short answer: SUBSTRING(str FROM start FOR length) is the ANSI standard. Most engines also accept the function-call form SUBSTRING(str, start, length). Position is 1-based on every major engine, so SUBSTRING('hello', 1, 3) returns 'hel'. For pulling out parts based on a delimiter rather than a fixed offset, use SPLIT_PART (see split string by delimiter).

Canonical SQL #

sql
-- ANSI form (works on Postgres / DuckDB / SQL Server / Oracle).
SELECT SUBSTRING('dbsyntax.com' FROM 1 FOR 8);   -- 'dbsyntax'

-- Function-call form (works on most engines).
SELECT SUBSTRING('dbsyntax.com', 1, 8);          -- 'dbsyntax'
SELECT SUBSTR('dbsyntax.com', 1, 8);             -- MySQL / SQLite / Oracle / Snowflake / DuckDB

-- From a position to the end (omit length).
SELECT SUBSTRING('dbsyntax.com', 10);             -- 'com'

-- Last N characters: combine with LENGTH or use RIGHT.
SELECT RIGHT('dbsyntax.com', 3);                  -- 'com' (most engines)
SELECT SUBSTRING('dbsyntax.com', LENGTH('dbsyntax.com') - 2, 3);

Example #

Loading SQL editor...

Common variants #

sql
-- Find a position, then extract from it (POSITION + SUBSTRING).
SELECT SUBSTRING(email, POSITION('@' IN email) + 1) AS domain
FROM users;
-- 'ana@example.com' -> 'example.com'

-- Engine-specific synonyms for POSITION.
-- Postgres / DuckDB / Oracle: POSITION('@' IN email) or STRPOS(email, '@')
-- MySQL / SQL Server: LOCATE('@', email) / CHARINDEX('@', email)

-- Drop the last N chars (e.g., trim a trailing extension).
SELECT SUBSTRING(filename, 1, LENGTH(filename) - 4) AS basename
FROM (VALUES ('report.csv'), ('summary.pdf')) AS files(filename);
-- 'report.csv' -> 'report'

-- Pad-then-extract for fixed-width parsing.
SELECT SUBSTRING(LPAD(code, 10, '0'), 1, 6) AS region_code
FROM (VALUES ('42'), ('987654')) AS lookups(code);

Dialect comparison #

EngineSyntaxGotcha
PostgreSQLSUBSTRING(str FROM 1 FOR 3)negative start does not mean "from the end"; find with POSITION('@' IN str)
MySQLSUBSTRING(str, 1, 3)SUBSTR(str, -3) means last 3 chars; find with LOCATE(sub, str)
SQL ServerSUBSTRING(str, 1, 3)length argument is mandatory; find with CHARINDEX(sub, str)
BigQuerySUBSTR(str, 1, 3)negative start counts from the end; find with STRPOS(str, sub)
SnowflakeSUBSTR(str, 1, 3)negative start counts from the end; find with POSITION(sub, str)
OracleSUBSTR(str, 1, 3)no LEFT/RIGHT; find with INSTR(str, sub)
SQLiteSUBSTR(str, 1, 3)negative start counts from the end
DuckDBSUBSTRING(str, 1, 3)find with POSITION('@' IN str) or STRPOS

Dialect notes #

  • All major engines are 1-based (not 0-based like most programming languages). SUBSTRING(s, 1, 3) returns the first three characters.
  • SUBSTRING vs SUBSTR: synonyms on most engines. Postgres prefers SUBSTRING; MySQL, SQLite, Oracle, Snowflake, DuckDB accept both. Use SUBSTRING for portability.
  • Negative start positions: MySQL and SQLite treat SUBSTR(s, -3) as "last 3 characters." Postgres and SQL Server don't; they treat negative as before-string and may return empty. Don't rely on negatives; use RIGHT(s, n) instead.
  • LEFT and RIGHT: shortcuts for "first N" and "last N." Available on Postgres, MySQL, SQL Server, DuckDB, Snowflake, Redshift. Not on Oracle (use SUBSTR) or older SQLite.
  • Beyond-string positions: SUBSTRING('abc', 10, 5) returns empty string (not error) on every major engine.
  • Multibyte characters: SUBSTRING is character-based on most engines (correct for emoji / accented text). Some engines have SUBSTRB for byte-based extraction.
Tip

SUBSTRING is the right tool for fixed-width or known-position extraction. For delimited parsing ("everything after the @", "third dash-separated piece"), SPLIT_PART is shorter and clearer. For patterned extraction — "the four digits in this string" — REGEXP_SUBSTR / REGEXP_MATCHES is the right tool. Reaching for SUBSTRING with a hand-computed POSITION offset usually means you wanted one of the other two. Reference: split string by delimiter, regex in SQL.