Recipe

Split a string by a delimiter

Short answer: to grab one piece, use SPLIT_PART(str, delim, n) (Postgres / Snowflake / Redshift / DuckDB) or SUBSTRING_INDEX(str, delim, n) (MySQL). To split a string into multiple rows, use STRING_TO_ARRAY + UNNEST (Postgres / DuckDB), STRING_SPLIT (SQL Server), or SPLIT + UNNEST (BigQuery / Snowflake). The choice depends on whether you want a single piece or every piece as a row.

Canonical SQL #

sql
-- One piece by index (1-based).
SELECT SPLIT_PART('a,b,c,d', ',', 2);              -- 'b'  (Postgres / Snowflake / Redshift / DuckDB)
SELECT SUBSTRING_INDEX('a,b,c,d', ',', 2);         -- 'a,b' (MySQL — note: returns up-to-N, not Nth)
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('a,b,c,d', ',', 2), ',', -1); -- 'b' (MySQL idiom for nth)

-- Every piece as a row.
SELECT UNNEST(STRING_TO_ARRAY('a,b,c,d', ',')) AS part;   -- Postgres / DuckDB
SELECT value FROM STRING_SPLIT('a,b,c,d', ',');           -- SQL Server (2016+)
SELECT part FROM UNNEST(SPLIT('a,b,c,d', ',')) AS part;   -- BigQuery
SELECT VALUE FROM TABLE(SPLIT_TO_TABLE('a,b,c,d', ',')); -- Snowflake

Example #

Loading SQL editor...

Common variants #

sql
-- Pull the email domain.
SELECT SPLIT_PART(email, '@', 2) AS domain FROM users;

-- "Last piece" — index by the count of delimiters + 1.
SELECT SPLIT_PART(path, '/', LENGTH(path) - LENGTH(REPLACE(path, '/', '')) + 1) AS filename
FROM files;
-- '/var/log/app.log' -> 'app.log'

-- Explode a tag column ("python,sql,viz") into one row per tag.
SELECT
  post_id,
  tag
FROM posts,
     UNNEST(STRING_TO_ARRAY(tags, ',')) AS tag;

-- Trim each piece (split + array_map / a CTE per row).
SELECT TRIM(part) AS clean_part
FROM UNNEST(STRING_TO_ARRAY('  alpha , beta ,gamma ', ',')) AS part;

Dialect comparison #

EngineSyntaxGotcha
PostgreSQL / DuckDBSPLIT_PART(str, ',', 2)1-based; out-of-range index returns '', not NULL
MySQLSUBSTRING_INDEX(str, ',', 2)returns everything up to the 2nd delimiter, not the 2nd piece
SQL ServerSTRING_SPLIT(str, ',')rows only, no nth-piece function; delimiter must be a single character
BigQuerySPLIT(str, ',')[OFFSET(1)]OFFSET is 0-based; ORDINAL is 1-based
SnowflakeSPLIT_PART(str, ',', 2)index 0 also returns the first piece
OracleREGEXP_SUBSTR(str, '[^,]+', 1, 2)delimiter is a regex; skips empty pieces
SQLiteno native splitter; use a recursive CTE

Dialect notes #

  • Postgres / DuckDB: SPLIT_PART(str, delim, n) for one piece (1-based). STRING_TO_ARRAY(str, delim) returns an ARRAY; UNNEST(STRING_TO_ARRAY(...)) to explode into rows. REGEXP_SPLIT_TO_TABLE for regex-based splits.
  • MySQL: no SPLIT_PART or single-call splitter. SUBSTRING_INDEX(str, delim, n) returns everything up to the nth occurrence, not the nth piece, so wrap twice (SUBSTRING_INDEX(SUBSTRING_INDEX(...), delim, -1)) to get the nth piece. To split into rows, MySQL 8 has no native function; use a recursive CTE or a numbers table.
  • SQL Server: STRING_SPLIT(str, delim) (2016+) returns rows; ordering is preserved with enable_ordinal (2022+). For "nth piece" before 2016, fall back to SUBSTRING + CHARINDEX chains.
  • BigQuery: SPLIT(str, delim) returns an ARRAY<STRING>; access pieces with [OFFSET(n)] (0-based) or [ORDINAL(n)] (1-based). UNNEST(SPLIT(...)) to explode.
  • Snowflake: SPLIT(str, delim) returns ARRAY; SPLIT_PART(str, delim, n) for the nth piece (1-based; 0 also returns the first piece, surprisingly). SPLIT_TO_TABLE for row-explosion.
  • Oracle: no native split. REGEXP_SUBSTR(str, '[^,]+', 1, n) returns the nth piece; for row-explosion, use a hierarchical query (CONNECT BY) or XMLTABLE.
  • SQLite: no native split. Workaround with recursive CTE or pre-process in application code.
Warning

Splitting strings inside SQL is a sign you're working with badly modeled data. A column that contains 'python,sql,viz' should usually be a separate post_tags table with one row per tag. SPLIT_PART and friends are the right temporary fix when you're consuming someone else's denormalized export, but for data you own, normalize: it's faster, indexable, and joinable. The recipe is here so you can ship today; the long-term fix is upstream.