SQL string functions cheat sheet
Common string operations across the major engines: length, case, trim, search, replace, split, concat, regex. Use this when the function name you remember from one engine isn't in the other.
Length, case, trim #
LENGTH(s) -- character count (most engines)
LEN(s) -- SQL Server (LEN ignores trailing spaces)
CHAR_LENGTH(s) -- ANSI alias
OCTET_LENGTH(s) -- byte count
UPPER(s) LOWER(s)
INITCAP(s) -- Postgres / Oracle / Snowflake — Title Case Each Word
TRIM(s) -- both sides
LTRIM(s) RTRIM(s)
TRIM(BOTH 'x' FROM s) -- trim a specific character (ANSI)Substring extraction #
SUBSTRING(s FROM 1 FOR 5) -- ANSI form
SUBSTRING(s, 1, 5) -- function form (most engines)
SUBSTR(s, 1, 5) -- MySQL / SQLite / Oracle / DuckDB / Snowflake alias
LEFT(s, 5) -- first 5 chars
RIGHT(s, 5) -- last 5 chars (most engines, not Oracle)
-- Find a position.
POSITION('@' IN email) -- ANSI
STRPOS(s, 'sub') -- Postgres / DuckDB
LOCATE('sub', s) -- MySQL
CHARINDEX('sub', s) -- SQL Server
INSTR(s, 'sub') -- Oracle / SQLite / MySQL / DuckDBConcatenate #
a || b -- ANSI (Postgres / DuckDB / SQLite / Oracle / Snowflake / Redshift)
CONCAT(a, b, c) -- portable function form
CONCAT_WS(', ', city, state, zip) -- with separator, skips NULLs (most engines)
a + b -- SQL Server (only)
-- NULL trap on most engines:
a || NULL -- returns NULL — every concat with NULL becomes NULL
-- use CONCAT_WS or wrap each side in COALESCE(col, '')Replace and pad #
REPLACE(s, 'old', 'new') -- replace ALL occurrences
TRANSLATE(s, 'abc', 'xyz') -- char-for-char remap (Postgres / Oracle / Snowflake / DuckDB)
LPAD(s, 10, '0') -- pad left to length 10 with '0'
RPAD(s, 10, ' ') -- pad right
REPEAT(s, 3) -- 'abcabcabc'
REVERSE(s) -- 'abc' -> 'cba' (Postgres / MySQL / SQL Server / Snowflake / DuckDB)Split by delimiter #
-- Get one piece by index.
SPLIT_PART('a,b,c', ',', 2) -- 'b' (Postgres / DuckDB / Snowflake / Redshift)
SUBSTRING_INDEX('a,b,c', ',', 2) -- 'a,b' (MySQL — note: returns up-to-N)
REGEXP_SUBSTR('a,b,c', '[^,]+', 1, 2) -- 'b' (Oracle, regex)
-- Explode into rows.
UNNEST(STRING_TO_ARRAY('a,b,c', ',')) -- Postgres / DuckDB
STRING_SPLIT('a,b,c', ',') -- SQL Server 2016+
UNNEST(SPLIT('a,b,c', ',')) -- BigQuery
LATERAL FLATTEN(input => SPLIT('a,b,c', ',')) -- SnowflakeAggregate strings (rows → list) #
STRING_AGG(col, ', ' ORDER BY col) -- Postgres / SQL Server 2017+ / DuckDB / Snowflake / BigQuery
LISTAGG(col, ', ') WITHIN GROUP (ORDER BY col) -- Oracle / Snowflake / Redshift
GROUP_CONCAT(col ORDER BY col SEPARATOR ', ') -- MySQL / SQLite
-- Watch the truncation defaults: MySQL 1024 bytes, Oracle 4000 bytes.
SET SESSION group_concat_max_len = 1000000; -- raise on MySQLLIKE patterns #
WHERE s LIKE 'abc%' -- starts with 'abc'
WHERE s LIKE '%abc' -- ends with 'abc'
WHERE s LIKE '%abc%' -- contains 'abc'
WHERE s LIKE 'a_c' -- 'a' + any single char + 'c'
WHERE s LIKE 'abc\%' ESCAPE '\' -- literal '%' (escape required)
-- Case-insensitive.
WHERE LOWER(s) LIKE 'abc%' -- portable (defeats index)
WHERE s ILIKE 'abc%' -- Postgres / DuckDB
-- Anchored: prefix LIKE 'abc%' uses index; '%abc' or '%abc%' usually doesn't.Regex #
-- Match.
REGEXP_LIKE(s, '^abc[0-9]+$') -- most engines
s ~ '^abc' -- Postgres / DuckDB
s ~* '^abc' -- Postgres / DuckDB case-insensitive
-- Replace.
REGEXP_REPLACE(s, '[0-9]+', '?') -- replace first match (or all, depending on engine)
REGEXP_REPLACE(s, '[0-9]+', '?', 'g') -- explicitly all (Postgres / DuckDB)
-- Extract.
REGEXP_SUBSTR(s, '[a-z]+@[a-z.]+') -- first email-like substring
(REGEXP_MATCHES(s, '([a-z]+)@([a-z.]+)'))[1] -- Postgres: capture group 1
REGEXP_EXTRACT(s, '[0-9]{4}') -- BigQuery / DuckDBType casts to/from string #
CAST(value AS VARCHAR) -- portable
value::VARCHAR -- Postgres / DuckDB shorthand
CAST(s AS INTEGER) -- string → number
CAST(s AS DATE) -- string → date
-- Safe casts (return NULL on failure instead of erroring).
TRY_CAST(s AS INTEGER) -- DuckDB / SQL Server / Snowflake
SAFE_CAST(s AS INT64) -- BigQuery
TO_NUMBER(s, '999.99') -- Oracle / Postgres with formatCommon patterns #
-- Pull domain out of email.
SELECT SPLIT_PART(email, '@', 2) AS domain FROM users;
-- Strip non-digits from a phone number.
SELECT REGEXP_REPLACE(phone, '[^0-9]', '', 'g') FROM users;
-- Build a full name (NULL-safe).
SELECT CONCAT_WS(' ', first_name, last_name) FROM users;
-- Case-insensitive search with index-friendliness on Postgres.
WHERE email ILIKE 'ana@%';
-- Or: lower-cased index: CREATE INDEX users_email_lower_idx ON users (LOWER(email));
-- WHERE LOWER(email) = LOWER('Ana@example.com');
-- Truncate a long string for display.
SELECT
CASE WHEN LENGTH(description) > 100
THEN SUBSTRING(description, 1, 100) || '...'
ELSE description END AS short_desc
FROM products;Common mistakes #
- NULL poisoning with
||: anyNULLin the chain returnsNULLfor the whole expression. UseCONCAT_WSorCOALESCE(col, ''). - Wrapping an indexed column in a function:
WHERE LOWER(email) = ...defeats a plain index. UseILIKE(Postgres / DuckDB) or a functional index onLOWER(email). - MySQL's
||: defaults to logical OR, not concat, so'a' || 'b'is1. UseCONCATfor portability. POSITIONis 1-based, not 0-based.POSITION('@' IN 'a@b')returns2.- String-typed JSON extraction:
(payload ->> 'amount') > 100does string comparison ("9" > "100"). Always cast:(payload ->> 'amount')::DECIMAL.
FAQ #
How do I extract a substring from a string in SQL?
SUBSTRING(s FROM start FOR length) is the ANSI form; SUBSTRING(s, start, length) is the function form supported on most engines. SUBSTR is an alias on MySQL, SQLite, Oracle. LEFT(s, n) and RIGHT(s, n) are convenient shortcuts on Postgres, MySQL, SQL Server, DuckDB. Position is 1-based on every major engine.
Why does my CONCAT return NULL?
On Postgres, DuckDB, SQLite, Oracle, Snowflake, Redshift: || (and CONCAT on most) returns NULL if any operand is NULL. Use CONCAT_WS(' ', a, b, c) which skips NULLs, or wrap each side in COALESCE(col, ''). MySQL's CONCAT also propagates NULL; SQL Server's CONCAT (2012+) treats NULL as empty string.
How do I split a comma-separated string into rows?
Postgres / DuckDB: UNNEST(STRING_TO_ARRAY(s, ',')). SQL Server: STRING_SPLIT(s, ','). BigQuery: UNNEST(SPLIT(s, ',')). Snowflake: LATERAL FLATTEN(input => SPLIT(s, ',')). MySQL has no native function; use a recursive CTE or numbers table.
How do I do case-insensitive matching?
Three options. (1) LOWER(col) = LOWER('value'): portable but defeats indexes (use a functional index on LOWER(col)). (2) ILIKE, the Postgres / DuckDB shorthand for case-insensitive LIKE. (3) A COLLATE clause, which pushes the case-fold into the type system and lets the engine optimize.
What's the difference between LIKE and REGEXP?
LIKE supports two wildcards: % (zero-or-more chars) and _ (single char). REGEXP (or ~ on Postgres) supports full regex: character classes, quantifiers, alternation, captures. Use LIKE for simple prefix/contains; reach for regex when the pattern is more complex. Both can defeat indexes when wrapping the column.
How do I aggregate multiple rows into a single delimited string?
Postgres / SQL Server (2017+) / DuckDB / Snowflake / BigQuery: STRING_AGG(col, ', ' ORDER BY col). Oracle / Snowflake / Redshift: LISTAGG(col, ', ') WITHIN GROUP (ORDER BY col). MySQL / SQLite: GROUP_CONCAT(col ORDER BY col SEPARATOR ', '). Watch the truncation defaults: MySQL caps at 1024 bytes by default.
**For string manipulation that doesn't need to happen in SQL, clean it up at ingest, not at every query.** Splitting strings, parsing dates from text, normalizing case: each one repeated across a hundred queries is real cost. Doing it once at the ingest layer (or on materialized columns) makes downstream SQL simpler and faster. Detail pages: Substring, Split string, Regex, Concatenate, Concatenate rows.