JSON

JSON in SQL

Read, write, query, and aggregate JSON data inside SQL. Used everywhere data lands as semi-structured payloads: event logs, API responses, user preference blobs. Every modern engine has JSON support, and the syntax differs more between engines than almost any other feature in SQL. Use these pages when you know which JSON operation you want and need the syntax for your engine.

At a glance #

sql
-- Get a field's value out of a JSON column.
SELECT payload ->> 'user_id' FROM events;                      -- Postgres / DuckDB
SELECT payload:user_id::STRING FROM events;                    -- Snowflake
SELECT JSON_VALUE(payload, '$.user_id') FROM events;           -- BigQuery / SQL Server / Oracle / MySQL

-- Turn a JSON array column into one row per element.
SELECT id, item FROM events, UNNEST(JSON_EXTRACT_ARRAY(payload, '$.items')) AS item;

-- Build JSON from rows.
SELECT JSON_OBJECT('id', id, 'email', email) FROM users;       -- standard / MySQL / Postgres / DuckDB

Dialect comparison #

EngineSyntaxGotcha
PostgreSQLpayload ->> 'key'returns text; -> returns JSON — use jsonb for storage
MySQLpayload ->> '$.key'path needs the $. prefix; shorthand for JSON_UNQUOTE(JSON_EXTRACT(...))
SQL ServerJSON_VALUE(payload, '$.key')scalars only; JSON_QUERY for objects/arrays
BigQueryJSON_VALUE(payload, '$.key')always returns STRING — cast before comparing
Snowflakepayload:key::STRINGwithout the cast, comparisons happen in VARIANT space
OracleJSON_VALUE(payload, '$.key')JSON_QUERY for non-scalar values
SQLitepayload ->> '$.key'-> / ->> need 3.38+; json_extract on older versions
DuckDBpayload ->> 'key'Postgres-style operators; json_extract(payload, '$.key') for paths

Pages #

Warning

JSON-in-SQL is fast for ad-hoc work and slow as a primary data model. A query that extracts a field from a small JSON column on a small table is fine; the same query against millions of rows reads and parses every JSON value on every scan. If you find yourself writing the same ->>'user_id' extraction in dozens of dashboards, that field belongs as its own column. Index it, document it, and stop paying the parse cost forever. JSON columns are the right place for truly variable payloads, not for "we didn't model it yet."