JSON

Unnest a JSON array into rows

Q: What's LATERAL FLATTEN in Snowflake?

LATERAL FLATTEN(input => col) explodes an array (or object) into rows. Common use: SELECT e.id, f.value AS item FROM events e, LATERAL FLATTEN(input => e.payload:items) f. Returns columns SEQ, KEY, PATH, INDEX, VALUE, THIS; usually you just want f.value.

Q: Should I store data as JSON arrays or as separate rows?

For truly variable per-row data: JSON arrays are fine. For data that's queried often: flatten on ingest into a separate table with one row per array element. The flattened form indexes well, joins naturally, and avoids per-query JSON parsing overhead. Use JSON for ad-hoc; flatten for production paths.

Q: How do I get the array index in the result?

BigQuery: UNNEST(...) WITH OFFSET AS idx. Postgres: jsonb_array_elements(arr) WITH ORDINALITY AS t(item, idx). Useful when the position matters (e.g., "the 3rd item in the cart").

Short answer: explode a JSON-array column into one row per element. Postgres uses jsonb_array_elements (or json_array_elements); BigQuery uses UNNEST(JSON_EXTRACT_ARRAY(...)); Snowflake uses LATERAL FLATTEN(input => col); SQL Server uses OPENJSON. The output is one row per array element, joined with the parent row via a lateral join.

Canonical SQL #

sql

-- Postgres.
SELECT e.id, item
FROM   events e,
       LATERAL jsonb_array_elements(e.payload -> 'items') AS item;

-- Or text-typed elements.
SELECT e.id, item
FROM   events e,
       LATERAL jsonb_array_elements_text(e.payload -> 'items') AS item;

-- DuckDB (no jsonb_array_elements — cast the JSON array to a list).
SELECT e.id,
       UNNEST(CAST(e.payload -> 'items' AS VARCHAR[])) AS item
FROM   events e;

-- BigQuery.
SELECT e.id, item
FROM   events e,
       UNNEST(JSON_EXTRACT_ARRAY(payload, '$.items')) AS item;

-- Snowflake.
SELECT e.id, f.value AS item
FROM   events e,
       LATERAL FLATTEN(input => e.payload:items) f;

-- SQL Server.
SELECT e.id, item.value AS item
FROM   events e
CROSS  APPLY OPENJSON(payload, '$.items') AS item;

-- MySQL.
SELECT e.id, jt.item
FROM   events e,
       JSON_TABLE(e.payload, '$.items[*]' COLUMNS (item VARCHAR(100) PATH '$')) AS jt;

Example #

Loading SQL editor...

Common patterns #

sql

-- Postgres jsonb syntax — see the dialect table for other engines.

-- Count items per event, including events with empty/missing arrays.
SELECT e.id,
       COUNT(item.*) AS item_count
FROM   events e
LEFT JOIN LATERAL jsonb_array_elements(e.payload -> 'items') AS item ON TRUE
GROUP  BY e.id;

-- Unnest objects, then extract per-object fields.
SELECT e.id,
       (line ->> 'sku')             AS sku,
       (line ->> 'qty')::INTEGER    AS qty
FROM   orders e,
       LATERAL jsonb_array_elements(e.payload -> 'lines') AS line;

-- Find events that contain a specific value in their array.
SELECT * FROM events
WHERE payload -> 'items' @> '"banana"'::JSONB;

Dialect comparison #

Engine	Syntax	Gotcha
PostgreSQL	`jsonb_array_elements(payload -> 'items')`	empty arrays drop the parent row; use `LEFT JOIN LATERAL ... ON TRUE` to keep it
MySQL	`JSON_TABLE(payload, '$.items[*]' COLUMNS (...))`	result columns declared inline
SQL Server	`CROSS APPLY OPENJSON(payload, '$.items')`	returns `key` / `value` / `type`; add `WITH (...)` for typed columns
BigQuery	`UNNEST(JSON_EXTRACT_ARRAY(payload, '$.items'))`	native `JSON` columns need `JSON_QUERY_ARRAY`
Snowflake	`LATERAL FLATTEN(input => payload:items)`	the element is `f.value`, not `f`
Oracle	`JSON_TABLE(payload, '$.items[*]' COLUMNS (...))`	—
SQLite	`json_each(payload, '$.items')`	element lives in the `value` column
DuckDB	`UNNEST(CAST(payload -> 'items' AS VARCHAR[]))`	no `jsonb_array_elements`; cast the JSON array to a list first

Dialect notes #

Postgres: jsonb_array_elements(arr) returns a row per element, each typed jsonb. Use jsonb_array_elements_text(arr) for text-typed elements. The LATERAL keyword is optional; comma-join works the same way.
DuckDB: no jsonb_array_elements, so cast the JSON array to a list and UNNEST it: UNNEST(CAST(payload -> 'items' AS VARCHAR[])). The -> / ->> extraction operators match Postgres.
Snowflake: LATERAL FLATTEN(input => arr) returns columns SEQ, KEY, PATH, INDEX, VALUE, THIS. Most workflows just use f.value.
BigQuery: UNNEST(JSON_EXTRACT_ARRAY(payload, '$.items')) is canonical. If the column is the new native JSON type, UNNEST(JSON_QUERY_ARRAY(payload, '$.items')) instead.
SQL Server: OPENJSON(payload, '$.items') returns rows with key, value, type columns; access via CROSS APPLY. For typed columns, add a WITH (...) clause.
MySQL: JSON_TABLE(json_doc, path COLUMNS (...)) is the standard. Cleaner output than the raw extract because you specify the result columns inline.
Oracle: JSON_TABLE similar to MySQL. Mature, can extract many fields from each element in one expression.
Redshift: no equivalent, since Redshift's JSON support is text-based. Workaround: pre-process in source system or load arrays as separate rows during ingestion.

FAQ #

How do I unnest a JSON array in PostgreSQL?

SELECT e.id, item FROM events e, LATERAL jsonb_array_elements(e.payload -> 'items') AS item. For text-typed elements, use jsonb_array_elements_text instead. The LATERAL keyword is optional; comma-join works the same way.

What's LATERAL FLATTEN in Snowflake?

LATERAL FLATTEN(input => col) explodes an array (or object) into rows. Common use: SELECT e.id, f.value AS item FROM events e, LATERAL FLATTEN(input => e.payload:items) f. Returns columns SEQ, KEY, PATH, INDEX, VALUE, THIS; usually you just want f.value.

How do I unnest in BigQuery?

SELECT e.id, item FROM events e, UNNEST(JSON_EXTRACT_ARRAY(payload, '$.items')) AS item. For arrays of structs, the STRUCT fields are accessible directly: SELECT e.id, o.order_id FROM user_orders e, UNNEST(e.orders) AS o.

What happens when the array is empty?

Empty arrays produce zero rows for that parent. The parent row vanishes from the result, similar to INNER JOIN. To keep parents with empty/missing arrays, use LEFT JOIN LATERAL ... ON TRUE (Postgres / DuckDB) or the equivalent LEFT JOIN UNNEST(...) form.

Should I store data as JSON arrays or as separate rows?

For truly variable per-row data: JSON arrays are fine. For data that's queried often: flatten on ingest into a separate table with one row per array element. The flattened form indexes well, joins naturally, and avoids per-query JSON parsing overhead. Use JSON for ad-hoc; flatten for production paths.

How do I get the array index in the result?

BigQuery: UNNEST(...) WITH OFFSET AS idx. Postgres: jsonb_array_elements(arr) WITH ORDINALITY AS t(item, idx). Useful when the position matters (e.g., "the 3rd item in the cart").

Warning

Unnesting a JSON array against millions of rows is expensive. The engine reads the JSON, parses it, and emits rows during the scan; no index can help past finding the parent row. If a JSON array is queried often (e.g., events.items), the right architecture is to flatten on ingest: write the array to a separate event_items table at load time, with one row per item and a foreign key back. Joins replace nested queries; indexes work; query plans get readable. Use UNNEST for ad-hoc analysis; flatten for production paths.