Build JSON from rows
Short answer: JSON_AGG(col) aggregates rows into a JSON array; JSON_OBJECT('key', val, ...) builds a single object from columns; JSON_OBJECTAGG(key, val) builds an object from key/value rows. Snowflake uses OBJECT_CONSTRUCT and ARRAY_AGG with implicit JSON conversion. The right tool when an API contract or downstream consumer wants the data shaped, not flat.
Canonical SQL #
-- Build a JSON object from columns of one row.
SELECT JSON_OBJECT('id', id, 'email', email) FROM users;
-- Aggregate rows into a JSON array.
SELECT user_id, JSON_AGG(product ORDER BY product) AS products
FROM orders
GROUP BY user_id;
-- Aggregate rows into a JSON array OF OBJECTS.
SELECT user_id,
JSON_AGG(JSON_BUILD_OBJECT('id', id, 'product', product) ORDER BY id) AS lines
FROM orders
GROUP BY user_id;
-- Snowflake.
SELECT user_id,
ARRAY_AGG(OBJECT_CONSTRUCT('id', id, 'product', product)) AS lines
FROM orders
GROUP BY user_id;Example #
Common patterns #
-- One row per user with a "products" array — the JSON shape an API would return.
SELECT
u.id,
u.email,
JSON_AGG(
JSON_BUILD_OBJECT('order_id', o.id, 'total', o.total)
ORDER BY o.created_at
) FILTER (WHERE o.id IS NOT NULL) AS orders -- Postgres: skip nulls (LEFT JOIN no-match)
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
GROUP BY u.id, u.email;
-- Build a key-value map.
SELECT JSON_OBJECTAGG(setting_key, setting_value) AS settings
FROM user_settings
WHERE user_id = 42;
-- Pretty-print for debugging (Postgres / DuckDB).
SELECT JSONB_PRETTY(JSON_AGG(...));Dialect comparison #
| Engine | Syntax | Gotcha |
|---|---|---|
| PostgreSQL | json_agg(expr ORDER BY ...) | jsonb_agg for jsonb; FILTER (WHERE ...) skips rows |
| MySQL | JSON_ARRAYAGG(expr) | element order not guaranteed before 8.0 |
| SQL Server | FOR JSON PATH | JSON_ARRAYAGG / JSON_OBJECTAGG only in SQL Server 2025 |
| BigQuery | TO_JSON(ARRAY_AGG(STRUCT(...))) | much cleaner with the native JSON type than legacy string JSON |
| Snowflake | ARRAY_AGG(OBJECT_CONSTRUCT('k', v)) | returns VARIANT; sort with WITHIN GROUP (ORDER BY ...) |
| Oracle | JSON_ARRAYAGG(expr) | 19c+ |
| SQLite | json_group_array(expr) | json_group_object for key/value maps |
| DuckDB | json_group_array(expr) | SQLite-compat name; Postgres-style json_agg also works |
Dialect notes #
- Postgres:
JSON_AGG(andJSONB_AGG),JSON_OBJECT,JSON_BUILD_OBJECT,JSON_OBJECT_AGG,ROW_TO_JSON(a whole row),TO_JSONB(any value).FILTER (WHERE ...)clause to skip rows. ANSIJSON_OBJECTsyntax in Postgres 16+. - DuckDB:
JSON_GROUP_ARRAY,JSON_GROUP_OBJECT(SQLite-compat). Also Postgres-styleJSON_AGGandTO_JSON. - MySQL (5.7+):
JSON_ARRAYAGG,JSON_OBJECTAGG,JSON_OBJECT,JSON_ARRAY. NoORDER BYinside aggregates until 8.0. - SQL Server (2017+):
STRING_AGGfor text or build manually withFOR JSON PATH/FOR JSON AUTO. SQL Server 2022 addsJSON_OBJECTandJSON_ARRAY. SQL Server 2025 addsJSON_OBJECTAGGandJSON_ARRAYAGG. - Oracle:
JSON_OBJECT,JSON_ARRAYAGG,JSON_OBJECTAGG,JSON_TRANSFORM(modify). 19c+. - Snowflake:
OBJECT_CONSTRUCTfor objects,ARRAY_AGGfor arrays. ImplicitVARIANTtyping means output is naturally JSON-shaped.OBJECT_AGGto build a map from key/value rows. - BigQuery:
TO_JSON,JSON_OBJECT,ARRAY_AGGofSTRUCTthenTO_JSON. The nativeJSONtype makes this much cleaner than the legacySTRING-based JSON. - Redshift: limited.
LISTAGGto manually build JSON-looking strings, but no realJSON_AGG. Often easier to emit JSON in the application layer.
FAQ #
How do I aggregate rows into a JSON array?
Postgres / DuckDB: JSON_AGG(col ORDER BY col). SQL Server (2017+): JSON_ARRAYAGG(col) or STRING_AGG (then parse). Snowflake: ARRAY_AGG(col) (returns a VARIANT array, behaves like JSON). MySQL: JSON_ARRAYAGG. BigQuery: ARRAY_AGG of structs, then TO_JSON.
How do I build a JSON object from columns?
Postgres: JSON_BUILD_OBJECT('id', id, 'email', email) or JSON_OBJECT (16+). MySQL / Postgres 16+: JSON_OBJECT('id', id, 'email', email). Snowflake: OBJECT_CONSTRUCT('id', id, 'email', email). SQL Server (2022+): JSON_OBJECT('id': id, 'email': email).
Can I sort the elements of an aggregated JSON array?
Yes — pass ORDER BY inside the aggregate: JSON_AGG(col ORDER BY col) (Postgres). Snowflake: ARRAY_AGG(col) WITHIN GROUP (ORDER BY col). SQL Server: STRING_AGG ... WITHIN GROUP (ORDER BY ...). The order is preserved in the resulting JSON.
Should I store JSON or flat columns?
Flat columns are usually right. JSON aggregation is a contract layer between the warehouse and consumers (APIs, BI tools that want shape), not a storage layer. Storing the JSON-aggregated result back is usually a smell — you've collapsed queryable structure into a blob.
How do I skip NULL when aggregating?
Postgres: JSON_AGG(col) FILTER (WHERE col IS NOT NULL). Or skip the join: JSON_AGG(...) FILTER (WHERE other.id IS NOT NULL) after a LEFT JOIN. SQL Server's STRING_AGG skips NULLs by default; check your engine.
Why is my aggregated JSON huge?
Aggregating a column with thousands of rows into one JSON document creates massive single values. Network and parsing become the bottleneck. If the result regularly exceeds 1MB, reconsider: paginate, deliver as separate rows, or aggregate at finer granularity (group by + JSON_AGG inside each group).
**Use JSON aggregation as a contract layer between the warehouse and consumers, not as a storage layer.** Building JSON in a query that feeds an API is a clean pattern: the database produces exactly the shape the consumer wants. Storing the JSON-aggregated result back into a table is usually a smell, because you've collapsed queryable structure into a blob, and the next analyst will need to unnest it again. The right shape for downstream storage is almost always normalized rows; let the API or BI layer reshape into JSON when serving.