英文:
How to unnest a json arrays columns
问题
我有一个名为event_params
的列,其中包含以下值:
[{"key": "label", "value": {"int_value": null, "float_value": null, "double_value": null, "string_value": "example"}}, {"key": "label2", "value": {"int_value": null, "float_value": null, "double_value": null, "string_value": "another_example"}}, {"key": "label3", "value": {"int_value": 9, "float_value": null, "double_value": null, "string_value": null}}, {"key": "label3", "value": {"int_value": 1234, "float_value": null, "double_value": null, "string_value": null}}]
或者
[{"key": "label",
"value": {"int_value": null,
"float_value": null,
"double_value": null,
"string_value": "example"}},
{"key": "label2",
"value": {"int_value": null,
"float_value": null,
"double_value": null,
"string_value": "another_example"}},
{"key": "label4",
"value": {"int_value": 9,
"float_value": null,
"double_value": null,
"string_value": null}},
{"key": "label4",
"value": {"int_value": 1234,
"float_value": null,
"double_value": null,
"string_value": null}}]
我想将此列展开以提取每个user_id
中key
为"label"的string_value
。为此,我尝试使用jsonb_to_recordset
函数,但我得到的输出基本上是相同的user_id
,只是每个"value"键的值不同:
user_id | value
123 | {"int_value": null, "float_value": null, "double_value": null, "string_value": "example"}
123 | {"int_value": null, "float_value": null, "double_value": null, "string_value": "another_example"}
123 | {"int_value": 9, "float_value": null, "double_value": null, "string_value": null}
...
尽管有更多的用户(我希望每个用户只有一行)。这是我正在使用的代码片段:
select
user_id,
event_params,
evs.key,
evs.value::json
from google_analytics.firebase_events_analytics,
jsonb_to_recordset(event_params) as evs(key text, value text)
英文:
I have a column called event_params
with values like this:
[{"key": "label", "value": {"int_value": null, "float_value": null, "double_value": null, "string_value": "example"}}, {"key": "label2", "value": {"int_value": null, "float_value": null, "double_value": null, "string_value": "another_example"}}, {"key": "label3", "value": {"int_value": 9, "float_value": null, "double_value": null, "string_value": null}}, {"key": "label3", "value": {"int_value": 1234, "float_value": null, "double_value": null, "string_value": null}}]
OR
[{"key": "label",
"value": {"int_value": null,
"float_value": null,
"double_value": null,
"string_value": "example"}},
{"key": "label2",
"value": {"int_value": null,
"float_value": null,
"double_value": null,
"string_value": "another_example"}},
{"key": "label4",
"value": {"int_value": 9,
"float_value": null,
"double_value": null,
"string_value": null}},
{"key": "label4",
"value": {"int_value": 1234,
"float_value": null,
"double_value": null,
"string_value": null}}]
I would like to unnest this column to extract the string_value
from key
= "label" for every user_id
. To do that I'm trying to use jsonb_to_recordset
function but the output I'm getting is basically the same user_id but multiple times and the only thing it changes it's the values from each "value" key:
user_id | value
123 | {"int_value": null, "float_value": null, "double_value": null, "string_value": "example"}
123 | {"int_value": null, "float_value": null, "double_value": null, "string_value": "another_example"}
123 | {"int_value": 9, "float_value": null, "double_value": null, "string_value": null}
...
All of that for the same user even though there are more users (I would expect to have just one row per user). Here is the piece of code that I'm using:
select
user_id,
event_params,
evs.key,
evs.value::json
from google_analytics.firebase_events_analytics,
jsonb_to_recordset(event_params) as evs(key text, value text)
答案1
得分: 2
查询由于隐式的CROSS JOIN
而返回了多行每个user_id
,当显式表达CROSS JOIN
时,等效的查询如下(还包括与JSONB_TO_RECORDSET
关联的隐式LATERAL
):
SELECT user_id,
event_params,
evs.key,
evs.value
FROM google_analytics.firebase_events_analytics
CROSS JOIN LATERAL JSONB_TO_RECORDSET(event_params) AS evs(key text, value jsonb)
可以通过过滤连接的行来实现所需的行为。以下查询使用WHERE
子句中的条件来排除不需要的行:
SELECT user_id,
event_params,
evs.key,
evs.value
FROM google_analytics.firebase_events_analytics
CROSS JOIN JSONB_TO_RECORDSET(event_params) AS evs(key text, value jsonb)
WHERE evs.key = 'label';
另一种方法是使用JOIN
条件:
SELECT user_id,
event_params,
evs.key,
evs.value
FROM google_analytics.firebase_events_analytics
JOIN JSONB_TO_RECORDSET(event_params) AS evs(key text, value jsonb)
ON evs.key = 'label';
这两种方法是等效的,并且在我测试查询时产生了相同的执行计划。
英文:
The query is returning multiple rows per user_id because of an implicit CROSS JOIN
. The following is the equivalent when the CROSS JOIN
is explicitly expressed (the implicit LATERAL
associated with JSONB_TO_RECORDSET
is also included):
SELECT user_id,
event_params,
evs.key,
evs.value
FROM google_analytics.firebase_events_analytics
CROSS JOIN LATERAL JSONB_TO_RECORDSET(event_params) AS evs(key text, value jsonb)
The desired behavior can be achieved by filtering the joined rows. The following uses criteria in the WHERE
clause to exclude undestired rows:
SELECT user_id,
event_params,
evs.key,
evs.value
FROM google_analytics.firebase_events_analytics
CROSS JOIN JSONB_TO_RECORDSET(event_params) AS evs(key text, value jsonb)
WHERE evs.key = 'label';
An alternative is to use JOIN
criteria:
SELECT user_id,
event_params,
evs.key,
evs.value
FROM google_analytics.firebase_events_analytics
JOIN JSONB_TO_RECORDSET(event_params) AS evs(key text, value jsonb)
ON evs.key = 'label';
The two approaches are equivalent and produced identical explain plans when I tested the queries.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论