如何展开 JSON 数组列

huangapple go评论76阅读模式
英文:

How to unnest a json arrays columns

问题

我有一个名为event_params的列,其中包含以下值:

[{"key": "label", "value": {"int_value": null, "float_value": null, "double_value": null, "string_value": "example"}}, {"key": "label2", "value": {"int_value": null, "float_value": null, "double_value": null, "string_value": "another_example"}}, {"key": "label3", "value": {"int_value": 9, "float_value": null, "double_value": null, "string_value": null}}, {"key": "label3", "value": {"int_value": 1234, "float_value": null, "double_value": null, "string_value": null}}]

或者

[{"key": "label",
  "value": {"int_value": null,
            "float_value": null,
            "double_value": null,
            "string_value": "example"}},
  {"key": "label2",
   "value": {"int_value": null,
            "float_value": null,
            "double_value": null,
            "string_value": "another_example"}},
  {"key": "label4",
   "value": {"int_value": 9,
            "float_value": null,
            "double_value": null,
            "string_value": null}},
  {"key": "label4",
   "value": {"int_value": 1234,
             "float_value": null,
             "double_value": null,
             "string_value": null}}]

我想将此列展开以提取每个user_idkey为"label"的string_value。为此,我尝试使用jsonb_to_recordset函数,但我得到的输出基本上是相同的user_id,只是每个"value"键的值不同:

user_id | value
123 | {"int_value": null, "float_value": null, "double_value": null, "string_value": "example"}
123 | {"int_value": null, "float_value": null, "double_value": null, "string_value": "another_example"}
123 | {"int_value": 9, "float_value": null, "double_value": null, "string_value": null}
...

尽管有更多的用户(我希望每个用户只有一行)。这是我正在使用的代码片段:

select
	user_id,
	event_params,
	evs.key,
	evs.value::json
from google_analytics.firebase_events_analytics,
	jsonb_to_recordset(event_params) as evs(key text, value text)
英文:

I have a column called event_params with values like this:

[{"key": "label", "value": {"int_value": null, "float_value": null, "double_value": null, "string_value": "example"}}, {"key": "label2", "value": {"int_value": null, "float_value": null, "double_value": null, "string_value": "another_example"}}, {"key": "label3", "value": {"int_value": 9, "float_value": null, "double_value": null, "string_value": null}}, {"key": "label3", "value": {"int_value": 1234, "float_value": null, "double_value": null, "string_value": null}}]

OR

[{"key": "label",
  "value": {"int_value": null,
            "float_value": null,
            "double_value": null,
            "string_value": "example"}},
  {"key": "label2",
   "value": {"int_value": null,
            "float_value": null,
            "double_value": null,
            "string_value": "another_example"}},
  {"key": "label4",
   "value": {"int_value": 9,
            "float_value": null,
            "double_value": null,
            "string_value": null}},
  {"key": "label4",
   "value": {"int_value": 1234,
             "float_value": null,
             "double_value": null,
             "string_value": null}}]

I would like to unnest this column to extract the string_value from key = "label" for every user_id. To do that I'm trying to use jsonb_to_recordset function but the output I'm getting is basically the same user_id but multiple times and the only thing it changes it's the values from each "value" key:

user_id | value
123 | {"int_value": null, "float_value": null, "double_value": null, "string_value": "example"}
123 | {"int_value": null, "float_value": null, "double_value": null, "string_value": "another_example"}
123 | {"int_value": 9, "float_value": null, "double_value": null, "string_value": null}
...

All of that for the same user even though there are more users (I would expect to have just one row per user). Here is the piece of code that I'm using:

select
	user_id,
	event_params,
	evs.key,
	evs.value::json
from google_analytics.firebase_events_analytics,
	jsonb_to_recordset(event_params) as evs(key text, value text)

答案1

得分: 2

查询由于隐式的CROSS JOIN而返回了多行每个user_id,当显式表达CROSS JOIN时,等效的查询如下(还包括与JSONB_TO_RECORDSET关联的隐式LATERAL):

SELECT user_id,
       event_params,
       evs.key,
       evs.value
  FROM google_analytics.firebase_events_analytics
  CROSS JOIN LATERAL JSONB_TO_RECORDSET(event_params) AS evs(key text, value jsonb)

可以通过过滤连接的行来实现所需的行为。以下查询使用WHERE子句中的条件来排除不需要的行:

SELECT user_id,
       event_params,
       evs.key,
       evs.value
  FROM google_analytics.firebase_events_analytics
  CROSS JOIN JSONB_TO_RECORDSET(event_params) AS evs(key text, value jsonb)
  WHERE evs.key = 'label';

另一种方法是使用JOIN条件:

SELECT user_id,
       event_params,
       evs.key,
       evs.value
  FROM google_analytics.firebase_events_analytics
  JOIN JSONB_TO_RECORDSET(event_params) AS evs(key text, value jsonb)
    ON evs.key = 'label';

这两种方法是等效的,并且在我测试查询时产生了相同的执行计划。

英文:

The query is returning multiple rows per user_id because of an implicit CROSS JOIN. The following is the equivalent when the CROSS JOIN is explicitly expressed (the implicit LATERAL associated with JSONB_TO_RECORDSET is also included):

SELECT user_id,
       event_params,
       evs.key,
       evs.value
  FROM google_analytics.firebase_events_analytics
  CROSS JOIN LATERAL JSONB_TO_RECORDSET(event_params) AS evs(key text, value jsonb)

The desired behavior can be achieved by filtering the joined rows. The following uses criteria in the WHERE clause to exclude undestired rows:

SELECT user_id,
       event_params,
       evs.key,
       evs.value
  FROM google_analytics.firebase_events_analytics
  CROSS JOIN JSONB_TO_RECORDSET(event_params) AS evs(key text, value jsonb)
  WHERE evs.key = 'label';

An alternative is to use JOIN criteria:

SELECT user_id,
       event_params,
       evs.key,
       evs.value
  FROM google_analytics.firebase_events_analytics
  JOIN JSONB_TO_RECORDSET(event_params) AS evs(key text, value jsonb)
    ON evs.key = 'label';

The two approaches are equivalent and produced identical explain plans when I tested the queries.

huangapple
  • 本文由 发表于 2023年8月9日 06:06:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/76863470.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定