从JSON提取数据到PostgreSQL中的表格(动态)

huangapple go评论120阅读模式
英文:

How extract data from json to table (dynamic) in postgresql

问题

如何将JSON数据提取到表格格式中,除了_id之外不需要任何参考。

我有下面的表格和JSON数据。

输出应该如下表格格式。

_id 指标
128 start 2023-06-13T19:01:20.952+05:30
128 end 2023-06-13T19:01:43.804+05:30
128 username nayan
128 Output_1/surveyname A1
128 Output_1/thematic A2
128 Output_1/State DC
128 Output_1/Monthe 06
128 Output_1/Quarter 02
128 Output_1/QuarterText Q2
128 Output_1/Suboutput_1_3/RRF_3/Question_3/Question_3__1/Q2Target_3 29
128 _submitted_by nayan
英文:

How to extract data from json to table format without any reference except _id.

I have a below table and json data.

create table jsobdata(
id serial,
j_data jsonb)

insert into jsobdata (j_data) values('{"count":1,"next":null,"previous":null,"results":[{"_id":128,"start":"2023-06-13T19:01:20.952+05:30","end":"2023-06-13T19:01:43.804+05:30","username":"nayan","Output_1/surveyname":"A1","Output_1/thematic":"A2","Output_1/State":"DC","Output_1/Month":"06","Output_1/Quarter":"2","Output_1/QuarterText":"Q2","Output_1/Suboutput_1_3/RRF_3/Question_3/Question_3__1/Q2Target_3":"29","_submitted_by":"nayan"}]}');

output should be like below table format.

_id indicator value
128 start 2023-06-13T19:01:20.952+05:30
128 end 2023-06-13T19:01:43.804+05:30
128 username nayan
128 Output_1/surveyname A1
128 Output_1/thematic A2
128 Output_1/State DC
128 Output_1/Monthe 06
128 Output_1/Quarter 02
128 Output_1/QuarterText Q2
128 Output_1/Suboutput_1_3/RRF_3/Question_3/Question_3__1/Q2Target_3 29
128 _submitted_by nayan

答案1

得分: 1

在PostgreSQL中,您可以使用jsonb_each_text函数从JSON列中提取数据,该函数将最外层的JSON对象扩展为一组键值对。结合使用jsonb_array_elements和jsonb_each_text将允许您提取数组中的嵌套JSON对象。

以下是如何实现这一点的示例:

SELECT
  results->'_id' AS _id,
  (kv).key AS indicator,
  (kv).value AS value
FROM 
  (SELECT j_data->'results' AS results_array FROM jsobdata) AS dt,
  jsonb_array_elements(dt.results_array) AS results,
  jsonb_each_text(results) AS kv;

此查询首先从j_data列中提取'results'数组。然后,它会扩展该数组中的每个JSON对象,并针对每个这些JSON对象,进一步将其扩展为键值对。这为您提供了每个键值对的一行,正如您所期望的那样。

但是,请记住,此查询无法正确处理更深层次的嵌套JSON对象。如果您的JSON对象可以具有任意级别的嵌套,您需要使用递归查询,这可能会相当复杂。

此外,请记住,从性能的角度来看,此类操作可能会昂贵。因此,如果您经常需要以结构化、类似表格的格式操作数据,考虑将其存储在规范化的关系表中,而不是使用JSONB存储。

根据更新后的问题,答案应该基本相同。然而,提供的JSON对象似乎具有一些嵌套属性。提供的查询仅适用于JSON对象的顶层属性,而不适用于像"Output_1/surveyname"这样的嵌套属性。

不幸的是,Postgres没有提供将嵌套的JSON对象展平为一组键值对的开箱即用方法。如果您可以控制传入JSON的结构,最简单的解决方案是避免嵌套属性。

如果无法更改JSON的结构,您将需要编写递归函数或一系列命令来处理嵌套属性。这可能会变得相当复杂,并且如果数据量大,性能可能不佳。

考虑以下解决方案:

WITH RECURSIVE extract_nested AS (
  SELECT
    results->'_id' AS _id,
    (kv).key AS indicator,
    CASE WHEN jsonb_typeof((kv).value) = 'object' THEN
      (kv).value::text -- 对于嵌套的JSON对象,只需将其转换为文本
    ELSE
      (kv).value -- 对于基本值,直接使用值
    END AS value
  FROM 
    (SELECT j_data->'results' AS results_array FROM jsobdata) AS dt,
    jsonb_array_elements(dt.results_array) AS results,
    jsonb_each_text(results) AS kv
)
SELECT * FROM extract_nested;

这是一个相当高级的查询。它使用了一个递归公共表达式(CTE)来处理嵌套的JSON对象。它首先提取顶层属性,就像前面的查询一样,但如果遇到嵌套的JSON对象,它只是将其转换为字符串(使用::text转换),而不是尝试提取其属性。

请注意,这是一个简化的方法,可能不完全适合您的需求。例如,它不处理JSON中的数组,而是将嵌套对象转换为字符串而不是展平它们。如果需要处理这些更复杂的情况,您可能需要编写一个更复杂的递归函数,或考虑在将数据插入数据库之前使用具有更好JSON处理功能的语言(如Python或JavaScript)处理数据。

另外,请注意,这种方法在处理大量数据时性能不佳。如果您经常需要以结构化的方式操作JSON数据,最好将数据存储在更结构化的格式中。

英文:

In PostgreSQL, you can extract data from JSON columns using the jsonb_each_text function, which expands the outermost JSON object into a set of key-value pairs. The combined use of jsonb_array_elements and jsonb_each_text will allow you to extract the nested JSON objects within your array.

Here is how you can achieve this:

SELECT
  results->_id AS _id,
  (kv).key AS indicator,
  (kv).value AS value
FROM 
  (SELECT j_data->'results' AS results_array FROM jsobdata) AS dt,
  jsonb_array_elements(dt.results_array) AS results,
  jsonb_each_text(results) AS kv;

This query first extracts the 'results' array from the j_data column. Then, it expands each JSON object within that array, and for each of these JSON objects, it further expands them into key-value pairs. This gives you a row for each key-value pair, just as you wanted.

However, keep in mind that this query won't handle deeper nested JSON objects properly. If your JSON objects can have arbitrary levels of nesting, you would need to use a recursive query, which can be quite complex.

Also, remember that such operations might be expensive in terms of performance. So, if you frequently need to operate on your data in a structured, table-like format, it might be worth considering to store it in normalized relational tables instead of JSONB.

Edit

Given the updated question, the answer should stay pretty much the same. However, it seems that the provided JSON object has some nested properties. The provided query would only work for properties at the top level of the JSON objects, but not for the nested ones like "Output_1/surveyname".

Unfortunately, Postgres does not provide an out-of-the-box way to flatten nested JSON objects into a set of key-value pairs. If you have control over the structure of the incoming JSON, the easiest solution would be to avoid nested properties.

If you can't change the structure of the JSON, you would need to write a recursive function or a series of commands to handle the nested properties. This can become quite complex and might not perform well if you have a lot of data.

Consider following solution:

WITH RECURSIVE extract_nested AS (
  SELECT
    results->_id AS _id,
    (kv).key AS indicator,
    CASE WHEN jsonb_typeof((kv).value) = 'object' THEN
      (kv).value::text -- for nested JSON objects, just convert them to text
    ELSE
      (kv).value -- for primitive values, use the value directly
    END AS value
  FROM 
    (SELECT j_data->'results' AS results_array FROM jsobdata) AS dt,
    jsonb_array_elements(dt.results_array) AS results,
    jsonb_each_text(results) AS kv
)
SELECT * FROM extract_nested;

This is a rather advanced query. It uses a recursive common table expression (CTE) to handle the nested JSON objects. It first extracts the top-level properties just like in the previous query, but if it encounters a nested JSON object, it just converts it to a string (using the ::text cast) instead of trying to extract its properties.

Please note that this is a simplistic approach and may not suit your needs perfectly. For example, it does not handle arrays within the JSON, and it turns nested objects into strings instead of flattening them. If you need to handle these more complex cases, you may need to write a more complex recursive function, or consider using a language with better JSON handling capabilities (like Python or JavaScript) to process the data before inserting it into the database.

Also, keep in mind that this approach is not performant for large amounts of data. If you often need to operate on the JSON data in a structured way, it might be better to store the data in a more structured format in the first place.

huangapple
  • 本文由 发表于 2023年6月19日 20:21:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76506582.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定