JSON转CSV在Python中,CSV的行数多于JSON。

huangapple go评论79阅读模式
英文:

JSON to CSV in Python, CSV has more rows than JSON

问题

I have made the necessary translation of your provided text. Below is the translated content you requested:

  1. 我正在尝试使用PythonJSON转换为CSV,但遇到了一个问题,2JSON生成了6CSV。以下是我的原始JSON和生成的CSV输出的示例:
  2. JSON

{ " _id ": "a1", " pl ": [ { " age ":45, " n ": "ar"}, { " age ":52, " n ": "Pi"}, { " age ":18, " n ": "al"} ], " ld ":13}
{ " _id ": "a2", " pl ": [ { " age ":85, " n ": "ta"}, { " age ":46, " n ": "lee"} ], " ld ":14}

  1. CSV

_id, ld, age, n, age, n
a1, 13, 45, ar, 45, ar
a1, 13, 52, Pi, 52, Pi
a1, 13, 18, al, 18, al
a2, 14, 85, ta, 85, ta
a2, 14, 46, lee, 46, lee

  1. 我目前处理2JSON,但意外地收到5CSV输出。
  2. 以下是我当前使用的Python代码:
  3. ```python
  4. import pandas as pd
  5. import glob
  6. for i in glob.glob('D:\.json'):
  7. data = []
  8. df_it = pd.read_json(i, encoding='utf-8', lines=True, chunksize=100, dtype=object)
  9. for sub in df_it:
  10. data.append(sub)
  11. df = pd.concat(data)
  12. df = df.explode('pl')
  13. df = pd.concat([
  14. df.reset_index(drop=True),
  15. pd.json_normalize(df.pl),
  16. ], axis=1)
  17. df = df.drop(['pl', ], axis=1)
  18. df.to_csv('D:\.csv', index=None, encoding='utf-8')

我希望获得如下格式的CSV输出:

  1. _id pl.age1 pl.n1 pl.age2 pl.n2 pl.age3 pl.n3 ld
  2. a1 45 ar 52 Pi 18 al 13
  3. a2 85 ta 46 lee 14

为了实现所期望的CSV输出,我需要对代码或实现进行哪些更改?

  1. <details>
  2. <summary>英文:</summary>
  3. I am attempting to convert JSON into CSV using Python, but I&#39;m encountering an issue where 2 lines of JSON are producing 6 lines of CSV. Here are samples of my original JSON and resulting CSV outputs:
  4. JSON:

{ "_id" : "a1" ,"pl" : [ { "age" : 45, "n" : "ar"}, { "age" : 52, "n" : "Pi" }, { "age" : 18, "n" : "al"} ] , "ld" : 13}
{ "_id" : "a2" ,"pl" : [ { "age" : 85, "n" : "ta"}, { "age" : 46, "n" : "lee" }] , "ld" : 14}

  1. CSV:

_id, ld, age, n, age, n
a1, 13, 45, ar, 45, ar
a1, 13, 52, Pi, 52, Pi
a1, 13, 18, al, 18, al
a2, 14, 85, ta, 85, ta
a2, 14, 46, lee, 46, lee

  1. I am currently processing 2 lines of JSON but unexpectedly receiving 5 lines of CSV output.
  2. Below is the Python code I&#39;m currently using:

import pandas as pd
import glob

for i in glob.glob('D:\1.json'):
data = []
df_it = pd.read_json(i, encoding='utf-8', lines=True, chunksize=100, dtype=object)
for sub in df_it:
data.append(sub)
df = pd.concat(data)
df = df.explode('pl')
df = pd.concat([
df.reset_index(drop=True),
pd.json_normalize(df.pl),
], axis=1)
df = df.drop(['pl', ], axis=1)
df.to_csv('D:\1.csv', index=None, encoding='utf-8')

  1. I am hoping to get a CSV output formatted like this:

_id, pl.age1, pl.n1, pl.age2, pl.n2, pl.age3, pl.n3, ld,
a1, 45, ar, 52, Pi, 18, al, 13
a2, 85, ta, 46, lee, , , 14

  1. What changes to my code or implementation do I need to make to achieve my desired CSV output?
  2. </details>
  3. # 答案1
  4. **得分**: 0
  5. 我假设 JSON 数据被存储在不同的文件中。
  6. **实现:**
  7. ```python3
  8. import glob
  9. from cherrypicker import CherryPicker
  10. import json
  11. import pandas as pd
  12. df = pd.DataFrame()
  13. for index, file_name in enumerate(glob.glob("json/*.json")):
  14. with open(file_name, encoding="utf-8") as file:
  15. data = json.load(file)
  16. picker = CherryPicker(data)
  17. flat = picker.flatten().get()
  18. df = df.append(flat, ignore_index=True)
  19. df.to_csv("output.csv", index=False)

输出:

  1. _id,pl_0_age,pl_0_n,pl_1_age,pl_1_n,pl_2_age,pl_2_n,ld
  2. a1,45,ar,52,Pi,18.0,al,13
  3. a2,85,ta,46,lee,,,14
英文:

I am assuming that the json are being stored in different files.

Implementation:

  1. import glob
  2. from cherrypicker import CherryPicker
  3. import json
  4. import pandas as pd
  5. df = pd.DataFrame()
  6. for index, file_name in enumerate(glob.glob(&quot;json/*.json&quot;)):
  7. with open(file_name, encoding=&quot;utf-8&quot;) as file:
  8. data = json.load(file)
  9. picker = CherryPicker(data)
  10. flat = picker.flatten().get()
  11. df = df._append(flat, ignore_index=True)
  12. df.to_csv(f&quot;output.csv&quot;, index=False)

Output:

  1. _id,pl_0_age,pl_0_n,pl_1_age,pl_1_n,pl_2_age,pl_2_n,ld
  2. a1,45,ar,52,Pi,18.0,al,13
  3. a2,85,ta,46,lee,,,14

huangapple
  • 本文由 发表于 2023年6月16日 10:54:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/76486682.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定