英文:
JSON to CSV in Python, CSV has more rows than JSON
问题
I have made the necessary translation of your provided text. Below is the translated content you requested:
我正在尝试使用Python将JSON转换为CSV,但遇到了一个问题,2行JSON生成了6行CSV。以下是我的原始JSON和生成的CSV输出的示例:
JSON:
{ " _id ": "a1", " pl ": [ { " age ":45, " n ": "ar"}, { " age ":52, " n ": "Pi"}, { " age ":18, " n ": "al"} ], " ld ":13}
{ " _id ": "a2", " pl ": [ { " age ":85, " n ": "ta"}, { " age ":46, " n ": "lee"} ], " ld ":14}
CSV:
_id, ld, age, n, age, n
a1, 13, 45, ar, 45, ar
a1, 13, 52, Pi, 52, Pi
a1, 13, 18, al, 18, al
a2, 14, 85, ta, 85, ta
a2, 14, 46, lee, 46, lee
我目前处理2行JSON,但意外地收到5行CSV输出。
以下是我当前使用的Python代码:
```python
import pandas as pd
import glob
for i in glob.glob('D:\.json'):
data = []
df_it = pd.read_json(i, encoding='utf-8', lines=True, chunksize=100, dtype=object)
for sub in df_it:
data.append(sub)
df = pd.concat(data)
df = df.explode('pl')
df = pd.concat([
df.reset_index(drop=True),
pd.json_normalize(df.pl),
], axis=1)
df = df.drop(['pl', ], axis=1)
df.to_csv('D:\.csv', index=None, encoding='utf-8')
我希望获得如下格式的CSV输出:
_id, pl.age1, pl.n1, pl.age2, pl.n2, pl.age3, pl.n3, ld,
a1, 45, ar, 52, Pi, 18, al, 13
a2, 85, ta, 46, lee, , , 14
为了实现所期望的CSV输出,我需要对代码或实现进行哪些更改?
<details>
<summary>英文:</summary>
I am attempting to convert JSON into CSV using Python, but I'm encountering an issue where 2 lines of JSON are producing 6 lines of CSV. Here are samples of my original JSON and resulting CSV outputs:
JSON:
{ "_id" : "a1" ,"pl" : [ { "age" : 45, "n" : "ar"}, { "age" : 52, "n" : "Pi" }, { "age" : 18, "n" : "al"} ] , "ld" : 13}
{ "_id" : "a2" ,"pl" : [ { "age" : 85, "n" : "ta"}, { "age" : 46, "n" : "lee" }] , "ld" : 14}
CSV:
_id, ld, age, n, age, n
a1, 13, 45, ar, 45, ar
a1, 13, 52, Pi, 52, Pi
a1, 13, 18, al, 18, al
a2, 14, 85, ta, 85, ta
a2, 14, 46, lee, 46, lee
I am currently processing 2 lines of JSON but unexpectedly receiving 5 lines of CSV output.
Below is the Python code I'm currently using:
import pandas as pd
import glob
for i in glob.glob('D:\1.json'):
data = []
df_it = pd.read_json(i, encoding='utf-8', lines=True, chunksize=100, dtype=object)
for sub in df_it:
data.append(sub)
df = pd.concat(data)
df = df.explode('pl')
df = pd.concat([
df.reset_index(drop=True),
pd.json_normalize(df.pl),
], axis=1)
df = df.drop(['pl', ], axis=1)
df.to_csv('D:\1.csv', index=None, encoding='utf-8')
I am hoping to get a CSV output formatted like this:
_id, pl.age1, pl.n1, pl.age2, pl.n2, pl.age3, pl.n3, ld,
a1, 45, ar, 52, Pi, 18, al, 13
a2, 85, ta, 46, lee, , , 14
What changes to my code or implementation do I need to make to achieve my desired CSV output?
</details>
# 答案1
**得分**: 0
我假设 JSON 数据被存储在不同的文件中。
**实现:**
```python3
import glob
from cherrypicker import CherryPicker
import json
import pandas as pd
df = pd.DataFrame()
for index, file_name in enumerate(glob.glob("json/*.json")):
with open(file_name, encoding="utf-8") as file:
data = json.load(file)
picker = CherryPicker(data)
flat = picker.flatten().get()
df = df.append(flat, ignore_index=True)
df.to_csv("output.csv", index=False)
输出:
_id,pl_0_age,pl_0_n,pl_1_age,pl_1_n,pl_2_age,pl_2_n,ld
a1,45,ar,52,Pi,18.0,al,13
a2,85,ta,46,lee,,,14
英文:
I am assuming that the json are being stored in different files.
Implementation:
import glob
from cherrypicker import CherryPicker
import json
import pandas as pd
df = pd.DataFrame()
for index, file_name in enumerate(glob.glob("json/*.json")):
with open(file_name, encoding="utf-8") as file:
data = json.load(file)
picker = CherryPicker(data)
flat = picker.flatten().get()
df = df._append(flat, ignore_index=True)
df.to_csv(f"output.csv", index=False)
Output:
_id,pl_0_age,pl_0_n,pl_1_age,pl_1_n,pl_2_age,pl_2_n,ld
a1,45,ar,52,Pi,18.0,al,13
a2,85,ta,46,lee,,,14
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论