将JSON数据格式化并附加硬编码数据以创建一个扁平的.txt文件。

huangapple go评论58阅读模式
英文:

Code to format JSON data and append hardcoded data to create a flat .txt file

问题

import datetime
import pandas as pd
import json

json_data = [{"studentid": 1, "name": "ABC", "subjects": ["Python", "Data Structures"]},
             {"studentid": 2, "name": "PQR", "subjects": ["Java", "Operating System"]}]

Hardcoded_Val1 = 10
Hardcoded_Val2 = 20
Hardcoded_Val3 = str(datetime.datetime.now())

profile = str(Hardcoded_Val1) + ',' + str(Hardcoded_Val2) + ',"' + str(json_data) + '",' + Hardcoded_Val3

print(profile)

data_list = []
for data_info in profile:
   data_list.append(data_info.replace(', "', '|').replace('[{', '{').replace('}]', '}'))

data_df = pd.DataFrame(data=data_list)
data_df.to_csv(r'E:\DataLake\api_fetched_sample_output.txt', sep='|', index=False, encoding='utf-8')
英文:

Source Data::

json_data = [{"studentid": 1, "name": "ABC", "subjects": ["Python", "Data Structures"]},
             {"studentid": 2, "name": "PQR", "subjects": ["Java", "Operating System"]}]

Hardcoded_Val1 = 10
Hardcoded_Val2 = 20
Hardcoded_Val3 = str(datetime.datetime.now())

Need to create a flat .txt file with the below data.

ID,DEPT,"studentid|name|subjects",execution_dt
10,20,"1|ABC|Python,Data Structures",2023-06-01
10,20,"2|PQR|Java,Operating System",2023-06-01

I am very new in python. Have already tried to figure it out to achieve it but couldn't. Your help will be much appreciated.

import datetime
import pandas as pd
import json


json_data = [{"studentid": 1, "name": "ABC", "subjects": ["Python", "Data Structures"]},
             {"studentid": 2, "name": "PQR", "subjects": ["Java", "Operating System"]}]

Hardcoded_Val1 = 10
Hardcoded_Val2 = 20
Hardcoded_Val3 = str(datetime.datetime.now())

profile = str(Hardcoded_Val1) + ',' + str(Hardcoded_Val2) + ',"' + str(json_data) + '",' + Hardcoded_Val3
        
print(profile)
#data = json.dumps(profile, indent=True)
#print(data)
data_list = []
for data_info in profile:
   data_list.append(data_info.replace(", '", '|'))
data_df = pd.DataFrame(data=data_list)
data_df.to_csv(r'E:\DataLake\api_fetched_sample_output.txt', sep='|', index=False, encoding='utf-8')

答案1

得分: 1

我建议不使用pandas来完成这个任务,而是主要使用列表推导和join()方法手动构建字符串。

import datetime
import csv

Hardcoded_Val1 = 10
Hardcoded_Val2 = 20
Hardcoded_Val3 = str(datetime.date.today())
json_data = [
    {"studentid": 1, "name": "ABC", "subjects": ["Python", "Data Structures"]},
    {"studentid": 2, "name": "PQR", "subjects": ["Java", "Operating System"]}
]

csv_data = []
for row in json_data:
    keys = "|".join(row.keys())
    values = "|".join([
        ",".join(value) if isinstance(value, list) else str(value)
        for value in row.values()
    ])
    csv_data.append(dict([
        ("ID", Hardcoded_Val1),
        ("DEPT", Hardcoded_Val2),
        (keys, values),
        ("execution_dt", Hardcoded_Val3)
    ]))

with open("out.csv", "w", encoding="utf-8", newline="") as file_out:
    writer = csv.DictWriter(file_out, fieldnames=list(csv_data[0].keys()))
    writer.writeheader()
    writer.writerows(csv_data)

这将生成以下内容的文件:

ID,DEPT,studentid|name|subjects,execution_dt
10,20,1|ABC|Python,Data Structures,2023-06-02
10,20,2|PQR|Java,Operating System,2023-06-02
英文:

I would bypass using pandas for this and just build the string manually primarily using a list comprehension and join().

import datetime
import csv

Hardcoded_Val1 = 10
Hardcoded_Val2 = 20
Hardcoded_Val3 = str(datetime.date.today())
json_data = [
    {"studentid": 1, "name": "ABC", "subjects": ["Python", "Data Structures"]},
    {"studentid": 2, "name": "PQR", "subjects": ["Java", "Operating System"]}
]

csv_data = []
for row in json_data:
    keys = "|".join(row.keys())
    values = "|".join([
        ",".join(value) if isinstance(value, list) else str(value)
        for value in row.values()
    ])
    csv_data.append(dict([
        ("ID", Hardcoded_Val1),
        ("DEPT", Hardcoded_Val2),
        (keys, values),
        ("execution_dt", Hardcoded_Val3)
    ]))

with open("out.csv", "w", encoding="utf-8", newline="") as file_out:
    writer = csv.DictWriter(file_out, fieldnames=list(csv_data[0].keys()))
    writer.writeheader()
    writer.writerows(csv_data)

This will produce a file with the following contents:

ID,DEPT,studentid|name|subjects,execution_dt
10,20,"1|ABC|Python,Data Structures",2023-06-02
10,20,"2|PQR|Java,Operating System",2023-06-02

答案2

得分: 0

我认为这会有所帮助

import pandas as pd

json_data = [
    {"studentid": 1, "name": "ABC", "subjects": ["Python", "Data Structures"]},
    {"studentid": 2, "name": "PQR", "subjects": ["Java", "Operating System"]}
]

# 将 JSON 转换为 DataFrame
df = pd.json_normalize(json_data, "subjects", ["studentid", "name"])

# 重命名列
df.columns = ["subject", "studentid", "name"]

# 将 DataFrame 保存为 CSV
df.to_csv("output.csv", index=False)
英文:

I think this will help.

import pandas as pd

json_data = [
    {"studentid": 1, "name": "ABC", "subjects": ["Python", "Data Structures"]},
    {"studentid": 2, "name": "PQR", "subjects": ["Java", "Operating System"]}
]

# Convert JSON to DataFrame
df = pd.json_normalize(json_data, "subjects", ["studentid", "name"])

# Rename columns
df.columns = ["subject", "studentid", "name"]

# Save DataFrame to CSV
df.to_csv("output.csv", index=False)

huangapple
  • 本文由 发表于 2023年6月2日 01:10:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/76384220.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定