英文:
How to export multiple or all JSON objects to CSV with Python?
问题
I have the following Python script to view JSON data in normalized format for 10 rows:
import pandas as pd
from openpyxl.workbook import Workbook
import csv
from pathlib import Path
from pandas.io.json import json_normalize
import json
from datetime import datetime
from datetime import date
from datetime import timedelta
import psycopg2
from psycopg2 import OperationalError
# Import our files
import pg # Various functions to interface with the Postgres servers
from db_creds import * # The DB server and user creds
#try:
# Connect to an existing database
connection = pg.create_connection(sourceDB_setting[3], sourceDB_setting[5], sourceDB_setting[6], sourceDB_setting[1], sourceDB_setting[2])
#Create a cursor to perform database operations
cursor = connection.cursor()
cursor.execute("SELECT application_data, id FROM ssap_applications LIMIT 10;")
results = cursor.fetchall()
for row in results:
jrec, app_id = row
# Process each row here
#print(jrec)
jrec = json.loads(jrec)
normal_json = pd.json_normalize(jrec)
print(normal_json)
# save to csv
normal_json.to_csv('App_data2.csv', index=False, encoding='utf-8')
cursor.close()
I want to export those 10 records to a CSV file, so far I can only export one record with this code normal_json.to_csv('App_data2.csv', index=False, encoding='utf-8')
so I wonder how should I fix my script to export 10 records or all records?
英文:
I have the following Python script to view JSON data in normalized format for 10 rows:
import pandas as pd
from openpyxl.workbook import Workbook
import csv
from pathlib import Path
from pandas.io.json import json_normalize
import json
from datetime import datetime
from datetime import date
from datetime import timedelta
import psycopg2
from psycopg2 import OperationalError
# Import our files
import pg # Various functions to interface with the Postgres servers
from db_creds import * # The DB server and user creds
#try:
# Connect to an existing database
connection = pg.create_connection(sourceDB_setting[3], sourceDB_setting[5], sourceDB_setting[6], sourceDB_setting[1], sourceDB_setting[2])
#Create a cursor to perform database operations
cursor = connection.cursor()
cursor.execute("SELECT application_data, id FROM ssap_applications LIMIT 10;")
results = cursor.fetchall()
for row in results:
jrec, app_id = row
# Process each row here
#print(jrec)
jrec = json.loads(jrec)
normal_json = pd.json_normalize(jrec)
print(normal_json)
# save to csv
normal_json.to_csv('App_data2.csv', index=False, encoding='utf-8')
cursor.close()
I want to export those 10 records to a CSV file, so far I can only export one record with this code normal_json.to_csv('App_data2.csv', index=False, encoding='utf-8')
so I wonder how should I fix my script to export 10 records or all records?
答案1
得分: 1
You ARE exporting all 10 records, but you are rewriting the file every time, so each one overwrites the previous one. The to_csv
method includes a mode
parameter, exactly like open
, which lets you specify "append" mode.
normal_json.to_csv('App_data2.csv', index=False, mode='a', encoding='utf-8')
I can already predict the next question: "But if I run this several in a row, the file just keeps getting longer and longer." There are two solutions to that. One is to erase the file before you begin:
if os.path.exists('App_data2.csv'):
os.remove('App_data2.csv')
The other is to open the file yourself, and pass the open file handle to pandas:
csvfile = open('App_data2.csv', encoding='utf-8')
...
normal_json.to_csv(csvfile, index=False)
英文:
Your ARE exporting all 10 records, but you are rewriting the file every time, so each one overwrites the previous one. The to_csv
method includes a mode
parameter, exactly like open
, which lets you specify "append" mode.
normal_json.to_csv('App_data2.csv', index=False, mode='a', encoding='utf-8')
I can already predict the next question: "But if I run this several in a row, the file just keeps getting longer and longer." There are two solutions to that. One is to erase the file before you begin:
if os.path.exists('App_data2.csv'):
os.remove('App_data2.csv')
The other is to open the file yourself, and pass the open file handle to pandas:
csvfile = open('App_data2.csv', encoding='utf-8')
...
normal_json.to_csv(csvfile, index=False)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论