英文:
Python flatten a dictionary column
问题
以下是翻译好的部分:
原始数据框:
df['addresses'][0]
[{'addressLine1': '124 Main Street',
'addressLine2': '',
'addressLine3': '',
'city': 'Portland',
'region': 'ME',
'postalCode': '04019',
'country': 'USA'}]
test = pd.json_normalize(result['addresses'][0])
test
到目前为止,一切都正常,但当我使用该函数并将其应用于整个列时,生成的数据框如下所示。
test = pd.json_normalize(result['addresses'])
test
以下是一些列数据:
[[{'addressLine1': '124 Main Street',
'addressLine2': '',
'addressLine3': '',
'city': 'Portland',
'region': 'ME',
'postalCode': '04019',
'country': 'USA'}],
...
...
]
英文:
It should be a simple line of code using pd.json_normalize function but it's working only with a single string and it's not batch processing my whole column
Orginial dataframe
df['addresses'][0]
[{'addressLine1': '124 Main Street',
'addressLine2': '',
'addressLine3': '',
'city': 'Portland',
'region': 'ME',
'postalCode': '04019',
'country': 'USA'}]
test = pd.json_normalize(result['addresses'][0])
test
Everything up to this point works, but when I use the function and apply to the whole column, the resulting dataframe turned out to look like this.
test = pd.json_normalize(result['addresses'])
test
Here are some column data:
[[{'addressLine1': '124 Main Street',
'addressLine2': '',
'addressLine3': '',
'city': 'Portland',
'region': 'ME',
'postalCode': '04019',
'country': 'USA'}],
[{'addressLine1': '1234 Main Street',
'addressLine2': '',
'addressLine3': '',
'city': 'Chattanooga',
'region': 'TN',
'postalCode': '37402',
'country': 'USA'}],
[{'addressLine1': '1684151 Chair Street',
'addressLine2': '',
'addressLine3': '',
'city': 'Notaplace',
'region': 'AL',
'postalCode': '48835',
'country': 'USA'}],
[{'addressLine1': '136 Main Street',
'addressLine2': '',
'addressLine3': '',
'city': 'Portland',
'region': 'ME',
'postalCode': '22118',
'country': 'USA'}],
[{'addressLine1': '123452 HoneyDo LN',
'addressLine2': '',
'addressLine3': '',
'city': 'Portland',
'region': 'ME',
'postalCode': '04019',
'country': 'USA'}],
[{'addressLine1': '123 Main Street',
'addressLine2': 'Apt 2B',
'addressLine3': 'Building B',
'city': 'Portland',
'region': 'ME',
'postalCode': '04019',
'country': 'USA'}],
[{'addressLine1': '123 Main Street',
'addressLine2': 'Apt 2B',
'addressLine3': 'Building B',
'city': 'New York City',
'region': 'NY',
'postalCode': '10001',
'country': 'USA'}],
[{'addressLine1': '123 Main Street',
'addressLine2': 'Apt 2B',
'addressLine3': 'Building B',
'city': 'Portland',
'region': 'ME',
'postalCode': '04019',
'country': 'USA'}],
[{'addressLine1': '4578 Shiver Me Timbers Road',
'addressLine2': '',
'addressLine3': '',
'city': 'Portland',
'region': 'ME',
'postalCode': '04019',
'country': 'USA'}],
[{'addressLine1': '124 Main ST',
'addressLine2': '',
'addressLine3': '',
'city': 'PORTLAND',
'region': 'ME',
'postalCode': '04019',
'country': 'USA'}]]
答案1
得分: 1
如果我理解您的意思,您可以使用以下示例将您的数据帧 df
转换为包含 dict
数据的形式:
df = pd.concat([df, df.pop('addresses').str[0].apply(pd.Series)], axis=1)
print(df)
打印结果:
addressLine1 addressLine2 addressLine3 city region postalCode country
0 124 Main Street Portland ME 04019 USA
1 1234 Main Street Chattanooga TN 37402 USA
2 1684151 Chair Street Notaplace AL 48835 USA
3 136 Main Street Portland ME 22118 USA
4 123452 HoneyDo LN Portland ME 04019 USA
5 123 Main Street Apt 2B Building B Portland ME 04019 USA
6 123 Main Street Apt 2B Building B New York City NY 10001 USA
7 123 Main Street Apt 2B Building B Portland ME 04019 USA
8 4578 Shiver Me Timbers Road Portland ME 04019 USA
9 124 Main ST PORTLAND ME 04019 USA
英文:
If I understand you correctly, you can transform your dataframe df
with dict
data with following example:
df = pd.concat([df, df.pop('addresses').str[0].apply(pd.Series)], axis=1)
print(df)
Prints:
addressLine1 addressLine2 addressLine3 city region postalCode country
0 124 Main Street Portland ME 04019 USA
1 1234 Main Street Chattanooga TN 37402 USA
2 1684151 Chair Street Notaplace AL 48835 USA
3 136 Main Street Portland ME 22118 USA
4 123452 HoneyDo LN Portland ME 04019 USA
5 123 Main Street Apt 2B Building B Portland ME 04019 USA
6 123 Main Street Apt 2B Building B New York City NY 10001 USA
7 123 Main Street Apt 2B Building B Portland ME 04019 USA
8 4578 Shiver Me Timbers Road Portland ME 04019 USA
9 124 Main ST PORTLAND ME 04019 USA
答案2
得分: 1
以下是您要翻译的内容:
"It seems your list has one-element lists as elements.
Lets say your list is address_list
then you get the first element in that list and then use json_normalize
pd.json_normalize([e[0] for e in address_list])
If the test data that you posted is actually a column then just use:
pd.json_normalize(result["addresses"].str[0])
Or if you have other columns in addition to addresses
in your result
dataframe:
pd.concat(
[result.drop(column="addresses"), pd.json_normalize(result["addresses"].str[0])],
axis=1
)"
英文:
It seems your list has one-element lists as elements.
Lets say your list is address_list
then you get the first element in that list and then use json_normalize
pd.json_normalize([e[0] for e in address_list])
If the test data that you posted is actually a column then just use:
pd.json_normalize(result["addresses"].str[0])
Or if you have other columns in addition to addresses
in your result
dataframe:
pd.concat(
[result.drop(column="addresses"), pd.json_normalize(result["addresses"].str[0])],
axis=1
)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论