英文:
Pandas multiindex from list of nested records
问题
我有一些数据,结构如下:
data = [
{"name": "Jack", "last_name": "Black", "sizes": {"shoes": 43, "waist": 48, "chest": 52}},
{"name": "Mario", "last_name": "Green", "sizes": {"shoes": 42, "waist": 53, "chest": 63}}
]
如何轻松地获得一个类似这样的数据框:
name last_name sizes
name last_name shoes waist chest
Jack Black 43 48 52
Mario Green 42 53 63
我知道我可以使用
pd.json_normalize(data)
但它不完全相同。
如果数据是这样的记录字典:
data = {
12345: {"name": "Jack", "last_name": "Black", "sizes": {"shoes": 43, "waist": 48, "chest": 52}},
78910: {"name": "Mario", "last_name": "Green", "sizes": {"shoes": 42, "waist": 53, "chest": 63}}
}
我想要获得:
name last_name sizes
name last_name shoes waist chest
12345 Jack Black 43 48 52
78910 Mario Green 42 53 63
非常感谢!
英文:
I have some data in a structure like this:
data = [
{"name": "Jack", "last_name": "Black", "sizes": {"shoes": 43, "waist": 48, "chest":52}},
{"name": "Mario", "last_name": "Green", "sizes": {"shoes": 42, "waist": 53, "chest":63}}
]
how can i get a dataframe that looks like this easily?:
name last_name sizes
name last_name shoes waist chest
Jack Black 43 48 52
Mario Green 42 53 63
i know that i can use
pd.json_normalize(data)
but it's not exactly the same
and how could i do it if the data was a dict of records like this:
data = {
12345: {"name": "Jack", "last_name": "Black", "sizes": {"shoes": 43, "waist": 48, "chest":52}},
78910: {"name": "Mario", "last_name": "Green", "sizes": {"shoes": 42, "waist": 53, "chest":63}}
}
and i wanted to get:
name last_name sizes
name last_name shoes waist chest
12345 Jack Black 43 48 52
78910 Mario Green 42 53 63
Many thanks
答案1
得分: 1
以下是翻译好的部分:
创建一个包含 JSON 格式数据的数据框,然后将索引设置为名字和姓氏,然后拆分剩余的列以转换为多重索引。
import pandas as pd
df = pd.json_normalize(data)
df = df.set_index(['name', 'last_name'])
df.columns = df.columns.str.split('.', expand=True)
sizes
shoes waist chest
name last_name
Jack Black 43 48 52
Mario Green 42 53 63
英文:
Create a dataframe with json normalize then set the index to first and last name now split the remaining columns to convert to multiindex
df = pd.json_normalize(data)
df = df.set_index(['name', 'last_name'])
df.columns = df.columns.str.split('.', expand=True)
sizes
shoes waist chest
name last_name
Jack Black 43 48 52
Mario Green 42 53 63
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论