如何在新列中提取pandas DataFrame的字典值?

huangapple go评论71阅读模式
英文:

How to extract dict values of pandas DataFrame in new columns?

问题

I would like to extract the values of a dictionary inside a Pandas DataFrame df into new columns of that DataFrame. All keys in the referring dict are the same across all rows.

import pandas as pd

df = pd.DataFrame({'a': [1, 2, 3], 'b': [{'x':[101], 'y': [102], 'z': [103]}, {'x':[201], 'y': [202], 'z': [203]}, {'x':[301], 'y': [302], 'z': [303]}]})

如何在新列中提取pandas DataFrame的字典值?

dfResult = pd.DataFrame({'a': [1, 2, 3],  'x':[101, 201, 301], 'y': [102, 202, 302], 'z': [103, 203, 303]})

如何在新列中提取pandas DataFrame的字典值?

I am as far as I can get the keys and values out of the dict, but I do not know how to make new columns out of them:

df.b.apply(lambda x: [x[y] for y in x.keys()])

0    [[101], [102], [103]]
1    [[201], [202], [203]]
2    [[301], [302], [303]]


df.b.apply(lambda x: [y for y in x.keys()])

0    [x, y, z]
1    [x, y, z]
2    [x, y, z]
英文:

I would like to extract the values of a dictionary inside a Pandas DataFrame df into new columns of that DataFrame. All keys in the referring dict are the same across all rows.

import pandas as pd

df = pd.DataFrame({'a': [1, 2, 3], 'b': [{'x':[101], 'y': [102], 'z': [103]}, {'x':[201], 'y': [202], 'z': [203]}, {'x':[301], 'y': [302], 'z': [303]}]})

如何在新列中提取pandas DataFrame的字典值?

dfResult = pd.DataFrame({'a': [1, 2, 3],  'x':[101, 201, 301], 'y': [102, 202, 302], 'z': [103, 203, 303]})

如何在新列中提取pandas DataFrame的字典值?

I am as far as I can get the keys and values out of the dict, but I do not know how to make new columns out of them:

df.b.apply(lambda x: [x[y] for y in x.keys()])

0    [[101], [102], [103]]
1    [[201], [202], [203]]
2    [[301], [302], [303]]


df.b.apply(lambda x: [y for y in x.keys()])

0    [x, y, z]
1    [x, y, z]
2    [x, y, z]

答案1

得分: 3

如果始终存在只有一个元素的列表,可以使用嵌套列表与字典推导,然后传递给 DataFrame 构造函数:

df = df.join(pd.DataFrame([{k: v[0] for k, v in x.items()} for x in df.pop('b')],
                           index=df.index))
print (df)
   a    x    y    z
0  1  101  102  103
1  2  201  202  203
2  3  301  302  303

另一种方法是为字典推导中的每一行创建一个 DataFrame,并使用 concat 进行连接:

df = df.join(pd.concat({k: pd.DataFrame(v) for k, v in df.pop('b').items()}).droplevel(1))
print (df)
   a    x    y    z
0  1  101  102  103
1  2  201  202  203
2  3  301  302  303
英文:

If there are always one element lists is possible use nested list with dictionary comprehension and pass to DataFrame constructor:

df = df.join(pd.DataFrame([{k: v[0] for k, v in x.items()} for x in df.pop('b')],
                           index=df.index))
print (df)
   a    x    y    z
0  1  101  102  103
1  2  201  202  203
2  3  301  302  303

Another idea is create DataFrame for each row in dictionary comprehension and join by concat:

df = df.join(pd.concat({k: pd.DataFrame(v) for k, v in df.pop('b').items()}).droplevel(1))
print (df)
   a    x    y    z
0  1  101  102  103
1  2  201  202  203
2  3  301  302  303

huangapple
  • 本文由 发表于 2023年5月17日 12:51:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/76268656.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定