2023年5月17日 12:51:35go评论97阅读模式

英文:

How to extract dict values of pandas DataFrame in new columns?

问题

I would like to extract the values of a dictionary inside a Pandas DataFrame df into new columns of that DataFrame. All keys in the referring dict are the same across all rows.

import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3], 'b': [{'x':[101], 'y': [102], 'z': [103]}, {'x':[201], 'y': [202], 'z': [203]}, {'x':[301], 'y': [302], 'z': [303]}]})

dfResult = pd.DataFrame({'a': [1, 2, 3],  'x':[101, 201, 301], 'y': [102, 202, 302], 'z': [103, 203, 303]})

I am as far as I can get the keys and values out of the dict, but I do not know how to make new columns out of them:

df.b.apply(lambda x: [x[y] for y in x.keys()])
0    [[101], [102], [103]]
1    [[201], [202], [203]]
2    [[301], [302], [303]]
df.b.apply(lambda x: [y for y in x.keys()])
0    [x, y, z]
1    [x, y, z]
2    [x, y, z]

英文:

I would like to extract the values of a dictionary inside a Pandas DataFrame df into new columns of that DataFrame. All keys in the referring dict are the same across all rows.

import pandas as pd
df = pd.DataFrame({&#39;a&#39;: [1, 2, 3], &#39;b&#39;: [{&#39;x&#39;:[101], &#39;y&#39;: [102], &#39;z&#39;: [103]}, {&#39;x&#39;:[201], &#39;y&#39;: [202], &#39;z&#39;: [203]}, {&#39;x&#39;:[301], &#39;y&#39;: [302], &#39;z&#39;: [303]}]})

dfResult = pd.DataFrame({&#39;a&#39;: [1, 2, 3],  &#39;x&#39;:[101, 201, 301], &#39;y&#39;: [102, 202, 302], &#39;z&#39;: [103, 203, 303]})

I am as far as I can get the keys and values out of the dict, but I do not know how to make new columns out of them:

df.b.apply(lambda x: [x[y] for y in x.keys()])
0    [[101], [102], [103]]
1    [[201], [202], [203]]
2    [[301], [302], [303]]
df.b.apply(lambda x: [y for y in x.keys()])
0    [x, y, z]
1    [x, y, z]
2    [x, y, z]

答案1

得分: 3

如果始终存在只有一个元素的列表，可以使用嵌套列表与字典推导，然后传递给 DataFrame 构造函数：

df = df.join(pd.DataFrame([{k: v[0] for k, v in x.items()} for x in df.pop('b')],
                           index=df.index))
print (df)
   a    x    y    z
0  1  101  102  103
1  2  201  202  203
2  3  301  302  303

另一种方法是为字典推导中的每一行创建一个 DataFrame，并使用 concat 进行连接：

df = df.join(pd.concat({k: pd.DataFrame(v) for k, v in df.pop('b').items()}).droplevel(1))
print (df)
   a    x    y    z
0  1  101  102  103
1  2  201  202  203
2  3  301  302  303

英文:

If there are always one element lists is possible use nested list with dictionary comprehension and pass to DataFrame constructor:

df = df.join(pd.DataFrame([{k: v[0] for k, v in x.items()} for x in df.pop(&#39;b&#39;)],
                           index=df.index))
print (df)
   a    x    y    z
0  1  101  102  103
1  2  201  202  203
2  3  301  302  303

Another idea is create DataFrame for each row in dictionary comprehension and join by concat:

df = df.join(pd.concat({k: pd.DataFrame(v) for k, v in df.pop(&#39;b&#39;).items()}).droplevel(1))
print (df)
   a    x    y    z
0  1  101  102  103
1  2  201  202  203
2  3  301  302  303

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在新列中提取pandas DataFrame的字典值？

问题

答案1

Pandas.read_sql throw exception from sqlalchemy: AttributeError: 'Connection' object has no attribute 'exec_driver_sql'

使用Selenium Python实现网页的无限滚动。

如何在Pandas的数据框中将列合并为单一列？

如何使用DuckDB从Google存储中读取CSV文件

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。