2023年6月22日 00:26:12go评论217阅读模式

英文:

Split Column of list Panda

问题

我有一个带有这列的Pandas DataFrame：
这是从Mongo数据库中提取的，但我不知道如何处理同时包含[]和{}的列：

分割Pandas列表列

如何将这列拆分成两列？

期望结果：

分割Pandas列表列

谢谢你的帮助！

英文:

I have a Pandas DataFrame with this column:
This is an extraction from a database in Mongo but I don't know how to handle a column containing both [] and {}:

分割Pandas列表列

How can split this column into two columns?

Desired result:

分割Pandas列表列

Thanks for your help !

答案1

得分: 1

你可以创建一个字典的列表（而不是一个带有字典的列表），然后创建一个数据框并将其与原始数据框连接。

import pandas as pd

data = {"coeff":[[{"value": 0.0641, "year":2000}],
                 [{"value": 0.0641, "year":2000}],
                 [{"value": 0.0641, "year":2000}],
                 [{"value": 0.0652, "year":2005}]]}

df = pd.DataFrame(data)

df = df.join(pd.DataFrame([x[0] for x in df.coeff]))

这将帮助你实现所需的数据框连接操作。

英文:

You can create a list of dictionaries (instead of a list of lists with dictionaries), then a dataframe and join this to the original df.

import pandas as pd

data = {&quot;coeff&quot;:[[{&quot;value&quot;: 0.0641, &quot;year&quot;:2000}],
                 [{&quot;value&quot;: 0.0641, &quot;year&quot;:2000}],
                 [{&quot;value&quot;: 0.0641, &quot;year&quot;:2000}],
                 [{&quot;value&quot;: 0.0652, &quot;year&quot;:2005}]]}

df = pd.DataFrame(data)
#                                coeff
# 0  [{&#39;value&#39;: 0.0641, &#39;year&#39;: 2000}]
# 1  [{&#39;value&#39;: 0.0641, &#39;year&#39;: 2000}]
# 2  [{&#39;value&#39;: 0.0641, &#39;year&#39;: 2000}]
# 3  [{&#39;value&#39;: 0.0652, &#39;year&#39;: 2005}]

df = df.join(pd.DataFrame([x[0] for x in df.coeff]))
#                                coeff   value  year
# 0  [{&#39;value&#39;: 0.0641, &#39;year&#39;: 2000}]  0.0641  2000
# 1  [{&#39;value&#39;: 0.0641, &#39;year&#39;: 2000}]  0.0641  2000
# 2  [{&#39;value&#39;: 0.0641, &#39;year&#39;: 2000}]  0.0641  2000
# 3  [{&#39;value&#39;: 0.0652, &#39;year&#39;: 2005}]  0.0652  2005

答案2

得分: 1

pandas有一个从字典构建数据框的函数

import pandas as pd

my_data = {"coeff":[[{"value": 0.0641, "year":2000}],
                 [{"value": 0.0641, "year":2000}],
                 [{"value": 0.0641, "year":2000}],
                 [{"value": 0.0652, "year":2005}]]
           }

df = pd.DataFrame(my_data)

df2 = pd.DataFrame.from_records(d[0] for d in df['coeff'])

print(df2)

输出:

    value  year
0  0.0641  2000
1  0.0641  2000
2  0.0641  2000
3  0.0652  2005

英文:

pandas has a function to construct DF from dictionaries

import pandas as pd

my_data = {&quot;coeff&quot;:[[{&quot;value&quot;: 0.0641, &quot;year&quot;:2000}],
                 [{&quot;value&quot;: 0.0641, &quot;year&quot;:2000}],
                 [{&quot;value&quot;: 0.0641, &quot;year&quot;:2000}],
                 [{&quot;value&quot;: 0.0652, &quot;year&quot;:2005}]]
           }

df = pd.DataFrame(my_data)

df2 = pd.DataFrame.from_records(d[0] for d in df[&#39;coeff&#39;])

print(df2)

gives:

    value  year
0  0.0641  2000
1  0.0641  2000
2  0.0641  2000
3  0.0652  2005

答案3

得分: 0

# 使用`explode`和`json_normalize`结合：
out = pd.json_normalize(df['coeff'].explode())

# 或者，如果每个列表只有一个字典：
out = pd.json_normalize(df['coeff'].str[0])

# 或者使用`from_records`：
out = pd.DataFrame.from_records(df['coeff'].str[0])

# 输出：
"""
    value  year
0  0.0641  2000
1  0.0641  2000
2  0.0641  2000
3  0.0652  2005
"""

英文:

Combine explode and json_normalize:

out = pd.json_normalize(df[&#39;coeff&#39;].explode())

Or, if you have only one dictionary per list:

out = pd.json_normalize(df[&#39;coeff&#39;].str[0])

Or usig from_records:

out = pd.DataFrame.from_records(df[&#39;coeff&#39;].str[0])

Output:

    value  year
0  0.0641  2000
1  0.0641  2000
2  0.0641  2000
3  0.0652  2005

答案4

得分: 0

创建一个基于你的基本数据的数据框（df）：

data = {"coeff":[[{"value": 0.0641, "year":2000}],
             [{"value": 0.0641, "year":2000}],
             [{"value": 0.0641, "year":2000}],
             [{"value": 0.0652, "year":2005}]]}

每个df中的元素都是列表中的字典。使用apply方法和lambda函数来隔离列表中的字典的第一个元素。

使用.values()来获取字典的值（年份和数值），这将作为一个dtype为dict_values的对象存在。

dict_values的dtype相当受限制，所以将其包装在一个列表函数中以将其转换为列表，以便您可以使用切片和索引：

df2 = df.coeff.apply(lambda x: list(x[0].values()))

使用apply方法和lambda函数以及索引位置来获取年份和值，将它们分别分配给一个字典中的相应列名，并将其作为参数传递给pd.DataFrame类以创建一个新的数据框：

pd.DataFrame(data = {'year': df2.apply(lambda y: y[1]),
                 'value':df2.apply(lambda y: y[0])})

英文:

Create a df out of your base data:

data = {&quot;coeff&quot;:[[{&quot;value&quot;: 0.0641, &quot;year&quot;:2000}],
             [{&quot;value&quot;: 0.0641, &quot;year&quot;:2000}],
             [{&quot;value&quot;: 0.0641, &quot;year&quot;:2000}],
             [{&quot;value&quot;: 0.0652, &quot;year&quot;:2005}]]}

Each element in the df is a dictionary within a list. Isolate the dictionary within the list using the apply method and a lambda function to access the first element in the list (the dictionary).

Use .values() to retrieve the dictionary values (year and value) which will exist as an object with dtype dict_values.

The dytpe of dict_values is pretty limiting so wrap it in a list function to convert to a list so you can use slicing and inxdexing:

df2 = df.coeff.apply(lambda x: list(x[0].values()))

Use the apply method with a lambda function and index positions to retrieve the years and values respectively, assign these to their respective column names within a dictionary and pass this as an argument into the pd.DataFrame class to create a new dataframe:

pd.DataFrame(data = {&#39;year&#39;: df2.apply(lambda y: y[1]),
                 &#39;value&#39;:df2.apply(lambda y: y[0])})

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

分割Pandas列表列

问题

答案1

答案2

答案3

答案4

尝试使用Python对使用SQLite 3创建的数据库进行详细验证。

如何使用Python在非矩形坐标中插值数值？

Django Channels 与 Redis 在 WSL2 中

你可以根据在R中的字符串是否包含特定值来改变数据。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论