2023年2月16日 04:28:38go评论62阅读模式

英文:

How can I deep copy a pandas DataFrame where some values are lists, make changes in the copied dataframe, and not change the list in the original?

问题

如果一个Pandas DataFrame具有包含列表的列，并且该DataFrame被（深度）复制，新DataFrame上该列的更改会影响原始DataFrame。这不受如何初始化该列的方式的影响。如何避免这种情况？这是否是在DataFrame中使用列表的固有缺陷？

import pandas as pd
import numpy as np

df = pd.DataFrame({'Person': ['Jack', 'Bob', 'Alice'], 'Age': [20, 30, 40]})
df['Flavors'] = np.empty((len(df), 0)).tolist()
df['Colors'] = [[] for _ in range(len(df))]
df['Friends'] = [[],[],[]]
print(df)
df2 = df.copy(deep=True)
df2.loc[0, 'Flavors'].append('Apple')
df2.loc[0, 'Colors'].append('Red')
df2.loc[0, 'Friends'].append('Rick')
print(df)
print(df2)

给出的输出是：

  Person  Age Flavors Colors Friends
0   Jack   20      []     []      []
1    Bob   30      []     []      []
2  Alice   40      []     []      []
  Person  Age  Flavors Colors Friends
0   Jack   20  [Apple]  [Red]  [Rick]
1    Bob   30       []     []      []
2  Alice   40       []     []      []
  Person  Age  Flavors Colors Friends
0   Jack   20  [Apple]  [Red]  [Rick]
1    Bob   30       []     []      []
2  Alice   40       []     []      []

我期望的输出是这样的：

  Person  Age Flavors Colors Friends
0   Jack   20      []     []      []
1    Bob   30      []     []      []
2  Alice   40      []     []      []
  Person  Age  Flavors Colors Friends
0   Jack   20  [Apple]     []      []
1    Bob   30       []     []      []
2  Alice   40       []     []      []
  Person  Age  Flavors Colors Friends
0   Jack   20  [Apple]  [Red]  [Rick]
1    Bob   30       []     []      []
2  Alice   40       []     []      []

英文:

If a Pandas DataFrame has a column where each row contains a list, and that DataFrame is (deep) copied, the changes in the new DataFrame on that column effect the original DataFrame. This happens regardless of how I instantiate the column. How can this be avoided? Is this an inherent flaw to using lists inside a DataFrame?

import pandas as pd
import numpy as np

df = pd.DataFrame({&#39;Person&#39;: [&#39;Jack&#39;, &#39;Bob&#39;, &#39;Alice&#39;], &#39;Age&#39;: [20, 30, 40]})
df[&#39;Flavors&#39;] = np.empty((len(df), 0)).tolist()
df[&#39;Colors&#39;] = [[] for _ in range(len(df))]
df[&#39;Friends&#39;] = [[],[],[]]
print(df)
df2 = df.copy(deep=True)
df2.loc[0, &#39;Flavors&#39;].append(&#39;Apple&#39;)
df2.loc[0, &#39;Colors&#39;].append(&#39;Red&#39;)
df2.loc[0, &#39;Friends&#39;].append(&#39;Rick&#39;)
print(df)
print(df2)

Gives output:

  Person  Age Flavors Colors Friends
0   Jack   20      []     []      []
1    Bob   30      []     []      []
2  Alice   40      []     []      []
  Person  Age  Flavors Colors Friends
0   Jack   20  [Apple]  [Red]  [Rick]
1    Bob   30       []     []      []
2  Alice   40       []     []      []
  Person  Age  Flavors Colors Friends
0   Jack   20  [Apple]  [Red]  [Rick]
1    Bob   30       []     []      []
2  Alice   40       []     []      []

I would expect the output to be this:

  Person  Age Flavors Colors Friends
0   Jack   20      []     []      []
1    Bob   30      []     []      []
2  Alice   40      []     []      []
  Person  Age  Flavors Colors Friends
0   Jack   20  [Apple]     []      []
1    Bob   30       []     []      []
2  Alice   40       []     []      []
  Person  Age  Flavors Colors Friends
0   Jack   20  [Apple]  [Red]  [Rick]
1    Bob   30       []     []      []
2  Alice   40       []     []      []

答案1

得分: 3

以下是翻译好的内容：

df = pd.DataFrame({'Person': ['Jack', 'Bob', 'Alice'], 'Age': [20, 30, 40]})
df['Flavors'] = np.empty((len(df), 0)).tolist()
df['Colors'] = [[] for _ in range(len df)) 
df['Friends'] = [[], [], []]
print(df)

with io.BytesIO() as buf:
    df.to_pickle(buf)
    buf.seek(0)
    df2 = pd.read_pickle(buf)

df2.loc[0, 'Flavors'].append('Apple')
df2.loc[0, 'Colors'].append('Red')
df2.loc[0, 'Friends'].append('Rick')
print(df)
print(df2)

输出：

  Person  Age Flavors Colors Friends
0   Jack   20      []     []      []
1    Bob   30      []     []      []
2  Alice   40      []     []      []
  Person  Age Flavors Colors Friends
0   Jack   20      []     []      []
1    Bob   30      []     []      []
2  Alice   40      []     []      []
  Person  Age  Flavors Colors Friends
0   Jack   20  [Apple]  [Red]  [Rick]
1    Bob   30       []     []      []
2  Alice   40       []     []      []

英文:

It's not the best solution but an idea is to store your dataframe on disk (or memory) then reload it from buffer:

df = pd.DataFrame({&#39;Person&#39;: [&#39;Jack&#39;, &#39;Bob&#39;, &#39;Alice&#39;], &#39;Age&#39;: [20, 30, 40]})
df[&#39;Flavors&#39;] = np.empty((len(df), 0)).tolist()
df[&#39;Colors&#39;] = [[] for _ in range(len(df))]
df[&#39;Friends&#39;] = [[],[],[]]
print(df)

with io.BytesIO() as buf:
    df.to_pickle(buf)
    buf.seek(0)
    df2 = pd.read_pickle(buf)

df2.loc[0, &#39;Flavors&#39;].append(&#39;Apple&#39;)
df2.loc[0, &#39;Colors&#39;].append(&#39;Red&#39;)
df2.loc[0, &#39;Friends&#39;].append(&#39;Rick&#39;)
print(df)
print(df2)

Output:

  Person  Age Flavors Colors Friends
0   Jack   20      []     []      []
1    Bob   30      []     []      []
2  Alice   40      []     []      []
  Person  Age Flavors Colors Friends
0   Jack   20      []     []      []
1    Bob   30      []     []      []
2  Alice   40      []     []      []
  Person  Age  Flavors Colors Friends
0   Jack   20  [Apple]  [Red]  [Rick]
1    Bob   30       []     []      []
2  Alice   40       []     []      []

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

How can I deep copy a pandas DataFrame where some values are lists, make changes in the copied dataframe, and not change the list in the original?

问题

答案1

使用C语言在Python代码中进行更快的傅立叶变换？

创建一个字典，其中值是表达式，而不进行评估或将其视为字符串。

如何在同一个项目中组织Python模块时进行分开打包？

如何在静态类型语言中复制？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论