英文:
How can I deep copy a pandas DataFrame where some values are lists, make changes in the copied dataframe, and not change the list in the original?
问题
如果一个Pandas DataFrame具有包含列表的列,并且该DataFrame被(深度)复制,新DataFrame上该列的更改会影响原始DataFrame。这不受如何初始化该列的方式的影响。如何避免这种情况?这是否是在DataFrame中使用列表的固有缺陷?
import pandas as pd
import numpy as np
df = pd.DataFrame({'Person': ['Jack', 'Bob', 'Alice'], 'Age': [20, 30, 40]})
df['Flavors'] = np.empty((len(df), 0)).tolist()
df['Colors'] = [[] for _ in range(len(df))]
df['Friends'] = [[],[],[]]
print(df)
df2 = df.copy(deep=True)
df2.loc[0, 'Flavors'].append('Apple')
df2.loc[0, 'Colors'].append('Red')
df2.loc[0, 'Friends'].append('Rick')
print(df)
print(df2)
给出的输出是:
Person Age Flavors Colors Friends
0 Jack 20 [] [] []
1 Bob 30 [] [] []
2 Alice 40 [] [] []
Person Age Flavors Colors Friends
0 Jack 20 [Apple] [Red] [Rick]
1 Bob 30 [] [] []
2 Alice 40 [] [] []
Person Age Flavors Colors Friends
0 Jack 20 [Apple] [Red] [Rick]
1 Bob 30 [] [] []
2 Alice 40 [] [] []
我期望的输出是这样的:
Person Age Flavors Colors Friends
0 Jack 20 [] [] []
1 Bob 30 [] [] []
2 Alice 40 [] [] []
Person Age Flavors Colors Friends
0 Jack 20 [Apple] [] []
1 Bob 30 [] [] []
2 Alice 40 [] [] []
Person Age Flavors Colors Friends
0 Jack 20 [Apple] [Red] [Rick]
1 Bob 30 [] [] []
2 Alice 40 [] [] []
英文:
If a Pandas DataFrame has a column where each row contains a list, and that DataFrame is (deep) copied, the changes in the new DataFrame on that column effect the original DataFrame. This happens regardless of how I instantiate the column. How can this be avoided? Is this an inherent flaw to using lists inside a DataFrame?
import pandas as pd
import numpy as np
df = pd.DataFrame({'Person': ['Jack', 'Bob', 'Alice'], 'Age': [20, 30, 40]})
df['Flavors'] = np.empty((len(df), 0)).tolist()
df['Colors'] = [[] for _ in range(len(df))]
df['Friends'] = [[],[],[]]
print(df)
df2 = df.copy(deep=True)
df2.loc[0, 'Flavors'].append('Apple')
df2.loc[0, 'Colors'].append('Red')
df2.loc[0, 'Friends'].append('Rick')
print(df)
print(df2)
Gives output:
Person Age Flavors Colors Friends
0 Jack 20 [] [] []
1 Bob 30 [] [] []
2 Alice 40 [] [] []
Person Age Flavors Colors Friends
0 Jack 20 [Apple] [Red] [Rick]
1 Bob 30 [] [] []
2 Alice 40 [] [] []
Person Age Flavors Colors Friends
0 Jack 20 [Apple] [Red] [Rick]
1 Bob 30 [] [] []
2 Alice 40 [] [] []
I would expect the output to be this:
Person Age Flavors Colors Friends
0 Jack 20 [] [] []
1 Bob 30 [] [] []
2 Alice 40 [] [] []
Person Age Flavors Colors Friends
0 Jack 20 [Apple] [] []
1 Bob 30 [] [] []
2 Alice 40 [] [] []
Person Age Flavors Colors Friends
0 Jack 20 [Apple] [Red] [Rick]
1 Bob 30 [] [] []
2 Alice 40 [] [] []
答案1
得分: 3
以下是翻译好的内容:
df = pd.DataFrame({'Person': ['Jack', 'Bob', 'Alice'], 'Age': [20, 30, 40]})
df['Flavors'] = np.empty((len(df), 0)).tolist()
df['Colors'] = [[] for _ in range(len df))
df['Friends'] = [[], [], []]
print(df)
with io.BytesIO() as buf:
df.to_pickle(buf)
buf.seek(0)
df2 = pd.read_pickle(buf)
df2.loc[0, 'Flavors'].append('Apple')
df2.loc[0, 'Colors'].append('Red')
df2.loc[0, 'Friends'].append('Rick')
print(df)
print(df2)
输出:
Person Age Flavors Colors Friends
0 Jack 20 [] [] []
1 Bob 30 [] [] []
2 Alice 40 [] [] []
Person Age Flavors Colors Friends
0 Jack 20 [] [] []
1 Bob 30 [] [] []
2 Alice 40 [] [] []
Person Age Flavors Colors Friends
0 Jack 20 [Apple] [Red] [Rick]
1 Bob 30 [] [] []
2 Alice 40 [] [] []
英文:
It's not the best solution but an idea is to store your dataframe on disk (or memory) then reload it from buffer:
df = pd.DataFrame({'Person': ['Jack', 'Bob', 'Alice'], 'Age': [20, 30, 40]})
df['Flavors'] = np.empty((len(df), 0)).tolist()
df['Colors'] = [[] for _ in range(len(df))]
df['Friends'] = [[],[],[]]
print(df)
with io.BytesIO() as buf:
df.to_pickle(buf)
buf.seek(0)
df2 = pd.read_pickle(buf)
df2.loc[0, 'Flavors'].append('Apple')
df2.loc[0, 'Colors'].append('Red')
df2.loc[0, 'Friends'].append('Rick')
print(df)
print(df2)
Output:
Person Age Flavors Colors Friends
0 Jack 20 [] [] []
1 Bob 30 [] [] []
2 Alice 40 [] [] []
Person Age Flavors Colors Friends
0 Jack 20 [] [] []
1 Bob 30 [] [] []
2 Alice 40 [] [] []
Person Age Flavors Colors Friends
0 Jack 20 [Apple] [Red] [Rick]
1 Bob 30 [] [] []
2 Alice 40 [] [] []
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论