copy() 方法在Python中不正常工作。

huangapple go评论67阅读模式
英文:

The copy() method in Python does not work properly

问题

我有一个pandas数据框,我想创建一个副本并对副本进行一些操作,而不影响原始数据框。我使用".copy()"方法,但出于某种原因它不起作用!以下是我的代码:

import pandas as pd
import numpy as np

x = np.array([1,2])
df = pd.DataFrame({'A': [x, x, x], 'B': [4, 5, 6]})

duplicate = df.copy()
duplicate['A'].values[0][[0,1]] = 0

print(duplicate)
print(df)

正如您所看到的,“df”(原始数据集)也受到影响。有谁知道为什么,以及如何正确完成这个操作?

英文:

I have a pandas dataframe that I would like to make a duplicate of and do some operations on the duplicated version without affecting the original one. I use ".copy()" method but for some reason it doesn't work! Here is my code:

import pandas as pd
import numpy as np

x = np.array([1,2])
df = pd.DataFrame({'A': [x, x, x], 'B': [4, 5, 6]})

duplicate = df.copy()
duplicate['A'].values[0][[0,1]] = 0

print(duplicate)
print(df)

        A  B
0  [0, 0]  4
1  [0, 0]  5
2  [0, 0]  6
        A  B
0  [0, 0]  4
1  [0, 0]  5
2  [0, 0]  6

As you can see "df" (the original dataset) gets affected as well. Does anyone know why, and how this should be done correctly?

答案1

得分: 2

问题实际上出在列表值上,而不是数据框本身。当您复制数据框时,即使默认情况下是深复制,它并不对值本身执行深复制,因此,如果值是一个列表,引用将被复制,您可以根据以下事实来判断:即使您只尝试修改第一行,但在您的副本中所有A的值都被修改。

正确的方法可能是:

import pandas as pd
import numpy as np
from copy import deepcopy # <- **

x = np.array([1,2])
df = pd.DataFrame({'A': [x, x, x], 'B': [4, 5, 6]})

duplicate = df.copy()
duplicate['A'] = duplicate["A"].apply(deepcopy)  # <- **

duplicate['A'].values[0][[0,1]] = 0

print(duplicate)
print(df)

        A  B
0  [0, 0]  4
1  [1, 2]  5
2  [1, 2]  6

        A  B
0  [1, 2]  4
1  [1, 2]  5
2  [1, 2]  6
英文:

The problem is actually in the list value rather than the df itself. When you are copying the dataframe, even if it's by default a deep copy, it's not doing deepcopy on the value itself, so if the value is a list, the reference is copied over, you can tell this by the fact that even though you only tried to modify the first row, but all values of A in your duplicate are modified.

The proper way is probably:

import pandas as pd
import numpy as np
from copy import deepcopy # <- **

x = np.array([1,2])
df = pd.DataFrame({'A': [x, x, x], 'B': [4, 5, 6]})

duplicate = df.copy()
duplicate['A'] = duplicate["A"].apply(deepcopy)  # <- **

duplicate['A'].values[0][[0,1]] = 0

print(duplicate)
print(df)

        A  B
0  [0, 0]  4
1  [1, 2]  5
2  [1, 2]  6

        A  B
0  [1, 2]  4
1  [1, 2]  5
2  [1, 2]  6

huangapple
  • 本文由 发表于 2023年2月7日 02:16:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/75365126.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定