英文:
conditionally append value to list of lists in pandas
问题
以下是您提供的代码的中文翻译:
import pandas as pd
df = pd.DataFrame(data={'A': [1, 2, 3]})
df['B'] = [[[1],[1],[1]]] * df.shape[0]
df
A B
0 1 [[1], [1], [1]]
1 2 [[1], [1], [1]]
2 3 [[1], [1], [1]]
# 尝试将B列中的第一个列表添加2
df['B'] = df['B'].mask(df.A == 2, df['B'].apply(lambda x: x[0].append(2)))
df
A B
0 1 [[1, 2, 2, 2], [1], [1]]
1 2 None
2 3 [[1, 2, 2, 2], [1], [1]]
# 期望的结果是:
df['B'] = [[[1],[1],[1]],[[1,2],[1],[1]],[[1],[1],[1]]]
df
A B
0 1 [[1], [1], [1]]
1 2 [[1, 2], [1], [1]]
2 3 [[1], [1], [1]]
请注意,我已经按照您的要求,只返回代码部分的中文翻译。如果您有任何其他问题或需要进一步的帮助,请随时提问。
英文:
I'm trying to conditionally append a list of lists in pandas:
import pandas as pd
df = pd.DataFrame(data={'A': [1, 2, 3]})
df['B'] = [[[1],[1],[1]]] * df.shape[0]
df
A B
0 1 [[1], [1], [1]]
1 2 [[1], [1], [1]]
2 3 [[1], [1], [1]]
# attempting to append 1st list of lists in B column with 2
df['B'] = df['B'].mask(df.A == 2, df['B'].apply(lambda x: x[0].append(2)))
df
A B
0 1 [[1, 2, 2, 2], [1], [1]]
1 2 None
2 3 [[1, 2, 2, 2], [1], [1]]
#expected result I'm hoping for is:
df['B'] = [[[1],[1],[1]],[[1,2],[1],[1]],[[1],[1],[1]]]
df
A B
0 1 [[1], [1], [1]]
1 2 [[1, 2], [1], [1]]
2 3 [[1], [1], [1]]
答案1
得分: 2
尝试使用lambda函数条件性地修改列表的列表,如下所示:
import pandas as pd
df = pd.DataFrame(data={'A': [1, 2, 3]})
df['B'] = [[[1],[1],[1]]] * df.shape[0]
df['B'] = df.apply(lambda row: row['B'] if row['A'] != 2 else [row['B'][0] + [2]] + row['B'][1:], axis=1)
print(df)
在这里,我们创建了一个新的列表,其中包括修改后的列表,然后返回它。使用apply
方法,以axis=1对整个DataFrame进行逐行应用lambda函数。
英文:
try using a lambda function to conditionally modify the list of lists like this:
import pandas as pd
df = pd.DataFrame(data={'A': [1, 2, 3]})
df['B'] = [[[1],[1],[1]]] * df.shape[0]
df['B'] = df.apply(lambda row: row['B'] if row['A'] != 2 else [row['B'][0] + [2]] + row['B'][1:], axis=1)
print(df)
here we create a new list of lists that includes the modified list and then return that. apply
method is called on the entire DataFrame with axis=1 to apply the lambda function row-wise.
答案2
得分: 1
list.append
是原地操作,所以实际上返回的是 None
而不是列表。这就是为什么你的新 df 在第二行有 None
。
以下是一种向列表添加 2 的方法。我们取第二行中的第一个列表,添加 [2]
,然后展开其余的列表以形成预期输出:
df['B'].mask(df['A'].eq(2), lambda x: x.map(lambda x: [x[0] + [2], *x[1:]]))
输出:
0 [[1], [1], [1]]
1 [[1, 2], [1], [1]]
2 [[1], [1], [1]]
英文:
list.append
works in place, so it actually returns None
instead of a list. This is why your new df has None
on the second row.
Below is a way to add 2 to the list. We take the first list in the second row and add [2]
, then unpack the rest of the lists to form the expected output:
df['B'].mask(df['A'].eq(2),lambda x: x.map(lambda x: [x[0] + [2],*x[1:]]))
Output:
0 [[1], [1], [1]]
1 [[1, 2], [1], [1]]
2 [[1], [1], [1]]
答案3
得分: 0
问题出在你生成数据框的方式上:
import pandas as pd
df = pd.DataFrame(data={'A': [1, 2, 3]})
df['B'] = [[[1],[1],[1]]] * df.shape[0]
df
[[[1],[1],[1]]] * df.shape[0]
在Python中有点棘手。因为df
中不同行的所有[[1],[1],[1]]
都指向相同的对象,即列表[[1], [1], [1]]
。
尝试这样做:
df.at[1, 'B'][0] = [1, 2]
.at[]
允许你同时使用索引号和列名进行访问,[0]
选择访问后返回的列表的第一个元素。所以这里不需要使用lambda
。
有人可能认为只有第二行的'B'列列表中的第一个元素[[1], [1], [1]]
被更改为[[1, 2], [1], [1]]
。但如果你查看整个数据框:
df
它返回:
A B
0 1 [[1, 2], [1], [1]]
1 2 [[1, 2], [1], [1]]
2 3 [[1, 2], [1], [1]]
因为'B'中的元素都指向相同的对象,它们一起被改变。
这就是为什么在生成这种结构时应避免使用[ ] * <number>
。而应该使用列表推导式:
import pandas as pd
df = pd.DataFrame(data={'A': [1, 2, 3]})
df['B'] = [[[1],[1],[1]] for _ in range(df.shape[0])]
df
# 然后通过以下方式进行更改:
df.at[1, 'B'][0] = [1, 2]
df
避免在lambda中使用append
:
你可以尝试:
import pandas as pd
df = pd.DataFrame(data={'A': [1, 2, 3]})
df['B'] = [[[1],[1],[1]] for _ in range(df.shape[0])]
df
import copy
def myfun(x):
l = copy.deepcopy(x)
l[0] = l[0] + [2]
return l
df.at[1, 'B'] = myfun(df.at[1, 'B'])
df
你也可以使用np.where()
来获取行索引:
df.at[np.where(df['A'] == 2)[0][0], 'B'] = myfun(df.at[np.where(df['A'] == 2)[0][0], 'B'])
英文:
The problem is the way how you generate your data frame:
import pandas as pd
df = pd.DataFrame(data={'A': [1, 2, 3]})
df['B'] = [[[1],[1],[1]]] * df.shape[0]
df
The [[[1],[1],[1]]] * df.shape[0]
is a tricky thing in Python.
Because all [[1],[1],[1]]
in the different rows of you df
point to the
same object, a list [[1], [1], [1]]
.
Try this:
df.at[1, 'B'][0] = [1, 2]
.at[ ]
allows you to access using index number and column name at the same time. the [0]
chooses the first element of the list returned after access.
So no lambda
needed here.
One thinks that only the first element in the second row's 'B' columns' list
[[1], [1], [1]]
is changed to [[1, 2], [1], [1]]
.
but if you look at the entire data frame:
df
it returns:
A B
0 1 [[1, 2], [1], [1]]
1 2 [[1, 2], [1], [1]]
2 3 [[1, 2], [1], [1]]
Because the elements in B
point all to the identical object, they all get mutated at once.
That is why you should avoid [ ] * <number>
when generating such constructs.
Instead, use e.g. a list comprehension:
import pandas as pd
df = pd.DataFrame(data={'A': [1, 2, 3]})
df['B'] = [[[1],[1],[1]] for _ in range(df.shape[0])]
df
# and mutate by:
df.at[1, 'B'][0] = [1, 2]
df
A B
0 1 [[1], [1], [1]]
1 2 [[1, 2], [1], [1]]
2 3 [[1], [1], [1]]
Avoid append
in your lambda
How about:
import pandas as pd
df = pd.DataFrame(data={'A': [1, 2, 3]})
df['B'] = [[[1],[1],[1]] for _ in range(df.shape[0])]
df
import copy
def myfun(x):
l = copy.deepcopy(x)
l[0] = l[0] + [2]
return l
df.at[1, 'B'] = myfun(df.at[1, 'B'])
df
A B
0 1 [[1], [1], [1]]
1 2 [[1, 2], [1], [1]]
2 3 [[1], [1], [1]]
One could use np.where()
to get row index:
df.at[np.where(df['A'] == 2)[0][0], 'B'] = myfun(df.at[np.where(df['A'] == 2)[0][0], 'B'])
答案4
得分: 0
以下是您提供的代码的翻译部分:
import pandas as pd
import copy
df = pd.DataFrame(data={'A': [1, 2, 3]})
df['B'] = [[[1],[1],[1]]] * df.shape[0]
i, j = 1, 1
l = copy.deepcopy(df.iat[i, j]) # 使用深拷贝以避免指针问题
l[0] = [1,2]
df.iat[i, j] = l
print(df)
A B
0 1 [[1], [1], [1]]
1 2 [[1, 2], [1], [1]]
2 3 [[1], [1], [1]]
英文:
import pandas as pd
import copy
df = pd.DataFrame(data={'A': [1, 2, 3]})
df['B'] = [[[1],[1],[1]]] * df.shape[0]
i,j = 1,1
l = copy.deepcopy(df.iat[i, j]) # Deepcopy to avoid pointer problem
l[0] = [1,2]
df.iat[i, j] = l
print(df)
A B
0 1 [[1], [1], [1]]
1 2 [[1, 2], [1], [1]]
2 3 [[1], [1], [1]]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论