英文:
Function for multi column filtering in Pandas with a loop
问题
Here is the translated code:
假设我有以下Pandas数据框:
df = DataFrame({'A': [True, True, False], 'B': [1, 1, 2], 'C': [3, 4, 5]})
| A | B | C |
| -------- | -------- | -------- |
| True | 1 | 3 |
| True | 1 | 4 |
| False | 2 | 5 |
我想要编写一个函数,该函数将列名列表和它们对应的值作为输入,并返回筛选后的列表。例如,
def pandas_filter(df, columns_list, values_list):
return df.loc[df[columns_list] == values_list]
继续使用上面的示例,当我编写以下代码时
result = pandas_filter(df=df, columns_list=['A', 'B'], values_list=[True, 1])
我希望得到以下结果:
| A | B | C |
| -------- | -------- | -------- |
| True | 1 | 3 |
| True | 1 | 4 |
def pandas_filter(df, columns_list, values_list):
return df.loc[df[columns_list] == values_list]
Regarding the issue with the ValueError
, it's caused by using a list for filtering. You can modify the code like this to avoid the error:
def pandas_filter(df, columns_list, values_list):
filter_mask = (df[columns_list] == values_list).all(axis=1)
return df[filter_mask]
This modified function should work correctly.
英文:
Suppose I have the following Pandas dataframe:
df = DataFrame({'A' : [True, True, False], 'B' : [1, 1, 2], 'C' : [3, 4, 5]})
| A | B | C |
| -------- | -------- | -------- |
| True | 1 | 3 |
| True | 1 | 4 |
| False | 2 | 5 |
I want write a function that will give a list of columns and their corresponding values as inputs and it will return the filtered list. For example,
def pandas_filter(df, columns_list, values_list):
return df.loc[df[columns_list] == values_list]
Continuing on the example, when I write the following code
result = pandas_filter(df=df, columns_list=[A, B], values_list=[True, 1])
I want to get the following result
| A | B | C |
| -------- | -------- | -------- |
| True | 1 | 3 |
| True | 1 | 4 |
def pandas_filter(df, columns_list, values_list):
return df.loc[df[columns_list] == values_list]
This function returns ValueError("Cannot index with multidimensional key")
答案1
得分: 2
你只需将eq-comparison (==
) 与 all
链接起来,形成一个mask:
def pandas_filter(df, columns_list, values_list):
return df.loc[
(df[columns_list] == values_list).all(axis=1) # <-- add it here
]
result = pandas_filter(df=df, columns_list=["A", "B"], values_list=[True, 1])
输出:
print(result)
A B C
0 True 1 3
1 True 1 4
中间结果:
>>> df[columns_list] == values_list
A B
0 True True
1 True True
2 False False
>>> (df[columns_list] == values_list).all(axis=1)
0 True
1 True
2 False
dtype: bool
英文:
You just need to chain your eq-comparison (==
) with all
to form a mask :
def pandas_filter(df, columns_list, values_list):
return df.loc[
(df[columns_list] == values_list).all(axis=1) # <-- add it here
]
result = pandas_filter(df=df, columns_list=["A", "B"], values_list=[True, 1])
Output :
print(result)
A B C
0 True 1 3
1 True 1 4
Intermediates :
>>> df[columns_list] == values_list
A B
0 True True
1 True True
2 False False
>>> (df[columns_list] == values_list).all(axis=1)
0 True
1 True
2 False
dtype: bool
答案2
得分: 0
你的代码中有一个小错误,因为你需要将列名作为字符串指定在columns_list
中。此外,==
运算符不适用于值列表。你可以使用isin()
方法,它允许你检查列是否包含列表中的某个值。
def pandas_filter(df, columns_list, values_list):
conditions = pd.Series(True, index=df.index)
for col, val in zip(columns_list, values_list):
conditions = conditions & df[col].isin([val])
return df.loc[conditions]
isin()
方法检查每个列中的值是否包含在相应的值列表中。
英文:
There is a small error in your code because you need to specify column names in columns_list as strings. Also, the == operator does not work with a list of values. You can use the isin() method which allows you to check if a column contains one of the values in the list.
def pandas_filter(df, columns_list, values_list):
conditions = pd.Series(True, index=df.index)
for col, val in zip(columns_list, values_list):
conditions = conditions & df[col].isin([val])
return df.loc[conditions]
The isin() method checks whether each value in a column is contained in the corresponding list of values.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论