英文:
Python pandas drop columns if their partial name is in a list or column in pandas
问题
thisFilter = df.filter(like=dropthese['partname'].iloc[0], axis=1)
df.drop(thisFilter.columns, axis=1, inplace=True)
thisFilter = df.filter(like=dropthese['partname'].iloc[0], axis=1)
df.drop(thisFilter.columns, axis=1, inplace=True)
thisFilter = df.filter(like=dropthese['partname'].iloc[0], axis=1)
df.drop(thisFilter.columns, axis=1, inplace=True)
thisFilter = df.filter(like=dropthese['partname'].iloc[0], axis=1)
df.drop(thisFilter.columns, axis=1, inplace=True)
thisFilter = df.filter(like=dropthese['partname'].iloc[0], axis=1)
df.drop(thisFilter.columns, axis=1, inplace=True)
thisFilter = df.filter(like=dropthese['partname'].iloc[0], axis=1)
df.drop(thisFilter.columns, axis=1, inplace=True)
英文:
I have the following dataframe called dropthese
.
| partname | x1 | x2 | x3....
0 text1_mid1
1 another1_mid2
2 yet_another
And another dataframe called df
that looks like this.
text1_mid1_suffix1 | text1_mid1_suffix2 | ... | something_else | another1_mid2_suffix1 | ....
0 .....
1 .....
2 .....
3 .....
I want to drop all the columns from df
, if a part of the name is in dropthese['partname']
.
So for example, since text1_mid1
is in partname
, all columns that contain that partial string should be dropped like text1_mid1_suffix1
and text1_mid1_suffix2
.
I have tried,
thisFilter = df.filter(dropthese.partname, regex=True)
df.drop(thisFilter, axis=1)
But I get this error, TypeError: Keyword arguments `items`, `like`, or `regex` are mutually exclusive
. What is the proper way to do this filter?
答案1
得分: 3
我会使用正则表达式与 str.contains
(或 str.match
如果您想限制在字符串的开头)一起使用:
import re
pattern = '|'.join(dropthese['partname'].map(re.escape))
out = df.loc[:, ~df.columns.str.contains(f'({pattern})')]
输出:
something_else
0 ...
为什么您的命令失败了
您应该将模式传递给 filter
的 regex
参数,并在 drop
中使用列名:
pattern = '|'.join(dropthese['partname'].map(re.escape))
thisFilter = df.filter(regex=pattern)
df.drop(thisFilter.columns, axis=1)
英文:
I would use a regex with str.contains
(or str.match
if you want to restrict to the start of string):
import re
pattern = '|'.join(dropthese['partname'].map(re.escape))
out = df.loc[:, ~df.columns.str.contains(f'({pattern})')]
Output:
something_else
0 ...
Why your command failed
you should pass the pattern to the regex
parameter of filter
, and use the column names in drop
:
pattern = '|'.join(dropthese['partname'].map(re.escape))
thisFilter = df.filter(regex=pattern)
df.drop(thisFilter.columns, axis=1)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论