在Pandas中如何删除包含第三个字母为W的行?

huangapple go评论66阅读模式
英文:

How to drop a row in Pandas if a third letter in a column is W?

问题

我有这样的数据框:

ID Jan Feb Mar
20WAST 2 2 5
20S22 0 0 1
20W1ST 2 2 5
200122 0 0 1

我想要删除所有第一列中第三个字母是 'W' 的行,以获得以下输出:

ID Jan Feb Mar
20S22 0 0 1
200122 0 0 1

这是一个非常大的数据框,我尝试了类似这样的操作:

df[df.ID.str[2] != 'W']

但这只选择了第二行的项。我可能可以遍历数据框,但想看看是否有更好的选项。

英文:

I have a dataframe of this kind:

ID Jan Feb Mar
20WAST 2 2 5
20S22 0 0 1
20W1ST 2 2 5
200122 0 0 1

And I want to drop all the rows where the third letter in the first column is a 'W' to give an output:

ID Jan Feb Mar
20S22 0 0 1
200122 0 0 1

It is a very large dataframe and I tried doing something like this:

df[df.ID[2] != 'W']

But this only selects the item in the second row. I could potentially iterate over the dataframe but wanted to see if there was a better option.

答案1

得分: 4

df = df[df['ID'].str[2].ne('W')]
在进行此选择后,您可能需要重置索引。

英文:

You are almost there. Use:

df= df[df['ID'].str[2].ne('W')]

you might want to reset the index after this selection

答案2

得分: 0

你可以使用正则表达式来查找第3个字符。

out = df[df['ID'].str.contains('^.{2}(?!W)')]
# 或者
out = df[df['ID'].str.match('.{2}(?!W)')]
# 或者
out = df[df['ID'].str.match('.{2}[^W]')]

注意:str.containsstr.match 之间的区别是 str.match 从目标字符串的开头匹配。

print(out)

       ID  Jan  Feb  Mar
1   20S22    0    0    1
3  200122    0    0    1
英文:

You can use regex to find the 3rd character

out = df[df['ID'].str.contains('^.{2}(?!W)')]
# or
out = df[df['ID'].str.match('.{2}(?!W)')]
# or
out = df[df['ID'].str.match('.{2}[^W]')]

NOTE: difference between str.contains and str.match is that str.match match the string from beginning of the target.

$ print(out)

       ID  Jan  Feb  Mar
1   20S22    0    0    1
3  200122    0    0    1

huangapple
  • 本文由 发表于 2023年3月4日 05:32:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/75632057.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定