英文:
How can I remove rows of a dataframe that contain two specific characters?
问题
我有一个包含7个字母的数据框中的字符列,这些字母可以是A、B、C、D。
例如
1 - AAAABBB
2 - ACBCDAB
3 - AACCADD
4 - ACDCACC
5 - ABAABBC
6 - BCBBDCB
我想要删除数据框中同时包含A和B的行,但保留包含A、C、D或B、C、D的行。
所以最终结果应该是。
3 - AACCADD
4 - ACDCACC
6 - BCBBDCB
单元格中A和B的数量并不重要,只要单元格至少包含一个A和一个B,我想要删除该行。
我尝试使用str_split_fix然后对不同的列进行子集操作,但我觉得应该有一种更有效的方法。
英文:
I have a character column of a dataframe that contains 7 letters that are either A,B,C,D.
for example
1 - AAAABBB
2 - ACBCDAB
3 - AACCADD
4 - ACDCACC
5 - ABAABBC
6 - BCBBDCB
I would like to remove rows of the dataframe that contain both an A and B,
but keep any rows that contain A's, C's, D's or B's, C's, D's
So the end result should be.
3 - AACCADD
4 - ACDCACC
6 - BCBBDCB
The number of A's and B's in a cell do not matter, as long as the cell has at least both one A and one B I would like to remove that row.
I've tried to use str_split_fix and then subset the various columns, but I feel like there should be a more efficient way.
答案1
得分: 0
使用基本的R:
subset(your_data, !(grepl("A", your_column) & grepl("B", your_column)))
或者使用tidyverse:
library(stringr)
library(dplyr)
your_data %>%
filter(!(str_detect(your_column, "A") & str_detect(your_column, "B")))
英文:
With base R
subset(your_data, !(grepl("A", your_column) & grepl("B", your_column)))
Or with tidyverse
library(stringr)
library(dplyr)
your_data |>
filter(!(str_detect(your_column, "A") & str_detect(your_column, "B"))
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论