英文:
Filter row with one specific string value in R
问题
I have a dataframe in R as below:
Fruits
Apple:1
Apple:4
Bananna
Papaya
Orange, Apple:2
I want to filter rows with the string Apple as:
Apple:1
Apple:4
I tried using the dplyr package.
df <- dplyr::filter(df, grepl('Apple', Fruits))
But it filters rows with the string Apple as:
Apple:1
Apple: 4
Orange, Apple:2
How to remove rows with multiple strings and filter rows with one specific string (in this case Apple)?
英文:
I have a dataframe in R as below:
Fruits
Apple:1
Apple:4
Bananna
Papaya
Orange, Apple:2
I want to filter rows with string Apple as
Apple:1
Apple:4
I tried using dplyr package.
df <- dplyr::filter(df, grepl('Apple', Fruits))
But it filters rows with string Apple as:
Apple:1
Apple: 4
Orange, Apple:2
How to remove rows with multiple strings and filter rows with one specific string (in this case Apple)?
答案1
得分: 2
只过滤出 Apple
,您可以使用正则锚点 ^
指定字符串的开头,然后是 "Apple:" 和任何数字。最后,使用 $
来指定字符串的结束,其中上述模式可能多次出现。如果字符串中有其他字符,搜索将返回 FALSE
。
library(dplyr)
df %>% filter(grepl("^(Apple:\\d+(, )?){1,}$", Fruits))
Fruits
1 Apple:1
2 Apple:4
英文:
To only filter out Apple
, you can use the regex anchor ^
to specify the start of a string, followed by "Apple:" and any digits. Finally close the search pattern with $
, which specifies the end of a string, where the above pattern could happen more than once. The search will return FALSE
if you have any other characters in between the string.
library(dplyr)
df %>% filter(grepl("^(Apple:\\d+(, )?){1,}$", Fruits))
Fruits
1 Apple:1
2 Apple:4
答案2
得分: 1
Here's the translated code part:
df %>%
filter(str_detect(Fruits, '^(?!.*Banana|Orange).*Apple'))
And the translated data:
df <- data.frame(
Fruits = c("Orange, Apple:2",
"Apple, Apple:2, Apple:7",
"Apple:2, Banana:10"))
英文:
EDIT:
Assuming, based on comments made by OP, that strings should be filtered where the only fruit mentioned is Apple
and assuming further that the list of non-Apple
fruit is manageable, you could do this:
df %>%
filter(str_detect(Fruits, '^(?!.*Banana|Orange).*Apple'))
Fruits
1 Apple, Apple:2, Apple:7
Here, we use negative look-ahead (?!.*Banana|Orange)
to assert that Banana
or Orange
must not be present in the string together with Apple
Data:
df <- data.frame(
Fruits = c("Orange, Apple:2",
"Apple, Apple:2, Apple:7",
"Apple:2, Banana:10"))
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论