2023年4月13日 20:02:04go评论91阅读模式

英文:

Filter row with one specific string value in R

问题

I have a dataframe in R as below:

Fruits
Apple:1
Apple:4
Bananna
Papaya
Orange, Apple:2

I want to filter rows with the string Apple as:

Apple:1
Apple:4

I tried using the dplyr package.

df <- dplyr::filter(df, grepl('Apple', Fruits))

But it filters rows with the string Apple as:

Apple:1
Apple: 4
Orange, Apple:2

How to remove rows with multiple strings and filter rows with one specific string (in this case Apple)?

英文:

I have a dataframe in R as below:

Fruits
Apple:1
Apple:4
Bananna    
Papaya    
Orange, Apple:2

I want to filter rows with string Apple as

Apple:1
Apple:4

I tried using dplyr package.

df &lt;- dplyr::filter(df, grepl(&#39;Apple&#39;, Fruits))

But it filters rows with string Apple as:

Apple:1
Apple: 4     
Orange, Apple:2

How to remove rows with multiple strings and filter rows with one specific string (in this case Apple)?

答案1

得分: 2

只过滤出 Apple，您可以使用正则锚点 ^ 指定字符串的开头，然后是 "Apple:" 和任何数字。最后，使用 $ 来指定字符串的结束，其中上述模式可能多次出现。如果字符串中有其他字符，搜索将返回 FALSE。

library(dplyr)
df %>% filter(grepl("^(Apple:\\d+(, )?){1,}$", Fruits))
   Fruits
1 Apple:1
2 Apple:4

英文:

To only filter out Apple, you can use the regex anchor ^ to specify the start of a string, followed by "Apple:" and any digits. Finally close the search pattern with $, which specifies the end of a string, where the above pattern could happen more than once. The search will return FALSE if you have any other characters in between the string.

library(dplyr)
df %&gt;% filter(grepl(&quot;^(Apple:\\d+(, )?){1,}$&quot;, Fruits))
   Fruits
1 Apple:1
2 Apple:4

答案2

得分: 1

Here's the translated code part:

df %>% 
  filter(str_detect(Fruits, '^(?!.*Banana|Orange).*Apple'))

And the translated data:

df <- data.frame(
  Fruits = c("Orange, Apple:2", 
             "Apple, Apple:2, Apple:7", 
             "Apple:2, Banana:10"))

英文:

EDIT:

Assuming, based on comments made by OP, that strings should be filtered where the only fruit mentioned is Apple and assuming further that the list of non-Apple fruit is manageable, you could do this:

df %&gt;% 
  filter(str_detect(Fruits, &#39;^(?!.*Banana|Orange).*Apple&#39;))
                   Fruits
1 Apple, Apple:2, Apple:7

Here, we use negative look-ahead (?!.*Banana|Orange) to assert that Banana or Orange must not be present in the string together with Apple

Data:

df &lt;- data.frame(
  Fruits = c(&quot;Orange, Apple:2&quot;, 
             &quot;Apple, Apple:2, Apple:7&quot;, 
             &quot;Apple:2, Banana:10&quot;))

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

筛选R中具有特定字符串值的行

问题

答案1

答案2

问题与编织Markdown文件和pandoc有关。

在 R 中更改 text_tokens 函数的输出

使用 case_when 替代嵌套的 ifelse 语句。

在Pandas中使用索引子句的错误适当标记为’ignore’。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。