英文:
Remove rows from a data frame that match on multiple criteria
问题
我希望删除包含特定模式的数据帧行,并且如果可能的话,我希望使用 tidyverse 语法。
我希望删除列1包含 "cat" 且列2至4中包含以下任何单词的行:dog、fox 或 cow。对于此示例,这将从原始数据中删除行1和4。
这是一个示例数据集:
df <- data.frame(col1 = c("cat", "fox", "dog", "cat", "pig"),
col2 = c("lion", "tiger", "elephant", "dog", "cow"),
col3 = c("bird", "cow", "sheep", "fox", "dog"),
col4 = c("dog", "cat", "cat", "cow", "fox"))
我已经尝试了许多 across
变体,但一直遇到问题。这是我最新的尝试:
filtered_df <- df %>%
filter(!(col1 == "cat" & !any(cowfoxdog <- across(col2:col4, ~ . %in% c("cow", "fox", "dog")))))
这返回以下错误:
Error in `filter()`:
! Problem while computing `..1 = !...`.
Caused by error in `FUN()`:
! only defined on a data frame with all numeric variables
英文:
I wish to remove rows of my data frame that contain a specific pattern and I wish to use tidyverse syntax if possible.
I wish to remove rows where column 1 contains "cat" and where any of col2:4 contain any of the following words: dog, fox or cow. For this example that will remove rows 1 and 4 from the original data.
Here's a sample dataset:
df <- data.frame(col1 = c("cat", "fox", "dog", "cat", "pig"),
col2 = c("lion", "tiger", "elephant", "dog", "cow"),
col3 = c("bird", "cow", "sheep", "fox", "dog"),
col4 = c("dog", "cat", "cat", "cow", "fox"))
I've tried a number of across variants but constantly run into issues. Here is my latest attempt:
filtered_df <- df %>%
filter(!(animal1 == "cat" & !any(cowfoxdog <- across(animal2:animal4, ~ . %in% c("cow", "fox", "dog")))))
This returns the following error:
Error in `filter()`:
! Problem while computing `..1 = !...`.
Caused by error in `FUN()`:
! only defined on a data frame with all numeric variables
答案1
得分: 5
你可以使用 if_any()
。为了进行更强健的测试,我首先添加了一行,其中 col1 == "cat"
,但 col2:col4
中 没有 出现 "dog"
、"fox"
或 "cow"
。
英文:
You can use if_any()
. For a more robust test, I first added a row where col1 == "cat"
but "dog"
, "fox"
, or "cow"
don't appear in columns 2-4.
library(dplyr)
df <- df %>%
add_row(col1 = "cat", col2 = "sheep", col3 = "lion", col4 = "tiger")
df %>%
filter(!(col1 == "cat" & if_any(col2:col4, \(x) x %in% c("dog", "fox", "cow"))))
col1 col2 col3 col4
1 fox tiger cow cat
2 dog elephant sheep cat
3 pig cow dog fox
4 cat sheep lion tiger
答案2
得分: 1
使用**filter()**函数根据逻辑运算符过滤符合您的条件的行:
library(tidyverse)
pattern1 <- c("cat")
pattern2 <- c("dog", "fox", "cow")
df %>%
filter(!(col1 == pattern1 &
(col2 %in% pattern2 |
col3 %in% pattern2 |
col4 %in% pattern2))
)
col1 col2 col3 col4
1 fox tiger cow cat
2 dog elephant sheep cat
3 pig cow dog fox
英文:
One way is to use filter() function that filters rows that meet your criteria based on logical operators:
library(tidyverse)
pattern1<-c("cat")
pattern2<-c("dog", "fox", "cow")
df %>%
filter(!(col1 == pattern1 &
(col2 %in% pattern2 |
col3 %in% pattern2 |
col4 %in% pattern2))
)
col1 col2 col3 col4
1 fox tiger cow cat
2 dog elephant sheep cat
3 pig cow dog fox
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论