2023年3月3日 22:51:16go评论96阅读模式

英文:

Removing rows based on the values in multiple columns in R

问题

在我的数据中，如果三列（V1、V2、V3）中的值是12和NA的组合（如第2行）或这三个值都等于12（如第5行），我需要删除这些行。请注意，如果所有的值都等于NA（如第3行），则应保留在数据中。

以下是你期望的结果：

     V1 V2 V3 V4 V5
1    NA 55 21 NA NA
3    NA NA NA NA 18
4    12 14 NA NA NA

感谢您的提前帮助。

英文:

In my data, I need to remove the rows if the values in three columns (V1, V2, V3) are either a combination of 12 and NAs (like row 2) or all three of them equal 12 (like row 5). Please note that if all values equal NA (like row 3) it should remain in the data.

df &lt;- data.frame(
  &quot;V1&quot; = c(NA, NA, NA, 12, 12),
  &quot;V2&quot; = c(55, NA, NA, 14, 12),
  &quot;V3&quot; = c(21, 12, NA, NA, 12),
  &quot;V4&quot; = c(NA, 32, NA, NA, NA),
  &quot;V5&quot; = c(NA, NA, 18, NA, NA)
)
     V1 V2 V3 V4 V5 
1    NA 55 21 NA NA
2    NA NA 12 32 NA
3    NA NA NA NA 18
4    12 14 NA NA NA
5    12 12 12 NA NA

I would like the following result:

     V1 V2 V3 V4 V5 
1    NA 55 21 NA NA
3    NA NA NA NA 18
4    12 14 NA NA NA

Thanks in advance for your help.

答案1

得分: 3

你可以在 filter() 中使用双重条件：

library(dplyr)
df %>%
  filter(!if_all(V1:V3, ~ .x %in% c(12, NA)) | if_all(V1:V3, ~ is.na(.x)))
#   V1 V2 V3 V4 V5
# 1 NA 55 21 NA NA
# 2 NA NA NA NA 18
# 3 12 14 NA NA NA

英文:

You can use a dual condition in filter():

library(dplyr)
df %&gt;%
  filter(!if_all(V1:V3, ~ .x %in% c(12, NA)) | if_all(V1:V3, ~ is.na(.x)))
#   V1 V2 V3 V4 V5
# 1 NA 55 21 NA NA
# 2 NA NA NA NA 18
# 3 12 14 NA NA NA

答案2

得分: 1

以下是代码的翻译部分：

col <- c("V1", "V2", "V3")
df[apply(df[, col], 1, \(x) sum((is.na(x) | x == 12), na.rm = T) != length(col)), ]

或者

df[rowSums(is.na(df[, col]) | df[, col] == 12, na.rm = TRUE) < length(col), ]

更新： 要删除包含既有 12 又有 NA 或所有值都等于 12 的行，请使用以下代码：

df[apply(df[, col], 1, \(x) !((sum((is.na(x) | x == 12), na.rm = T) == length(col)) &amp; 
                               (sum(is.na(x)) &gt;= 1 &amp; sum(x == 12, na.rm = T) &gt;= 1) |
                                sum(x == 12, na.rm = T) == length(col))), ]

输出

  V1 V2 V3 V4 V5
1 NA 55 21 NA NA
3 76 NA NA NA 12
4 12 14 NA NA NA

英文:

First set a col variable storing the target column names. The total number of records being NA or 12 should match the length of col.

col &lt;- c(&quot;V1&quot;, &quot;V2&quot;, &quot;V3&quot;)
df[apply(df[, col], 1, \(x) sum((is.na(x) | x == 12), na.rm = T) != length(col)), ]

df[rowSums(is.na(df[, col]) | df[, col] == 12, na.rm = TRUE) &lt; length(col), ]

<hr>

Update: To remove rows that either include both 12 and NA or all of the values equal 12, use the following code:

df[apply(df[, col], 1, \(x) !((sum((is.na(x) | x == 12), na.rm = T) == length(col)) &amp; 
                               (sum(is.na(x)) &gt;= 1 &amp; sum(x == 12, na.rm = T) &gt;= 1) |
                                sum(x == 12, na.rm = T) == length(col))), ]

Output

  V1 V2 V3 V4 V5
1 NA 55 21 NA NA
3 76 NA NA NA 12
4 12 14 NA NA NA

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在R中基于多列数值删除行。

问题

答案1

答案2

输出

Output

大陆名称和区域地图

使用图像编辑的循环：将循环的输出图像用作下一次迭代的输入。

查找以特定模式结尾的名称/文本（使用基本R）。

创建一个特定的选择矩阵

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。