过滤掉数据框中特定列为零的行(R)

huangapple go评论75阅读模式
英文:

Filter out dataframe rows where specific columns have zeroes (R)

问题

假设我有一个如下的数据框:

df <- data.frame(
  Cat = c("A", "B", "C", "D", "E", "F"),
  S1 = c(0, 0, 0, 0, 0, 0),
  S2 = c(3, 0, 0, 0, 0, 2),
  S3 = c(-3, 3, 2, 0, 0, 5),
  S4 = c(0, -5, 5, 0, 0, 0)
)

我想要删除所有S1、S2、S3和S4都为零的行,这样我只会剩下Cat A、B、C和F。我不能简单地使用行求和,因为这样也会错误地删除Cat A(因为3 + -3 = 0)。请注意,在真实数据中,有36个"Sx"列,列名每个月都会变化。有什么想法吗?谢谢!

英文:

suppose I've got a dataframe like this:

df &lt;- data.frame(
  Cat = c(&quot;A&quot;, &quot;B&quot;, &quot;C&quot;, &quot;D&quot;, &quot;E&quot;, &quot;F&quot;),
  S1 = c(0, 0, 0, 0, 0, 0),
  S2 = c(3, 0, 0, 0, 0, 2),
  S3 = c(-3, 3, 2, 0, 0, 5),
  S4 = c(0, -5, 5, 0, 0, 0)
)

I want to remove any rows where S1, S2, S3, and S4 are all zeroes, so that I'd only be left with Cat A, B, C, and F. I can't simply use use a row sum because that would also wrongly remove Cat A (since 3 + -3 = 0). Note that in the real data, there are 36 "Sx" columns with names that will change every month. Any thoughts? Thank you!

答案1

得分: 1

你可以这样做:

library(tidyverse)

df %>%
  mutate(keep = apply(across(-Cat), 1, function(x) !all(x == 0))) %>%
  filter(keep == TRUE) %>%
  select(-keep)

其中df是你的数据框,它包含了CatS1S2S3S4这几列。这段代码的作用是对数据框进行处理,首先使用mutate函数创建一个名为keep的新列,该列的值为逐行判断除了Cat列之外的其他列是否全部为0,如果全部为0,则为FALSE,否则为TRUE。然后使用filter函数筛选出keep列值为TRUE的行,最后使用select函数去除keep列。最终得到的结果是一个新的数据框,其中只包含了满足条件的行。

英文:

You can do:

library(tidyverse)

df %&gt;%
  mutate(keep = apply(across(-Cat), 1, function(x) !all(x == 0))) %&gt;%
  filter(keep == TRUE) %&gt;%
  select(-keep)

  Cat S1 S2 S3 S4
1   A  0  3 -3  0
2   B  0  0  3 -5
3   C  0  0  2  5
4   F  0  2  5  0

答案2

得分: 1

你可以使用rowSums(df[-1] == 0)来计算每行中0的数量,不包括第一列。

df[rowSums(df[-1] == 0) < (nrow(df) - 1), ]
#   Cat S1 S2 S3 S4
# 1   A  0  3 -3  0
# 2   B  0  0  3 -5
# 3   C  0  0  2  5
# 6   F  0  2  5  0
英文:

You can use rowSums(df[-1] == 0) to count the number of 0s in each row, excluding the first column.

df[rowSums(df[-1] == 0) &lt; (nrow(df) - 1), ]
#   Cat S1 S2 S3 S4
# 1   A  0  3 -3  0
# 2   B  0  0  3 -5
# 3   C  0  0  2  5
# 6   F  0  2  5  0

答案3

得分: 0

你可以尝试以下代码:

zero_cols <- c("S1", "S2", "S3", "S4")

df[!apply(df[,zero_cols], 1, function(x) all(x == 0)),]

输出结果为:

#   Cat S1 S2 S3 S4
# 1   A  0  3 -3  0
# 2   B  0  0  3 -5
# 3   C  0  0  2  5
# 6   F  0  2  5  0
英文:

You could try:

zero_cols &lt;- c(&quot;S1&quot;, &quot;S2&quot;, &quot;S3&quot;, &quot;S4&quot;)

df[!apply(df[,zero_cols], 1, function(x) all(x == 0)),]

Output:

#   Cat S1 S2 S3 S4
# 1   A  0  3 -3  0
# 2   B  0  0  3 -5
# 3   C  0  0  2  5
# 6   F  0  2  5  0

huangapple
  • 本文由 发表于 2023年8月9日 03:21:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/76862638.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定