Replace multiple columns in a dataframe with a new column that indicates if the original columns contained any non-missing data

huangapple go评论143阅读模式
英文:

Replace multiple columns in a dataframe with a new column that indicates if the original columns contained any non-missing data

问题

我有一个类似下面简化版的数据框。我希望用一个新列(new_column)替换A:C列,对于有数据的行提供1,对于没有数据的行提供NA。

A  B  C
NA NA NA
1  0  0
0  1  0
0  0  1

结果应该类似这样:

new_column
NA
1
1
1

我尝试使用dplyr中的mutate命令:

library(dplyr)

df %>%
  mutate(new_column = apply(is.na(df[, c("A","B","C")]), 1, all),
         .keep = "unused",
         .before = "D" ) # 其中D是数据框中的下一列
英文:

I have a dataframe that resembles the simplified one below. I am hoping to replace columns A:C with a new column (new_column) that provides a 1 for a row with data and an NA for a row without data.

A  B  C
NA NA NA
1  0  0
0  1  0
0  0  1

The outcome would look something like this:

new_column
NA
1
1
1

I tried using the mutate command in dplyr

library(dplyr)

df %>%
  mutate(new_column = apply(is.na(df[, c("A","B","C")]), 1, all),
         .keep = "unused",
         .before = "D" ) # where D is the next column in the data frame

答案1

得分: 1

请尝试以下代码:

library(tidyverse)

data %>% mutate(new=ifelse(!is.na(rowSums(across(c(A:C)))),1,NA))

基于R的方法:

data$new_column <- ifelse(rowSums(is.na(data)) == ncol(data), NA, 1)
英文:

Please try the below code

library(tidyverse)

data %&gt;% mutate(new=ifelse(!is.na(rowSums(across(c(A:C)))),1,NA))


   A  B  C new
1 NA NA NA  NA
2  1  0  0   1
3  0  1  0   1
4  0  0  1   1
5  0  0  0   1

base r approach

data$new_column &lt;- ifelse(rowSums(is.na(data)) == ncol(data), NA, 1)

答案2

得分: 0

你可以使用 if_all() + is.na 来进行操作:

library(dplyr)

df %>%
  mutate(new_column = ifelse(if_all(A:C, is.na), NA, 1),
         .keep = "unused")

#   new_column
# 1         NA
# 2          1
# 3          1
# 4          1

数据
df <- structure(list(A = c(NA, 1, 0, 0), B = c(NA, 0, 1, 0), C = c(NA, 
0, 0, 1)), class = "data.frame", row.names = c(NA, -4L))
英文:

You can use if_all() + is.na:

library(dplyr)

df %&gt;%
  mutate(new_column = ifelse(if_all(A:C, is.na), NA, 1),
         .keep = &quot;unused&quot;)

#   new_column
# 1         NA
# 2          1
# 3          1
# 4          1

Data
df &lt;- structure(list(A = c(NA, 1, 0, 0), B = c(NA, 0, 1, 0), C = c(NA, 
0, 0, 1)), class = &quot;data.frame&quot;, row.names = c(NA, -4L))

huangapple
  • 本文由 发表于 2023年8月11日 02:04:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/76878286.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定