Replace multiple columns in a dataframe with a new column that indicates if the original columns contained any non-missing data

huangapple go评论177阅读模式
英文:

Replace multiple columns in a dataframe with a new column that indicates if the original columns contained any non-missing data

问题

我有一个类似下面简化版的数据框。我希望用一个新列(new_column)替换A:C列,对于有数据的行提供1,对于没有数据的行提供NA。

  1. A B C
  2. NA NA NA
  3. 1 0 0
  4. 0 1 0
  5. 0 0 1

结果应该类似这样:

  1. new_column
  2. NA
  3. 1
  4. 1
  5. 1

我尝试使用dplyr中的mutate命令:

  1. library(dplyr)
  2. df %>%
  3. mutate(new_column = apply(is.na(df[, c("A","B","C")]), 1, all),
  4. .keep = "unused",
  5. .before = "D" ) # 其中D是数据框中的下一列
英文:

I have a dataframe that resembles the simplified one below. I am hoping to replace columns A:C with a new column (new_column) that provides a 1 for a row with data and an NA for a row without data.

  1. A B C
  2. NA NA NA
  3. 1 0 0
  4. 0 1 0
  5. 0 0 1

The outcome would look something like this:

  1. new_column
  2. NA
  3. 1
  4. 1
  5. 1

I tried using the mutate command in dplyr

  1. library(dplyr)
  2. df %>%
  3. mutate(new_column = apply(is.na(df[, c("A","B","C")]), 1, all),
  4. .keep = "unused",
  5. .before = "D" ) # where D is the next column in the data frame

答案1

得分: 1

请尝试以下代码:

  1. library(tidyverse)
  2. data %>% mutate(new=ifelse(!is.na(rowSums(across(c(A:C)))),1,NA))

基于R的方法:

  1. data$new_column <- ifelse(rowSums(is.na(data)) == ncol(data), NA, 1)
英文:

Please try the below code

  1. library(tidyverse)
  2. data %&gt;% mutate(new=ifelse(!is.na(rowSums(across(c(A:C)))),1,NA))
  3. A B C new
  4. 1 NA NA NA NA
  5. 2 1 0 0 1
  6. 3 0 1 0 1
  7. 4 0 0 1 1
  8. 5 0 0 0 1

base r approach

  1. data$new_column &lt;- ifelse(rowSums(is.na(data)) == ncol(data), NA, 1)

答案2

得分: 0

你可以使用 if_all() + is.na 来进行操作:

  1. library(dplyr)
  2. df %>%
  3. mutate(new_column = ifelse(if_all(A:C, is.na), NA, 1),
  4. .keep = "unused")
  5. # new_column
  6. # 1 NA
  7. # 2 1
  8. # 3 1
  9. # 4 1

数据
  1. df <- structure(list(A = c(NA, 1, 0, 0), B = c(NA, 0, 1, 0), C = c(NA,
  2. 0, 0, 1)), class = "data.frame", row.names = c(NA, -4L))
英文:

You can use if_all() + is.na:

  1. library(dplyr)
  2. df %&gt;%
  3. mutate(new_column = ifelse(if_all(A:C, is.na), NA, 1),
  4. .keep = &quot;unused&quot;)
  5. # new_column
  6. # 1 NA
  7. # 2 1
  8. # 3 1
  9. # 4 1

Data
  1. df &lt;- structure(list(A = c(NA, 1, 0, 0), B = c(NA, 0, 1, 0), C = c(NA,
  2. 0, 0, 1)), class = &quot;data.frame&quot;, row.names = c(NA, -4L))

huangapple
  • 本文由 发表于 2023年8月11日 02:04:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/76878286.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定