2023年8月11日 02:04:45go评论190阅读模式

英文:

Replace multiple columns in a dataframe with a new column that indicates if the original columns contained any non-missing data

问题

我有一个类似下面简化版的数据框。我希望用一个新列（new_column）替换A:C列，对于有数据的行提供1，对于没有数据的行提供NA。

A  B  C
NA NA NA
1  0  0
0  1  0
0  0  1

结果应该类似这样：

new_column
NA
1
1
1

我尝试使用dplyr中的mutate命令：

library(dplyr)
df %>%
  mutate(new_column = apply(is.na(df[, c("A","B","C")]), 1, all),
         .keep = "unused",
         .before = "D" ) # 其中D是数据框中的下一列

英文:

I have a dataframe that resembles the simplified one below. I am hoping to replace columns A:C with a new column (new_column) that provides a 1 for a row with data and an NA for a row without data.

A  B  C
NA NA NA
1  0  0
0  1  0
0  0  1

The outcome would look something like this:

new_column
NA
1
1
1

I tried using the mutate command in dplyr

library(dplyr)
df %&gt;%
  mutate(new_column = apply(is.na(df[, c(&quot;A&quot;,&quot;B&quot;,&quot;C&quot;)]), 1, all),
         .keep = &quot;unused&quot;,
         .before = &quot;D&quot; ) # where D is the next column in the data frame

答案1

得分: 1

请尝试以下代码：

library(tidyverse)
data %>% mutate(new=ifelse(!is.na(rowSums(across(c(A:C)))),1,NA))

基于R的方法：

data$new_column <- ifelse(rowSums(is.na(data)) == ncol(data), NA, 1)

英文:

Please try the below code

library(tidyverse)
data %&gt;% mutate(new=ifelse(!is.na(rowSums(across(c(A:C)))),1,NA))
   A  B  C new
1 NA NA NA  NA
2  1  0  0   1
3  0  1  0   1
4  0  0  1   1
5  0  0  0   1

base r approach

data$new_column &lt;- ifelse(rowSums(is.na(data)) == ncol(data), NA, 1)

答案2

得分: 0

你可以使用 if_all() + is.na 来进行操作：

library(dplyr)
df %>%
  mutate(new_column = ifelse(if_all(A:C, is.na), NA, 1),
         .keep = "unused")
#   new_column
# 1         NA
# 2          1
# 3          1
# 4          1

数据

df <- structure(list(A = c(NA, 1, 0, 0), B = c(NA, 0, 1, 0), C = c(NA, 
0, 0, 1)), class = "data.frame", row.names = c(NA, -4L))

英文:

You can use if_all() + is.na:

library(dplyr)
df %&gt;%
  mutate(new_column = ifelse(if_all(A:C, is.na), NA, 1),
         .keep = &quot;unused&quot;)
#   new_column
# 1         NA
# 2          1
# 3          1
# 4          1

Data

df &lt;- structure(list(A = c(NA, 1, 0, 0), B = c(NA, 0, 1, 0), C = c(NA, 
0, 0, 1)), class = &quot;data.frame&quot;, row.names = c(NA, -4L))

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Replace multiple columns in a dataframe with a new column that indicates if the original columns contained any non-missing data

问题

答案1

答案2

数据

Data

为什么如果不同方式计算，我会得到不同的p值？

从嵌套字典中根据条件提取数据框。

提取多列中的第一个非NA值。

获取列中每个条目中字符串的索引2-4。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。