2023年4月20日 08:44:31go评论92阅读模式

英文:

Mutate a new column according to the values of each row

问题

我有以下的玩具数据框。

toy.df <- data.frame(Name = c("group1", "group2", "group3", "group4", "group5", "group6", "group7"), 
                 col1 = c("pos", "neg", "NA", "pos","neg", "NA", "pos"),
                 col2 = c("pos", "pos", "NA", "pos","neg","NA", "neg"),
                 col3 = c("pos", "NA", "pos", "NA", "neg", "neg", "neg"))

我想要创建一个新列，检查每一行的所有列的值。如果它们都是"pos"或"NA"，则变为"pos"，如果它们都是"neg"或"NA"，则变为"neg"，如果它们是"pos"、"neg"或"NA"中的任何一个，则变为"both"。

新列看起来如下：

col4 <- c("pos", "both", "pos", "pos","neg", "neg","both")

这是最终的数据框：

 Name  col1 col2 col3 col4
group1  pos  pos  pos  pos
group2  neg  pos  NA  both
group3  NA   NA   pos  pos
group4  pos  pos   NA  pos
group5  neg  neg  neg  neg
group6  NA   NA   neg  neg
group7  pos  neg  neg both

英文:

I have the following toy data frame.

toy.df &lt;- data.frame(Name = c(&quot;group1&quot;, &quot;group2&quot;, &quot;group3&quot;, &quot;group4&quot;, &quot;group5&quot;, &quot;group6&quot;, &quot;group7&quot;), 
                 col1 = c(&quot;pos&quot;, &quot;neg&quot;, &quot;NA&quot;, &quot;pos&quot;,&quot;neg&quot;, &quot;NA&quot;, &quot;pos&quot;),
                 col2 = c(&quot;pos&quot;, &quot;pos&quot;, &quot;NA&quot;, &quot;pos&quot;,&quot;neg&quot;,&quot;NA&quot;, &quot;neg&quot;),
                 col3 = c(&quot;pos&quot;, &quot;NA&quot;, &quot;pos&quot;, &quot;NA&quot;, &quot;neg&quot;, &quot;neg&quot;, &quot;neg&quot;))

I would like to mutate a new column that check the values of all columns per row. If they are all "pos" or "NA" mutate "pos", if they are all "neg" or "NA" mutate "neg" and if they are "pos" or "neg" or "NA" mutate "both".

The new column looks as follows:

col4 &lt;- c(&quot;pos&quot;, &quot;both&quot;, &quot;pos&quot;, &quot;pos&quot;,&quot;neg&quot;, &quot;neg&quot;,&quot;both&quot;)

Here is the final data frame:

 Name  col1 col2 col3 col4
group1  pos  pos  pos  pos
group2  neg  pos  NA  both
group3  NA   NA   pos  pos
group4  pos  pos   NA  pos
group5  neg  neg  neg  neg
group6  NA   NA   neg  neg
group7  pos  neg  neg both

答案1

得分: 3

以下是您要的翻译部分：

"NA"在您的数据框中是字面值"NA"，我们需要使用na_if将其转换为真正的缺失值NA，然后使用case_when为新列分配条件。我们需要在每一行中使用rowwise才能使其在每一行中起作用。case_when中的最后一个TRUE ~ "unknown"捕捉了col1到col3中除了"pos"和"neg"之外的字符串。

我添加了两个条目来展示当所有行都是NA或列中有拼写错误时的行为。

library(dplyr)
toy.df %>%
  rowwise() %>%
  mutate(across(everything(), ~na_if(.x, "NA")),
         col4 = case_when(all(is.na(c_across(col1:col3))) ~ NA,
                          all(c_across(col1:col3) == "pos", na.rm = T) ~ "pos",
                          all(c_across(col1:col3) == "neg", na.rm = T) ~ "neg",
                          all(c_across(col1:col3) %in% c("pos", "neg", NA)) ~ "both",
                          TRUE ~ "unknown")) %>%
  ungroup()
# A tibble: 9 × 5
  Name   col1  col2  col3  col4   
1 group1 pos   pos   pos   pos    
2 group2 neg   pos   NA    both   
3 group3 NA    NA    pos   pos    
4 group4 pos   pos   NA    pos    
5 group5 neg   neg   neg   neg    
6 group6 NA    NA    neg   neg    
7 group7 pos   neg   neg   both   
8 group8 NA    NA    NA    NA     
9 group9 pos   pos   typo  unknown

数据

toy.df <- structure(list(Name = c("group1", "group2", "group3", "group4", 
"group5", "group6", "group7", "group8", "group9"), col1 = c("pos", 
"neg", "NA", "pos", "neg", "NA", "pos", NA, "pos"), col2 = c("pos", 
"pos", "NA", "pos", "neg", "NA", "neg", NA, "pos"), col3 = c("pos", 
"NA", "pos", "NA", "neg", "neg", "neg", NA, "typo")), class = "data.frame", row.names = c(NA, 
-9L))

英文:

Since the "NA" in your data frame is literal "NA", we need to turn it into real missing value NA by na_if. Then use case_when to supply the conditions for new column assignment. We need rowwise for it to work in every row. The final TRUE ~ "unknown" in case_when captures strings other than "pos" and "neg" in col1 to col3.

I added two entries to show the behaviour when all rows are NA, or when there's a typo in the columns.

library(dplyr)
toy.df %&gt;% 
  rowwise() %&gt;%  
  mutate(across(everything(), ~na_if(.x, &quot;NA&quot;)),
         col4 = case_when(all(is.na(c_across(col1:col3))) ~ NA,
                          all(c_across(col1:col3) == &quot;pos&quot;, na.rm = T) ~ &quot;pos&quot;,
                          all(c_across(col1:col3) == &quot;neg&quot;, na.rm = T) ~ &quot;neg&quot;,
                          all(c_across(col1:col3) %in% c(&quot;pos&quot;, &quot;neg&quot;, NA)) ~ &quot;both&quot;,
                          TRUE ~ &quot;unknown&quot;)) %&gt;% 
  ungroup()
# A tibble: 9 &#215; 5
  Name   col1  col2  col3  col4   
  &lt;chr&gt;  &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;  
1 group1 pos   pos   pos   pos    
2 group2 neg   pos   NA    both   
3 group3 NA    NA    pos   pos    
4 group4 pos   pos   NA    pos    
5 group5 neg   neg   neg   neg    
6 group6 NA    NA    neg   neg    
7 group7 pos   neg   neg   both   
8 group8 NA    NA    NA    NA     
9 group9 pos   pos   typo  unknown

Data

toy.df &lt;- structure(list(Name = c(&quot;group1&quot;, &quot;group2&quot;, &quot;group3&quot;, &quot;group4&quot;, 
&quot;group5&quot;, &quot;group6&quot;, &quot;group7&quot;, &quot;group8&quot;, &quot;group9&quot;), col1 = c(&quot;pos&quot;, 
&quot;neg&quot;, &quot;NA&quot;, &quot;pos&quot;, &quot;neg&quot;, &quot;NA&quot;, &quot;pos&quot;, NA, &quot;pos&quot;), col2 = c(&quot;pos&quot;, 
&quot;pos&quot;, &quot;NA&quot;, &quot;pos&quot;, &quot;neg&quot;, &quot;NA&quot;, &quot;neg&quot;, NA, &quot;pos&quot;), col3 = c(&quot;pos&quot;, 
&quot;NA&quot;, &quot;pos&quot;, &quot;NA&quot;, &quot;neg&quot;, &quot;neg&quot;, &quot;neg&quot;, NA, &quot;typo&quot;)), class = &quot;data.frame&quot;, row.names = c(NA, 
-9L))

答案2

得分: 1

以下是您要翻译的代码部分：

toy.df$group6 <- apply(toy.df, 1, \(x) {
  val <- sort(unique(x[2:4]))
  if (val[1] == "NA") val = val[2:length(val)]
  if (length(val) == 2) {
    "both"
  } else if (val=="pos")
    "pos"
  else 
    "neg"
})
toy.df

out:

    Name col1 col2 col3 group6
1 group1  pos  pos  pos    pos
2 group2  neg  pos   NA   both
3 group3   NA   NA  pos    pos
4 group4  pos  pos   NA    pos
5 group5  neg  neg  neg    neg
6 group6   NA   NA  neg    neg
7 group7  pos  neg  neg   both

英文:

Another way:

toy.df$group6 &lt;- apply(toy.df, 1, \(x) {
  val &lt;- sort(unique(x[2:4]))
  if (val[1] == &quot;NA&quot;) val = val[2:length(val)]
  if (length(val) == 2) {
    &quot;both&quot;
  } else if (val==&quot;pos&quot;)
    &quot;pos&quot;
  else 
    &quot;neg&quot;
})
toy.df

out:

    Name col1 col2 col3 group6
1 group1  pos  pos  pos    pos
2 group2  neg  pos   NA   both
3 group3   NA   NA  pos    pos
4 group4  pos  pos   NA    pos
5 group5  neg  neg  neg    neg
6 group6   NA   NA  neg    neg
7 group7  pos  neg  neg   both

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

根据每行的值进行新列的变异。

问题

答案1

数据

Data

答案2

按时间阈值在R中计算真值、假值和总和值。

三角形，每个顶点都有来自 r 的颜色渐变。

Subscript type for remove an Array column from index in R

Adding "empty space" to perimeter of ggplot2 plot

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。