2023年3月12日 19:09:56go评论136阅读模式

英文:

How to create new column based on true/false in other columns?

问题

我有多个包含“TRUE”和“FALSE”语句的列，我想创建一个新列，其中包含真实列的列名，它应该看起来像示例中的样子。

新列应该是“color”。

           color   red yellow orange  blue
1           blue FALSE  FALSE  FALSE  TRUE
2      red, blue  TRUE  FALSE  FALSE  TRUE
3    blue, green FALSE  FALSE  FALSE  TRUE
4         purple FALSE  FALSE  FALSE FALSE
5 yellow, orange FALSE   TRUE   TRUE FALSE

我尝试使用case_when函数，但要考虑的排列组合太多。

英文:

I have multiple columns that contain TRUE and FALSE statements and I want to create a new column that contains the col name of the true columns it should look like the example.

the color needs to be the new column.

           color   red yellow orange  blue
1           blue FALSE  FALSE  FALSE  TRUE
2      red, blue  TRUE  FALSE  FALSE  TRUE
3    blue, green FALSE  FALSE  FALSE  TRUE
4         purple FALSE  FALSE  FALSE FALSE
5 yellow, orange FALSE   TRUE   TRUE FALSE

I have tried to use case_when function but it is to many permutations to use.

答案1

得分: 2

你可以在apply和cbind中对names进行子集化。

cbind(dat, clr = apply(dat[-1], 1, \(x) if (any(x)) toString(names(dat)[-1][x]) else NA))
#            color   red yellow orange  blue            clr
# 1           blue FALSE  FALSE  FALSE  TRUE           blue
# 2      red, blue  TRUE  FALSE  FALSE  TRUE      red, blue
# 3    blue, green FALSE  FALSE  FALSE  TRUE           blue
# 4         purple FALSE  FALSE  FALSE FALSE           <NA>
# 5 yellow, orange FALSE   TRUE   TRUE FALSE yellow, orange

数据:

dat <- structure(list(color = c("blue", "red, blue", "blue, green", 
"purple", "yellow, orange"), red = c(FALSE, TRUE, FALSE, FALSE, 
FALSE), yellow = c(FALSE, FALSE, FALSE, FALSE, TRUE), orange = c(FALSE, 
FALSE, FALSE, FALSE, TRUE), blue = c(TRUE, TRUE, TRUE, FALSE, 
FALSE)), class = "data.frame", row.names = c(NA, -5L))

英文:

You could subset the names in an apply and cbind.

cbind(dat, clr=apply(dat[-1], 1, \(x) if (any(x)) toString(names(dat)[-1][x]) else NA))
#            color   red yellow orange  blue            clr
# 1           blue FALSE  FALSE  FALSE  TRUE           blue
# 2      red, blue  TRUE  FALSE  FALSE  TRUE      red, blue
# 3    blue, green FALSE  FALSE  FALSE  TRUE           blue
# 4         purple FALSE  FALSE  FALSE FALSE           &lt;NA&gt;
# 5 yellow, orange FALSE   TRUE   TRUE FALSE yellow, orange

Data:

dat &lt;- structure(list(color = c(&quot;blue&quot;, &quot;red, blue&quot;, &quot;blue, green&quot;, 
&quot;purple&quot;, &quot;yellow, orange&quot;), red = c(FALSE, TRUE, FALSE, FALSE, 
FALSE), yellow = c(FALSE, FALSE, FALSE, FALSE, TRUE), orange = c(FALSE, 
FALSE, FALSE, FALSE, TRUE), blue = c(TRUE, TRUE, TRUE, FALSE, 
FALSE)), class = &quot;data.frame&quot;, row.names = c(NA, -5L))

答案2

得分: 0

以下是翻译后的代码部分：

# 使用tidyverse，创建一个单独的列（可能有几种方法可以实现）：
# 准备数据以添加id列
df <- df %>%
  mutate(id = row_number())
# 计算具有颜色的新列
df_new_col <- df %>%
  pivot_longer(!id, names_to = "color", values_to "presence") %>%
  filter(presence) %>%
  group_by(id) %>%
  summarise(
    Color = paste0(color, collapse = ", ")
  )
# 添加新列并移除临时的id
df <- df %>%
  left_join(df_new_col, by = "id") %>%
  select(-id)

希望这有所帮助。

英文:

I would use the tidyverse, and create the column in a separate way before (there is probably several ways to do this):

# Prepare the data to add the id column
df &lt;- df %&gt;% 
  mutate(id = row_number())
# Compute the new column with the colors
df_new_col &lt;- df %&gt;% 
  pivot_longer(!id, names_to = &quot;color&quot;, values_to = &quot;presence&quot;) %&gt;% 
  filter(presence) %&gt;% 
  group_by(id) %&gt;% 
  summarise(
    Color = paste0(color, collapse = &quot;, &quot;)
  )
# Add the new column, and remove the temporary id
df &lt;- df %&gt;% 
  left_join(df_new_col, by = &quot;id&quot;) %&gt;% 
  select(-id)

I do it like that in case there is some lines with all FALSE.

答案3

得分: 0

另一种使用dplyr的方法：

library(dplyr)
df %>%
  rowwise() %>%
  mutate(color = toString(names(.)[c_across(everything())])) %>%
  ungroup()

输出：

# A tibble: 5 × 5
  red   yellow orange blue  color           
  <lgl> <lgl>  <lgl>  <lgl> <chr>           
1 FALSE FALSE  FALSE  TRUE  "blue"          
2 TRUE  FALSE  FALSE  TRUE  "red, blue"     
3 FALSE FALSE  FALSE  TRUE  "blue"          
4 FALSE FALSE  FALSE  FALSE ""              
5 FALSE TRUE   TRUE   FALSE "yellow, orange"

英文:

Another dplyr way:

library(dplyr)
df %&gt;%
  rowwise %&gt;%
  mutate(color = toString(names(.)[c_across(everything())])) %&gt;%
  ungroup

Output:

# A tibble: 5 &#215; 5
  red   yellow orange blue  color           
  &lt;lgl&gt; &lt;lgl&gt;  &lt;lgl&gt;  &lt;lgl&gt; &lt;chr&gt;           
1 FALSE FALSE  FALSE  TRUE  &quot;blue&quot;          
2 TRUE  FALSE  FALSE  TRUE  &quot;red, blue&quot;     
3 FALSE FALSE  FALSE  TRUE  &quot;blue&quot;          
4 FALSE FALSE  FALSE  FALSE &quot;&quot;              
5 FALSE TRUE   TRUE   FALSE &quot;yellow, orange&quot;

答案4

得分: 0

我们可以使用 tidyverse，如下所示：

library(dplyr)
library(tidyr)
 df1 %>%
  mutate(across(red:blue, ~ case_when(.x ~ cur_column()))) %>%
  unite(color, red:blue, na.rm = TRUE, sep = ", ", remove = FALSE)

输出：

           color  red yellow orange blue
1           blue <NA>   <NA>   <NA> blue
2      red, blue  red   <NA>   <NA> blue
3           blue <NA>   <NA>   <NA> blue
4              <NA>   <NA>   <NA>   <NA>
5 yellow, orange <NA> yellow orange <NA>

英文:

We could use tidyverse as

library(dplyr)
library(tidyr)
 df1 %&gt;% 
  mutate(across(red:blue, ~ case_when(.x ~ cur_column()))) %&gt;%
  unite(color, red:blue, na.rm = TRUE, sep = &quot;, &quot;, remove = FALSE)

-output

           color  red yellow orange blue
1           blue &lt;NA&gt;   &lt;NA&gt;   &lt;NA&gt; blue
2      red, blue  red   &lt;NA&gt;   &lt;NA&gt; blue
3           blue &lt;NA&gt;   &lt;NA&gt;   &lt;NA&gt; blue
4                &lt;NA&gt;   &lt;NA&gt;   &lt;NA&gt; &lt;NA&gt;
5 yellow, orange &lt;NA&gt; yellow orange &lt;NA&gt;
</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

根据其他列中的True/False 如何创建新列？

问题

答案1

答案2

答案3

答案4

无法使用emmeans获得arcsin反转换。

在R中使用不同的文件名编写多个表格。

Error in `[[<-`(`tmp`, i, value = sub("\\_.*", "", i)) : attempt to select more than one element in vectorIndex

用lapply在R中替换嵌套的for循环

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论