根据其他列中的True/False 如何创建新列?

huangapple go评论136阅读模式
英文:

How to create new column based on true/false in other columns?

问题

我有多个包含“TRUE”和“FALSE”语句的列,我想创建一个新列,其中包含真实列的列名,它应该看起来像示例中的样子。

新列应该是“color”。

  1. color red yellow orange blue
  2. 1 blue FALSE FALSE FALSE TRUE
  3. 2 red, blue TRUE FALSE FALSE TRUE
  4. 3 blue, green FALSE FALSE FALSE TRUE
  5. 4 purple FALSE FALSE FALSE FALSE
  6. 5 yellow, orange FALSE TRUE TRUE FALSE

我尝试使用case_when函数,但要考虑的排列组合太多。

英文:

I have multiple columns that contain TRUE and FALSE statements and I want to create a new column that contains the col name of the true columns it should look like the example.

the color needs to be the new column.

  1. color red yellow orange blue
  2. 1 blue FALSE FALSE FALSE TRUE
  3. 2 red, blue TRUE FALSE FALSE TRUE
  4. 3 blue, green FALSE FALSE FALSE TRUE
  5. 4 purple FALSE FALSE FALSE FALSE
  6. 5 yellow, orange FALSE TRUE TRUE FALSE

I have tried to use case_when function but it is to many permutations to use.

答案1

得分: 2

你可以在applycbind中对names进行子集化。

  1. cbind(dat, clr = apply(dat[-1], 1, \(x) if (any(x)) toString(names(dat)[-1][x]) else NA))
  2. # color red yellow orange blue clr
  3. # 1 blue FALSE FALSE FALSE TRUE blue
  4. # 2 red, blue TRUE FALSE FALSE TRUE red, blue
  5. # 3 blue, green FALSE FALSE FALSE TRUE blue
  6. # 4 purple FALSE FALSE FALSE FALSE <NA>
  7. # 5 yellow, orange FALSE TRUE TRUE FALSE yellow, orange

数据:

  1. dat <- structure(list(color = c("blue", "red, blue", "blue, green",
  2. "purple", "yellow, orange"), red = c(FALSE, TRUE, FALSE, FALSE,
  3. FALSE), yellow = c(FALSE, FALSE, FALSE, FALSE, TRUE), orange = c(FALSE,
  4. FALSE, FALSE, FALSE, TRUE), blue = c(TRUE, TRUE, TRUE, FALSE,
  5. FALSE)), class = "data.frame", row.names = c(NA, -5L))
英文:

You could subset the names in an apply and cbind.

  1. cbind(dat, clr=apply(dat[-1], 1, \(x) if (any(x)) toString(names(dat)[-1][x]) else NA))
  2. # color red yellow orange blue clr
  3. # 1 blue FALSE FALSE FALSE TRUE blue
  4. # 2 red, blue TRUE FALSE FALSE TRUE red, blue
  5. # 3 blue, green FALSE FALSE FALSE TRUE blue
  6. # 4 purple FALSE FALSE FALSE FALSE &lt;NA&gt;
  7. # 5 yellow, orange FALSE TRUE TRUE FALSE yellow, orange

Data:

  1. dat &lt;- structure(list(color = c(&quot;blue&quot;, &quot;red, blue&quot;, &quot;blue, green&quot;,
  2. &quot;purple&quot;, &quot;yellow, orange&quot;), red = c(FALSE, TRUE, FALSE, FALSE,
  3. FALSE), yellow = c(FALSE, FALSE, FALSE, FALSE, TRUE), orange = c(FALSE,
  4. FALSE, FALSE, FALSE, TRUE), blue = c(TRUE, TRUE, TRUE, FALSE,
  5. FALSE)), class = &quot;data.frame&quot;, row.names = c(NA, -5L))

答案2

得分: 0

以下是翻译后的代码部分:

  1. # 使用tidyverse,创建一个单独的列(可能有几种方法可以实现):
  2. # 准备数据以添加id列
  3. df <- df %>%
  4. mutate(id = row_number())
  5. # 计算具有颜色的新列
  6. df_new_col <- df %>%
  7. pivot_longer(!id, names_to = "color", values_to "presence") %>%
  8. filter(presence) %>%
  9. group_by(id) %>%
  10. summarise(
  11. Color = paste0(color, collapse = ", ")
  12. )
  13. # 添加新列并移除临时的id
  14. df <- df %>%
  15. left_join(df_new_col, by = "id") %>%
  16. select(-id)

希望这有所帮助。

英文:

I would use the tidyverse, and create the column in a separate way before (there is probably several ways to do this):

  1. # Prepare the data to add the id column
  2. df &lt;- df %&gt;%
  3. mutate(id = row_number())
  4. # Compute the new column with the colors
  5. df_new_col &lt;- df %&gt;%
  6. pivot_longer(!id, names_to = &quot;color&quot;, values_to = &quot;presence&quot;) %&gt;%
  7. filter(presence) %&gt;%
  8. group_by(id) %&gt;%
  9. summarise(
  10. Color = paste0(color, collapse = &quot;, &quot;)
  11. )
  12. # Add the new column, and remove the temporary id
  13. df &lt;- df %&gt;%
  14. left_join(df_new_col, by = &quot;id&quot;) %&gt;%
  15. select(-id)

I do it like that in case there is some lines with all FALSE.

答案3

得分: 0

另一种使用dplyr的方法:

  1. library(dplyr)
  2. df %>%
  3. rowwise() %>%
  4. mutate(color = toString(names(.)[c_across(everything())])) %>%
  5. ungroup()

输出:

  1. # A tibble: 5 × 5
  2. red yellow orange blue color
  3. <lgl> <lgl> <lgl> <lgl> <chr>
  4. 1 FALSE FALSE FALSE TRUE "blue"
  5. 2 TRUE FALSE FALSE TRUE "red, blue"
  6. 3 FALSE FALSE FALSE TRUE "blue"
  7. 4 FALSE FALSE FALSE FALSE ""
  8. 5 FALSE TRUE TRUE FALSE "yellow, orange"
英文:

Another dplyr way:

  1. library(dplyr)
  2. df %&gt;%
  3. rowwise %&gt;%
  4. mutate(color = toString(names(.)[c_across(everything())])) %&gt;%
  5. ungroup

Output:

  1. # A tibble: 5 &#215; 5
  2. red yellow orange blue color
  3. &lt;lgl&gt; &lt;lgl&gt; &lt;lgl&gt; &lt;lgl&gt; &lt;chr&gt;
  4. 1 FALSE FALSE FALSE TRUE &quot;blue&quot;
  5. 2 TRUE FALSE FALSE TRUE &quot;red, blue&quot;
  6. 3 FALSE FALSE FALSE TRUE &quot;blue&quot;
  7. 4 FALSE FALSE FALSE FALSE &quot;&quot;
  8. 5 FALSE TRUE TRUE FALSE &quot;yellow, orange&quot;

答案4

得分: 0

我们可以使用 tidyverse,如下所示:

  1. library(dplyr)
  2. library(tidyr)
  3. df1 %>%
  4. mutate(across(red:blue, ~ case_when(.x ~ cur_column()))) %>%
  5. unite(color, red:blue, na.rm = TRUE, sep = ", ", remove = FALSE)

输出:

  1. color red yellow orange blue
  2. 1 blue <NA> <NA> <NA> blue
  3. 2 red, blue red <NA> <NA> blue
  4. 3 blue <NA> <NA> <NA> blue
  5. 4 <NA> <NA> <NA> <NA>
  6. 5 yellow, orange <NA> yellow orange <NA>
英文:

We could use tidyverse as

  1. library(dplyr)
  2. library(tidyr)
  3. df1 %&gt;%
  4. mutate(across(red:blue, ~ case_when(.x ~ cur_column()))) %&gt;%
  5. unite(color, red:blue, na.rm = TRUE, sep = &quot;, &quot;, remove = FALSE)

-output

  1. color red yellow orange blue
  2. 1 blue &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; blue
  3. 2 red, blue red &lt;NA&gt; &lt;NA&gt; blue
  4. 3 blue &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; blue
  5. 4 &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; &lt;NA&gt;
  6. 5 yellow, orange &lt;NA&gt; yellow orange &lt;NA&gt;
  7. </details>

huangapple
  • 本文由 发表于 2023年3月12日 19:09:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/75712703.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定