英文:
How to create new column based on true/false in other columns?
问题
我有多个包含“TRUE”和“FALSE”语句的列,我想创建一个新列,其中包含真实列的列名,它应该看起来像示例中的样子。
新列应该是“color”。
color red yellow orange blue
1 blue FALSE FALSE FALSE TRUE
2 red, blue TRUE FALSE FALSE TRUE
3 blue, green FALSE FALSE FALSE TRUE
4 purple FALSE FALSE FALSE FALSE
5 yellow, orange FALSE TRUE TRUE FALSE
我尝试使用case_when
函数,但要考虑的排列组合太多。
英文:
I have multiple columns that contain TRUE
and FALSE
statements and I want to create a new column that contains the col name of the true columns it should look like the example.
the color needs to be the new column.
color red yellow orange blue
1 blue FALSE FALSE FALSE TRUE
2 red, blue TRUE FALSE FALSE TRUE
3 blue, green FALSE FALSE FALSE TRUE
4 purple FALSE FALSE FALSE FALSE
5 yellow, orange FALSE TRUE TRUE FALSE
I have tried to use case_when
function but it is to many permutations to use.
答案1
得分: 2
你可以在apply
和cbind
中对names
进行子集化。
cbind(dat, clr = apply(dat[-1], 1, \(x) if (any(x)) toString(names(dat)[-1][x]) else NA))
# color red yellow orange blue clr
# 1 blue FALSE FALSE FALSE TRUE blue
# 2 red, blue TRUE FALSE FALSE TRUE red, blue
# 3 blue, green FALSE FALSE FALSE TRUE blue
# 4 purple FALSE FALSE FALSE FALSE <NA>
# 5 yellow, orange FALSE TRUE TRUE FALSE yellow, orange
数据:
dat <- structure(list(color = c("blue", "red, blue", "blue, green",
"purple", "yellow, orange"), red = c(FALSE, TRUE, FALSE, FALSE,
FALSE), yellow = c(FALSE, FALSE, FALSE, FALSE, TRUE), orange = c(FALSE,
FALSE, FALSE, FALSE, TRUE), blue = c(TRUE, TRUE, TRUE, FALSE,
FALSE)), class = "data.frame", row.names = c(NA, -5L))
英文:
You could subset the names
in an apply
and cbind
.
cbind(dat, clr=apply(dat[-1], 1, \(x) if (any(x)) toString(names(dat)[-1][x]) else NA))
# color red yellow orange blue clr
# 1 blue FALSE FALSE FALSE TRUE blue
# 2 red, blue TRUE FALSE FALSE TRUE red, blue
# 3 blue, green FALSE FALSE FALSE TRUE blue
# 4 purple FALSE FALSE FALSE FALSE <NA>
# 5 yellow, orange FALSE TRUE TRUE FALSE yellow, orange
Data:
dat <- structure(list(color = c("blue", "red, blue", "blue, green",
"purple", "yellow, orange"), red = c(FALSE, TRUE, FALSE, FALSE,
FALSE), yellow = c(FALSE, FALSE, FALSE, FALSE, TRUE), orange = c(FALSE,
FALSE, FALSE, FALSE, TRUE), blue = c(TRUE, TRUE, TRUE, FALSE,
FALSE)), class = "data.frame", row.names = c(NA, -5L))
答案2
得分: 0
以下是翻译后的代码部分:
# 使用tidyverse,创建一个单独的列(可能有几种方法可以实现):
# 准备数据以添加id列
df <- df %>%
mutate(id = row_number())
# 计算具有颜色的新列
df_new_col <- df %>%
pivot_longer(!id, names_to = "color", values_to "presence") %>%
filter(presence) %>%
group_by(id) %>%
summarise(
Color = paste0(color, collapse = ", ")
)
# 添加新列并移除临时的id
df <- df %>%
left_join(df_new_col, by = "id") %>%
select(-id)
希望这有所帮助。
英文:
I would use the tidyverse, and create the column in a separate way before (there is probably several ways to do this):
# Prepare the data to add the id column
df <- df %>%
mutate(id = row_number())
# Compute the new column with the colors
df_new_col <- df %>%
pivot_longer(!id, names_to = "color", values_to = "presence") %>%
filter(presence) %>%
group_by(id) %>%
summarise(
Color = paste0(color, collapse = ", ")
)
# Add the new column, and remove the temporary id
df <- df %>%
left_join(df_new_col, by = "id") %>%
select(-id)
I do it like that in case there is some lines with all FALSE.
答案3
得分: 0
另一种使用dplyr
的方法:
library(dplyr)
df %>%
rowwise() %>%
mutate(color = toString(names(.)[c_across(everything())])) %>%
ungroup()
输出:
# A tibble: 5 × 5
red yellow orange blue color
<lgl> <lgl> <lgl> <lgl> <chr>
1 FALSE FALSE FALSE TRUE "blue"
2 TRUE FALSE FALSE TRUE "red, blue"
3 FALSE FALSE FALSE TRUE "blue"
4 FALSE FALSE FALSE FALSE ""
5 FALSE TRUE TRUE FALSE "yellow, orange"
英文:
Another dplyr
way:
library(dplyr)
df %>%
rowwise %>%
mutate(color = toString(names(.)[c_across(everything())])) %>%
ungroup
Output:
# A tibble: 5 × 5
red yellow orange blue color
<lgl> <lgl> <lgl> <lgl> <chr>
1 FALSE FALSE FALSE TRUE "blue"
2 TRUE FALSE FALSE TRUE "red, blue"
3 FALSE FALSE FALSE TRUE "blue"
4 FALSE FALSE FALSE FALSE ""
5 FALSE TRUE TRUE FALSE "yellow, orange"
答案4
得分: 0
我们可以使用 tidyverse
,如下所示:
library(dplyr)
library(tidyr)
df1 %>%
mutate(across(red:blue, ~ case_when(.x ~ cur_column()))) %>%
unite(color, red:blue, na.rm = TRUE, sep = ", ", remove = FALSE)
输出:
color red yellow orange blue
1 blue <NA> <NA> <NA> blue
2 red, blue red <NA> <NA> blue
3 blue <NA> <NA> <NA> blue
4 <NA> <NA> <NA> <NA>
5 yellow, orange <NA> yellow orange <NA>
英文:
We could use tidyverse
as
library(dplyr)
library(tidyr)
df1 %>%
mutate(across(red:blue, ~ case_when(.x ~ cur_column()))) %>%
unite(color, red:blue, na.rm = TRUE, sep = ", ", remove = FALSE)
-output
color red yellow orange blue
1 blue <NA> <NA> <NA> blue
2 red, blue red <NA> <NA> blue
3 blue <NA> <NA> <NA> blue
4 <NA> <NA> <NA> <NA>
5 yellow, orange <NA> yellow orange <NA>
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论