英文:
Conditionally mutate multiple columns in R
问题
我有一个包含j个级别的因子列的数据框,以及j个长度为k的向量。我想要在前一个数据框中根据因子的条件填充k列,使用后一个向量中的值。
简化的例子(三个级别,三个向量,两个值):
df1 <- data.frame("Factor" = rep(c("A", "B", "C"), times = 5))
vecA <- c(1, 2)
vecB <- c(2, 1)
vecC <- c(3, 3)
这里是使用嵌套的ifelse语句的解决方案:
library(tidyverse)
df1 %>%
mutate(V1 = ifelse(Factor == "A", vecA[1],
ifelse(Factor == "B", vecB[1], vecC[1])),
V2 = ifelse(Factor == "A", vecA[2],
ifelse(Factor == "B", vecB[2], vecC[2])))
我想要避免嵌套的ifelse语句。理想情况下,我也想避免单独对每一列进行突变。
英文:
I have a dataframe with a factor column with j levels, as well as j vectors of length k. I would like to populate k columns in the former dataframe with values from the latter vectors, conditional on the factor.
Simplified example (three levels, three vectors, two values):
df1 <- data.frame("Factor" = rep(c("A", "B", "C"), times = 5))
vecA <- c(1, 2)
vecB <- c(2, 1)
vecC <- c(3, 3)
Here is a solution using nested ifelse statements:
library(tidyverse)
df1 %>%
mutate(V1 = ifelse(Factor == "A", vecA[1],
ifelse(Factor == "B", vecB[1], vecC[1])),
V2 = ifelse(Factor == "A", vecA[2],
ifelse(Factor == "B", vecB[2], vecC[2])))
I would like to avoid the nested ifelse statements. Ideally, I would also like to avoid mutating each column separately.
答案1
得分: 1
以下是一个想法。在全局环境中,获取所有以“vec”开头的对象,使用mget()
完成。这将创建一个列表。对于列表中的每个元素,使用下划线“_”连接数字。然后,在以下连接过程中排列向量的名称。在连接之后,使用cSplit()
拆分列中的值。我希望这个方法对你的实际情况适用。
library(tidyverse)
library(splitstackshape)
# 创建一个字符向量。
mychr <- map_chr(.x = mget(ls(pattern = "vec")),
.f = function(x) {paste0(x, collapse = "_")})
# 移除名称中的“vec”。
names(mychr) <- sub(x = names(mychr), pattern = "vec", replacement = "")
# A B C
# "1_2" "2_1" "3_3"
# stack()创建一个数据框。在left_join()中使用它。
# 然后,拆分列中的值为两列。你可能有多个列,所以我决定在这里使用cSplit()。
left_join(df1, stack(mychr), by = c("Factor" = "ind")) %>%
cSplit(splitCols = "values", sep = "_", direction = "wide", type.convert = FALSE)
# Factor values_1 values_2
# 1: A 1 2
# 2: B 2 1
# 3: C 3 3
# 4: A 1 2
# 5: B 2 1
# 6: C 3 3
# 7: A 1 2
# 8: B 2 1
# 9: C 3 3
#10: A 1 2
#11: B 2 1
#12: C 3 3
#13: A 1 2
#14: B 2 1
#15: C 3 3
英文:
Here is one idea. In the global environment, get all objects that begin with "vec", which is done by mget()
. This creates a list. For each element in the list, paste the numbers with "_" in between. Then, arrange names in the vector for the following join process. After join, split the column, values with cSplit()
. I hope this approach will be applicable to your real situation.
library(tidyverse)
library(splitstackshape)
# Create a character vector.
mychr <- map_chr(.x = mget(ls(pattern = "vec")),
.f = function(x) {paste0(x, collapse = "_")})
# Remove "vec" in names.
names(mychr) <- sub(x = names(mychr), pattern = "vec", replacement = "")
# A B C
#"1_2" "2_1" "3_3"
# stack() creates a data frame. Use it in left_join().
# Then, split the column, values into two columns. You probably have more than
# two. So I decided to use cSplit() here.
left_join(df1, stack(mychr), by = c("Factor" = "ind")) %>%
cSplit(splitCols = "values", sep = "_", direction = "wide", type.convert = FALSE)
# Factor values_1 values_2
# 1: A 1 2
# 2: B 2 1
# 3: C 3 3
# 4: A 1 2
# 5: B 2 1
# 6: C 3 3
# 7: A 1 2
# 8: B 2 1
# 9: C 3 3
#10: A 1 2
#11: B 2 1
#12: C 3 3
#13: A 1 2
#14: B 2 1
#15: C 3 3
答案2
得分: 1
以下是翻译好的代码部分:
使用 base R
选项:
df1[c('V1', 'V2')] <- do.call(Map, c(f = c, mget(ls(pattern='^vec[A-C]$'))))
df1
# Factor V1 V2
#1 A 1 2
#2 B 2 1
#3 C 3 3
#4 A 1 2
#5 B 2 1
#6 C 3 3
#7 A 1 2
#8 B 2 1
#9 C 3 3
#10 A 1 2
#11 B 2 1
#12 C 3 3
#13 A 1 2
#14 B 2 1
#15 C 3 3
或者使用 purrr
中的 transpose
:
library(dplyr)
library(purrr)
mget(ls(pattern='^vec[A-C]$')) %>%
transpose %>%
setNames(c('V1', 'V2')) %>%
cbind(df1, .)
英文:
Here is a base R
option
df1[c('V1', 'V2')] <- do.call(Map, c(f = c, mget(ls(pattern="^vec[A-C]$"))))
df1
# Factor V1 V2
#1 A 1 2
#2 B 2 1
#3 C 3 3
#4 A 1 2
#5 B 2 1
#6 C 3 3
#7 A 1 2
#8 B 2 1
#9 C 3 3
#10 A 1 2
#11 B 2 1
#12 C 3 3
#13 A 1 2
#14 B 2 1
#15 C 3 3
Or with transpose
from purrr
library(dplyr)
library(purrr)
mget(ls(pattern="^vec[A-C]$")) %>%
transpose %>%
setNames(c('V1', 'V2')) %>%
cbind(df1, .)
答案3
得分: 0
这是一种方法:
# 修改向量
l <- list('A' = vecA, 'B' = vecB, 'C' = vecC)
# 创建带映射的数据框
df2 = data.frame(t(sapply(df1$Factor, function(x) l[[x]])))
colnames(df2) <- c('V1', 'V2')
new_df = cbind(df1, df2)
Factor V1 V2
1 A 1 2
2 B 2 1
3 C 3 3
4 A 1 2
5 B 2 1
6 C 3 3
7 A 1 2
8 B 2 1
9 C 3 3
10 A 1 2
11 B 2 1
12 C 3 3
13 A 1 2
14 B 2 1
15 C 3 3
英文:
Here's a way to do:
# modify the vectors
l <- list('A' = vecA, 'B' = vecB, 'C' = vecC)
# create df with mapping
df2 = data.frame(t(sapply(df1$Factor, function(x) l[[x]])))
colnames(df2) <- c('V1', 'V2')
new_df = cbind(df1, df2)
Factor V1 V2
1 A 1 2
2 B 2 1
3 C 3 3
4 A 1 2
5 B 2 1
6 C 3 3
7 A 1 2
8 B 2 1
9 C 3 3
10 A 1 2
11 B 2 1
12 C 3 3
13 A 1 2
14 B 2 1
15 C 3 3
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论