2020年1月6日 21:01:57go评论127阅读模式

英文:

Conditionally mutate multiple columns in R

问题

我有一个包含j个级别的因子列的数据框，以及j个长度为k的向量。我想要在前一个数据框中根据因子的条件填充k列，使用后一个向量中的值。

简化的例子（三个级别，三个向量，两个值）：

df1 <- data.frame("Factor" = rep(c("A", "B", "C"), times = 5))
vecA <- c(1, 2)
vecB <- c(2, 1)
vecC <- c(3, 3)

这里是使用嵌套的ifelse语句的解决方案：

library(tidyverse)
df1 %>%
  mutate(V1 = ifelse(Factor == "A", vecA[1], 
                     ifelse(Factor == "B", vecB[1], vecC[1])),
         V2 = ifelse(Factor == "A", vecA[2], 
                     ifelse(Factor == "B", vecB[2], vecC[2])))

我想要避免嵌套的ifelse语句。理想情况下，我也想避免单独对每一列进行突变。

英文:

I have a dataframe with a factor column with j levels, as well as j vectors of length k. I would like to populate k columns in the former dataframe with values from the latter vectors, conditional on the factor.

Simplified example (three levels, three vectors, two values):

df1 &lt;- data.frame(&quot;Factor&quot; = rep(c(&quot;A&quot;, &quot;B&quot;, &quot;C&quot;), times = 5))
vecA &lt;- c(1, 2)
vecB &lt;- c(2, 1)
vecC &lt;- c(3, 3)

Here is a solution using nested ifelse statements:

library(tidyverse)
df1 %&gt;%
  mutate(V1 = ifelse(Factor == &quot;A&quot;, vecA[1], 
                     ifelse(Factor == &quot;B&quot;, vecB[1], vecC[1])),
         V2 = ifelse(Factor == &quot;A&quot;, vecA[2], 
                     ifelse(Factor == &quot;B&quot;, vecB[2], vecC[2])))

I would like to avoid the nested ifelse statements. Ideally, I would also like to avoid mutating each column separately.

答案1

得分: 1

以下是一个想法。在全局环境中，获取所有以“vec”开头的对象，使用mget()完成。这将创建一个列表。对于列表中的每个元素，使用下划线“_”连接数字。然后，在以下连接过程中排列向量的名称。在连接之后，使用cSplit()拆分列中的值。我希望这个方法对你的实际情况适用。

library(tidyverse)
library(splitstackshape)
# 创建一个字符向量。
mychr <- map_chr(.x = mget(ls(pattern = "vec")),
                 .f = function(x) {paste0(x, collapse = "_")})
# 移除名称中的“vec”。
names(mychr) <- sub(x = names(mychr), pattern = "vec", replacement = "")
#   A     B     C 
# "1_2" "2_1" "3_3"
# stack()创建一个数据框。在left_join()中使用它。
# 然后，拆分列中的值为两列。你可能有多个列，所以我决定在这里使用cSplit()。
left_join(df1, stack(mychr), by = c("Factor" = "ind")) %>%
cSplit(splitCols = "values", sep = "_", direction = "wide", type.convert = FALSE)
#   Factor values_1 values_2
# 1:      A        1        2
# 2:      B        2        1
# 3:      C        3        3
# 4:      A        1        2
# 5:      B        2        1
# 6:      C        3        3
# 7:      A        1        2
# 8:      B        2        1
# 9:      C        3        3
#10:      A        1        2
#11:      B        2        1
#12:      C        3        3
#13:      A        1        2
#14:      B        2        1
#15:      C        3        3

英文:

Here is one idea. In the global environment, get all objects that begin with "vec", which is done by mget(). This creates a list. For each element in the list, paste the numbers with "_" in between. Then, arrange names in the vector for the following join process. After join, split the column, values with cSplit(). I hope this approach will be applicable to your real situation.

library(tidyverse)
library(splitstackshape)
# Create a character vector.
mychr &lt;- map_chr(.x = mget(ls(pattern = &quot;vec&quot;)),
                 .f = function(x) {paste0(x, collapse = &quot;_&quot;)})
# Remove &quot;vec&quot; in names.
names(mychr) &lt;- sub(x = names(mychr), pattern = &quot;vec&quot;, replacement = &quot;&quot;)
#   A     B     C 
#&quot;1_2&quot; &quot;2_1&quot; &quot;3_3&quot;
# stack() creates a data frame. Use it in left_join().
# Then, split the column, values into two columns. You probably have more than
# two. So I decided to use cSplit() here.
left_join(df1, stack(mychr), by = c(&quot;Factor&quot; = &quot;ind&quot;)) %&gt;%
cSplit(splitCols = &quot;values&quot;, sep = &quot;_&quot;, direction = &quot;wide&quot;, type.convert = FALSE)
#    Factor values_1 values_2
# 1:      A        1        2
# 2:      B        2        1
# 3:      C        3        3
# 4:      A        1        2
# 5:      B        2        1
# 6:      C        3        3
# 7:      A        1        2
# 8:      B        2        1
# 9:      C        3        3
#10:      A        1        2
#11:      B        2        1
#12:      C        3        3
#13:      A        1        2
#14:      B        2        1
#15:      C        3        3

答案2

得分: 1

以下是翻译好的代码部分：

使用 base R 选项：

df1[c('V1', 'V2')] <- do.call(Map, c(f = c, mget(ls(pattern='^vec[A-C]$'))))
df1
#    Factor V1 V2
#1       A  1  2
#2       B  2  1
#3       C  3  3
#4       A  1  2
#5       B  2  1
#6       C  3  3
#7       A  1  2
#8       B  2  1
#9       C  3  3
#10      A  1  2
#11      B  2  1
#12      C  3  3
#13      A  1  2
#14      B  2  1
#15      C  3  3

或者使用 purrr 中的 transpose：

library(dplyr)
library(purrr)
mget(ls(pattern='^vec[A-C]$')) %>%
     transpose %>%
     setNames(c('V1', 'V2')) %>%
     cbind(df1, .)

英文:

Here is a base R option

df1[c(&#39;V1&#39;, &#39;V2&#39;)] &lt;- do.call(Map, c(f = c, mget(ls(pattern=&quot;^vec[A-C]$&quot;))))
df1
#    Factor V1 V2
#1       A  1  2
#2       B  2  1
#3       C  3  3
#4       A  1  2
#5       B  2  1
#6       C  3  3
#7       A  1  2
#8       B  2  1
#9       C  3  3
#10      A  1  2
#11      B  2  1
#12      C  3  3
#13      A  1  2
#14      B  2  1
#15      C  3  3

Or with transpose from purrr

library(dplyr)
library(purrr)
mget(ls(pattern=&quot;^vec[A-C]$&quot;)) %&gt;% 
     transpose %&gt;% 
     setNames(c(&#39;V1&#39;, &#39;V2&#39;)) %&gt;% 
     cbind(df1, .)

答案3

得分: 0

这是一种方法：

# 修改向量
l <- list('A' = vecA, 'B' = vecB, 'C' = vecC)
# 创建带映射的数据框
df2 = data.frame(t(sapply(df1$Factor, function(x) l[[x]])))
colnames(df2) <- c('V1', 'V2')
new_df = cbind(df1, df2)
   Factor V1 V2
1       A  1  2
2       B  2  1
3       C  3  3
4       A  1  2
5       B  2  1
6       C  3  3
7       A  1  2
8       B  2  1
9       C  3  3
10      A  1  2
11      B  2  1
12      C  3  3
13      A  1  2
14      B  2  1
15      C  3  3

英文:

Here's a way to do:

# modify the vectors
l &lt;- list(&#39;A&#39; = vecA, &#39;B&#39; = vecB, &#39;C&#39; = vecC)
# create df with mapping
df2 = data.frame(t(sapply(df1$Factor, function(x) l[[x]])))
colnames(df2) &lt;- c(&#39;V1&#39;, &#39;V2&#39;)
new_df = cbind(df1, df2)
   Factor V1 V2
1       A  1  2
2       B  2  1
3       C  3  3
4       A  1  2
5       B  2  1
6       C  3  3
7       A  1  2
8       B  2  1
9       C  3  3
10      A  1  2
11      B  2  1
12      C  3  3
13      A  1  2
14      B  2  1
15      C  3  3

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在R中有条件地修改多个列

问题

答案1

答案2

答案3

自动增长率热图函数

在R中使用筛选器对项目进行随机化。

Survminer – 排列多个 ggsurvplot 和 ggadjustedcurves

如何在R中基于另一列中的指定范围值[移动窗口]递增一个新列

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。