在R中有条件地修改多个列

huangapple go评论125阅读模式
英文:

Conditionally mutate multiple columns in R

问题

我有一个包含j个级别的因子列的数据框,以及j个长度为k的向量。我想要在前一个数据框中根据因子的条件填充k列,使用后一个向量中的值。

简化的例子(三个级别,三个向量,两个值):

  1. df1 <- data.frame("Factor" = rep(c("A", "B", "C"), times = 5))
  2. vecA <- c(1, 2)
  3. vecB <- c(2, 1)
  4. vecC <- c(3, 3)

这里是使用嵌套的ifelse语句的解决方案:

  1. library(tidyverse)
  2. df1 %>%
  3. mutate(V1 = ifelse(Factor == "A", vecA[1],
  4. ifelse(Factor == "B", vecB[1], vecC[1])),
  5. V2 = ifelse(Factor == "A", vecA[2],
  6. ifelse(Factor == "B", vecB[2], vecC[2])))

我想要避免嵌套的ifelse语句。理想情况下,我也想避免单独对每一列进行突变。

英文:

I have a dataframe with a factor column with j levels, as well as j vectors of length k. I would like to populate k columns in the former dataframe with values from the latter vectors, conditional on the factor.

Simplified example (three levels, three vectors, two values):

  1. df1 &lt;- data.frame(&quot;Factor&quot; = rep(c(&quot;A&quot;, &quot;B&quot;, &quot;C&quot;), times = 5))
  2. vecA &lt;- c(1, 2)
  3. vecB &lt;- c(2, 1)
  4. vecC &lt;- c(3, 3)

Here is a solution using nested ifelse statements:

  1. library(tidyverse)
  2. df1 %&gt;%
  3. mutate(V1 = ifelse(Factor == &quot;A&quot;, vecA[1],
  4. ifelse(Factor == &quot;B&quot;, vecB[1], vecC[1])),
  5. V2 = ifelse(Factor == &quot;A&quot;, vecA[2],
  6. ifelse(Factor == &quot;B&quot;, vecB[2], vecC[2])))

I would like to avoid the nested ifelse statements. Ideally, I would also like to avoid mutating each column separately.

答案1

得分: 1

以下是一个想法。在全局环境中,获取所有以“vec”开头的对象,使用mget()完成。这将创建一个列表。对于列表中的每个元素,使用下划线“_”连接数字。然后,在以下连接过程中排列向量的名称。在连接之后,使用cSplit()拆分列中的值。我希望这个方法对你的实际情况适用。

  1. library(tidyverse)
  2. library(splitstackshape)
  3. # 创建一个字符向量。
  4. mychr <- map_chr(.x = mget(ls(pattern = "vec")),
  5. .f = function(x) {paste0(x, collapse = "_")})
  6. # 移除名称中的“vec”。
  7. names(mychr) <- sub(x = names(mychr), pattern = "vec", replacement = "")
  8. # A B C
  9. # "1_2" "2_1" "3_3"
  10. # stack()创建一个数据框。在left_join()中使用它。
  11. # 然后,拆分列中的值为两列。你可能有多个列,所以我决定在这里使用cSplit()。
  12. left_join(df1, stack(mychr), by = c("Factor" = "ind")) %>%
  13. cSplit(splitCols = "values", sep = "_", direction = "wide", type.convert = FALSE)
  14. # Factor values_1 values_2
  15. # 1: A 1 2
  16. # 2: B 2 1
  17. # 3: C 3 3
  18. # 4: A 1 2
  19. # 5: B 2 1
  20. # 6: C 3 3
  21. # 7: A 1 2
  22. # 8: B 2 1
  23. # 9: C 3 3
  24. #10: A 1 2
  25. #11: B 2 1
  26. #12: C 3 3
  27. #13: A 1 2
  28. #14: B 2 1
  29. #15: C 3 3
英文:

Here is one idea. In the global environment, get all objects that begin with "vec", which is done by mget(). This creates a list. For each element in the list, paste the numbers with "_" in between. Then, arrange names in the vector for the following join process. After join, split the column, values with cSplit(). I hope this approach will be applicable to your real situation.

  1. library(tidyverse)
  2. library(splitstackshape)
  3. # Create a character vector.
  4. mychr &lt;- map_chr(.x = mget(ls(pattern = &quot;vec&quot;)),
  5. .f = function(x) {paste0(x, collapse = &quot;_&quot;)})
  6. # Remove &quot;vec&quot; in names.
  7. names(mychr) &lt;- sub(x = names(mychr), pattern = &quot;vec&quot;, replacement = &quot;&quot;)
  8. # A B C
  9. #&quot;1_2&quot; &quot;2_1&quot; &quot;3_3&quot;
  10. # stack() creates a data frame. Use it in left_join().
  11. # Then, split the column, values into two columns. You probably have more than
  12. # two. So I decided to use cSplit() here.
  13. left_join(df1, stack(mychr), by = c(&quot;Factor&quot; = &quot;ind&quot;)) %&gt;%
  14. cSplit(splitCols = &quot;values&quot;, sep = &quot;_&quot;, direction = &quot;wide&quot;, type.convert = FALSE)
  15. # Factor values_1 values_2
  16. # 1: A 1 2
  17. # 2: B 2 1
  18. # 3: C 3 3
  19. # 4: A 1 2
  20. # 5: B 2 1
  21. # 6: C 3 3
  22. # 7: A 1 2
  23. # 8: B 2 1
  24. # 9: C 3 3
  25. #10: A 1 2
  26. #11: B 2 1
  27. #12: C 3 3
  28. #13: A 1 2
  29. #14: B 2 1
  30. #15: C 3 3

答案2

得分: 1

以下是翻译好的代码部分:

使用 base R 选项:

  1. df1[c('V1', 'V2')] <- do.call(Map, c(f = c, mget(ls(pattern='^vec[A-C]$'))))
  2. df1
  3. # Factor V1 V2
  4. #1 A 1 2
  5. #2 B 2 1
  6. #3 C 3 3
  7. #4 A 1 2
  8. #5 B 2 1
  9. #6 C 3 3
  10. #7 A 1 2
  11. #8 B 2 1
  12. #9 C 3 3
  13. #10 A 1 2
  14. #11 B 2 1
  15. #12 C 3 3
  16. #13 A 1 2
  17. #14 B 2 1
  18. #15 C 3 3

或者使用 purrr 中的 transpose

  1. library(dplyr)
  2. library(purrr)
  3. mget(ls(pattern='^vec[A-C]$')) %>%
  4. transpose %>%
  5. setNames(c('V1', 'V2')) %>%
  6. cbind(df1, .)
英文:

Here is a base R option

  1. df1[c(&#39;V1&#39;, &#39;V2&#39;)] &lt;- do.call(Map, c(f = c, mget(ls(pattern=&quot;^vec[A-C]$&quot;))))
  2. df1
  3. # Factor V1 V2
  4. #1 A 1 2
  5. #2 B 2 1
  6. #3 C 3 3
  7. #4 A 1 2
  8. #5 B 2 1
  9. #6 C 3 3
  10. #7 A 1 2
  11. #8 B 2 1
  12. #9 C 3 3
  13. #10 A 1 2
  14. #11 B 2 1
  15. #12 C 3 3
  16. #13 A 1 2
  17. #14 B 2 1
  18. #15 C 3 3

Or with transpose from purrr

  1. library(dplyr)
  2. library(purrr)
  3. mget(ls(pattern=&quot;^vec[A-C]$&quot;)) %&gt;%
  4. transpose %&gt;%
  5. setNames(c(&#39;V1&#39;, &#39;V2&#39;)) %&gt;%
  6. cbind(df1, .)

答案3

得分: 0

这是一种方法:

  1. # 修改向量
  2. l <- list('A' = vecA, 'B' = vecB, 'C' = vecC)
  3. # 创建带映射的数据框
  4. df2 = data.frame(t(sapply(df1$Factor, function(x) l[[x]])))
  5. colnames(df2) <- c('V1', 'V2')
  6. new_df = cbind(df1, df2)
  7. Factor V1 V2
  8. 1 A 1 2
  9. 2 B 2 1
  10. 3 C 3 3
  11. 4 A 1 2
  12. 5 B 2 1
  13. 6 C 3 3
  14. 7 A 1 2
  15. 8 B 2 1
  16. 9 C 3 3
  17. 10 A 1 2
  18. 11 B 2 1
  19. 12 C 3 3
  20. 13 A 1 2
  21. 14 B 2 1
  22. 15 C 3 3
英文:

Here's a way to do:

  1. # modify the vectors
  2. l &lt;- list(&#39;A&#39; = vecA, &#39;B&#39; = vecB, &#39;C&#39; = vecC)
  3. # create df with mapping
  4. df2 = data.frame(t(sapply(df1$Factor, function(x) l[[x]])))
  5. colnames(df2) &lt;- c(&#39;V1&#39;, &#39;V2&#39;)
  6. new_df = cbind(df1, df2)
  7. Factor V1 V2
  8. 1 A 1 2
  9. 2 B 2 1
  10. 3 C 3 3
  11. 4 A 1 2
  12. 5 B 2 1
  13. 6 C 3 3
  14. 7 A 1 2
  15. 8 B 2 1
  16. 9 C 3 3
  17. 10 A 1 2
  18. 11 B 2 1
  19. 12 C 3 3
  20. 13 A 1 2
  21. 14 B 2 1
  22. 15 C 3 3

huangapple
  • 本文由 发表于 2020年1月6日 21:01:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/59612602.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定