在R中循环遍历组。

huangapple go评论110阅读模式
英文:

Looping over groups in R

问题

我有一个df,包括一组数据框df1df2df3,每个数据框都遵循这个结构:

  1. df1 <- data.frame(year = c("2013", "2013", "2013", "2013", "2013","2013"),
  2. site = c("a", "a", "a", "a", "a", "a"),
  3. trt = c("x", "y", "x", "y", "x", "y"),
  4. cover = c(2, 5, 1,20,50,12))
  5. df2 <- data.frame(year = c("2014", "2014", "2014", "2014", "2014","2014"),
  6. site = c("a", "a", "a", "a", "a", "a"),
  7. trt = c("x", "y", "x", "y", "x", "y"),
  8. cover = c(1, 3, 1,24,32,12))
  9. df3 <- data.frame(year = c("2015", "2015", "2015", "2015", "2015","2015"),
  10. site = c("a", "a", "a", "a", "a", "a"),
  11. trt = c("x", "y", "z", "z", "x", "y"),
  12. cover = c(2, 5, 1,2,11,32))
  13. df <- rbind(df1, df2, df3)
  14. df
  15. year site trt cover
  16. 1 2013 a x 2
  17. 2 2013 a y 5
  18. 3 2013 a x 1
  19. 4 2013 a y 20
  20. 5 2013 a x 50
  21. 6 2013 a y 12
  22. 7 2014 a x 1
  23. 8 2014 a y 3
  24. 9 2014 a x 1
  25. 10 2014 a y 24
  26. 11 2014 a x 32
  27. 12 2014 a y 12
  28. 13 2015 a x 2
  29. 14 2015 a y 5
  30. 15 2015 a z 1
  31. 16 2015 a z 2
  32. 17 2015 a x 11
  33. 18 2015 a y 32

我过去常用for loop对每年的cover列进行排名。

  1. v1 <- unique(df$year)
  2. lst <- list()
  3. for (i in seq_along(v1)) {
  4. lst[[i]] <- df |>
  5. filter(year == v1[i]) |>
  6. mutate(rank = dense_rank(desc(cover)))
  7. }

现在,我尝试对每年的每个组(在trt列中定义)的值进行排名,但我在想如何做到。我该如何使用for loop实现这个目标。我愿意使用lapply函数来得到答案,因为我想了解它。

英文:

I have a df including a set of data frames, df1, df2, and df3 where each data frame follow this structure:

  1. df1 &lt;- data.frame(year = c(&quot;2013&quot;, &quot;2013&quot;, &quot;2013&quot;, &quot;2013&quot;, &quot;2013&quot;,&quot;2013&quot;),
  2. site = c(&quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;),
  3. trt = c(&quot;x&quot;, &quot;y&quot;, &quot;x&quot;, &quot;y&quot;, &quot;x&quot;, &quot;y&quot;),
  4. cover = c(2, 5, 1,20,50,12))
  5. df2 &lt;- data.frame(year = c(&quot;2014&quot;, &quot;2014&quot;, &quot;2014&quot;, &quot;2014&quot;, &quot;2014&quot;,&quot;2014&quot;),
  6. site = c(&quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;),
  7. trt = c(&quot;x&quot;, &quot;y&quot;, &quot;x&quot;, &quot;y&quot;, &quot;x&quot;, &quot;y&quot;),
  8. cover = c(1, 3, 1,24,32,12))
  9. df3 &lt;- data.frame(year = c(&quot;2015&quot;, &quot;2015&quot;, &quot;2015&quot;, &quot;2015&quot;, &quot;2015&quot;,&quot;2015&quot;),
  10. site = c(&quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;),
  11. trt = c(&quot;x&quot;, &quot;y&quot;, &quot;z&quot;, &quot;z&quot;, &quot;x&quot;, &quot;y&quot;),
  12. cover = c(2, 5, 1,2,11,32))
  13. df &lt;- rbind(df1, df2, df3)
  14. df
  15. year site trt cover
  16. 1 2013 a x 2
  17. 2 2013 a y 5
  18. 3 2013 a x 1
  19. 4 2013 a y 20
  20. 5 2013 a x 50
  21. 6 2013 a y 12
  22. 7 2014 a x 1
  23. 8 2014 a y 3
  24. 9 2014 a x 1
  25. 10 2014 a y 24
  26. 11 2014 a x 32
  27. 12 2014 a y 12
  28. 13 2015 a x 2
  29. 14 2015 a y 5
  30. 15 2015 a z 1
  31. 16 2015 a z 2
  32. 17 2015 a x 11
  33. 18 2015 a y 32

I used to rank the values in the cover column for each year, using a for loop.

  1. v1 &lt;- unique(df$year)
  2. lst &lt;- list()
  3. for (i in seq_along(v1)) {
  4. lst[[i]] &lt;- df |&gt;
  5. filter(year == v1[i]) |&gt;
  6. mutate(rank = dense_rank(desc(cover)))
  7. }

Now, I am trying to rank the values of each group (as defined in the trt column) for each year, but I am having trouble figuring out how to do so. How can I do this with for loop. I am open to get an answer with lapply function as I would like to learn about it.

答案1

得分: 1

使用 dplyr,我们可以通过在 mutate 之前使用 group 来避免循环和过滤,然后使用 group_split 构建列表。

  1. library(dplyr)
  2. df |&gt;
  3. group_by(year) |&gt;
  4. mutate(rank = dense_rank(desc(cover))) |&gt;
  5. group_split()

输出:

  1. [[1]]
  2. # 一个 tibble: 6 &#215; 5
  3. year site trt cover rank
  4. &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
  5. 1 2013 a x 2 5
  6. 2 2013 a y 5 4
  7. 3 2013 a x 1 6
  8. 4 2013 a y 20 2
  9. 5 2013 a x 50 1
  10. 6 2013 a y 12 3
  11. [[2]]
  12. # 一个 tibble: 6 &#215; 5
  13. year site trt cover rank
  14. &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
  15. 1 2014 a x 1 5
  16. 2 2014 a y 3 4
  17. 3 2014 a x 1 5
  18. 4 2014 a y 24 2
  19. 5 2014 a x 32 1
  20. 6 2014 a y 12 3
  21. [[3]]
  22. # 一个 tibble: 6 &#215; 5
  23. year site trt cover rank
  24. &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
  25. 1 2015 a x 2 4
  26. 2 2015 a y 5 3
  27. 3 2015 a z 1 5
  28. 4 2015 a z 2 4
  29. 5 2015 a x 11 2
  30. 6 2015 a y 32 1
英文:

Using dplyr, we can avoid the loop and the filtering by using group before mutate, and then construct the list using group_split.

  1. library(dplyr)
  2. df |&gt;
  3. group_by(year) |&gt;
  4. mutate(rank = dense_rank(desc(cover))) |&gt;
  5. group_split()

Output:

  1. [[1]]
  2. # A tibble: 6 &#215; 5
  3. year site trt cover rank
  4. &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
  5. 1 2013 a x 2 5
  6. 2 2013 a y 5 4
  7. 3 2013 a x 1 6
  8. 4 2013 a y 20 2
  9. 5 2013 a x 50 1
  10. 6 2013 a y 12 3
  11. [[2]]
  12. # A tibble: 6 &#215; 5
  13. year site trt cover rank
  14. &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
  15. 1 2014 a x 1 5
  16. 2 2014 a y 3 4
  17. 3 2014 a x 1 5
  18. 4 2014 a y 24 2
  19. 5 2014 a x 32 1
  20. 6 2014 a y 12 3
  21. [[3]]
  22. # A tibble: 6 &#215; 5
  23. year site trt cover rank
  24. &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
  25. 1 2015 a x 2 4
  26. 2 2015 a y 5 3
  27. 3 2015 a z 1 5
  28. 4 2015 a z 2 4
  29. 5 2015 a x 11 2
  30. 6 2015 a y 32 1

huangapple
  • 本文由 发表于 2023年3月31日 18:42:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/75897615.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定