在R中循环遍历组。

huangapple go评论84阅读模式
英文:

Looping over groups in R

问题

我有一个df,包括一组数据框df1df2df3,每个数据框都遵循这个结构:

df1 <- data.frame(year = c("2013", "2013", "2013", "2013", "2013","2013"), 
                  site = c("a", "a", "a", "a", "a", "a"),
                  trt = c("x", "y", "x", "y", "x", "y"),
                  cover = c(2, 5, 1,20,50,12))

df2 <- data.frame(year = c("2014", "2014", "2014", "2014", "2014","2014"),
                  site = c("a", "a", "a", "a", "a", "a"),
                  trt = c("x", "y", "x", "y", "x", "y"),
                  cover = c(1, 3, 1,24,32,12))

df3 <- data.frame(year = c("2015", "2015", "2015", "2015", "2015","2015"),
                  site = c("a", "a", "a", "a", "a", "a"),
                  trt = c("x", "y", "z", "z", "x", "y"),
                  cover = c(2, 5, 1,2,11,32))

df <- rbind(df1, df2, df3)
df

   year site trt cover
1  2013    a   x     2
2  2013    a   y     5
3  2013    a   x     1
4  2013    a   y    20
5  2013    a   x    50
6  2013    a   y    12
7  2014    a   x     1
8  2014    a   y     3
9  2014    a   x     1
10 2014    a   y    24
11 2014    a   x    32
12 2014    a   y    12
13 2015    a   x     2
14 2015    a   y     5
15 2015    a   z     1
16 2015    a   z     2
17 2015    a   x    11
18 2015    a   y    32

我过去常用for loop对每年的cover列进行排名。

v1 <- unique(df$year)
lst <- list()

for (i in seq_along(v1)) {
  lst[[i]] <- df |>
    filter(year == v1[i]) |>
    mutate(rank = dense_rank(desc(cover)))
}

现在,我尝试对每年的每个组(在trt列中定义)的值进行排名,但我在想如何做到。我该如何使用for loop实现这个目标。我愿意使用lapply函数来得到答案,因为我想了解它。

英文:

I have a df including a set of data frames, df1, df2, and df3 where each data frame follow this structure:

df1 &lt;- data.frame(year = c(&quot;2013&quot;, &quot;2013&quot;, &quot;2013&quot;, &quot;2013&quot;, &quot;2013&quot;,&quot;2013&quot;), 
                  site = c(&quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;),
                  trt = c(&quot;x&quot;, &quot;y&quot;, &quot;x&quot;, &quot;y&quot;, &quot;x&quot;, &quot;y&quot;),
                  cover = c(2, 5, 1,20,50,12))

df2 &lt;- data.frame(year = c(&quot;2014&quot;, &quot;2014&quot;, &quot;2014&quot;, &quot;2014&quot;, &quot;2014&quot;,&quot;2014&quot;),
                  site = c(&quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;),
                  trt = c(&quot;x&quot;, &quot;y&quot;, &quot;x&quot;, &quot;y&quot;, &quot;x&quot;, &quot;y&quot;),
                  cover = c(1, 3, 1,24,32,12))

df3 &lt;- data.frame(year = c(&quot;2015&quot;, &quot;2015&quot;, &quot;2015&quot;, &quot;2015&quot;, &quot;2015&quot;,&quot;2015&quot;),
                  site = c(&quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;),
                  trt = c(&quot;x&quot;, &quot;y&quot;, &quot;z&quot;, &quot;z&quot;, &quot;x&quot;, &quot;y&quot;),
                  cover = c(2, 5, 1,2,11,32))

df &lt;- rbind(df1, df2, df3)
df

   year site trt cover
1  2013    a   x     2
2  2013    a   y     5
3  2013    a   x     1
4  2013    a   y    20
5  2013    a   x    50
6  2013    a   y    12
7  2014    a   x     1
8  2014    a   y     3
9  2014    a   x     1
10 2014    a   y    24
11 2014    a   x    32
12 2014    a   y    12
13 2015    a   x     2
14 2015    a   y     5
15 2015    a   z     1
16 2015    a   z     2
17 2015    a   x    11
18 2015    a   y    32

I used to rank the values in the cover column for each year, using a for loop.

v1 &lt;- unique(df$year)
lst &lt;- list()

for (i in seq_along(v1)) {
  lst[[i]] &lt;- df |&gt; 
    filter(year == v1[i]) |&gt; 
    mutate(rank = dense_rank(desc(cover)))
}

Now, I am trying to rank the values of each group (as defined in the trt column) for each year, but I am having trouble figuring out how to do so. How can I do this with for loop. I am open to get an answer with lapply function as I would like to learn about it.

答案1

得分: 1

使用 dplyr,我们可以通过在 mutate 之前使用 group 来避免循环和过滤,然后使用 group_split 构建列表。

library(dplyr)

df |&gt;
  group_by(year) |&gt;
  mutate(rank = dense_rank(desc(cover))) |&gt;
  group_split()

输出:

[[1]]
# 一个 tibble: 6 &#215; 5
  year  site  trt   cover  rank
  &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
1 2013  a     x         2     5
2 2013  a     y         5     4
3 2013  a     x         1     6
4 2013  a     y        20     2
5 2013  a     x        50     1
6 2013  a     y        12     3

[[2]]
# 一个 tibble: 6 &#215; 5
  year  site  trt   cover  rank
  &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
1 2014  a     x         1     5
2 2014  a     y         3     4
3 2014  a     x         1     5
4 2014  a     y        24     2
5 2014  a     x        32     1
6 2014  a     y        12     3

[[3]]
# 一个 tibble: 6 &#215; 5
  year  site  trt   cover  rank
  &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
1 2015  a     x         2     4
2 2015  a     y         5     3
3 2015  a     z         1     5
4 2015  a     z         2     4
5 2015  a     x        11     2
6 2015  a     y        32     1
英文:

Using dplyr, we can avoid the loop and the filtering by using group before mutate, and then construct the list using group_split.

library(dplyr)

df |&gt;
  group_by(year) |&gt;
  mutate(rank = dense_rank(desc(cover))) |&gt;
  group_split()

Output:

[[1]]
# A tibble: 6 &#215; 5
  year  site  trt   cover  rank
  &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
1 2013  a     x         2     5
2 2013  a     y         5     4
3 2013  a     x         1     6
4 2013  a     y        20     2
5 2013  a     x        50     1
6 2013  a     y        12     3

[[2]]
# A tibble: 6 &#215; 5
  year  site  trt   cover  rank
  &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
1 2014  a     x         1     5
2 2014  a     y         3     4
3 2014  a     x         1     5
4 2014  a     y        24     2
5 2014  a     x        32     1
6 2014  a     y        12     3

[[3]]
# A tibble: 6 &#215; 5
  year  site  trt   cover  rank
  &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
1 2015  a     x         2     4
2 2015  a     y         5     3
3 2015  a     z         1     5
4 2015  a     z         2     4
5 2015  a     x        11     2
6 2015  a     y        32     1

huangapple
  • 本文由 发表于 2023年3月31日 18:42:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/75897615.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定