2023年3月31日 18:42:53go评论110阅读模式

英文:

Looping over groups in R

问题

我有一个df，包括一组数据框df1，df2和df3，每个数据框都遵循这个结构：

df1 <- data.frame(year = c("2013", "2013", "2013", "2013", "2013","2013"), 
                  site = c("a", "a", "a", "a", "a", "a"),
                  trt = c("x", "y", "x", "y", "x", "y"),
                  cover = c(2, 5, 1,20,50,12))
df2 <- data.frame(year = c("2014", "2014", "2014", "2014", "2014","2014"),
                  site = c("a", "a", "a", "a", "a", "a"),
                  trt = c("x", "y", "x", "y", "x", "y"),
                  cover = c(1, 3, 1,24,32,12))
df3 <- data.frame(year = c("2015", "2015", "2015", "2015", "2015","2015"),
                  site = c("a", "a", "a", "a", "a", "a"),
                  trt = c("x", "y", "z", "z", "x", "y"),
                  cover = c(2, 5, 1,2,11,32))
df <- rbind(df1, df2, df3)
df
   year site trt cover
1  2013    a   x     2
2  2013    a   y     5
3  2013    a   x     1
4  2013    a   y    20
5  2013    a   x    50
6  2013    a   y    12
7  2014    a   x     1
8  2014    a   y     3
9  2014    a   x     1
10 2014    a   y    24
11 2014    a   x    32
12 2014    a   y    12
13 2015    a   x     2
14 2015    a   y     5
15 2015    a   z     1
16 2015    a   z     2
17 2015    a   x    11
18 2015    a   y    32

我过去常用for loop对每年的cover列进行排名。

v1 <- unique(df$year)
lst <- list()
for (i in seq_along(v1)) {
  lst[[i]] <- df |>
    filter(year == v1[i]) |>
    mutate(rank = dense_rank(desc(cover)))
}

现在，我尝试对每年的每个组（在trt列中定义）的值进行排名，但我在想如何做到。我该如何使用for loop实现这个目标。我愿意使用lapply函数来得到答案，因为我想了解它。

英文:

I have a df including a set of data frames, df1, df2, and df3 where each data frame follow this structure:

df1 &lt;- data.frame(year = c(&quot;2013&quot;, &quot;2013&quot;, &quot;2013&quot;, &quot;2013&quot;, &quot;2013&quot;,&quot;2013&quot;), 
                  site = c(&quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;),
                  trt = c(&quot;x&quot;, &quot;y&quot;, &quot;x&quot;, &quot;y&quot;, &quot;x&quot;, &quot;y&quot;),
                  cover = c(2, 5, 1,20,50,12))
df2 &lt;- data.frame(year = c(&quot;2014&quot;, &quot;2014&quot;, &quot;2014&quot;, &quot;2014&quot;, &quot;2014&quot;,&quot;2014&quot;),
                  site = c(&quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;),
                  trt = c(&quot;x&quot;, &quot;y&quot;, &quot;x&quot;, &quot;y&quot;, &quot;x&quot;, &quot;y&quot;),
                  cover = c(1, 3, 1,24,32,12))
df3 &lt;- data.frame(year = c(&quot;2015&quot;, &quot;2015&quot;, &quot;2015&quot;, &quot;2015&quot;, &quot;2015&quot;,&quot;2015&quot;),
                  site = c(&quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;, &quot;a&quot;),
                  trt = c(&quot;x&quot;, &quot;y&quot;, &quot;z&quot;, &quot;z&quot;, &quot;x&quot;, &quot;y&quot;),
                  cover = c(2, 5, 1,2,11,32))
df &lt;- rbind(df1, df2, df3)
df
   year site trt cover
1  2013    a   x     2
2  2013    a   y     5
3  2013    a   x     1
4  2013    a   y    20
5  2013    a   x    50
6  2013    a   y    12
7  2014    a   x     1
8  2014    a   y     3
9  2014    a   x     1
10 2014    a   y    24
11 2014    a   x    32
12 2014    a   y    12
13 2015    a   x     2
14 2015    a   y     5
15 2015    a   z     1
16 2015    a   z     2
17 2015    a   x    11
18 2015    a   y    32

I used to rank the values in the cover column for each year, using a for loop.

v1 &lt;- unique(df$year)
lst &lt;- list()
for (i in seq_along(v1)) {
  lst[[i]] &lt;- df |&gt; 
    filter(year == v1[i]) |&gt; 
    mutate(rank = dense_rank(desc(cover)))
}

Now, I am trying to rank the values of each group (as defined in the trt column) for each year, but I am having trouble figuring out how to do so. How can I do this with for loop. I am open to get an answer with lapply function as I would like to learn about it.

答案1

得分: 1

使用 dplyr，我们可以通过在 mutate 之前使用 group 来避免循环和过滤，然后使用 group_split 构建列表。

library(dplyr)
df |&gt;
  group_by(year) |&gt;
  mutate(rank = dense_rank(desc(cover))) |&gt;
  group_split()

输出：

[[1]]
# 一个 tibble: 6 &#215; 5
  year  site  trt   cover  rank
  &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
1 2013  a     x         2     5
2 2013  a     y         5     4
3 2013  a     x         1     6
4 2013  a     y        20     2
5 2013  a     x        50     1
6 2013  a     y        12     3
[[2]]
# 一个 tibble: 6 &#215; 5
  year  site  trt   cover  rank
  &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
1 2014  a     x         1     5
2 2014  a     y         3     4
3 2014  a     x         1     5
4 2014  a     y        24     2
5 2014  a     x        32     1
6 2014  a     y        12     3
[[3]]
# 一个 tibble: 6 &#215; 5
  year  site  trt   cover  rank
  &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
1 2015  a     x         2     4
2 2015  a     y         5     3
3 2015  a     z         1     5
4 2015  a     z         2     4
5 2015  a     x        11     2
6 2015  a     y        32     1

英文:

Using dplyr, we can avoid the loop and the filtering by using group before mutate, and then construct the list using group_split.

library(dplyr)
df |&gt;
  group_by(year) |&gt;
  mutate(rank = dense_rank(desc(cover))) |&gt;
  group_split()

Output:

[[1]]
# A tibble: 6 &#215; 5
  year  site  trt   cover  rank
  &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
1 2013  a     x         2     5
2 2013  a     y         5     4
3 2013  a     x         1     6
4 2013  a     y        20     2
5 2013  a     x        50     1
6 2013  a     y        12     3
[[2]]
# A tibble: 6 &#215; 5
  year  site  trt   cover  rank
  &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
1 2014  a     x         1     5
2 2014  a     y         3     4
3 2014  a     x         1     5
4 2014  a     y        24     2
5 2014  a     x        32     1
6 2014  a     y        12     3
[[3]]
# A tibble: 6 &#215; 5
  year  site  trt   cover  rank
  &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
1 2015  a     x         2     4
2 2015  a     y         5     3
3 2015  a     z         1     5
4 2015  a     z         2     4
5 2015  a     x        11     2
6 2015  a     y        32     1

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在R中循环遍历组。

问题

答案1

如果模式在gsub中匹配，则仅返回值。

使用`st_centroid`返回点的质心。

如何更改由tags$i生成的悬停文本的样式？

调整 Johnson-Neyman 图

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。