2023年5月20日 22:18:16go评论87阅读模式

英文:

How to select row with max value from a group of groups in R?

问题

我有一个包含多个分组的数据框。我想创建一个由给定组中的最大行填充的数据框。给定数据框如下：

group unit treatment value etc
1 A w 8 apple
1 A x 9 pear
1 A y 7 orange
1 A z 2 pear
1 B w 4 strawberry
1 B x 3 dragonfruit
1 B y 6 raspberry
1 B z 5 apple
1 C w 32 banana
1 C x 27 peach
1 C y 15 plum
1 C z 28 orange
2 A w 12 apricot
2 A x 11 blackberry
2 A y 10 banana
2 A z 9 raspeberry
2 B w 1 plum
2 B x 2 lemon
2 B y 3 grapefruit
2 B z 4 apple
2 C w 51 fig
2 C x 47 avocado
2 C y 68 blackberry
2 C z 53 dragonfruit

对于每个组，对于每个单元，我想选择具有最高值的行，以便最终得到：

group unit treatment value etc
1 A x 9 pear
1 B y 6 raspberry
1 C w 32 banana
2 A w 12 apricot
2 B z 4 apple
2 C y 68 blackberry

etc列只是为了强调我想选择整行。

我可以编写一系列嵌套循环，但感觉应该有更优雅的方法。欢迎使用base或tidyverse的建议。

英文:

I have a dataframe with a number of groupings. I want to create a dataframe populated by rows that are the maximum of a given group of groups. Given the dataframe

group unit treatment value         etc
     1    A         w     8       apple
     1    A         x     9        pear
     1    A         y     7      orange
     1    A         z     2        pear
     1    B         w     4  strawberry
     1    B         x     3 dragonfruit
     1    B         y     6   raspberry
     1    B         z     5       apple
     1    C         w    32      banana
     1    C         x    27       peach
     1    C         y    15        plum
     1    C         z    28      orange
     2    A         w    12     apricot
     2    A         x    11  blackberry
     2    A         y    10      banana
     2    A         z     9  raspeberry
     2    B         w     1        plum
     2    B         x     2       lemon
     2    B         y     3  grapefruit
     2    B         z     4       apple
     2    C         w    51         fig
     2    C         x    47     avocado
     2    C         y    68  blackberry
     2    C         z    53 dragonfruit

for each group, for each unit, I would like to select the row with the highest value, such that I would end up with:

group unit treatment value        etc
     1    A         x     9       pear
     1    B         y     6  raspberry
     1    C         w    32     banana
     2    A         w    12    apricot
     2    B         z     4      apple
     2    C         y    68 blackberry

the etc column is just to highlight that I'd like to select the whole row.

I could write a series of nested loops, but there feels like there has to be something more elegant. Happy for base or tidyverse suggestions.

答案1

得分: 0

以下是翻译好的部分：

你可以按照以下方式操作：
```R
library(dplyr)
filter(dt, value==max(value), .by=group:unit)

或者（如@Limey建议的）

library(dplyr)
slice_max(dt, order_by= value, by=group:unit)

或者

library(data.table)
setDT(dt)[, .SD[value==max(value)], .(group, unit)]

输出结果：

   group   unit treatment value        etc
   <int> <char>    <char> <int>     <char>
1:     1      A         x     9       pear
2:     1      B         y     6  raspberry
3:     1      C         w    32     banana
4:     2      A         w    12    apricot
5:     2      B         z     4      apple
6:     2      C         y    68 blackberry


<details>
<summary>英文:</summary>
You can do as follows:

library(dplyr)
filter(dt, value==max(value), .by=group:unit)


or (as @Limey suggests)

library(dplyr)
slice_max(dt, order_by= value, by=group:unit)

or

library(data.table)
setDT(dt)[, .SD[value==max(value)], .(group, unit)]


Output:

group unit treatment value etc
<int> <char> <char> <int> <char>
1: 1 A x 9 pear
2: 1 B y 6 raspberry
3: 1 C w 32 banana
4: 2 A w 12 apricot
5: 2 B z 4 apple
6: 2 C y 68 blackberry


</details>
# 答案2
**得分**: 0
你可以使用 `dplyr::slice_max()`。在 `dplyr V1.1.0` 或更高版本中：
```R
library(dplyr)
df %>% 
  slice_max(value, by = c(group, unit))

在旧版本中：

df %>%
  group_by(group, unit) %>%
  slice_max(value) %>%
  ungroup()

英文:

You can use dplyr::slice_max(). In dplyr V1.1.0 or later:

library(dplyr)
df %&gt;% 
  slice_max(value, by = c(group, unit))

In older versions:

df %&gt;% 
group_by(group, unit) %&gt;%
  slice_max(value) %&gt;%
  ungroup()

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

选择在R中的一组组中具有最大值的行如何？

问题

答案1

百分比堆叠条形图

PERMANOVA – 不平衡设计

R：自动化解决数据框中的许多方程组

如何在 echarts 工具提示格式化程序中按组提取 y 轴值

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。