选择在R中的一组组中具有最大值的行如何?

huangapple go评论75阅读模式
英文:

How to select row with max value from a group of groups in R?

问题

我有一个包含多个分组的数据框。我想创建一个由给定组中的最大行填充的数据框。给定数据框如下:

group unit treatment value etc
1 A w 8 apple
1 A x 9 pear
1 A y 7 orange
1 A z 2 pear
1 B w 4 strawberry
1 B x 3 dragonfruit
1 B y 6 raspberry
1 B z 5 apple
1 C w 32 banana
1 C x 27 peach
1 C y 15 plum
1 C z 28 orange
2 A w 12 apricot
2 A x 11 blackberry
2 A y 10 banana
2 A z 9 raspeberry
2 B w 1 plum
2 B x 2 lemon
2 B y 3 grapefruit
2 B z 4 apple
2 C w 51 fig
2 C x 47 avocado
2 C y 68 blackberry
2 C z 53 dragonfruit

对于每个组,对于每个单元,我想选择具有最高值的行,以便最终得到:

group unit treatment value etc
1 A x 9 pear
1 B y 6 raspberry
1 C w 32 banana
2 A w 12 apricot
2 B z 4 apple
2 C y 68 blackberry

etc列只是为了强调我想选择整行。

我可以编写一系列嵌套循环,但感觉应该有更优雅的方法。欢迎使用basetidyverse的建议。

英文:

I have a dataframe with a number of groupings. I want to create a dataframe populated by rows that are the maximum of a given group of groups. Given the dataframe

group unit treatment value         etc
     1    A         w     8       apple
     1    A         x     9        pear
     1    A         y     7      orange
     1    A         z     2        pear
     1    B         w     4  strawberry
     1    B         x     3 dragonfruit
     1    B         y     6   raspberry
     1    B         z     5       apple
     1    C         w    32      banana
     1    C         x    27       peach
     1    C         y    15        plum
     1    C         z    28      orange
     2    A         w    12     apricot
     2    A         x    11  blackberry
     2    A         y    10      banana
     2    A         z     9  raspeberry
     2    B         w     1        plum
     2    B         x     2       lemon
     2    B         y     3  grapefruit
     2    B         z     4       apple
     2    C         w    51         fig
     2    C         x    47     avocado
     2    C         y    68  blackberry
     2    C         z    53 dragonfruit

for each group, for each unit, I would like to select the row with the highest value, such that I would end up with:

group unit treatment value        etc
     1    A         x     9       pear
     1    B         y     6  raspberry
     1    C         w    32     banana
     2    A         w    12    apricot
     2    B         z     4      apple
     2    C         y    68 blackberry

the etc column is just to highlight that I'd like to select the whole row.

I could write a series of nested loops, but there feels like there has to be something more elegant. Happy for base or tidyverse suggestions.

答案1

得分: 0

以下是翻译好的部分:

你可以按照以下方式操作:

```R
library(dplyr)
filter(dt, value==max(value), .by=group:unit)

或者(如@Limey建议的)

library(dplyr)
slice_max(dt, order_by= value, by=group:unit)

或者

library(data.table)
setDT(dt)[, .SD[value==max(value)], .(group, unit)]

输出结果:

   group   unit treatment value        etc
   <int> <char>    <char> <int>     <char>
1:     1      A         x     9       pear
2:     1      B         y     6  raspberry
3:     1      C         w    32     banana
4:     2      A         w    12    apricot
5:     2      B         z     4      apple
6:     2      C         y    68 blackberry


<details>
<summary>英文:</summary>

You can do as follows:

library(dplyr)
filter(dt, value==max(value), .by=group:unit)


or (as @Limey suggests)

library(dplyr)
slice_max(dt, order_by= value, by=group:unit)


or

library(data.table)
setDT(dt)[, .SD[value==max(value)], .(group, unit)]


Output:

group unit treatment value etc
<int> <char> <char> <int> <char>
1: 1 A x 9 pear
2: 1 B y 6 raspberry
3: 1 C w 32 banana
4: 2 A w 12 apricot
5: 2 B z 4 apple
6: 2 C y 68 blackberry




</details>



# 答案2
**得分**: 0

你可以使用 `dplyr::slice_max()`。在 `dplyr V1.1.0` 或更高版本中:

```R
library(dplyr)
df %>% 
  slice_max(value, by = c(group, unit))

在旧版本中:

df %>%
  group_by(group, unit) %>%
  slice_max(value) %>%
  ungroup()
英文:

You can use dplyr::slice_max(). In dplyr V1.1.0 or later:

library(dplyr)
df %&gt;% 
  slice_max(value, by = c(group, unit))

In older versions:

df %&gt;% 
group_by(group, unit) %&gt;%
  slice_max(value) %&gt;%
  ungroup()

huangapple
  • 本文由 发表于 2023年5月20日 22:18:16
  • 转载请务必保留本文链接:https://go.coder-hub.com/76295680.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定