英文:
How to find the column name with max value from a list in R
问题
以下是要翻译的内容:
"我有这样的数据:"
df <- data.frame(Product= paste0("AE", c(1:11)),
Condition=c("-","A","A","B","B","B","C","C","C","D","D"),
Score1=c(0.231,0.831,0.894,1e-05,1e-05,1e-05,0.874,0.785,0.879,1e-08,1e-08),
Score2=c(0.968,0.168,0.105,0.239,0.149,0.125,1e-05,1e-05,1e-05,0.159,0.105),
Score3=c(1e-07,1e-07,1e-07,1e-07,1e-07,1e-07,1e-05,1e-05,1e-05,0.855,0.827),
Score4=c(1e-07,1e-07,1e-07,1e-07,1e-07,1e-07,0.492,0.500,0.656,1e-08,1e-08))
"使用split(df, df$Condition)
后会变成:"
> split(df, df$Condition)
$`-`
Product Condition Score1 Score2 Score3 Score4
1 AE1 - 0.231 0.968 1e-07 1e-07
$A
Product Condition Score1 Score2 Score3 Score4
2 AE2 A 0.831 0.168 1e-07 1e-07
3 AE3 A 0.894 0.105 1e-07 1e-07
$B
Product Condition Score1 Score2 Score3 Score4
4 AE4 B 1e-05 0.239 1e-07 1e-07
5 AE5 B 1e-05 0.149 1e-07 1e-07
6 AE6 B 1e-05 0.125 1e-07 1e-07
$C
Product Condition Score1 Score2 Score3 Score4
7 AE7 C 0.874 1e-05 1e-05 0.492
8 AE8 C 0.785 1e-05 1e-05 0.500
9 AE9 C 0.879 1e-05 1e-05 0.656
$D
Product Condition Score1 Score2 Score3 Score4
10 AE10 D 1e-08 0.159 0.855 1e-08
11 AE11 D 1e-08 0.105 0.827 1e-08
"我想要在每个列表中获取具有最大值的列,如下所示:"
"在$`-`
中,$A
和$C
的最大值在Score1,$B
中的最大值在Score2,$D
中的最大值在Score3。"
"我想要获得一个表格,如下所示:"
Condition Column
- Score1
A Score1
B Score2
C Score1
D Score3
"任何建议都有帮助。谢谢。"
英文:
I have a data like this :
df <- data.frame(Product= paste0("AE", c(1:11)),
Condition=c("-","A","A","B","B","B","C","C","C","D","D"),
Score1=c(0.231,0.831,0.894,1e-05,1e-05,1e-05,0.874,0.785,0.879,1e-08,1e-08),
Score2=c(0.968,0.168,0.105,0.239,0.149,0.125,1e-05,1e-05,1e-05,0.159,0.105),
Score3=c(1e-07,1e-07,1e-07,1e-07,1e-07,1e-07,1e-05,1e-05,1e-05,0.855,0.827),
Score4=c(1e-07,1e-07,1e-07,1e-07,1e-07,1e-07,0.492,0.500,0.656,1e-08,1e-08))
after split(df, df$Condition)
will like :
> split(df, df$Condition)
$`-`
Product Condition Score1 Score2 Score3 Score4
1 AE1 - 0.231 0.968 1e-07 1e-07
$A
Product Condition Score1 Score2 Score3 Score4
2 AE2 A 0.831 0.168 1e-07 1e-07
3 AE3 A 0.894 0.105 1e-07 1e-07
$B
Product Condition Score1 Score2 Score3 Score4
4 AE4 B 1e-05 0.239 1e-07 1e-07
5 AE5 B 1e-05 0.149 1e-07 1e-07
6 AE6 B 1e-05 0.125 1e-07 1e-07
$C
Product Condition Score1 Score2 Score3 Score4
7 AE7 C 0.874 1e-05 1e-05 0.492
8 AE8 C 0.785 1e-05 1e-05 0.500
9 AE9 C 0.879 1e-05 1e-05 0.656
$D
Product Condition Score1 Score2 Score3 Score4
10 AE10 D 1e-08 0.159 0.855 1e-08
11 AE11 D 1e-08 0.105 0.827 1e-08
I want to get column with max value in each list,
like :
In $`-`
, $A
and $C
is Score1,in $B
is Score2,in $D
is Score3.
I want get a table like
Condition Column
- Score1
A Score1
B Score2
C Score1
D Score3
Any suggestion is helpful.Thank you.
答案1
得分: 1
在基本的 R 中,使用您的分割方法:
stack(lapply(split(df[-(1:2)], df[2]), \(x)names(x)[col(x)[which.max(unlist(x))]]))
values ind
1 Score2 -
2 Score1 A
3 Score2 B
4 Score1 C
5 Score3 D
请注意,还有更好的替代方法。
英文:
In base R. using your split:
stack(lapply(split(df[-(1:2)], df[2]), \(x)names(x)[col(x)[which.max(unlist(x))]]))
values ind
1 Score2 -
2 Score1 A
3 Score2 B
4 Score1 C
5 Score3 D
Note that there are better alternatives
答案2
得分: 0
一种高效的方法是使用 tidyr::pivot_longer()
,以便将所有列数据聚合在一起,然后可以轻松地获取每个 Condition
中的最高值。
library(tidyverse)
df <- data.frame(Product= paste0("AE", c(1:11)),
Condition=c("-","A","A","B","B","B","C","C","C","D","D"),
Score1=c(0.231,0.831,0.894,1e-05,1e-05,1e-05,0.874,0.785,0.879,1e-08,1e-08),
Score2=c(0.968,0.168,0.105,0.239,0.149,0.125,1e-05,1e-05,1e-05,0.159,0.105),
Score3=c(1e-07,1e-07,1e-07,1e-07,1e-07,1e-07,1e-05,1e-05,1e-05,0.855,0.827),
Score4=c(1e-07,1e-07,1e-07,1e-07,1e-07,1e-07,0.492,0.500,0.656,1e-08,1e-08))
df |>
pivot_longer(starts_with("Score"), names_to = "Column") |>
group_by(Condition) |>
slice_max(order_by = value, n = 1) |>
select(Condition, Column)
#> # A tibble: 5 × 2
#> # Groups: Condition [5]
#> Condition Column
#> <chr> <chr>
#> 1 - Score2
#> 2 A Score1
#> 3 B Score2
#> 4 C Score1
#> 5 D Score3
创建于2023年07月05日,使用 reprex v2.0.2。
英文:
An efficient way to get this is to tidyr::pivot_longer()
so that you aggregate all the column data together and can easily take the top value in each Condition
.
library(tidyverse)
df <- data.frame(Product= paste0("AE", c(1:11)),
Condition=c("-","A","A","B","B","B","C","C","C","D","D"),
Score1=c(0.231,0.831,0.894,1e-05,1e-05,1e-05,0.874,0.785,0.879,1e-08,1e-08),
Score2=c(0.968,0.168,0.105,0.239,0.149,0.125,1e-05,1e-05,1e-05,0.159,0.105),
Score3=c(1e-07,1e-07,1e-07,1e-07,1e-07,1e-07,1e-05,1e-05,1e-05,0.855,0.827),
Score4=c(1e-07,1e-07,1e-07,1e-07,1e-07,1e-07,0.492,0.500,0.656,1e-08,1e-08))
df |>
pivot_longer(starts_with("Score"), names_to = "Column") |>
group_by(Condition) |>
slice_max(order_by = value, n = 1) |>
select(Condition, Column)
#> # A tibble: 5 × 2
#> # Groups: Condition [5]
#> Condition Column
#> <chr> <chr>
#> 1 - Score2
#> 2 A Score1
#> 3 B Score2
#> 4 C Score1
#> 5 D Score3
<sup>Created on 2023-07-05 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论