如何在R中从列表中找到具有最大值的列名

huangapple go评论108阅读模式
英文:

How to find the column name with max value from a list in R

问题

以下是要翻译的内容:

"我有这样的数据:"

  1. df <- data.frame(Product= paste0("AE", c(1:11)),
  2. Condition=c("-","A","A","B","B","B","C","C","C","D","D"),
  3. Score1=c(0.231,0.831,0.894,1e-05,1e-05,1e-05,0.874,0.785,0.879,1e-08,1e-08),
  4. Score2=c(0.968,0.168,0.105,0.239,0.149,0.125,1e-05,1e-05,1e-05,0.159,0.105),
  5. Score3=c(1e-07,1e-07,1e-07,1e-07,1e-07,1e-07,1e-05,1e-05,1e-05,0.855,0.827),
  6. Score4=c(1e-07,1e-07,1e-07,1e-07,1e-07,1e-07,0.492,0.500,0.656,1e-08,1e-08))

"使用split(df, df$Condition)后会变成:"

  1. > split(df, df$Condition)
  2. $`-`
  3. Product Condition Score1 Score2 Score3 Score4
  4. 1 AE1 - 0.231 0.968 1e-07 1e-07
  5. $A
  6. Product Condition Score1 Score2 Score3 Score4
  7. 2 AE2 A 0.831 0.168 1e-07 1e-07
  8. 3 AE3 A 0.894 0.105 1e-07 1e-07
  9. $B
  10. Product Condition Score1 Score2 Score3 Score4
  11. 4 AE4 B 1e-05 0.239 1e-07 1e-07
  12. 5 AE5 B 1e-05 0.149 1e-07 1e-07
  13. 6 AE6 B 1e-05 0.125 1e-07 1e-07
  14. $C
  15. Product Condition Score1 Score2 Score3 Score4
  16. 7 AE7 C 0.874 1e-05 1e-05 0.492
  17. 8 AE8 C 0.785 1e-05 1e-05 0.500
  18. 9 AE9 C 0.879 1e-05 1e-05 0.656
  19. $D
  20. Product Condition Score1 Score2 Score3 Score4
  21. 10 AE10 D 1e-08 0.159 0.855 1e-08
  22. 11 AE11 D 1e-08 0.105 0.827 1e-08

"我想要在每个列表中获取具有最大值的列,如下所示:"

"在$`-` 中,$A$C的最大值在Score1,$B中的最大值在Score2,$D中的最大值在Score3。"

"我想要获得一个表格,如下所示:"

  1. Condition Column
  2. - Score1
  3. A Score1
  4. B Score2
  5. C Score1
  6. D Score3

"任何建议都有帮助。谢谢。"

英文:

I have a data like this :

  1. df <- data.frame(Product= paste0("AE", c(1:11)),
  2. Condition=c("-","A","A","B","B","B","C","C","C","D","D"),
  3. Score1=c(0.231,0.831,0.894,1e-05,1e-05,1e-05,0.874,0.785,0.879,1e-08,1e-08),
  4. Score2=c(0.968,0.168,0.105,0.239,0.149,0.125,1e-05,1e-05,1e-05,0.159,0.105),
  5. Score3=c(1e-07,1e-07,1e-07,1e-07,1e-07,1e-07,1e-05,1e-05,1e-05,0.855,0.827),
  6. Score4=c(1e-07,1e-07,1e-07,1e-07,1e-07,1e-07,0.492,0.500,0.656,1e-08,1e-08))

after split(df, df$Condition) will like :

  1. > split(df, df$Condition)
  2. $`-`
  3. Product Condition Score1 Score2 Score3 Score4
  4. 1 AE1 - 0.231 0.968 1e-07 1e-07
  5. $A
  6. Product Condition Score1 Score2 Score3 Score4
  7. 2 AE2 A 0.831 0.168 1e-07 1e-07
  8. 3 AE3 A 0.894 0.105 1e-07 1e-07
  9. $B
  10. Product Condition Score1 Score2 Score3 Score4
  11. 4 AE4 B 1e-05 0.239 1e-07 1e-07
  12. 5 AE5 B 1e-05 0.149 1e-07 1e-07
  13. 6 AE6 B 1e-05 0.125 1e-07 1e-07
  14. $C
  15. Product Condition Score1 Score2 Score3 Score4
  16. 7 AE7 C 0.874 1e-05 1e-05 0.492
  17. 8 AE8 C 0.785 1e-05 1e-05 0.500
  18. 9 AE9 C 0.879 1e-05 1e-05 0.656
  19. $D
  20. Product Condition Score1 Score2 Score3 Score4
  21. 10 AE10 D 1e-08 0.159 0.855 1e-08
  22. 11 AE11 D 1e-08 0.105 0.827 1e-08

I want to get column with max value in each list,
like :
In $`-` , $A and $Cis Score1,in $B is Score2,in $D is Score3.

I want get a table like

  1. Condition Column
  2. - Score1
  3. A Score1
  4. B Score2
  5. C Score1
  6. D Score3

Any suggestion is helpful.Thank you.

答案1

得分: 1

在基本的 R 中,使用您的分割方法:

  1. stack(lapply(split(df[-(1:2)], df[2]), \(x)names(x)[col(x)[which.max(unlist(x))]]))
  2. values ind
  3. 1 Score2 -
  4. 2 Score1 A
  5. 3 Score2 B
  6. 4 Score1 C
  7. 5 Score3 D

请注意,还有更好的替代方法。

英文:

In base R. using your split:

  1. stack(lapply(split(df[-(1:2)], df[2]), \(x)names(x)[col(x)[which.max(unlist(x))]]))
  2. values ind
  3. 1 Score2 -
  4. 2 Score1 A
  5. 3 Score2 B
  6. 4 Score1 C
  7. 5 Score3 D

Note that there are better alternatives

答案2

得分: 0

一种高效的方法是使用 tidyr::pivot_longer(),以便将所有列数据聚合在一起,然后可以轻松地获取每个 Condition 中的最高值。

  1. library(tidyverse)
  2. df <- data.frame(Product= paste0("AE", c(1:11)),
  3. Condition=c("-","A","A","B","B","B","C","C","C","D","D"),
  4. Score1=c(0.231,0.831,0.894,1e-05,1e-05,1e-05,0.874,0.785,0.879,1e-08,1e-08),
  5. Score2=c(0.968,0.168,0.105,0.239,0.149,0.125,1e-05,1e-05,1e-05,0.159,0.105),
  6. Score3=c(1e-07,1e-07,1e-07,1e-07,1e-07,1e-07,1e-05,1e-05,1e-05,0.855,0.827),
  7. Score4=c(1e-07,1e-07,1e-07,1e-07,1e-07,1e-07,0.492,0.500,0.656,1e-08,1e-08))
  8. df |>
  9. pivot_longer(starts_with("Score"), names_to = "Column") |>
  10. group_by(Condition) |>
  11. slice_max(order_by = value, n = 1) |>
  12. select(Condition, Column)
  13. #> # A tibble: 5 × 2
  14. #> # Groups: Condition [5]
  15. #> Condition Column
  16. #> <chr> <chr>
  17. #> 1 - Score2
  18. #> 2 A Score1
  19. #> 3 B Score2
  20. #> 4 C Score1
  21. #> 5 D Score3

创建于2023年07月05日,使用 reprex v2.0.2

英文:

An efficient way to get this is to tidyr::pivot_longer() so that you aggregate all the column data together and can easily take the top value in each Condition.

  1. library(tidyverse)
  2. df &lt;- data.frame(Product= paste0(&quot;AE&quot;, c(1:11)),
  3. Condition=c(&quot;-&quot;,&quot;A&quot;,&quot;A&quot;,&quot;B&quot;,&quot;B&quot;,&quot;B&quot;,&quot;C&quot;,&quot;C&quot;,&quot;C&quot;,&quot;D&quot;,&quot;D&quot;),
  4. Score1=c(0.231,0.831,0.894,1e-05,1e-05,1e-05,0.874,0.785,0.879,1e-08,1e-08),
  5. Score2=c(0.968,0.168,0.105,0.239,0.149,0.125,1e-05,1e-05,1e-05,0.159,0.105),
  6. Score3=c(1e-07,1e-07,1e-07,1e-07,1e-07,1e-07,1e-05,1e-05,1e-05,0.855,0.827),
  7. Score4=c(1e-07,1e-07,1e-07,1e-07,1e-07,1e-07,0.492,0.500,0.656,1e-08,1e-08))
  8. df |&gt;
  9. pivot_longer(starts_with(&quot;Score&quot;), names_to = &quot;Column&quot;) |&gt;
  10. group_by(Condition) |&gt;
  11. slice_max(order_by = value, n = 1) |&gt;
  12. select(Condition, Column)
  13. #&gt; # A tibble: 5 &#215; 2
  14. #&gt; # Groups: Condition [5]
  15. #&gt; Condition Column
  16. #&gt; &lt;chr&gt; &lt;chr&gt;
  17. #&gt; 1 - Score2
  18. #&gt; 2 A Score1
  19. #&gt; 3 B Score2
  20. #&gt; 4 C Score1
  21. #&gt; 5 D Score3

<sup>Created on 2023-07-05 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年7月6日 10:49:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/76625195.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定