使用两个数据框列对每个组进行wilcox.test与其他所有组比较。

huangapple go评论89阅读模式
英文:

Per group wilcox.test against everything else using two data frame columns

问题

输入数据框:

  1. df <- data.frame(x=abs(rnorm(50)),col1=rep(1:5,10), col2=rep(1:4,25))

我想要执行以下操作:

  1. df %>%
  2. group_by(col1) %>%
  3. # 在col2中的每个组上执行wilcox.test,计算p_value
  4. < for g in col2 do wilcox.test(.data[.data$col2 == g]$x,.data[.data$col2 != g]$x)$p.value >;

我不确定如何实现括号内的部分。最终结果应该有三列:col1、col2、p_value;其中p_value来自于col2中每个组与col2中其他值(在每个col1值内)执行的wilcox.test。

英文:

Input data frame:

  1. df &lt;- data.frame(x=abs(rnorm(50)),col1=rep(1:5,10), col2=rep(1:4,25))

I want to do:

  1. df %&gt;%
  2. group_by(col1) %&gt;%
  3. &lt; for g in col2 do wilcox.test(.data[.data$col2 == g]$x,.data[.data$col2 != g]$x)$p.value &gt;

So what I am not sure is how to implement the part in the brackets. The end result should have three columns: col1, col2, p_value; where the p_value is from the wilcox.test of each group in col2 against all other values outside the group in col2 (within each col1 value).

答案1

得分: 0

以下是您要求的翻译内容:

  1. 你可以创建一个辅助函数,该函数接受xcol2列,按组返回带有p值的数据框。然后,只需使用`reframe`调用该函数,使用`.by=col1`
  2. ```R
  3. f <- function(x, c2) {
  4. vs <- unique(c2)
  5. data.frame(col2 = vs, p_value = sapply(vs, function(v) wilcox.test(x[c2 == v], x[c2 != v])$p.value))
  6. }
  7. reframe(df, f(x, col2), .by = col1)

输出:

  1. col1 col2 p_value
  2. 1 1 1 0.08062436
  3. 2 1 2 0.44453044
  4. 3 1 3 0.16795666
  5. 4 1 4 0.67247162
  6. 5 2 2 0.02541280
  7. 6 2 3 0.14176987
  8. 7 2 4 0.80005160
  9. 8 2 1 0.73542312
  10. 9 3 3 0.73542312
  11. 10 3 4 0.86597007
  12. 11 3 1 0.19736842
  13. 12 3 2 0.49729102
  14. 13 4 4 0.30559856
  15. 14 4 1 0.14176987
  16. 15 4 2 0.11855005
  17. 16 4 3 0.34855521
  18. 17 5 1 0.26612487
  19. 18 5 2 0.14176987
  20. 19 5 3 0.05263158
  21. 20 5 4 0.49729102

输入(请注意,我使用rnorm(100)以避免循环使用):

  1. set.seed(123)
  2. df <- data.frame(x = abs(rnorm(100)), col1 = rep(1:5, 10), col2 = rep(1:4, 25))
  1. <details>
  2. <summary>英文:</summary>
  3. You can make a helper function that takes the x and col2 columns, by group and returns a dataframe with the p values. Then, just call that function using `reframe`, with `.by=col1`

f <- (x,c2) {
vs <- unique(c2)
data.frame(col2=vs,p_value=sapply(vs, (v) wilcox.test(x[c2==v],x[c2!=v])$p.value))
}

reframe(df, f(x,col2), .by=col1)

  1. Output:

col1 col2 p_value
1 1 1 0.08062436
2 1 2 0.44453044
3 1 3 0.16795666
4 1 4 0.67247162
5 2 2 0.02541280
6 2 3 0.14176987
7 2 4 0.80005160
8 2 1 0.73542312
9 3 3 0.73542312
10 3 4 0.86597007
11 3 1 0.19736842
12 3 2 0.49729102
13 4 4 0.30559856
14 4 1 0.14176987
15 4 2 0.11855005
16 4 3 0.34855521
17 5 1 0.26612487
18 5 2 0.14176987
19 5 3 0.05263158
20 5 4 0.49729102

  1. Input (notice that I use `rnorm(100)` to avoid recycling):

set.seed=123
df <- data.frame(x=abs(rnorm(100)),col1=rep(1:5,10), col2=rep(1:4,25))

  1. </details>

huangapple
  • 本文由 发表于 2023年6月13日 01:33:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/76459029.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定