2023年6月13日 01:33:30go评论89阅读模式

英文:

Per group wilcox.test against everything else using two data frame columns

问题

输入数据框：

df <- data.frame(x=abs(rnorm(50)),col1=rep(1:5,10), col2=rep(1:4,25))

我想要执行以下操作：

df %>%
  group_by(col1) %>%
  # 在col2中的每个组上执行wilcox.test，计算p_value
  < for g in col2 do wilcox.test(.data[.data$col2 == g]$x,.data[.data$col2 != g]$x)$p.value >;

我不确定如何实现括号内的部分。最终结果应该有三列：col1、col2、p_value；其中p_value来自于col2中每个组与col2中其他值（在每个col1值内）执行的wilcox.test。

英文:

Input data frame:

df &lt;- data.frame(x=abs(rnorm(50)),col1=rep(1:5,10), col2=rep(1:4,25))

I want to do:

df %&gt;% 
  group_by(col1) %&gt;%
  &lt; for g in col2 do wilcox.test(.data[.data$col2 == g]$x,.data[.data$col2 != g]$x)$p.value &gt;

So what I am not sure is how to implement the part in the brackets. The end result should have three columns: col1, col2, p_value; where the p_value is from the wilcox.test of each group in col2 against all other values outside the group in col2 (within each col1 value).

答案1

得分: 0

以下是您要求的翻译内容：

你可以创建一个辅助函数，该函数接受x和col2列，按组返回带有p值的数据框。然后，只需使用`reframe`调用该函数，使用`.by=col1`。
```R
f <- function(x, c2) {
  vs <- unique(c2)
  data.frame(col2 = vs, p_value = sapply(vs, function(v) wilcox.test(x[c2 == v], x[c2 != v])$p.value))
} 
reframe(df, f(x, col2), .by = col1)

输出：

   col1 col2    p_value
1     1    1 0.08062436
2     1    2 0.44453044
3     1    3 0.16795666
4     1    4 0.67247162
5     2    2 0.02541280
6     2    3 0.14176987
7     2    4 0.80005160
8     2    1 0.73542312
9     3    3 0.73542312
10    3    4 0.86597007
11    3    1 0.19736842
12    3    2 0.49729102
13    4    4 0.30559856
14    4    1 0.14176987
15    4    2 0.11855005
16    4    3 0.34855521
17    5    1 0.26612487
18    5    2 0.14176987
19    5    3 0.05263158
20    5    4 0.49729102

输入（请注意，我使用rnorm(100)以避免循环使用）：

set.seed(123)
df <- data.frame(x = abs(rnorm(100)), col1 = rep(1:5, 10), col2 = rep(1:4, 25))


<details>
<summary>英文:</summary>
You can make a helper function that takes the x and col2 columns, by group and returns a dataframe with the p values. Then, just call that function using `reframe`, with `.by=col1`

f <- (x,c2) {
vs <- unique(c2)
data.frame(col2=vs,p_value=sapply(vs, (v) wilcox.test(x[c2==v],x[c2!=v])$p.value))
}

reframe(df, f(x,col2), .by=col1)


Output:

col1 col2 p_value
1 1 1 0.08062436
2 1 2 0.44453044
3 1 3 0.16795666
4 1 4 0.67247162
5 2 2 0.02541280
6 2 3 0.14176987
7 2 4 0.80005160
8 2 1 0.73542312
9 3 3 0.73542312
10 3 4 0.86597007
11 3 1 0.19736842
12 3 2 0.49729102
13 4 4 0.30559856
14 4 1 0.14176987
15 4 2 0.11855005
16 4 3 0.34855521
17 5 1 0.26612487
18 5 2 0.14176987
19 5 3 0.05263158
20 5 4 0.49729102


Input (notice that I use `rnorm(100)` to avoid recycling):

set.seed=123
df <- data.frame(x=abs(rnorm(100)),col1=rep(1:5,10), col2=rep(1:4,25))


</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用两个数据框列对每个组进行wilcox.test与其他所有组比较。

问题

答案1

在R中仅针对相同类别的连续行分组数据。

Update run-length ID but skip NA.

如何在R中识别并将“混合”观察中的罗马数字转换为整数？

有没有一种方法可以计算两个具有不同范围的单独数据集的检测概率？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。