2023年5月25日 10:46:01go评论169阅读模式

英文:

How can I iterate dataframe column names for ANOVA?

问题

我有一个数据框架

bbb &lt;- as.data.frame(list(X1 = c(19, 12, 6, 17, 8, 14, 19, 22, 20, 21, 23, 19), 
                                X2 = c(12, 6, 11, 9, 9, 9, 19, 18, 21, 22, 21, 23), 
                                X3 = c(19, 12, 13, 13, 12, 5, 23, 19, 14, 19, 20, 20), 
                                X4 = c(12, 12, 12, 16, 9, 10, 21, 19, 19, 21, 16, 21), 
                                X5 = c(12, 10, 7, 6, 11, 10, 15, 20, 24, 19, 19, 24),
                                cluster = c(1,1,1,1,1,1,2,2,2,2,2,2)))

我想使用方差分析（ANOVA）从每一列获取p值。我可以逐个进行，但如何在循环中执行呢？aov不理解来自 colnames(bbb) 的数据。

summary(aov(X1 ~ cluster, data = bbb))[[1]]$&#39;Pr(&gt;F)&#39;[1]

我需要迭代我的数据框架并将p值提取到一个向量中。

英文:

I have a dataframe

bbb &lt;- as.data.frame(list(X1 = c(19, 12, 6, 17, 8, 14, 19, 22, 20, 21, 23, 19), 
                                X2 = c(12, 6, 11, 9, 9, 9, 19, 18, 21, 22, 21, 23), 
                                X3 = c(19, 12, 13, 13, 12, 5, 23, 19, 14, 19, 20, 20), 
                                X4 = c(12, 12, 12, 16, 9, 10, 21, 19, 19, 21, 16, 21), 
                                X5 = c(12, 10, 7, 6, 11, 10, 15, 20, 24, 19, 19, 24),
                                cluster = c(1,1,1,1,1,1,2,2,2,2,2,2)))

and I would like to use ANOVA to get p-values from each column. I can do it one by one, but how can I do it in a loop? aov does not understand data from colnames(bbb)

summary(aov(X1 ~ cluster, data = bbb))[[1]]$&#39;Pr(&gt;F)&#39;[1]

I need to iterate my dataframe and extract p-values into a vector

答案1

得分: 1

# 创建数据框架
bbb <- as.data.frame(list(X1 = c(19, 12, 6, 17, 8, 14, 19, 22, 20, 21, 23, 19), 
                          X2 = c(12, 6, 11, 9, 9, 9, 19, 18, 21, 22, 21, 23), 
                          X3 = c(19, 12, 13, 13, 12, 5, 23, 19, 14, 19, 20, 20), 
                          X4 = c(12, 12, 12, 16, 9, 10, 21, 19, 19, 21, 16, 21), 
                          X5 = c(12, 10, 7, 6, 11, 10, 15, 20, 24, 19, 19, 24),
                          cluster = c(1,1,1,1,1,1,2,2,2,2,2,2)))

# 向量用于存储 p 值
p_values <- numeric()

# 执行方差分析并提取每列的 p 值
p_values <- lapply(names(bbb)[1:5], function(col) {
  aov_result <- summary(aov(as.formula(paste(col, "~ cluster")), data = bbb))
  p_value <- aov_result[[1]]$`Pr(>F)`[1]
  return(p_value)
})

# 打印 p 值
print(p_values)

英文:

# Create the dataframe
bbb &lt;- as.data.frame(list(X1 = c(19, 12, 6, 17, 8, 14, 19, 22, 20, 21, 23, 19), 
                          X2 = c(12, 6, 11, 9, 9, 9, 19, 18, 21, 22, 21, 23), 
                          X3 = c(19, 12, 13, 13, 12, 5, 23, 19, 14, 19, 20, 20), 
                          X4 = c(12, 12, 12, 16, 9, 10, 21, 19, 19, 21, 16, 21), 
                          X5 = c(12, 10, 7, 6, 11, 10, 15, 20, 24, 19, 19, 24),
                          cluster = c(1,1,1,1,1,1,2,2,2,2,2,2)))

# Vector to store p-values
p_values &lt;- numeric()

# Perform ANOVA and extract p-values for each column
p_values &lt;- lapply(names(bbb)[1:5], function(col) {
  aov_result &lt;- summary(aov(as.formula(paste(col, &quot;~ cluster&quot;)), data = bbb))
  p_value &lt;- aov_result[[1]]$`Pr(&gt;F)`[1]
  return(p_value)
})

# Print the p-values
print(p_values)

答案2

得分: 0

使用purrr包中的map()函数是一种选择，例如：

library(purrr)

bbb <- as.data.frame(list(X1 = c(19, 12, 6, 17, 8, 14, 19, 22, 20, 21, 23, 19), 
                          X2 = c(12, 6, 11, 9, 9, 9, 19, 18, 21, 22, 21, 23), 
                          X3 = c(19, 12, 13, 13, 12, 5, 23, 19, 14, 19, 20, 20), 
                          X4 = c(12, 12, 12, 16, 9, 10, 21, 19, 19, 21, 16, 21), 
                          X5 = c(12, 10, 7, 6, 11, 10, 15, 20, 24, 19, 19, 24),
                          cluster = c(1,1,1,1,1,1,2,2,2,2,2,2)))

summary(aov(X1 ~ cluster, data = bbb))[[1]]$'Pr(>F)'[1]
#> [1] 0.004145981

map(bbb, ~summary(aov(.x ~ cluster, data = bbb))[[1]]$'Pr(>F)'[1])
#> $X1
#> [1] 0.004145981
#> 
#> $X2
#> [1] 1.614913e-06
#> 
#> $X3
#> [1] 0.01052767
#> 
#> $X4
#> [1] 0.0001252443
#> 
#> $X5
#> [1] 7.91075e-05
#> 
#> $cluster
#> [1] 4.842692e-159

<sup>创建于2023年05月25日，使用reprex v2.0.2</sup>

这将以列表的形式输出结果，但如果需要，您可以使用unlist()函数将结果转换为单个向量或将它们转换为数据框。

英文:

One option is to use the map() function from the purrr package (part of the tidyverse), e.g.

library(purrr)

bbb &lt;- as.data.frame(list(X1 = c(19, 12, 6, 17, 8, 14, 19, 22, 20, 21, 23, 19), 
                          X2 = c(12, 6, 11, 9, 9, 9, 19, 18, 21, 22, 21, 23), 
                          X3 = c(19, 12, 13, 13, 12, 5, 23, 19, 14, 19, 20, 20), 
                          X4 = c(12, 12, 12, 16, 9, 10, 21, 19, 19, 21, 16, 21), 
                          X5 = c(12, 10, 7, 6, 11, 10, 15, 20, 24, 19, 19, 24),
                          cluster = c(1,1,1,1,1,1,2,2,2,2,2,2)))

summary(aov(X1 ~ cluster, data = bbb))[[1]]$&#39;Pr(&gt;F)&#39;[1]
#&gt; [1] 0.004145981

map(bbb, ~summary(aov(.x ~ cluster, data = bbb))[[1]]$&#39;Pr(&gt;F)&#39;[1])
#&gt; $X1
#&gt; [1] 0.004145981
#&gt; 
#&gt; $X2
#&gt; [1] 1.614913e-06
#&gt; 
#&gt; $X3
#&gt; [1] 0.01052767
#&gt; 
#&gt; $X4
#&gt; [1] 0.0001252443
#&gt; 
#&gt; $X5
#&gt; [1] 7.91075e-05
#&gt; 
#&gt; $cluster
#&gt; [1] 4.842692e-159

<sup>Created on 2023-05-25 with reprex v2.0.2</sup>

This outputs the results in a list, but you can unlist() the results or coerce them to a dataframe if required

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何迭代数据框列名以进行方差分析？

问题

答案1

答案2

更好的方法来查找重复的整行并标记单个R数据框内的次要差异？

R: save a regex match to a new variable while removing the regex match from the existing variable using `str_extract()`

获取参数化查询的结果集，使用R的`DBI`将其直接合并到数据库中。

使用R语言和nloptr包解决非线性优化问题。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论