英文:
How can I iterate dataframe column names for ANOVA?
问题
我有一个数据框架
bbb <- as.data.frame(list(X1 = c(19, 12, 6, 17, 8, 14, 19, 22, 20, 21, 23, 19),
X2 = c(12, 6, 11, 9, 9, 9, 19, 18, 21, 22, 21, 23),
X3 = c(19, 12, 13, 13, 12, 5, 23, 19, 14, 19, 20, 20),
X4 = c(12, 12, 12, 16, 9, 10, 21, 19, 19, 21, 16, 21),
X5 = c(12, 10, 7, 6, 11, 10, 15, 20, 24, 19, 19, 24),
cluster = c(1,1,1,1,1,1,2,2,2,2,2,2)))
我想使用方差分析(ANOVA)从每一列获取p值。我可以逐个进行,但如何在循环中执行呢?aov不理解来自 colnames(bbb)
的数据。
summary(aov(X1 ~ cluster, data = bbb))[[1]]$'Pr(>F)'[1]
我需要迭代我的数据框架并将p值提取到一个向量中。
英文:
I have a dataframe
bbb <- as.data.frame(list(X1 = c(19, 12, 6, 17, 8, 14, 19, 22, 20, 21, 23, 19),
X2 = c(12, 6, 11, 9, 9, 9, 19, 18, 21, 22, 21, 23),
X3 = c(19, 12, 13, 13, 12, 5, 23, 19, 14, 19, 20, 20),
X4 = c(12, 12, 12, 16, 9, 10, 21, 19, 19, 21, 16, 21),
X5 = c(12, 10, 7, 6, 11, 10, 15, 20, 24, 19, 19, 24),
cluster = c(1,1,1,1,1,1,2,2,2,2,2,2)))
and I would like to use ANOVA to get p-values from each column. I can do it one by one, but how can I do it in a loop? aov does not understand data from colnames(bbb)
summary(aov(X1 ~ cluster, data = bbb))[[1]]$'Pr(>F)'[1]
I need to iterate my dataframe and extract p-values into a vector
答案1
得分: 1
# 创建数据框架
bbb <- as.data.frame(list(X1 = c(19, 12, 6, 17, 8, 14, 19, 22, 20, 21, 23, 19),
X2 = c(12, 6, 11, 9, 9, 9, 19, 18, 21, 22, 21, 23),
X3 = c(19, 12, 13, 13, 12, 5, 23, 19, 14, 19, 20, 20),
X4 = c(12, 12, 12, 16, 9, 10, 21, 19, 19, 21, 16, 21),
X5 = c(12, 10, 7, 6, 11, 10, 15, 20, 24, 19, 19, 24),
cluster = c(1,1,1,1,1,1,2,2,2,2,2,2)))
# 向量用于存储 p 值
p_values <- numeric()
# 执行方差分析并提取每列的 p 值
p_values <- lapply(names(bbb)[1:5], function(col) {
aov_result <- summary(aov(as.formula(paste(col, "~ cluster")), data = bbb))
p_value <- aov_result[[1]]$`Pr(>F)`[1]
return(p_value)
})
# 打印 p 值
print(p_values)
英文:
# Create the dataframe
bbb <- as.data.frame(list(X1 = c(19, 12, 6, 17, 8, 14, 19, 22, 20, 21, 23, 19),
X2 = c(12, 6, 11, 9, 9, 9, 19, 18, 21, 22, 21, 23),
X3 = c(19, 12, 13, 13, 12, 5, 23, 19, 14, 19, 20, 20),
X4 = c(12, 12, 12, 16, 9, 10, 21, 19, 19, 21, 16, 21),
X5 = c(12, 10, 7, 6, 11, 10, 15, 20, 24, 19, 19, 24),
cluster = c(1,1,1,1,1,1,2,2,2,2,2,2)))
# Vector to store p-values
p_values <- numeric()
# Perform ANOVA and extract p-values for each column
p_values <- lapply(names(bbb)[1:5], function(col) {
aov_result <- summary(aov(as.formula(paste(col, "~ cluster")), data = bbb))
p_value <- aov_result[[1]]$`Pr(>F)`[1]
return(p_value)
})
# Print the p-values
print(p_values)
答案2
得分: 0
使用purrr包
中的map()
函数是一种选择,例如:
library(purrr)
bbb <- as.data.frame(list(X1 = c(19, 12, 6, 17, 8, 14, 19, 22, 20, 21, 23, 19),
X2 = c(12, 6, 11, 9, 9, 9, 19, 18, 21, 22, 21, 23),
X3 = c(19, 12, 13, 13, 12, 5, 23, 19, 14, 19, 20, 20),
X4 = c(12, 12, 12, 16, 9, 10, 21, 19, 19, 21, 16, 21),
X5 = c(12, 10, 7, 6, 11, 10, 15, 20, 24, 19, 19, 24),
cluster = c(1,1,1,1,1,1,2,2,2,2,2,2)))
summary(aov(X1 ~ cluster, data = bbb))[[1]]$'Pr(>F)'[1]
#> [1] 0.004145981
map(bbb, ~summary(aov(.x ~ cluster, data = bbb))[[1]]$'Pr(>F)'[1])
#> $X1
#> [1] 0.004145981
#>
#> $X2
#> [1] 1.614913e-06
#>
#> $X3
#> [1] 0.01052767
#>
#> $X4
#> [1] 0.0001252443
#>
#> $X5
#> [1] 7.91075e-05
#>
#> $cluster
#> [1] 4.842692e-159
<sup>创建于2023年05月25日,使用reprex v2.0.2</sup>
这将以列表的形式输出结果,但如果需要,您可以使用unlist()
函数将结果转换为单个向量或将它们转换为数据框。
英文:
One option is to use the map()
function from the purrr package (part of the tidyverse), e.g.
library(purrr)
bbb <- as.data.frame(list(X1 = c(19, 12, 6, 17, 8, 14, 19, 22, 20, 21, 23, 19),
X2 = c(12, 6, 11, 9, 9, 9, 19, 18, 21, 22, 21, 23),
X3 = c(19, 12, 13, 13, 12, 5, 23, 19, 14, 19, 20, 20),
X4 = c(12, 12, 12, 16, 9, 10, 21, 19, 19, 21, 16, 21),
X5 = c(12, 10, 7, 6, 11, 10, 15, 20, 24, 19, 19, 24),
cluster = c(1,1,1,1,1,1,2,2,2,2,2,2)))
summary(aov(X1 ~ cluster, data = bbb))[[1]]$'Pr(>F)'[1]
#> [1] 0.004145981
map(bbb, ~summary(aov(.x ~ cluster, data = bbb))[[1]]$'Pr(>F)'[1])
#> $X1
#> [1] 0.004145981
#>
#> $X2
#> [1] 1.614913e-06
#>
#> $X3
#> [1] 0.01052767
#>
#> $X4
#> [1] 0.0001252443
#>
#> $X5
#> [1] 7.91075e-05
#>
#> $cluster
#> [1] 4.842692e-159
<sup>Created on 2023-05-25 with reprex v2.0.2</sup>
This outputs the results in a list, but you can unlist()
the results or coerce them to a dataframe if required
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论