英文:
How to loop multiple Normal distributions with Bartlett tests and ANOVAs?
问题
以下是您提供的代码的中文翻译部分:
我试图通过从假定ANOVA中的3个组的正态分布模拟数据来模拟单因素ANOVA的数据,每个组中有n=10个观测值。这三个组的均值分别为{5, 5, 5},标准差分别为{5, 10, 15}。然后,我想使用Bartlett's检验来测试同方差性的假设,然后进行F检验。
我想在for循环中重复这些步骤100,000次,并将Bartlett's检验和F检验的结果保存在向量中。
我在R中运行了下面的代码,但正态分布只保存每个组的1个观测值,而不是每个组的10个观测值。这导致其余的代码无法正常运行。
v <- 100000 # 创建一个包含100,000的向量
# 创建用于存储for循环中重复值的空向量
Storage <- c()
Group1 <- c()
Group2 <- c()
Group3 <- c()
b.test2 <- c()
f.test2 <- c()
fit2 <- c()
# 循环100,000次,对3个组进行正态分布
for (i in 1:v) {
# 为每个组生成来自正态分布的随机数据
Group1[i] <- rnorm(n = 10, mean = 5, sd = 5)
Group2[i] <- rnorm(n = 10, mean = 5, sd = 10)
Group3[i] <- rnorm(n = 10, mean = 5, sd = 15)
# 将这三个组合并成一个数据框
data2 <- data.frame(Group1, Group2, Group3)
# 在运行ANOVA和Bartlett检验之前,将模拟数据转化为长数据集
data.long2 <- data2 %>%
pivot_longer(cols = c("Group1":"Group3"), names_to = "group", values_to = "num" )
# 使用Bartlett检验检查残差的同方差性,并将结果保存到向量中
b.test2[i] <- bartlett.test(data2)
# 为模拟数据创建第一个模型拟合
fit2[i] <- lm(num~group, data = data2)
# 将F检验的p值保存到向量中
f.test2[i] <- Anova(fit2, type=3)[1,4]
}
英文:
I am trying to simulate data for a single factor ANOVA by simulating data from a Normal distribution assuming 3 groups in the ANOVA with n=10 observations at each group. The following means {5,5,5} and standard deviations {5,10,15} are used for the three groups. I then want to test the assumption of homoscedasticity using a Bartlett's test, followed by an F-test.
I want to repeat these steps 100,000 times in a for loop while also saving the results for the Bartlett's test and F-test in a vectors.
I've run the below code in R, however the normal distributions are only saving 1 observation per group instead of 10 observations per group. This is stopping the rest of the code from running properly.
v <- 100000 #create a vector for 100000 to be used in the for loop
#create empty vectors to store the repeated values from the for loop in
Storage <- c()
Group1 <- c()
Group2 <- c()
Group3 <- c()
b.test2 <- c()
f.test2 <- c()
fit2 <- c()
# Loop through a normal distribution for 3 groups 100,000 times
for (i in 1:v) {
# Generate random data from a normal distribution for each group
Group1[i] <- rnorm(n = 10, mean = 5, sd = 5)
Group2[i] <- rnorm(n = 10, mean = 5, sd = 10)
Group3[i] <- rnorm(n = 10, mean = 5, sd = 15)
# Combine the three groups into a single data frame
data2 <- data.frame(Group1, Group2, Group3)
# Pivot the simulated data into a long data set prior to running the ANOVA and Bartlett test
data.long2 <- data2 %>%
pivot_longer(cols = c("Group1":"Group3"), names_to = "group", values_to = "num" )
# Check for homoscedasticity of the residuals using a Bartlett test and save the results into a vector
b.test2[i] <- bartlett.test(data2)
# Create the first model fit for the simulated data
fit2[i] <- lm(num~group, data = data2)
# Save the p-value from the F-test in a vector
f.test2[i] <- Anova(fit2, type=3)[1,4]
}
答案1
得分: 1
以下是您要翻译的内容:
"since you are trying to do these analyses on individual groups you need to store your data in lists (or subgroups)
here is a different approach that is dependent on using purrr
and dplyr
- it avoids the looping
first create a function for generating your data
group_fun <- function(n){
data.frame(group = rep(c("group1", "group2", "group3"), each=n),
num = c(rnorm(n,5,5), rnorm(n,5,10), rnorm(n,5,15)))
}
then replicate the data function the desired # of times (i'm using 20 for brevity)
n = 10
1:20 %>% # number of iterations
map(., ~ group_fun(n)) %>% # run the data function once for each iteration
tibble::enframe(.) %>% # collapse for a single list for each iter
mutate(fit = map(value, ~lm(num~group, data = .)),
anov = map(fit, anova),. # run models
bart = map(value, ~bartlett.test(num~group, data = .))) %>%
mutate(ftest = map(anov, broom::tidy),
bart = map(bart, broom::tidy), # process results
summary = map(fit, broom::tidy)) %>%
unnest(ftest) # pull out results
since some of the model outputs have the same names you could use names_repair
if you want more than one output unnest(c(ftest, bart), names_repair = "universal")
there are some slightly more efficient ways to do this but this should get you started"
英文:
since you are trying to do these analyses on individual groups you need to store your data in lists (or subgroups)
here is a different approach that is dependent on using purrr
and dplyr
- it avoids the looping
first create a function for generating your data
group_fun <- function(n){
data.frame(group = rep(c("group1", "group2", "group3"), each=n),
num = c(rnorm(n,5,5), rnorm(n,5,10), rnorm(n,5,15)))
}
then replicate the data function the desired # of times (i'm using 20 for brevity)
n = 10
1:20 %>% # number of iterations
map(., ~ group_fun(n)) %>% # run the data function once for each iteration
tibble::enframe(.) %>% # collapse for a single list for each iter
mutate(fit = map(value, ~lm(num~group, data = .)),
anov = map(fit, anova),. # run models
bart = map(value, ~bartlett.test(num~group, data = .))) %>%
mutate(ftest = map(anov, broom::tidy),
bart = map(bart, broom::tidy), # process resulsts
summary = map(fit, broom::tidy)) %>%
unnest(ftest) # pull out results
since some of the model outputs have the same names you could use names_repair
if you want more than one output unnest(c(ftest, bart), names_repair = "universal")
there are some slightly more efficient ways to do this but this should get you started
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论