如何循环多个正态分布使用巴特利特检验和方差分析(ANOVA)?

huangapple go评论138阅读模式
英文:

How to loop multiple Normal distributions with Bartlett tests and ANOVAs?

问题

以下是您提供的代码的中文翻译部分:

  1. 我试图通过从假定ANOVA中的3个组的正态分布模拟数据来模拟单因素ANOVA的数据,每个组中有n=10个观测值。这三个组的均值分别为{5, 5, 5},标准差分别为{5, 10, 15}。然后,我想使用Bartlett's检验来测试同方差性的假设,然后进行F检验。
  2. 我想在for循环中重复这些步骤100,000次,并将Bartlett's检验和F检验的结果保存在向量中。
  3. 我在R中运行了下面的代码,但正态分布只保存每个组的1个观测值,而不是每个组的10个观测值。这导致其余的代码无法正常运行。
  4. v <- 100000 # 创建一个包含100,000的向量
  5. # 创建用于存储for循环中重复值的空向量
  6. Storage <- c()
  7. Group1 <- c()
  8. Group2 <- c()
  9. Group3 <- c()
  10. b.test2 <- c()
  11. f.test2 <- c()
  12. fit2 <- c()
  13. # 循环100,000次,对3个组进行正态分布
  14. for (i in 1:v) {
  15. # 为每个组生成来自正态分布的随机数据
  16. Group1[i] <- rnorm(n = 10, mean = 5, sd = 5)
  17. Group2[i] <- rnorm(n = 10, mean = 5, sd = 10)
  18. Group3[i] <- rnorm(n = 10, mean = 5, sd = 15)
  19. # 将这三个组合并成一个数据框
  20. data2 <- data.frame(Group1, Group2, Group3)
  21. # 在运行ANOVA和Bartlett检验之前,将模拟数据转化为长数据集
  22. data.long2 <- data2 %>%
  23. pivot_longer(cols = c("Group1":"Group3"), names_to = "group", values_to = "num" )
  24. # 使用Bartlett检验检查残差的同方差性,并将结果保存到向量中
  25. b.test2[i] <- bartlett.test(data2)
  26. # 为模拟数据创建第一个模型拟合
  27. fit2[i] <- lm(num~group, data = data2)
  28. # 将F检验的p值保存到向量中
  29. f.test2[i] <- Anova(fit2, type=3)[1,4]
  30. }
英文:

I am trying to simulate data for a single factor ANOVA by simulating data from a Normal distribution assuming 3 groups in the ANOVA with n=10 observations at each group. The following means {5,5,5} and standard deviations {5,10,15} are used for the three groups. I then want to test the assumption of homoscedasticity using a Bartlett's test, followed by an F-test.

I want to repeat these steps 100,000 times in a for loop while also saving the results for the Bartlett's test and F-test in a vectors.

I've run the below code in R, however the normal distributions are only saving 1 observation per group instead of 10 observations per group. This is stopping the rest of the code from running properly.

  1. v &lt;- 100000 #create a vector for 100000 to be used in the for loop
  2. #create empty vectors to store the repeated values from the for loop in
  3. Storage &lt;- c()
  4. Group1 &lt;- c()
  5. Group2 &lt;- c()
  6. Group3 &lt;- c()
  7. b.test2 &lt;- c()
  8. f.test2 &lt;- c()
  9. fit2 &lt;- c()
  10. # Loop through a normal distribution for 3 groups 100,000 times
  11. for (i in 1:v) {
  12. # Generate random data from a normal distribution for each group
  13. Group1[i] &lt;- rnorm(n = 10, mean = 5, sd = 5)
  14. Group2[i] &lt;- rnorm(n = 10, mean = 5, sd = 10)
  15. Group3[i] &lt;- rnorm(n = 10, mean = 5, sd = 15)
  16. # Combine the three groups into a single data frame
  17. data2 &lt;- data.frame(Group1, Group2, Group3)
  18. # Pivot the simulated data into a long data set prior to running the ANOVA and Bartlett test
  19. data.long2 &lt;- data2 %&gt;%
  20. pivot_longer(cols = c(&quot;Group1&quot;:&quot;Group3&quot;), names_to = &quot;group&quot;, values_to = &quot;num&quot; )
  21. # Check for homoscedasticity of the residuals using a Bartlett test and save the results into a vector
  22. b.test2[i] &lt;- bartlett.test(data2)
  23. # Create the first model fit for the simulated data
  24. fit2[i] &lt;- lm(num~group, data = data2)
  25. # Save the p-value from the F-test in a vector
  26. f.test2[i] &lt;- Anova(fit2, type=3)[1,4]
  27. }

答案1

得分: 1

以下是您要翻译的内容:

"since you are trying to do these analyses on individual groups you need to store your data in lists (or subgroups)

here is a different approach that is dependent on using purrr and dplyr - it avoids the looping

first create a function for generating your data

  1. group_fun &lt;- function(n){
  2. data.frame(group = rep(c(&quot;group1&quot;, &quot;group2&quot;, &quot;group3&quot;), each=n),
  3. num = c(rnorm(n,5,5), rnorm(n,5,10), rnorm(n,5,15)))
  4. }

then replicate the data function the desired # of times (i'm using 20 for brevity)

  1. n = 10
  2. 1:20 %&gt;% # number of iterations
  3. map(., ~ group_fun(n)) %&gt;% # run the data function once for each iteration
  4. tibble::enframe(.) %&gt;% # collapse for a single list for each iter
  5. mutate(fit = map(value, ~lm(num~group, data = .)),
  6. anov = map(fit, anova),. # run models
  7. bart = map(value, ~bartlett.test(num~group, data = .))) %&gt;%
  8. mutate(ftest = map(anov, broom::tidy),
  9. bart = map(bart, broom::tidy), # process results
  10. summary = map(fit, broom::tidy)) %&gt;%
  11. unnest(ftest) # pull out results

since some of the model outputs have the same names you could use names_repair if you want more than one output unnest(c(ftest, bart), names_repair = &quot;universal&quot;)

there are some slightly more efficient ways to do this but this should get you started"

英文:

since you are trying to do these analyses on individual groups you need to store your data in lists (or subgroups)

here is a different approach that is dependent on using purrr and dplyr - it avoids the looping

first create a function for generating your data

  1. group_fun &lt;- function(n){
  2. data.frame(group = rep(c(&quot;group1&quot;, &quot;group2&quot;, &quot;group3&quot;), each=n),
  3. num = c(rnorm(n,5,5), rnorm(n,5,10), rnorm(n,5,15)))
  4. }

then replicate the data function the desired # of times (i'm using 20 for brevity)

  1. n = 10
  2. 1:20 %&gt;% # number of iterations
  3. map(., ~ group_fun(n)) %&gt;% # run the data function once for each iteration
  4. tibble::enframe(.) %&gt;% # collapse for a single list for each iter
  5. mutate(fit = map(value, ~lm(num~group, data = .)),
  6. anov = map(fit, anova),. # run models
  7. bart = map(value, ~bartlett.test(num~group, data = .))) %&gt;%
  8. mutate(ftest = map(anov, broom::tidy),
  9. bart = map(bart, broom::tidy), # process resulsts
  10. summary = map(fit, broom::tidy)) %&gt;%
  11. unnest(ftest) # pull out results

since some of the model outputs have the same names you could use names_repair if you want more than one output unnest(c(ftest, bart), names_repair = &quot;universal&quot;)

there are some slightly more efficient ways to do this but this should get you started

huangapple
  • 本文由 发表于 2023年3月21日 01:03:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/75793230.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定