2023年3月21日 01:03:04go评论138阅读模式

英文:

How to loop multiple Normal distributions with Bartlett tests and ANOVAs?

问题

以下是您提供的代码的中文翻译部分：

我试图通过从假定ANOVA中的3个组的正态分布模拟数据来模拟单因素ANOVA的数据，每个组中有n=10个观测值。这三个组的均值分别为{5, 5, 5}，标准差分别为{5, 10, 15}。然后，我想使用Bartlett's检验来测试同方差性的假设，然后进行F检验。
我想在for循环中重复这些步骤100,000次，并将Bartlett's检验和F检验的结果保存在向量中。
我在R中运行了下面的代码，但正态分布只保存每个组的1个观测值，而不是每个组的10个观测值。这导致其余的代码无法正常运行。
v <- 100000 # 创建一个包含100,000的向量
# 创建用于存储for循环中重复值的空向量
Storage <- c()
Group1 <- c() 
Group2 <- c()
Group3 <- c()
b.test2 <- c()
f.test2 <- c()
fit2 <- c()
# 循环100,000次，对3个组进行正态分布
for (i in 1:v) {
  # 为每个组生成来自正态分布的随机数据
  Group1[i] <- rnorm(n = 10, mean = 5, sd = 5)
  Group2[i] <- rnorm(n = 10, mean = 5, sd = 10)
  Group3[i] <- rnorm(n = 10, mean = 5, sd = 15)
  # 将这三个组合并成一个数据框
  data2 <- data.frame(Group1, Group2, Group3)
  # 在运行ANOVA和Bartlett检验之前，将模拟数据转化为长数据集
  data.long2 <- data2 %>%
    pivot_longer(cols = c("Group1":"Group3"), names_to = "group", values_to = "num" )
  # 使用Bartlett检验检查残差的同方差性，并将结果保存到向量中
  b.test2[i] <- bartlett.test(data2)
  # 为模拟数据创建第一个模型拟合
  fit2[i] <- lm(num~group, data = data2)
  # 将F检验的p值保存到向量中
  f.test2[i] <- Anova(fit2, type=3)[1,4]
}

英文:

I am trying to simulate data for a single factor ANOVA by simulating data from a Normal distribution assuming 3 groups in the ANOVA with n=10 observations at each group. The following means {5,5,5} and standard deviations {5,10,15} are used for the three groups. I then want to test the assumption of homoscedasticity using a Bartlett's test, followed by an F-test.

I want to repeat these steps 100,000 times in a for loop while also saving the results for the Bartlett's test and F-test in a vectors.

I've run the below code in R, however the normal distributions are only saving 1 observation per group instead of 10 observations per group. This is stopping the rest of the code from running properly.

v &lt;- 100000 #create a vector for 100000 to be used in the for loop
#create empty vectors to store the repeated values from the for loop in
Storage &lt;- c()
Group1 &lt;- c() 
Group2 &lt;- c()
Group3 &lt;- c()
b.test2 &lt;- c()
f.test2 &lt;- c()
fit2 &lt;- c()
# Loop through a normal distribution for 3 groups 100,000 times
for (i in 1:v) {
  # Generate random data from a normal distribution for each group
  Group1[i] &lt;- rnorm(n = 10, mean = 5, sd = 5)
  Group2[i] &lt;- rnorm(n = 10, mean = 5, sd = 10)
  Group3[i] &lt;- rnorm(n = 10, mean = 5, sd = 15)
  # Combine the three groups into a single data frame
  data2 &lt;- data.frame(Group1, Group2, Group3)
  # Pivot the simulated data into a long data set prior to running the ANOVA and Bartlett test
      data.long2 &lt;- data2 %&gt;%
        pivot_longer(cols = c(&quot;Group1&quot;:&quot;Group3&quot;), names_to = &quot;group&quot;, values_to = &quot;num&quot; )
  # Check for homoscedasticity of the residuals using a Bartlett test and save the results into a     vector
  b.test2[i] &lt;- bartlett.test(data2)
  # Create the first model fit for the simulated data
  fit2[i] &lt;- lm(num~group, data = data2)
  # Save the p-value from the F-test in a vector
  f.test2[i] &lt;- Anova(fit2, type=3)[1,4]
}

答案1

得分: 1

以下是您要翻译的内容：

"since you are trying to do these analyses on individual groups you need to store your data in lists (or subgroups)

here is a different approach that is dependent on using purrr and dplyr - it avoids the looping

first create a function for generating your data

group_fun &lt;- function(n){
  data.frame(group = rep(c(&quot;group1&quot;, &quot;group2&quot;, &quot;group3&quot;), each=n),
             num = c(rnorm(n,5,5), rnorm(n,5,10), rnorm(n,5,15)))
}

then replicate the data function the desired # of times (i'm using 20 for brevity)

n = 10
1:20 %&gt;%                                     # number of iterations
  map(., ~ group_fun(n)) %&gt;%                 # run the data function once for each iteration
  tibble::enframe(.) %&gt;%                     # collapse for a single list for each iter
  mutate(fit  = map(value, ~lm(num~group, data = .)),
         anov = map(fit, anova),.            # run models
         bart = map(value, ~bartlett.test(num~group, data = .))) %&gt;% 
  mutate(ftest = map(anov, broom::tidy),
         bart = map(bart, broom::tidy),      # process results
         summary = map(fit, broom::tidy)) %&gt;% 
  unnest(ftest)                              # pull out results

since some of the model outputs have the same names you could use names_repair if you want more than one output unnest(c(ftest, bart), names_repair = "universal")

there are some slightly more efficient ways to do this but this should get you started"

英文:

since you are trying to do these analyses on individual groups you need to store your data in lists (or subgroups)

here is a different approach that is dependent on using purrr and dplyr - it avoids the looping

first create a function for generating your data

group_fun &lt;- function(n){
  data.frame(group = rep(c(&quot;group1&quot;, &quot;group2&quot;, &quot;group3&quot;), each=n),
             num = c(rnorm(n,5,5), rnorm(n,5,10), rnorm(n,5,15)))
}

then replicate the data function the desired # of times (i'm using 20 for brevity)

n = 10
1:20 %&gt;%                                     # number of iterations
  map(., ~ group_fun(n)) %&gt;%                 # run the data function once for each iteration
  tibble::enframe(.) %&gt;%                     # collapse for a single list for each iter
  mutate(fit  = map(value, ~lm(num~group, data = .)),
         anov = map(fit, anova),.            # run models
         bart = map(value, ~bartlett.test(num~group, data = .))) %&gt;% 
  mutate(ftest = map(anov, broom::tidy),
         bart = map(bart, broom::tidy),      # process resulsts
         summary = map(fit, broom::tidy)) %&gt;% 
  unnest(ftest)                              # pull out results

since some of the model outputs have the same names you could use names_repair if you want more than one output unnest(c(ftest, bart), names_repair = "universal")

there are some slightly more efficient ways to do this but this should get you started

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何循环多个正态分布使用巴特利特检验和方差分析（ANOVA）？

问题

答案1

Python字典在for循环中

上传文本文档到 R

将值根据条件复制到行中

更新数据框中的名称以始终保持相同。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。