在R中的for循环:model.frame.default()中的错误。

huangapple go评论78阅读模式
英文:

For loop in R: error in model.frame.default()

问题

我对循环还不太了解,请耐心等待 在R中的for循环:model.frame.default()中的错误。

我已经计算了α指数(观察值、Shannon、InvSimpson、均匀度),我想要对我的表的变量“Month”执行Kruskal-Wallis统计检验。

我的表(df)大致如下:

Observed Shannon InvSimpson Evenness Month
688 4.5538 23.365814 0.696963 二月
749 4.3815 15.162467 0.661992 二月
610 3.8291 11.178981 0.597054 二月
665 4.2011 16.284009 0.646343 三月
839 5.1855 43.198709 0.770260 三月
516 3.2393 4.765211 0.518611 四月
470 3.9677 11.614851 0.644873 四月
539 4.2995 15.593572 0.683583 四月
... ... ... ... ...

在尝试使用循环之前,我逐个指数执行了测试,如下所示:

  1. obs <- df %>% kruskal_test(Observed ~ Month)
  2. sha <- df %>% kruskal_test(Shannon ~ Month)
  3. inv <- df %>% kruskal_test(InvSimpson ~ Month)
  4. eve <- df %>% kruskal_test(Evenness ~ Month)
  5. res.kruskal <- rbind(obs, sha, inv, eve)
  6. res.kruskal

这样可以运行,这是我想要使用for循环获得的相同结果:

  1. # A tibble: 4 × 6
  2. .y. n statistic df p method
  3. 1 Observed 45 20.6 9 0.0144 Kruskal-Wallis
  4. 2 Shannon 45 24.0 9 0.00434 Kruskal-Wallis
  5. 3 InvSimpson 45 20.3 9 0.0159 Kruskal-Wallis
  6. 4 Evenness 45 22.0 9 0.00899 Kruskal-Wallis

然而,当我尝试使用for循环时,如下所示:

  1. Indices <- c("Observed", "Shannon", "InvSimpson", "Evenness")
  2. result.kruskal <- data_frame()
  3. for (i in Indices) {
  4. kruskal <- df %>% kruskal_test(i ~ Month)
  5. result.kruskal <- rbind(result.kruskal, kruskal)
  6. }

我遇到了以下错误:

  1. Error in model.frame.default(formula = formula, data = data) :
  2. variable length differ (found for 'Month')

根据论坛上类似的错误,我不认为问题出在"Month"变量上,正如错误消息所说,我的表df中也没有NA。我是否编写了for循环有问题?

我会感激您提供的任何见解。 在R中的for循环:model.frame.default()中的错误。

Sophie

英文:

I'm quite new to loops so please be patient with me 在R中的for循环:model.frame.default()中的错误。

I have calculated alpha indices (Observed, Shannon, InvSimpson, Evenness) for which I want to perform a Kruskal-Wallis statistical test with the variable Month of my table.

My table (df) looks something like this :

Observed Shannon InvSimpson Evenness Month
688 4.5538 23.365814 0.696963 February
749 4.3815 15.162467 0.661992 February
610 3.8291 11.178981 0.597054 February
665 4.2011 16.284009 0.646343 March
839 5.1855 43.198709 0.770260 March
516 3.2393 4.765211 0.518611 April
470 3.9677 11.614851 0.644873 April
539 4.2995 15.593572 0.683583 April
... ... ... ... ...

Before trying with a loop I performed the test, one indices at a time, like so :

  1. obs &lt;- df %&gt;% kruskal_test(Observed ~ Month)
  2. sha &lt;- df %&gt;% kruskal_test(Shannon ~ Month)
  3. inv &lt;- df %&gt;% kruskal_test(InvSimpson ~ Month)
  4. eve &lt;- df %&gt;% kruskal_test(Evenness ~ Month)
  5. res.kruskal &lt;- rbind(obs, sha, inv, eve)
  6. res.kruskal

And it worked, that's the same result I want to get with the for loop :

  1. # A tibble: 4 &#215; 6
  2. .y. n statistic df p method
  3. &lt;chr&gt; &lt;int&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt; &lt;chr&gt;
  4. 1 Observed 45 20.6 9 0.0144 Kruskal-Wallis
  5. 2 Shannon 45 24.0 9 0.00434 Kruskal-Wallis
  6. 3 InvSimpson 45 20.3 9 0.0159 Kruskal-Wallis
  7. 4 Evenness 45 22.0 9 0.00899 Kruskal-Wallis

However, when I try it with a for loop like so :

  1. Indices &lt;- c(&quot;Observed&quot;, &quot;Shannon&quot;, &quot;InvSimpson&quot;, &quot;Evenness&quot;)
  2. result.kruskal &lt;- data_frame()
  3. for (i in Indices) {
  4. kruskal &lt;- df %&gt;% kruskal_test(i ~ Month)
  5. result.kruskal &lt;- rbind(result.kruskal, kruskal)
  6. }

I get the following error :

  1. Error in model.frame.default(formula = formula, data = data) :
  2. variable length differ (found for &#39;Month&#39;)

From similar errors found on the forum, I don't think my problem comes from the Month variable as the error message says, I don't have NA in my table df either. Am I writing the for loop wrong?

I would be thankful for any insight you might have. 在R中的for循环:model.frame.default()中的错误。

Sophie

答案1

得分: 0

使用数据集的前几行作为示例,lapply()apply() 都可以用于迭代处理列。然后,使用 bind_rows() 将单独的测试结果组合成一个数据框:

  1. library(tidyverse)
  2. library(rstatix)
  3. Indices <- c("Observed", "Shannon", "InvSimpson", "Evenness")
  4. # 使用 lapply
  5. result.kruskal <- bind_rows(
  6. lapply(df[Indices], FUN = function(x) kruskal_test(df, x ~ Month))
  7. , .id = "variable") %>%
  8. select(-2) %>% as.data.frame()
  9. result.kruskal
  10. variable n statistic df p method
  11. 1 Observed 8 5.14 2 0.0766 Kruskal-Wallis
  12. 2 Shannon 8 2 2 0.368 Kruskal-Wallis
  13. 3 InvSimpson 8 3.22 2 0.2 Kruskal-Wallis
  14. 4 Evenness 8 1.44 2 0.486 Kruskal-Wallis
  15. # 或者使用 apply
  16. result.kruskal <- bind_rows(
  17. apply(df[Indices], 2, FUN = function(x) kruskal_test(df, x ~ Month))
  18. , .id = "variable") %>% select(-2) %>% as.data.frame()
  19. result.kruskal
  20. variable n statistic df p method
  21. 1 Observed 8 5.14 2 0.0766 Kruskal-Wallis
  22. 2 Shannon 8 2 2 0.368 Kruskal-Wallis
  23. 3 InvSimpson 8 3.22 2 0.2 Kruskal-Wallis
  24. 4 Evenness 8 1.44 2 0.486 Kruskal-Wallis
  25. # 示例数据
  26. df <- read.table(text = "Observed Shannon InvSimpson Evenness Month
  27. 688 4.5538 23.365814 0.696963 February
  28. 749 4.3815 15.162467 0.661992 February
  29. 610 3.8291 11.178981 0.597054 February
  30. 665 4.2011 16.284009 0.646343 March
  31. 839 5.1855 43.198709 0.770260 March
  32. 516 3.2393 4.765211 0.518611 April
  33. 470 3.9677 11.614851 0.644873 April
  34. 539 4.2995 15.593572 0.683583 April", header=T)

这是你提供的代码的翻译部分。

英文:

Using the first rows of your dataset as example, both lapply() and apply() can be used to iterate over the columns. Then, with bind_rows() the results of single tests can be combined together as a data frame:

  1. library(tidyverse)
  2. library(rstatix)
  3. Indices &lt;- c(&quot;Observed&quot;, &quot;Shannon&quot;, &quot;InvSimpson&quot;, &quot;Evenness&quot;)

using lapply

  1. result.kruskal &lt;- bind_rows(
  2. lapply(df[Indices], FUN = function(x) kruskal_test(df, x ~ Month))
  3. , .id = &quot;variable&quot;) %&gt;%
  4. select(-2) %&gt;% as.data.frame()
  5. result.kruskal
  6. variable n statistic df p method
  7. 1 Observed 8 5.14 2 0.0766 Kruskal-Wallis
  8. 2 Shannon 8 2 2 0.368 Kruskal-Wallis
  9. 3 InvSimpson 8 3.22 2 0.2 Kruskal-Wallis
  10. 4 Evenness 8 1.44 2 0.486 Kruskal-Wallis

or with apply

  1. result.kruskal &lt;- bind_rows(
  2. apply(df[Indices], 2, FUN = function(x) kruskal_test(df, x ~ Month))
  3. , .id = &quot;variable&quot;) %&gt;% select(-2) %&gt;% as.data.frame()
  4. result.kruskal
  5. variable n statistic df p method
  6. 1 Observed 8 5.14 2 0.0766 Kruskal-Wallis
  7. 2 Shannon 8 2 2 0.368 Kruskal-Wallis
  8. 3 InvSimpson 8 3.22 2 0.2 Kruskal-Wallis
  9. 4 Evenness 8 1.44 2 0.486 Kruskal-Wallis

Example data

  1. df &lt;- read.table(text = &quot;Observed Shannon InvSimpson Evenness Month
  2. 688 4.5538 23.365814 0.696963 February
  3. 749 4.3815 15.162467 0.661992 February
  4. 610 3.8291 11.178981 0.597054 February
  5. 665 4.2011 16.284009 0.646343 March
  6. 839 5.1855 43.198709 0.770260 March
  7. 516 3.2393 4.765211 0.518611 April
  8. 470 3.9677 11.614851 0.644873 April
  9. 539 4.2995 15.593572 0.683583 April&quot;, header=T)

huangapple
  • 本文由 发表于 2023年4月7日 03:05:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/75952936.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定