在R中的for循环:model.frame.default()中的错误。

huangapple go评论57阅读模式
英文:

For loop in R: error in model.frame.default()

问题

我对循环还不太了解,请耐心等待 在R中的for循环:model.frame.default()中的错误。

我已经计算了α指数(观察值、Shannon、InvSimpson、均匀度),我想要对我的表的变量“Month”执行Kruskal-Wallis统计检验。

我的表(df)大致如下:

Observed Shannon InvSimpson Evenness Month
688 4.5538 23.365814 0.696963 二月
749 4.3815 15.162467 0.661992 二月
610 3.8291 11.178981 0.597054 二月
665 4.2011 16.284009 0.646343 三月
839 5.1855 43.198709 0.770260 三月
516 3.2393 4.765211 0.518611 四月
470 3.9677 11.614851 0.644873 四月
539 4.2995 15.593572 0.683583 四月
... ... ... ... ...

在尝试使用循环之前,我逐个指数执行了测试,如下所示:

obs <- df %>% kruskal_test(Observed ~ Month)
sha <- df %>% kruskal_test(Shannon ~ Month)
inv <- df %>% kruskal_test(InvSimpson ~ Month)
eve <- df %>% kruskal_test(Evenness ~ Month)
res.kruskal <- rbind(obs, sha, inv, eve)
res.kruskal

这样可以运行,这是我想要使用for循环获得的相同结果:

# A tibble: 4 × 6
  .y.            n statistic    df       p method        
1 Observed      45      20.6     9 0.0144  Kruskal-Wallis
2 Shannon       45      24.0     9 0.00434 Kruskal-Wallis
3 InvSimpson    45      20.3     9 0.0159  Kruskal-Wallis
4 Evenness      45      22.0     9 0.00899 Kruskal-Wallis

然而,当我尝试使用for循环时,如下所示:

Indices <- c("Observed", "Shannon", "InvSimpson", "Evenness")
result.kruskal <- data_frame()

for (i in Indices) {
  kruskal <- df %>% kruskal_test(i ~ Month)
  result.kruskal <- rbind(result.kruskal, kruskal)
}

我遇到了以下错误:

Error in model.frame.default(formula = formula, data = data) : 
  variable length differ (found for 'Month')

根据论坛上类似的错误,我不认为问题出在"Month"变量上,正如错误消息所说,我的表df中也没有NA。我是否编写了for循环有问题?

我会感激您提供的任何见解。 在R中的for循环:model.frame.default()中的错误。

Sophie

英文:

I'm quite new to loops so please be patient with me 在R中的for循环:model.frame.default()中的错误。

I have calculated alpha indices (Observed, Shannon, InvSimpson, Evenness) for which I want to perform a Kruskal-Wallis statistical test with the variable Month of my table.

My table (df) looks something like this :

Observed Shannon InvSimpson Evenness Month
688 4.5538 23.365814 0.696963 February
749 4.3815 15.162467 0.661992 February
610 3.8291 11.178981 0.597054 February
665 4.2011 16.284009 0.646343 March
839 5.1855 43.198709 0.770260 March
516 3.2393 4.765211 0.518611 April
470 3.9677 11.614851 0.644873 April
539 4.2995 15.593572 0.683583 April
... ... ... ... ...

Before trying with a loop I performed the test, one indices at a time, like so :

obs &lt;- df %&gt;% kruskal_test(Observed ~ Month)
sha &lt;- df %&gt;% kruskal_test(Shannon ~ Month)
inv &lt;- df %&gt;% kruskal_test(InvSimpson ~ Month)
eve &lt;- df %&gt;% kruskal_test(Evenness ~ Month)
res.kruskal &lt;- rbind(obs, sha, inv, eve)
res.kruskal

And it worked, that's the same result I want to get with the for loop :

# A tibble: 4 &#215; 6
  .y.            n statistic    df       p method        
  &lt;chr&gt;      &lt;int&gt;     &lt;dbl&gt; &lt;int&gt;   &lt;dbl&gt; &lt;chr&gt;         
1 Observed      45      20.6     9 0.0144  Kruskal-Wallis
2 Shannon       45      24.0     9 0.00434 Kruskal-Wallis
3 InvSimpson    45      20.3     9 0.0159  Kruskal-Wallis
4 Evenness      45      22.0     9 0.00899 Kruskal-Wallis

However, when I try it with a for loop like so :

Indices &lt;- c(&quot;Observed&quot;, &quot;Shannon&quot;, &quot;InvSimpson&quot;, &quot;Evenness&quot;)
result.kruskal &lt;- data_frame()

for (i in Indices) {
  kruskal &lt;- df %&gt;% kruskal_test(i ~ Month)
  result.kruskal &lt;- rbind(result.kruskal, kruskal)
}

I get the following error :

Error in model.frame.default(formula = formula, data = data) : 
  variable length differ (found for &#39;Month&#39;)

From similar errors found on the forum, I don't think my problem comes from the Month variable as the error message says, I don't have NA in my table df either. Am I writing the for loop wrong?

I would be thankful for any insight you might have. 在R中的for循环:model.frame.default()中的错误。

Sophie

答案1

得分: 0

使用数据集的前几行作为示例,lapply()apply() 都可以用于迭代处理列。然后,使用 bind_rows() 将单独的测试结果组合成一个数据框:

library(tidyverse)
library(rstatix)
Indices <- c("Observed", "Shannon", "InvSimpson", "Evenness")

# 使用 lapply

result.kruskal <- bind_rows(
               lapply(df[Indices], FUN = function(x)   kruskal_test(df, x ~ Month))
               , .id = "variable") %>%
               select(-2) %>% as.data.frame()

result.kruskal

 variable        n statistic    df   p    method        
1 Observed       8      5.14     2 0.0766 Kruskal-Wallis
2 Shannon        8      2        2 0.368  Kruskal-Wallis
3 InvSimpson     8      3.22     2 0.2    Kruskal-Wallis
4 Evenness       8      1.44     2 0.486  Kruskal-Wallis

# 或者使用 apply

result.kruskal <- bind_rows(
  apply(df[Indices], 2, FUN = function(x) kruskal_test(df, x ~ Month))
, .id = "variable") %>% select(-2) %>% as.data.frame()

result.kruskal

 variable        n statistic    df   p    method        
1 Observed       8      5.14     2 0.0766 Kruskal-Wallis
2 Shannon        8      2        2 0.368  Kruskal-Wallis
3 InvSimpson     8      3.22     2 0.2    Kruskal-Wallis
4 Evenness       8      1.44     2 0.486  Kruskal-Wallis

# 示例数据

df <- read.table(text = "Observed	Shannon	InvSimpson	Evenness	Month
688	4.5538	23.365814	0.696963	February
749	4.3815	15.162467	0.661992	February
610	3.8291	11.178981	0.597054	February
665	4.2011	16.284009	0.646343	March
839	5.1855	43.198709	0.770260	March
516	3.2393	4.765211	0.518611	April
470	3.9677	11.614851	0.644873	April
539	4.2995	15.593572	0.683583	April", header=T)

这是你提供的代码的翻译部分。

英文:

Using the first rows of your dataset as example, both lapply() and apply() can be used to iterate over the columns. Then, with bind_rows() the results of single tests can be combined together as a data frame:

library(tidyverse)
library(rstatix)
Indices &lt;- c(&quot;Observed&quot;, &quot;Shannon&quot;, &quot;InvSimpson&quot;, &quot;Evenness&quot;)

using lapply

result.kruskal &lt;- bind_rows(
               lapply(df[Indices], FUN = function(x)   kruskal_test(df, x ~ Month))
               , .id = &quot;variable&quot;) %&gt;% 
               select(-2) %&gt;% as.data.frame()

result.kruskal

 variable        n statistic    df   p    method        
1 Observed       8      5.14     2 0.0766 Kruskal-Wallis
2 Shannon        8      2        2 0.368  Kruskal-Wallis
3 InvSimpson     8      3.22     2 0.2    Kruskal-Wallis
4 Evenness       8      1.44     2 0.486  Kruskal-Wallis

or with apply

result.kruskal &lt;- bind_rows(
  apply(df[Indices], 2, FUN = function(x) kruskal_test(df, x ~ Month))
, .id = &quot;variable&quot;) %&gt;% select(-2) %&gt;% as.data.frame()

result.kruskal

 variable        n statistic    df   p    method        
1 Observed       8      5.14     2 0.0766 Kruskal-Wallis
2 Shannon        8      2        2 0.368  Kruskal-Wallis
3 InvSimpson     8      3.22     2 0.2    Kruskal-Wallis
4 Evenness       8      1.44     2 0.486  Kruskal-Wallis

Example data

df &lt;- read.table(text = &quot;Observed	Shannon	InvSimpson	Evenness	Month
688	4.5538	23.365814	0.696963	February
749	4.3815	15.162467	0.661992	February
610	3.8291	11.178981	0.597054	February
665	4.2011	16.284009	0.646343	March
839	5.1855	43.198709	0.770260	March
516	3.2393	4.765211	0.518611	April
470	3.9677	11.614851	0.644873	April
539	4.2995	15.593572	0.683583	April&quot;, header=T)

huangapple
  • 本文由 发表于 2023年4月7日 03:05:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/75952936.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定