Error in anova(): "Error in getResponseFormula(el) : 'form' must be a two-sided formula"

huangapple go评论104阅读模式
英文:

Error in anova(): "Error in getResponseFormula(el) : 'form' must be a two-sided formula"

问题

我有一组植物生长的纵向数据,记录在不同的季节。我使用nlme包的nlme函数将数据拟合到具有/不具有季节效应的生长模型中。
**为了查看估计参数(k)的季节效应,我使用anova函数比较了这两个模型,然而,我得到了"Error in getResponseFormula(el) : 'form' must be a two-sided formula"的错误,陷入了这个问题中。
有人可以帮我解决这个问题吗?

以下是可重现的示例。

数据

  1. library(tidyverse)
  2. # 生成虚拟数据
  3. n = 100
  4. id = as.factor(1:n)
  5. season = c("s1","s2","s3","s4")
  6. L0 = rnorm(n,40,5)
  7. td = rnorm(n, 180, 10)
  8. growth <- function(td, Lt_1){
  9. return(function(k,p){
  10. 150*((1 + ((150/Lt_1)^(1/p)-1)*exp(-k*td/365))^(-p))
  11. })
  12. }
  13. L1 = growth(Lt_1 = L0, td = td)(p = 1.2, k = 0.55)
  14. L2 = growth(Lt_1 = L1, td = td)(p = 1.2, k = 1.09)
  15. L3 = growth(Lt_1 = L2, td = td)(p = 1.2, k = 0.62)
  16. L4 = growth(Lt_1 = L3, td = td)(p = 1.2, k = 0.97)
  17. df <- data.frame(id = id, L0 = L0, L1 = L1, L2 = L2, L3 = L3, L4 = L4) %>%
  18. pivot_longer(cols = L0:L4, names_to = "Ltype", values_to = "Lobs") %>%
  19. mutate(Lobs = round(Lobs)) %>%
  20. group_by(id) %>%
  21. mutate(Lt_1 = lag(Lobs)) %>%
  22. filter(!is.na(Lt_1)) %>%
  23. mutate(td = round(rnorm(4, 180, 10))) %>%
  24. mutate(season = factor(season)) %>%
  25. ungroup() %>%
  26. mutate(season2 = case_when(
  27. season == "s3" ~ "s1",
  28. season == "s4" ~ "s2",
  29. TRUE ~ season
  30. )) %>%
  31. mutate(season2 = factor(season2)) %>%
  32. mutate(log_Lobs = log(Lobs))
  33. > head(df)
  34. # A tibble: 6 x 8
  35. id Ltype Lobs Lt_1 td season season2 log_Lobs
  36. <fct> <chr> <dbl> <dbl> <dbl> <fct> <fct> <dbl>
  37. 1 1 L1 57 48 178 s1 s1 4.04
  38. 2 1 L2 76 57 184 s2 s2 4.33
  39. 3 1 L3 87 76 188 s3 s1 4.47
  40. 4 1 L4 103 87 194 s4 s2 4.63
  41. 5 2 L1 43 35 175 s1 s1 3.76
  42. 6 2 L2 63 43 176 s2 s2 4.14

拟合模型

  1. library(nlme)
  2. ## 没有季节效应的模型
  3. formula = Lobs ~ 150*((1 + ((150/Lt_1)^(1/p)-1)*exp(-k*td/365))^(-p))
  4. p = 1.2
  5. k = 0.4
  6. fit_null <- nlme(formula,
  7. fixed = c(p ~ 1, k ~ 1),
  8. random = p ~ 1 | id,
  9. data = df,
  10. start = list(fixed = c(p, k)),
  11. na.action = na.exclude,
  12. control=list(maxIter=1e6, msMaxIter = 1e6, msVerbose = TRUE))
  13. ## 带季节效应的模型
  14. p = 1.2
  15. k = c(0.4, 1.09)
  16. fit_withSeason <- nlme(formula,
  17. fixed = c(p ~ 1, k ~ 1 + season2),
  18. random = k ~ 1 | id,
  19. data = df,
  20. start = list(fixed = c(p, k)),
  21. na.action = na.exclude,
  22. control=list(maxIter=1e6, msMaxIter = 1e6, msVerbose = TRUE))

两个模型的摘要

  1. > summary(fit_null)
  2. Nonlinear mixed-effects model fit by maximum likelihood
  3. Model: formula
  4. Data: df
  5. AIC BIC logLik
  6. 2312.975 2328.941 -1152.488
  7. Random effects:
  8. Formula: p ~ 1 | id
  9. p Residual
  10. StdDev: 8.300513e-05 4.315789
  11. Fixed effects: c(p ~ 1, k ~ 1)
  12. Value Std.Error DF t-value p-value
  13. p 0.7335954 0.09075133 299 8.083577 0
  14. k 0.9823510 0.05594228 299 17.560080 0
  15. Correlation:
  16. p
  17. k -0.969
  18. Standardized Within-Group Residuals:
  19. Min Q1 Med Q3 Max
  20. -1.4898414 -0.8966244 -0.2154542 0.8401292 2.2148509
  21. Number of Observations: 400
  22. Number of Groups: 100
  23. > summary(fit_withSeason)
  24. Nonlinear mixed-effects model fit by maximum likelihood
  25. Model: formula
  26. Data: df
  27. AIC BIC logLik
  28. 1466.922 1486.879 -728.4608
  29. Random effects:
  30. Formula: k ~ 1 | id
  31. k.(Intercept) Residual
  32. StdDev: 0.03294767 1.37547
  33. Fixed effects: c(p ~ 1, k ~ 1 + season2)
  34. Value Std.Error DF t-value p-value
  35. p 1.9509211 0.16984927 298 11.48619 0
  36. k.(Intercept) 0.5212180 0.
  37. <details>
  38. <summary>英文:</summary>
  39. I have a longitudinal dataset for plant growth recorded in different seasons. I fitted the data to growth models with/without seasonal effect using `nlme` function from `nlme` package.
  40. **To see the seasonal effect on the estimated parameter (k), I compared the two models using `anova` function, however, I got &quot;Error in getResponseFormula(el) : &#39;form&#39; must be a two-sided formula&quot;** and stuck in this problem.
  41. Can anyone help me solve this problem?
  42. Here is the reproducible example.
  43. ### data
  44. ```r
  45. library(tidyverse)
  46. # generate dummy data
  47. n = 100
  48. id = as.factor(1:n)
  49. season = c(&quot;s1&quot;,&quot;s2&quot;,&quot;s3&quot;,&quot;s4&quot;)
  50. L0 = rnorm(n,40,5)
  51. td = rnorm(n, 180, 10)
  52. growth &lt;- function(td, Lt_1){
  53. return(function(k,p){
  54. 150*((1 + ((150/Lt_1)^(1/p)-1)*exp(-k*td/365))^(-p))
  55. })
  56. }
  57. L1 = growth(Lt_1 = L0, td = td)(p = 1.2, k = 0.55)
  58. L2 = growth(Lt_1 = L1, td = td)(p = 1.2, k = 1.09)
  59. L3 = growth(Lt_1 = L2, td = td)(p = 1.2, k = 0.62)
  60. L4 = growth(Lt_1 = L3, td = td)(p = 1.2, k = 0.97)
  61. df &lt;- data.frame(id = id, L0 = L0, L1 = L1, L2 = L2, L3 = L3, L4 = L4) %&gt;%
  62. pivot_longer(cols = L0:L4, names_to = &quot;Ltype&quot;, values_to = &quot;Lobs&quot;) %&gt;%
  63. mutate(Lobs = round(Lobs)) %&gt;%
  64. group_by(id) %&gt;%
  65. mutate(Lt_1 = lag(Lobs)) %&gt;%
  66. filter(!is.na(Lt_1)) %&gt;%
  67. mutate(td = round(rnorm(4, 180, 10))) %&gt;%
  68. mutate(season = factor(season)) %&gt;%
  69. ungroup() %&gt;%
  70. mutate(season2 = case_when(
  71. season == &quot;s3&quot; ~ &quot;s1&quot;,
  72. season == &quot;s4&quot; ~ &quot;s2&quot;,
  73. TRUE ~ season
  74. )) %&gt;%
  75. mutate(season2 = factor(season2)) %&gt;%
  76. mutate(log_Lobs = log(Lobs))
  77. &gt; head(df)
  78. # A tibble: 6 x 8
  79. id Ltype Lobs Lt_1 td season season2 log_Lobs
  80. &lt;fct&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt;
  81. 1 1 L1 57 48 178 s1 s1 4.04
  82. 2 1 L2 76 57 184 s2 s2 4.33
  83. 3 1 L3 87 76 188 s3 s1 4.47
  84. 4 1 L4 103 87 194 s4 s2 4.63
  85. 5 2 L1 43 35 175 s1 s1 3.76
  86. 6 2 L2 63 43 176 s2 s2 4.14

fitting

  1. library(nlme)
  2. ## model without seasonal effect
  3. formula = Lobs ~ 150*((1 + ((150/Lt_1)^(1/p)-1)*exp(-k*td/365))^(-p))
  4. p = 1.2
  5. k = 0.4
  6. fit_null &lt;- nlme(formula,
  7. fixed = c(p ~ 1, k ~ 1),
  8. random = p ~ 1 | id,
  9. data = df,
  10. start = list(fixed = c(p, k)),
  11. na.action = na.exclude,
  12. control=list(maxIter=1e6, msMaxIter = 1e6, msVerbose = TRUE))
  13. ## model with seasonal effect
  14. p = 1.2
  15. k = c(0.4, 1.09)
  16. fit_withSeason &lt;- nlme(formula,
  17. fixed = c(p ~ 1, k ~ 1 + season2),
  18. random = k ~ 1 | id,
  19. data = df,
  20. start = list(fixed = c(p, k)),
  21. na.action = na.exclude,
  22. control=list(maxIter=1e6, msMaxIter = 1e6, msVerbose = TRUE))

summary of the two models

  1. &gt; summary(fit_null)
  2. Nonlinear mixed-effects model fit by maximum likelihood
  3. Model: formula
  4. Data: df
  5. AIC BIC logLik
  6. 2312.975 2328.941 -1152.488
  7. Random effects:
  8. Formula: p ~ 1 | id
  9. p Residual
  10. StdDev: 8.300513e-05 4.315789
  11. Fixed effects: c(p ~ 1, k ~ 1)
  12. Value Std.Error DF t-value p-value
  13. p 0.7335954 0.09075133 299 8.083577 0
  14. k 0.9823510 0.05594228 299 17.560080 0
  15. Correlation:
  16. p
  17. k -0.969
  18. Standardized Within-Group Residuals:
  19. Min Q1 Med Q3 Max
  20. -1.4898414 -0.8966244 -0.2154542 0.8401292 2.2148509
  21. Number of Observations: 400
  22. Number of Groups: 100
  23. &gt; summary(fit_withSeason)
  24. Nonlinear mixed-effects model fit by maximum likelihood
  25. Model: formula
  26. Data: df
  27. AIC BIC logLik
  28. 1466.922 1486.879 -728.4608
  29. Random effects:
  30. Formula: k ~ 1 | id
  31. k.(Intercept) Residual
  32. StdDev: 0.03294767 1.37547
  33. Fixed effects: c(p ~ 1, k ~ 1 + season2)
  34. Value Std.Error DF t-value p-value
  35. p 1.9509211 0.16984927 298 11.48619 0
  36. k.(Intercept) 0.5212180 0.01156849 298 45.05498 0
  37. k.season2s2 0.4151107 0.00828721 298 50.09055 0
  38. Correlation:
  39. p k.(In)
  40. k.(Intercept) -0.868
  41. k.season2s2 -0.572 0.265
  42. Standardized Within-Group Residuals:
  43. Min Q1 Med Q3 Max
  44. -2.341255091 -0.727512901 0.003841993 0.622830078 3.190926108
  45. Number of Observations: 400
  46. Number of Groups: 100
  1. # model comparison to test the seasonal effect
  2. &gt; anova(fit_null, fit_withSeason)
  3. Error in getResponseFormula(el) : &#39;form&#39; must be a two-sided formula

答案1

得分: 1

这可能是nlme中的一个错误,但是一个棘手的问题。当你将公式传递为一个符号 称为"formula"(即"formula",而不是"Lobs ~ ")时,它无法正确提取模型调用中的公式。

如果你将公式的变量名从formula更改为(例如)form1,我认为一切都应该正常工作(另一个使用内置对象的名称作为变量名的原因!)

更一般地说(对于涉及存储为符号的对象的不正确评估的其他问题),你可以通过使用do.call()来解决这个问题,这会欺骗R在存储调用之前评估公式:

  1. fit_null <- do.call(nlme,
  2. list(formula,
  3. fixed = c(p ~ 1, k ~ 1),
  4. random = p ~ 1 | id,
  5. data = df,
  6. start = list(fixed = c(p, k)),
  7. na.action = na.exclude,
  8. control=list(maxIter=1e6, msMaxIter = 1e6, msVerbose = TRUE)
  9. ))
  10. ## 具有季节效应的模型
  11. p = 1.2
  12. k = c(0.4, 1.09)
  13. fit_withSeason <- do.call(nlme,
  14. list(formula,
  15. fixed = c(p ~ 1, k ~ 1 + season2),
  16. random = k ~ 1 | id,
  17. data = df,
  18. start = list(fixed = c(p, k)),
  19. na.action = na.exclude,
  20. control=list(maxIter=1e6, msMaxIter = 1e6, msVerbose = TRUE)))
  21. anova(fit_null, fit_withSeason)
  1. Model df AIC BIC logLik Test L.Ratio p-value
  2. fit_null 1 4 2317.913 2333.879 -1154.9565
  3. fit_withSeason 2 5 1540.088 1560.045 -765.0438 1 vs 2 779.8253 &lt;.0001

我已将此提交为R bug

英文:

This is arguably a bug in nlme, but a tricky one. When you pass the formula as a symbol called "formula" (i.e. "formula", rather than "Lobs ~ <stuff>"), it fails to extract the formula from the model call properly.

If you change the variable name of your formula from formula to (e.g.) form1, I think everything should work fine (another reason not to use names of built-in objects as variable names!)

More generally (for other issues involving improper evaluation of objects stored as symbols) you can work around this by using do.call(), which tricks R into evaluating the formula before storing the call:

  1. fit_null &lt;- do.call(nlme,
  2. list(formula,
  3. fixed = c(p ~ 1, k ~ 1),
  4. random = p ~ 1 | id,
  5. data = df,
  6. start = list(fixed = c(p, k)),
  7. na.action = na.exclude,
  8. control=list(maxIter=1e6, msMaxIter = 1e6, msVerbose = TRUE)
  9. ))
  10. ## model with seasonal effect
  11. p = 1.2
  12. k = c(0.4, 1.09)
  13. fit_withSeason &lt;- do.call(nlme,
  14. list(formula,
  15. fixed = c(p ~ 1, k ~ 1 + season2),
  16. random = k ~ 1 | id,
  17. data = df,
  18. start = list(fixed = c(p, k)),
  19. na.action = na.exclude,
  20. control=list(maxIter=1e6, msMaxIter = 1e6, msVerbose = TRUE)))
  21. anova(fit_null, fit_withSeason)
  1. Model df AIC BIC logLik Test L.Ratio p-value
  2. fit_null 1 4 2317.913 2333.879 -1154.9565
  3. fit_withSeason 2 5 1540.088 1560.045 -765.0438 1 vs 2 779.8253 &lt;.0001

I've submitted this as an R bug

huangapple
  • 本文由 发表于 2023年7月7日 07:02:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/76632999.html
匿名

发表评论

匿名网友
#

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定