计算线性模型和其他广义线性模型的GAIC时出错。

huangapple go评论88阅读模式
英文:

Error in computing GAIC of linear model and other GLMs

问题

我的数据集可以在这里找到:https://raw.githubusercontent.com/yuliaUU/test/main/test.csv

  1. library(gamlss)
  2. library(tidyverse)
  3. data_final <- read_csv("https://raw.githubusercontent.com/yuliaUU/test/main/test.csv")
  4. # 使用对数转换的正态模型
  5. model_1 <- gamlss(log(Abundance) ~ salinity * avrg_dep, data = data_final, family = NO())
  6. # 对数正态模型
  7. model_2 <- gamlss(Abundance ~ salinity * avrg_dep, data = data_final, family = LOGNO())
  8. # 带有逆高斯分布的模型
  9. model_3 <- gamlss(Abundance ~ salinity * avrg_dep, data = data_final, family = IG())
  10. # Gamma模型
  11. model_4 <- gamlss(Abundance ~ salinity * avrg_dep, data = data_final, family = GA())
  12. 我想要使用GAIC来比较这些模型,但第一个模型的GAIC值与其余模型相差较大。
  13. 我读到:
  14. > 为了确保具有转换响应的线性模型的GAIC可比性,使用了转换后的对数似然乘以Jacobian,并手动重新计算了GAIC
  15. 我尝试按以下方式进行:
  16. Jacobian <- 1/abs(data_final$Abundance)
  17. # 手动计算拟合值(对数尺度)
  18. fitted_values_log <- predict(model_1)
  19. # 手动计算残差(对数尺度)
  20. residuals_transformed <- log(data_final$Abundance) - fitted_values_log
  21. # 计算残差的标准差
  22. sd_residuals_transformed <- sd(residuals_transformed)
  23. # 计算转换后的对数似然
  24. log_likelihood_transformed <- sum(dnorm(log(data_final$Abundance), mean = fitted_values_log, sd = sd_residuals_transformed, log = TRUE) * Jacobian)
  25. # 计算自由度:模型中的参数数量
  26. df <- length(coef(model_1))
  27. # 手动计算GAIC
  28. GAIC_transformed <- -2 * log_likelihood_transformed + 2 * df
  29. GAIC_transformed

但所产生的值差距很大,所以我认为我在某个地方犯了错误。

英文:

my dataset can be found here: https://raw.githubusercontent.com/yuliaUU/test/main/test.csv

  1. library(gamlss)
  2. library(tidyverse)
  3. data_final&lt;- read_csv(&quot;https://raw.githubusercontent.com/yuliaUU/test/main/test.csv&quot;)
  4. # Normal model with log transformation
  5. model_1 &lt;- gamlss(log(Abundance) ~ salinity*avrg_dep, data = data_final, family = NO())
  6. # log normal model
  7. model_2 &lt;- gamlss(Abundance ~ salinity*avrg_dep, data = data_final, family = LOGNO())
  8. # Model with inverse gaussian distribution
  9. model_3 &lt;- gamlss(Abundance ~ salinity*avrg_dep, data = data_final, family = IG())
  10. # Gamma model
  11. model_4 &lt;- gamlss(Abundance ~ salinity*avrg_dep, data = data_final, family = GA())

I want to use GAIC to compare between the models, but GAIC value for 1st model is far off from the rest

I read that:
>To ensure that the GAIC of the linear model with the transformed response was comparable, the transformed log-likelihood multiplied by the Jacobian was used, and the GAIC was re-calculated manually.

I tried to do it the following way:

  1. Jacobian &lt;- 1/abs(data_final$Abundance)
  2. # Calculate fitted values (on the log scale)
  3. fitted_values_log &lt;- predict(model_1)
  4. # Calculate residuals manually (on the log scale)
  5. residuals_transformed &lt;- log(data_final$Abundance) - fitted_values_log
  6. # Calculate standard deviation of the residuals
  7. sd_residuals_transformed &lt;- sd(residuals_transformed)
  8. # Transformed log-likelihood calculation
  9. log_likelihood_transformed &lt;- sum(dnorm(log(data_final$Abundance), mean=fitted_values_log, sd=sd_residuals_transformed, log=TRUE) * Jacobian)
  10. # Calculate degrees of freedom: number of parameters in the model
  11. df &lt;- length(coef(model_1))
  12. # Manually calculate GAIC
  13. GAIC_transformed &lt;- -2 * log_likelihood_transformed + 2 * df
  14. GAIC_transformed

but the value produced is sooo off, so I think I made a mistake somewhere

答案1

得分: 0

最简单的答案是在gamlss中明确拟合对数正态分布,即family=LOGNO。

一个更一般的答案,适用于实数线上的正态分布以外的分布,例如TF,是创建相应的logTF分布:

gen.Family ("TF", type="log")

然后在gamlss拟合中使用

family=logTF

英文:

The easiest answer is to just fit the lognormal explicitly in gamlss,
i.e. family=LOGNO

A more general answer, that applies to distributions other than the normal distribution on the real line, e.g. TF, is to create the corresponding logTF distribution:

gen.Family ("TF", type="log")

and then in the gamlss fit use

family=logTF

答案2

得分: 0

Model 1: 使用lm()对log-transformed响应进行建模

model_1 <- lm(log(Abundance) ~ salinity * avrg_dep, data = data_final)

计算模型的log-likelihood

logL <- logLik(model_1)

使用log转换的Jacobian来调整log-likelihood

adjusted_logL <- logL + sum(log(1/data_final$Abundance))

统计模型中的参数数量(包括截距)

k <- length(coef(model_1))

获取样本大小

n <- length(model_1$residuals)

计算使用调整后log-likelihood的GAIC

GAIC_adjusted <- -2adjusted_logL + 2k + 2k(k+1)/(n-k-1)

print(GAIC_adjusted)

英文:
  1. # Model 1: Log-transformed response with lm()
  2. model_1 &lt;- lm(log(Abundance) ~ salinity * avrg_dep, data = data_final)
  3. # Calculate log-likelihood of the model
  4. logL &lt;- logLik(model_1)
  5. # Adjust the log-likelihood using the Jacobian for a log transformation
  6. adjusted_logL &lt;- logL + sum(log(1/data_final$Abundance))
  7. # Count the number of parameters in the model (including intercept)
  8. k &lt;- length(coef(model_1))
  9. # Get sample size
  10. n &lt;- length(model_1$residuals)
  11. # Compute GAIC with adjusted log-likelihood
  12. GAIC_adjusted &lt;- -2*adjusted_logL + 2*k + 2*k*(k+1)/(n-k-1)
  13. print(GAIC_adjusted)

huangapple
  • 本文由 发表于 2023年6月8日 22:51:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76433110.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定