英文:
R packages lme4 and glmmTMB produce different AIC for the same model and data
问题
我使用 lmer(){lme4}
和 glmmTMB(){glmmTMB}
分别对相同的数据拟合了相同的模型。模型的AIC因所使用的函数而异,我想知道为什么。
我生成了以下具有随机截距数据:
rm(list=ls())
library(lme4)
library(glmmTMB)
library(ggplot2)
# 具有8级随机效应和变量斜率的“回归”:
set.seed(1)
ii=8 # 参数设置大小; 组数
reps <- sample(x=8:12, size=ii, replace=T)
slope <- 3
intercept <- runif(ii, min = 70, max = 140)
group.sd <- runif(ii, min = 5, max = 15)
group.ID <- LETTERS[1:ii]
xx <- 1:15
out.list <- list() # 空列表用于存储模拟数据
for (i in 1:ii){
df00 <- data.frame(Group=rep(group.ID[i], reps[i]*length(xx)),
x=rep(xx, reps[i]))
df00$y = intercept[i] + df00$x * slope + rnorm(length(df00[,1]), 0, group.sd[i])
out.list[[i]] = df00
}
# 将out.list转换为数据框:
df <- do.call(rbind.data.frame, out.list)
# 数据可视化
gg20 <- ggplot(df, aes(x = x, y = y, colour = Group))
gg20 + geom_point() + geom_smooth(method="lm") + theme_classic()
如果我使用 lme4
和 glmmTMB
拟合相同的随机截距模型,相关的AIC不同:
glm4_ri <- lmer(y~x + (1|Group), data=df)
glmmTMB_ri <- glmmTMB(y~x + (1|Group), data=df)
AIC(glm4_ri, glmmTMB_ri)
df AIC
glm4_ri 4 8588.399
glmmTMB_ri 4 8590.300
我意识到差异很小,但为什么AIC不完全相同呢?
英文:
I fitted the same model to the same data using lmer(){lme4}
and glmmTMB(){glmmTMB}
. The model's AIC differed depending on the function used, and I would like to know why.
I generated the following random-intercept data:
rm(list=ls())
library(lme4)
library(glmmTMB)
library(ggplot2)
# "regression" with an 8-level RE with variable slopes:
set.seed(1)
ii=8 # parameter set size; no. groups
reps <- sample(x=8:12, size=ii, replace=T)
slope <- 3
intercept <- runif(ii, min = 70, max = 140)
group.sd <- runif(ii, min = 5, max = 15)
group.ID <- LETTERS[1:ii]
xx <- 1:15
out.list <- list() # empty list to store simulated data
for (i in 1:ii){
df00 <- data.frame(Group=rep(group.ID[i], reps[i]*length(xx)),
x=rep(xx, reps[i]))
df00$y = intercept[i] + df00$x * slope + rnorm(length(df00[,1]), 0, group.sd[i])
out.list[[i]] = df00
}
# to turn out.list into a data.frame:
df <- do.call(rbind.data.frame, out.list)
# data visualisation
gg20 <- ggplot(df, aes(x = x, y = y, colour = Group))
gg20 + geom_point() + geom_smooth(method="lm") + theme_classic()
If I fit the same random-intercept model with lme4
and glmmTMB
the associated AIC differs:
glm4_ri <- lmer(y~x + (1|Group), data=df)
glmmTMB_ri <- glmmTMB(y~x + (1|Group), data=df)
AIC(glm4_ri, glmmTMB_ri)
df AIC
glm4_ri 4 8588.399
glmmTMB_ri 4 8590.300
I realise that the discrepancy is small, but why aren't the AIC exactly the same?
答案1
得分: 1
tl;dr lmer
默认使用REML,glmmTMB
默认使用ML。这是一个令人惊讶和不幸的微妙差别 - lmer
在某些情况下会尝试禁用REML(例如在anova()
方法中),但有时候它会“不注意”。
如果你要比较具有不同固定效应成分的模型,你必须使用ML而不是REML。
英文:
tl;dr lmer
is using REML by default, glmmTMB
is using ML by default. This is surprisingly, and unfortunately, subtle - lmer
does make some efforts to disable REML when it makes sense (e.g. in the anova()
method), but sometimes it doesn't 'notice'.
> logLik(glm4_ri)
'log Lik.' -4290.2 (df=4)
> logLik(update(glm4_ri, REML = FALSE))
'log Lik.' -4291.15 (df=4)
> logLik(glmmTMB_ri)
'log Lik.' -4291.15 (df=4)
> logLik(update(glmmTMB_ri, REML = TRUE))
'log Lik.' -4290.2 (df=4)
If you're going to be comparing models with different fixed-effect components, you must use ML rather than REML.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论