英文:
How to add confidence intervals around predicted prior probabilities of latent class membership in a LCA model fitted with {poLCA}?
问题
I understand your request. Please find the translated code snippet below:
我想要将潜在类别分析(LCA)模型中带有协变量的预测先验概率添加置信区间。
我已经准备了一个示例,基于包{poLCA}的作者提供的一个示例([链接](https://www.jstatsoft.org/article/view/v042i10))。
用于计算置信区间的相关部分应该是在拟合模型对象中的协方差矩阵(在此示例中为`nes.3cov$coeff.V`)。
解决方案应该将置信区间的下限和上限添加到对象`pdata`中,以便它们可以作为可视化中的带状区域包含在内。
```r
library(poLCA)
library(tidyr)
library(dplyr)
library(ggplot2)
data(election)
## 使用3个类别拟合带有协变量的LCA模型
f.3cov <- cbind(MORALG,CARESG,KNOWG,LEADG,DISHONG,INTELG,
MORALB,CARESB,KNOWB,LEADB,DISHONB,INTELB)~PARTY*AGE
nes.3cov <- poLCA(f.3cov,election,nclass=3) # 对数似然:-16135.39
probs.start <- poLCA.reorder(nes.3cov$probs.start,
order(nes.3cov$P,decreasing=TRUE))
nes.3cov <- poLCA(f.3cov,election,nclass=3,probs.start=probs.start)
# 为预测的先验概率创建配置文件
## 不同年龄的强烈民主党人和强烈共和党人
pred.profs <- rbind(cbind(1,1,c(18:80),(c(18:80)*1)),
cbind(1,7,c(18:80),(c(18:80)*7)))
exb.pred.profs <- exp(pred.profs %*% nes.3cov$coeff)
# 计算预测的先验概率
pred.profs <- cbind(1, exb.pred.profs)/(1 + rowSums(exb.pred.profs))
pdata <- cbind(data.frame(pred.profs),
age = rep(18:80, 2),
party = rep(c("强烈民主党人", "强烈共和党人"),
each = 63))
# 为绘图准备数据
pdata <- pdata %>%
pivot_longer(cols = X1:X3, names_to = "亲和力", values_to = "pp") %>%
mutate(亲和力 = case_when(
亲和力 == "X1" ~ "其他",
亲和力 == "X2" ~ "布什亲和力",
亲和力 == "X3" ~ "戈尔亲和力"))
# 绘制预测的先验概率
pdata %>%
ggplot(aes(x = age, y = pp, fill = 亲和力, linetype = 亲和力)) +
geom_line() +
scale_linetype_manual(values = c("长虚线", "点线", "实线")) +
facet_wrap(~ party) +
coord_cartesian(ylim = c(0, 1)) +
labs(x = "年龄", y = "潜在类别成员概率") +
theme_classic()
创建于2023年5月17日,使用reprex v2.0.2
<details>
<summary>英文:</summary>
I want to add confidence intervals to the predicted prior probabilities from a latent class analysis (LCA) model with covariates.
I have prepared an example built upon one provided by the authors of the package {poLCA} ((https://www.jstatsoft.org/article/view/v042i10)).
The relevant piece for calculating the confidence intervals should be the covariance matrix in the object with the fitted model (`nes.3cov$coeff.V` in this example).
The solution should add the lower and upper bounds from the confidence intervals to the object `pdata`, so they can be included as ribbons in the visualization.
``` r
library(poLCA)
library(tidyr)
library(dplyr)
library(ggplot2)
data(election)
## Fit LCA model with covariates using 3 classes
f.3cov <- cbind(MORALG,CARESG,KNOWG,LEADG,DISHONG,INTELG,
MORALB,CARESB,KNOWB,LEADB,DISHONB,INTELB)~PARTY*AGE
nes.3cov <- poLCA(f.3cov,election,nclass=3) # log-likelihood: -16135.39
probs.start <- poLCA.reorder(nes.3cov$probs.start,
order(nes.3cov$P,decreasing=TRUE))
nes.3cov <- poLCA(f.3cov,election,nclass=3,probs.start=probs.start)
# Create profiles for predicted prior probabilities
## Strong dems and strong reps of different age
pred.profs <- rbind(cbind(1,1,c(18:80),(c(18:80)*1)),
cbind(1,7,c(18:80),(c(18:80)*7)))
exb.pred.profs <- exp(pred.profs %*% nes.3cov$coeff)
# Calculate predicted prior probabilities
pred.profs <- cbind(1, exb.pred.profs)/(1 + rowSums(exb.pred.profs))
pdata <- cbind(data.frame(pred.profs),
age = rep(18:80, 2),
party = rep(c("Strong Democrat", "Strong Republican"),
each = 63))
# Prepare data for plot
pdata <- pdata |>
pivot_longer(cols = X1:X3, names_to = "affinity", values_to = "pp") |>
mutate(affinity = case_when(
affinity == "X1" ~ "Other",
affinity == "X2" ~ "Bush affinity",
affinity == "X3" ~ "Gore affinity"))
# Plot predicted prior probabilities
pdata |>
ggplot(aes(x = age, y = pp, fill = affinity, linetype = affinity)) +
geom_line() +
scale_linetype_manual(values = c("longdash", "dotted", "solid")) +
facet_wrap(~ party) +
coord_cartesian(ylim = c(0, 1)) +
labs(x = "Age", y = "Probability of latent class membership") +
theme_classic()
<!-- -->
<sup>Created on 2023-05-17 with reprex v2.0.2</sup>
答案1
得分: 1
你可以使用一种伪贝叶斯方法来完成这个任务,将回归系数视为后验均值,方差协方差矩阵视为回归系数的后验方差协方差矩阵。然后,从具有这些回归系数和方差协方差矩阵作为参数的多元正态分布中进行随机抽样(例如,我进行了2500次抽样)。对于每次抽样,你可以生成概率矩阵,然后从这2500次模拟中找到下限和上限。
估计模型:
library(poLCA)
library(tidyr)
library(dplyr)
library(ggplot2)
data(election)
## 使用3个类别拟合LCA模型并考虑协变量
f.3cov <- cbind(MORALG,CARESG,KNOWG,LEADG,DISHONG,INTELG,
MORALB,CARESB,KNOWB,LEADB,DISHONB,INTELB)~PARTY*AGE
nes.3cov <- poLCA(f.3cov,election,nclass=3)
probs.start <- poLCA.reorder(nes.3cov$probs.start,
order(nes.3cov$P,decreasing=TRUE))
nes.3cov <- poLCA(f.3cov,election,nclass=3,probs.start=probs.start)
计算概率:
pred.profs <- rbind(cbind(1,1,c(18:80),(c(18:80)*1)),
cbind(1,7,c(18:80),(c(18:80)*7)))
q <- cbind(1, exp(pred.profs %*% nes.3cov$coeff))
p <- prop.table(q, 1)
进行模拟:
B <- MASS::mvrnorm(2500, c(nes.3cov$coeff), nes.3cov$coeff.V)
B <- lapply(1:nrow(B), \(i)matrix(B[i, ], ncol=2))
probs <- lapply(B, function(b){
q <- cbind(1, exp(pred.profs %*% b))
prop.table(q, 1)
})
probs <- abind::abind(probs, along=3)
lwr <- apply(probs, c(1,2), quantile, .025)
upr <- apply(probs, c(1,2), quantile, .975)
colnames(p) <- paste0("pp_", 1:3)
colnames(lwr) <- paste0("lwr_", 1:3)
colnames(upr) <- paste0("upr_", 1:3)
组合概率、下限、上限和数据:
pdata <- cbind(data.frame(p),
data.frame(lwr),
data.frame(upr),
age = rep(18:80, 2),
party = rep(c("Strong Democrat", "Strong Republican"),
each = 63))
# 为绘图准备数据
pdata <- pdata |>
pivot_longer(cols = 1:9, names_sep="_", names_to = c(".value", "affinity")) |>
mutate(affinity = case_when(
affinity == "1" ~ "Other",
affinity == "2" ~ "Bush affinity",
affinity == "3" ~ "Gore affinity"))
制作绘图:
pdata |>
ggplot(aes(x = age, y = pp, ymin =lwr, ymax=upr, linetype = affinity)) +
geom_ribbon(alpha=.2, fill="gray75") +
geom_line() +
scale_linetype_manual(values = c("longdash", "dotted", "solid")) +
facet_wrap(~ party) +
coord_cartesian(ylim = c(0, 1)) +
labs(x = "Age", y = "Probability of latent class membership") +
theme_classic()
<!-- -->
英文:
You could do this with a pseudo-Bayesian approach, treating the coefficients as posterior means and the variance-covariance matrix as the posterior variance-covariance matrix of the coefficients. Then, you take random draws from the multivariate normal distribution with the coefficients and variance-covariance matrix as parameters (I did 2500). For each draw you make the matrix of probabilities. Then, you can find the lower and upper bounds from those 2500 simulations.
Estimate models
library(poLCA)
library(tidyr)
library(dplyr)
library(ggplot2)
data(election)
## Fit LCA model with covariates using 3 classes
f.3cov <- cbind(MORALG,CARESG,KNOWG,LEADG,DISHONG,INTELG,
MORALB,CARESB,KNOWB,LEADB,DISHONB,INTELB)~PARTY*AGE
nes.3cov <- poLCA(f.3cov,election,nclass=3)
probs.start <- poLCA.reorder(nes.3cov$probs.start,
order(nes.3cov$P,decreasing=TRUE))
nes.3cov <- poLCA(f.3cov,election,nclass=3,probs.start=probs.start)
Calculate probabilties
pred.profs <- rbind(cbind(1,1,c(18:80),(c(18:80)*1)),
cbind(1,7,c(18:80),(c(18:80)*7)))
q <- cbind(1, exp(pred.profs %*% nes.3cov$coeff))
p <- prop.table(q, 1)
Do simulation
B <- MASS::mvrnorm(2500, c(nes.3cov$coeff), nes.3cov$coeff.V)
B <- lapply(1:nrow(B), \(i)matrix(B[i, ], ncol=2))
probs <- lapply(B, function(b){
q <- cbind(1, exp(pred.profs %*% b))
prop.table(q, 1)
})
probs <- abind::abind(probs, along=3)
lwr <- apply(probs, c(1,2), quantile, .025)
upr <- apply(probs, c(1,2), quantile, .975)
colnames(p) <- paste0("pp_", 1:3)
colnames(lwr) <- paste0("lwr_", 1:3)
colnames(upr) <- paste0("upr_", 1:3)
Combine probabilities, lower and upper bounds and data
pdata <- cbind(data.frame(p),
data.frame(lwr),
data.frame(upr),
age = rep(18:80, 2),
party = rep(c("Strong Democrat", "Strong Republican"),
each = 63))
# Prepare data for plot
pdata <- pdata |>
pivot_longer(cols = 1:9, names_sep="_", names_to = c(".value", "affinity")) |>
mutate(affinity = case_when(
affinity == "1" ~ "Other",
affinity == "2" ~ "Bush affinity",
affinity == "3" ~ "Gore affinity"))
Make plot
pdata |>
ggplot(aes(x = age, y = pp, ymin =lwr, ymax=upr, linetype = affinity)) +
geom_ribbon(alpha=.2, fill="gray75") +
geom_line() +
scale_linetype_manual(values = c("longdash", "dotted", "solid")) +
facet_wrap(~ party) +
coord_cartesian(ylim = c(0, 1)) +
labs(x = "Age", y = "Probability of latent class membership") +
theme_classic()
<!-- -->
<sup>Created on 2023-05-17 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论