2023年2月23日 21:35:49go评论104阅读模式

英文:

Fitting several zero-inflated negbin models and get pooled results

问题

我有一个包含患者数据的数据集。其中一些患者在医院里住过多次，但为了保证观察相互独立（我尝试过使用多层模型，但这个数据不适用），我创建了一个数据集，对于每个有多次住院记录的患者，只随机选择一个住院记录。这样，我创建了100个包含这些患者随机住院记录的数据集。我的因变量是一个计数变量，零膨胀负二项模型最适合。

我已经成功地在每个数据集上运行了回归模型（数据集由变量“sample”标识），但我不知道如何获得这100个回归模型的汇总结果。我想要得到每个预测变量的计数模型和零膨胀模型的汇总结果。

我使用的包括：library(dplyr); library(tidyverse); library(pscl); library(broom); library(jtools); library(mice)。

pool函数来自mice包。

我像这样创建了合并的数据集：

set.seed(12345)
Combined_randcase <- bind_rows(replicate(100, cohort_1 %>%
                                  group_by(patient) %>%
                                  slice_sample(n=1, replace = TRUE), simplify=F), .id="sample")
Combined_randcase <- data.frame(as.list(Combined_randcase))

我在每个数据集上运行了ZINB回归模型，按“sample”分组，如下（使用broom包）：

regr_comb_randcase.zeroinfl = Combined_randcase %>%
  nest_by(sample) %>%
  mutate(model = list(zeroinfl(formula = cm_number ~ after_wm + age + gender_male + ref_mode_police + ref_lg_invol + ref_reas_selfharm + ref_reas_aggrpers + comm_limited + duration_days + diagnosis_personality + diagnosis_psychosis + diagnosis_mania + diagnosis_substance + intoxication | age + gender_male + ref_mode_police + ref_lg_invol + ref_reas_selfharm + ref_reas_aggrpers + comm_limited + duration_days + diagnosis_personality + diagnosis_psychosis + diagnosis_mania + diagnosis_substance + intoxication, data = cohort_1, na.action = na.exclude, dist = "negbin"))) %>%
  summarise(tidy(model))

这是我尝试获取汇总结果的方式：

models.zeroinfl <- regr_comb_randcase.zeroinfl$model
pool_results.zeroinfl <- pool(regr_comb_randcase.zeroinfl)

在运行第二行时，我收到了以下错误：

Error: No tidy method for objects of class character

对于另一个逻辑回归模型，我成功地做了以下操作：

regr_comb_randcase.log = Combined_randcase %>%
  group_by(sample) %>%
  do(model = glm(cm ~ after_wm + age + gender_male +ref_mode_police + ref_lg_invol + ref_reas_selfharm + ref_reas_aggrpers + comm_limited + duration_days + diagnosis_personality + diagnosis_psychosis + diagnosis_mania + diagnosis_substance + intoxication, data = ., family = binomial()))
models <- regr_comb_randcase.log$model
pool_results <- pool(models)
summary(pool_results)

希望这能帮到你。

英文:

I have a dataset with patient data. Some of the patients have multiple stays in the hospital, but for the observations to be independent (I tried a multi-level model, but it's not possible with this data) I created a dataset, where for each patient that has multiple stays only one stay is selected randomly. Like this, I created 100 datasets with random stays for these patients. My dependent variable is a count variable and a zero-inflated negative binomial model fits best.
I already managed to run the regression model on each of the datasets (the datasets are identified by the variable "sample"), but I don't know how to get a pooled result for all of these 100 regressions. I would like to get the pooled results of the count model and of the zero-inflated model for every predictor.
I'm using:
library(dplyr); library(tidyverse); library(pscl); library(broom); library(jtools); library(mice)

The pool function is from mice.

I created the combined dataset like this:

set.seed(12345)
Combined_randcase &lt;- bind_rows(replicate(100, cohort_1 %&gt;% group_by(patient) %&gt;%
                                  slice_sample(n=1, replace = TRUE), simplify=F), .id=&quot;sample&quot;)
Combined_randcase &lt;- data.frame(as.list(Combined_randcase))

I ran the ZINB regression model on each dataset, grouped by "sample", like this (using broom package):

regr_comb_randcase.zeroinfl = Combined_randcase %&gt;% 
nest_by(sample) %&gt;% 
mutate(model = list(zeroinfl(formula = cm_number ~ after_wm + age + gender_male + ref_mode_police + ref_lg_invol + ref_reas_selfharm + ref_reas_aggrpers + comm_limited + duration_days + diagnosis_personality + diagnosis_psychosis + diagnosis_mania + diagnosis_substance + intoxication | age + gender_male + ref_mode_police + ref_lg_invol + ref_reas_selfharm + ref_reas_aggrpers + comm_limited + duration_days + diagnosis_personality + diagnosis_psychosis + diagnosis_mania + diagnosis_substance + intoxication, data = cohort_1, na.action = na.exclude, dist = &quot;negbin&quot;))) 
%&gt;%
  summarise(tidy(model))

That's how I tried to get pooled results:

models.zeroinfl &lt;- regr_comb_randcase.zeroinfl$model
pool_results.zeroinfl &lt;- pool(regr_comb_randcase.zeroinfl)

When running the second line, I get this error:

Error: No tidy method for objects of class character

For another logistic regression model, I did this successfully:

regr_comb_randcase.log = Combined_randcase %&gt;% 
group_by(sample) %&gt;% 
do(model = glm(cm ~ after_wm + age + gender_male +ref_mode_police + ref_lg_invol + ref_reas_selfharm + ref_reas_aggrpers + comm_limited + duration_days + diagnosis_personality + diagnosis_psychosis + diagnosis_mania + diagnosis_substance + intoxication, data = ., family = binomial()))
models &lt;- regr_comb_randcase.log$model
pool_results &lt;- pool(models)
summary(pool_results)

Output of dput(cohort_1_example) (a shortened version of my dataset) for reproducibility:

structure(list(case = c(&quot;20001879&quot;, &quot;20009253&quot;, &quot;20003748&quot;, &quot;20002321&quot;, 
&quot;20001662&quot;, &quot;1910967&quot;, &quot;20008058&quot;, &quot;20010686&quot;, &quot;20010938&quot;, &quot;20009508&quot;, 
&quot;20002307&quot;, &quot;20010105&quot;, &quot;210098181&quot;, &quot;21009818&quot;, &quot;210100261&quot;, 
&quot;21010026&quot;, &quot;21000865&quot;, &quot;21002199&quot;, &quot;1906803&quot;, &quot;1907642&quot;, &quot;20008274&quot;, 
&quot;21000858&quot;, &quot;21004557&quot;, &quot;1910669&quot;, &quot;21004451&quot;, &quot;21000202&quot;, &quot;21000812&quot;, 
&quot;21001006&quot;, &quot;21001143&quot;, &quot;21001423&quot;, &quot;1906820&quot;, &quot;21000448&quot;, &quot;21002128&quot;, 
&quot;21002666&quot;, &quot;21003560&quot;, &quot;1907070&quot;, &quot;20011121&quot;, &quot;1907614&quot;, &quot;20002748&quot;, 
&quot;20010645&quot;, &quot;21001363&quot;, &quot;1908906&quot;, &quot;1910981&quot;, &quot;1905926&quot;, &quot;21002429&quot;, 
&quot;21004264&quot;, &quot;20011209&quot;, &quot;20010442&quot;, &quot;20009977&quot;, &quot;1906382&quot;, &quot;1909409&quot;, 
&quot;1908904&quot;, &quot;1910516&quot;, &quot;20001534&quot;, &quot;20011201&quot;, &quot;1907432&quot;, &quot;1908332&quot;, 
&quot;1906356&quot;, &quot;20011026&quot;, &quot;20008206&quot;, &quot;20000809&quot;, &quot;1910664&quot;, &quot;1908673&quot;, 
&quot;1907610&quot;, &quot;1911046&quot;, &quot;20008505&quot;, &quot;20009385&quot;, &quot;21000530&quot;, &quot;1909620&quot;, 
&quot;1909730&quot;, &quot;1910988&quot;, &quot;20009899&quot;, &quot;1907282&quot;, &quot;1906671&quot;, &quot;20007870&quot;, 
&quot;1910749&quot;, &quot;20010782&quot;, &quot;20009808&quot;, &quot;20003311&quot;, &quot;1910722&quot;, &quot;1910529&quot;, 
&quot;1906638&quot;, &quot;1906861&quot;, &quot;1906956&quot;, &quot;1910743&quot;, &quot;20002057&quot;, &quot;21000891&quot;, 
&quot;20010349&quot;, &quot;20008503&quot;, &quot;1906093&quot;, &quot;1910662&quot;, &quot;20008093&quot;, &quot;20010683&quot;, 
&quot;20008787&quot;, &quot;20003631&quot;, &quot;20007796&quot;, &quot;20008089&quot;, &quot;21004141&quot;, &quot;20010177&quot;, 
&quot;20001316&quot;, &quot;1909809&quot;, &quot;20001875&quot;, &quot;20009552&quot;, &quot;20001443&quot;, &quot;21000419&quot;, 
&quot;20003106&quot;, &quot;1909773&quot;, &quot;21004600&quot;, &quot;20008105&quot;, &quot;21002070&quot;, &quot;1908245&quot;, 
&quot;1909860&quot;, &quot;21004209&quot;, &quot;21003022&quot;, &quot;20003151&quot;, &quot;20011037&quot;, &quot;21001966&quot;, 
&quot;20009902&quot;, &quot;1906202&quot;, &quot;1910009&quot;, &quot;1910777&quot;, &quot;20010294&quot;, &quot;1910072&quot;
), patient = c(&quot;10&quot;, &quot;11&quot;, &quot;100&quot;, &quot;100&quot;, &quot;101&quot;, &quot;102&quot;, &quot;103&quot;, 
&quot;105&quot;, &quot;106&quot;, &quot;107&quot;, &quot;108&quot;, &quot;11&quot;, &quot;11&quot;, &quot;11&quot;, &quot;11&quot;, &quot;11&quot;, &quot;1000&quot;, 
&quot;1001&quot;, &quot;1002&quot;, &quot;1002&quot;, &quot;1003&quot;, &quot;1003&quot;, &quot;1004&quot;, &quot;1005&quot;, &quot;1005&quot;, 
&quot;1006&quot;, &quot;1008&quot;, &quot;1009&quot;, &quot;1009&quot;, &quot;1009&quot;, &quot;1011&quot;, &quot;1012&quot;, &quot;1013&quot;, 
&quot;1013&quot;, &quot;1013&quot;, &quot;1014&quot;, &quot;1016&quot;, &quot;1017&quot;, &quot;1018&quot;, &quot;1020&quot;, &quot;1020&quot;, 
&quot;1021&quot;, &quot;1022&quot;, &quot;1023&quot;, &quot;1026&quot;, &quot;1026&quot;, &quot;1029&quot;, &quot;1030&quot;, &quot;1033&quot;, 
&quot;1035&quot;, &quot;1036&quot;, &quot;1037&quot;, &quot;1037&quot;, &quot;1037&quot;, &quot;1039&quot;, &quot;1041&quot;, &quot;1041&quot;, 
&quot;1042&quot;, &quot;1042&quot;, &quot;1043&quot;, &quot;1044&quot;, &quot;1045&quot;, &quot;1046&quot;, &quot;1047&quot;, &quot;1048&quot;, 
&quot;1049&quot;, &quot;1050&quot;, &quot;1053&quot;, &quot;1054&quot;, &quot;1056&quot;, &quot;1056&quot;, &quot;1057&quot;, &quot;1058&quot;, 
&quot;1060&quot;, &quot;1061&quot;, &quot;1064&quot;, &quot;1064&quot;, &quot;1064&quot;, &quot;1065&quot;, &quot;1066&quot;, &quot;1067&quot;, 
&quot;1067&quot;, &quot;1067&quot;, &quot;1067&quot;, &quot;1069&quot;, &quot;1071&quot;, &quot;1072&quot;, &quot;1073&quot;, &quot;1074&quot;, 
&quot;1075&quot;, &quot;1075&quot;, &quot;1076&quot;, &quot;1077&quot;, &quot;1078&quot;, &quot;1079&quot;, &quot;1080&quot;, &quot;1081&quot;, 
&quot;1082&quot;, &quot;1083&quot;, &quot;1086&quot;, &quot;1087&quot;, &quot;1087&quot;, &quot;1088&quot;, &quot;1089&quot;, &quot;1089&quot;, 
&quot;1090&quot;, &quot;1091&quot;, &quot;1091&quot;, &quot;1092&quot;, &quot;1093&quot;, &quot;1094&quot;, &quot;1094&quot;, &quot;1095&quot;, 
&quot;1096&quot;, &quot;1098&quot;, &quot;1098&quot;, &quot;1098&quot;, &quot;1099&quot;, &quot;1048&quot;, &quot;1048&quot;, &quot;1021&quot;, 
&quot;1018&quot;, &quot;1011&quot;), cm = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 
0, 0, 0, 0, 0), cm_number = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 3, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 1, 5, 0, 0, 0, 0, 0, 0, 1, 
0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 2, 2, 0, 0, 1, 0, 0, 0, 0, 1, 
0, 0, 0, 0, 0, 0, 0), total_cm_duration = c(0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 165.000000003492, 0, 0, 0, 0, 0, 0, 
0, 0, 174.999999994179, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 259.999999998836, 
720, 0, 0, 0, 0, 0, 60.0000000069849, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 815.000000005821, 0, 0, 10865.0000000023, 
0, 0, 0, 0, 0, 0, 420.000000006985, 0, 0, 0, 0, 0, 200.000000002328, 
0, 0, 0, 0, 0, 0, 239.999999996508, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 1084.99999999534, 0, 0, 145.000000001164, 0, 0, 789.999999997672, 
435.000000003492, 0, 0, 60.0000000069849, 0, 0, 0, 0, 775.000000001164, 
0, 0, 0, 0, 0, 0, 0), after_wm = c(0, 1, 0, 0, 0, 0, 1, 1, 1, 
1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 
0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 
0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 
0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 
1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 
0, 1, 1, 1, 0, 0, 0, 1, 0), age = c(26, 27, 53, 53, 26, 28, 30, 
57, 39, 50, 49, 27, 28, 28, 27, 27, 89, 18, 22, 22, 21, 21, 58, 
35, 36, 63, 44, 35, 35, 35, 25, 24, 36, 36, 36, 62, 50, 21, 55, 
23, 23, 44, 53, 71, 39, 39, 79, 47, 81, 43, 39, 21, 22, 22, 79, 
22, 22, 33, 35, 86, 27, 42, 20, 30, 25, 22, 26, 62, 54, 46, 46, 
46, 79, 39, 21, 63, 64, 64, 31, 59, 70, 70, 70, 70, 49, 37, 49, 
63, 74, 38, 39, 74, 50, 72, 61, 80, 51, 45, 67, 45, 76, 76, 61, 
30, 31, 35, 48, 49, 45, 30, 76, 76, 20, 18, 20, 20, 21, 51, 24, 
24, 45, 55, 25), gender_male = c(1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 
1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 
1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 
0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 
0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 
1, 1, 1, 0, 0, 0, 0, 0), ref_mode_police = c(0, 0, 0, 0, 0, 0, 
1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 
1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 
0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 
0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0), ref_lg_invol = c(0, 0, 0, 
0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 
1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 
1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0), ref_reas_selfharm = c(1, 
1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 
1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 
1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 
0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 
1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 
0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0), ref_reas_aggrpers = c(0, 
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 
0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0), comm_limited = c(0, 
0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 
1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), duration_days = c(41, 
1, 2, 42, 3, 46, 3, 8, 1, 1, 21, 64, 25, 2, 25, 2, 22, 26, 7, 
29, 101, 119, 153, 2, 2, 51, 10, 2, 6, 49, 83, 5, 1, 8, 1, 36, 
71, 1, 7, 9, 166, 41, 2, 76, 12, 1, 25, 40, 4, 0, 2, 28, 1, 3, 
49, 29, 54, 95, 119, 29, 28, 26, 43, 1, 15, 121, 22, 28, 73, 
13, 39, 1, 119, 14, 73, 18, 124, 32, 2, 120, 67, 2, 2, 8, 29, 
27, 34, 32, 112, 6, 8, 38, 118, 24, 38, 20, 2, 1, 2, 9, 1, 21, 
42, 57, 49, 1, 1, 1, 35, 2, 45, 23, 64, 29, 2, 6, 56, 0, 5, 3, 
58, 51, 2), diagnosis_personality = c(0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 
1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0), diagnosis_psychosis = c(1, 1, 
1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 
0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 
1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 
1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 
1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 
1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1), diagnosis_mania = c(0, 
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 
0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), diagnosis_substance = c(0, 
0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 
1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 
0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 
0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1), intoxication = c(0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 
0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 
1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1)), row.names = c(NA, 
-123L), class = c(&quot;tbl_df&quot;, &quot;tbl&quot;, &quot;data.frame&quot;))

I could find information on doing this with a linear and a logistic model, but not on a zero-inflated negbin model. Maybe that's why tidy is not working? Any help is much appreciated.

答案1

得分: 1

可以将混合/多层模型拟合到具有单例组的数据中；主要约束是是否有足够的总体信息来估计组间方差。

在主流软件包中，lme4 可以拟合线性混合模型 (LMM)、逻辑混合效应模型 (GLMM) 和负二项式混合模型（尽管速度较慢，没有经过大量工作的零膨胀混合模型，可以参考这里）。glmmTMB 可以处理上述所有情况。brms 可以做glmmTMB 能做的一切，还可以加入更多的内容，但速度较慢（因为贝叶斯MCMC），可能需要深入了解贝叶斯/MCMC抽样的细节。

以下示例使用您的数据子集，并采用大大简化的模型。您的样本数据中有86名患者，61名单例，总共有123个观察值。这似乎足够拟合简化模型（如下）；但对于ZINB拟合（零膨胀项趋于零概率，NB离散参数趋于泊松分布）和逻辑拟合（奇异拟合，即患者间方差趋于零），我们会遇到一些典型的数据不足问题。如果您的完整数据集较大，这些问题发生的可能性要小得多...

除了加载软件包之外，下面的代码主要是一些样式的处理，以避免过多的重复代码，并使跨所有模型包括的公共预测变量更容易看到。

library(lme4)
library(glmmTMB)
## 所有固定效应预测变量的列表
fix_vars &lt;- c(&quot;after_wm&quot;, &quot;age&quot;, &quot;gender_male&quot;, &quot;ref_mode_police&quot;)
## 'resp'是响应变量的名称（字符）
ff &lt;- function(resp) {
   reformulate(c(fix_vars, &quot;(1|patient)&quot;), resp = resp)
}

## 高斯/LMM
lme4::lmer(ff(&quot;total_cm_duration&quot;), data = dd)
## NB/无零膨胀
glmmTMB::glmmTMB(ff(&quot;cm_number&quot;),
                 family = nbinom2,
                 data = dd)
## NB与（简单的）零膨胀
glmmTMB::glmmTMB(ff(&quot;cm_number&quot;),
                 family = nbinom2,
                 ## 可以使用 zi = ff(NULL) 来包括所有固定效应预测变量作为零膨胀预测变量...
                 zi = ~1,
                 data = dd)
## 逻辑
lme4::glmer(ff(&quot;cm&quot;),
            family = binomial,
            data = dd)

英文:

We can fit mixed/multilevel models to data with singleton groups; the main constraint is whether there is enough information overall to make estimating the among-group variances feasible.

Among mainstream packages, lme4 can fit LMMs, logistic GLMMs, and negative binomial mixed models (albeit a bit slowly, and not zero-inflated mixed models without a lot of work: see e.g. here). glmmTMB can handle all of the above. brms can do anything glmmTMB can do, plus the kitchen sink, but is slower (because Bayesian MCMC) and may need you to get into the weeds of Bayesian/MCMC sampling.

The examples below use your data subset, with considerably simplified models. Your sample data has 86 patients, 61 singletons, 123 total observations. This seems to be nearly enough to fit the simplified models (below); we do run into some of the typical not-quite-enough-data problems with the ZINB fit (zero-inflation term converges to zero probability, NB dispersion parameter converges to a Poisson distribution) and the logistic fit (singular fit, i.e. the among-patient variance converges to zero). These problems are much less likely to occur if your full data set is large ...

The first bit of machinery (besides loading packages) is cosmetic, to avoid too much repetitive code and make it easier to see what common set of predictors is included across all models.

library(lme4)
library(glmmTMB)
## list of all of the fixed effect predictors
fix_vars &lt;- c(&quot;after_wm&quot;, &quot;age&quot;, &quot;gender_male&quot;, &quot;ref_mode_police&quot;)
## &#39;resp&#39; is the name of the response variable (character)
ff &lt;- function(resp) {
   reformulate(c(fix_vars, &quot;(1|patient)&quot;), resp = resp)
}

## Gaussian/LMM
lme4::lmer(ff(&quot;total_cm_duration&quot;), data = dd)
## NB/no zero-inflation
glmmTMB::glmmTMB(ff(&quot;cm_number&quot;),
                 family = nbinom2,
                 data = dd)
## NB with (simple) zero-inflation
glmmTMB::glmmTMB(ff(&quot;cm_number&quot;),
                 family = nbinom2,
                 ## could use zi = ff(NULL) to include all FE predictors
                 ##  as ZI predictors as well ...
                 zi = ~1,
                 data = dd)
## logistic
lme4::glmer(ff(&quot;cm&quot;),
            family = binomial,
            data = dd)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

拟合多个零膨胀负二项模型并获取汇总结果

问题

答案1

一个在R中的循环函数

Why does the survival probability of the survival package return 0% at the end of the time horizon when there are survivors in the dataset?

多边形的非唯一属性的中位栅格值

如何使用现有的数值向量来派生新变量

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。