如何绘制自定义广义加性模型(GAM)概率的逻辑图?

huangapple go评论113阅读模式
英文:

How to plot logistic probability for custom generalized additive model (GAM) plot?

问题

我明白,以下是代码部分的中文翻译:

库和数据

library(mgcv)
library(gamair)
library(tidyverse)
data("wesdr")

拟合模型

fit <- gam(
  ret ~ s(dur),
  method = "REML",
  family = binomial,
  data = wesdr
)

评估平滑效应

sm <- smooth_estimates(fit) %>%
  add_confint()
sm

添加偏残差

wesdr <- wesdr %>% 
  add_partial_residuals(fit)

绘图

p <- sm %>%
  filter(smooth == "s(dur)") %>%
  ggplot() +
  geom_rug(aes(x = dur),
           data = wesdr,
           sides = "b", 
           length = grid::unit(0.02, "npc")) +
  geom_ribbon(aes(ymin = lower_ci, 
                  ymax = upper_ci,
                  x = dur),
              alpha = 0.2) +
  geom_line(aes(x = dur,
                y = est),
            lwd = 1.2) +
  labs(y = "部分效应",
       title = "s(dur)")
p

具体来说,我在寻找与以下功能等效的内容:

plot(fit,
     trans = plogis,
     shift = coef(fit)[1])

有关如何获得概率图的建议呢?

英文:

I realize that there is already a question pertaining to this here. However, I'm not looking to use the draw function and would prefer to build up the plot by scratch like shown here. However, the plot shown in the link doesn't show how to do this with logistic probability. I show an example that gets close, but only plots by the link function.

#### Libraries and Data ####
library(mgcv)
library(gamair)
library(tidyverse)
data(&quot;wesdr&quot;)

#### Fit Model ####
fit &lt;- gam(
  ret ~ s(dur),
  method = &quot;REML&quot;,
  family = binomial,
  data = wesdr
)

#### Evaluate the Smooths ####
sm &lt;- smooth_estimates(fit) %&gt;%
  add_confint()
sm

#### Add Partial Residuals ####
wesdr &lt;- wesdr %&gt;% 
  add_partial_residuals(fit)

#### Plot ####
p &lt;- sm %&gt;%
  filter(smooth == &quot;s(dur)&quot;) %&gt;%
  ggplot() +
  geom_rug(aes(x = dur),
           data = wesdr,
           sides = &quot;b&quot;, 
           length = grid::unit(0.02, &quot;npc&quot;)) +
  geom_ribbon(aes(ymin = lower_ci, 
                  ymax = upper_ci,
                  x = dur),
              alpha = 0.2) +
  geom_line(aes(x = dur,
                y = est),
            lwd = 1.2) +
  labs(y = &quot;Partial effect&quot;,
       title = &quot;s(dur)&quot;)
p

如何绘制自定义广义加性模型(GAM)概率的逻辑图?

Specifically, I'm looking for something functionally equivalent to this:

plot(fit,
     trans = plogis,
     shift = coef(fit)[1])

如何绘制自定义广义加性模型(GAM)概率的逻辑图?

Any advice on how to get the probability plot?

Edit

I realize I wasn't precise about what I needed. The answer given is good, but I'm considering the most general-case use where a logistic GAM has multiple predictors. So I need a plot based off a model like this:

#### Fit Model ####
fit &lt;- gam(
  ret 
  ~ s(dur)
  + s(bmi),
  method = &quot;REML&quot;,
  family = binomial,
  data = wesdr
)

To account for the average value of other predictors, I need to include the intercept into the plot, which is why I originally used shift in the plot.gam function in base R.

答案1

得分: 2

以下是代码的翻译部分:

链接函数返回对数几率。由于几率只是 `p / (1 - p)`,那么对数几率就是 `log(p / (1 - p))`。反之为 `exp(对数几率) / (1 + exp(对数几率))`。所以我们可以这样做:
```r
p$data &lt;- mutate(p$data, across(c(est, lower_ci, upper_ci), ~exp(.x)/(1 + exp(.x))))
p + ylim(c(0, 1))

要在具有多个自变量的情况下手动获取部分效应,可以在其他变量的均值处预测结果,同时只更改感兴趣的变量。在这里,我们将美化 ggplot,使它看起来更像基本的 R 图:

fit &lt;- gam(
  ret 
  ~ s(dur)
  + s(bmi),
  method = &quot;REML&quot;,
  family = binomial,
  data = wesdr
)

newdata &lt;- data.frame(dur = 0:55, bmi = mean(wesdr$bmi))

pred &lt;- predict(fit, newdata, se.fit = TRUE)

newdata$ret &lt;- exp(pred$fit) / (1 + exp(pred$fit))

newdata$upper &lt;- exp(pred$fit + 1.96 * pred$se.fit) / 
                 (1 + exp(pred$fit + 1.96 * pred$se.fit))

newdata$lower &lt;- exp(pred$fit - 1.96 * pred$se.fit) / 
                 (1 + exp(pred$fit - 1.96 * pred$se.fit))

ggplot(newdata, aes(x = dur)) +
  geom_rug(data = wesdr, sides = &quot;b&quot;, length = grid::unit(0.02, &quot;npc&quot;), 
           lwd = 0.2, col = &quot;gray60&quot;) +
  geom_ribbon(aes(ymin = lower, ymax = upper), alpha = 0, linetype = 2,
              color = &quot;black&quot;) +
  geom_line(aes(y = ret), lwd = 1.2) +
  labs(y = &quot;Partial effect&quot;, title = &quot;s(dur)&quot;) +
  scale_y_continuous(breaks = 0.2 * 1:5) +
  scale_x_continuous(breaks = 10 * 0:5) +
  theme_classic(base_size = 20)

<details>
<summary>英文:</summary>

The link function returns log odds. Since odds are just `p / (1 - p)`, then log odds are `log(p / (1 - p))`. The inverse is `exp(odds) / (1 + exp(odds))`. So we can just do:
```r
p$data &lt;- mutate(p$data, across(c(est, lower_ci, upper_ci), ~exp(.x)/(1 + exp(.x))))

p + ylim(c(0, 1))

如何绘制自定义广义加性模型(GAM)概率的逻辑图?

To get the partial effect by hand when you have multiple independent variables, you can predict the outcome at the mean of the other variables while only changing the variable of interest. Here, we'll dress up the ggplot to make it look more like the base R plot:

fit &lt;- gam(
  ret 
  ~ s(dur)
  + s(bmi),
  method = &quot;REML&quot;,
  family = binomial,
  data = wesdr
)

newdata &lt;- data.frame(dur = 0:55, bmi = mean(wesdr$bmi))

pred &lt;- predict(fit, newdata, se.fit = TRUE)

newdata$ret &lt;- exp(pred$fit) / (1 + exp(pred$fit))

newdata$upper &lt;- exp(pred$fit + 1.96 * pred$se.fit) / 
                 (1 + exp(pred$fit + 1.96 * pred$se.fit))

newdata$lower &lt;- exp(pred$fit - 1.96 * pred$se.fit) / 
                 (1 + exp(pred$fit - 1.96 * pred$se.fit))

ggplot(newdata, aes(x = dur)) +
  geom_rug(data = wesdr, sides = &quot;b&quot;, length = grid::unit(0.02, &quot;npc&quot;), 
           lwd = 0.2, col = &quot;gray60&quot;) +
  geom_ribbon(aes(ymin = lower, ymax = upper), alpha = 0, linetype = 2,
              color = &quot;black&quot;) +
  geom_line(aes(y = ret), lwd = 1.2) +
  labs(y = &quot;Partial effect&quot;, title = &quot;s(dur)&quot;) +
  scale_y_continuous(breaks = 0.2 * 1:5) +
  scale_x_continuous(breaks = 10 * 0:5) +
  theme_classic(base_size = 20)

如何绘制自定义广义加性模型(GAM)概率的逻辑图?

答案2

得分: 2

你可以简单地使用 predict.gam 使用 type=&#39;link&#39;se=TRUE

mod &lt;- gam(
  ret ~ s(dur),
  method = &quot;REML&quot;,
  family = binomial,
  data = wesdr
)

`pred &lt;- predict.gam(mod, wesdr, type = &quot;link&quot;, se=TRUE)`

然后使用反函数链接适当地缩放。

```R
wesdr %&gt;% 
  mutate(fit = mod$family$linkinv(pred$fit),
         lci = mod$family$linkinv(pred$fit - 1.96 * pred$se.fit),
         uci = mod$family$linkinv(pred$fit + 1.96 * pred$se.fit)) %&gt;% 
  ggplot(aes(dur, fit)) + 
  geom_line() + 
  geom_ribbon(aes(ymin=lci, ymax = uci), alpha = 0.3) +
  geom_rug(aes(x=dur), sides = &quot;b&quot;, 
           length = grid::unit(0.02, &quot;npc&quot;))
英文:

You can simply use predict.gam using type=&#39;link&#39; and se=TRUE

mod &lt;- gam(
  ret ~ s(dur),
  method = &quot;REML&quot;,
  family = binomial,
  data = wesdr
)

pred &lt;- predict.gam(mod, wesdr, type = &quot;link&quot;, se=TRUE)

Then use the inverse link to scale appropriately

wesdr %&gt;% 
  mutate(fit = mod$family$linkinv(pred$fit),
         lci = mod$family$linkinv(pred$fit - 1.96 * pred$se.fit),
         uci = mod$family$linkinv(pred$fit + 1.96 * pred$se.fit)) %&gt;% 
  ggplot(aes(dur, fit)) + 
  geom_line() + 
  geom_ribbon(aes(ymin=lci, ymax = uci), alpha = 0.3) +
  geom_rug(aes(x=dur), sides = &quot;b&quot;, 
           length = grid::unit(0.02, &quot;npc&quot;))

答案3

得分: 2

以下是代码部分的翻译:

使用你的扩展示例,通常的方法是从模型中预测你想要的值。

library("gratia")
library("mgcv")
data("wesdr", package = "gamair")
fit <- gam(ret ~ s(dur) + s(bmi),
           data = wesdr, method = "REML", family = binomial)
在你想要的协变量值上创建一个数据切片。如果你只指定了`dur`,则会得到其他协变量,`bmi` 被设置为训练数据中最接近`bmi`中位数观察值的值。

ds1 <- data_slice(fit, dur = evenly(dur, n = 100))

但是也可以轻松指定其他值;你提到将其他协变量设置为它们的均值:

ds2 <- data_slice(fit, dur = evenly(dur, n = 100), bmi = mean(bmi))

然后使用predict()

fv2 <- fitted_values(fit, data = ds2, scale = "response")

然后绘图:

library("ggplot2")

fv2 %>%
  ggplot(aes(x = dur, y = fitted)) +
  geom_ribbon(aes(x = dur, ymin = lower, ymax = upper),
              inherit.aes = FALSE, alpha = 0.2) +
  geom_line() +
  geom_rug(data = wesdr, aes(x = dur), sides = "b", inherit.aes = FALSE, 
           length = grid::unit(0.01, "npc"), alpha = 0.5)

这将产生如下图所示的结果。

英文:

The general way to do this is to predict from the model at the values you want. Using your extended example,

library(&quot;gratia&quot;)
library(&quot;mgcv&quot;)
data(&quot;wesdr&quot;, package = &quot;gamair&quot;)
fit &lt;- gam(ret ~ s(dur) + s(bmi),
           data = wesdr, method = &quot;REML&quot;, family = binomial)

Create a data slice at the values of the covariates you want. If you just specify dur in the data slice then you will get the other covariate, bmi set to the value of the observation closest to the median of bmi in the training data

ds1 &lt;- data_slice(fit, dur = evenly(dur, n = 100))

But it is easy to specify other values; you mentioned setting the other covariates to their mean:

ds2 &lt;- data_slice(fit, dur = evenly(dur, n = 100), bmi = mean(bmi))

Then you predict():

fv2 &lt;- fitted_values(fit, data = ds2, scale = &quot;response&quot;)

Then plot

library(&quot;ggplot2&quot;)

fv2 |&gt;
  ggplot(aes(x = dur, y = fitted)) +
  geom_ribbon(aes(x = dur, ymin = lower, ymax = upper),
              inherit.aes = FALSE, alpha = 0.2) +
  geom_line() +
  geom_rug(data = wesdr, aes(x = dur), sides = &quot;b&quot;, inherit.aes = FALSE, 
           length = grid::unit(0.01, &quot;npc&quot;), alpha = 0.5)

which produces

如何绘制自定义广义加性模型(GAM)概率的逻辑图?

huangapple
  • 本文由 发表于 2023年3月20日 23:15:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/75792081.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定