2023年3月20日 23:15:45go评论159阅读模式

英文:

How to plot logistic probability for custom generalized additive model (GAM) plot?

问题

我明白，以下是代码部分的中文翻译：

库和数据

library(mgcv)
library(gamair)
library(tidyverse)
data("wesdr")

拟合模型

fit <- gam(
  ret ~ s(dur),
  method = "REML",
  family = binomial,
  data = wesdr
)

评估平滑效应

sm <- smooth_estimates(fit) %>%
  add_confint()
sm

添加偏残差

wesdr <- wesdr %>% 
  add_partial_residuals(fit)

绘图

p <- sm %>%
  filter(smooth == "s(dur)") %>%
  ggplot() +
  geom_rug(aes(x = dur),
           data = wesdr,
           sides = "b", 
           length = grid::unit(0.02, "npc")) +
  geom_ribbon(aes(ymin = lower_ci, 
                  ymax = upper_ci,
                  x = dur),
              alpha = 0.2) +
  geom_line(aes(x = dur,
                y = est),
            lwd = 1.2) +
  labs(y = "部分效应",
       title = "s(dur)")
p

具体来说，我在寻找与以下功能等效的内容：

plot(fit,
     trans = plogis,
     shift = coef(fit)[1])

有关如何获得概率图的建议呢？

英文:

I realize that there is already a question pertaining to this here. However, I'm not looking to use the draw function and would prefer to build up the plot by scratch like shown here. However, the plot shown in the link doesn't show how to do this with logistic probability. I show an example that gets close, but only plots by the link function.

#### Libraries and Data ####
library(mgcv)
library(gamair)
library(tidyverse)
data(&quot;wesdr&quot;)
#### Fit Model ####
fit &lt;- gam(
  ret ~ s(dur),
  method = &quot;REML&quot;,
  family = binomial,
  data = wesdr
)
#### Evaluate the Smooths ####
sm &lt;- smooth_estimates(fit) %&gt;%
  add_confint()
sm
#### Add Partial Residuals ####
wesdr &lt;- wesdr %&gt;% 
  add_partial_residuals(fit)
#### Plot ####
p &lt;- sm %&gt;%
  filter(smooth == &quot;s(dur)&quot;) %&gt;%
  ggplot() +
  geom_rug(aes(x = dur),
           data = wesdr,
           sides = &quot;b&quot;, 
           length = grid::unit(0.02, &quot;npc&quot;)) +
  geom_ribbon(aes(ymin = lower_ci, 
                  ymax = upper_ci,
                  x = dur),
              alpha = 0.2) +
  geom_line(aes(x = dur,
                y = est),
            lwd = 1.2) +
  labs(y = &quot;Partial effect&quot;,
       title = &quot;s(dur)&quot;)
p

Specifically, I'm looking for something functionally equivalent to this:

plot(fit,
     trans = plogis,
     shift = coef(fit)[1])

Any advice on how to get the probability plot?

Edit

I realize I wasn't precise about what I needed. The answer given is good, but I'm considering the most general-case use where a logistic GAM has multiple predictors. So I need a plot based off a model like this:

#### Fit Model ####
fit &lt;- gam(
  ret 
  ~ s(dur)
  + s(bmi),
  method = &quot;REML&quot;,
  family = binomial,
  data = wesdr
)

To account for the average value of other predictors, I need to include the intercept into the plot, which is why I originally used shift in the plot.gam function in base R.

答案1

得分: 2

以下是代码的翻译部分：

链接函数返回对数几率。由于几率只是 `p / (1 - p)`，那么对数几率就是 `log(p / (1 - p))`。反之为 `exp(对数几率) / (1 + exp(对数几率))`。所以我们可以这样做：
```r
p$data &lt;- mutate(p$data, across(c(est, lower_ci, upper_ci), ~exp(.x)/(1 + exp(.x))))
p + ylim(c(0, 1))

要在具有多个自变量的情况下手动获取部分效应，可以在其他变量的均值处预测结果，同时只更改感兴趣的变量。在这里，我们将美化 ggplot，使它看起来更像基本的 R 图：

fit &lt;- gam(
  ret 
  ~ s(dur)
  + s(bmi),
  method = &quot;REML&quot;,
  family = binomial,
  data = wesdr
)
newdata &lt;- data.frame(dur = 0:55, bmi = mean(wesdr$bmi))
pred &lt;- predict(fit, newdata, se.fit = TRUE)
newdata$ret &lt;- exp(pred$fit) / (1 + exp(pred$fit))
newdata$upper &lt;- exp(pred$fit + 1.96 * pred$se.fit) / 
                 (1 + exp(pred$fit + 1.96 * pred$se.fit))
newdata$lower &lt;- exp(pred$fit - 1.96 * pred$se.fit) / 
                 (1 + exp(pred$fit - 1.96 * pred$se.fit))
ggplot(newdata, aes(x = dur)) +
  geom_rug(data = wesdr, sides = &quot;b&quot;, length = grid::unit(0.02, &quot;npc&quot;), 
           lwd = 0.2, col = &quot;gray60&quot;) +
  geom_ribbon(aes(ymin = lower, ymax = upper), alpha = 0, linetype = 2,
              color = &quot;black&quot;) +
  geom_line(aes(y = ret), lwd = 1.2) +
  labs(y = &quot;Partial effect&quot;, title = &quot;s(dur)&quot;) +
  scale_y_continuous(breaks = 0.2 * 1:5) +
  scale_x_continuous(breaks = 10 * 0:5) +
  theme_classic(base_size = 20)


<details>
<summary>英文:</summary>
The link function returns log odds. Since odds are just `p / (1 - p)`, then log odds are `log(p / (1 - p))`. The inverse is `exp(odds) / (1 + exp(odds))`. So we can just do:
```r
p$data &lt;- mutate(p$data, across(c(est, lower_ci, upper_ci), ~exp(.x)/(1 + exp(.x))))
p + ylim(c(0, 1))

To get the partial effect by hand when you have multiple independent variables, you can predict the outcome at the mean of the other variables while only changing the variable of interest. Here, we'll dress up the ggplot to make it look more like the base R plot:

fit &lt;- gam(
  ret 
  ~ s(dur)
  + s(bmi),
  method = &quot;REML&quot;,
  family = binomial,
  data = wesdr
)
newdata &lt;- data.frame(dur = 0:55, bmi = mean(wesdr$bmi))
pred &lt;- predict(fit, newdata, se.fit = TRUE)
newdata$ret &lt;- exp(pred$fit) / (1 + exp(pred$fit))
newdata$upper &lt;- exp(pred$fit + 1.96 * pred$se.fit) / 
                 (1 + exp(pred$fit + 1.96 * pred$se.fit))
newdata$lower &lt;- exp(pred$fit - 1.96 * pred$se.fit) / 
                 (1 + exp(pred$fit - 1.96 * pred$se.fit))
ggplot(newdata, aes(x = dur)) +
  geom_rug(data = wesdr, sides = &quot;b&quot;, length = grid::unit(0.02, &quot;npc&quot;), 
           lwd = 0.2, col = &quot;gray60&quot;) +
  geom_ribbon(aes(ymin = lower, ymax = upper), alpha = 0, linetype = 2,
              color = &quot;black&quot;) +
  geom_line(aes(y = ret), lwd = 1.2) +
  labs(y = &quot;Partial effect&quot;, title = &quot;s(dur)&quot;) +
  scale_y_continuous(breaks = 0.2 * 1:5) +
  scale_x_continuous(breaks = 10 * 0:5) +
  theme_classic(base_size = 20)

答案2

得分: 2

你可以简单地使用 predict.gam 使用 type='link' 和 se=TRUE。

mod &lt;- gam(
  ret ~ s(dur),
  method = &quot;REML&quot;,
  family = binomial,
  data = wesdr
)
`pred &lt;- predict.gam(mod, wesdr, type = &quot;link&quot;, se=TRUE)`
然后使用反函数链接适当地缩放。
```R
wesdr %&gt;% 
  mutate(fit = mod$family$linkinv(pred$fit),
         lci = mod$family$linkinv(pred$fit - 1.96 * pred$se.fit),
         uci = mod$family$linkinv(pred$fit + 1.96 * pred$se.fit)) %&gt;% 
  ggplot(aes(dur, fit)) + 
  geom_line() + 
  geom_ribbon(aes(ymin=lci, ymax = uci), alpha = 0.3) +
  geom_rug(aes(x=dur), sides = &quot;b&quot;, 
           length = grid::unit(0.02, &quot;npc&quot;))

英文:

You can simply use predict.gam using type='link' and se=TRUE

mod &lt;- gam(
  ret ~ s(dur),
  method = &quot;REML&quot;,
  family = binomial,
  data = wesdr
)

pred <- predict.gam(mod, wesdr, type = "link", se=TRUE)

Then use the inverse link to scale appropriately

wesdr %&gt;% 
  mutate(fit = mod$family$linkinv(pred$fit),
         lci = mod$family$linkinv(pred$fit - 1.96 * pred$se.fit),
         uci = mod$family$linkinv(pred$fit + 1.96 * pred$se.fit)) %&gt;% 
  ggplot(aes(dur, fit)) + 
  geom_line() + 
  geom_ribbon(aes(ymin=lci, ymax = uci), alpha = 0.3) +
  geom_rug(aes(x=dur), sides = &quot;b&quot;, 
           length = grid::unit(0.02, &quot;npc&quot;))

答案3

得分: 2

以下是代码部分的翻译：

使用你的扩展示例，通常的方法是从模型中预测你想要的值。
library("gratia")
library("mgcv")
data("wesdr", package = "gamair")
fit <- gam(ret ~ s(dur) + s(bmi),
           data = wesdr, method = "REML", family = binomial)

在你想要的协变量值上创建一个数据切片。如果你只指定了`dur`，则会得到其他协变量，`bmi` 被设置为训练数据中最接近`bmi`中位数观察值的值。
ds1 <- data_slice(fit, dur = evenly(dur, n = 100))

但是也可以轻松指定其他值；你提到将其他协变量设置为它们的均值：

ds2 <- data_slice(fit, dur = evenly(dur, n = 100), bmi = mean(bmi))

然后使用predict()：

fv2 <- fitted_values(fit, data = ds2, scale = "response")

然后绘图：

library("ggplot2")
fv2 %>%
  ggplot(aes(x = dur, y = fitted)) +
  geom_ribbon(aes(x = dur, ymin = lower, ymax = upper),
              inherit.aes = FALSE, alpha = 0.2) +
  geom_line() +
  geom_rug(data = wesdr, aes(x = dur), sides = "b", inherit.aes = FALSE, 
           length = grid::unit(0.01, "npc"), alpha = 0.5)

这将产生如下图所示的结果。

英文:

The general way to do this is to predict from the model at the values you want. Using your extended example,

library(&quot;gratia&quot;)
library(&quot;mgcv&quot;)
data(&quot;wesdr&quot;, package = &quot;gamair&quot;)
fit &lt;- gam(ret ~ s(dur) + s(bmi),
           data = wesdr, method = &quot;REML&quot;, family = binomial)

Create a data slice at the values of the covariates you want. If you just specify dur in the data slice then you will get the other covariate, bmi set to the value of the observation closest to the median of bmi in the training data

ds1 &lt;- data_slice(fit, dur = evenly(dur, n = 100))

But it is easy to specify other values; you mentioned setting the other covariates to their mean:

ds2 &lt;- data_slice(fit, dur = evenly(dur, n = 100), bmi = mean(bmi))

Then you predict():

fv2 &lt;- fitted_values(fit, data = ds2, scale = &quot;response&quot;)

Then plot

library(&quot;ggplot2&quot;)
fv2 |&gt;
  ggplot(aes(x = dur, y = fitted)) +
  geom_ribbon(aes(x = dur, ymin = lower, ymax = upper),
              inherit.aes = FALSE, alpha = 0.2) +
  geom_line() +
  geom_rug(data = wesdr, aes(x = dur), sides = &quot;b&quot;, inherit.aes = FALSE, 
           length = grid::unit(0.01, &quot;npc&quot;), alpha = 0.5)

which produces

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何绘制自定义广义加性模型（GAM）概率的逻辑图？

问题

库和数据

拟合模型

评估平滑效应

添加偏残差

绘图

Edit

答案1

答案2

答案3

如何在一个shiny模块中使用shinjs::extendShinyjs()？

Elegant way to merge many duplicated columns in R?

在R数据框中反转非NA值的顺序。

如何最佳优化我的R代码并避免循环

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论