如何在同一图中为两个模型制作漂亮的ROC曲线?

huangapple go评论96阅读模式
英文:

How to make beautiful ROC curves for two models in the same plot?

问题

I've trained two xgboost models, say model1 and model2. I have the AUC scores for each model and I want them to appear in the plot. I want to make beautiful ROC curves for both models in the same plot. Something like this:

如何在同一图中为两个模型制作漂亮的ROC曲线?

How can I do that?

I usually use the library pROC, and I know I need to extract the scores, and the truth from each model, right?

so something like this maybe:

roc1 = roc(model1$truth, model1$scores)
roc2 = roc(model2$truth, model2$scores)

I also need the fpr and tpr for each model:

D1 = data.frame(fpr = 1 - roc1$specificities, tpr = roc1$sensitivities)
D2 = data.frame(fpr = 1 - roc2$specificities, tpr = roc2$sensitivities)

Then I can maybe add arrows to point out which curve is which:

arrows = tibble(x1 = c(0.5, 0.13) , x2 = c(0.32, 0.2), y1 = c(0.52, 0.83), y2 = c(0.7, 0.7))

And finally ggplot: (this part is missing)

ggplot(data = D1, aes(x = fpr, y = tpr)) + 
geom_smooth(se = FALSE) + 
geom_smooth(data = D2, color = 'red', se = FALSE) + 
annotate("text", x = 0.5, 0.475, label = "score of model 1") + 
annotate("text", x = 0.13, y = 0.9, label = "scores of model 2")

So I need help with two things:

  1. How do I get the right information out from the models, to make ROC curves? How do I get the truth and the prediction scores? The truth are just the labels of the target feature in the training set maybe?

  2. How do I continue the code? and is my code right so far?

英文:

I've trained two xgboost models, say model1 and model2. I have the AUC scores for each model and I want them to appear in the plot. I want to make beautiful ROC curves for both models in the same plot. Something like this:

如何在同一图中为两个模型制作漂亮的ROC曲线?

How can I do that?

I usually use the library pROC, and I know I need to extract the scores, and the truth from each model, right?

so something like this maybe:

roc1 = roc(model1$truth, model1$scores)
roc2 = roc(model2$truth, model2$scores)

I also need the fpr and tpr for each model:

D1 = data.frame = (fpr = 1 - roc1$specificities, tpr = roc1$sensitivities)
D2 = data.frame = (fpr = 1 - roc2$specificities, tpr = roc2$sensitivities)

Then I can maybe add arrows to point out which curve is which:

arrows = tibble(x1 = c(0.5, 0.13) , x2 = c(0.32, 0.2), y1 = c(0.52, 0.83), y2 = c(0.7,0.7) )

And finally ggplot: (this part is missing)

ggplot(data = D1, aes(x = fpr, y = tpr)) + 
geom_smooth(se = FALSE) + 
geom_smooth(data = D2, color = 'red', se = FALSE) + 
annotate("text", x = 0.5, 0.475, label = 'score of model 1') + 
annotate("text", x = 0.13, y = 0.9, label = scores of model 2') +

So I need help with two things:

  1. How do I get the right information out from the models, to make ROC curves? How do I get the truth and the prediction scores? The truth are just the labels of the target feature in the training set maybe?

  2. How do I continue the code? and is my code right so far?

答案1

得分: 3

以下是已翻译的内容:

You can get the sensitivity and specificity in a data frame using coords from pROC. Just rbind the results for the two models after first attaching a column labeling each set as model 1 or model 2. To get the smooth-looking ROC with automatic labels you can use geom_textsmooth from the geomtextpath package:

library(pROC)
library(geomtextpath)

roc1 <- roc(model1$truth, model1$scores)
roc2 <- roc(model2$truth, model2$scores)

df <- rbind(cbind(model = "Model 1", coords(roc1)), 
            cbind(model = "Model 2", coords(roc2)))

ggplot(df, aes(1 - specificity, sensitivity, color = model)) +
  geom_textsmooth(aes(label = model), size = 7, se = FALSE, span = 0.2,
                  textcolour = "black", vjust = 1.5, linewidth = 1,
                  text_smoothing = 50) +
  geom_abline() +
  scale_color_brewer(palette = "Set1", guide = "none", direction = -1) +
  scale_x_continuous("False Positive Rate", labels = scales::percent) +
  scale_y_continuous("True Positive Rate", labels = scales::percent) +
  coord_equal(expand = FALSE) +
  theme_classic(base_size = 20) +
  theme(plot.margin = margin(10, 30, 10, 10)) 

Data used

set.seed(2023)

model1 <- model2 <- data.frame(scores = rep(1:100, 50))
p1 <- model2$scores + rnorm(5000, 0, 20)
p2 <- model1$scores/100

model1$truth <- rbinom(5000, 1, (p1 - min(p1))/diff(range(p1)))
model2$truth <- rbinom(5000, 1, p2)

如何在同一图中为两个模型制作漂亮的ROC曲线?

英文:

You can get the sensitivity and specifity in a data frame using coords from pROC. Just rbind the results for the two models after first attaching a column labelling each set as model 1 or model 2. To get the smooth-looking ROC with automatic labels you can use geom_textsmooth from the geomtextpath package:

library(pROC)
library(geomtextpath)

roc1 &lt;- roc(model1$truth, model1$scores)
roc2 &lt;- roc(model2$truth, model2$scores)

df &lt;- rbind(cbind(model = &quot;Model 1&quot;, coords(roc1)), 
            cbind(model = &quot;Model 2&quot;, coords(roc2)))

ggplot(df, aes(1 - specificity, sensitivity, color = model)) +
  geom_textsmooth(aes(label = model), size = 7, se = FALSE, span = 0.2,
                  textcolour = &quot;black&quot;, vjust = 1.5, linewidth = 1,
                  text_smoothing = 50) +
  geom_abline() +
  scale_color_brewer(palette = &quot;Set1&quot;, guide = &quot;none&quot;, direction = -1) +
  scale_x_continuous(&quot;False Positive Rate&quot;, labels = scales::percent) +
  scale_y_continuous(&quot;True Positive Rate&quot;, labels = scales::percent) +
  coord_equal(expand = FALSE) +
  theme_classic(base_size = 20) +
  theme(plot.margin = margin(10, 30, 10, 10)) 

如何在同一图中为两个模型制作漂亮的ROC曲线?


Data used

set.seed(2023)

model1 &lt;- model2 &lt;- data.frame(scores = rep(1:100, 50))
p1 &lt;- model2$scores + rnorm(5000, 0, 20)
p2 &lt;- model1$scores/100

model1$truth &lt;- rbinom(5000, 1, (p1 - min(p1))/diff(range(p1)))
model2$truth &lt;- rbinom(5000, 1, p2)

huangapple
  • 本文由 发表于 2023年2月19日 05:58:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/75496646.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定