如何在同一图中为两个模型制作漂亮的ROC曲线?

huangapple go评论117阅读模式
英文:

How to make beautiful ROC curves for two models in the same plot?

问题

I've trained two xgboost models, say model1 and model2. I have the AUC scores for each model and I want them to appear in the plot. I want to make beautiful ROC curves for both models in the same plot. Something like this:

如何在同一图中为两个模型制作漂亮的ROC曲线?

How can I do that?

I usually use the library pROC, and I know I need to extract the scores, and the truth from each model, right?

so something like this maybe:

  1. roc1 = roc(model1$truth, model1$scores)
  2. roc2 = roc(model2$truth, model2$scores)

I also need the fpr and tpr for each model:

  1. D1 = data.frame(fpr = 1 - roc1$specificities, tpr = roc1$sensitivities)
  2. D2 = data.frame(fpr = 1 - roc2$specificities, tpr = roc2$sensitivities)

Then I can maybe add arrows to point out which curve is which:

  1. arrows = tibble(x1 = c(0.5, 0.13) , x2 = c(0.32, 0.2), y1 = c(0.52, 0.83), y2 = c(0.7, 0.7))

And finally ggplot: (this part is missing)

  1. ggplot(data = D1, aes(x = fpr, y = tpr)) +
  2. geom_smooth(se = FALSE) +
  3. geom_smooth(data = D2, color = 'red', se = FALSE) +
  4. annotate("text", x = 0.5, 0.475, label = "score of model 1") +
  5. annotate("text", x = 0.13, y = 0.9, label = "scores of model 2")

So I need help with two things:

  1. How do I get the right information out from the models, to make ROC curves? How do I get the truth and the prediction scores? The truth are just the labels of the target feature in the training set maybe?

  2. How do I continue the code? and is my code right so far?

英文:

I've trained two xgboost models, say model1 and model2. I have the AUC scores for each model and I want them to appear in the plot. I want to make beautiful ROC curves for both models in the same plot. Something like this:

如何在同一图中为两个模型制作漂亮的ROC曲线?

How can I do that?

I usually use the library pROC, and I know I need to extract the scores, and the truth from each model, right?

so something like this maybe:

  1. roc1 = roc(model1$truth, model1$scores)
  2. roc2 = roc(model2$truth, model2$scores)

I also need the fpr and tpr for each model:

  1. D1 = data.frame = (fpr = 1 - roc1$specificities, tpr = roc1$sensitivities)
  2. D2 = data.frame = (fpr = 1 - roc2$specificities, tpr = roc2$sensitivities)

Then I can maybe add arrows to point out which curve is which:

  1. arrows = tibble(x1 = c(0.5, 0.13) , x2 = c(0.32, 0.2), y1 = c(0.52, 0.83), y2 = c(0.7,0.7) )

And finally ggplot: (this part is missing)

  1. ggplot(data = D1, aes(x = fpr, y = tpr)) +
  2. geom_smooth(se = FALSE) +
  3. geom_smooth(data = D2, color = 'red', se = FALSE) +
  4. annotate("text", x = 0.5, 0.475, label = 'score of model 1') +
  5. annotate("text", x = 0.13, y = 0.9, label = scores of model 2') +

So I need help with two things:

  1. How do I get the right information out from the models, to make ROC curves? How do I get the truth and the prediction scores? The truth are just the labels of the target feature in the training set maybe?

  2. How do I continue the code? and is my code right so far?

答案1

得分: 3

以下是已翻译的内容:

You can get the sensitivity and specificity in a data frame using coords from pROC. Just rbind the results for the two models after first attaching a column labeling each set as model 1 or model 2. To get the smooth-looking ROC with automatic labels you can use geom_textsmooth from the geomtextpath package:

  1. library(pROC)
  2. library(geomtextpath)
  3. roc1 <- roc(model1$truth, model1$scores)
  4. roc2 <- roc(model2$truth, model2$scores)
  5. df <- rbind(cbind(model = "Model 1", coords(roc1)),
  6. cbind(model = "Model 2", coords(roc2)))
  7. ggplot(df, aes(1 - specificity, sensitivity, color = model)) +
  8. geom_textsmooth(aes(label = model), size = 7, se = FALSE, span = 0.2,
  9. textcolour = "black", vjust = 1.5, linewidth = 1,
  10. text_smoothing = 50) +
  11. geom_abline() +
  12. scale_color_brewer(palette = "Set1", guide = "none", direction = -1) +
  13. scale_x_continuous("False Positive Rate", labels = scales::percent) +
  14. scale_y_continuous("True Positive Rate", labels = scales::percent) +
  15. coord_equal(expand = FALSE) +
  16. theme_classic(base_size = 20) +
  17. theme(plot.margin = margin(10, 30, 10, 10))

Data used

  1. set.seed(2023)
  2. model1 <- model2 <- data.frame(scores = rep(1:100, 50))
  3. p1 <- model2$scores + rnorm(5000, 0, 20)
  4. p2 <- model1$scores/100
  5. model1$truth <- rbinom(5000, 1, (p1 - min(p1))/diff(range(p1)))
  6. model2$truth <- rbinom(5000, 1, p2)

如何在同一图中为两个模型制作漂亮的ROC曲线?

英文:

You can get the sensitivity and specifity in a data frame using coords from pROC. Just rbind the results for the two models after first attaching a column labelling each set as model 1 or model 2. To get the smooth-looking ROC with automatic labels you can use geom_textsmooth from the geomtextpath package:

  1. library(pROC)
  2. library(geomtextpath)
  3. roc1 &lt;- roc(model1$truth, model1$scores)
  4. roc2 &lt;- roc(model2$truth, model2$scores)
  5. df &lt;- rbind(cbind(model = &quot;Model 1&quot;, coords(roc1)),
  6. cbind(model = &quot;Model 2&quot;, coords(roc2)))
  7. ggplot(df, aes(1 - specificity, sensitivity, color = model)) +
  8. geom_textsmooth(aes(label = model), size = 7, se = FALSE, span = 0.2,
  9. textcolour = &quot;black&quot;, vjust = 1.5, linewidth = 1,
  10. text_smoothing = 50) +
  11. geom_abline() +
  12. scale_color_brewer(palette = &quot;Set1&quot;, guide = &quot;none&quot;, direction = -1) +
  13. scale_x_continuous(&quot;False Positive Rate&quot;, labels = scales::percent) +
  14. scale_y_continuous(&quot;True Positive Rate&quot;, labels = scales::percent) +
  15. coord_equal(expand = FALSE) +
  16. theme_classic(base_size = 20) +
  17. theme(plot.margin = margin(10, 30, 10, 10))

如何在同一图中为两个模型制作漂亮的ROC曲线?


Data used

  1. set.seed(2023)
  2. model1 &lt;- model2 &lt;- data.frame(scores = rep(1:100, 50))
  3. p1 &lt;- model2$scores + rnorm(5000, 0, 20)
  4. p2 &lt;- model1$scores/100
  5. model1$truth &lt;- rbinom(5000, 1, (p1 - min(p1))/diff(range(p1)))
  6. model2$truth &lt;- rbinom(5000, 1, p2)

huangapple
  • 本文由 发表于 2023年2月19日 05:58:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/75496646.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定