英文:
Why does a non-monotonic ROC curve plotting error occur when using list of roc for ggrow?
问题
我在R中训练了一个包含10折交叉验证的随机森林模型。我提供了ROC对象的列表,可以在此处下载,它是'my_roclist.rds'文件。以下是用于生成底部图形的代码:
library(pROC)
library(ggplot2)
# 加载我在谷歌云盘中提供的'my_roclist.rds'文件
hochgerner.roc.list <- loadRDS("my_roclist.rds")
# 绘图
jpeg('hochgerner2018_earlysig_roc.jpeg', width = 600, height=600)
ggroc(hochgerner.roc.list, alpha = 1,
colour = "red", linetype = 'solid',
size = 4, legacy.axes = TRUE) +
theme_classic() +
ggtitle("Hochgerner et al., (2018) ROC") +
xlab("FPR") + ylab("TPR") +
geom_segment(aes(x = 0, xend = 1, y = 0, yend = 1), linewidth = 2,
color="darkgrey", linetype="dashed") +
theme(plot.title = element_text(size = 20, face = "bold",
hjust = 0.5),
axis.text=element_text(size=20, face = "bold", colour="black"),
axis.title =element_text(size=20, face = "bold",
colour="black"),
panel.border = element_rect(color = "black",
fill = NA,
linewidth = 2),
plot.margin = margin(t = 10, r = 10, b = 10, l = 10) ) +
scale_x_continuous(expand = expansion(mult = c(0, 0)),
breaks = c(0,0.25,0.5,0.75,1)) +
scale_y_continuous(expand = expansion(mult = c(0, 0)),
breaks = c(0.25,0.5,0.75,1))
dev.off()
我使用pROC包来从具有交叉验证的随机森林模型生成图形。每个折叠都在一个for循环内执行,然后计算该曲线的ROC,并将其添加到ROC对象的列表中。在训练结束时,将这些森林组合起来。我使用ggroc来绘制特定折叠的ROC曲线,只需提供一个ROC对象的列表。ggroc()应该为每个折叠的ROC绘制一条曲线,可以在此处找到教程。这个问题似乎特定于我提供的数据,所以我无法复现这个图形。为什么ROC是非单调的?这应该是不可能的,我用来生成模型的代码非常直接。我可能做错了什么?
英文:
I am traing a 10 fold crossvalidated random forest in R. I have provided the list of roc objects is [here for download][https://drive.google.com/drive/folders/1ZYs1gMzPr64lV7WQmVOu1zt0YSVak9cr] , it is the 'my_roclist.rds' file. The code to make the plot at the bottom is here:
library(pROC)
library(ggplot2)
# load the my_roclist.rds file I've provided in the google drive
hochgerner.roc.list <- loadRDS("my_roclist.rds")
# plotting
jpeg('hochgerner2018_earlysig_roc.jpeg', width = 600, height=600)
ggroc(hochgerner.roc.list, alpha = 1,
colour = "red", linetype = 'solid',
size = 4, legacy.axes = TRUE) +
theme_classic() +
ggtitle("Hochgerner et al., (2018) ROC") +
xlab("FPR") + ylab("TPR") +
geom_segment(aes(x = 0, xend = 1, y = 0, yend = 1), linewidth = 2,
color="darkgrey", linetype="dashed") +
theme(plot.title = element_text(size = 20, face = "bold",
hjust = 0.5),
axis.text=element_text(size=20, face = "bold", colour="black"),
axis.title =element_text(size=20, face = "bold",
colour="black"),
panel.border = element_rect(color = "black",
fill = NA,
linewidth = 2),
plot.margin = margin(t = 10, r = 10, b = 10, l = 10) ) +
scale_x_continuous(expand = expansion(mult = c(0, 0)),
breaks = c(0,0.25,0.5,0.75,1)) +
scale_y_continuous(expand = expansion(mult = c(0, 0)),
breaks = c(0.25,0.5,0.75,1))
dev.off()
I am using the [pROC package][https://cran.r-project.org/web/packages/pROC/pROC.pdf] to generate plots from a random forest model with cross validation. Each fold is executed inside a for loop and an ROC for that curve computed then added to a list of ROC objects. At the end of the training the forests are combined. I plot the fold specific ROCs with [ggroc][https://www.rdocumentation.org/packages/pROC/versions/1.18.4/topics/ggroc.roc] by providing it a list of the roc objects. ggroc() should then plot a curve for each fold's roc, tutorial here. This issue appears to be specific to the data I am feeding it so I cannot reproduce the plot. Why is the ROC non-monotonic? That should be impossible, and the code I am using to produce the model is very straight forward. What could I be doing wrong?
答案1
得分: 0
pROC软件包的作者在其GitHub页面上回答了这个问题,并提供了一个修复方案。问题链接在这里:https://github.com/xrobin/pROC/issues/121
英文:
Authors of the pROC package answered the question on their github, and have provided a fix. The issue is here: https://github.com/xrobin/pROC/issues/121
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论