交叉验证中的重采样(折叠)在R中

huangapple go评论62阅读模式
英文:

Resamples (folds) for cross-validation in R

问题

我正在使用tidymodels框架为随机森林模型创建分层重采样折叠以进行交叉验证。是否可以实际访问和查看/绘制每个折叠内的数据?下面是可复制的代码:

library(tidyverse)
library(tidymodels)

df_cv <- vfold_cv(iris, v = 10, strata = Species)

请注意,这是您提供的代码段,其中包含关于使用tidymodels框架进行交叉验证的信息。

英文:

I'm using tidymodels framework for creating stratified resample folds for cross-validation in a random forest model. Is it possible to actually access and view / plot the data within each of these folds? Reproducible code below:

library(tidyverse)
library(tidymodels)

df_cv &lt;- vfold_cv(iris, v = 10, strata =Species)

答案1

得分: 2

vfold_cv的输出是一个rsplit对象。您可以运行split1 <- get_rsplit(df_cv, index = 1)来获取分割结果。analysis(split1)将为您提供分析数据框,assessment(split1)将为您提供评估数据框。

您还可以运行tidy(split1)来获取关于哪些行属于分析集和哪些行属于评估集的信息。

有关如何处理rsplit对象的更多信息,可以参考此参考文档

要更深入了解rsplit类,您可以查看这里的代码

英文:

The output of vfold_cv is an rsplit object. You can run split1 &lt;- get_rsplit(df_cv, index = 1) to get the split. analysis(split1) will give you the analysis data frame and assessment(split1) will get you the assessment data frame.

You can also run tidy(split1) to get information about which rows went to the analysis set vs. which rows went to the assessment set.

This reference gives a little bit more information about what you do with an rsplit object.

For a more in-depth understanding of the rsplit class, you can check out the code here.

huangapple
  • 本文由 发表于 2023年2月24日 01:11:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/75548116.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定