2020年1月4日 01:58:20go评论107阅读模式

英文:

getResamplingIndices from resampling used in benchmark experiment - mlr

问题

我正在使用嵌套交叉验证进行基准实验。我想要检索每个外部循环使用的实例的索引。我知道有一个适用于此任务的函数getResamplingIndices()。但它不会接受'BenchmarkResult'对象。有没有办法绕过这个问题？以下是一个示例：

res = benchmark(lrns, task, outer, measures = list(auc, ppv, npv), show.info = FALSE, models = T)
getResamplingIndices(res)
Error in getResamplingIndices(res) : 
  对'object'的断言失败：必须继承自类'ResampleResult'，但它的类是'BenchmarkResult'。

英文:

I am using nested cross-validation in a benchmark experiment. I would like to retrieve the indices of the instances used for each outer loop. I am aware there is a function getResamplingIndices() suited for this task. But it won't accept 'BenchmarkResult' object. Is there a way to get around this? Here is an example:

res = benchmark(lrns, task, outer, measures = list(auc, ppv, npv), show.info = FALSE, models = T)
getResamplingIndices(res)
Error in getResamplingIndices(res) : 
  Assertion on &#39;object&#39; failed: Must inherit from class &#39;ResampleResult&#39;, but has class &#39;BenchmarkResult&#39;.

答案1

得分: 2

重采样索引在各个任务之间是相同的，因此您只需将其应用于嵌套在您的BenchmarkResult对象中的ResamplingResult对象中。

如果您希望在BenchmarkResult对象中的每个任务中都有这些索引，可以执行以下操作：

inds_by_task = lapply(bmr$results, function(x) getResamplingIndices(x[[1]]))

以下是完整的reprex示例：

library(mlr)
lrns = list(makeLearner("classif.lda"), makeLearner("classif.rpart"))
tasks = list(iris.task, sonar.task)
rdesc = makeResampleDesc("CV", iters = 2L)
meas = list(acc, ber)
bmr = benchmark(lrns, tasks, rdesc, measures = meas, show.info = FALSE)
getResamplingIndices(bmr$results$`iris-example`$classif.lda)
# $train.inds
# $train.inds[[1]]
#  [1]  25 136  62  77  22 114 101 145  87  34  93 120 133 126  76   4 105  97   7
# [20] 128  49  19 106   9  54   5  72 102  73  51 109 115  23  86  89 112 130  69
# [39]  57 122  53  12  60  40  36  70  83  90 108  81  38  50 129  75  71  59  47
# [58]  95  31 147  37  30 127  43 148 103  27  66 137  29 124  35 143 132  45
# 
# $train.inds[[2]]
#  [1]  79 123  28  17  82  55 110   8 107  26 125 121  32  33  48 119  98  58 116
# [20] 144 139  67  13 142 111  16  65  21  74  42 113 149 117   2  99  68 140 104
# [39]  96 150   1  94  46 131 135  11  10 146  85  15  52  78  63 100   3 118 134
# [58]   6  88 138  41  39  92  56  84  61  20  24  91  80  18  64  14 141  44
# 
# 
# $test.inds
# $test.inds[[1]]
#  [1]   1   2   3   6   8  10  11  13  14  15  16  17  18  20  21  24  26  28  32
# [20]  33  39  41  42  44  46  48  52  55  56  58  61  63  64  65  67  68  74  78
# [39]  79  80  82  84  85  88  91  92  94  96  98  99 100 104 107 110 111 113 116
# [58] 117 118 119 121 123 125 131 134 135 138 139 140 141 142 144 146 149 150
# 
# $test.inds[[2]]
#  [1]   4   5   7   9  12  19  22  23  25  27  29  30  31  34  35  36  37  38  40
# [20]  43  45  47  49  50  51  53  54  57  59  60  62  66  69  70  71  72  73  75
# [39]  76  77  81  83  86  87  89  90  93  95  97 101 102 103 105 106 108 109 112
# [58] 114 115 120 122 124 126 127 128 129 130 132 133 136 137 143 145 147 148

^{创建于2020-01-04，使用 reprex包 (v0.3.0)}

英文:

Resampling indices are the same across tasks, so you just apply it in a ResamplingResult object which is nested within your BenchmarkResult object.

If you want to have it for every task in your BenchmarkResult object, do

inds_by_task = lapply(bmr$results, function(x) getResamplingIndices(x[[1]]))

See below for a full reprex.

library(mlr)
#&gt; Loading required package: ParamHelpers
lrns = list(makeLearner(&quot;classif.lda&quot;), makeLearner(&quot;classif.rpart&quot;))
tasks = list(iris.task, sonar.task)
rdesc = makeResampleDesc(&quot;CV&quot;, iters = 2L)
meas = list(acc, ber)
bmr = benchmark(lrns, tasks, rdesc, measures = meas, show.info = FALSE)
getResamplingIndices(bmr$results$`iris-example`$classif.lda)
#&gt; $train.inds
#&gt; $train.inds[[1]]
#&gt;  [1]  25 136  62  77  22 114 101 145  87  34  93 120 133 126  76   4 105  97   7
#&gt; [20] 128  49  19 106   9  54   5  72 102  73  51 109 115  23  86  89 112 130  69
#&gt; [39]  57 122  53  12  60  40  36  70  83  90 108  81  38  50 129  75  71  59  47
#&gt; [58]  95  31 147  37  30 127  43 148 103  27  66 137  29 124  35 143 132  45
#&gt; 
#&gt; $train.inds[[2]]
#&gt;  [1]  79 123  28  17  82  55 110   8 107  26 125 121  32  33  48 119  98  58 116
#&gt; [20] 144 139  67  13 142 111  16  65  21  74  42 113 149 117   2  99  68 140 104
#&gt; [39]  96 150   1  94  46 131 135  11  10 146  85  15  52  78  63 100   3 118 134
#&gt; [58]   6  88 138  41  39  92  56  84  61  20  24  91  80  18  64  14 141  44
#&gt; 
#&gt; 
#&gt; $test.inds
#&gt; $test.inds[[1]]
#&gt;  [1]   1   2   3   6   8  10  11  13  14  15  16  17  18  20  21  24  26  28  32
#&gt; [20]  33  39  41  42  44  46  48  52  55  56  58  61  63  64  65  67  68  74  78
#&gt; [39]  79  80  82  84  85  88  91  92  94  96  98  99 100 104 107 110 111 113 116
#&gt; [58] 117 118 119 121 123 125 131 134 135 138 139 140 141 142 144 146 149 150
#&gt; 
#&gt; $test.inds[[2]]
#&gt;  [1]   4   5   7   9  12  19  22  23  25  27  29  30  31  34  35  36  37  38  40
#&gt; [20]  43  45  47  49  50  51  53  54  57  59  60  62  66  69  70  71  72  73  75
#&gt; [39]  76  77  81  83  86  87  89  90  93  95  97 101 102 103 105 106 108 109 112
#&gt; [58] 114 115 120 122 124 126 127 128 129 130 132 133 136 137 143 145 147 148

<sup>Created on 2020-01-04 by the reprex package (v0.3.0)</sup>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从在基准实验中使用的重采样中获取重采样索引 – mlr

问题

答案1

Create plot with relative time point in R

附加的右轴用于分组。

在R中如何对每个组的系数进行乘法运算，然后计算原始值的百分比。

如何在满足行条件时扩展R数据框中的值？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。