从在基准实验中使用的重采样中获取重采样索引 – mlr

huangapple go评论107阅读模式
英文:

getResamplingIndices from resampling used in benchmark experiment - mlr

问题

我正在使用嵌套交叉验证进行基准实验。我想要检索每个外部循环使用的实例的索引。我知道有一个适用于此任务的函数getResamplingIndices()。但它不会接受'BenchmarkResult'对象。有没有办法绕过这个问题?以下是一个示例:

  1. res = benchmark(lrns, task, outer, measures = list(auc, ppv, npv), show.info = FALSE, models = T)
  2. getResamplingIndices(res)
  3. Error in getResamplingIndices(res) :
  4. 'object'的断言失败:必须继承自类'ResampleResult',但它的类是'BenchmarkResult'
英文:

I am using nested cross-validation in a benchmark experiment. I would like to retrieve the indices of the instances used for each outer loop. I am aware there is a function getResamplingIndices() suited for this task. But it won't accept 'BenchmarkResult' object. Is there a way to get around this? Here is an example:

  1. res = benchmark(lrns, task, outer, measures = list(auc, ppv, npv), show.info = FALSE, models = T)
  2. getResamplingIndices(res)
  3. Error in getResamplingIndices(res) :
  4. Assertion on 'object' failed: Must inherit from class 'ResampleResult', but has class 'BenchmarkResult'.

答案1

得分: 2

重采样索引在各个任务之间是相同的,因此您只需将其应用于嵌套在您的BenchmarkResult对象中的ResamplingResult对象中。

如果您希望在BenchmarkResult对象中的每个任务中都有这些索引,可以执行以下操作:

  1. inds_by_task = lapply(bmr$results, function(x) getResamplingIndices(x[[1]]))

以下是完整的reprex示例:

  1. library(mlr)
  2. lrns = list(makeLearner("classif.lda"), makeLearner("classif.rpart"))
  3. tasks = list(iris.task, sonar.task)
  4. rdesc = makeResampleDesc("CV", iters = 2L)
  5. meas = list(acc, ber)
  6. bmr = benchmark(lrns, tasks, rdesc, measures = meas, show.info = FALSE)
  7. getResamplingIndices(bmr$results$`iris-example`$classif.lda)
  8. # $train.inds
  9. # $train.inds[[1]]
  10. # [1] 25 136 62 77 22 114 101 145 87 34 93 120 133 126 76 4 105 97 7
  11. # [20] 128 49 19 106 9 54 5 72 102 73 51 109 115 23 86 89 112 130 69
  12. # [39] 57 122 53 12 60 40 36 70 83 90 108 81 38 50 129 75 71 59 47
  13. # [58] 95 31 147 37 30 127 43 148 103 27 66 137 29 124 35 143 132 45
  14. #
  15. # $train.inds[[2]]
  16. # [1] 79 123 28 17 82 55 110 8 107 26 125 121 32 33 48 119 98 58 116
  17. # [20] 144 139 67 13 142 111 16 65 21 74 42 113 149 117 2 99 68 140 104
  18. # [39] 96 150 1 94 46 131 135 11 10 146 85 15 52 78 63 100 3 118 134
  19. # [58] 6 88 138 41 39 92 56 84 61 20 24 91 80 18 64 14 141 44
  20. #
  21. #
  22. # $test.inds
  23. # $test.inds[[1]]
  24. # [1] 1 2 3 6 8 10 11 13 14 15 16 17 18 20 21 24 26 28 32
  25. # [20] 33 39 41 42 44 46 48 52 55 56 58 61 63 64 65 67 68 74 78
  26. # [39] 79 80 82 84 85 88 91 92 94 96 98 99 100 104 107 110 111 113 116
  27. # [58] 117 118 119 121 123 125 131 134 135 138 139 140 141 142 144 146 149 150
  28. #
  29. # $test.inds[[2]]
  30. # [1] 4 5 7 9 12 19 22 23 25 27 29 30 31 34 35 36 37 38 40
  31. # [20] 43 45 47 49 50 51 53 54 57 59 60 62 66 69 70 71 72 73 75
  32. # [39] 76 77 81 83 86 87 89 90 93 95 97 101 102 103 105 106 108 109 112
  33. # [58] 114 115 120 122 124 126 127 128 129 130 132 133 136 137 143 145 147 148

创建于2020-01-04,使用 reprex包 (v0.3.0)

英文:

Resampling indices are the same across tasks, so you just apply it in a ResamplingResult object which is nested within your BenchmarkResult object.

If you want to have it for every task in your BenchmarkResult object, do

  1. inds_by_task = lapply(bmr$results, function(x) getResamplingIndices(x[[1]]))

See below for a full reprex.

  1. library(mlr)
  2. #> Loading required package: ParamHelpers
  3. lrns = list(makeLearner("classif.lda"), makeLearner("classif.rpart"))
  4. tasks = list(iris.task, sonar.task)
  5. rdesc = makeResampleDesc("CV", iters = 2L)
  6. meas = list(acc, ber)
  7. bmr = benchmark(lrns, tasks, rdesc, measures = meas, show.info = FALSE)
  8. getResamplingIndices(bmr$results$`iris-example`$classif.lda)
  9. #> $train.inds
  10. #> $train.inds[[1]]
  11. #> [1] 25 136 62 77 22 114 101 145 87 34 93 120 133 126 76 4 105 97 7
  12. #> [20] 128 49 19 106 9 54 5 72 102 73 51 109 115 23 86 89 112 130 69
  13. #> [39] 57 122 53 12 60 40 36 70 83 90 108 81 38 50 129 75 71 59 47
  14. #> [58] 95 31 147 37 30 127 43 148 103 27 66 137 29 124 35 143 132 45
  15. #>
  16. #> $train.inds[[2]]
  17. #> [1] 79 123 28 17 82 55 110 8 107 26 125 121 32 33 48 119 98 58 116
  18. #> [20] 144 139 67 13 142 111 16 65 21 74 42 113 149 117 2 99 68 140 104
  19. #> [39] 96 150 1 94 46 131 135 11 10 146 85 15 52 78 63 100 3 118 134
  20. #> [58] 6 88 138 41 39 92 56 84 61 20 24 91 80 18 64 14 141 44
  21. #>
  22. #>
  23. #> $test.inds
  24. #> $test.inds[[1]]
  25. #> [1] 1 2 3 6 8 10 11 13 14 15 16 17 18 20 21 24 26 28 32
  26. #> [20] 33 39 41 42 44 46 48 52 55 56 58 61 63 64 65 67 68 74 78
  27. #> [39] 79 80 82 84 85 88 91 92 94 96 98 99 100 104 107 110 111 113 116
  28. #> [58] 117 118 119 121 123 125 131 134 135 138 139 140 141 142 144 146 149 150
  29. #>
  30. #> $test.inds[[2]]
  31. #> [1] 4 5 7 9 12 19 22 23 25 27 29 30 31 34 35 36 37 38 40
  32. #> [20] 43 45 47 49 50 51 53 54 57 59 60 62 66 69 70 71 72 73 75
  33. #> [39] 76 77 81 83 86 87 89 90 93 95 97 101 102 103 105 106 108 109 112
  34. #> [58] 114 115 120 122 124 126 127 128 129 130 132 133 136 137 143 145 147 148

<sup>Created on 2020-01-04 by the reprex package (v0.3.0)</sup>

huangapple
  • 本文由 发表于 2020年1月4日 01:58:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/59583188.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定