如何在R中创建一个数据框,用于收集具有多个索引的for循环的结果?

huangapple go评论66阅读模式
英文:

How can I create a dataframe which collect the results of a for loop with multilple indexes in R?

问题

我想从一个for循环中获取一个数据框,以便更好地可视化代码的结果。我处于随机生存森林的背景下,所以我留一个示例来更好地解释。

library("randomForestSRC")
data(veteran, package = "randomForestSRC")
for (j in 1:3) {
  for (i in 1:2) {
    obj <- rfsrc(Surv(time, status) ~ ., data = veteran, 
                 ntree = 100, block.size = j, mtry = i, nodesize = 3, samptype = "swr")
    car <- 1 - get.cindex(obj$yvar[,1], obj$yvar[,2], obj$predicted.oob)
    print(c(car, round(i, 0), j))
  }
}

得到的结果如下:

[1] 0.6748275 1.0000000 1.0000000
[1] 0.6877191 2.0000000 1.0000000
[1] 0.6577519 1.0000000 2.0000000
[1] 0.6826303 2.0000000 2.0000000
[1] 0.6614837 1.0000000 3.0000000
[1] 0.6847789 2.0000000 3.0000000

我想要将这些结果收集到一个数据框中,特别是我对"car"的值感兴趣。

我首先尝试了只使用索引j,我得到了我想要的内容。我构建了一个数据框和"car":

b <- (1:3)
df <- data.frame(b)
car = 0

然后我只使用索引j运行了相同的循环,以获取我想要的数据框(一个指示j的列和一个用于不同"car"值的列),如下所示:

for (j in 1:3) {
  obj <- rfsrc(Surv(time, status) ~ ., data = veteran, 
               ntree = 100, block.size = 1, mtry = j, nodesize = 3, samptype = "swr")
  df$car[j] <- 1 - get.cindex(obj$yvar[,1], obj$yvar[,2], obj$predicted.oob)
}

但使用i和j索引一起的相同方法不起作用。我希望有人能给我一些提示来解决这个问题。

英文:

I want to obtain from a for loop a dataframe in which i can visualize better the result of the code. I'm in the context of Random survival forest, so I leave an example to explain me better.

library(&quot;randomForestSRC&quot;)
data(veteran, package = &quot;randomForestSRC&quot;)
for(j in 1:3){for(i in 1:2){
  obj&lt;- rfsrc(Surv(time, status) ~ ., data = veteran, 
              ntree = 100, block.size = j, mtry=i, nodesize = 3, samptype = &quot;swr&quot;)
  car&lt;-1-get.cindex(obj$yvar[,1], obj$yvar[,2], obj$predicted.oob)
  print(c(car, round(i, 0), j))
}}

the result obtained is the following:

[1] 0.6748275 1.0000000 1.0000000
[1] 0.6877191 2.0000000 1.0000000
[1] 0.6577519 1.0000000 2.0000000
[1] 0.6826303 2.0000000 2.0000000
[1] 0.6614837 1.0000000 3.0000000
[1] 0.6847789 2.0000000 3.0000000

I want to collect this results in a dataframe, in particular i'm interested in the values of "car".

I've firstly tried only with the index j and I've obtained what i want. I have built a dataframe and car:

b&lt;- (1:3)
df&lt;-data.frame(b)
car=0

Then I run the same loop only for the index j and I obtain the dataframe that I want (one column that indicates j and one column for the different values obtained for car) in the subsequent way:

for(j in 1:3){
  obj&lt;- rfsrc(Surv(time, status) ~ ., data = veteran, 
              ntree = 100, block.size = 1, mtry=j, nodesize = 3, samptype = &quot;swr&quot;)
  df$car[j]&lt;-1-get.cindex(obj$yvar[,1], obj$yvar[,2], obj$predicted.oob)
}

The same method do not function with i and j indexes together. I hope that someone will give me sone hint to solve the problem.

答案1

得分: 4

在循环中处理两个索引可能有点不方便,尤其是当你不需要数组输出时。使用expand.grid()创建的组合,结合使用Map()可能会更容易。同时,你可以切换到使用更有信息性的名称:

fit_car <- function(mtry, block.size) {
    obj <- rfsrc(
      Surv(time, status) ~ .,
      data = veteran,
      ntree = 100,
      block.size = block.size,
      mtry = mtry,
      nodesize = 3,
      samptype = "swr"
    )
    car <- 1 - get.cindex(obj$yvar[, 1], obj$yvar[, 2], obj$predicted.oob)
    car
}

df <- expand.grid(mtry = 1:2, block.size = 1:3)
df$car <- unlist(Map(fit_car, df$mtry, df$block.size))
df
#>   mtry block.size       car
#> 1    1          1 0.6857967
#> 2    2          1 0.7095443
#> 3    1          2 0.6437295
#> 4    2          2 0.6921294
#> 5    1          3 0.6640846
#> 6    2          3 0.6917901

请注意,我已经保留了代码部分的原文,只翻译了注释和字符串部分。

英文:

Having to deal with two indices in loops can be a bit inconvenient,
especially since you’re not after array output. It might be easier to use
Map() with the combinations created with expand.grid(). At the same time
you can swap to using more informative names:

fit_car &lt;- function(mtry, block.size) {
    obj &lt;- rfsrc(
      Surv(time, status) ~ .,
      data = veteran,
      ntree = 100,
      block.size = block.size,
      mtry = mtry,
      nodesize = 3,
      samptype = &quot;swr&quot;
    )
    car &lt;- 1 - get.cindex(obj$yvar[, 1], obj$yvar[, 2], obj$predicted.oob)
    car
}

df &lt;- expand.grid(mtry = 1:2, block.size = 1:3)
df$car &lt;- unlist(Map(fit_car, df$mtry, df$block.size))
df
#&gt;   mtry block.size       car
#&gt; 1    1          1 0.6857967
#&gt; 2    2          1 0.7095443
#&gt; 3    1          2 0.6437295
#&gt; 4    2          2 0.6921294
#&gt; 5    1          3 0.6640846
#&gt; 6    2          3 0.6917901

huangapple
  • 本文由 发表于 2023年5月28日 05:37:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76349152.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定