2023年5月28日 05:37:04go评论92阅读模式

英文:

How can I create a dataframe which collect the results of a for loop with multilple indexes in R?

问题

我想从一个for循环中获取一个数据框，以便更好地可视化代码的结果。我处于随机生存森林的背景下，所以我留一个示例来更好地解释。

library("randomForestSRC")
data(veteran, package = "randomForestSRC")
for (j in 1:3) {
  for (i in 1:2) {
    obj <- rfsrc(Surv(time, status) ~ ., data = veteran, 
                 ntree = 100, block.size = j, mtry = i, nodesize = 3, samptype = "swr")
    car <- 1 - get.cindex(obj$yvar[,1], obj$yvar[,2], obj$predicted.oob)
    print(c(car, round(i, 0), j))
  }
}

得到的结果如下：

[1] 0.6748275 1.0000000 1.0000000
[1] 0.6877191 2.0000000 1.0000000
[1] 0.6577519 1.0000000 2.0000000
[1] 0.6826303 2.0000000 2.0000000
[1] 0.6614837 1.0000000 3.0000000
[1] 0.6847789 2.0000000 3.0000000

我想要将这些结果收集到一个数据框中，特别是我对"car"的值感兴趣。

我首先尝试了只使用索引j，我得到了我想要的内容。我构建了一个数据框和"car"：

b <- (1:3)
df <- data.frame(b)
car = 0

然后我只使用索引j运行了相同的循环，以获取我想要的数据框（一个指示j的列和一个用于不同"car"值的列），如下所示：

for (j in 1:3) {
  obj <- rfsrc(Surv(time, status) ~ ., data = veteran, 
               ntree = 100, block.size = 1, mtry = j, nodesize = 3, samptype = "swr")
  df$car[j] <- 1 - get.cindex(obj$yvar[,1], obj$yvar[,2], obj$predicted.oob)
}

但使用i和j索引一起的相同方法不起作用。我希望有人能给我一些提示来解决这个问题。

英文:

I want to obtain from a for loop a dataframe in which i can visualize better the result of the code. I'm in the context of Random survival forest, so I leave an example to explain me better.

library(&quot;randomForestSRC&quot;)
data(veteran, package = &quot;randomForestSRC&quot;)
for(j in 1:3){for(i in 1:2){
  obj&lt;- rfsrc(Surv(time, status) ~ ., data = veteran, 
              ntree = 100, block.size = j, mtry=i, nodesize = 3, samptype = &quot;swr&quot;)
  car&lt;-1-get.cindex(obj$yvar[,1], obj$yvar[,2], obj$predicted.oob)
  print(c(car, round(i, 0), j))
}}

the result obtained is the following:

[1] 0.6748275 1.0000000 1.0000000
[1] 0.6877191 2.0000000 1.0000000
[1] 0.6577519 1.0000000 2.0000000
[1] 0.6826303 2.0000000 2.0000000
[1] 0.6614837 1.0000000 3.0000000
[1] 0.6847789 2.0000000 3.0000000

I want to collect this results in a dataframe, in particular i'm interested in the values of "car".

I've firstly tried only with the index j and I've obtained what i want. I have built a dataframe and car:

b&lt;- (1:3)
df&lt;-data.frame(b)
car=0

Then I run the same loop only for the index j and I obtain the dataframe that I want (one column that indicates j and one column for the different values obtained for car) in the subsequent way:

for(j in 1:3){
  obj&lt;- rfsrc(Surv(time, status) ~ ., data = veteran, 
              ntree = 100, block.size = 1, mtry=j, nodesize = 3, samptype = &quot;swr&quot;)
  df$car[j]&lt;-1-get.cindex(obj$yvar[,1], obj$yvar[,2], obj$predicted.oob)
}

The same method do not function with i and j indexes together. I hope that someone will give me sone hint to solve the problem.

答案1

得分: 4

在循环中处理两个索引可能有点不方便，尤其是当你不需要数组输出时。使用expand.grid()创建的组合，结合使用Map()可能会更容易。同时，你可以切换到使用更有信息性的名称：

fit_car <- function(mtry, block.size) {
    obj <- rfsrc(
      Surv(time, status) ~ .,
      data = veteran,
      ntree = 100,
      block.size = block.size,
      mtry = mtry,
      nodesize = 3,
      samptype = "swr"
    )
    car <- 1 - get.cindex(obj$yvar[, 1], obj$yvar[, 2], obj$predicted.oob)
    car
}
df <- expand.grid(mtry = 1:2, block.size = 1:3)
df$car <- unlist(Map(fit_car, df$mtry, df$block.size))
df
#>   mtry block.size       car
#> 1    1          1 0.6857967
#> 2    2          1 0.7095443
#> 3    1          2 0.6437295
#> 4    2          2 0.6921294
#> 5    1          3 0.6640846
#> 6    2          3 0.6917901

请注意，我已经保留了代码部分的原文，只翻译了注释和字符串部分。

英文:

Having to deal with two indices in loops can be a bit inconvenient,
especially since you’re not after array output. It might be easier to use
Map() with the combinations created with expand.grid(). At the same time
you can swap to using more informative names:

fit_car &lt;- function(mtry, block.size) {
    obj &lt;- rfsrc(
      Surv(time, status) ~ .,
      data = veteran,
      ntree = 100,
      block.size = block.size,
      mtry = mtry,
      nodesize = 3,
      samptype = &quot;swr&quot;
    )
    car &lt;- 1 - get.cindex(obj$yvar[, 1], obj$yvar[, 2], obj$predicted.oob)
    car
}
df &lt;- expand.grid(mtry = 1:2, block.size = 1:3)
df$car &lt;- unlist(Map(fit_car, df$mtry, df$block.size))
df
#&gt;   mtry block.size       car
#&gt; 1    1          1 0.6857967
#&gt; 2    2          1 0.7095443
#&gt; 3    1          2 0.6437295
#&gt; 4    2          2 0.6921294
#&gt; 5    1          3 0.6640846
#&gt; 6    2          3 0.6917901

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在R中创建一个数据框，用于收集具有多个索引的for循环的结果？

问题

答案1

在R的Plotly动画中，连接点的线段消失。

在R中，对分组后的列进行编号。

返回数据表中每个组的多行。

predict.lme 无法解释由变量定义的公式

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。