英文:
Shared library is not recognized when using parallel foreach in R
问题
你在我的R代码中使用了一个用C语言编写的共享库。我使用dyn.load
命令加载了编译后的共享库。我打算在一个并行的foreach循环中调用一个共享库函数。以下是我的代码:
library(foreach)
library(doParallel)
totalCores = detectCores()
cluster <- makeCluster(totalCores[1]-1)
registerDoParallel(cluster)
dyn.load("package.so")
run <- function(i) {
row <- data[i,]
res <- .Call("c_function", as.double(row))
return(res)
}
result <- foreach(i=1:nrow(data), .combine = rbind) %dopar% {
run(i)
}
我得到了以下错误:
Error in { :
task 1 failed - "C symbol name "c_function" not in load table"
尽管我已经加载了共享库,但似乎在并行任务中无法识别c_function
。当然,如果我在foreach循环中让dyn.load
命令,问题就解决了:
result <- foreach(i=1:nrow(data), .combine = rbind) %dopar% {
dyn.load("package.so")
run(i)
}
但我不确定这是否是最佳实践,因为在每次迭代中都会加载共享库(package.so
),这可能效率不高。有什么想法吗?
编辑:
关于r2even的答案,我测试了以下代码:
foreach(i=1:50,.packages='rootSolve') %dopar% {
print(is.loaded("c_function"))
}
我的电脑有10个CPU核心(20个线程),所以在执行此代码时,totalCores
变量的值是20。以下是结果:
[[1]]
[1] TRUE
[[2]]
[1] TRUE
[[3]]
[1] TRUE
[[4]]
[1] TRUE
[[5]]
[1] TRUE
[[6]]
[1] TRUE
[[7]]
[1] TRUE
[[8]]
[1] TRUE
[[9]]
[1] TRUE
[[10]]
[1] TRUE
[[11]]
[1] FALSE
[[12]]
[1] FALSE
[[13]]
[1] FALSE
[[14]]
[1] FALSE
[[15]]
[1] FALSE
[[16]]
[1] FALSE
[[17]]
[1] FALSE
[[18]]
[1] FALSE
[[19]]
[1] FALSE
[[20]]
[1] TRUE
[[21]]
[1] TRUE
[[22]]
[1] TRUE
[[23]]
[1] TRUE
[[24]]
[1] TRUE
[[25]]
[1] TRUE
[[26]]
[1] TRUE
[[27]]
[1] TRUE
[[28]]
[1] TRUE
[[29]]
[1] TRUE
[[30]]
[1] TRUE
[[31]]
[1] TRUE
[[32]]
[1] TRUE
[[33]]
[1] TRUE
[[34]]
[1] TRUE
[[35]]
[1] TRUE
[[36]]
[1] TRUE
[[37]]
[1] TRUE
[[38]]
[1] TRUE
[[39]]
[1] TRUE
[[40]]
[1] TRUE
[[41]]
[1] TRUE
[[42]]
[1] TRUE
[[43]]
[1] TRUE
[[44]]
[1] TRUE
[[45]]
[1] TRUE
[[46]]
[1] TRUE
[[47]]
[1] TRUE
[[48]]
[1] TRUE
[[49]]
[1] TRUE
[[50]]
[1] TRUE
这引发了一些问题。is.loaded("c_function")
的值是否总是在迭代11到20之间为False?从第21次迭代开始是否保证它总是为True?
英文:
I am using a shared library written in C in my R code. I load the compiled shared library using dyn.load command. I am going to call a shared library function in a parallelized foreach loop. Here is my code:
library(foreach)
library(doParallel)
totalCores = detectCores()
cluster <- makeCluster(totalCores[1]-1)
registerDoParallel(cluster)
dyn.load("package.so")
run <- function(i) {
row <- data[i,]
res <- .Call("c_function", as.double(row))
return(res)
}
result <- foreach(i=1:nrow(data), .combine = rbind) %dopar% {
run(i)
}
I get the following error:
Error in { :
task 1 failed - "C symbol name "c_function" not in load table"
Although I have loaded the shared library, it seems c_function is not recognized in the parallel tasks. Of course when I let the dyn.load command in the foreach loop the problem is solved:
result <- foreach(i=1:nrow(data), .combine = rbind) %dopar% {
dyn.load("package.so")
run(i)
}
But I am not sure if this is the best practice since at each iteration the shared library (package.so) is loaded and it may be not efficient. Any ideas?
Edit:
Regarding to r2even's answer I tested the following code:
foreach(i=1:50,.packages='rootSolve') %dopar% {
print(is.loaded("c_function"))
}
My PC has 10 CPU cores (20 threads) so the value of totalCores variable was 20 when I executed this code. Here is the result:
[[1]]
[1] TRUE
[[2]]
[1] TRUE
[[3]]
[1] TRUE
[[4]]
[1] TRUE
[[5]]
[1] TRUE
[[6]]
[1] TRUE
[[7]]
[1] TRUE
[[8]]
[1] TRUE
[[9]]
[1] TRUE
[[10]]
[1] TRUE
[[11]]
[1] FALSE
[[12]]
[1] FALSE
[[13]]
[1] FALSE
[[14]]
[1] FALSE
[[15]]
[1] FALSE
[[16]]
[1] FALSE
[[17]]
[1] FALSE
[[18]]
[1] FALSE
[[19]]
[1] FALSE
[[20]]
[1] TRUE
[[21]]
[1] TRUE
[[22]]
[1] TRUE
[[23]]
[1] TRUE
[[24]]
[1] TRUE
[[25]]
[1] TRUE
[[26]]
[1] TRUE
[[27]]
[1] TRUE
[[28]]
[1] TRUE
[[29]]
[1] TRUE
[[30]]
[1] TRUE
[[31]]
[1] TRUE
[[32]]
[1] TRUE
[[33]]
[1] TRUE
[[34]]
[1] TRUE
[[35]]
[1] TRUE
[[36]]
[1] TRUE
[[37]]
[1] TRUE
[[38]]
[1] TRUE
[[39]]
[1] TRUE
[[40]]
[1] TRUE
[[41]]
[1] TRUE
[[42]]
[1] TRUE
[[43]]
[1] TRUE
[[44]]
[1] TRUE
[[45]]
[1] TRUE
[[46]]
[1] TRUE
[[47]]
[1] TRUE
[[48]]
[1] TRUE
[[49]]
[1] TRUE
[[50]]
[1] TRUE
It raises several questions. Is the value of is.loaded("c_function") always False only in the the iterations 11 to 20? Is it guaranteed that from the iteration 21 to the rest it is always true?
答案1
得分: 1
尝试这个hack,几乎肯定比重复调用dyn.load
要高效得多:
result <- foreach(i=1:nrow(data), .combine = rbind) %dopar% {
if (!is.loaded("c_function")) dyn.load("package.so")
run(i)
}
(未经测试。)
英文:
Try this *hack*, almost certainly less inefficient than repeated calls to dyn.load
:
result <- foreach(i=1:nrow(data), .combine = rbind) %dopar% {
if (!is.loaded("c_function")) dyn.load("package.so")
run(i)
}
(Untested.)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论