2015年9月15日 22:51:06go评论124阅读模式

英文:

how to compile Cuda source with Go language's cgo?

问题

我写了一个简单的cuda-c程序，在eclipse nsight上运行正常。这是源代码：

#include <iostream>
#include <stdio.h>
__global__ void add(int a, int b, int *c){
    *c = a + b;
}
int main(void){
    int c;
    int *dev_c;
    cudaMalloc((void**)&dev_c, sizeof(int));
    add<<<1,1>>>(2, 7, dev_c);
    cudaMemcpy(&c, dev_c, sizeof(int), cudaMemcpyDeviceToHost);
    printf("\n2+7= %d\n", c);
    cudaFree(dev_c);
    return 0;
}

现在我想用Go语言的cgo来使用这段代码！所以我写了这段新代码：

package main
//#include "/usr/local/cuda-7.0/include/cuda.h"
//#include "/usr/local/cuda-7.0/include/cuda_runtime.h"
//#cgo LDFLAGS: -lcuda
//#cgo LDFLAGS: -lcurand
////default location:
//#cgo LDFLAGS: -L/usr/local/cuda-7.0/lib64 -L/usr/local/cuda-7.0/lib
//#cgo CFLAGS: -I/usr/local/cuda-7.0/include/
//
//
//
//
//
//
//
//
//
//
//
/*
#include <stdio.h>
__global__ void add(int a, int b, int *c){
    *c = a + b;
}
int esegui_somma(void){
    int c;
    int *dev_c;
    cudaMalloc((void**)&dev_c, sizeof(int));
    add<<<1,1>>>(2, 7, dev_c);
    cudaMemcpy(&c, dev_c, sizeof(int), cudaMemcpyDeviceToHost);
    cudaFree(dev_c);
    return c;
}
*/
import "C"
import "fmt"
func main(){
    fmt.Printf("il risultato è %d", C.esegui_somma)
}

但是它不起作用！我读到了这个错误信息：

cgo_cudabyexample_1/main.go:34:8: error: expected expression before '<<' token
add<<<1,1>>>(2,7,dev_c);
    ^

我认为我必须为cgo设置nvcc cuda编译器，而不是gcc。我该怎么做？我可以更改CC环境变量吗？最好的问候。

英文:

I wrote a simple program in cuda-c and it works on eclipse nsight. This is source code:

#include &lt;iostream&gt;
#include &lt;stdio.h&gt;
__global__ void add( int a,int b, int *c){
*c = a + b;
}
int main(void){
int c;
int *dev_c;
cudaMalloc((void**)&amp;dev_c, sizeof(int));
add &lt;&lt;&lt;1,1&gt;&gt;&gt;(2,7,dev_c);
cudaMemcpy(&amp;c, dev_c, sizeof(int),cudaMemcpyDeviceToHost);
printf(&quot;\n2+7= %d\n&quot;,c);
cudaFree(dev_c);
return 0;
}

Now I'm trying to use this code with Go language with cgo!!!
So I wrote this new code:

package main
//#include &quot;/usr/local/cuda-7.0/include/cuda.h&quot;
//#include &quot;/usr/local/cuda-7.0/include/cuda_runtime.h&quot;
//#cgo LDFLAGS: -lcuda
//#cgo LDFLAGS: -lcurand
////default location:
//#cgo LDFLAGS: -L/usr/local/cuda-7.0/lib64 -L/usr/local/cuda-7.0/lib
//#cgo CFLAGS: -I/usr/local/cuda-7.0/include/
//
//
//
//
//
//
//
//
//
//
/*
#include &lt;stdio.h&gt;
__global__ void add( int a,int b, int *c){
	*c = a + b;
}
int esegui_somma(void){
	int c;
	int *dev_c;
	
	cudaMalloc((void**)&amp;dev_c, sizeof(int));
	add &lt;&lt;&lt;1,1&gt;&gt;&gt; (2,7,dev_c);
	cudaMemcpy(&amp;c, dev_c, sizeof(int),cudaMemcpyDeviceToHost);
	
	cudaFree(dev_c);
	return c;
}
*/
import &quot;C&quot;
import &quot;fmt&quot;
func main(){
	fmt.Printf(&quot;il risultato &#232; %d&quot;,C.esegui_somma)
}

But it doesn't work!!
I read this error message:

cgo_cudabyexample_1/main.go:34:8: error: expected expression before &#39;&lt;&#39; token
add &lt;&lt;&lt;1,1&gt;&gt;&gt; (2,7,dev_c);
      ^

I think that I must to set nvcc cuda compiler for cgo instead of gcc.
How can I do it? Can I change CC environment variable?
best regards

答案1

得分: 2

我终于弄清楚如何做了。最大的问题是nvcc不遵循gcc的标准标志，而且不像clang那样会默默地忽略它们。cgo通过添加一些用户没有明确指定的标志来触发这个问题。

为了使所有的工作正常，你需要将设备代码和直接调用它的函数分开放在不同的文件中，并使用nvcc直接将它们编译/打包成一个共享库（.so）。然后，你将使用cgo来使用系统上的默认链接器链接这个共享库。你唯一需要添加的是-lcudart到你的LDFLAGS（链接器标志）中，以链接CUDA运行时库。

英文:

I finally figured out how to do this. Thing biggest problem is that nvcc does not follow gcc standard flags and unlike clang it won't silently ignore them. cgo triggers the problem by adding a bunch of flags not explicitly specified by the user.

To make it all work, you'll need to separate out your device code and the functions that directly call it into separate files and compile/package them directly using nvcc into a shared library (.so). Then you'll use cgo to link this shared library using whatever default linker you have on your system. The only thing you'll have to add is -lcudart to your LDFLAGS (linker flags) to link the CUDA runtime.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何使用Go语言的cgo编译CUDA源代码？

问题

答案1

在Go语言中，存储和迭代命名的嵌套数据结构的惯用方式是什么？

将一个方法中的两个切片合并成一个映射。

如何测试需要令牌才能调用数据服务的处理程序？

为什么简单的XML解析器不能填充结构体？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。