英文:
How can I set -arch=sm_NN of nvcc in my CMake configuration?
问题
我有一个 .cu 文件,其中使用了 atomicCAS
,输入参数是 (unsigned short *, unsigned short, unsigned short)
。我的环境是 --gpu-code=sm_86,sm_86,sm_61
。在默认的 CUDA 架构下出现了编译错误。
我尝试使用 nvcc -arch=sm_86 my.cu
进行编译,这个函数可以成功编译。那么,我如何在我的 CMakeLists.txt 文件中设置 -arch=sm_86
的 nvcc 参数呢?
英文:
I have a .cu file which uses atomicCAS
with inputs (unsigned short *, unsigned short, unsigned short)
. My environment is with --gpu-code=sm_86,sm_86,sm_61
. It occurs compile error under the default cuda architecture.
I tested using nvcc -arch=sm_86 my.cu
, this funciton could be compiled successfully. So, how could I set -arch=sm_86
of nvcc in my CMakeLists.txt?
答案1
得分: 2
在CMake 3.18及更高版本中,您可以通过设置CUDA_ARCHITECTURES目标属性中的架构数字来实现此目的(该属性默认根据CMAKE_CUDA_ARCHITECTURES变量进行初始化,使用分号作为列表条目的分隔符)。
从文档的示例部分:
set_target_properties(tgt PROPERTIES CUDA_ARCHITECTURES "35;50;72")
生成用于实际和虚拟架构30、50和72的代码。
set_property(TARGET tgt PROPERTY CUDA_ARCHITECTURES 70-real 72-virtual)
生成用于实际架构70和虚拟架构72的代码。
set_property(TARGET tgt PROPERTY CUDA_ARCHITECTURES OFF)
CMake不会向编译器传递任何架构标志。
设置Makefile生成器中带有sm_
参数的构建命令的源代码可以在Source/cmMakefileTargetGenerator.cxx
中找到:
void cmMakefileTargetGenerator::WriteDeviceLinkRule(
std::vector<std::string>& commands, const std::string& output)
{
std. architecturesStr =
this->GeneratorTarget->GetSafeProperty("CUDA_ARCHITECTURES");
...
std::vector<std::string> architectures = cmExpandedList(architecturesStr);
...
for (const std::string& architectureKind : architectures) {
...
const std::string architecture =
architectureKind.substr(0, architectureKind.find('-'));
...
std::string command = cmStrCat(
this->Makefile->GetRequiredDefinition("CMAKE_CUDA_DEVICE_LINKER"),
" -arch=sm_", architecture, registerFileCmd, " -o=$@ ",
cmJoin(linkDeps, " "));
localGen->WriteMakeRule(*this->BuildFileStream, nullptr, cubin, linkDeps,
{ command }, false);
}
...
}
另外,您也可以使用target_compile_options
或add_compile_options
。
另外稍相关的是FindCUDA
模块中的CUDA_SELECT_NVCC_ARCH_FLAGS
函数(自CMake 3.10版本起被弃用,该版本将CUDA语言支持添加到CMake中)。
相关的CUDA文档: https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#options-for-steering-gpu-code-generation
还要注意,自CMake v3.24起,CUDA_ARCHITECTURES
有一个native
值:“为主机GPU的架构编译”。
英文:
In CMake 3.18 and above, you do this by setting the architecture numbers in the CUDA_ARCHITECTURES
target property (which is default initialized according to the CMAKE_CUDA_ARCHITECTURES
variable) to a semicolon separated list (CMake uses semicolons as its list entry separator character).
From the docs' Examples section:
> cmake
> set_target_properties(tgt PROPERTIES CUDA_ARCHITECTURES "35;50;72")
>
> Generates code for real and virtual architectures 30, 50 and 72.
>
> cmake
> set_property(TARGET tgt PROPERTY CUDA_ARCHITECTURES 70-real 72-virtual)
>
> Generates code for real architecture 70 and virtual architecture 72.
>
> cmake
> set_property(TARGET tgt PROPERTY CUDA_ARCHITECTURES OFF)
>
> CMake will not pass any architecture flags to the compiler.
The source code that sets the build command with the sm_
argument for the Makefile generator can be found in Source/cmMakefileTargetGenerator.cxx
:
void cmMakefileTargetGenerator::WriteDeviceLinkRule(
std::vector<std::string>& commands, const std::string& output)
{
std::string architecturesStr =
this->GeneratorTarget->GetSafeProperty("CUDA_ARCHITECTURES");
...
std::vector<std::string> architectures = cmExpandedList(architecturesStr);
...
for (const std::string& architectureKind : architectures) {
...
const std::string architecture =
architectureKind.substr(0, architectureKind.find('-'));
...
std::string command = cmStrCat(
this->Makefile->GetRequiredDefinition("CMAKE_CUDA_DEVICE_LINKER"),
" -arch=sm_", architecture, registerFileCmd, " -o=$@ ",
cmJoin(linkDeps, " "));
localGen->WriteMakeRule(*this->BuildFileStream, nullptr, cubin, linkDeps,
{ command }, false);
}
...
}
Otherwise, you can just use target_compile_options
or add_compile_options
.
Also slightly related: the CUDA_SELECT_NVCC_ARCH_FLAGS
function in the FindCUDA
module (deprecated since version 3.10, which added CUDA language support into CMake).
Related CUDA docs: https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#options-for-steering-gpu-code-generation
Note also that since CMake v3.24, there is the native
value for CUDA_ARCHITECTURES
: "Compile for the architecture(s) of the host's GPU(s)."
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论