你可以在我的CMake配置中如何设置nvcc的-arch=sm_NN参数?

huangapple go评论91阅读模式
英文:

How can I set -arch=sm_NN of nvcc in my CMake configuration?

问题

我有一个 .cu 文件,其中使用了 atomicCAS,输入参数是 (unsigned short *, unsigned short, unsigned short)。我的环境是 --gpu-code=sm_86,sm_86,sm_61。在默认的 CUDA 架构下出现了编译错误。

我尝试使用 nvcc -arch=sm_86 my.cu 进行编译,这个函数可以成功编译。那么,我如何在我的 CMakeLists.txt 文件中设置 -arch=sm_86 的 nvcc 参数呢?

英文:

I have a .cu file which uses atomicCAS with inputs (unsigned short *, unsigned short, unsigned short). My environment is with --gpu-code=sm_86,sm_86,sm_61. It occurs compile error under the default cuda architecture.

I tested using nvcc -arch=sm_86 my.cu, this funciton could be compiled successfully. So, how could I set -arch=sm_86 of nvcc in my CMakeLists.txt?

答案1

得分: 2

在CMake 3.18及更高版本中,您可以通过设置CUDA_ARCHITECTURES目标属性中的架构数字来实现此目的(该属性默认根据CMAKE_CUDA_ARCHITECTURES变量进行初始化,使用分号作为列表条目的分隔符)。

从文档的示例部分:

set_target_properties(tgt PROPERTIES CUDA_ARCHITECTURES "35;50;72")

生成用于实际和虚拟架构30、50和72的代码。

set_property(TARGET tgt PROPERTY CUDA_ARCHITECTURES 70-real 72-virtual)

生成用于实际架构70和虚拟架构72的代码。

set_property(TARGET tgt PROPERTY CUDA_ARCHITECTURES OFF)

CMake不会向编译器传递任何架构标志。

设置Makefile生成器中带有sm_参数的构建命令的源代码可以在Source/cmMakefileTargetGenerator.cxx中找到:

void cmMakefileTargetGenerator::WriteDeviceLinkRule(
  std::vector<std::string>& commands, const std::string& output)
{
  std. architecturesStr =
    this->GeneratorTarget->GetSafeProperty("CUDA_ARCHITECTURES");
  ...
  std::vector<std::string> architectures = cmExpandedList(architecturesStr);
  ...
  for (const std::string& architectureKind : architectures) {
    ...
    const std::string architecture =
      architectureKind.substr(0, architectureKind.find('-'));
    ...
    std::string command = cmStrCat(
      this->Makefile->GetRequiredDefinition("CMAKE_CUDA_DEVICE_LINKER"),
      " -arch=sm_", architecture, registerFileCmd, " -o=$@ ",
      cmJoin(linkDeps, " "));
    localGen->WriteMakeRule(*this->BuildFileStream, nullptr, cubin, linkDeps,
                            { command }, false);
  }
  ...
}

另外,您也可以使用target_compile_optionsadd_compile_options

另外稍相关的是FindCUDA模块中的CUDA_SELECT_NVCC_ARCH_FLAGS函数(自CMake 3.10版本起被弃用,该版本将CUDA语言支持添加到CMake中)。

相关的CUDA文档: https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#options-for-steering-gpu-code-generation

还要注意,自CMake v3.24起,CUDA_ARCHITECTURES有一个native值:“为主机GPU的架构编译”。

英文:

In CMake 3.18 and above, you do this by setting the architecture numbers in the CUDA_ARCHITECTURES target property (which is default initialized according to the CMAKE_CUDA_ARCHITECTURES variable) to a semicolon separated list (CMake uses semicolons as its list entry separator character).

From the docs' Examples section:

> cmake
&gt; set_target_properties(tgt PROPERTIES CUDA_ARCHITECTURES &quot;35;50;72&quot;)
&gt;

> Generates code for real and virtual architectures 30, 50 and 72.
>
> cmake
&gt; set_property(TARGET tgt PROPERTY CUDA_ARCHITECTURES 70-real 72-virtual)
&gt;

> Generates code for real architecture 70 and virtual architecture 72.
>
> cmake
&gt; set_property(TARGET tgt PROPERTY CUDA_ARCHITECTURES OFF)
&gt;

> CMake will not pass any architecture flags to the compiler.

The source code that sets the build command with the sm_ argument for the Makefile generator can be found in Source/cmMakefileTargetGenerator.cxx:

void cmMakefileTargetGenerator::WriteDeviceLinkRule(
  std::vector&lt;std::string&gt;&amp; commands, const std::string&amp; output)
{
  std::string architecturesStr =
    this-&gt;GeneratorTarget-&gt;GetSafeProperty(&quot;CUDA_ARCHITECTURES&quot;);
  ...
  std::vector&lt;std::string&gt; architectures = cmExpandedList(architecturesStr);
  ...
  for (const std::string&amp; architectureKind : architectures) {
    ...
    const std::string architecture =
      architectureKind.substr(0, architectureKind.find(&#39;-&#39;));
    ...
    std::string command = cmStrCat(
      this-&gt;Makefile-&gt;GetRequiredDefinition(&quot;CMAKE_CUDA_DEVICE_LINKER&quot;),
      &quot; -arch=sm_&quot;, architecture, registerFileCmd, &quot; -o=$@ &quot;,
      cmJoin(linkDeps, &quot; &quot;));
    localGen-&gt;WriteMakeRule(*this-&gt;BuildFileStream, nullptr, cubin, linkDeps,
                            { command }, false);
  }
  ...
}

Otherwise, you can just use target_compile_options or add_compile_options.

Also slightly related: the CUDA_SELECT_NVCC_ARCH_FLAGS function in the FindCUDA module (deprecated since version 3.10, which added CUDA language support into CMake).

Related CUDA docs: https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#options-for-steering-gpu-code-generation

Note also that since CMake v3.24, there is the native value for CUDA_ARCHITECTURES: "Compile for the architecture(s) of the host's GPU(s)."

huangapple
  • 本文由 发表于 2023年2月27日 11:50:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/75576630.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定