2023年2月27日 11:50:59go评论91阅读模式

英文:

How can I set -arch=sm_NN of nvcc in my CMake configuration?

问题

我有一个 .cu 文件，其中使用了 atomicCAS，输入参数是 (unsigned short *, unsigned short, unsigned short)。我的环境是 --gpu-code=sm_86,sm_86,sm_61。在默认的 CUDA 架构下出现了编译错误。

我尝试使用 nvcc -arch=sm_86 my.cu 进行编译，这个函数可以成功编译。那么，我如何在我的 CMakeLists.txt 文件中设置 -arch=sm_86 的 nvcc 参数呢？

英文:

I have a .cu file which uses atomicCAS with inputs (unsigned short *, unsigned short, unsigned short). My environment is with --gpu-code=sm_86,sm_86,sm_61. It occurs compile error under the default cuda architecture.

I tested using nvcc -arch=sm_86 my.cu, this funciton could be compiled successfully. So, how could I set -arch=sm_86 of nvcc in my CMakeLists.txt?

答案1

得分: 2

在CMake 3.18及更高版本中，您可以通过设置CUDA_ARCHITECTURES目标属性中的架构数字来实现此目的（该属性默认根据CMAKE_CUDA_ARCHITECTURES变量进行初始化，使用分号作为列表条目的分隔符）。

从文档的示例部分：

set_target_properties(tgt PROPERTIES CUDA_ARCHITECTURES "35;50;72")

生成用于实际和虚拟架构30、50和72的代码。

set_property(TARGET tgt PROPERTY CUDA_ARCHITECTURES 70-real 72-virtual)

生成用于实际架构70和虚拟架构72的代码。

set_property(TARGET tgt PROPERTY CUDA_ARCHITECTURES OFF)

CMake不会向编译器传递任何架构标志。

设置Makefile生成器中带有sm_参数的构建命令的源代码可以在Source/cmMakefileTargetGenerator.cxx中找到：

void cmMakefileTargetGenerator::WriteDeviceLinkRule(
  std::vector<std::string>& commands, const std::string& output)
{
  std. architecturesStr =
    this->GeneratorTarget->GetSafeProperty("CUDA_ARCHITECTURES");
  ...
  std::vector<std::string> architectures = cmExpandedList(architecturesStr);
  ...
  for (const std::string& architectureKind : architectures) {
    ...
    const std::string architecture =
      architectureKind.substr(0, architectureKind.find('-'));
    ...
    std::string command = cmStrCat(
      this->Makefile->GetRequiredDefinition("CMAKE_CUDA_DEVICE_LINKER"),
      " -arch=sm_", architecture, registerFileCmd, " -o=$@ ",
      cmJoin(linkDeps, " "));
    localGen->WriteMakeRule(*this->BuildFileStream, nullptr, cubin, linkDeps,
                            { command }, false);
  }
  ...
}

另外，您也可以使用target_compile_options或add_compile_options。

另外稍相关的是FindCUDA模块中的CUDA_SELECT_NVCC_ARCH_FLAGS函数（自CMake 3.10版本起被弃用，该版本将CUDA语言支持添加到CMake中）。

还要注意，自CMake v3.24起，CUDA_ARCHITECTURES有一个native值：“为主机GPU的架构编译”。

英文:

In CMake 3.18 and above, you do this by setting the architecture numbers in the CUDA_ARCHITECTURES target property (which is default initialized according to the CMAKE_CUDA_ARCHITECTURES variable) to a semicolon separated list (CMake uses semicolons as its list entry separator character).

From the docs' Examples section:

> cmake > set_target_properties(tgt PROPERTIES CUDA_ARCHITECTURES "35;50;72") >
> Generates code for real and virtual architectures 30, 50 and 72.
>
> cmake > set_property(TARGET tgt PROPERTY CUDA_ARCHITECTURES 70-real 72-virtual) >
> Generates code for real architecture 70 and virtual architecture 72.
>
> cmake > set_property(TARGET tgt PROPERTY CUDA_ARCHITECTURES OFF) >
> CMake will not pass any architecture flags to the compiler.

The source code that sets the build command with the sm_ argument for the Makefile generator can be found in Source/cmMakefileTargetGenerator.cxx:

void cmMakefileTargetGenerator::WriteDeviceLinkRule(
  std::vector&lt;std::string&gt;&amp; commands, const std::string&amp; output)
{
  std::string architecturesStr =
    this-&gt;GeneratorTarget-&gt;GetSafeProperty(&quot;CUDA_ARCHITECTURES&quot;);
  ...
  std::vector&lt;std::string&gt; architectures = cmExpandedList(architecturesStr);
  ...
  for (const std::string&amp; architectureKind : architectures) {
    ...
    const std::string architecture =
      architectureKind.substr(0, architectureKind.find(&#39;-&#39;));
    ...
    std::string command = cmStrCat(
      this-&gt;Makefile-&gt;GetRequiredDefinition(&quot;CMAKE_CUDA_DEVICE_LINKER&quot;),
      &quot; -arch=sm_&quot;, architecture, registerFileCmd, &quot; -o=$@ &quot;,
      cmJoin(linkDeps, &quot; &quot;));
    localGen-&gt;WriteMakeRule(*this-&gt;BuildFileStream, nullptr, cubin, linkDeps,
                            { command }, false);
  }
  ...
}

Otherwise, you can just use target_compile_options or add_compile_options.

Also slightly related: the CUDA_SELECT_NVCC_ARCH_FLAGS function in the FindCUDA module (deprecated since version 3.10, which added CUDA language support into CMake).

Note also that since CMake v3.24, there is the native value for CUDA_ARCHITECTURES: "Compile for the architecture(s) of the host's GPU(s)."

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

你可以在我的CMake配置中如何设置nvcc的-arch=sm_NN参数？

问题

答案1

Will compiler still generate default move constructor if I declare a deconstructor but exactly as same as the default deconstructor?

C++ 协程在通过 promise 类型传递值时导致段错误。

可以使用Go编写原生的Node.js扩展，而不是使用C++吗？

C++ strlen函数错误

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论