tbb parallel_for: Object with intrusive list node can be part of only one intrusive list simultaneously

huangapple go评论57阅读模式
英文:

tbb parallel_for: Object with intrusive list node can be part of only one intrusive list simultaneously

问题

我目前正在将一个最初在Windows 10上编写的程序移植到一个具有g++版本8.50. 20210514 (Red Hat 8.5.0-18)的Red Hat系统(VERSION_ID="8.8")。我使用vcpkg安装了tbb库。我已将问题简化为以下的最小工作示例(注意,原始项目有100多个文件。我尝试保留可能相关的CMakeLists.txt中的一些内容,但我不确定):

CMakeLists.txt

cmake_minimum_required(VERSION 3.9)

if (CMAKE_BUILD_TYPE STREQUAL "Debug")
    set(CMAKE_CXX_FLAGS_DEBUG "-g -O0")
    set(CMAKE_BUILD_TYPE Debug)
else()
    set(CMAKE_CXX_FLAGS "-Wall -mtune=native -march=native -g")
    set(CMAKE_CXX_FLAGS_RELEASE "-O3 -fno-math-errno -fno-signed-zeros -fno-trapping-math-freciprocal-math -fno-rounding-math -fno-signaling-nans -fexcess-precision=fast")
endif()

project(Example LANGUAGES CXX VERSION 0.9)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_INTERPROCEDURAL_OPTIMIZATION TRUE) 
set(CMAKE_CXX_VISIBILITY_PRESET hidden)
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR})

add_executable(example
    example.cpp
)

find_package(TBB CONFIG REQUIRED)

target_link_libraries(example
    TBB::tbb
    TBB::tbbmalloc
)

vcpkg.json

{
    "name": "vt",
    "version-string": "",
    "dependencies": [
        "tbb"
    ],
    "builtin-baseline": "36fb23307e10cc6ffcec566c46c4bb3f567c82c6"
}

example.cpp

#include <vector>
#include "tbb/blocked_range.h"
#include "tbb/parallel_for.h"

int main()
{
    std::vector<uint32_t> vec(100);
    for (size_t i = 0; i < vec.size(); ++i){
        vec[i] = i;
    }

    tbb::parallel_for(tbb::blocked_range<size_t>(0, vec.size()),
        [&](tbb::blocked_range<size_t> r)
        {
            for (size_t i = r.begin(); i < r.end(); ++i)
            {
                vec[i] = vec[i]+1;
            }
        });

    return 0;
}

这个程序在我的Windows机器上编译和运行正常(使用Visual Studio)。它也可以在Red Hat机器上编译,但只能在Debug模式下运行。当在Release模式下编译时,我会收到以下错误:

Assertion node(val).my_prev_node == &node(val) && node(val).my_next_node == &node(val) failed (located in the push_front function, line in file: 135)
Detailed description: Object with intrusive list node can be part of only one intrusive list simultaneously
Aborted (core dumped)

要在Release模式下构建,我运行以下命令:

mkdir build
cd build
cmake -DCMAKE_TOOLCHAIN_FILE=~/vcpkg/scripts/buildsystems/vcpkg.cmake ..
cmake --build .

(对于Debug模式,我只需将-DCMAKE_BUILD_TYPE=Debug添加到cmake配置命令中)

一些我尝试过但没有成功的方法包括:

  • vcpkg.json中删除builtin-baseline
  • 删除CMAKE_CXX_FLAGS_RELEASE中的所有选项
  • 删除set(CMAKE_INTERPROCEDURAL_OPTIMZATION TRUE)
  • 删除set(CMAKE_CXX_VISIBILITY_PRESET hidden)

我还尝试在我的个人Ubuntu 22.04.1 LTS机器上执行相同的小问题(使用g++版本11.3.0),结果是完全相同的。它可以编译,但只有在作为Debug构建时才能正常运行,而不是Release,并且会出现上述相同的错误。

英文:

I'm currently porting a program that was originally written on Windows 10 to a redhat system (VERSION_ID=&quot;8.8&quot;) that has g++ version 8.50. 20210514 (Red Hat 8.5.0-18). I installed tbb with vcpkg. I've reduced my problem to the following minimal working example. (Note, the original project has over a hundred files. I've tried to keep some things from the CMakeLists.txt that may be related, but I'm unsure):

CMakeLists.txt:

cmake_minimum_required(VERSION 3.9)

if (CMAKE_BUILD_TYPE STREQUAL &quot;Debug&quot;)
	set(CMAKE_CXX_FLAGS_DEBUG &quot;-g -O0&quot;)
	set(CMAKE_BUILD_TYPE Debug)
else()
	set(CMAKE_CXX_FLAGS &quot;-Wall -mtune=native -march=native -g&quot;)
	set(CMAKE_CXX_FLAGS_RELEASE &quot;-O3 -fno-math-errno -fno-signed-zeros -fno-trapping-math-freciprocal-math -fno-rounding-math -fno-signaling-nans -fexcess-precision=fast&quot;)
endif()

project(Example LANGUAGES CXX VERSION 0.9)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_INTERPROCEDURAL_OPTIMIZATION TRUE) 
set(CMAKE_CXX_VISIBILITY_PRESET hidden)
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR})

add_executable(example
	example.cpp
)

find_package(TBB CONFIG REQUIRED)

target_link_libraries(example
    TBB::tbb
	TBB::tbbmalloc
)

vcpkg.json:

{
    &quot;name&quot;: &quot;vt&quot;,
    &quot;version-string&quot;: &quot;&quot;,
    &quot;dependencies&quot;: [
        &quot;tbb&quot;
    ],
    &quot;builtin-baseline&quot;: &quot;36fb23307e10cc6ffcec566c46c4bb3f567c82c6&quot;
}

example.cpp:

#include &lt;vector&gt;

#include &quot;tbb/blocked_range.h&quot;
#include &quot;tbb/parallel_for.h&quot;

int main()
{
    std::vector&lt;uint32_t&gt; vec(100);
    for (size_t i = 0; i &lt; vec.size(); ++i){
        vec[i] = i;
    }

    tbb::parallel_for(tbb::blocked_range&lt;size_t&gt;(0, vec.size()),
        [&amp;](tbb::blocked_range&lt;size_t&gt; r)
        {
            for (size_t i = r.begin(); i &lt; r.end(); ++i)
            {
                vec[i] = vec[i]+1;
            }
        });

    return 0;
}

This compiles and runs fine on my windows machine (using Visual Studio). It also compiles on the red hat machine, however it will only run in Debug mode. When compiled in Release mode, I get the following error:

Assertion node(val).my_prev_node == &amp;node(val) &amp;&amp; node(val).my_next_node == &amp;node(val) failed (located in the push_front function, line in file: 135)
Detailed description: Object with intrusive list node can be part of only one intrusive list simultaneously
Aborted (core dumped)

To build in release mode I run the following:

mkdir build
cd build
cmake -DCMAKE_TOOLCHAIN_FILE=~/vcpkg/scripts/buildsystems/vcpkg.cmake ..
cmake --build .

(For Debug I simply add -DCMAKE_BUILD_TYPE=Debug to the cmake configuration call)


Some things I've tried (that did not work):

  • remove the builtin-baseline from the vcpkg.json
  • remove all of the options set for CMAKE_CXX_FLAGS_RELEASE
  • remove set(CMAKE_INTERPROCEDURAL_OPTIMZATION TRUE)
  • remove set(CMAKE_CXX_VISIBILITY_PRESET hidden)

I've also tried doing this exact small problem on my personal Ubuntu 22.04.1 LTS machine (with g++ version 11.3.0), and I got the exact same results. It compiles but only works when built as Debug but not Release, giving the same error as above.

答案1

得分: 1

我的一位同事发现,将以下行添加到CMakeLists.txt中,目前是一种解决方法,并且他认为已经确定了潜在问题的实际原因:

add_compile_definitions(TBB_USE_ASSERT)

以下是他的简要总结。我已将其添加到我在vcpkg存储库上提交的问题中。希望这有助于解决问题,但如果不能解决,希望这个临时解决方法能帮助其他遇到同样问题的人。(注意:这可能会对vcpkg的真正解决方案产生性能损耗)

编辑:来自我的同事的描述:

> 这本质上是一个构建系统的问题。我不得不花费一些时间逐步跟踪分配器。基本上,它断言未初始化的内存碰巧包含了它自己的地址。我不得不建立一个可以工作和坏掉的副本以跟踪在工作副本中设置了哪个值。
>
> 设置它的代码位于intrusive_list_node助手类的类构造函数中,并且是有条件的预处理器宏。该预处理器宏检查是否启用了运行时调试断言。如果启用了断言,它会进行一些额外的初始化。如果没有启用,它会跳过,因为一旦节点被使用,它就会被替换。tbb唯一会遍历新创建的列表的时候是在第一次插入时用于检查它的完整性。
>
> 实际问题是vcpkg正在使用启用了断言的tbb二进制文件进行构建。当进行调试构建时,还会启用tbb断言。这个微小的一行初始化代码实际上会在构建应用程序时编译,而不是tbb。这是因为它涉及初始化在程序中设置的数据。
>
> 因此,当您以发布模式构建时,代码中会禁用调试断言,并且跳过了额外的初始化。但然后您与启用了运行时断言的tbb版本合并。它试图断言内部数据结构完整,并且因为它处于未初始化状态,实际上从未被任何东西使用,除了完整性检查。
>
> 简而言之:问题在于vcpkg将您的发布模式程序与仍然启用了调试断言的tbb版本合并。这个CMake指令表示在tbb中启用调试断言时编译您的代码。

英文:

A colleague of mine discovered that adding the following line to the CMakeLists.txt is a work around for now, and he believes he's identified what the underlying problem actually is:

add_compile_definitions(TBB_USE_ASSERT)

Here is a brief summary of what he said. I've added it to an issue I submitted on the vcpkg repository. Hopefully it helps resolve the issue, but if not, hopefully this work-around helps anyone else who stumbles across this same issue. (NOTE: It likely has a performance penalty over a true solution on the vcpkg side of things)

EDIT: Description from my colleague:

> Its fundamentally a build system issue. I had to spend some time single stepping through the allocator. Basically, it was asserting that uninitialized memory happened to contain its own address. I had to get a side-by-side setup with a working and broken copy to track where that value got set in the working copy.
>
> The code that sets it is in the class constructor for the intrusive_list_node helper class and was conditional on a preprocessor macro. That preprocessor macro checks if runtime debugging assertions are enabled. If assertions are enabled, it does some extra initialization. If not, it skips that because it would be replaced as soon as the node is used. The only time tbb would ever traverse a newly created list is when sanity checking it on the first insertion.
>
> The actual issue is that vcpkg is building the tbb binary with assertions enabled. When you do a debug build, you also build things with tbb assertions enabled. That tiny one line of initialization code actually ends up getting compiled when building your application, not tbb. This is because it has to do with initializing data you set aside in your program.
>
> So when you built it in release mode, debug assertions were disabled in your code and it skipped the extra initialization. But you then get combined with a version of tbb that was built with runtime assertions enabled. It tries to assert that the internal data structures are intact and that fails because it was left in an uninitialized state that would actually have never been used by anything other than the sanity check.
>
> TLDR: The issue is vcpkg is combining you release mode program with a version of tbb with debugging assertions still enabled. That CMake directive says to compile your code assuming debugging assertions in tbb are enabled.

huangapple
  • 本文由 发表于 2023年5月25日 04:53:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/76327326.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定