2023年7月18日 15:17:28go评论138阅读模式

英文:

omp for loop for constexpr indexes

问题

假设我有一个依赖于一个非类型模板参数的函数，一个```std::size_t```，它可以取值```0,...,N-1```，其中```N```在编译时已知。
可以使用```std::sequence```或模板递归来迭代所有值。例如：
```cpp
#include <utility>
template <std::size_t I>
void f() {
//...
}
template <std::size_t... I>
void loop_f_impl(std::index_sequence<I...>) {
   (f<I>(),...);
}
template <std::size_t N>
void loop_f() {
   loop_f_impl(std::make_index_sequence<N>{});
}
int main() {
   constexpr std::size_t N = 4;
   loop_f<N>();
}

我如何将“展开的循环”转换为可以使用openmp并行化的标准for循环？类似于以下代码（显然不能编译...）

#pragma omp for
for (std::size_t i = 0; i < N; ++i)
     f<i>();

显然，例如，如果N=3，我可以用以下方式实现：

#pragma omp for
for (std::size_t i = 0; i < N; ++i)
    switch (i) {
        case 1:
             f<1>();
             break;
        case 2:
             f<2>();
             break;
        case 3:
             f<3>();
             break;
    }

然而，我对一个适用于每个N的通用代码感兴趣。

英文:

Suppose I have a function depending on a nontype template argument, an std::size_t, which can take value 0,...,N-1, with N known at compile time.
An iteration over all values can be done with a std::sequence or with a template recursion. E.g.:

#include &lt;utility&gt;
template &lt;std::size_t I&gt;
void f() {
//...
}
template &lt;std::size_t... I&gt;
void loop_f_impl(std::index_sequence&lt;I...&gt;) {
   (f&lt;I&gt;(),...);
}
template &lt;std::size_t N&gt;
void loop_f() {
   loop_f_impl(std::make_index_sequence&lt;N&gt;{});
}
int main() {
   constexpr std::size_t N = 4;
   loop_f&lt;N&gt;();
}

How can I convert the "unrolled loop" to a standard for loop that I can parallelize with openmp? Something like that (which obviously does not compile...)

#pragma omp for
for (std::size_t i = 0; i &lt; N; ++i)
     f&lt;i&gt;();

Clearly, if, say, N=3, I could implement that with

#pragma omp for
for (std::size_t i = 0; i &lt; N; ++i)
    switch (i) {
        case 1:
             f&lt;1&gt;();
             break;
        case 2:
             f&lt;2&gt;();
             break;
        case 3:
             f&lt;3&gt;();
             break;
    }

I am interested however in a generic code that works for every N.

答案1

得分: 4

omp for loop for constexpr indexes

您可以将 f 修改为接受 I 作为参数，因为您的 for 循环中的 i 不是 constexpr，不能在需要 constexpr 的地方使用。
void f(std::size_t I) {
}

Another option, without using omp, could be to launch all f<I...>()s asynchronously:

另一种选择，不使用 omp，是将所有的 f<I...>() 异步启动：
#include <future>
#include <tuple>
template <std::size_t... I>
void loop_f_impl(std::index_sequence<I...>) {
    std::tuple all{ std::async(std::launch::async, f<I>)... };
} // here all futures stored in the `tuple` wait until done

An alternative could be to use one of the standard (since C++17) Execution Policies directly from loop_f in a std::for_each. Example:

另一种选择是直接从 loop_f 中使用标准的（自 C++17 起）Execution Policies 中的一个，在 std::for_each 中使用。示例：
#include <algorithm>
#include <array>
#include <execution>
template <std::size_t N>
void loop_f() {
    // C++20 lambda template:
    constexpr auto funcs = []<std::size_t... Is>(std::index_sequence<Is...>) {
        return std::array{f<Is>...};
    }(std::make_index_sequence<N>{});
    std::for_each(std::execution::par_unseq, funcs.begin(), funcs.end(),
                  [](auto func) { func(); });
}
This will make use of Intel® oneAPI Threading Building Blocks or whatever your implementation uses as a backend.

英文:

> omp for loop for constexpr indexes

You could change f to take I as an argument instead since i in your for loop is not constexpr and can't be used where one is needed.

void f(std::size_t I) {
}

Another option, without using omp, could be to launch all f<I...>()s asynchronously:

#include &lt;future&gt;
#include &lt;tuple&gt;
template &lt;std::size_t... I&gt;
void loop_f_impl(std::index_sequence&lt;I...&gt;) {
    std::tuple all{ std::async(std::launch::async, f&lt;I&gt;)... };
} // here all futures stored in the `tuple` wait until done

An alternative could be to use one of the standard (since C++17) Execution Policies directly from loop_f in a std::for_each. Example:

#include &lt;algorithm&gt;
#include &lt;array&gt;
#include &lt;execution&gt;
template &lt;std::size_t N&gt;
void loop_f() {
    // C++20 lambda template:
    constexpr auto funcs = []&lt;std::size_t... Is&gt;(std::index_sequence&lt;Is...&gt;) {
        return std::array{f&lt;Is&gt;...};
    }(std::make_index_sequence&lt;N&gt;{});
    std::for_each(std::execution::par_unseq, funcs.begin(), funcs.end(),
                  [](auto func) { func(); });
}

This will make use of Intel® oneAPI Threading Building Blocks or whatever your implementation uses as a backend.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

omp for loop for constexpr indexes

问题

答案1

Adding multiple objects to ArrayList using loop, all objects get updated when one is changed

G++尝试链接ncurses，尽管已经加入了-lncurses标志，但仍然失败。

C++中的Hello World程序出现分段错误，涉及cin和cout。

未声明的标识符错误，但变量已正确定义 (?)

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。