关于OpenMP并行SIMD归约

huangapple go评论63阅读模式
英文:

Regarding OpenMP Parallel SIMD Reductions

问题

我有一个相当简单的for循环,对一个非常大的double值数组 `x`(1亿个数据点)求和,使用C语言并行计算,使用指定数量的线程。根据我的阅读,OpenMP指令应该如下

```c
int nthreads = 4, l = 1e8;
double sum = 0.0;

#pragma omp parallel for simd num_threads(nthreads) reduction(+:sum)
for (int i = 0; i < l; ++i) sum += x[i];

然而,这会导致编译器警告:

loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]

而且,使用多个线程运行比单线程更慢。我正在使用Apple M1 Mac,并使用 clang (Xclang) v13.0.0 编译器。我想知道的是:这是我的系统问题,还是这个OpenMP指令确实存在问题/不可行?


<details>
<summary>英文:</summary>

I have a rather simple for loop summing a very large array of double values `x` (100 mio data points) in C. I want to do this in parallel with SIMD reductions, using a specified number of threads. The OpenMP instruction in my reading should be:

```c
int nthreads = 4, l = 1e8;
double sum = 0.0;

#pragma omp parallel for simd num_threads(nthreads) reduction(+:sum)
for (int i = 0; i &lt; l; ++i) sum += x[i];

This however gives a compiler warning

loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]

and running it with multiple threads is slower than single threaded. I'm using the Apple M1 Mac with clang (Xclang) v13.0.0 compiler. What I would like to know is: is this an issue with my system or is there actually something wrong / infeasible with this OpenMP instruction?

答案1

得分: 0

这在 clang 版本大于等于 15 上编译时没有警告,但性能取决于系统。在 Apple M1 上,多线程似乎对 SIMD 向量化没有太多帮助,使用 #pragma omp simd reduction(+:sum) 指令的单线程执行效果差不多是最好的。

英文:

This compiles without warning on clang >= 15, but performance depends on the system. With the Apple M1 it seems that multithreading does not add much to the SIMD vectorization and single threaded execution with a #pragma omp simd reduction(+:sum) instruction is about as good as it gets.

huangapple
  • 本文由 发表于 2023年5月25日 01:10:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/76325938.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定