malloc in openmp for parallel loop

huangapple go评论46阅读模式
英文:

malloc in openmp for parallel loop

问题

Here are the translated code parts:

Method - 1:

#pragma omp parallel for
for (size_t i = 0; i < 1000000; ++i) {

    #pragma omp atomic
    double * p = malloc(sizeof(double) * FIXED_SIZE);

    ....... /* 使用 p 数组进行一些操作 */

    #pragma omp atomic
    free(p);
}

Method - 2:

int num_threads = omp_get_num_threads();
double * p = malloc(sizeof(double) * FIXED_SIZE * num_threads);

#pragma omp parallel for
for (size_t i = 0; i < 1000000; ++i) {

    int thread_num = omp_get_thread_num();

    double * p1 = p + FIXED_SIZE * thread_num;

    ....... /* 使用 p1 数组进行一些操作 */
}
free(p);
英文:

I am bit confused what is a better way to use malloc()/free() in openmp parallel for loop. Here are two ways I thought of but I am not aware of which method is better. I learned from previous answers that malloc/free in loop can fragment the memory.

Suppose I have a loop which runs over million times

for (size_t i = 0 ; i&lt; 1000000; ++i){
    double * p = malloc(sizeof(double)*FIXED_SIZE); 

    /* FIXED_SIZE is some size constant 
    for the entire loop but is only determined dynamically */

    ....... /* Do some stuff using p array */

    free(p);
}

Now I want to parallelize the above loop with openmp

Method -1. simply adding a pragma on top of for loop

#pragma omp parallel for
for (size_t i = 0 ; i&lt; 1000000; ++i){

    #pragma omp atomic
    double * p = malloc(sizeof(double)*FIXED_SIZE); 
    
    ....... /* Do some stuff using p array */

    #pragma omp atomic
    free(p);
}

Method -2. allocate a common array outside loop for each thread

int num_threads = omp_get_num_threads();
double * p = malloc(sizeof(double)*FIXED_SIZE * num_threads); 

#pragma omp parallel for
for (size_t i = 0 ; i&lt; 1000000; ++i){

    int thread_num = omp_get_thread_num();

    double * p1 = p + FIXED_SIZE*thread_num ;
    
    ....... /* Do some stuff using p1 array */
}
free(p);

</details>


# 答案1
**得分**: 2

首先创建一个并行块,为每个线程分配资源,然后将线程拆分以执行并行循环。

```c
#pragma omp parallel
{
  double * p = malloc(sizeof(double) * FIXED_SIZE);

  #pragma omp for
  for (size_t i = 0; i < 1000000; ++i) { ... }

  free(p);
}
英文:

First create a parallel block, allocate resource for each thread and next split threads to do a parallel loop.

#pragma omp parallel
{
  double * p = malloc(sizeof(double)*FIXED_SIZE);

  #pragma omp for
  for (size_t i = 0 ; i&lt; 1000000; ++i) { ... }

  free(p);
}

huangapple
  • 本文由 发表于 2023年5月21日 00:05:50
  • 转载请务必保留本文链接:https://go.coder-hub.com/76296136.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定