在2D数组上并行执行 – 锁定数组的单个字段/单元格。

huangapple go评论76阅读模式
英文:

Parallelize execution on 2D Array - lock single field / cells of array

问题

Iam working on the parallelization of the calculation of a 2D-Array. To be more precise, iam trying to parallize the simulation of a predator-prey szenario. Therefore, i am using a 2D-Array, in which each field of the array displays a pixel. For the parallelization, iam deviding the field to the given number of threads in subfields - one for each thread. The threads should perform the calculation for each field in each sub-array simultaniously.

More information about my algorithm:

  1. Step: it chooses a random cell inside its subfield (xi, yi)
  2. Step: it chooses a random neighbor field --> left, right, above, below of (xi,yi), called xin, yin
  3. Step: it chooses randomly, if it performe a move, selection or reproduction action

Here is an example of the move action:

   String temp = this.simulationBoard[xi][yi];
   this.simulationBoard[xi][yi] = this.simulationBoard[xin][yin];
   this.simulationBoard[xin][yin] = temp;
}

Now regarding the parallelization:

  • Each thread performs the algorithm on its subfield (this is ensured by only randomly choosing cells in the subfield (by specifying borders))
  • The only difference is, that in Step 2 of the algorithm it is checked, wether (xi,yi) or (xin,yin) are located in a critical region
  • critical region means, that 2 Threads could both perform an action on the field of the array ((xi,yi) or (xin,yin))
  • critical regions are simplified the boarders of each sub-field

So my goal is to ensure, that if a threads performs an action on the element of the 2d array that is located in a critical region, only one thread can perform the action at a time. During this, the other threads should nether read nor write the element.

My current approach is:

All threads are performing methods on the instance of the SimulationBoard class see below:


   private String[][] simulationBoard;

   ...

   public void move(int xi, int yi, int xin, int yin, boolean lockXiYi, boolean lockXinYin) {
        if(lockXiYi && lockXinYin){
            this.moveXiYiXinYinLocked(xi, yi, xin, yin);
        }
        else if(lockXiYi){
            this.moveXiYiLocked(xi, yi, xin, yin);
        }
        else if(lockXinYin){
            this.moveXinYinLocked(xi, yi, xin, yin);
        }
        else {
            this.moveUnlocked(xi, yi, xin, yin);
        }
   }
    ...

    // here, only one thread at a time should acces and perform actions on the element simulationBoard[xi][yi]
    public void moveXiYiLocked(int xi, int yi, int xin, int yin){
        synchronized(simulationBoard[xi][yi]){
            String temp = this.simulationBoard[xi][yi];
            this.simulationBoard[xi][yi] = this.simulationBoard[xin][yin];
            this.simulationBoard[xin][yin] = temp;
        }
    }
} ```

So this is my current status..

now when i am performing my simulation on a 600 x 400 field for example with 4 threads, the whole simulation is not getting any faster. for other configuration the result is more or less the same..

Is my logic wrong or something like that? Is synchronized(array[xi][yi]) blocking the whole array maybe???

Thanks for your help!!

<details>
<summary>英文:</summary>

Iam working on the parallelization of the calculation of a 2D-Array. To be more precise, iam trying to parallize the simulation of a predator-prey szenario. Therefore, i am using a 2D-Array, in which each field of the array displays a pixel. For the parallelization, iam deviding the field to the given number of threads in subfields - one for each thread. The threads should perform the calculation for each field in each sub-array simultaniously.

More information about my algorithm: 

1. Step: it chooses a random cell inside its subfield (xi, yi)
2. Step: it chooses a random neighbor field --&gt; left, right, above, below of (xi,yi), called xin, yin
3. Step: it chooses randomly, if it performe a move, selection or reproduction action

Here is an example of the move action: 

public void move(int xi, int yi, int xin, int yin){
   String temp = this.simulationBoard[xi][yi];
   this.simulationBoard[xi][yi] = this.simulationBoard[xin][yin];
   this.simulationBoard[xin][yin] = temp;
}

Now regarding the parallelization: 

- Each thread performs the algorithm on its subfield (this is ensured by only randomly choosing cells in the subfield (by specifying borders))
- The only difference is, that in Step 2 of the algorithm it is checked, wether (xi,yi) or (xin,yin) are located in a critical region
- critical region means, that 2 Threads could both perform an action on the field of the array ((xi,yi) or (xin,yin))
- critical regions are simplified the boarders of each sub-field

So my goal is to ensure, that if a threads performs an action on the element of the 2d array that is located in a critical region, 
only one thread can perform the action at a time. During this, the other threads should nether read nor write the element. 

My current approach is: 


All threads are performing methods on the instance of the SimulationBoard class see below:

public class SimulationBoard{

   private String[][] simulationBoard;

   ...

   public void move(int xi, int yi, int xin, int yin, boolean lockXiYi, boolean lockXinYin) {
        if(lockXiYi &amp;&amp; lockXinYin){
            this.moveXiYiXinYinLocked(xi, yi, xin, yin);
        }
        else if(lockXiYi){
            this.moveXiYiLocked(xi, yi, xin, yin);
        }
        else if(lockXinYin){
            this.moveXinYinLocked(xi, yi, xin, yin);
        }
        else {
            this.moveUnlocked(xi, yi, xin, yin);
        }
   }
    ...

    // here, only one thread at a time should acces and perform actions on the element simulationBoard[xi][yi]
    public void moveXiYiLocked(int xi, int yi, int xin, int yin){
        synchronized(simulationBoard[xi][yi]){
            String temp = this.simulationBoard[xi][yi];
            this.simulationBoard[xi][yi] = this.simulationBoard[xin][yin];
            this.simulationBoard[xin][yin] = temp;
        }
    }
}

So this is my current status..
now when i am performing my simulation on a 600 x 400 field for example with 4 threads, the whole simulation is not getting any faster. for other configuration the result is more or less the same..
Is my logic wrong or something like that? 
Is synchronized(array[xi][yi]) blocking the whole array maybe???
Thanks for your help!!
</details>
# 答案1
**得分**: 1
First of all, developing multi-threaded applications is a complex topic. The complexity may exceed everything you’ve learned so far, so it’s weird to get an assignment including parallel processing from your university without any guidance.
首先,开发多线程应用程序是一个复杂的主题。这种复杂性可能超出你迄今为止学到的一切,因此在没有任何指导的情况下从你的大学得到一个包括并行处理的任务是奇怪的。
Parallel read access does not impose any problems and hence, does not require locking. On the other hand, problems can occur with concurrent access to the same data while at least one thread modifies it. This is what requires appropriate measures.
并行读取不会导致任何问题,因此不需要锁定。另一方面,当至少一个线程修改数据时,可能会出现并发访问相同数据的问题。这就需要采取适当的措施。
The next fundamental thing you need to understand is: *`synchronized` does not lock anything*.
你需要理解的下一个基本概念是:*`synchronized` 不会锁定任何东西*。
When two or more threads try to execute a synchronized block or method, synchronizing *on the same object*, they have to execute the block or method one after the other. The thread entering the synchronized block or method after the other is guaranteed to see the modifications made by the other thread before leaving the synchronized block or method, consistently. But there is no restriction on which data the thread(s) may modify and nothing prevents other threads from modifying the same data without `synchronized`.
当两个或更多线程尝试执行一个同步块或方法时,*在相同对象上同步*,它们必须依次执行该块或方法。进入同步块或方法的线程在离开同步块或方法之前可以保证看到其他线程所做的修改,但线程(们)可以修改的数据没有限制,没有任何东西阻止其他线程在没有 `synchronized` 的情况下修改相同的数据。
It’s up to you to arrange the code in a way that all access to the same mutable data is done within synchronized blocks or methods, synchronizing on the same object. A natural valid protocol would be to use the same object whose fields you are accessing for the synchronization, but nothing enforces this. It’s also valid to use a different object for synchronization than the one you’re modifying, you might even modify more than one object, *as long as all threads adhere to the same protocol*.
如何安排代码是由你决定的,以确保所有对相同可变数据的访问都在同步块或方法中完成,同步在相同的对象上。一个自然有效的协议是使用与你正在访问的字段相同的对象进行同步,但没有强制规定这一点。使用不同于正在修改的对象进行同步也是有效的,甚至可以修改多个对象,*只要所有线程都遵守相同的协议*。
So when you use `synchronized(array[xi][yi]){ … }` you are *not* locking the array element. You are reading the array element, without synchronization as the synchronized block has not entered yet, and synchronizing on the object whose reference you read. Since you said, the objects are (immutable) `String`s, it’s clear that you are not modifying the object but assigning a new string, as that’s the only way to update a string array.
因此,当你使用 `synchronized(array[xi][yi]){ … }` 时,*你没有* 锁定数组元素。你正在读取数组元素,没有同步,因为同步块还没有进入,并且正在同步引用读取的对象。由于你说对象是(不可变的)`String`,很明显你没有修改对象,而是分配了一个新的字符串,因为这是更新字符串数组的唯一方法。
This is completely broken. Two threads performing `synchronized(array[xi][yi]){ … }` for the same `xi` and `yi` may see different objects at the array location and synchronizing on different objects is as good as having no synchronization at all. For completeness, it’s also possible to get over-synchronization when the same string is stored at different array locations, but performance problems are irrelevant as long as the program isn’t correct.
这是完全错误的。两个线程同时执行 `synchronized(array[xi][yi]){ … }` 对于相同的 `xi` 和 `yi` 可能会在数组位置看到不同的对象,并且在不同的对象上同步与根本没有同步一样糟糕。为了完整起见,当相同的字符串存储在不同的数组位置时,也可能发生过度同步,但只要程序不正确,性能问题就无关紧要。
It might be overkill to create a distinct object for each (600 x 400) array location to lock on those. But how a useful solution might look like depends on how much overlapping your sub-fields actually have and how many writes are necessary in your algorithm, compared to the number of read accesses. That’s a new question, requiring more information about your actual algorithm.
为每个(600 x 400)数组位置创建一个独立的对象以进行锁定可能有点过度。但一个有用的解决方案可能会如何取决于你的子字段实际上有多少重叠,以及在你的算法中需要多少写操作,与读取操作的数量相比。这是一个新问题,需要更多关于你实际算法的信息。
<details>
<summary>英文:</summary>
First of all, developing multi-threaded applications is a complex topic. The complexity may exceed everything you’ve learned so far, so it’s weird to get an assignment including parallel processing from your university without any guidance.
Parallel read access does not impose any problems and hence, does not require locking. On the other hand, problems can occur with concurrent access to the same data while at least one thread modifies it. This is what requires appropriate measures.
The next fundamental thing you need to understand is: *`synchronized` does not lock anything*.
When two or more threads try to execute a synchronized block or method, synchronizing *on the same object*, they have to execute the block or method one after the other. The thread entering the synchronized block or method after the other is guaranteed to see the modifications made by the other thread before leaving the synchronized block or method, consistently. But there is no restriction on which data the thread(s) may modify and nothing prevents other threads from modifying the same data without `synchronized`.
It’s up to you to arrange the code in a way that all access to the same mutable data is done within synchronized blocks or methods, synchronizing on the same object. A natural valid protocol would be to use the same object whose fields your are accessing for the synchronization, but nothing enforces this. It’s also valid to use a different object for synchronization than the one you’re modifying, you might even modify more than one object, *as long as all threads adhere to the same protocol*.
So when you use `synchronized(array[xi][yi]){ … }` you are *not* locking the array element. You are reading the array element, without synchronization as the synchronized block has not entered yet, and synchronizing on the object whose reference you read. Since you said, the objects are (immutable) `String`s, it’s clear that you are not modifying the object but assigning a new string, as that’s the only way to update a string array.
This is completely broken. Two threads performing `synchronized(array[xi][yi]){ … }` for the same `xi` and `yi` may see different objects at the array location and synchronizing on different objects is as good as having no synchronization at all. For completeness, it’s also possible to get over-synchronization when the same string is stored at different array locations, but performance problems are irrelevant as long as the program isn’t correct.
It might be overkill to create a distinct object for each (600 x 400) array location to lock on those. But how a useful solution might look like depends on how much overlapping your sub-fields actually have and how many writes are necessary in your algorithm, compared to the number of read accesses. That’s a new question, requiring more information about your actual algorithm.
</details>

huangapple
  • 本文由 发表于 2023年5月8日 01:20:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/76195313.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定