英文:
Strange anomalous behavior using concurrency with image package
问题
我正在尝试让一个生成1D元胞自动机图像的程序正常工作,它需要足够强大以处理数百万个单独细胞的大规模模拟,因此需要多线程处理图像生成过程。出于这个原因,我选择了Go语言,因为Go协程可以更轻松高效地划分CPU的工作。由于使用单独的Go协程写入每个细胞的性能不佳,我决定创建一个函数,该函数调用图像对象,并负责生成整行的细胞。该函数引用一个包含所有要绘制的细胞的位切片数组的二维数组对象,因此有很多循环,但这对于问题并不重要。程序的目标是简单地读取所有单个位,并在正确的位置将一个正方形写入图像矩形,表示存在一个细胞(基于变量pSize,表示正方形的边长)。以下是该函数的代码:
func renderRow(wg *sync.WaitGroup, img *image.RGBA, i int, pSize int) {
defer wg.Done()
var lpc = 0
for j := 0; j < 64; j++ {
for k := range sim[i] {
for l := lpc * pSize; l <= (lpc*pSize)+pSize; l++ {
for m := i * pSize; m <= (i*pSize)+pSize; m++ {
if getBit(sim[i][k], j) == 1 {
img.Set(l, m, black)
} else {
img.Set(l, m, white)
}
}
}
lpc++
}
}
}
现在,我很高兴地说,当在一个线程上按顺序运行时,这个函数的表现符合预期。以下是非并行函数调用的代码(忽略了等待组):
img = image.NewRGBA(image.Rectangle{Min: upLeft, Max: lowRight})
for i := range sim {
renderRow(&wg, img, i, pSize)
}
f, _ := os.Create("export/image.png")
_ = png.Encode(f, img)
然而,当我们将简单的更改应用到并发实现时,输出会出现几个单独的像素错误,并且似乎会随机地收缩和扩展某些行,因为每次运行时错误的数量都会发生变化。以下是并发函数调用的代码:
img = image.NewRGBA(image.Rectangle{Min: upLeft, Max: lowRight})
for i := range sim {
go renderRow(&wg, img, i, pSize) // TODO make multithreaded again
}
wg.Wait()
f, _ := os.Create("export/image.png")
_ = png.Encode(f, img)
那么这两种实现的输出是什么样的呢?使用以下起始条件:{0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1}
和演化空间为11
(pSize为2
),单线程实现的输出如下所示:
现在,如果你放大该图像,你会发现所有的正方形在垂直和水平方向上都均匀分布,没有任何异常。然而,现在让我们来看看并发输出。
这个版本似乎有几个异常,许多行被缩小,许多地方有单个像素错误,虽然它正确地遵循了模拟的一般模式,但肯定不是视觉上令人愉悦的。在调查这个问题时,我寻找与并发相关的问题,所以我认为图像包中的像素数组的动态分配可能会导致某种冲突,所以我调查了img.Set()
函数,它的代码如下:
func (p *NRGBA) Set(x, y int, c color.Color) {
if !(Point{x, y}.In(p.Rect)) {
return
}
i := p.PixOffset(x, y)
c1 := color.NRGBAModel.Convert(c).(color.NRGBA)
s := p.Pix[i : i+4 : i+4] // Small cap improves performance, see https://golang.org/issue/27857
s[0] = c1.R
s[1] = c1.G
s[2] = c1.B
s[3] = c1.A
}
然而,当我看到这个函数时,它似乎毫无意义。因为它似乎是将img.Pix
元素存储在一个顺序的整数1D数组中,表示颜色,但是.Set()
函数如果传递给它的(x,y)元素已经在.Pix
切片中找到,它会立即返回。但更奇怪的是,这似乎是一种隐式赋值(我在Go中从未见过),其中从.Pix
切片中取出4个元素,表示一个像素的颜色,并赋给s
。最奇怪的部分是,s
、c1
和i
从未被引用、返回或存储在内存中,只是被丢弃给垃圾回收。但不知何故,这个函数在顺序执行时似乎是有效的,所以我决定让它按照自己的方式运行,并查看在并发和非并发实现之间.Pix
切片中的差异是什么。
现在,这里有四个paste bin的链接,它们包含了两个单独试验的img.Pix
对象数据,每行都属于一个单独像素的颜色,从每个图像的左上角开始向下移动。之所以进行两次试验,是为了验证单线程方法的一致性,它似乎是一致的,但是你可以通过访问网站diffchecker.com观察到多线程测试和单线程输出之间的差异。
现在,我将分享一些关于这些数据的观察结果:
- 不同的多线程测试和单线程测试之间存在差异和不同数量的差异
- 单线程和多线程之间的添加和删除数量相同,这意味着所有的数据都存在,只是顺序不对。
这些观察结果可能意味着当我们调用Set函数时,线程在Pix数组的某些索引上发生了冲突,但从查看Set函数的代码来看,每个像素都应该在数组中有一个唯一的位置,这个位置是根据提供的矩形的长度和宽度预先分配的,应该使得顺序是绝对的,线程之间不可能发生冲突。下面是负责创建图像对象的函数的代码:
// NewRGBA returns a new RGBA image with the given bounds.
func NewRGBA(r Rectangle) *RGBA {
return &RGBA{
Pix: make([]uint8, pixelBufferLength(4, r, "RGBA")),
Stride: 4 * r.Dx(),
Rect: r,
}
}
总而言之,我真的不知道出了什么问题。似乎图像包中出现了一些奇怪的行为,因为多个Go协程访问同一个切片,但是由于切片的索引在理论上是绝对的(即每个变量都是唯一的),所以不应该有任何顺序问题。我能想到的唯一可能的问题是,尽管切片是按照预期的方式定义的,但在Set函数中,切片可能被重新调整大小,或者至少被移动,导致冲突。非常感谢任何帮助找出问题所在的帮助或关于可能导致问题的理论。谢谢!
英文:
The program im trying to get working is a generator for images of 1D cellular automate and it needs to be robust enough to handle extremely large simulations on orders of several millions of individual cells so multi-threading the image generation process is necessary. I chose Go for this reason because go-routines were going to make the issue of dividing work for the CPU much easier and efficient. Now because writing each cell with a individual go-routine would not be very performant at all i decided to create a function that calls the image object and is responsible for generating an entire row of cells instead. This function is referencing a 2D array object containing a bitsliced (see this) array of all the cells to be drawn hence the many loops however this is not important to the issue at hand. What the program is supposed to do is simply read all the individual bits and write a square to the image rectangle in the correct position denoting the presence of a cell (based on the variable pSize noting the side length of the square). Here is that function...
func renderRow(wg *sync.WaitGroup, img *image.RGBA, i int, pSize int) {
defer wg.Done()
var lpc = 0
for j := 0; j < 64; j++ {
for k := range sim[i] {
for l := lpc * pSize; l <= (lpc*pSize)+pSize; l++ {
for m := i * pSize; m <= (i*pSize)+pSize; m++ {
if getBit(sim[i][k], j) == 1 {
img.Set(l, m, black)
} else {
img.Set(l, m, white)
}
}
}
lpc++
}
}
}
Now im happy to say that this function here performs just as expected when run sequentially on one thread. Here is the non parallel function call (ignoring the waitgroup)
img = image.NewRGBA(image.Rectangle{Min: upLeft, Max: lowRight})
for i := range sim {
renderRow(&wg, img, i, pSize)
}
f, _ := os.Create("export/image.png")
_ = png.Encode(f, img)
Now on the other hand when we make the simple change to a concurrent implementation the output has several individual pixel errors and seems to shrink and extend certain rows randomly as the amount of errors changes with each run. Here's the concurrent function call. Here's the concurrent function call ...
img = image.NewRGBA(image.Rectangle{Min: upLeft, Max: lowRight})
for i := range sim {
go renderRow(&wg, img, i, pSize) // TODO make multithreaded again
}
wg.Wait()
f, _ := os.Create("export/image.png")
_ = png.Encode(f, img)
Now what does the output look like for these two respective implementations?
using these starting conditions {0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1}
and a evolution space of 11
(pSize 2
). We get this as output from the single thread implementation...
Now if you zoom in on that image you'll find all the squares all evenly spaced vertically and horizontally with no anomalies. However now lets take a look at the concurrent output.
This version seems to have several anomalies many rows have been shrunk there are individual pixel errors in many places and although it follows the general pattern of the simulation correctly it is most certainly not visually pleasing. While i was investigating this issue i looked for issues related to concurrency and so i thought that perhaps a dynamic allocation of the pixel array in the image package might be causing conflicts of some sort and so i investigated img.Set()
which looks like this...
func (p *NRGBA) Set(x, y int, c color.Color) {
if !(Point{x, y}.In(p.Rect)) {
return
}
i := p.PixOffset(x, y)
c1 := color.NRGBAModel.Convert(c).(color.NRGBA)
s := p.Pix[i : i+4 : i+4] // Small cap improves performance, see https://golang.org/issue/27857
s[0] = c1.R
s[1] = c1.G
s[2] = c1.B
s[3] = c1.A
}
However when i look at this it seems to make no sense. As it appears that img.Pix
element is storing all the pixel data in a sequential 1D array of integers representing colors but the .Set()
function immediately returns if the (x,y) elements passed to it are already found in the .Pix slice. But whats even more strange is what appears to be some sort of implicit assignment (which iv'e never seen in Go) where 4 elements of the .Pix slice are taken out representing an individual pixel's color and assigned to s
. And the strangest part being that s
, c1
and i
are never referenced again, returned, or stored in memory simply thrown to garbage collection. But somehow this function appears to work sequentially so i just decided to let it do its thing and take a look at what the differences were in the .Pix
slice between the concurrent and non concurrent implementations.
Now here's the links to four paste bins, they contain the img.Pix
objects data for 2 separate trials arranged with each row belonging to an individual pixel's colors starting from the top left of each image and moving down. The reason for two trials is to verify consistency for the single threaded approach which appears to be consistent but as you can observe by going to a website like diffchecker.com is that both the multi threaded tests show differences between them and the single threaded output.
Now here I'll share some observations about this data.
- There are differences and different quantities of differences between the different multi-threaded and the single-threaded tests
- there are identical quantities of additions and deletions between single thread and multithread implying that all the data is present and that its simply in the wrong order.
Now these observations may imply that as we call the Set function threads are colliding with each other on certain indices in the Pix array but from looking at the set function every single pixel is supposed to have a distinct place in the array which is preallocated based on the length and width of the provided rectangle which should make ordering absolute and collisions impossible between threads. Heres the function thats responsible for creating the image object...
// NewRGBA returns a new RGBA image with the given bounds.
func NewRGBA(r Rectangle) *RGBA {
return &RGBA{
Pix: make([]uint8, pixelBufferLength(4, r, "RGBA")),
Stride: 4 * r.Dx(),
Rect: r,
}
}
So all in all I really have no idea whats going on. There seems to be some weird behaviors arising from the image package as multiple go-routines access the same slice but since the indices of the slice are theoretically absolute (meaning unique for each variable) there shouldn't be any ordering issues. The only possible issue i could think of is that the slice despite being defined in the manner it was is somehow being resized by that set function or at least shifted around causing collisions. Any help figuring out whats going wrong or any theories about what might be causing the problem are greatly appreciated. Cheers!
答案1
得分: 3
上面的代码产生了许多竞争冲突,这是由于go-routines试图向.Pix对象中的相同像素坐标写入数据引起的。修复方法在renderRow
函数中,当前像素的宽度和高度计算在每次迭代中重叠,原因是使用了<=
而不是<
。故事的寓意是使用-race
来查找冲突,并始终查找相同变量的覆盖或并发读取。感谢@rustyx。
英文:
The code above produces many race conflicts arising from go-routines attempting to write to the same pixel coordinate in the .Pix object. The fix was within the renderRow
function where the calculations for the width and height of the current pixel were overlapping on each iteration due to <=
instead of '<'. Moral of the story is use -race
to look for collisions and always look for overwrites or concurrent reads of the same variable. Credit to @rustyx.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论