
huangapple go评论66阅读模式

Appending to slice bad performance.. why?



vertexInfo := Opengl.OpenGLVertexInfo{}

for i := 0; i < 4; i = i + 1 {
	vertexInfo.Translations = append(vertexInfo.Translations, float32(s.x), float32(s.y), 0)
	vertexInfo.Rotations = append(vertexInfo.Rotations, 0, 0, 1, s.rot)
	vertexInfo.Scales = append(vertexInfo.Scales, s.xS, s.yS, 0)
	vertexInfo.Colors = append(vertexInfo.Colors, s.r, s.g, s.b, s.a)


我对每个精灵的每次绘制都这样做。问题是,为什么仅仅循环几次并将相同的内容追加到这些切片中会导致如此大的性能损失?有没有更高效的方法来做这个?并不是说我添加了大量的数据。每个切片只包含大约16个元素,如上所示(4 x 4)。


更新: 我对每个追加操作进行了基准测试,似乎每个追加操作都需要1个FPS的时间。考虑到这些数据是相当静态的,这似乎是很多的。我只需要4次迭代...

更新: 添加了GitHub存储库https://github.com/Triangle345/GT


I'm currently creating a game using GoLang. I'm measuring the FPS. I'm noticing about a 7 fps loss using a for loop to append to a slice like so:

vertexInfo := Opengl.OpenGLVertexInfo{}

for i := 0; i &lt; 4; i = i + 1 {
	vertexInfo.Translations = append(vertexInfo.Translations, float32(s.x), float32(s.y), 0)
	vertexInfo.Rotations = append(vertexInfo.Rotations, 0, 0, 1, s.rot)
	vertexInfo.Scales = append(vertexInfo.Scales, s.xS, s.yS, 0)
	vertexInfo.Colors = append(vertexInfo.Colors, s.r, s.g, s.b, s.a)


I'm doing this for every sprite, every draw. The question is why do I get such a huge performance hit with just looping for times and appending the same thing to these slices? Is there a more efficient way to do this? It is not like I'm adding exuberant amount of data. Each slice contains about 16 elements as shown above (4 x 4).

When I simply put all 16 elements in one []float32{1..16} then fps is improved by about 4.

Update: I benchmarked each append and it seems that each one takes 1 fps to perform.. That seems like a lot considering this data is pretty static.. I only need 4 iterations...

Update: Added github repo https://github.com/Triangle345/GT


得分: 6




vertexInfo := Opengl.OpenGLVertexInfo{
	Translations: []float32{float32(s.x), float32(s.y), 0, float32(s.x), float32(s.y), 0, float32(s.x), float32(s.y), 0, float32(s.x), float32(s.y), 0},
	Rotations:    []float64{0, 0, 1, s.rot, 0, 0, 1, s.rot, 0, 0, 1, s.rot, 0, 0, 1, s.rot},
	Scales:       []float64{s.xS, s.yS, 0, s.xS, s.yS, 0, s.xS, s.yS, 0, s.xS, s.yS, 0},
	Colors:       []float64{s.r, s.g, s.b, s.a, s.r, s.g, s.b, s.a, s.r, s.g, s.b, s.a, s.r, s.g, s.b, s.a},

还要注意,这个结构体字面量将处理切片后面不需要重新分配数组的情况。但是,如果在代码的其他地方(我们看不到)你向这些切片追加更多元素,它们可能会导致重新分配。如果是这种情况,你应该创建容量更大的切片以覆盖“未来”分配(例如make([]float64, 16, 32))。


The builtin append() needs to create a new backing array if the capacity of the destination slice is less than what the length of the slice would be after the append. This also requires to copy the current elements from destination to the newly allocated array, so there are much overhead.

Slices you append to are most likely empty slices since you used a slice literal to create your Opengl.OpenGLVertexInfo value. Even though append() thinks for the future and allocates a bigger array than what is needed to append the specified elements, chances are that in your case multiple reallocations will be needed to complete the 4 iterations.

You may avoid reallocations if you create and initialize vertexInfo like this:

vertexInfo := Opengl.OpenGLVertexInfo{
	Translations: []float32{float32(s.x), float32(s.y), 0, float32(s.x), float32(s.y), 0, float32(s.x), float32(s.y), 0, float32(s.x), float32(s.y), 0},
	Rotations:    []float64{0, 0, 1, s.rot, 0, 0, 1, s.rot, 0, 0, 1, s.rot, 0, 0, 1, s.rot},
	Scales:       []float64{s.xS, s.yS, 0, s.xS, s.yS, 0, s.xS, s.yS, 0, s.xS, s.yS, 0},
	Colors:       []float64{s.r, s.g, s.b, s.a, s.r, s.g, s.b, s.a, s.r, s.g, s.b, s.a, s.r, s.g, s.b, s.a},

Also note that this struct literal will take care of not having to reallocate arrays behind the slices. But if in other places of your code (which we don't see) you append further elements to these slices, they may cause reallocations. If this is the case, you should create slices with bigger capacity covering "future" allocations (e.g. make([]float64, 16, 32)).


得分: 4




An empty slice is empty. To append, it must allocate memory. And then you do more appends, which have to allocate even more memory.

To speed it up use a fixed size array or use make to create a slice with the correct length, or initialize the slice with the items when you declare it.

  • 本文由 发表于 2015年8月28日 13:54:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/32264208.html



:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:
