2015年8月7日 18:44:08go评论127阅读模式

英文:

opengl3 20k sprites slow framerate?

问题

我已经成功在Golang中制作了一个OpenGL 3.x动画。然而，在渲染了20,000个纹理后，逐帧更新明显变慢。所有精灵只是简单地从屏幕的左侧移动到右侧。请记住，它们都叠放在一起，因为我懒得随机化它们的位置。

我有一台更新的电脑，可以在高设置下运行GTA5，但无法在OpenGL 3环境中显示20,000个精灵（带纹理的四边形）。

我一定是做错了什么。也许我需要将所有顶点打包到一个VBO中，而不是为每个对象创建一个新的VBO？我也为每个对象绑定了缓冲区。我真的不确定是什么导致了这个瓶颈。有人可以帮忙吗？因为我不确定接下来该怎么做。

我附上了我的代码作为参考，希望有人可以给出一些建议，加快在OpenGL 3中渲染20,000个精灵的速度：
http://pastebin.com/SHQtRPn7

英文:

I have successfully made an opengl 3.x animation in golang. However; the frame by frame update is noticeably slow only after 20k textures rendered. All the sprites do is simply move from the left side of the screen to the right. Keep in mind they are all on top of each other because i was too lazy to randomize location.

I have an updated PC that can run GTA5 on high settings but cannot display 20k sprites (quads with textures) in opengl3 environment??

I must be doing something wrong here. Maybe I need to pack all vertices in one VBO instead of a new vbo for each object? I bindbuffers every object as well. I'm not really sure what is causing this bottleneck. Can someone help as i'm not sure where to go from here?

I have attached my code as reference for anyone that can give some tips on speeding up rendering of 20k sprites in opengl3:
http://pastebin.com/SHQtRPn7

答案1

得分: 2

不查看源代码，你应该使用一个VBO（顶点缓冲对象）来合并共享纹理的所有精灵的几何信息，并使用一个绘制调用来绘制它们。

英文:

Without looking at source code, you should use one VBO and combine geometry for all sprites that share a texture and draw them using one draw call.

答案2

得分: 2

OpenGL调用是昂贵的。在每一帧中，你要做成千上万次的调用。如果可能的话，你应该尽量做一次大的绘制调用。如果不行，可以每个纹理+程序组合做一次绘制调用。

你可以为每个绘制的对象传递一个矩阵，而不是将矩阵作为统一变量传递。

这段代码不是最好的，但它的性能比你的代码好上几个数量级。

func (drawer *SpriteDrawer) Draw(sprites []Sprite) {
    if len(sprites) == 0 {
        return
    }
    drawer.Use()
    drawer.Texture.Bind(gl.TEXTURE_2D_ARRAY)
    tmp := drawer.GetTransform().To32()
    drawer.camera_uniform.UniformMatrix2x3f(false, &tmp)
    vertexbuffer := gl.GenBuffer()
    defer vertexbuffer.Delete()
    vertexbuffer.Bind(gl.ARRAY_BUFFER)
    stride := int(unsafe.Sizeof(sprites[0]))
    gl.BufferData(gl.ARRAY_BUFFER, stride*len(sprites), sprites, gl.STREAM_DRAW)
    var transform1, transform2, texcoords, texlevel gl.AttribLocation
    transform1 = 0
    transform2 = 1
    texcoords = 2
    texlevel = 3
    transform1.AttribPointer(3, gl.FLOAT, false, stride, unsafe.Offsetof(sprites[0].Transform))
    transform2.AttribPointer(3, gl.FLOAT, false, stride, unsafe.Offsetof(sprites[0].Transform)+unsafe.Sizeof(sprites[0].Transform[0])*3)
    texcoords.AttribPointer(4, gl.FLOAT, false, stride, unsafe.Offsetof(sprites[0].TextureLeft))
    texlevel.AttribPointer(1, gl.FLOAT, false, stride, unsafe.Offsetof(sprites[0].Layer))
    transform1.EnableArray()
    transform2.EnableArray()
    texcoords.EnableArray()
    texlevel.EnableArray()
    gl.DrawArrays(gl.POINTS, 0, len(sprites))
    transform1.DisableArray()
    transform2.DisableArray()
    texcoords.DisableArray()
    texlevel.DisableArray()
}

这段代码来自于这个库

英文:

OpenGL calls are expensive. You're doing lots of them ten thousand times in every single frame. Instead you want to do one big draw call if possible. If not, one draw call per textures + program combination.

Instead of passing a matrix as uniform, you can pass a matrix for each object you draw.

This code is not as good as it could be, but it performs orders of magnitude better than yours.

func (drawer *SpriteDrawer) Draw(sprites []Sprite) {
if len(sprites) == 0 {
return
}
drawer.Use()
drawer.Texture.Bind(gl.TEXTURE_2D_ARRAY)
tmp := drawer.GetTransform().To32()
drawer.camera_uniform.UniformMatrix2x3f(false, &amp;tmp)
vertexbuffer := gl.GenBuffer()
defer vertexbuffer.Delete()
vertexbuffer.Bind(gl.ARRAY_BUFFER)
stride := int(unsafe.Sizeof(sprites[0]))
gl.BufferData(gl.ARRAY_BUFFER, stride*len(sprites), sprites, gl.STREAM_DRAW)
var transform1, transform2, texcoords, texlevel gl.AttribLocation
transform1 = 0
transform2 = 1
texcoords = 2
texlevel = 3
transform1.AttribPointer(3, gl.FLOAT, false, stride, unsafe.Offsetof(sprites[0].Transform))
transform2.AttribPointer(3, gl.FLOAT, false, stride, unsafe.Offsetof(sprites[0].Transform)+unsafe.Sizeof(sprites[0].Transform[0])*3)
texcoords.AttribPointer(4, gl.FLOAT, false, stride, unsafe.Offsetof(sprites[0].TextureLeft))
texlevel.AttribPointer(1, gl.FLOAT, false, stride, unsafe.Offsetof(sprites[0].Layer))
transform1.EnableArray()
transform2.EnableArray()
texcoords.EnableArray()
texlevel.EnableArray()
gl.DrawArrays(gl.POINTS, 0, len(sprites))
transform1.DisableArray()
transform2.DisableArray()
texcoords.DisableArray()
texlevel.DisableArray()
}

library this is from

答案3

得分: 1

我看到的一个问题是你每帧都在创建一个VBO。我不确定你想要做什么。如果你想要更新你的VBO，可以使用glBufferSubData()。glBufferData()每次调用都会创建一个新的缓冲区，所以比起glBufferSubData()来说会更加昂贵。glBufferSubData()只是修改/更新你的VBO。这应该会提高你的帧率。

英文:

One problem which I see is that you are creaing a vbo per every frame. Im not sure what you are trying to do. If you are trying to update your vbo, use glBufferSubData() instead. glBufferData() creates a new buffer every time you call it so it will be more expensive than glBufferSubData() for sure. glBufferSubData() just modifies/updates your vbo. This should give your fps a boost.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

OpenGL3 20k精灵帧率慢？

问题

答案1

答案2

答案3

如何在Golang中使用Gin web框架将参数传递给路由处理程序？

“-6”在基准函数名称后面代表什么意思？

不同输入数据下的 Goroutine 执行时间

如何解析协议缓冲区消息并创建 JSON 数据？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。