英文:
How should I handle Vulkan objects for issuing commands
问题
好的,以下是翻译好的部分:
好的,我们有队列,我们将命令缓冲提交给队列,我们将渲染通道记录到命令缓冲中,我们将子通道添加到渲染通道中。为什么这么复杂呢?我理解我们可以使用子通道来有效地同步我们的命令,还有整个队列可以一次性发送到GPU。但为什么需要命令缓冲和渲染通道呢?甚至还有命令池来创建命令缓冲。如果我可以在一个线程中创建我的命令缓冲,为什么我需要两个呢?那么如果我可以在子通道中完成所有操作,为什么我需要两个渲染通道呢,我有什么遗漏吗?
命令池对于多线程很有用。好吧,我可以在我的线程中创建我的命令缓冲,为什么我需要两个呢?那么如果我可以在子通道中完成所有操作,为什么我需要两个渲染通道呢?我是不是漏掉了什么?
为了明确,我正在为我的引擎创建一个Vulkan的封装或抽象层,并试图理解这些概念。也许这才是真正的问题,我试图在没有足够知识的情况下将一切都抽象化,并进行纯粹的Vulkan项目,但在我看来,这种方式更容易学习。
我尝试在线寻找一些解释,但没有找到满足我的需求的解释。
英文:
Okay so, we have queues, we submit command buffers to queues, we record render passes to command buffers, we add subpasses to render passes. Why so complicated. I understand that we can use subpasses to efficiently synchronize our commands, and there are whole queues that we can send to the GPU all at once. But why command buffers and render passes. That's not even all, there are command pools to create your command buffers. Why would I need multiple queues with each multiple command buffers with each multiple render passes with each multiple subpasses.
Command pools are useful for multiple threads. Okay. So I can create my command buffer in my thread, why would I need two. Than why would I need two render passes if I can do everything in subpasses, am I missing something?
To be clear I am making an wrapper or an abstraction around Vulkan for my engine and am trying to grasp these concepts. Maybe this is the real problem, I'm trying to abstract everything without having enough knowledge and doing pure Vulkan projects, but in my mind it's easier to learn this way.
I tried looking for some explanations online but couldn't any ones that satisfy my need.
答案1
得分: 5
为什么在Vulkan中发出GPU命令如此复杂
在Vulkan中发出命令并不复杂,它只是纯粹复杂。GPU可能是您计算机中最复杂的组件,可能仅略逊于x86指令解码器。Vulkan、Metal和D3D12并不是GPU硬件的演进,它们只是暴露了已存在一段时间的更低级功能。
但为什么要使用命令缓冲区
命令缓冲区已经存在很长时间了,可能可以追溯到GPU的最早期,只是之前它们对用户隐藏了起来。记录命令非常慢,主要是由于状态验证(这也是为什么存在管线状态并进行预编译的原因),为了缓解这个问题,现代低开销API允许在单独的线程中记录命令缓冲区。这减少了CPU瓶颈,并允许更好的多线程渲染。
命令缓冲区还允许子命令缓冲区。如果您有一组命令将被多次使用,可以使用子命令缓冲区,以防止您不断重复记录相同的命令,从而节省CPU时间。您还可以重用主命令缓冲区,如果您愿意的话。
这还不是全部,还有命令池来创建命令缓冲区
低开销API的一部分是为您提供更多的内存分配控制权。通过按线程拆分池,Vulkan在内存分配期间不需要内部同步,这就是命令池需要在外部同步的原因(实际上,您不应该执行任何同步操作)。建议您为交换链中的每一帧都创建单独的池,这样在不再使用后更容易释放内存,并减少内存池中的碎片。
为什么需要多个队列
您不需要,但您可以选择使用。队列有两个作用。第一个是允许同时执行不同类型的队列。GPU可能能够并行执行图形功能、计算功能和复制功能,因此Vulkan暴露了这一功能。第二个是允许队列优先级。您可以在同一家族中拥有具有不同队列优先级的多个队列,从而允许您将低优先级的活动放入较低优先级的队列中,以潜在地减少高优先级任务的计算时间不足。
如果您正在创建一个多个图形API的包装器,那么您的选择基本上归结为暴露最小功能或最大功能。如果您暴露最小功能,那么您最好只使用OpenGL,而不是试图有效地编写OpenGL驱动程序来击败GPU制造商。
走另一条路,提供最大功能,将需要模拟许多功能,但好处很少。您可以在OpenGL中实现命令缓冲区包装器,但您实际上不会获得任何并行状态验证的好处。队列也将完全在用户模式下实现,而不是在内核模式和GPU内部实现。
英文:
> Why is issuing commands to GPU so complicated in Vulkan
Issuing commands isn't complicated in Vulkan, it's just plain complicated. The GPU is quite possibly the most complicated component in your computer, possibly only slightly behind the x86 instruction decoder. Vulkan, Metal, and D3D12 weren't an evolution on GPU hardware, they were merely exposing the lower-level functionality that has existed for a while.
> But why command buffers
Command buffers have been a thing for a very long time, probably going back to the very beginning of GPU's, they were just hidden from you. Recording commands are incredibly slow, primarily due to state validation (this is also why pipeline states exist and are precompiled), to alleviate this the modern low-overhead APIs exposed the ability to record command buffers in separate threads. This reduces the CPU bottleneck and allows for better multithreaded rendering.
Command buffers also allow for sub command buffers. This can be useful if you have a set of commands that will be used multiple times, to prevent you from having to spend CPU time re-recording identical commands over and over again. You can also re-use primary command buffers, if you're so inclined.
> That's not even all, there are command pools to create your command buffers.
Part of the low-overhead APIs is giving you more control over memory allocation. By splitting pools up by thread Vulkan doesn't need to have internal synchronization during memory allocations, which is why command pools are externally synchronized (and frankly, you shouldn't be performing any synchronization). You're also advised to have individual pools for each frame in your swapchain, this makes it easier to release memory after it's no longer in use and reduce sharding in your memory pools.
> Why would I need multiple queues
You don't, but you have the option to. The queues serve two purposes. The first is to allow for different queue types to be executed simultaneously. A GPU might be able to execute graphics functions, compute functions, and copy functions in parallel, so Vulkan exposes this. The second is to allow for queue priorities. You can have multiple queues in the same family with different queue priorities allowing you to put low priority activities in a lower priority queue to potentially reduce starving high priority tasks of compute time.
<hr />
If you're making a wrapper around multiple graphics APIs, then your options mostly boil down to exposing the minimal functionality, or the maximal functionality. If you expose the minimal functionality you'd be better off just using OpenGL rather than attempting to effectively beat GPU manufacturers at writing an OpenGL driver.
Going the other direction, offering the maximum functionality, will require emulating a lot of functionality, with few benefits. You can implement command buffer wrapper in OpenGL, but you won't actually gain any of the parallel state validation benefits. Queues would also be exclusively implemented in user mode, rather than at the kernel mode and within the GPU.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论