英文:
vulkan: SYNC-HAZARD-READ-AFTER-WRITE despite full pipeline barrier between operations
问题
我正在尝试在我的Vulkan应用程序中使用VK_LAYER_KHRONOS_validation
的同步验证,但似乎无法使它“满意”。
在调试过程中,我已将问题简化为vkCmdDispatch
-> 全流水线屏障 -> vkCmdCopyBuffer
,根据我的理解,这不应导致原子风险。
以下是相关API转储的一部分:
Thread 0, Frame 0:
vkCmdBindPipeline(commandBuffer, pipelineBindPoint, pipeline) 返回 void:
commandBuffer: VkCommandBuffer = 0x5651fe89b9f0
pipelineBindPoint: VkPipelineBindPoint = VK_PIPELINE_BIND_POINT_COMPUTE (1)
pipeline: VkPipeline = 0x5651fff4e5e0
Thread 0, Frame 0:
vkCmdPushDescriptorSetKHR(commandBuffer, pipelineBindPoint, layout, set, descriptorWriteCount, pDescriptorWrites) 返回 void:
commandBuffer: VkCommandBuffer = 0x5651fe89b9f0
pipelineBindPoint: VkPipelineBindPoint = VK_PIPELINE_BIND_POINT_COMPUTE (1)
layout: VkPipelineLayout = 0x5651fff4e270
set: uint32_t = 0
descriptorWriteCount: uint32_t = 1
pDescriptorWrites: const VkWriteDescriptorSet* = 0x7ffe711a6130
pDescriptorWrites[0]: const VkWriteDescriptorSet = 0x7ffe711a6130:
sType: VkStructureType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET (35)
pNext: const void* = NULL
dstSet: VkDescriptorSet = 0
dstBinding: uint32_t = 0
dstArrayElement: uint32_t = 0
descriptorCount: uint32_t = 1
descriptorType: VkDescriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER (7)
pImageInfo: const VkDescriptorImageInfo* = UNUSED
pBufferInfo: const VkDescriptorBufferInfo* = 0x5651fea830b0
pBufferInfo[0]: const VkDescriptorBufferInfo = 0x5651fea830b0:
buffer: VkBuffer = 0x5651ffcbf560
offset: VkDeviceSize = 0
range: VkDeviceSize = 1048576
pTexelBufferView: const VkBufferView* = UNUSED
Thread 0, Frame 0:
vkCmdDispatch(commandBuffer, groupCountX, groupCountY, groupCountZ) 返回 void:
commandBuffer: VkCommandBuffer = 0x5651fe89b9f0
groupCountX: uint32_t = 1024
groupCountY: uint32_t = 1
groupCountZ: uint32_t = 1
Thread 0, Frame 0:
vkCmdPipelineBarrier2(commandBuffer, pDependencyInfo) 返回 void:
commandBuffer: VkCommandBuffer = 0x5651fe89b9f0
pDependencyInfo: const VkDependencyInfo* = 0x7ffe711a6280:
sType: VkStructureType = VK_STRUCTURE_TYPE_DEPENDENCY_INFO (1000314003)
pNext: const void* = NULL
dependencyFlags: VkDependencyFlags = 0
memoryBarrierCount: uint32_t = 1
pMemoryBarriers: const VkMemoryBarrier2* = 0x7ffe711a6130
pMemoryBarriers[0]: const VkMemoryBarrier2 = 0x7ffe711a6130:
sType: VkStructureType = VK_STRUCTURE_TYPE_MEMORY_BARRIER_2 (1000314000)
pNext: const void* = NULL
srcStageMask: VkPipelineStageFlags2 = 65536 (VK_PIPELINE_STAGE_2_ALL_COMMANDS_BIT)
srcAccessMask: VkAccessFlags2 = 98304 (VK_ACCESS_2_MEMORY_READ_BIT | VK_ACCESS_2_MEMORY_WRITE_BIT)
dstStageMask: VkPipelineStageFlags2 = 65536 (VK_PIPELINE_STAGE_2_ALL_COMMANDS_BIT)
dstAccessMask: VkAccessFlags2 = 98304 (VK_ACCESS_2_MEMORY_READ_BIT | VK_ACCESS_2_MEMORY_WRITE_BIT)
bufferMemoryBarrierCount: uint32_t = 0
pBufferMemoryBarriers: const VkBufferMemoryBarrier2* = NULL
imageMemoryBarrierCount: uint32_t = 0
pImageMemoryBarriers: const VkImageMemoryBarrier2* = NULL
Thread 0, Frame 0:
vkCmdCopyBuffer(commandBuffer, srcBuffer, dstBuffer, regionCount, pRegions) 返回 void:
commandBuffer: VkCommandBuffer = 0x5651fe89b9f0
srcBuffer: VkBuffer = 0x5651ffcbf560
dstBuffer: VkBuffer = 0x5651fe986280
regionCount: uint32_t = 1
pRegions: const VkBufferCopy* = 0x7ffe711a6090
pRegions[0]: const VkBufferCopy = 0x7ffe711a6090:
srcOffset: VkDeviceSize = 0
dstOffset: VkDeviceSize = 0
size: VkDeviceSize = 1048576
供参考,我已上传完整的API转储此处。
同步验证报告如下。
SYNC-HAZARD-READ-AFTER-WRITE(错误/规范):msgNum:-455515022 - 验证错误:[SYNC-HAZARD-READ-AFTER-WRITE] 对象0:句柄=0x55eebfc30cb0,类型=VK_OBJECT_TYPE_BUFFER;| MessageID=0xe4d96472 | vkCmdCopyBuffer:Hazard READ_AFTER_WRITE for srcBuffer VkBuffer 0x55eebfc30cb0[],region 0。 访问信息(用法:SYNC_COPY_TRANSFER_READ,prior_usage:SYNC_COMPUTE_SHADER_SHADER_STORAGE_WRITE,write_barriers:0,command:vkCmdDispatch,seq_no:1,reset_no:1)。
然而,根据我的理解,vkCmdDispatch
和缓冲复制之间的全流水线屏障(按命令顺序)应该避免它们之间的重叠,也应考虑内存可见性。
值得注意的是,除了“最佳实践”之外,其他验证区域都没有报
英文:
I'm trying to use the synchronization validation of VK_LAYER_KHRONOS_validation
in my vulkan application, but appear to be unable to "make it happy".
In the debugging process I've reduced the problem down to the a vkCmdDispatch -> full pipeline barrier -> vkCmdCopyBuffer which (according to my understanding) should not result in a RaW-hazard.
An excerpt of the relevant part of the API dump is show below:
Thread 0, Frame 0:
vkCmdBindPipeline(commandBuffer, pipelineBindPoint, pipeline) returns void:
commandBuffer: VkCommandBuffer = 0x5651fe89b9f0
pipelineBindPoint: VkPipelineBindPoint = VK_PIPELINE_BIND_POINT_COMPUTE (1)
pipeline: VkPipeline = 0x5651fff4e5e0
Thread 0, Frame 0:
vkCmdPushDescriptorSetKHR(commandBuffer, pipelineBindPoint, layout, set, descriptorWriteCount, pDescriptorWrites) returns void:
commandBuffer: VkCommandBuffer = 0x5651fe89b9f0
pipelineBindPoint: VkPipelineBindPoint = VK_PIPELINE_BIND_POINT_COMPUTE (1)
layout: VkPipelineLayout = 0x5651fff4e270
set: uint32_t = 0
descriptorWriteCount: uint32_t = 1
pDescriptorWrites: const VkWriteDescriptorSet* = 0x7ffe711a6130
pDescriptorWrites[0]: const VkWriteDescriptorSet = 0x7ffe711a6130:
sType: VkStructureType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET (35)
pNext: const void* = NULL
dstSet: VkDescriptorSet = 0
dstBinding: uint32_t = 0
dstArrayElement: uint32_t = 0
descriptorCount: uint32_t = 1
descriptorType: VkDescriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER (7)
pImageInfo: const VkDescriptorImageInfo* = UNUSED
pBufferInfo: const VkDescriptorBufferInfo* = 0x5651fea830b0
pBufferInfo[0]: const VkDescriptorBufferInfo = 0x5651fea830b0:
buffer: VkBuffer = 0x5651ffcbf560
offset: VkDeviceSize = 0
range: VkDeviceSize = 1048576
pTexelBufferView: const VkBufferView* = UNUSED
Thread 0, Frame 0:
vkCmdDispatch(commandBuffer, groupCountX, groupCountY, groupCountZ) returns void:
commandBuffer: VkCommandBuffer = 0x5651fe89b9f0
groupCountX: uint32_t = 1024
groupCountY: uint32_t = 1
groupCountZ: uint32_t = 1
Thread 0, Frame 0:
vkCmdPipelineBarrier2(commandBuffer, pDependencyInfo) returns void:
commandBuffer: VkCommandBuffer = 0x5651fe89b9f0
pDependencyInfo: const VkDependencyInfo* = 0x7ffe711a6280:
sType: VkStructureType = VK_STRUCTURE_TYPE_DEPENDENCY_INFO (1000314003)
pNext: const void* = NULL
dependencyFlags: VkDependencyFlags = 0
memoryBarrierCount: uint32_t = 1
pMemoryBarriers: const VkMemoryBarrier2* = 0x7ffe711a6130
pMemoryBarriers[0]: const VkMemoryBarrier2 = 0x7ffe711a6130:
sType: VkStructureType = VK_STRUCTURE_TYPE_MEMORY_BARRIER_2 (1000314000)
pNext: const void* = NULL
srcStageMask: VkPipelineStageFlags2 = 65536 (VK_PIPELINE_STAGE_2_ALL_COMMANDS_BIT)
srcAccessMask: VkAccessFlags2 = 98304 (VK_ACCESS_2_MEMORY_READ_BIT | VK_ACCESS_2_MEMORY_WRITE_BIT)
dstStageMask: VkPipelineStageFlags2 = 65536 (VK_PIPELINE_STAGE_2_ALL_COMMANDS_BIT)
dstAccessMask: VkAccessFlags2 = 98304 (VK_ACCESS_2_MEMORY_READ_BIT | VK_ACCESS_2_MEMORY_WRITE_BIT)
bufferMemoryBarrierCount: uint32_t = 0
pBufferMemoryBarriers: const VkBufferMemoryBarrier2* = NULL
imageMemoryBarrierCount: uint32_t = 0
pImageMemoryBarriers: const VkImageMemoryBarrier2* = NULL
Thread 0, Frame 0:
vkCmdCopyBuffer(commandBuffer, srcBuffer, dstBuffer, regionCount, pRegions) returns void:
commandBuffer: VkCommandBuffer = 0x5651fe89b9f0
srcBuffer: VkBuffer = 0x5651ffcbf560
dstBuffer: VkBuffer = 0x5651fe986280
regionCount: uint32_t = 1
pRegions: const VkBufferCopy* = 0x7ffe711a6090
pRegions[0]: const VkBufferCopy = 0x7ffe711a6090:
srcOffset: VkDeviceSize = 0
dstOffset: VkDeviceSize = 0
size: VkDeviceSize = 1048576
For reference, I've uploaded the full API dump here.
The synchronization validation reports the following.
SYNC-HAZARD-READ-AFTER-WRITE(ERROR / SPEC): msgNum: -455515022 - Validation Error: [ SYNC-HAZARD-READ-AFTER-WRITE ] Object 0: handle = 0x55eebfc30cb0, type = VK_OBJECT_TYPE_BUFFER; | MessageID = 0xe4d96472 | vkCmdCopyBuffer: Hazard READ_AFTER_WRITE for srcBuffer VkBuffer 0x55eebfc30cb0[], region 0. Access info (usage: SYNC_COPY_TRANSFER_READ, prior_usage: SYNC_COMPUTE_SHADER_SHADER_STORAGE_WRITE, write_barriers: 0, command: vkCmdDispatch, seq_no: 1, reset_no: 1).
However, according to my understanding, the full pipeline barrier between the dispatch and the buffer copy (in command order) should avoid overlap between them also in execution order and with respect to memory visibility.
Notably, other validation areas (apart from "best practices") do not report any problems.
I'm at a loss what the validation error is supposed to tell me. Either I've made a very dumb small mistake which I cannot find, my understanding of synchronization is wrong or there is a bug in the validator.
答案1
得分: 0
这似乎是验证层中的一个错误(存在于版本1.3.236中),已在最新版本(1.3.239)中修复。
英文:
It appears that this was a bug in the validation layers (present in version 1.3.236), that has been fixed in the latest release (1.3.239).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论