I've recently been working on an offline vulkan renderer/compositor. Our initial implementation was a one-shot renderer - spawn the process, render the image, and exit. However, to amortize startup costs, we are converting it to a multi-shot renderer with an HTTP API. The first implementation simply used vkQueueWaitIdle()
, but in a multi-threaded environment this might be less than optimal as multiple command buffers are submitted to the same queue.
Using a fence allows the CPU to wait for the GPU to complete a specific command buffer, in this case, rendering the image and saving it to host memory.
In our renderer, we tried using a fence (incorrectly) with a timeout of 0, assuming it meant to wait indefinitely. We couldn't get it to work so we reverted back to using vkQueueWaitIdle()
which was fine for a one-shot renderer. However, after implementing our multi-threaded renderer, and attempting to (incorrectly) use fences again, we experienced corrupt output:
After checking the documentation, I found out we were doing it wrong:
If timeout is zero, then vkWaitForFences does not wait, but simply returns the current state of the fences. VK_TIMEOUT will be returned in this case if the condition is not satisfied, even though no actual wait was performed.
The correct implementation uses a loop; we can issue a warning if the job appears to be taking longer than expected:
In hindsight, this was a relatively trivial problem, however it highlighted the fact that Vulkan can sometimes be hard to comprehend in its entirety. I didn't write the original fence code, so without knowing any better, I initially suspected some problem with image barriers. When code gets bulky, it makes refactoring and the subsequent debugging harder.