I've recently been working on an offline vulkan renderer/compositor. Our initial implementation was a one-shot renderer - spawn the process, render the image, and exit. However, to amortize startup costs, we are converting it to a multi-shot renderer with an HTTP API. The first implementation simply used vkQueueWaitIdle()
, but in a multi-threaded environment this might be less than optimal as multiple command buffers are submitted to the same queue.
Using a fence allows the CPU to wait for the GPU to complete a specific command buffer, in this case, rendering the image and saving it to host memory.
In our renderer, we tried using a fence (incorrectly) with a timeout of 0, assuming it meant to wait indefinitely. We couldn't get it to work so we reverted back to using vkQueueWaitIdle()
which was fine for a one-shot renderer. However, after implementing our multi-threaded renderer, and attempting to (incorrectly) use fences again, we experienced corrupt output:
After checking the documentation, I found out we were doing it wrong:
If timeout is zero, then vkWaitForFences does not wait, but simply returns the current state of the fences. VK_TIMEOUT will be returned in this case if the condition is not satisfied, even though no actual wait was performed.
The correct implementation uses a loop; we can issue a warning if the job appears to be taking longer than expected:
Console::debug("Submitting command buffer to GPU...");
// The vulkan device:
vk::Device & device = _context->device();
// The command buffer we want to submit:
auto submits = vk::SubmitInfo()
.setCommandBufferCount(1).setPCommandBuffers(&commands);
// The queue we are going to submit to:
auto queue = device.getQueue(graphics_queue, 0);
// Generate a temporary fence:
auto fence = device.createFenceUnique({});
// Submit the command buffer to the queue with the fence:
queue.submit(1, &submits, *fence);
// Loop until the fence is signalled:
while (true) {
// Wait for 10ms for the render to complete:
auto result = device.waitForFences(*fence, true, 10000000);
// Check the result - if it's successful we are done:
if (result == vk::Result::eSuccess)
break;
// Otherwise, we took longer than 10ms to render:
Console::warn("Wait for fence: ", vk::to_string(result));
// If the result wasn't a timeout (e.g. error), we fail:
if (result != vk::Result::eTimeout)
throw std::runtime_error("renderer failed");
}
In hindsight, this was a relatively trivial problem, however it highlighted the fact that Vulkan can sometimes be hard to comprehend in its entirety. I didn't write the original fence code, so without knowing any better, I initially suspected some problem with image barriers. When code gets bulky, it makes refactoring and the subsequent debugging harder.