工程GIT地址:https://gitee.com/yaksue/yaksue-graphics
目标
我注意到:D3D12和Vulkan,相比于D3D11和OpenGL,多了“资源转换屏障”这种操作。我想弄明白:
- “资源转换屏障”的意义是什么?
- D3D12和Vulkan用什么样的API来实现?
- 创建纹理贴图时用了这种操作,细节是怎样的?
- 当前工程里其他哪些地方也用到了?
资源转换(Resource Transitions)
《D3D12龙书》【4.2.3 Resource Transitions】:
To implement common rendering effects, it is common for the GPU to write to a resource R in one step, and then, in a later step, read from the resource R. However, it would be a resource hazard to read from a resource if the GPU has not finished writing to it or not started writing at all. To solve this problem, Direct3D associates a state to resources. Resources are in a default state when they are created, and it is up to the application to tell Direct3D any state transitions. This enables the GPU to do any work it needs to do to make the transition and prevent resource hazards. For example, if we are writing to a resource, say a texture, we will set the texture state to a render target state; when we need to read the texture, we will change its state to a shader resource state. By informing Direct3D of a transition, the GPU can take steps to avoid the hazard, for example, by waiting for all the write operations to complete before reading from the resource. The burden of resource transition falls on the application developer for performance reasons. The application developer knows when these transitions are happening. An automatic transition tracking system would impose additional overhead.
为了实现一些渲染效果,对GPU来说很常见的需求是:将结果写到资源R
中,然后在下一步中从资源R
中读取数据。然而问题是:有可能当GPU还没写完,甚至还没开始写时,资源R
就被读取了,这种情况就叫做resource hazard(危险资源)。为了解决这个问题,D3D为资源关联了一个状态。资源在刚被创建时是默认状态,直到应用程序告诉D3D将其转变为某种状态。这样GPU就可做些工作来预防resource hazard。例如:有一张texture是要写的,那就将其变为渲染目标(render target)状态,如果这张texture是需要读的,就将其变为着色器资源(shader resource)状态。通过告诉D3D资源转换,GPU就可以采取措施防止resource hazard(例如,它可以在“读操作”之前等待所有的“写操作”)。出于性能考虑,资源转换的重担落在应用程序开发人员身上。你也可以写一个系统来自动追踪资源转换,但这当然会造成额外的开销。(本书译版注:在D3D11中,这些工作全权交由驱动来管理,因此性能稍差,在D3D12中全靠手动转换,就不需要驱动层接入资源状态的追踪了)
A resource transition is specified by setting an array of transition resource barriers on the command list; it is an array in case you want to transition multiple resources with one API call. In code, a resource barrier is represented by the D3D12_RESOURCE_BARRIER_DESC structure.
指定“资源转换”是通过在命令表列表上设置一系列 资源转换屏障(transition resource barriers) 。他们是个“数组”,而转换多个资源都是使用的同一个API。从代码上讲,一个 resource barrier 以一个D3D12_RESOURCE_BARRIER_DESC
来表示。(不过,更经常使用CD3DX12_RESOURCE_BARRIER
,它是“d3dx12.h”提供的一层更高级的封装)。
在 Images - Vulkan Tutorial的【Layout transitions】中也有类似的描述:
One of the most common ways to perform layout transitions is using an image memory barrier. A pipeline barrier like that is generally used to synchronize access to resources, like ensuring that a write to a buffer completes before reading from it
一个常用的执行layout转换的是 image memory barrier 。像这样的一个 pipeline barrier 通常用来同步资源的访问权限,例如确保在一个buffer的“读操作”之前“写操作”已经完成。
API实现
D3D12
D3D12通过ID3D12GraphicsCommandList::ResourceBarrier来实现。
见《D3D12龙书》的讨论:
例如下面的一次使用:
CommandList->ResourceBarrier(1
, &CD3DX12_RESOURCE_BARRIER::Transition(
CurrentBackBuffer()
, D3D12_RESOURCE_STATE_PRESENT
, D3D12_RESOURCE_STATE_RENDER_TARGET));
This code transitions a texture representing the image we are displaying on screen from a presentation state to a render target state. Observe that the resource barrier is added to the command list. You can think of the resource barrier transition as a command itself instructing the GPU that the state of a resource is being transitioned, so that it can take the necessary steps to prevent a resource hazard when executing subsequent commands.
这个代码就将我们当前显示在屏幕上的一个图像,从呈现状态(D3D12_RESOURCE_STATE_PRESENT) 转变为了一个渲染目标状态(D3D12_RESOURCE_STATE_RENDER_TARGET)。注意:一个“资源屏障”是需要被加到命令列表中的,你可以认为:“资源屏障”就是命令列表自己告诉GPU的关于资源转换的提示,所以在执行接下来的命令中,它就可以采取必要的步骤来防止resource hazard。
Vulkan的 vkCmdPipelineBarrier
Vulkan通过vkCmdPipelineBarrier来实现。
见Images - Vulkan Tutorial【Layout transitions】中的讨论:
例如对于下面的使用:
vkCmdPipelineBarrier(
commandBuffer,
0 /* TODO */, 0 /* TODO */,
0,
0, nullptr,
0, nullptr,
1, &barrier
);
All types of pipeline barriers are submitted using the same function. The first parameter after the command buffer specifies in which pipeline stage the operations occur that should happen before the barrier. The second parameter specifies the pipeline stage in which operations will wait on the barrier. The pipeline stages that you are allowed to specify before and after the barrier depend on how you use the resource before and after the barrier. The allowed values are listed in this table of the specification. For example, if you’re going to read from a uniform after the barrier, you would specify a usage of VK_ACCESS_UNIFORM_READ_BIT and the earliest shader that will read from the uniform as pipeline stage, for example VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT. It would not make sense to specify a non-shader pipeline stage for this type of usage and the validation layers will warn you when you specify a pipeline stage that does not match the type of usage.
The third parameter is either 0 or VK_DEPENDENCY_BY_REGION_BIT. The latter turns the barrier into a per-region condition. That means that the implementation is allowed to already begin reading from the parts of a resource that were written so far, for example.
The last three pairs of parameters reference arrays of pipeline barriers of the three available types: memory barriers, buffer memory barriers, and image memory barriers like the one we’re using here.
所有类型的 pipeline barriers 都是通过vkCmdPipelineBarrier
这一个函数提交的。commandBuffer
之后的第一个参数指定了:在屏障之前,操作应该发生在管线中的哪一个阶段。而第二个参数指定了:哪一个阶段需要在屏障上等待。允许你在屏障之前和之后指定的管线阶段,取决于屏障之前和之后你要如何使用资源。允许的值将在这张表《Supported access types》中列出(这个网页加载较慢,我把表格复制到本篇附录中了)。例如,如果你正准备在屏障之后读取一个UniformBuffer,你需要指定VK_ACCESS_UNIFORM_READ_BIT
作为“用途”,然后你可以指定管线阶段为VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT
代表你要在像素着色器中执行读取数据的操作。对于例子中的这种“用途”来说,你设定一个shader阶段无关的管线阶段是无意义的,所以在你《Supported access types》这张表上会看到VK_ACCESS_UNIFORM_READ_BIT
对应的管线阶段都是shader阶段,如果你尝试设定一个和“用途”不相匹配的管线阶段,则 validation layer 就会警告。
第三个参数是是0
或者VK_DEPENDENCY_BY_REGION_BIT
。后者将屏障变为一个“逐区域(per-region)”的状态。这意味着可以实现让资源的一部分被读,而其他部分还在被写。
最后三对参数代表了pipeline barriers数组,有三类:memory barriers, buffer memory barriers, 和 image memory barriers(此处用的就只有image )。
贴图相关的资源转换屏障
在《图形API学习工程(11):使用纹理》中,我参考了教程的代码实现了纹理贴图采样。但是当时代码量很大,还没功夫进行分析。现在,就主要分析一下对于D3D12和Vulkan来说,一个贴图资源怎样创建出来,重点观察其中“资源转换屏障”的使用:
Vulkan
0.使用stb库从文件中读取图片数据
int texWidth, texHeight, texChannels;
stbi_uc* pixels = stbi_load(ImageFile.c_str(), &texWidth, &texHeight, &texChannels, STBI_rgb_alpha);
VkDeviceSize imageSize = texWidth * texHeight * 4;
1.创建中间资源
VkBuffer stagingBuffer;
VkDeviceMemory stagingBufferMemory;
createBuffer(imageSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, stagingBuffer, stagingBufferMemory);
2.创建实际用的纹理资源
注意将纹理转变为了VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL
createImage(texWidth, texHeight, VK_FORMAT_R8G8B8A8_UNORM, VK_IMAGE_TILING_OPTIMAL, VK_IMAGE_USAGE_TRANSFER_SRC_BIT | VK_IMAGE_USAGE_TRANSFER_DST_BIT | VK_IMAGE_USAGE_SAMPLED_BIT, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, result->textureImage, result->textureImageMemory);
transitionImageLayout(result->textureImage, VK_FORMAT_R8G8B8A8_UNORM, VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL);
3.将数据拷贝到中间资源中
void* data;
vkMapMemory(Device, stagingBufferMemory, 0, imageSize, 0, &data);
memcpy(data, pixels, (size_t)imageSize);
vkUnmapMemory(Device, stagingBufferMemory);
4.将数据从中间资源拷贝给实际用的纹理
//将数据从中间Buffer拷贝给实际用的资源
copyBufferToImage(stagingBuffer, result->textureImage, static_cast<uint32_t>(texWidth), static_cast<uint32_t>(texHeight));
其中copyBufferToImage
:
VkCommandBuffer commandBuffer = beginSingleTimeCommands();
VkBufferImageCopy region{};
region.bufferOffset = 0;
region.bufferRowLength = 0;
region.bufferImageHeight = 0;
region.imageSubresource.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
region.imageSubresource.mipLevel = 0;
region.imageSubresource.baseArrayLayer = 0;
region.imageSubresource.layerCount = 1;
region.imageOffset = { 0, 0, 0 };
region.imageExtent = {
width,
height,
1
};
vkCmdCopyBufferToImage(commandBuffer, buffer, image, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, 1, ®ion);
endSingleTimeCommands(commandBuffer);
可以看到使用了vkCmdCopyBufferToImage
这个命令。
5.转换为 VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL
transitionImageLayout(result->textureImage, VK_FORMAT_R8G8B8A8_UNORM, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL);
D3D12
0.使用stb库从文件中读取图片数据
//读取图片
int texWidth, texHeight, texChannels;
stbi_uc* pixels = stbi_load(ImageFile.c_str(), &texWidth, &texHeight, &texChannels, STBI_rgb_alpha);
const UINT TexturePixelSize = 4;//由于是 rgb_alpha 所以是4通道
1.创建中间资源
注意中间资源是D3D12_RESOURCE_STATE_GENERIC_READ
ComPtr<ID3D12Resource> textureUploadHeap;
{
//尺寸:
const UINT64 uploadBufferSize = GetRequiredIntermediateSize(result->TextureResource.Get(), 0, 1);
// Create the GPU upload buffer.
ThrowIfFailed(Device->CreateCommittedResource(
&CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_UPLOAD),
D3D12_HEAP_FLAG_NONE,
&CD3DX12_RESOURCE_DESC::Buffer(uploadBufferSize),
D3D12_RESOURCE_STATE_GENERIC_READ,
nullptr,
IID_PPV_ARGS(&textureUploadHeap)));
}
2.创建实际用的纹理资源
注意,创建时是D3D12_RESOURCE_STATE_COPY_DEST
ThrowIfFailed(Device->CreateCommittedResource(
&CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT),
D3D12_HEAP_FLAG_NONE,
&textureDesc,
D3D12_RESOURCE_STATE_COPY_DEST,
nullptr,
IID_PPV_ARGS(&result->TextureResource)));
3+4.将数据拷贝到中间资源,再从中间资源拷贝给实际用的资源
这一过程是通过UpdateSubresources
函数完成的:
D3D12_SUBRESOURCE_DATA textureData = {};
textureData.pData = pixels;
textureData.RowPitch = texWidth * TexturePixelSize;
textureData.SlicePitch = textureData.RowPitch * texHeight;
UpdateSubresources(CommandList.Get(), result->TextureResource.Get(), textureUploadHeap.Get(), 0, 0, 1, &textureData);
这个函数在d3dx12.h
中:
可以看到,逻辑是先使用MemcpySubresource
将pSrcData
拷贝到pIntermediate
中,然后再使用CopyTextureRegion
这个命令将pIntermediate
拷贝到pDestinationResource
中。
5.转换为 PIXEL_SHADER_RESOURCE
将从D3D12_RESOURCE_STATE_COPY_DEST
转换为D3D12_RESOURCE_STATE_PIXEL_SHADER_RESOURCE
CommandList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(result->TextureResource.Get(), D3D12_RESOURCE_STATE_COPY_DEST, D3D12_RESOURCE_STATE_PIXEL_SHADER_RESOURCE));
对比总结
所以可以看出,二者在思路上具有高度的相似性:
只是Vulkan和D3D12所用的接口有所不同:
(其中红色字体为各自的图形命令)
Vulkan的transitionImageLayout
当工程里transitionImageLayout
:
注意最后的大量分支判断。其中每个判断中的barrier.srcAccessMask
匹配sourceStage
,而barrier.dstAccessMask
匹配destinationStage
,这种匹配应该在《Supported access types》表格中找到。如果不匹配,则 validation layer 会发出警告。
当前工程里其他资源转换屏障
D3D12:
渲染前:
// Indicate that the back buffer will be used as a render target.
CommandList->ResourceBarrier(1
, &CD3DX12_RESOURCE_BARRIER::Transition(RenderTargets[CurrentBackBufferIndex].Get()
, D3D12_RESOURCE_STATE_PRESENT
, D3D12_RESOURCE_STATE_RENDER_TARGET));
渲染完成,需要呈现时:
CommandList->ResourceBarrier(1,
&CD3DX12_RESOURCE_BARRIER::Transition(RenderTargets[CurrentBackBufferIndex].Get()
, D3D12_RESOURCE_STATE_RENDER_TARGET
, D3D12_RESOURCE_STATE_PRESENT));
Vulkan:
创建深度模板缓冲时:
transitionImageLayout(depthImage, depthFormat, VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL);
(不过据《Depth buffering - Vulkan Tutorial》所说这并不是必要的:We don’t need to explicitly transition the layout of the image to a depth attachment because we’ll take care of this in the render pass.)
其他值得讨论的问题
- 为何要将数据复制到“中间资源”然后再到“实际使用的资源”?例如对于上例D3D12而言,不能直接使用“中间”资源作为采样的纹理吗?
- “中间资源”拷贝到“实际使用的资源”这一步为什么要用一个图形命令?直接在CPU上拷贝不行吗?
附录:Supported access types
Access flag | Supported pipeline stages |
---|---|
VK_ACCESS_INDIRECT_COMMAND_READ_BIT | VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT |
VK_ACCESS_INDEX_READ_BIT | VK_PIPELINE_STAGE_VERTEX_INPUT_BIT |
VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT | VK_PIPELINE_STAGE_VERTEX_INPUT_BIT |
VK_ACCESS_UNIFORM_READ_BIT | VK_PIPELINE_STAGE_VERTEX_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT, VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, or VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT |
VK_ACCESS_SHADER_READ_BIT | VK_PIPELINE_STAGE_VERTEX_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT, VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, or VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT |
VK_ACCESS_SHADER_WRITE_BIT | VK_PIPELINE_STAGE_VERTEX_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT, VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, or VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT |
VK_ACCESS_INPUT_ATTACHMENT_READ_BIT | VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT |
VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT |
VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT | VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT |
VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT | VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT, or VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT |
VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT | VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT, or VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT |
VK_ACCESS_TRANSFER_READ_BIT | VK_PIPELINE_STAGE_TRANSFER_BIT |
VK_ACCESS_TRANSFER_WRITE_BIT | VK_PIPELINE_STAGE_TRANSFER_BIT |
VK_ACCESS_HOST_READ_BIT | VK_PIPELINE_STAGE_HOST_BIT |
VK_ACCESS_HOST_WRITE_BIT | VK_PIPELINE_STAGE_HOST_BIT |
VK_ACCESS_MEMORY_READ_BIT | Any |
VK_ACCESS_MEMORY_WRITE_BIT | Any |