CSharpGL(57)[译]Vulkan清空屏幕

CSharpGL(57)[译]Vulkan清空屏幕

本文是对(http://ogldev.atspace.co.uk/www/tutorial51/tutorial51.html)的翻译,作为学习Vulkan的一次尝试。

不翻译的话,每次都在看第一句,那就学不完了。

Background 背景

+BIT祝威+悄悄在此留下版了个权的信息说:

Welcome back. I hope that you've been able to complete the previous tutorial successfully and you are now ready to continue. In this tutorial we will do a very basic operation that usually starts a new frame - clear the screen. In OpenGL this can be done very easily with just the glClear() command but as you can already assume - it's a totally different ball game with Vulkan. This tutorial will introduce us to three new and improtant Vulkan entities - swap chain, images and the command buffers.

欢迎回来。我希望你已经完成了上一篇教程,现在准备好继续了。本教程中我们将做一个非常基本的操作——在开启新的一帧时首先做的——清空屏幕。在OpenGL中,这可以直接简单地用glClear()命令完成,但是你可能已经想到了——这在Vulkan中是完全不同的事。本教程将介绍给大家3个新的重要的Vulkan实体——交换链,Image和命令缓存。

Let's look at a very simple OpenGL render loop which just clears the screen:

+BIT祝威+悄悄在此留下版了个权的信息说:

我们来看看OpenGL的一个简单的渲染循环——仅仅清空屏幕:

1 void RenderLoop()
2 {
3     glClear(GL_COLOR_BUFFER_BIT);
4     glutSwapBuffers();   // Or in GLFW: glfwSwapBuffers(pWindow);
5 }

 

What we have here is a GL command to clear the color buffer followed by a GLUT or GLFW call that swaps the front buffer which is currently being displayed with the back buffer (which is really the buffer that glClear targeted). These two seemingly innocent functions hide a ton of back stage activity by the OpenGL driver. What Vulkan does is to provide us with a standard interface to the low level operations that used to be the sole domain of the OpenGL driver. Now we have to take control and manage these back stage activities ourselves.

这段代码里,我们用一个GL命令清空颜色缓存,然后用一个GLUT或GLFW调用来交换front缓存(当前在显示的)与back缓存(glClear清空的)。这两个看起来弱弱的函数隐藏了OpenGL驱动的成吨的后台活动。Vulkan提供给我们的,是一个对低层操作的接口,以前这是属于OpenGL驱动的独有领域。现在你必须自己来管理控制这些后台活动了。

+BIT祝威+悄悄在此留下版了个权的信息说:

So now let's think what really happens in the driver when it executes that render loop. In most graphics drivers there is a concept of a command buffer. The command buffer is a memory buffer that the driver fills with GPU instructions. The driver translates the GL commands into GPU instructions. It submits the command buffer to the GPU and there is usually some form of a queue of all the command buffers that are currently in flight. The GPU picks up the command buffers one by one and executes their contents. The command buffers contain instructions, pointers to resources, state changes and everything else that the GPU needs in order to execute the the OpenGL commands correctly. Each command buffer can potentially contain multiple OpenGL commands (and it usually does because it is more efficient). It is up to the driver to decide how to batch the OpenGL commands into the command buffers. The GPU informs the driver whenever a command buffer is completed and the driver can stall the application to prevent it from getting too much ahead of the GPU (e.g. the GPU renders frame N while the application is already at frame N+10).

所以现在我们来想想,驱动在执行那个渲染循环时到底发生了什么。大多数图形卡里都有一个“命令缓存”的概念。命令缓存,是驱动填入了GPU指令的一块内存。驱动将GL命令翻译成GPU指令。驱动将命令缓存提交给GPU,正在运行的所有的命令缓存通常形成某种形式的队列。GPU一个一个地拿起命令缓存,执行它们的内容。命令缓存包换指令、对资源的指针、状态变化和任何执行OpenGL命令所需的东西。每个命令缓存都可能包含多个OpenGL命令(通常是这样,因为更高效)。驱动决定如何将OpenGL命令批发为命令缓存。命令缓存完成后,GPU就通知驱动,驱动就暂停应用程序,以避免它领先GPU太多(例如GPU渲染帧N,同时应用程序已经在帧N+10了)。

This model works pretty well. Why do we need to change it? Well, making the driver in charge of command buffer management prevents us from some important potential performance optimizations that only we can make. For example, consider the Mesh class that we developed in previous tutorials when we studied the Assimp library. Rendering a mesh meant that in each frame we had to submit the same group of draw commands when the only real change was a few matrices that controlled the transformation. For each draw command the driver had to do considerable amount of work which is a waste of time in each frame. What if we could create a command buffer for this mesh class ahead of time and just submit it in each frame (while changing the matrices somehow)? That's the whole idea behind Vulkan. For the OpenGL driver the frame is just a series of GL commands and the driver doesn't understand what exactly the application is doing. It doesn't even know that these commands will be repeated in the next frame. Only the app designer understands what's going on and can create command buffers in a way that will match the structure of the application.

这个模型工作得相当好。为什么我们要改变它?好吧,让驱动来负责管理命令缓存,可以让我们失去只有我们能做到的重要的性能优化机会。例如,考虑我们在之前的教程中学习Assimp库时开发的网格类。渲染一个网格,意味着对每个阵我们必须相同的渲染命令,而真正有所改变的仅仅是用于控制方位变换的矩阵。对每个渲染命令,驱动必须做可观的工作量,这在每一帧里都造成了浪费。如果我们能为这个网格类创建一个命令缓存,每一帧里提交它(同时想办法修改矩阵)该多好?这就是Vulkan的核心思想。对OpenGL驱动来说,一帧只是一系列GL命令,驱动不理解应用程序到底在做什么。它甚至不知道这些命令会在下一帧重复。只有app设计者理解在发生什么,只有他能创建命令缓存,且使之与应用程序的结构相适应。

Another area where OpenGL never excelled in is multi threading. Submitting draw commands in different threads is possible but complex. The problem is that OpenGL was not designed with multi threading in mind. Therefore, in most cases a graphics app has one rendering thread and uses multi-threading for the rest of the logic. Vulkan addresses multi threading by allowing you to build command buffers concurrently and introduces the concept of queues and semaphores to handle concurrency at the GPU level.

OpenGL不擅长的另一个领域就是多线程。在不同的线程里提供渲染命令是可能的,但是很复杂。问题在于OpenGL在设计之初就没有考虑要支持多线程。因此,大多数时候一个图形app有1个渲染线程和多个逻辑线程。Vulkan支持多线程:它允许你并发地构建命令缓存,引入了queue和semaphore的概念来处理GPU层次的并发问题。

Let's get back to that render loop. By now you can imagine that what we are going to do is create a command buffer and add the clear instruction to it. What about swap buffers? We have been using GLUT/GLFW so we never gave much thought about it. GLUT/GLFW are not part of OpenGL. They are libraries built on top of windowing APIs such as GLX (Linux), WGL (Windows), EGL (Android) and CGL (Mac). They make it easy to build OS independent OpenGL programs. If you use the underlying APIs directly you will have to create an OpenGL context and window surface which are in general corresponding to the instance and surface we created in the previous tutorial. The underlying APIs provide functions such as glXSwapBuffers() and eglSwapBuffers() in order to swap the front and back buffers that are hidden under the cover of the surface. They don't provide you much control beyond that.

我们回到那个渲染循环。现在你可以想象我们计划做的,就是创建一个命令缓存,给它加入清空指令。那么swap缓存呢?我们一直在用GLUT/GLFW,所以没有好好考虑过它。GLUT/GLFW不是OpenGL的一部分。它们是基于窗口API(例如Linux的GLX、Windows的WGL、Android的EGL和Mac的CGL)的库。它们让构建操作系统无关的OpenGl程序简单了。如果你直接使用底层API,你将不得不创建OpenGL上下文和窗口surface,它们一般都是与我们在之前教程中创建的instance和surface对应的。底层API提供glXSwapBuffers()和eglSwapBuffers()这样的函数,用于交换front和back缓存。除此之外它们没有提供给你更多的控制权。

Vulkan goes a step further by introducing the concepts of swap chain, images and presentation engine. The Vulkan spec describes the swap chain as an abstraction of an array of presentable images that are associated with the surface. The images are what is actually being displayed on the screen and only one can be displayed at a time. When an image is displayed the application is free to prepare the remaining images and queue them for presentation. The total number of images can also be controlled by the application.

+BIT祝威+悄悄在此留下版了个权的信息说:

Vulkan走得更远,它引入了交换链、image和表现引擎的概念。Vulkan说明书将交换链描述为关联到surface的可显示image的数组的抽象。这些image实际上就是显示到屏幕上的东西,同一时间只能有1个显示出来。当一个image被显示时,应用程序可以*准备其他image,并将其列队待用。Image总数也可以被应用程序控制。

The presentation engine represents the display on the platform. It is responsible for picking up images from the queue, presenting them on the screen and notifying the application when an image can be reused.

表现引擎代表在平台上的显示。他负责从队列里拿起image,提交给屏幕,当image可以被重用时通知应用程序。

Now that we understand these concepts let's review what we need add to the previous tutorial in order to make it clear the screen. Here's the one time initialization steps:

既然我们理解了这些概念,我们来评审一下需要加入上一个教程的东西,以便清空屏幕。下面是一次性初始化的步骤:

  1. Get the command buffer queue from the logical device. Remember that the device create info included an array of VkDeviceQueueCreateInfo structures with the number of queues from each family to create. For simplicity we are using just one queue from graphics family. So this queue was already created in the previous tutorial. We just need to get its address.
    从logical device得到命令缓存。记住,device创建信息中包含一个VkDeviceQueueCreateInfo结构体和queue编号。简单来说,我们只用图形family的一个queue。这个queue是已经在上一个教程中创建好了的。我们只需找到它的地址。
  2. Create the swap chain and get the handles to its images.
    创建交换链,得到它的image的句柄。
  3. Create a command buffer and add the clear instruction to it.
    创建命令缓存,添加清空指令。

And here's what we need to do in the render loop:

这里是我们需要在渲染循环中做的:

  1. Acquire the next image from the swap chain.
    从交换链中请求下一个image。
  2. Submit the command buffer.
    提交命令缓存。
  3. Submit a request to present the image.
    提交“显示image”的请求。

Now let's review the code to accomplish this.

现在我们来评审一下相关代码。

Source walkthru 源代码浏览

All the logic that needs to be developed for this tutorial will go into the following class:

+BIT祝威+悄悄在此留下版了个权的信息说:

本教程中所有需要开发的逻辑都在下述类中:

 1 class OgldevVulkanApp
 2 {
 3 public:
 4 
 5     OgldevVulkanApp(const char* pAppName);
 6     
 7     ~OgldevVulkanApp();
 8     
 9     void Init();    
10     
11     void Run();
12     
13 private:
14 
15     void CreateSwapChain();
16     void CreateCommandBuffer();
17     void RecordCommandBuffers();
18     void RenderScene();
19 
20     std::string m_appName;
21     VulkanWindowControl* m_pWindowControl;
22     OgldevVulkanCore m_core;    
23     std::vector<VkImage> m_images;
24     VkSwapchainKHR m_swapChainKHR;
25     VkQueue m_queue;
26     std::vector<VkCommandBuffer> m_cmdBufs;
27     VkCommandPool m_cmdBufPool;
28 };

 

What we have here are a couple of public functions (Init() and Run()) that will be called from main() later on and several private member functions that are based on the steps that were described in the previous section. In addition, there are a few private member variables. The VulkanWindowControl and OgldevVulkanCore which were part of the main() function in the previous tutorial were moved here. We also have a vector of images, swap chain object, command queue, vector of command buffers and a command buffer pool. Now let's look at the Init() function:

+BIT祝威+悄悄在此留下版了个权的信息说:

上述代码中是2个public函数(Init()和Run()),它们将被main()函数调用;还有几个private函数,是我们在上一篇教程中涉及的。另外,还有几个private成员变量。上一篇教程中的VulkanWindowControl和OgldevVulkanCore在这里被移除了。我们还有image数组、交换链对象、命令队列、命令缓存数组和命令缓存池。现在我们来看看Init()函数:

 1 void OgldevVulkanApp::Init()
 2 {
 3 #ifdef WIN32
 4     m_pWindowControl = new Win32Control(m_appName.c_str());
 5 #else            
 6     m_pWindowControl = new XCBControl();
 7 #endif    
 8     m_pWindowControl->Init(WINDOW_WIDTH, WINDOW_HEIGHT);
 9 
10     m_core.Init(m_pWindowControl);
11         
12     vkGetDeviceQueue(m_core.GetDevice(), m_core.GetQueueFamily(), 0, &m_queue);
13 
14     CreateSwapChain();
15     CreateCommandBuffer();
16     RecordCommandBuffers();
17 }

 

This function starts in a similar fashion to the previous tutorial by creating and initializing the window control and Vulkan core objects. After that we call the private members to create the swap chain and command buffer and to record the clear instruction into the command buffer. Note the call to vkGetDeviceQueue(). This Vulkan function fetches the handle of a VkQueue object from the device. The first three parameters are the device, the index of the queue family and the index of the queue in that queue family (zero in our case because there is only one queue). The driver returns the result in the last parameter. The two getter functions here were added in this tutorial to the Vulkan core object.

类似上一篇教程中的方式,这个函数开始时创建和初始化窗口控件和Vulkan核心对象。之后,我们调用privaite成员,创建交换链和命令缓存,将清空指令写入命令缓存。注意对vkGetDeviceQueue()的调用。这个Vulkan函数从device提取VkQueue对象的句柄。前3个参数分别是device、queue family的索引和queue在queue family中的索引(本例中为0,因为只有1个queue)。驱动返回的结果保存到最后的参数里。本教程还加入了两个对Vulkan核心对象的getter函数。

Let's review the creation of the swap chain step by step:

我们来一步步地评审创建交换链的过程:

1 void OgldevVulkanApp::CreateSwapChain()
2 {          
3     const VkSurfaceCapabilitiesKHR& SurfaceCaps = m_core.GetSurfaceCaps();
4          
5     assert(SurfaceCaps.currentExtent.width != -1);

 

The first thing we need to do is to fetch the surface capabilities from the Vulkan core object. Remember that in the previous tutorial we populated a physical device database in the Vulkan core object with info about all the physical devices in the system. Some of that info was not generic but specific to the combination of the physical device and the surface that was created earlier. An example is the VkSurfaceCapabilitiesKHR vector which contains a VkSurfaceCapabilitiesKHR structure for each physical device. The function GetSurfaceCaps() indexes into that vector using the physical device index (which was selected in the previous tutorial). The VkSurfaceCapabilitiesKHR structure contains a lot of info on the surface. The currentExtent member describes the current size of the surface. Its type is a VkExtent2D which contains a width and height. Theoretically, the current extent should contain the dimensions that we have set when creating the surface and I have found that to be true on both Linux and Windows. In several examples (including the one in the Khronos SDK) I saw some logic which checks whether the width of the current extent is -1 and if so overwrites that with desired dimensions. I found that logic to be redundant so I just placed the assert you see above.

+BIT祝威+悄悄在此留下版了个权的信息说:

我们要做的第一件事是从Vulkan核心对象获取surface的capabilities。回忆上一篇教程中我们在Vulkan核心对象中填入了一个physical device数据库,其中含有系统上所有的physical device信息。有些信息不是通用的,而是针对之前创建的physical device和surface的组合的。一个粒子是VkSurfaceCapabilitiesKHR数组,其包含对每个physical device的VkSurfaceCapabilitiesKHR结构体。函数用physical device索引(在上一篇教程中选择的)使用这个数组。VkSurfaceCapabilitiesKHR结构体包含surface的很多信息。其中的currentExtent成员描述了surface当前的大小。它的类型是VkExtent2D,其包含宽度和高度。理论上,当创建surface时,当前范围应该包含我们设置的维度。我发现这在Linux和Windows上都是真的。在几个例子中(包括Khronos SDK中的例子)我看到一些逻辑是用于检查当前范围的宽度是否是-1.如果是,就用需要的维度覆盖那个宽度。我发现那个逻辑是多余的,所以我就用上述代码中的assert替换了它。

1     uint NumImages = 2;
2 
3     assert(NumImages >= SurfaceCaps.minImageCount);
4     assert(NumImages <= SurfaceCaps.maxImageCount);

 

Next we set the number of images that we will create in the swap chain to 2. This mimics the behavior of double buffering in OpenGL. I added assertions to make sure that this number is within the valid range of the platform. I assume that you won't hit these assertions but if you do you can try with one image only.

接下来,我们将交换链中的image数量设置为2。这模仿了OpenGL中的双缓存。我加入了assert来确保这个数值是在平台的有效要求内的。我假设你不会触发这些assert,但是如果你碰到了,你可以试试只有1个image。

1     VkSwapchainCreateInfoKHR SwapChainCreateInfo = {};
2     
3     SwapChainCreateInfo.sType            = VK_STRUCTURE_TYPE_SWAPCHAIN_CREATE_INFO_KHR;
4     SwapChainCreateInfo.surface          = m_core.GetSurface();
5     SwapChainCreateInfo.minImageCount    = NumImages;

 

The function that creates the swap chain takes most of its parameters from the VkSwapchainCreateInfoKHR structure. The first three parameters are obvious - the structure type, the surface handle and the number of images. Once created the swap chain is permanently attached to the same surface.

创建交换链的函数的大部分参数来自VkSwapchainCreateInfoKHR结构体。前3个参数很明显——结构体类型,surface句柄和image数量。一旦创建后,交换链就永远附着到同一surface上了。

1     SwapChainCreateInfo.imageFormat      = m_core.GetSurfaceFormat().format;
2     SwapChainCreateInfo.imageColorSpace  = m_core.GetSurfaceFormat().colorSpace;

 

Next comes the image format and color space. The image format was discussed in the previous tutorial. It describes the layout of data in image memory. It contains stuff such as channels (red, green and/or blue) and format (float, normalized int, etc). The color space describes the way the values are matched to colors. For example, this can be linear or sRGB. We will take both from the physical device database.

+BIT祝威+悄悄在此留下版了个权的信息说:

接下来是image格式和颜色空间。上一篇教程讨论过颜色格式了。它描述数据在image内存中的布局方式。它包含通道(RGB)、格式(float,标准化int,等)等内容。颜色空间描述值映射到颜色的方式。例如,可以是线性的或sRGB的。我们将从physical device数据库使用这两种。

1     SwapChainCreateInfo.imageExtent      = SurfaceCaps.currentExtent;

 

We can create the swap chain with a different size than the surface. For now, just grab the current extent from the surface capabilities structure.

我们可以创建大小与surface不同的交换链。目前,就用surface的capabilities结构体的当前范围好了。

1     SwapChainCreateInfo.imageUsage       = VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT;

 

We need to tell the driver how we are going to use this swap chain. We do that by specifying a combination of bit masks and there are 8 usage bits in total. For example, the swap chain can be used as a source or destination of a transfer (buffer copy) operation, as a depth stencil attachment, etc. We just want a standard color buffer so we use the bit above.

我们需要告诉驱动,我们将如何使用交换链。我们通过标识一个最多8位的掩码来实现。例如,交换链可以被用于转移(缓存复制)的源或目的,用于模板附件,等。我们只想要一个标准的颜色缓存,所以用上述位掩码。

1     SwapChainCreateInfo.preTransform     = VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR;

 

The pre transform field was designed for hand held devices that can change their orientation (cellular phones and tablets). It specifies how the orientation must be changed before presentation (90 degrees, 180 degrees, etc). It is more relevant to Android so we just tell the driver not to do any orientation change.

字段preTransform用于可改变朝向的手持设备(移动电话和平板电脑)。它标明在显示前应该如何改变朝向(90度,180度,等)。这和Android关系比较大,所以我们告诉驱动不修改朝向。

1     SwapChainCreateInfo.imageArrayLayers = 1;

 

imageArrayLayers is intended for stereoscopic applications where rendering takes place from more than one location and then combined before presentations. An example is VR where you want to render the scene from each eye separately. We are not going to do that today so just specify 1.

字段imageArrayLayers用于立体应用程序,其渲染发生在不止一处,然后联合起来再显示。

1     SwapChainCreateInfo.imageSharingMode = VK_SHARING_MODE_EXCLUSIVE;

 

Swap chain images can be shared by queues of different families. We will use exclusive access by the queue family we have selected previously.

+BIT祝威+悄悄在此留下版了个权的信息说:

交换链的image可以被不同family的queue共享。我们将??(译者注:看不懂)

1     SwapChainCreateInfo.presentMode      = VK_PRESENT_MODE_FIFO_KHR;

 

In the previous tutorial we briefly touched on the presentation engine which is the part of the platform involved in actually taking the swap chain image and putting it on the screen. This engine also exists in OpenGL where it is quite limited in comparison to Vulkan. In OpenGL you can select between single and double buffering. Double buffering avoids tearing by switching the buffers only on VSync and you have some control on the number of VSync in a second. That's it. Vulkan, however, provides you with no less than four different modes of operation that allow a higher level of flexibility and performance. We will be conservative here and use the FIFO mode which is the most similar to OpenGL double buffering.

上一篇教程中我们稍微提及了表现引擎,它参与了接收交换链image并将其放到屏幕上的过程。这个引擎也在OpenGL中存在,但是与Vulkan相比,存在感很低。在OpenGL中你可以在单缓存和双缓存中选择。双缓存避免了切换缓存时的撕裂,你还可以控制垂直同步的速度。仅此而已。但是,Vulkan提供至少4种操作模式,允许更高的扩展性和性能。我们就保守点,用FIFO模式,这是最接近OpenGL双缓存的模式。

1     SwapChainCreateInfo.clipped          = true;

 

The clipped field indicates whether the driver can discard parts of the image that are outside of the visible surface. There are some obscure cases where this is interesting but not in our case.

字段clipped表面驱动十分能忽略image位于可见surface的外部的部分。有时候这会有稀里糊涂的问题,但是我们的例子里没有。

1     SwapChainCreateInfo.compositeAlpha   = VK_COMPOSITE_ALPHA_OPAQUE_BIT_KHR;

 

compositeAlpha controls the manner in which the image is combined with other surfaces. This is only relevant on some of the operating systems so we don't use it.

字段控制image联合其他surface的方式。这只在某些操作系统的才有用,我们不用管它。

1     VkResult res = vkCreateSwapchainKHR(m_core.GetDevice(), &SwapChainCreateInfo, NULL, &m_swapChainKHR);
2     CHECK_VULKAN_ERROR("vkCreateSwapchainKHR error %d\n", res);    

 

Finally, we can create the swap chain and get its handle.

+BIT祝威+悄悄在此留下版了个权的信息说:

最后,我们创建交换链,得到它的句柄。

1     uint NumSwapChainImages = 0;
2     res = vkGetSwapchainImagesKHR(m_core.GetDevice(), m_swapChainKHR, &NumSwapChainImages, NULL);
3     CHECK_VULKAN_ERROR("vkGetSwapchainImagesKHR error %d\n", res);

 

When we created the swap chain we specified the minimum number of images it should contain. In the above call we fetch the actual number of images that were created.

创建交换链后,我们标明了它应该包含的image的最小数量。上述代码中我们获取了实际创建的image数量。

1     m_images.resize(NumSwapChainImages);
2     m_cmdBufs.resize(NumSwapChainImages);
3     
4     res = vkGetSwapchainImagesKHR(m_core.GetDevice(), m_swapChainKHR, &NumSwapChainImages, &(m_images[0]));
5     CHECK_VULKAN_ERROR("vkGetSwapchainImagesKHR error %d\n", res);
6 }

 

We have to get the handles of all the swap chain images so we resize the image handle vector accordingly. We also resize the command buffer vector because we will record a dedicated command buffer for each image in the swap chain.

我们必须得到所有交换链image的句柄,所以我们调整句柄数组的大小。我们还要调整命令缓存数组的大小,因为我们将为交换链的每个image记录一个命令缓存。

The following function creates the command buffers:

下述函数创建了命令缓存:

1 void OgldevVulkanApp::CreateCommandBuffer()
2 {
3     VkCommandPoolCreateInfo cmdPoolCreateInfo = {};
4     cmdPoolCreateInfo.sType = VK_STRUCTURE_TYPE_COMMAND_POOL_CREATE_INFO;
5     cmdPoolCreateInfo.queueFamilyIndex = m_core.GetQueueFamily();
6     
7     VkResult res = vkCreateCommandPool(m_core.GetDevice(), &cmdPoolCreateInfo, NULL, &m_cmdBufPool);    
8     CHECK_VULKAN_ERROR("vkCreateCommandPool error %d\n", res);

 

Command buffer are not created directly. Instead, they must be allocated from pools. As expected, the motivation is performance. By making command buffers part of a pool, better memory management and reuse can be implemented. It is imported to note that the pools are not thread safe. This means that any action on the pool or its command buffers must be explicitly synchronized by the application. So if you want multiple threads to create command buffers in parallel you can either do this synchronization or simply create a different pool for each thread.

命令缓存不是直接创建的。相反,它们必须从池里分配。可以想见,动机是性能。让命令缓存称为池的一部分,可以实现更好的内存管理和复用。重要的一点是,池不是线程安全的。这意味着对池或它的命令缓存的操作必须是明确的同步执行。所以如果你想在多线程并行地创建命令缓存,要么同步执行,要么为不同的线程各创建一个线程。

The function vkCreateCommandPool() creates the pool. It takes a VkCommandPoolCreateInfo structure parameter whose most important member is the queue family index. All commands allocated from this pool must be submitted to queues from this queue family.

函数vkCreateCommandPool()创建这个池。它接收VkCommandPoolCreateInfo结构体作为参数,其最重要的成员是quue family索引。由此池申请的所有命令都必须提交到这个queue family。

1     VkCommandBufferAllocateInfo cmdBufAllocInfo = {};
2     cmdBufAllocInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO;
3     cmdBufAllocInfo.commandPool = m_cmdBufPool;
4     cmdBufAllocInfo.commandBufferCount = m_images.size();
5     cmdBufAllocInfo.level = VK_COMMAND_BUFFER_LEVEL_PRIMARY;
6         
7     res = vkAllocateCommandBuffers(m_core.GetDevice(), &cmdBufAllocInfo, &m_cmdBufs[0]);            
8     CHECK_VULKAN_ERROR("vkAllocateCommandBuffers error %d\n", res);
9 }

 

We are now ready to create the command buffers. In the VkCommandBufferAllocateInfo structure we specify the pool we have just created and the number of command buffers (we need a dedicated command buffer per image in the swap chain). We also specify whether this is a primary or secondary command buffer. Primary command buffers are the common vehicle for submitting commands to the GPU but they cannot reference each other. This means that you can have two very similar command buffers but you still need to record everything into each one. You cannot share the common stuff between them. This is where secondary command buffers come in. They cannot be directly submitted to the queues but they can be referenced by primary command buffers which solves the problem of sharing. At this point we only need primary command buffers.

我们现在可以创建命令缓存了。在VkCommandBufferAllocateInfo结构体中我们标明了我们刚刚创建的池和命令缓存的数量(对交换链中的每个image,我们需要一个专用的命令缓存)。一级命令缓存是提交命令到GPU的*,但是它们不能相互引用。这意味着你可能有两个很相似的命令缓存,但是你还是需要在每个里记录所有的信息。你不能在两者之间共享任何东西。这就是二级命令缓存出场的时候了。它们不能被直接提交到queue,但是可以被一级命令缓存引用,这解决了共享的问题。目前我们只需要一级命令缓存。

Now let's record the clear instruction into our new command buffers.

现在我们将清空指令写入我们新的命令缓存中。

1 void OgldevVulkanApp::RecordCommandBuffers() 
2 {
3     VkCommandBufferBeginInfo beginInfo = {};
4     beginInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO;
5     beginInfo.flags = VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT;

 

Recording of command buffers must be done inside a region of the code explictly marked by a vkBeginCommandBuffer() and vkEndCommandBuffer(). In the VkCommandBufferBeginInfo structure we have a field named 'flags' where we tell the driver that the command buffers will be resubmitted to the queue over and over again. There are other usage models but for now we don't need them.

+BIT祝威+悄悄在此留下版了个权的信息说:

记录命令缓存必须在代码vkBeginCommandBuffer()和vkEndCommandBuffer()之间进行。在结构体中我们有一个字段'flags',它告诉驱动命令缓存将会被反复提交到queue。还有其他使用模式,不过暂时我们不需要。

1     VkClearColorValue clearColor = { 164.0f/256.0f, 30.0f/256.0f, 34.0f/256.0f, 0.0f };
2     VkClearValue clearValue = {};
3     clearValue.color = clearColor;

 

We have to specify our clear color using the two structures above. The first one is a union of four float/int/uint which allows different ways to do that. The second structure is a union of a VkClearColorValue structure and a VkClearDepthStencilValue structure. This scheme is used in parts of the API that can take either of the two structures. We go with the color case. Since I'm very creative today I used the RGB values from the color of the Vulkan logo ;-) 

我们必须用上述2个结构体声明自己的清空颜色。第一个是4个float/int/uint的联合体,支持多种使用方式。第二个是VkClearColorValue结构体和VkClearDepthStencilValue结构体的联合体。这个方案普遍运用于能接收两种结构体的API。我们用颜色功能。由于我今天创造力十足,我用的RGB值来自Vulkan的logo颜色。嘿嘿。

Note that each color channel goes from 0 (darkest) to 1 (brightest) and that this endless spectrum of real numbers is divided to 256 discrete segments which is why I divide by 256.

注意,每个颜色通道都是从0(最暗)到1(最亮),这个无限的实数光谱被分为256个离散的片段。因此我这里除了256。

1     VkImageSubresourceRange imageRange = {};
2     imageRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
3     imageRange.levelCount = 1;
4     imageRange.layerCount = 1;

 

We need to specify the range of images that we want to clear. In future tutorials we will study more complex schemes where there will be multiple mipmap levels, layers, etc. For now we just want the basics so we specify one mip map level and one layer. The aspectMask field tells the driver whether to clear the color, depth or stenctil (or a combination of them). We are only interested in the color aspect of the images.

我们要标明需要清空的image。在未来的教程中,我们将要就更复杂的方案,到时候会有多mipmap level,层,等。目前我们只想做基本工作,所以我们标识1个mipmap level和1个层即可。字段aspectMask告诉驱动是否要清空颜色、深度或模版(或其联合体)。我们只对image的颜色方面感兴趣。

 1     for (uint i = 0 ; i < m_cmdBufs.size() ; i++) {             
 2         VkResult res = vkBeginCommandBuffer(m_cmdBufs[i], &beginInfo);
 3         CHECK_VULKAN_ERROR("vkBeginCommandBuffer error %d\n", res);
 4 
 5         vkCmdClearColorImage(m_cmdBufs[i], m_images[i], VK_IMAGE_LAYOUT_GENERAL, &clearColor, 1, &imageRange);                
 6 
 7         res = vkEndCommandBuffer(m_cmdBufs[i]);
 8         CHECK_VULKAN_ERROR("vkEndCommandBuffer error %d\n", res);
 9     }
10 }

 

We are now ready to record the command buffers. As mentioned earlier, the commands that do the actual recording must be inside a block marked by calls that begin and end a command buffer. For that we specify the command buffer to record to and the beginInfo structure which we already prepared. Since we have an array of command buffers (one buffer per swap chain image) the entire thing is enclosed inside a for loop. vkCmdClearColorImage() records the clear instruction into the command buffer. As parameters it takes the command buffer to record, the target image, the layout of the image in memory, the clear color, the number of VkImageSubresourceRange structures to use and a pointer to an array of these structures (only one in our case).

我们可以开始记录命令缓存了。之前提到过,记录这些命令的操作必须位于开始和结束命令缓存的函数调用之间。为此,我们标明需要记录的命令缓存和备好的beginInfo结构体。由于我们有命令缓存的数组(每个交换链image对应一个缓存),整件事被包在一个for循环里。函数vkCmdClearColorImage()记录清空指令到命令缓存中。它接收命令缓存、目标image、image在内存中的布局、清空色、VkImageSubresourceRange结构体的数量和指向这些结构体数组的指针为参数,

We prepared everything we need and we can now code our main render function. In standard OpenGL this usually means specifying a list of GL commands to draw stuff followed by a swap buffers call (be it GLUT, GLFW or any other windowing API). For the driver it means a tedious repetition of command buffer recording and submission where changes from one frame to the next are relatively small (changes in shader matrices, etc). But in Vulkan all our command buffers are already recorded! We just need to queue them to the GPU. Since we have to be more verbose in Vulkan we also need to manage how we acquire and image for rendering and how to tell the presentation image to display it.

我们准备好了所需的一切, 现在可以编写主渲染函数了。在标准OpenGL中这通常意味着标明很多GL命令,以渲染些什么,之后再交换缓存(用GLUT、GLFW或任何其他窗口API)。对于区域,这意味着一个冗长乏味的重复命令缓存记录和提交操作,两帧之间的变化其实很小(shader矩阵的改变,等)。但是在Vulkan中我们所有的命令缓存都已经记录好了!我们只需将它们排队送到GPU。我们在Vulkan中不得不做很多冗长的工作,还需要管理如何请求要渲染的image,如何告诉表现image去显示。

1 void OgldevVulkanApp::RenderScene()
2 {
3     uint ImageIndex = 0;
4     
5     VkResult res = vkAcquireNextImageKHR(m_core.GetDevice(), m_swapChainKHR, UINT64_MAX, NULL, NULL, &ImageIndex);
6     CHECK_VULKAN_ERROR("vkAcquireNextImageKHR error %d\n", res);

 

The first thing we need to do is to acquire an image from the presentation engine which is available for rendering. We can acquire more than one image (e.g. if we plan to render two or more frames ahead) in an advanced scenario but for now one image will be enough. The API call above takes the device and swap chain as the first two parameters, respectively. The third parameter is the amount of time we're prepared to wait until that function returns. Often, the presentation engine cannot provide an image immediately because it needs to wait for an image to be released or some internal OS or GPU event (e.g. the VSync signal of the display). If we specify zero we make this a non blocking call which means that if an image is available we get it immediately and if not the function returns with an error. Any value above zero and below the maximum value of an unsigned 64bit integer will cause a timeout of that number of nanoseconds. The value of UINT64_MAX will cause the function to return only when an image becomes available (or some internal error occured). This seems like the safest course of action for us here. The next two parameters are pointers to a semaphore and a fence, respectively. Vulkan was designed with a lot of asynchronous operation in mind. This means that you can define inter-dependencies between queues on the GPU, between the CPU and GPU, etc. This allows you to submit work to the image even if it is not really ready to be rendered to (which is a bit counter intuitive to what vkAcquireNextImageKHR is supposed to do but can still happen). These semaphore and fence are synchornization primitives that must be waited upon before the actual rendering to the image can begin. A semaphore syncs between stuff on the GPU and the fence between the host CPU and the GPU. As you can see, I've specified NULL in both cases which might be unsafe and theoretically is not supposed to work yet it does. This may be because of the simplicity of our application. It allowed me to postpone all the synchronization business to a later date. Please let me know if you encounter problems because of this. The last parameter to the function is the index of the image that became available.

+BIT祝威+悄悄在此留下版了个权的信息说:

我们需要做的第一件事,是从表现引擎中获取一个可用于渲染的image。在高级场景中,我们可以获取不止一个(例如,如果我们计划提前渲染2个或多个帧),但目前1个image就足够了。上述API调用接收device和交换链为前2个参数。第3个参数是我们准备等待函数返回的时间。常常地,表现引擎不能立即提供image,因为他需要等待image被释放或某些操作系统内部或GPU事件(例如显示的垂直同步信号)。如果我们写0,我们就让它成为了一个非阻塞调用,也就是说,如果有image可用,我们会立即得到它,如果没有,函数就返回一个error。任何大于0小于uint64的整数都会引发超时(纳秒)。UINT64_MAX的值会让函数只在有可用image(或者发生内部错误)时才返回。这看起来像是最安全的选择。后2个参数是信号和fence指针。Vulkan被设计为很多异步操作。这意味着你可以在GPU上的queue之间、在CPU和GPU之间定义相互依赖关系。这运行你提交工作到image,即使它还没有准备好被渲染(这违反直觉,vkAcquireNextImageKHR原本不该这样,但是仍旧是可能发生的)。这些信号和fence是同步的基石,实际渲染到image开始前,必须等它们。一个信号同步GPU上的东西,fence用于宿主CPU和GPU之间。如你所见,我用NULL填入参数,这可能不安全,理论上行不通,但实际上还是工作了。

1     VkSubmitInfo submitInfo = {};
2     submitInfo.sType                = VK_STRUCTURE_TYPE_SUBMIT_INFO;
3     submitInfo.commandBufferCount   = 1;
4     submitInfo.pCommandBuffers      = &m_cmdBufs[ImageIndex];
5     
6     res = vkQueueSubmit(m_queue, 1, &submitInfo, NULL);    
7     CHECK_VULKAN_ERROR("vkQueueSubmit error %d\n", res);

 

Now that we have an image, let's submit the work to the queue. The vkQueueSubmit() function takes the handle of a queue, the number of VkSubmitInfo structures and a pointer to the corresponding array. The last parameter is a fence which we will conviniently ignore for now. The VkSubmitInfo actually contains 8 members in addition to the standard sType, but we are going to use only 2 (so just imagine how much complexity is still down there). We specify that we have one command buffer and we provide its address (the one that corresponds to the acquired image). The Vulkan spec notes that submission of work can have a high overhead and encourages us to pack as many command buffers as we possibly can into that API to minimize that overhead. In this simple example we don't have an opportunity to do that but we should keep that in mind as our application becomes more complex in the future.

现在我们有了image,我们把工作提交到queue吧。函数vkQueueSubmit()接收queue的句柄,结构体的数量和对于数组的指针。最后一个参数是fence,目前我们忽略它。除了sType外,VkSubmitInfo实际上还有8个成员,但是我们计划只用2个(所以想象下后面还会有多少复杂的东西吧)。我们标明我们还有1个命令缓存,提供它的地址(对应到获取到的image的那个)。Vulkan说明书提到,提交工作的开销比较大,鼓励我们尽可能打包最多的命令缓存到API,以最小化开销。在这个简单的例子中,我们没有机会这么做,但是我们应该记住这一点,因为应用程序会变得越来越复杂。

1     VkPresentInfoKHR presentInfo = {};
2     presentInfo.sType              = VK_STRUCTURE_TYPE_PRESENT_INFO_KHR;
3     presentInfo.swapchainCount     = 1;
4     presentInfo.pSwapchains        = &m_swapChainKHR;
5     presentInfo.pImageIndices      = &ImageIndex;
6     
7     res = vkQueuePresentKHR(m_queue, &presentInfo);    
8     CHECK_VULKAN_ERROR("vkQueuePresentKHR error %d\n" , res);
9 }
+BIT祝威+悄悄在此留下版了个权的信息说:

 

Once the previous API call has returned we know that the command buffer is on its way to the GPU queue but we have no idea when exactly it is going to be executed, and frankly, we don't really care. Command buffers in a queue are guaranteed to be processed in the order of submission and since we submit a present command after the clear command into the same queue we know that the image will be cleared before it is presented. So the vkQueuePresent() call is basically a marker that ends the frame and tells the presentation engine to display it. This function takes two parameters - a queue which has presentation capabilities (we took care of that when initializing the device and queue) and a pointer to a VkPresentInfoKHR structure. This structure contains, among other stuff, two arrays of equal sizes. A swap chain array and an image index array. This means that you can queue a present command to multiple swap chains where each swap chain is connected to a different window. Every swap chain in the array has a corresponding image index which specifies which image will be presented. The swapchainCount member says how many swap chains and images we are going present.

一旦之前的API返回了,我们就知道命令缓存前往GPU的queue了,但是我们不知道具体何时它才会被执行,坦白说,我们也不在乎。一个queue里的命令缓存被保证会按提交的顺序执行,由于我们在清空命令之后向同一queue提交显示命令,我们知道image会先清空后显示。所以调用vkQueuePresent()函数基本上就是标记帧结束,告诉表现引擎去显示。这个函数接收2个参数——有表现能力的queue(初始化device和queue的时候我们处理好了它)和VkPresentInfoKHR结构体的指针。除了其他东西,这个结构体还包含两个大小相同的数组——一个交换链数组和一个image索引数组。这意味着你可以将一个显示命令排到多个交换链的queue上,每个交换链都可以连接到不同的窗口。数组中的每个交换链有个对应的image索引,标明哪个image要被显示。成员swapchainCount告诉我们我们要显示多少交换链和image。

1 void OgldevVulkanApp::Run()
2 {    
3     while (true) {        
4         RenderScene();
5     }
6 }

 

Our main render function is very simple. We loop endlessly and call the function that we have just reviewed.

+BIT祝威+悄悄在此留下版了个权的信息说:

我们的主渲染函数很简单。我们无限循环,调用刚刚评审过的函数即可。

 1 int main(int argc, char** argv)
 2 {
 3     OgldevVulkanApp app("Tutorial 51");
 4     
 5     app.Init();
 6     
 7     app.Run();
 8     
 9     return 0;
10 }

 

The main function is also very simple. We declare an OgldevVulkanApp object, initialize and run it.

+BIT祝威+悄悄在此留下版了个权的信息说:

主函数main还是很简单。我们声明OgldevVulkanApp对象,初始化和运行它。

That's it for today. I hope that your window is clear. Next time we will draw a triangle.

今天就到这里吧。我行为你的窗口被清空了。下次我们将画一个三角形。

 

上一篇:Python自学笔记(四)读懂Python异常机制


下一篇:跟随我在oracle学习php(57)