CUDA编程图例

CUDA编程图例

CUDA C++ Programming Guide

CUDA编程图例

 

 Figure 7. Matrix Multiplication without Shared Memory

CUDA编程图例

 

 Figure 8. Matrix Multiplication with Shared Memory

CUDA编程图例

 

 Figure 20. Examples of Global Memory Accesses. Examples of Global Memory Accesses by a Warp, 4-Byte Word per Thread, and Associated Memory Transactions for Compute Capabilities 3.x and Beyond

CUDA编程图例

 

 Figure 21. Strided Shared Memory Accesses. Examples for devices of compute capability 3.x (in 32-bit mode) or compute capability 5.x and 6.x

CUDA编程图例

 

 Figure 22. Irregular Shared Memory Accesses. Examples for devices of compute capability 3.x, 5.x, or 6.x.

 

参考链接:

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#arithmetic-instructions__throughput-native-arithmetic-instructions

上一篇:354. 俄罗斯套娃信封问题


下一篇:把Espresso的源码编译出来了