一些Nvidia-GPU参数汇总和比较

桌面显卡——Titan

GeForce GTX Titan black
capability 3.5 （Kepler）
global memory : 6081MB L2 cache 1.5MB
cores: 15(Multipricessors) * 192(632) = 2880 cores
max size of a thread block:(1024102464)
max size of a grid :(214748364765535*65535)
Titan V
capability 7.0 （Volta）

桌面显卡——GTX

GTX 1060：
capability 6.1 （Pascal）
GPU-1.7GHz Memory-4GHz MemoryBus width-192bit
global memory : 3014MB L2 cache 1.5MB
cores: 9(Multipricessors) * 128(432) = 1152 cores
max size of a thread block:(1024102464)
max size of a grid :(214748364765535*65535)
GP106核心，拥有1280个CUDA单元？？，106个纹理单元，48个光栅单元，192bit显存位宽，6GB GDDR5显存？？，核心频率1506MHz，Boost频率1709MHz，等效显存频率8GHz？？

轻薄本显卡

(机械革命S1)：mx150（Pascal）
Device 0: "GeForce MX150"
CUDA Driver Version / Runtime Version 10.0 / 10.0
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 2003 MBytes (2099904512 bytes)
( 3) Multiprocessors, (128) CUDA Cores/MP: 384 CUDA Cores
GPU Max Clock rate: 1532 MHz (1.53 GHz)
Memory Clock rate: 3004 Mhz
Memory Bus Width: 64-bit
L2 Cache Size: 524288 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 3 / 0

Tegra Soc

Jetson Xavier
capability 7.2 （Volta）
GPU-1.5GHz Memory-1.37GHz MemoryBus width-256bit
global memory : 15698MB L2 cache 0.5MB
cores: 8(Multipricessors) * 128(432) = 1024 cores
max size of a thread block:(1024102464)
max size of a grid :(214748364765535*65535)
Jetson TX2 GP10B
capability 6.2 （Pascal）
Jetson TX2
capability 5.3 （Maxwell）

其他

(Dell Precision 5520) Device 0: "Quadro M1200"
CUDA Capability Major/Minor version number: 5.0 （Maxwell）
Total amount of global memory: 4046 MBytes (4242604032 bytes)
( 5) Multiprocessors, (128) CUDA Cores/MP: 640 CUDA Cores
GPU Max Clock rate: 1148 MHz (1.15 GHz)
Memory Clock rate: 2505 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 2097152 bytes

一些Nvidia-GPU参数汇总和比较

推荐阅读更多精彩内容