一些Nvidia-GPU参数汇总和比较

  1. 桌面显卡——Titan
  • GeForce GTX Titan black
    capability 3.5 (Kepler)
    global memory : 6081MB L2 cache 1.5MB
    cores: 15(Multipricessors) * 192(632) = 2880 cores
    max size of a thread block:(1024
    102464)
    max size of a grid :(2147483647
    65535*65535)

  • Titan V
    capability 7.0 (Volta)

  1. 桌面显卡——GTX
  • GTX 1060:
    capability 6.1 (Pascal)
    GPU-1.7GHz Memory-4GHz MemoryBus width-192bit
    global memory : 3014MB L2 cache 1.5MB
    cores: 9(Multipricessors) * 128(432) = 1152 cores
    max size of a thread block:(1024
    102464)
    max size of a grid :(2147483647
    65535*65535)
    GP106核心,拥有1280个CUDA单元??,106个纹理单元,48个光栅单元,192bit显存位宽,6GB GDDR5显存??,核心频率1506MHz,Boost频率1709MHz,等效显存频率8GHz??
  1. 轻薄本显卡
  • (机械革命S1):mx150(Pascal)
    Device 0: "GeForce MX150"
    CUDA Driver Version / Runtime Version 10.0 / 10.0
    CUDA Capability Major/Minor version number: 6.1
    Total amount of global memory: 2003 MBytes (2099904512 bytes)
    ( 3) Multiprocessors, (128) CUDA Cores/MP: 384 CUDA Cores
    GPU Max Clock rate: 1532 MHz (1.53 GHz)
    Memory Clock rate: 3004 Mhz
    Memory Bus Width: 64-bit
    L2 Cache Size: 524288 bytes
    Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
    Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
    Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
    Total amount of constant memory: 65536 bytes
    Total amount of shared memory per block: 49152 bytes
    Total number of registers available per block: 65536
    Warp size: 32
    Maximum number of threads per multiprocessor: 2048
    Maximum number of threads per block: 1024
    Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
    Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
    Maximum memory pitch: 2147483647 bytes
    Texture alignment: 512 bytes
    Concurrent copy and kernel execution: Yes with 2 copy engine(s)
    Run time limit on kernels: Yes
    Integrated GPU sharing Host Memory: No
    Support host page-locked memory mapping: Yes
    Alignment requirement for Surfaces: Yes
    Device has ECC support: Disabled
    Device supports Unified Addressing (UVA): Yes
    Device supports Compute Preemption: Yes
    Supports Cooperative Kernel Launch: Yes
    Supports MultiDevice Co-op Kernel Launch: Yes
    Device PCI Domain ID / Bus ID / location ID: 0 / 3 / 0
  1. Tegra Soc
  • Jetson Xavier
    capability 7.2 (Volta)
    GPU-1.5GHz Memory-1.37GHz MemoryBus width-256bit
    global memory : 15698MB L2 cache 0.5MB
    cores: 8(Multipricessors) * 128(432) = 1024 cores
    max size of a thread block:(1024
    102464)
    max size of a grid :(2147483647
    65535*65535)

  • Jetson TX2 GP10B
    capability 6.2 (Pascal)

  • Jetson TX2
    capability 5.3 (Maxwell)

  1. 其他
  • (Dell Precision 5520) Device 0: "Quadro M1200"
    CUDA Capability Major/Minor version number: 5.0 (Maxwell)
    Total amount of global memory: 4046 MBytes (4242604032 bytes)
    ( 5) Multiprocessors, (128) CUDA Cores/MP: 640 CUDA Cores
    GPU Max Clock rate: 1148 MHz (1.15 GHz)
    Memory Clock rate: 2505 Mhz
    Memory Bus Width: 128-bit
    L2 Cache Size: 2097152 bytes
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容