CUDA编程——设备属性信息的查询

    CUDA的目的就是将大量的计算分配给GPU进行快速运算来节省时间。我们希望在设备上(显卡)上分配内存和执行代码,当今的显卡可能包含多个GPU。如,某些NVIDIA产品-GeForce GTX TITAN X,就是在单块卡上包含两个GPU,所以装配该显卡的计算机拥有两个支持CUDA的处理器。

从CUDA3.0开始,在cudaDeviceProp结构中包含了以下信息:

struct cudaDeviceProp{

    char name[256];

    size_t totalGlobalMem;

    size_t shaerdMemPerBlock;

    int regsPerBlock;

    int warpSize;

    size_t memPitch;

    int maxThreadsPerBlock;

    int maxThreadsDim[3];

    int maxGridSize[3];

    size_t totalConstMem;

    int major;

    int minor;

    int clockRate;

    size_t textureAlignment;

    int deviceOverlap;

    int multiProcessorCount;

    int kernelExecTimeoutEnabled;

    int integrradted;

    int canMapHostMemory;

    int computeMode;

    int maxTexture1D;

    int maxTexture2D[2];

    int maxTexture3D[3];

    int maxTexture2DArray[3];

    int concurrentKernels;

}

某些属性可能见文知意很容易理解,但是为了更好的学习还是说明一下吧。见下表。


具体操作代码如下:

int main() {

    cudaDeviceProp prop;

    int count;

    cudaGetDeviceCount(&count);

    for (int i = 0; i < count; i++)

    {

        cudaGetDeviceProperties(&prop, i);

        printf("   --- General Inromation for device %d ---\n", i);

        printf("Name: %s\n", prop.name);

        printf("Compute capability: %d.%d\n", prop.major, prop.minor);

        printf("Clock rate: %d\n", prop.clockRate);

        printf("Device copy overlap: ");

        if (prop.deviceOverlap)

            printf("Enabled\n");

        else

            printf("Dissabled\n");

        printf("Kernel execition timeout : ");

        if (prop.kernelExecTimeoutEnabled)

            printf("Enabled\n");

        else

            printf("Dissabled\n");

        printf("Kernel execition timeout : ");

        if (prop.kernelExecTimeoutEnabled)

            printf("Enabled\n");

        else

            printf("Dissabled\n");

        printf("    --- Memory Information for device %d ---\n", i);

        printf("Total global mem: %ld\n", prop.totalGlobalMem);

        printf("Total constant Mem: %ld \n", prop.totalConstMem);

        printf("Max mem pitch: %ld\n", prop.memPitch);

        printf("Texture Alignment: %ld\n", prop.textureAlignment);

        printf("   ---MP Information for device %d ---\n", i);

        printf("Multiprocessor count: %d\n", prop.multiProcessorCount);

        printf("Shared mem per mp: %ld\n", prop.sharedMemPerBlock);

        printf("Registers per mp: %d\n", prop.regsPerBlock);

        printf("Threads in warp: %d\n", prop.warpSize);

        printf("Max threads per block: %d\n", prop.maxThreadsPerBlock);

        printf("Max thread dimensions: (%d, %d, %d)\n", prop.maxThreadsDim[0], prop.maxThreadsDim[1], prop.maxThreadsDim[2]);

        printf("Max grid dimensions: (%d, %d,%d)\n", prop.maxGridSize[0], prop.maxGridSize[1], prop.maxGridSize[2]);

        printf("\n");

    }

    return 0;

}

    对GPU进行简单的了解后,在后续的GPU运算中能够轻松的进行GPU的调用以及内存相关问题的操作。

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。