I adapted the NVIDIA CUDA 6.5 Device Query Example to encapsulate it in a cleaner class structure. The code is undocumented at this time but it is fairly straightforward in that it presents a the parameters for each CUDA device in the system. The CCUDAInfo class is small; it contains the count of the devices and an array of the devices themselves. The CCUDADeviceInfo class contains the bulk of the useful information. Both classes have ostream << operators overloaded and throw an exception if CUDA fails. CUDA must be initialized before using the class.
The class can be used as follows:
#include <iostream> #include <cuda.h> #include <helper_cuda_drvapi.h> #include <drvapi_error_string.h> #include "CCUDAInfo.h" int main(int argc, char **argv) { std::cout << "Starting ... \n"; // Init CUDA for application: CUresult error_id = cuInit(0); if (error_id != CUDA_SUCCESS) { std::cerr << "cuInit Failed. Returned " << error_id << ": " << getCudaDrvErrorString(error_id) << std::endl; printf("Result = FAIL\n"); exit(EXIT_FAILURE); } // Load and display the CUDA Info Class: try { CCUDAInfo cinfo; std::cout << cinfo << "\n"; } catch(std::exception &ex) { std::cout << "Error: " << ex.what() << "\n"; } return 0; }
With the following output:
Starting ... CUDA Driver Version: 6.5 Device Count: 1 *** DEVICE 0 *** Name: GeForce GT 650M Compute Capability: 3.0 Clock Rate: 835000 Compute Mode: 0 CUDA CORES: 384 Cores Per MP: 192 Device ID: 0 ECC Enabled: No Is Tesla: No Kernel Timeout Enabled: Yes L2 Cache Size: 262144 Max Block Dim: 1024, 1024, 64 Max Grid Dim: 2147483647, 65535, 65535 Max 1D Texture Size: 65536 Max 1D Layered Texture Size: 16384, 2048 Max 2D Texture Size: 65536, 65536 Max 2D Layers Texture Size: 16384, 16384, 2048 Max 3D Texture Size: 4096, 4096, 4096 Max Threads Per Block: 1024 Max Threads Per Multiprocessor: 2048 Memory Bus Width: 128 Memory Clock Rate: 2 Ghz Memory Pitch Bytes: 2147483647 Multiprocessor Count: 2 PCI Bus ID: 1 PCI Device ID: 0 Registers Per Block: 65536 Shared Memory Per Block: 49152 Total Constant Memory Bytes: 65536 Total Global Memory Bytes: 1073741824 Warp Size: 32 Supports Concurrent Kernels: Yes Supports GPU Overlap: Yes Supports Integrated GPU Sharing Host Memory: No Supports Map Host Memory: Yes Supports Unified Addressing: No Surface Alignment Required: Yes
The files can be found here.