| Name |
| |
| NV_device_attribute_query |
| |
| Name Strings |
| |
| cl_nv_device_attribute_query |
| |
| Number |
| |
| OpenCL Extension #18 |
| |
| Contributors |
| |
| Cyril Zeller, NVIDIA Corporation |
| Yogesh Kini, NVIDIA Corporation |
| Kedar Patil, NVIDIA Corporation |
| |
| Notice |
| |
| Copyright NVIDIA Corporation, 2009. |
| |
| IP Status |
| |
| NVIDIA Proprietary. |
| |
| Version |
| |
| October 5, 2009 (version 1.0) |
| |
| Dependencies |
| |
| OpenCL 1.0 is required. |
| |
| Overview |
| |
| This extension provides a mechanism to query device attributes specific to |
| NVIDIA hardware. This will enable the programmer to optimize OpenCL kernels |
| based on the specifics of the hardware. |
| |
| Details |
| |
| OpenCL 1.0 specification allows the programmer to query various device |
| attributes. The complete list of these attributes are listed in table 4.3. |
| However there is no way to query vendor specific information. This |
| extension extends this table to include NVIDIA specific device attribute |
| queries. |
| |
| This extension extends the table 4.3 of OpenCL 1.0 specification to include |
| the following |
| |
| |------------------------------------------------------------------------------------------------------------------| |
| | CL_DEVICE_COMPUTE_CAPABILITY_MAJOR_NV | cl_uint | Returns the major revision number that defines the CUDA | |
| | | | compute capability of the device. | |
| |------------------------------------------------------------------------------------------------------------------| |
| | CL_DEVICE_COMPUTE_CAPABILITY_MINOR_NV | cl_uint | Returns the minor revision number that defines the CUDA | |
| | | | compute capability of the device. | |
| |------------------------------------------------------------------------------------------------------------------| |
| | CL_DEVICE_REGISTERS_PER_BLOCK_NV | cl_unit | Maximum number of 32-bit registers available to a | |
| | | | work-group; this number is shared by all work-groups | |
| | | | simultaneously resident on a multiprocessor. | |
| |------------------------------------------------------------------------------------------------------------------| |
| | CL_DEVICE_WARP_SIZE_NV | cl_uint | Warp size in work-items. | |
| | | | | |
| |------------------------------------------------------------------------------------------------------------------| |
| | CL_DEVICE_GPU_OVERLAP_NV | cl_bool | Returns CL_TRUE if the device can concurrently copy memory | |
| | | | between host and device while executing a kernel, or | |
| | | | CL_FALSE if not. | |
| |------------------------------------------------------------------------------------------------------------------| |
| | CL_DEVICE_KERNEL_EXEC_TIMEOUT_NV | cl_bool | CL_TRUE if there is a run time limit for kernels executed | |
| | | | on the device, or CL_FALSE if not. | |
| |------------------------------------------------------------------------------------------------------------------| |
| | CL_DEVICE_INTEGRATED_MEMORY_NV | cl_bool | CL_TRUE if the device is integrated with the memory | |
| | | | subsystem, or CL_FALSE if not. | |
| |------------------------------------------------------------------------------------------------------------------| |
| |
| The function clGetDeviceInfo can be called with the constants above in |
| order to query the device attributes. |
| |