| Name |
| |
| NVX_gpu_multicast2 |
| |
| Name Strings |
| |
| GL_NVX_gpu_multicast2 |
| |
| Contact |
| |
| Joshua Schnarr, NVIDIA Corporation (jschnarr 'at' nvidia.com) |
| Ingo Esser, NVIDIA Corporation (iesser 'at' nvidia.com) |
| |
| Contributors |
| |
| Robert Menzel, NVIDIA |
| Ralf Biermann, NVIDIA |
| |
| Status |
| |
| Complete. |
| |
| Version |
| |
| Last Modified Date: July 23, 2019 |
| Author Revision: 8 |
| |
| Number |
| |
| OpenGL Extension #543 |
| |
| Dependencies |
| |
| This extension is written against the OpenGL 4.6 specification |
| (Compatibility Profile), dated October 24, 2016. |
| |
| This extension requires NV_gpu_multicast. |
| |
| This extension requires EXT_device_group. |
| |
| This extension requires NV_viewport_array. |
| |
| This extension requires NV_clip_space_w_scaling. |
| |
| This extension requires NVX_progress_fence. |
| |
| |
| Overview |
| |
| This extension provides additional mechanisms that influence multicast rendering which is |
| simultaneous rendering to multiple GPUs. |
| |
| New Procedures and Functions |
| |
| uint AsyncCopyImageSubDataNVX( |
| sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *waitValueArray, |
| uint srcGpu, GLbitfield dstGpuMask, |
| uint srcName, GLenum srcTarget, int srcLevel, int srcX, int srcY, int srcZ, |
| uint dstName, GLenum dstTarget, int dstLevel, int dstX, int dstY, int dstZ, |
| sizei srcWidth, sizei srcHeight, sizei srcDepth, |
| sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray); |
| |
| sync AsyncCopyBufferSubDataNVX( |
| sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *fenceValueArray, |
| uint readGpu, GLbitfield writeGpuMask, |
| uint readBuffer, uint writeBuffer, |
| GLintptr readOffset, GLintptr writeOffset, sizeiptr size, |
| sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray); |
| |
| void UploadGpuMaskNVX(bitfield mask); |
| |
| void MulticastViewportArrayvNVX(uint gpu, uint first, sizei count, const float *v); |
| |
| void MulticastScissorArrayvNVX(uint gpu, uint first, sizei count, const int *v); |
| |
| void MulticastViewportPositionWScaleNVX(uint gpu, uint index, float xcoeff, float ycoeff); |
| |
| |
| New Tokens |
| |
| Accepted by the <pname> parameter of GetIntegerv and GetInteger64v: |
| |
| UPLOAD_GPU_MASK_NVX 0x954A |
| |
| Additions to Chapter 20 (Multicast Rendering) added to the OpenGL 4.5 (Compatibility Profile) |
| Specification by NV_gpu_multicast |
| |
| Additions to Section 20.1 (Controlling Individual GPUs) |
| |
| Texture data uploads using the functions TexImage1D, TexImage2D, TexImage3D, |
| TexSubImage1D, TexSubImage2D and TexSubImage3D are restricted to a specific set of GPUs with |
| |
| void UploadGpuMaskNVX(bitfield mask); |
| |
| This command also restricts buffer object data uploads using the functions BufferStorage, |
| NamedBufferStorage, BufferSubData and NamedBufferSubData to the specified set of GPUs. |
| |
| Further this command also restricts buffer object clears using the functions ClearBufferData, |
| ClearNamedBufferData, ClearBufferSubData and ClearNamedBufferSubData. |
| |
| The following errors apply to UploadGpuMaskNVX: |
| |
| INVALID_VALUE is generated |
| * if <mask> is zero, |
| * if <mask> is greater than or equal to 2^n, where n is equal to MULTICAST_GPUS_NV |
| |
| If the command does not generate an error, UPLOAD_GPU_MASK_NVX is set to <mask>. |
| |
| The default value of UPLOAD_GPU_MASK_NVX is (2^n)-1. |
| |
| If a function restricted by UploadGpuMaskNVX operates on textures or buffer objects |
| with GPU-shared storage type (as opposed to per-GPU storage), UPLOAD_GPU_MASK_NVX is ignored. |
| |
| Modify Section 20.2 (Multi-GPU Buffer Storage) |
| |
| Append the following paragraphs: |
| |
| To initiate a copy of buffer data without waiting for it to complete, use the following command: |
| |
| void AsyncCopyBufferSubDataNVX( |
| sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *fenceValueArray, |
| uint readGpu, GLbitfield writeGpuMask, |
| uint readBuffer, uint writeBuffer, |
| GLintptr readOffset, GLintptr writeOffset, sizeiptr size, |
| sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray); |
| |
| This command behaves equivalently to MulticastCopyBufferSubDataNV, except that it may be |
| performed concurrently with commands submitted in the future. |
| Fence semaphore objects created with CreateProgressFenceNVX are used for synchronization of one or |
| multiple copies. |
| An array of <waitSemaphoreCount> synchronization objects can be specified in the <waitSemaphoresArray> |
| parameter as a pointer to the array of semaphore objects. |
| The copy will wait for all fence semaphores in the <waitSemaphoreArray> array to be reach or exceed |
| their corresponding fence value in <fenceValueArray> before starting the transfer. |
| A signal operation for each of the <signalSemaphoreCount> semaphores in <signalSemaphoresArray> is written |
| after the copy with the corresponding fence value in <signalValueArray>. |
| To wait for the copy to complete, use WaitSemaphoreui64NVX or ClientWaitSemaphoreui64NVX to wait |
| for the semaphores in <signalSemaphoreArray> to be signalled with the fence values in <signalValueArray>. |
| |
| Modify Section 20.3.1 (Copying Image Data Between GPUs) |
| |
| Insert the following paragraphs above the line starting "To copy pixel values": |
| |
| To initiate a copy of texel data without waiting for it to complete, use the following command: |
| |
| void AsyncCopyImageSubDataNVX( |
| sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *waitValueArray, |
| uint srcGpu, GLbitfield dstGpuMask, |
| uint srcName, GLenum srcTarget, int srcLevel, int srcX, int srcY, int srcZ, |
| uint dstName, GLenum dstTarget, int dstLevel, int dstX, int dstY, int dstZ, |
| sizei srcWidth, sizei srcHeight, sizei srcDepth, |
| sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray); |
| |
| This command behaves equivalently to MulticastCopyImageSubDataNV, except that it may be |
| performed concurrently with commands submitted in the future. |
| Fence semaphore objects created with CreateProgressFenceNVX are used for synchronization of one or |
| multiple copies. An array of <waitSemaphoreCount> synchronization objects can be specified in the |
| <waitSemaphoreArray> parameter as a pointer to the array of semaphore objects. |
| The copy will wait for all fence semaphores in the <waitSemaphoresArray> array to be reach or exceed |
| their corresponding fence value in <fenceValueArray> before starting the transfer. |
| A signal operation for each of the <signalSemaphoreCount> semaphores in <signalSemaphoreArray> is written |
| after the copy with the corresponding fence value in <signalValueArray>. |
| To wait for the copy to complete, use WaitSemaphoreui64NVX or ClientWaitSemaphoreui64NVX to wait |
| for the semaphores in <signalSemaphoresArray> to be signalled with the fence values in <signalValueArray>. |
| |
| Additions to Chapter 13 (Fixed-Function Vertex Post-Processing) added to the OpenGL 4.5 (Compatibility Profile) |
| |
| Modify Section 13.6 (Coordinate transformations) |
| |
| Viewport transformation parameters for multiple viewports are specified using |
| |
| MulticastViewportArrayvNVX(uint gpu, uint first, sizei count, const float * v); |
| |
| where the array of viewport parameters can be controlled for each multicast GPU, respectively. |
| |
| A set of scissor rectangles that are each applied to the corresponding viewport is specified |
| using |
| |
| MulticastScissorArrayvNVX(uint gpu, uint first, sizei count, const int *v); |
| |
| where the rectangle parameters can be controlled for each multicast GPU, respectively. |
| |
| |
| If VIEWPORT_POSITION_W_SCALE_NV is enabled, the w coordinates for each |
| primitive sent to a given viewport will be scaled as a function of |
| its x and y coordinates using the following equation: |
| |
| w' = xcoeff * x + ycoeff * y + w; |
| |
| The coefficients for "x" and "y" used in the above equation depend on the |
| viewport index and can be controlled for each multicast GPU, respectively, by the command |
| |
| MulticastViewportPositionWScaleNVX(uint gpu, uint index, float xcoeff, float ycoeff); |
| |
| An error INVALID_VALUE error is generated if <gpu> is greater than or equal to MULTICAST_GPUS_NV. |
| |
| Additions to the OpenGL Shading Language Specification, Version 4.50 |
| |
| Including the following line in a shader can be used to enumerate multicast GPUs |
| by using the shader built-in variable gl_DeviceIndex: |
| |
| #extension GL_EXT_device_group : enable |
| |
| Each multicast GPU contains a unique device index in the gl_DeviceIndex variable. |
| |
| Errors |
| |
| Relaxation of INVALID_ENUM errors |
| --------------------------------- |
| GetIntegerv and GetInteger64v now accept new tokens as |
| described in the "New Tokens" section. |
| |
| New State |
| |
| Additions to Table 23.6 Buffer Object State |
| Initial |
| Get Value Type Get Command Value Description Sec. Attribute |
| -------------------------- ------ ----------- ----- ----------------------- ---- --------- |
| UPLOAD_GPU_MASK_NVX Z+ GetIntegerv * Mask of GPUs that 20.1 - |
| restricts buffer data |
| writes |
| * See section 20.1 |
| |
| |
| New Implementation Dependent State |
| |
| None. |
| |
| Sample Code |
| |
| None. |
| |
| Issues |
| |
| None. |
| |
| Revision History |
| |
| Rev. Date Author Changes |
| ---- -------- -------- ----------------------------------------------- |
| 1 09/20/17 jschnarr initial draft |
| 2 02/23/18 rbiermann updated draft with new functions |
| 3 05/23/18 rbiermann updated draft with new ViewportArray and AsyncCopy functions |
| 4 06/08/18 rbiermann added NVX_progress_fence for synchronization objects |
| 5 08/15/18 rbiermann updated draft with gl_deviceIndex |
| 6 04/16/19 rbiermann updated draft with UploadGpuMaskNVX |
| 7 07/19/19 rbiermann updated draft with modifications of UploadGpuMaskNVX section |
| 8 07/23/19 rbiermann updated draft with support of Clear(Named)Buffer(Sub)Data by UploadGpuMaskNVX |
| |