blob: 4d63eecfa3d74d7074cf6a271c73b7cc9462ac8a [file] [log] [blame]
Name
NVX_gpu_multicast2
Name Strings
GL_NVX_gpu_multicast2
Contact
Joshua Schnarr, NVIDIA Corporation (jschnarr 'at' nvidia.com)
Ingo Esser, NVIDIA Corporation (iesser 'at' nvidia.com)
Contributors
Robert Menzel, NVIDIA
Ralf Biermann, NVIDIA
Status
Complete.
Version
Last Modified Date: July 23, 2019
Author Revision: 8
Number
OpenGL Extension #543
Dependencies
This extension is written against the OpenGL 4.6 specification
(Compatibility Profile), dated October 24, 2016.
This extension requires NV_gpu_multicast.
This extension requires EXT_device_group.
This extension requires NV_viewport_array.
This extension requires NV_clip_space_w_scaling.
This extension requires NVX_progress_fence.
Overview
This extension provides additional mechanisms that influence multicast rendering which is
simultaneous rendering to multiple GPUs.
New Procedures and Functions
uint AsyncCopyImageSubDataNVX(
sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *waitValueArray,
uint srcGpu, GLbitfield dstGpuMask,
uint srcName, GLenum srcTarget, int srcLevel, int srcX, int srcY, int srcZ,
uint dstName, GLenum dstTarget, int dstLevel, int dstX, int dstY, int dstZ,
sizei srcWidth, sizei srcHeight, sizei srcDepth,
sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray);
sync AsyncCopyBufferSubDataNVX(
sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *fenceValueArray,
uint readGpu, GLbitfield writeGpuMask,
uint readBuffer, uint writeBuffer,
GLintptr readOffset, GLintptr writeOffset, sizeiptr size,
sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray);
void UploadGpuMaskNVX(bitfield mask);
void MulticastViewportArrayvNVX(uint gpu, uint first, sizei count, const float *v);
void MulticastScissorArrayvNVX(uint gpu, uint first, sizei count, const int *v);
void MulticastViewportPositionWScaleNVX(uint gpu, uint index, float xcoeff, float ycoeff);
New Tokens
Accepted by the <pname> parameter of GetIntegerv and GetInteger64v:
UPLOAD_GPU_MASK_NVX 0x954A
Additions to Chapter 20 (Multicast Rendering) added to the OpenGL 4.5 (Compatibility Profile)
Specification by NV_gpu_multicast
Additions to Section 20.1 (Controlling Individual GPUs)
Texture data uploads using the functions TexImage1D, TexImage2D, TexImage3D,
TexSubImage1D, TexSubImage2D and TexSubImage3D are restricted to a specific set of GPUs with
void UploadGpuMaskNVX(bitfield mask);
This command also restricts buffer object data uploads using the functions BufferStorage,
NamedBufferStorage, BufferSubData and NamedBufferSubData to the specified set of GPUs.
Further this command also restricts buffer object clears using the functions ClearBufferData,
ClearNamedBufferData, ClearBufferSubData and ClearNamedBufferSubData.
The following errors apply to UploadGpuMaskNVX:
INVALID_VALUE is generated
* if <mask> is zero,
* if <mask> is greater than or equal to 2^n, where n is equal to MULTICAST_GPUS_NV
If the command does not generate an error, UPLOAD_GPU_MASK_NVX is set to <mask>.
The default value of UPLOAD_GPU_MASK_NVX is (2^n)-1.
If a function restricted by UploadGpuMaskNVX operates on textures or buffer objects
with GPU-shared storage type (as opposed to per-GPU storage), UPLOAD_GPU_MASK_NVX is ignored.
Modify Section 20.2 (Multi-GPU Buffer Storage)
Append the following paragraphs:
To initiate a copy of buffer data without waiting for it to complete, use the following command:
void AsyncCopyBufferSubDataNVX(
sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *fenceValueArray,
uint readGpu, GLbitfield writeGpuMask,
uint readBuffer, uint writeBuffer,
GLintptr readOffset, GLintptr writeOffset, sizeiptr size,
sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray);
This command behaves equivalently to MulticastCopyBufferSubDataNV, except that it may be
performed concurrently with commands submitted in the future.
Fence semaphore objects created with CreateProgressFenceNVX are used for synchronization of one or
multiple copies.
An array of <waitSemaphoreCount> synchronization objects can be specified in the <waitSemaphoresArray>
parameter as a pointer to the array of semaphore objects.
The copy will wait for all fence semaphores in the <waitSemaphoreArray> array to be reach or exceed
their corresponding fence value in <fenceValueArray> before starting the transfer.
A signal operation for each of the <signalSemaphoreCount> semaphores in <signalSemaphoresArray> is written
after the copy with the corresponding fence value in <signalValueArray>.
To wait for the copy to complete, use WaitSemaphoreui64NVX or ClientWaitSemaphoreui64NVX to wait
for the semaphores in <signalSemaphoreArray> to be signalled with the fence values in <signalValueArray>.
Modify Section 20.3.1 (Copying Image Data Between GPUs)
Insert the following paragraphs above the line starting "To copy pixel values":
To initiate a copy of texel data without waiting for it to complete, use the following command:
void AsyncCopyImageSubDataNVX(
sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *waitValueArray,
uint srcGpu, GLbitfield dstGpuMask,
uint srcName, GLenum srcTarget, int srcLevel, int srcX, int srcY, int srcZ,
uint dstName, GLenum dstTarget, int dstLevel, int dstX, int dstY, int dstZ,
sizei srcWidth, sizei srcHeight, sizei srcDepth,
sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray);
This command behaves equivalently to MulticastCopyImageSubDataNV, except that it may be
performed concurrently with commands submitted in the future.
Fence semaphore objects created with CreateProgressFenceNVX are used for synchronization of one or
multiple copies. An array of <waitSemaphoreCount> synchronization objects can be specified in the
<waitSemaphoreArray> parameter as a pointer to the array of semaphore objects.
The copy will wait for all fence semaphores in the <waitSemaphoresArray> array to be reach or exceed
their corresponding fence value in <fenceValueArray> before starting the transfer.
A signal operation for each of the <signalSemaphoreCount> semaphores in <signalSemaphoreArray> is written
after the copy with the corresponding fence value in <signalValueArray>.
To wait for the copy to complete, use WaitSemaphoreui64NVX or ClientWaitSemaphoreui64NVX to wait
for the semaphores in <signalSemaphoresArray> to be signalled with the fence values in <signalValueArray>.
Additions to Chapter 13 (Fixed-Function Vertex Post-Processing) added to the OpenGL 4.5 (Compatibility Profile)
Modify Section 13.6 (Coordinate transformations)
Viewport transformation parameters for multiple viewports are specified using
MulticastViewportArrayvNVX(uint gpu, uint first, sizei count, const float * v);
where the array of viewport parameters can be controlled for each multicast GPU, respectively.
A set of scissor rectangles that are each applied to the corresponding viewport is specified
using
MulticastScissorArrayvNVX(uint gpu, uint first, sizei count, const int *v);
where the rectangle parameters can be controlled for each multicast GPU, respectively.
If VIEWPORT_POSITION_W_SCALE_NV is enabled, the w coordinates for each
primitive sent to a given viewport will be scaled as a function of
its x and y coordinates using the following equation:
w' = xcoeff * x + ycoeff * y + w;
The coefficients for "x" and "y" used in the above equation depend on the
viewport index and can be controlled for each multicast GPU, respectively, by the command
MulticastViewportPositionWScaleNVX(uint gpu, uint index, float xcoeff, float ycoeff);
An error INVALID_VALUE error is generated if <gpu> is greater than or equal to MULTICAST_GPUS_NV.
Additions to the OpenGL Shading Language Specification, Version 4.50
Including the following line in a shader can be used to enumerate multicast GPUs
by using the shader built-in variable gl_DeviceIndex:
#extension GL_EXT_device_group : enable
Each multicast GPU contains a unique device index in the gl_DeviceIndex variable.
Errors
Relaxation of INVALID_ENUM errors
---------------------------------
GetIntegerv and GetInteger64v now accept new tokens as
described in the "New Tokens" section.
New State
Additions to Table 23.6 Buffer Object State
Initial
Get Value Type Get Command Value Description Sec. Attribute
-------------------------- ------ ----------- ----- ----------------------- ---- ---------
UPLOAD_GPU_MASK_NVX Z+ GetIntegerv * Mask of GPUs that 20.1 -
restricts buffer data
writes
* See section 20.1
New Implementation Dependent State
None.
Sample Code
None.
Issues
None.
Revision History
Rev. Date Author Changes
---- -------- -------- -----------------------------------------------
1 09/20/17 jschnarr initial draft
2 02/23/18 rbiermann updated draft with new functions
3 05/23/18 rbiermann updated draft with new ViewportArray and AsyncCopy functions
4 06/08/18 rbiermann added NVX_progress_fence for synchronization objects
5 08/15/18 rbiermann updated draft with gl_deviceIndex
6 04/16/19 rbiermann updated draft with UploadGpuMaskNVX
7 07/19/19 rbiermann updated draft with modifications of UploadGpuMaskNVX section
8 07/23/19 rbiermann updated draft with support of Clear(Named)Buffer(Sub)Data by UploadGpuMaskNVX