| Name |
| |
| NV_command_list |
| |
| Name Strings |
| |
| GL_NV_command_list |
| |
| Contact |
| |
| Pierre Boudier, NVIDIA (pboudier 'at' nvidia.com) |
| Christoph Kubisch, NVIDIA (ckubisch 'at' nvidia.com) |
| Tristan Lorach, NVIDIA (tlorach 'at' nvidia.com) |
| |
| Contributors |
| |
| Jeff Bolz, NVIDIA |
| Corentin Wallez, NVIDIA |
| Markus Tavenrath, NVIDIA |
| Mark Kilgard, NVIDIA |
| Joseph Emmons, NVIDIA |
| Thomas Ludwig, MAXON |
| |
| Status |
| |
| Shipping with NVIDIA driver release 347.88 (March 2015) |
| |
| Version |
| |
| Last Modified Date: November 3, 2015 |
| Revision: 6 |
| |
| Number |
| |
| OpenGL Extension #477 |
| |
| Dependencies |
| |
| This extension interacts with NV_vertex_buffer_unified_memory. |
| |
| This extension interacts with NV_uniform_buffer_unified_memory. |
| |
| This extension interacts with NV_parameter_buffer_object. |
| |
| This extension interacts with ARB_robust_buffer_access_behavior |
| |
| This extension interacts with NV_bindless_texture and ARB_bindless_texture |
| |
| This extension interacts with NV_shader_buffer_load |
| |
| This extension interacts with ARB_shader_draw_parameters |
| |
| The extension is written against the OpenGL 4.4 Specification, |
| Compatibility Profile. |
| |
| Overview |
| |
| This extension adds a few new features designed to provide very low |
| overhead batching and replay of rendering commands and state changes: |
| |
| - A state object, which stores a pre-validated representation of the |
| the state of (almost) the entire pipeline. |
| |
| - A more flexible and extensible MultiDrawIndirect (MDI) type of mechanism, using |
| a token-based command stream, allowing to setup binding state and emit draw calls. |
| |
| - A set of functions to execute a list of the token-based command streams with state object |
| changes interleaved with the streams. |
| |
| - Command lists enabling compilation and reuse of sequences of command |
| streams and state object changes. |
| |
| Because state objects reflect the state of the entire pipeline, it is |
| expected that they can be pre-validated and executed efficiently. It is |
| also expected that when state objects are combined into a command list, |
| the command list can diff consecutive state objects to produce a reduced/ |
| optimized set of state changes specific to that transition. |
| |
| The token-based command stream can also be stored in regular buffer objects |
| and therefore be modified by the server itself. This allows more |
| complex work creation than the original MDI approach, which was limited |
| to emitting draw calls only. |
| |
| New Procedures and Functions |
| |
| void CreateStatesNV(sizei n, uint *states); |
| void DeleteStatesNV(sizei n, const uint *states); |
| boolean IsStateNV(uint state); |
| |
| void StateCaptureNV(uint state, enum mode); |
| |
| uint GetCommandHeaderNV(enum tokenID, uint size); |
| ushort GetStageIndexNV(enum shadertype); |
| |
| void DrawCommandsNV(enum primitiveMode, uint buffer, const intptr* indirects, const sizei* sizes, |
| uint count); |
| void DrawCommandsAddressNV(enum primitiveMode, const uint64* indirects, const sizei* sizes, |
| uint count); |
| |
| void DrawCommandsStatesNV(uint buffer, const intptr* indirects, const sizei* sizes, |
| const uint* states, const uint* fbos, uint count); |
| void DrawCommandsStatesAddressNV(const uint64* indirects, const sizei* sizes, |
| const uint* states, const uint* fbos, uint count); |
| |
| void CreateCommandListsNV(sizei n, uint *lists); |
| void DeleteCommandListsNV(sizei n, const uint *lists); |
| boolean IsCommandListNV(uint list); |
| |
| void ListDrawCommandsStatesClientNV(uint list, uint segment, const void** indirects, |
| const sizei* sizes, const uint* states, const uint* fbos, uint count); |
| |
| void CommandListSegmentsNV(uint list, uint segments); |
| void CompileCommandListNV(uint list); |
| void CallCommandListNV(uint list); |
| |
| New Tokens |
| |
| Used in DrawCommandsStates buffer formats, in |
| GetCommandHeaderNV to return the header: |
| |
| |
| TERMINATE_SEQUENCE_COMMAND_NV 0x0000 |
| NOP_COMMAND_NV 0x0001 |
| DRAW_ELEMENTS_COMMAND_NV 0x0002 |
| DRAW_ARRAYS_COMMAND_NV 0x0003 |
| DRAW_ELEMENTS_STRIP_COMMAND_NV 0x0004 |
| DRAW_ARRAYS_STRIP_COMMAND_NV 0x0005 |
| DRAW_ELEMENTS_INSTANCED_COMMAND_NV 0x0006 |
| DRAW_ARRAYS_INSTANCED_COMMAND_NV 0x0007 |
| ELEMENT_ADDRESS_COMMAND_NV 0x0008 |
| ATTRIBUTE_ADDRESS_COMMAND_NV 0x0009 |
| UNIFORM_ADDRESS_COMMAND_NV 0x000a |
| BLEND_COLOR_COMMAND_NV 0x000b |
| STENCIL_REF_COMMAND_NV 0x000c |
| LINE_WIDTH_COMMAND_NV 0x000d |
| POLYGON_OFFSET_COMMAND_NV 0x000e |
| ALPHA_REF_COMMAND_NV 0x000f |
| VIEWPORT_COMMAND_NV 0x0010 |
| SCISSOR_COMMAND_NV 0x0011 |
| FRONT_FACE_COMMAND_NV 0x0012 |
| |
| |
| Additions to Chapter 5 of the OpenGL 4.4 (Compatibility) Specification |
| (Shared Objects and Multiple Contexts) |
| |
| Add state objects and command lists to the set of objects that can not be |
| shared between contexts. |
| |
| Additions to Chapter 7 of the OpenGL 4.4 (Compatibility) Specification |
| (Shared Objects and Multiple Contexts) |
| |
| Modify Section 7.12.2, Shader Memory Access Synchronization |
| |
| (modify list of barrier bits) |
| |
| * COMMAND_BARRIER_BIT: Command data sourced from buffer objects by |
| Draw*Indirect, DispatchComputeIndirect and DrawCommands*NV commands |
| after the barrier will reflect data written by shaders prior to the |
| barrier. The buffer objects affected by this bit are derived from the |
| DRAW_INDIRECT_BUFFER and DISPATCH_INDIRECT_BUFFER bindings, or |
| from the arguments passed to DrawCommands*NV. |
| |
| Additions to Chapter 10 of the OpenGL 4.4 (Compatibility) Specification |
| (Drawing Commands) |
| |
| Add a new Section 10.X (Indirect Draw Commands With State Changes) |
| |
| Add a new subsection 10.X.1 (State Objects) |
| |
| The current state of the rendering pipeline can be captured into a state |
| object for later reuse with a new set of drawing commands. The name space |
| for state objects is the unsigned integers, with zero reserved. The |
| command: |
| |
| void CreateStatesNV(sizei n, uint *states); |
| |
| returns <n> previously unused state object names in <states>, and creates |
| a state object in the initial state for each name. |
| |
| State objects are deleted by calling |
| |
| void DeleteStatesNV(sizei n, const uint *states); |
| |
| <states> contains <n> names of state objects to be deleted. Once a state |
| object is deleted it has no contents and its name is again unused. Unused |
| names in <states> are silently ignored, as is the value zero. |
| |
| All the states that can be set via DrawCommandsStatesNV (as defined in |
| Section 10.X.2) are excluded from the captured state and will be inherited |
| from the most recent commands or GL context state. Binding state is, however, |
| never inherited from GL context, only from commands. |
| |
| |
| The command |
| |
| void StateCaptureNV(uint state, enum basicmode); |
| |
| captures the current state of the rendering pipeline into the object |
| indicated by <state>. <basicmode> indicates the basic Begin mode that this |
| state object must be used with, see Table 10.X.1.2 for compatibility |
| between primitive modes and basic modes. |
| |
| Table 10.X.1.2 (Primitive mode compatibility) |
| |
| basic primitive mode | compatible primitive mode |
| --------------------------------------------------------------------- |
| POINTS | POINTS |
| LINES | LINES |
| | LINE_STRIP |
| | LINE_LOOP |
| TRIANGLES | TRIANGLES |
| | TRIANGLE_STRIP |
| | TRIANGLE_FAN |
| QUADS | QUADS |
| | QUAD_STRIP |
| PATCHES | PATCHES |
| LINES_ADJACENCY | LINES_ADJACENCY |
| | LINES_STRIP_ADJACENCY |
| TRIANGLES_ADJACENCY | TRIANGLES_ADJACENCY |
| | TRIANGLES_STRIP_ADJACENCY |
| |
| This rendering state includes: |
| |
| - Vertex attribute enable state, formats, types, relative offsets and strides. |
| |
| - Primitive state such as primitive restart and patch parameters, provoking vertex. |
| |
| - Immediate vertex attribute values as provided by glVertexAttrib* or |
| glVertexAttribI* |
| |
| - All active program binaries except compute (either from the active |
| program pipeline or from UseProgram) with their current subroutine |
| configuration. |
| |
| - Rasterization, multisample fragment operation, depth, stencil, and |
| blending state. |
| |
| - Rasterization state such as stippling and polygon modes and offsets. |
| |
| - Viewport, scissor, and depth range state. |
| |
| - Framebuffer attachment configuration: attachment state including attachment |
| formats, drawbuffer state, and target/layer information, but not including |
| actual attachments or sizes of attachments (these are stored separately). |
| |
| - Framebuffer attachment textures (but not their residency state). |
| |
| It does NOT include: |
| |
| - Bound vertex buffers or vertex unified addresses, or their offsets, |
| or bound index buffers/addresses. |
| |
| - Other program-related bindings, such as shader storage buffers, atomic counter buffers, texture |
| and sampler bindings. |
| |
| - Default-block uniform values from active programs |
| |
| - Blending constant color, front and back stencil reference values, alpha test threshold. |
| |
| - Polygon offset values. |
| |
| - Viewport and scissor rectangle for viewport index zero. |
| |
| Essentially all state that can be manupulated by the commands listed in 10.X.2 (Drawing with Commands) |
| is excluded from the state capture. |
| |
| INVALID_ENUM is generated if <mode> is not a basic primitive mode, as listed |
| in Table 10.X.1.2. |
| INVALID_OPERATION is generated if the default framebuffer is bound as either draw or read buffer. |
| INVALID_OPERATION is generated if transform feedback is enabled. |
| INVALID_OPERATION is generated if occlusion query is enabled. |
| INVALID_OPERATION is generated if the current active program or program pipeline |
| makes use of SHADER_STORAGE_BUFFER, ATOMIC_COUNTER_BUFFER or has uniforms defined |
| in the default uniform-block, or uniforms inheriting from fixed function state |
| (gl_ModelView etc.). |
| INVALID_OPERATION is generated if the current active program or program pipeline |
| uses uniform blocks that did not have the "commandBindableNV" flag set (see |
| "Modifications to the OpenGL Shading Language Specification" section). |
| INVALID_OPERATION is generated if neither program, nor program pipeline |
| objects are actively used. |
| |
| Add a new subsection 10.X.2 (Drawing with Commands) |
| |
| void DrawCommandsNV(enum mode, uint buffer, const intptr* indirects, const sizei* sizes, |
| uint count); |
| void DrawCommandsAddressNV(enum mode, const uint64* indirects, const sizei* sizes, |
| uint count); |
| |
| These commands accept arrays of buffer addresses (either an array of |
| offsets <indirects> into a buffer named by <buffer>, or an array of GPU |
| addresses <indirects>), and an array of sequence lengths in <sizes>. |
| All arrays have <count> entries. |
| The current binding state of vertex, element and uniform buffers will not be |
| effective but must be set via commands within the buffer, other state will |
| however be inherited from the current OpenGL context. |
| |
| INVALID_ENUM is generated if <mode> is not an accepted value. |
| INVALID_VALUE is generated if <buffer> is not a valid buffer object. |
| INVALID_OPERATION is generated if a geometry shader is active and <mode> is |
| incompatible with the input primitive type of the geometry shader in the currently |
| installed program object. |
| INVALID_OPERATION is generated if the default (zero) frame buffer object is |
| currently bound as DRAW_FRAMEBUFFER, a non-zero frame buffer object is required. |
| |
| DrawCommandsNV and DrawCommandsAddressNV are equivalent to: |
| |
| Save current GL state; |
| enum indexType = UNSIGNED_SHORT; |
| for (uint i = 0; i < count; i++) { |
| uint64 address = address computed from <buffer>+<indirects>[i]; |
| |
| indexType = DrawCommandSequenceNV(<mode>, indexType, address, sizes[i]); |
| } |
| Restore current GL state; |
| |
| The command: |
| |
| enum DrawCommandSequenceNV(enum mode, enum indexType, void *address, sizei size); |
| |
| does not exist in the GL, but is used to describe functionality in the rest |
| of this section. |
| |
| DrawCommandSequenceNV is a flexible and extensible command that executes |
| simple state changes and draw commands based on a tokenized format. The |
| loop above illustrates that the state changes from one invocation will |
| influence the next. All rendering is peformed as if the client states for |
| VERTEX_ATTRIB_ARRAY_UNIFIED_NV, ELEMENT_ARRAY_UNIFIED_NV and |
| UNIFORM_BUFFER_UNIFIED_NV are enabled. |
| |
| It is defined by the following pseudo code, tokens, and structures: |
| |
| |
| Table 10.X.2 (Token values and command structure names) |
| |
| tokenID | Command |
| --------------------------------------------------------------------- |
| TERMINATE_SEQUENCE_COMMAND_NV | TerminateSequenceCommandNV |
| NOP_COMMAND_NV | NOPCommandNV |
| DRAW_ELEMENTS_COMMAND_NV | DrawElementsCommandNV |
| DRAW_ARRAYS_COMMAND_NV | DrawArraysCommandNV |
| DRAW_ELEMENTS_STRIP_COMMAND_NV | DrawElementsCommandNV |
| DRAW_ARRAYS__STRIP_COMMAND_NV | DrawArraysCommandNV |
| DRAW_ELEMENTS_INSTANCED_COMMAND_NV | DrawElementsInstancedCommandNV |
| DRAW_ARRAYS_INSTANCED_COMMAND_NV | DrawArraysInstancedCommandNV |
| ELEMENT_ADDRESS_COMMAND_NV | ElementAddressCommandNV |
| ATTRIBUTE_ADDRESS_COMMAND_NV | AttributeAddressCommandNV |
| UNIFORM_ADDRESS_COMMAND_NV | UniformAddressCommandNV |
| BLEND_COLOR_COMMAND_NV | BlendColorCommandNV |
| STENCIL_REF_COMMAND_NV | StencilRefCommandNV |
| LINE_WIDTH_COMMAND_NV | LineWidthCommandNV |
| POLYGON_OFFSET_COMMAND_NV | PolygonOffsetCommandNV |
| ALPHA_REF_COMMAND_NV | AlphaRefCommandNV |
| VIEWPORT_COMMAND_NV | ViewportCommandNV |
| SCISSOR_COMMAND_NV | ScissorCommandNV |
| FRONT_FACE_COMMAND_NV | FrontFaceCommandNV |
| |
| |
| Tight packing is used for all structures |
| |
| typedef struct { |
| uint header; |
| } TerminateSequenceCommandNV; |
| |
| typedef struct { |
| uint header; |
| } NOPCommandNV; |
| |
| typedef struct { |
| uint header; |
| uint count; |
| uint firstIndex; |
| uint baseVertex; |
| } DrawElementsCommandNV; |
| |
| typedef struct { |
| uint header; |
| uint count; |
| uint first; |
| } DrawArraysCommandNV; |
| |
| typedef struct { |
| uint header; |
| uint mode; |
| uint count; |
| uint instanceCount; |
| uint firstIndex; |
| uint baseVertex; |
| uint baseInstance; |
| } DrawElementsInstancedCommandNV; |
| |
| typedef struct { |
| uint header; |
| uint mode; |
| uint count; |
| uint instanceCount; |
| uint first; |
| uint baseInstance; |
| } DrawArraysInstancedCommandNV; |
| |
| typedef struct { |
| uint header; |
| uint addressLo; |
| uint addressHi; |
| uint typeSizeInByte; |
| } ElementAddressCommandNV; |
| |
| typedef struct { |
| uint header; |
| uint index; |
| uint addressLo; |
| uint addressHi; |
| } AttributeAddressCommandNV; |
| |
| typedef struct { |
| uint header; |
| ushort index; |
| ushort stage; |
| uint addressLo; |
| uint addressHi; |
| } UniformAddressCommandNV; |
| |
| typedef struct { |
| uint header; |
| float red; |
| float green; |
| float blue; |
| float alpha; |
| } BlendColorCommandNV; |
| |
| typedef struct { |
| uint header; |
| uint frontStencilRef; |
| uint backStencilRef; |
| } StencilRefCommandNV; |
| |
| typedef struct { |
| uint header; |
| float lineWidth; |
| } LineWidthCommandNV; |
| |
| typedef struct { |
| uint header; |
| float scale; |
| float bias; |
| } PolygonOffsetCommandNV; |
| |
| typedef struct { |
| uint header; |
| float alphaRef; |
| } AlphaRefCommandNV; |
| |
| typedef struct { |
| uint header; |
| uint x; |
| uint y; |
| uint width; |
| uint height; |
| } ViewportCommandNV; // only ViewportIndex 0 |
| |
| typedef struct { |
| uint header; |
| uint x; |
| uint y; |
| uint width; |
| uint height; |
| } ScissorCommandNV; // only ViewportIndex 0 |
| |
| typedef struct { |
| uint header; |
| uint frontFace; // 0 for CW, 1 for CCW |
| } FrontFaceCommandNV; |
| |
| enum DrawCommandSequenceNV(enum mode, enum indexType, void *address, sizei size) |
| { |
| enum modeStrip; |
| if (mode == TRIANGLES) modeStrip = TRIANGLE_STRIP; |
| else if (mode == LINES) modeStrip = LINE_STRIP; |
| else if (mode == LINES_ADJACENCY) modeStrip = LINE_STRIP_ADJACENCY; |
| else if (mode == TRIANGLES_ADJACENCY) modeStrip = TRIANGLE_STRIP_ADJACENCY; |
| else if (mode == QUADS) modeStrip = QUAD_STRIP; |
| else modeStrip = mode; |
| |
| enum modeSpecial; |
| if (mode == LINES) modeSpecial = LINE_LOOP; |
| else if (mode == TRIANGLES) modeSpecial = TRIANGLE_FAN; |
| else modeSpecial = mode; |
| |
| void *current = address; |
| |
| while (current != (ubyte *)address + size) { |
| uint header = *(uint*)current; |
| |
| switch( GetTokenType(header)){ |
| case TERMINATE_SEQUENCE_NV: |
| { |
| return indexType; |
| } |
| break; |
| case NOP_COMMAND_NV: |
| |
| break; |
| case DRAW_ELEMENTS_COMMAND_NV: |
| { |
| DrawElementsCommandNV* cmd = (DrawElementsCommandNV*)current; |
| DrawElementsBaseVertex(mode, cmd->count, indexType, (void*)(cmd->firstIndex * sizeofindextype), cmd->baseVertex); |
| } |
| break; |
| case DRAW_ARRAYS_COMMAND_NV: |
| { |
| DrawArraysCommandNV* cmd = (DrawArraysCommandNV*)current; |
| DrawArrays(mode, cmd->first, cmd->count); |
| } |
| break; |
| case DRAW_ELEMENTS_STRIP_COMMAND_NV: |
| { |
| DrawElementsCommandNV* cmd = (DrawElementsCommandNV*)current; |
| DrawElementsBaseVertex(modeStrip, cmd->count, indexType, (void*)(cmd->firstIndex * sizeofindextype), cmd->baseVertex); |
| } |
| break; |
| case DRAW_ARRAYS_STRIP_COMMAND_NV: |
| { |
| DrawArraysCommandNV* cmd = (DrawArraysCommandNV*)current; |
| DrawArrays(modeStrip, cmd->first, cmd->count); |
| } |
| break; |
| case DRAW_ELEMENTS_INSTANCED_COMMAND_NV: |
| { |
| // undefined behavior if (cmd->mode != mode && cmd->mode != modeStrip && cmd->mode != modeSpecial) |
| |
| DrawElementsInstancedCommandNV* cmd = (DrawElementsInstancedCommandNV*)current; |
| DrawElementsIndirect(cmd->mode, indexType, &cmd->count); |
| } |
| break; |
| case DRAW_ARRAYS_INSTANCED_COMMAND_NV: |
| { |
| // undefined behavior if (cmd->mode != mode && cmd->mode != modeStrip && cmd->mode != modeSpecial) |
| |
| DrawArraysInstancedCommandNV* cmd = (DrawArraysInstancedCommandNV*)current; |
| DrawArraysIndirect(cmd->mode, &cmd->count); |
| } |
| break; |
| case ELEMENT_ADDRESS_COMMAND_NV: |
| { |
| ElementAddressCommandNV* cmd = (ElementAddressCommandNV*)current; |
| switch(cmd->typeSizeInByte){ |
| case 1: indexType = UNSIGNED_BYTE; break; |
| case 2: indexType = UNSIGNED_SHORT; break; |
| case 4: indexType = UNSIGNED_INT; break; |
| } |
| BufferAddressRangeNV(ELEMENT_ARRAY_ADDRESS_NV, 0, uint64(cmd->addressLo) | (uint64(cmd->addressHi)<<32), 0x7FFFFFFF); |
| } |
| break; |
| case ATTRIBUTE_ADDRESS_COMMAND_NV: |
| { |
| AttributeAddressCommandNV* cmd = (AttributeAddressCommandNV*)current; |
| BufferAddressRangeNV(VERTEX_ATTRIB_ARRAY_ADDRESS_NV, cmd->index, uint64(cmd->addressLo) | (uint64(cmd->addressHi)<<32), 0x7FFFFFFF); |
| } |
| break; |
| case UNIFORM_ADDRESS_COMMAND_NV: |
| { |
| UniformAddressCommandNV* cmd = (UniformAddressCommandNV*)current; |
| BufferAddressRangeNV(UNIFORM_BUFFER_ADDRESS_NV, cmd->index, uint64(cmd->addressLo) | (uint64(cmd->addressHi)<<32), 0x10000); |
| } |
| break; |
| case BLEND_COLOR_COMMAND_NV: |
| { |
| BlendColorCommandNV* cmd = (BlendColorCommandNV*)current; |
| BlendColor(cmd->red,cmd->green,cmd->blue,cmd->alpha); |
| } |
| break; |
| case STENCIL_REF_COMMAND_NV: |
| { |
| StencilRefCommandNV* cmd = (StencilRefCommandNV*)current; |
| StencilFuncSeparate(FRONT, asIs, cmd->frontStencilRef, asIs); |
| StencilFuncSeparate(BACK, asIs, cmd->backStencilRef, asIs); |
| } |
| break; |
| case LINE_WIDTH_COMMAND_NV: |
| { |
| LineWidthCommandNV* cmd = (LineWidthCommandNV*)current; |
| LineWidth(cmd->lineWidth); |
| } |
| break; |
| case POLYGON_OFFSET_COMMAND_NV: |
| { |
| PolygonOffsetCommandNV* cmd = (PolygonOffsetCommandNV*)current; |
| PolygonOffset(cmd->scale,cmd->bias); |
| } |
| break; |
| case ALPHA_REF_COMMAND_NV: |
| { |
| AlphaRefCommandNV* cmd = (AlphaRefCommandNV*)current; |
| AlphaFunc(asIs, cmd->alphaRef); |
| } |
| break |
| case VIEWPORT_COMMAND_NV: |
| { |
| ViewportCommandNV* cmd = (ViewportCommandNV*)current; |
| Viewport (cmd->x,cmd->y,cmd->width,cmd->height); |
| } |
| break; |
| case SCISSOR_COMMAND_NV: |
| { |
| ScissorCommandNV* cmd = (ScissorCommandNV*)current; |
| Scissor(cmd->x,cmd->y,cmd->width,cmd->height); |
| } |
| break; |
| case FRONT_FACE_COMMAND_NV: |
| { |
| FrontFaceCommandNV* cmd = (FrontFaceCommandNV*)current; |
| FrontFace(cmd->frontFace ? CW : CCW); |
| } |
| break; |
| } |
| |
| current = (ubyte *)current + GetTokenSize(header); |
| } |
| |
| return indexType; |
| } |
| |
| None of the commands called by DrawCommandSequenceNV may generate their |
| appropriate errors, providing erroneous data as parameters |
| or generating state that normally would create errors when executed |
| by the server can produce undefined results and may cause program |
| termination. |
| The residency of all resources referenced directly (buffer addresses inside tokens) |
| or indirectly (texture handles inside uniform buffer objects) must be managed |
| explicitly. |
| |
| |
| (XXX should we add something similar to CheckFramebufferStatus? for |
| debugging, that tests the content in software and throws error + offset into buffer |
| triggering the error) |
| |
| All BufferAddressRangeNV calls issued by DrawCommandSequenceNV are |
| effective independent of their appropriate client state being enabled or not. |
| |
| |
| uint GetCommandHeaderNV(enum tokenID, uint size) |
| |
| Returns the encoded 32bit header value for a given command; the returned |
| value is implementation specific. |
| The <size> is only provided as basic consistency check, since the size of each |
| structure is fixed and no padding is allowed. The value is the sum of the |
| header and the command specific structure. |
| INVALID_ENUM is generated if <tokenID> is not one of the values listed under Table 10.X.2. |
| INVALID_VALUE is thrown if the <size> does not match the fixed |
| size of a command defined by the spec. |
| |
| ushort GetStageIndexNV(enum shadertype) |
| |
| Returns the 16bit value for a specific shader stage; the returned value |
| is implementation specific. The value is to be used with the stage field |
| within UniformAddressCommandNV tokens. |
| |
| Add a new subsection 10.X.3 (Drawing with Commands and State Objects) |
| |
| State objects may be used in rendering with the commands: |
| |
| void DrawCommandsStatesNV(uint buffer, const intptr* indirects, const sizei* sizes, |
| const uint* states, const uint* fbos, uint count); |
| void DrawCommandsStatesAddressNV(const uint64* indirects, const sizei* sizes, |
| const uint* states, const uint* fbos, uint count); |
| |
| These commands accept arrays of buffer addresses (either an array of |
| offsets <indirects> into a buffer named by <buffer>, or an array of GPU |
| addresses <indirects>), an array of sequence lengths in <sizes>, and an |
| array of state object names in <states>, of which all names must be non-zero. |
| Frame buffer object names are stored in <fbos> and can |
| be either zero or non-zero. All arrays have <count> entries. |
| The residency of textures used as attachment inside the state object's |
| captured fbo or the passed fbo must managed explicitly. |
| |
| INVALID_VALUE is generated if one entry of <states> is zero. |
| INVALID_OPERATION is generated if the fbo configuration from <fbos> |
| mismatches the configuration inside the corresponding state object |
| from <states>. |
| |
| DrawCommandsStatesNV and DrawCommandsStatesAddressNV are equivalent to: |
| |
| Save current GL state; |
| enum indexType = UNSIGNED_SHORT; |
| for (uint i = 0; i < count; i++) { |
| fbo = LookupFbo(fbos[i]); |
| stateObject = LookupStateObject(states[i]); |
| |
| if ( i == 0){ |
| Set full state captured by stateObject; |
| } |
| else { |
| Set difference of state going from <states>[i-1] to current stateObject, |
| } |
| |
| if ( fbo == 0) { |
| BindFramebuffer(FRAMEBUFFER, stateObject.fbo.name); |
| } |
| else if ( stateObject.fbo.configuration == fbo.configuration ){ |
| // The configuration excludes attachment textures and size information, however |
| // includes attached texture formats and other state (see StateCaptureNV). |
| |
| BindFramebuffer(FRAMEBUFFER, fbo.name); |
| } |
| else { |
| // Only compatible fbo states can be used. |
| |
| generate ERROR INVALID_OPERATION; |
| return; |
| } |
| |
| enum mode = primitive mode from stateObject |
| |
| uint64 address = address computed from <buffer>+<indirects>[i]; |
| |
| indexType = DrawCommandSequenceNV(mode, indexType, address, sizes[i]); |
| } |
| Restore current GL state; |
| |
| where LookupFbo and LookupStateObject return the driver's internal fbo |
| and stateObject object and stateObject.fbo is the driver's fbo state |
| object and fbo.configuration and fbo.name are the current configuration of an fbo |
| and the fbo's name respectively. |
| |
| Add a new section 10.X.4 (Command Lists) |
| |
| A list of DrawCommandsStates* commands may be compiled into a command |
| list, for further optimization and efficient reuse. The name space for |
| command lists is the unsigned integers, with zero reserved. The command: |
| |
| void CreateCommandListsNV(sizei n, uint *lists); |
| |
| returns <n> previously unused command list names in <lists>, and creates |
| a command list in the initial state for each name. |
| |
| Command lists are deleted by calling |
| |
| void DeleteCommandListsNV(sizei n, const uint *lists); |
| |
| <lists> contains <n> names of command lists to be deleted. Once a command |
| list is deleted it has no contents and its name is again unused. Unused |
| names in <lists> are silently ignored, as is the value zero. |
| |
| The command |
| |
| void CommandListSegmentsNV(uint list, uint segments); |
| |
| indicates that <list> will have <segments> number of segments, each |
| of which is a list of command sequences that it enqueues. This must be |
| called before any commands are enqueued. In the initial state, a command |
| list has a single segment. |
| |
| A command list's initial state allows it to enqueue commands, but not to |
| be executed. The following command can be enqueued: |
| |
| void ListDrawCommandsStatesClientNV(uint list, uint segment, const void** indirects, |
| const sizei* sizes, const uint* states, const uint* fbos, |
| uint count); |
| |
| A list has multiple segments and each segment enqueues an ordered list of |
| command sequences. This command enqueues the equivalent of the DrawCommandsStatesNV |
| commands into the list indicated by <list> on the segment indicated by <segment> |
| except that the sequence data is copied from the sequences pointed to by the <indirects> |
| pointer. The <indirects> pointer should point to a list of size <count> of pointers, |
| each of which should point to a command sequence. |
| |
| The pre-validated state from <states> is saved into the command list, rather |
| than a reference to the state object (i.e. the state objects or fbos could be |
| deleted and the command list would be unaffected). This includes native |
| GPU addresses for all textures indirectly referenced through the fbos |
| passed or state objects' fbos attachments, therefore a recompile of the command list |
| is required if such referenced textures change their allocation (for example |
| due to resizing), as well as explicit management of the residency of |
| the textures prior CallCommandListNV. |
| |
| ListDrawCommandsStatesClientNV performs a by-value copy of the |
| indirect data based on the provided client-side pointers. In this case |
| the content is fully immutable, while the buffer-based versions can |
| change the content of the buffers at any later time. |
| |
| The command |
| |
| void CompileCommandListNV(uint list); |
| |
| make the list indicated by <list> switch from allowing collection of |
| commands to allowing its execution. At this time, the implementation may |
| generate optimized commands to transition between states as efficiently |
| as possible. Lists may be executed with the command |
| |
| void CallCommandListNV(uint list); |
| |
| This executes the command list indicated by <list>, which operates as if |
| the DrawCommandsStates* commands were replayed in the order they were |
| enqueued on each segment, starting from segment zero and proceeding to the |
| maximum segment. All buffer or texture resources' residency must be |
| managed explicitly, including texture attachments of the effective |
| fbos during list enqueuing. |
| |
| |
| Modifications to the OpenGL Shading Language Specification, Version 4.40 |
| |
| Including the following line in a shader can be used to control the |
| language features described in this extension: |
| |
| #extension GL_NV_command_list : <behavior> |
| |
| where <behavior> is as specified in section 3.3. |
| |
| New preprocessor #defines are added to the OpenGL Shading Language: |
| |
| #define GL_NV_command_list 1 |
| |
| |
| Modify Section 4.4.5, "Uniform and Shader Storage Block Layout Qualifiers" |
| |
| (modify first paragraph, p.78) Layout qualifiers can be used for uniform |
| and shader storage blocks, but not for non-block uniform declarations. |
| The layout qualifier identifiers (and shared keyword) for uniform and |
| shader storage blocks are |
| |
| layout-qualifier-id |
| shared |
| packed |
| std140 |
| std430 |
| row_major |
| column_major |
| binding = integer-constant-expression |
| offset = integer-constant-expression |
| align = integer-constant-expression |
| commandBindableNV |
| |
| (add paragraph prior "When multiple arguments", p. 80) |
| The commandBindableNV qualifier enables the associated uniform block |
| to be updated via UniformAddressCommandNVs when executing |
| DrawCommandsStatesNV. When commandBindableNV is enabled the <binding> |
| identifier must be provided for each block, only its value will |
| correspond with the index field of a UniformAddressCommandNV. |
| A link time error will be thrown if an index is greater or equal to |
| MAX_PROGRAM_PARAMETER_BUFFER_BINDINGS_NV. |
| Changing the binding point by the OpenGL API may not influence this |
| associated index value and may cause UniformAddressCommandNVs to have |
| undefined behavior. |
| |
| Dependencies on OpenGL 4.4 (Core Profile) |
| |
| If only the core profile of OpenGL 4.4 is supported, references to |
| functionality deprecated by OpenGL 3.0 (built-in input/output/uniform variables |
| corresponding to fixed-function vertex attributes, fixed-function |
| vertex and fragment processing) should be removed and/or replaced with |
| functionality supported in the core profile. In such an environment, the |
| QUADS primitive type is not supported by the StateCaptureNV function. StateCaptureNV will |
| also ignore all references to deprecated state such as line stippling. |
| The ALPHA_REF_COMMAND_NV is not allowed to be used, therefore GetCommandHeaderNV will |
| return an error if the token enum is passed. |
| |
| Interactions with NV_shader_buffer_load |
| |
| The GPU addresses used in ELEMENT_ADDRESS_COMMAND_NV, |
| ATTRIBUTE_ADDRESS_COMMAND_NV and UNIFORM_ADDRESS_COMMAND_NV |
| can be queried via the API provided in this extension. Furthermore |
| the same API must be used to ensure residency of such buffers |
| when draw commands using such addresses are issued. |
| |
| Interactions with NV_bindless_texture or ARB_bindless_texture |
| |
| Residency of fbo attachment textures referenced in state objects |
| or command lists must be managed explicitly using the API provided |
| by either of these extensions. |
| |
| Interactions with NV_parameter_buffer_object |
| |
| The UNIFORM_ADDRESS_COMMAND_NV described in (Drawing with Commands), will affect |
| the PROGRAM_PARAMETER_BUFFER of the target stage defined within the command |
| token. |
| |
| Interactions with ARB_robust_buffer_access_behavior |
| |
| The buffer setups performed by ELEMENT_ADDRESS_COMMAND_NV, |
| ATTRIBUTE_ADDRESS_COMMAND_NV and UNIFORM_ADDRESS_COMMAND_NV |
| do not provide the required buffer ranges for robust buffer |
| access. Therefore draw calls executed under this type of |
| buffer setup will not respect the robust buffer access rules. |
| |
| Interactions with ARB_shader_draw_parameters |
| |
| The drawing operations performed through this extension will not support |
| setting of the built-in GLSL values that were added by |
| ARB_shader_draw_parameters (gl_BaseInstanceARB, gl_BaseVertexARB, gl_DrawIDARB). |
| Accessing these variables will result in undefined values. |
| |
| Additions to the AGL/GLX/WGL Specifications |
| |
| None. |
| |
| GLX Protocol |
| |
| None. |
| |
| Errors |
| |
| |
| New State |
| |
| None. |
| |
| Issues |
| |
| 1) What motivates the design? |
| |
| The primary goal is to be able to reuse pre-validated command buffers. Other |
| APIs and proposals have addressed this with various incarnations of command |
| lists or state objects, but a recurring problem is that interactions between |
| various stages of the pipeline prevent this prevalidation and reuse. These |
| interactions are often hardware-specific (and differ from vendor to vendor |
| or even generation to generation) and new interactions are introduced by |
| new features that were not imagined when the prevalidation scheme was |
| proposed. |
| |
| We attempt to address this by having a monolithic state object that |
| encompasses (almost) the entire state of the pipeline. This should provide |
| enough information for all implementations to do any needed cross- |
| validation. We try to create these in a way that minimizes the new API |
| footprint - since we want ALL state (including any added in the future), we |
| just capture it from the current state of the context. |
| |
| We expect that a captured state object will be represented as a list of |
| commands to send to the GPU. While that list of commands may be fairly |
| large, it is also well-suited to filtering redundant changes when switching |
| from one state object to another (filtering may occur on the GPU, or by |
| some processing on the CPU). We anticipate that filtering will be applied |
| when compiling a command list, but it is likely that some (perhaps less |
| aggressive) filtering will also occur in unlisted DrawCommandsStates |
| commands. |
| |
| 2) Should binding state be captured? |
| |
| Binding state should not be captured, for multiple reasons. |
| |
| The memory management performed by the driver as part of legacy command |
| execution is expensive and not well-suited for the prevalidation of |
| commands. This can be replaced by explicit bindless memory management |
| APIs (e.g. Make*Resident). |
| |
| Resource bindings also require behind-the-scenes management of internal |
| GPU structures like texture handles. Again, this can be replaced by the |
| bindless APIs. |
| |
| 3) What FBO state should be captured? |
| |
| We definitely want to capture enough information to be able to do any |
| state-based recompiles of the fragment shader, which would include |
| drawbuffer state and format state. However, it is not desirable to have |
| all properties of the FBO be captured, e.g. if attachment width/height |
| were captured then state objects could become invalid if the window shape |
| changed |
| |
| RESOLVED: state objects reference the FBO configuration, but passing |
| other compatible FBOs during rendering is possible. Furthermore the |
| VIEWPORT_COMMAND_NV allows setting the appropriate viewport state. |
| |
| 4) Can UBOs be accessed? How? |
| |
| RESOLVED: We want to encourage the "first level of the scene graph" information read |
| by shaders to be accessed with fast UBO memory accesses. |
| UNIFORM_ADDRESS_COMMAND_NV provides this mechanism. |
| |
| 5) What about Compute? |
| |
| Compute does not have the same complex state interactions that the graphics |
| pipeline has, so it is not included in this extension. |
| |
| 6) What dynamic state should be allowed? |
| |
| There are some state values which are pretty much raw integer/floating |
| point data, where requiring a unique state object for each value would |
| drastically bloat the number of state objects needed and break batching. |
| We allow for a few such values to be set in the token command buffer |
| rather than in the state object. The current list is motivated by similar |
| state in other APIs, and may not be complete. |
| |
| 7) What are the "segments" in command lists? |
| |
| These are multiple "starting points" for appending commands to the list, |
| which are ultimately replayed in order by segments. This may be useful to |
| build a multipass rendering algorithm with only a single traversal of the |
| scene graph. |
| |
| 8) When are state objects consumed into the list? |
| |
| This could either occur as the command is appended to the list, or during |
| CompileCommandListNV. |
| |
| RESOLVED: At ListDrawCommandsStatesClientNV time. |
| |
| 9) Do we want to have multiple modes in the same dispatch ? |
| |
| RESOLVED: yes, state-objects with different modes can be used, allowing |
| fast transitioning between those. Furthermore, it is possible to mix |
| LINES/LINE_STRIP/LINE_LOOP or TRIANGLES/TRIANGLE_STRIP/TRIANGLE_FAN and others |
| using the same state object, as long as their base primitive mode is the same. |
| |
| 10) Do we want to allow mixing DrawArrays and DrawElements in the same |
| dispatch ? |
| |
| RESOLVED: yes. |
| |
| 11) What happens if the token buffer is modified while it is being dispatched ? |
| |
| RESOLVED: there is no guarantee of coherency, so undefined behavior. |
| |
| 12) I would like to change states in the middle; how do I do this ? |
| |
| RESOLVED: you can select a new state object or state tokens, but you cannot change |
| state in the indirect buffer itself. |
| |
| 13) Is the token buffer multithread safe; does it scale ? |
| |
| RESOLVED: yes. it is trivial to allocate a token buffer per thread, and then submit |
| them in the main thread sequentially. since the implementation is not involved |
| when the application writes to them, the only thread safety requirements are in |
| the application itself. |
| Command lists and state objects are, however, currently not context share-able, |
| though as rendering is much more efficient now, the main dispatching thread can |
| spend the time on preparing state objects prior drawing. The cost of glStateCaptureNV |
| is no worse than a classic API draw call, and exploiting temporal coherence not too |
| many states would be "new" frame to frame, but instead cached states can be reused. |
| |
| 14) Can I reuse token buffer multiple times ? |
| |
| RESOLVED: yes. |
| |
| 15) Should we use a fixed length decoding or at the very least a size in the header ? |
| |
| RESOLVED: fixed length is used. As basic consistency check the size is also passed to header generation. |
| The NOP command can be used to pad structures to custom sizes. |
| |
| 16) Can I do buffer updates in a single DrawCommands call ? |
| |
| RESOLVED: NO. |
| Updating memory in general requires synchronization, and having lots of |
| updates inside a single DrawCommands would become a performance bottleneck. |
| |
| 17) I want to implement some occlusion scheme and skip some of the draws; how do I do this ? |
| |
| RESOLVED: this extension does not offer a conditional render facility, but this can be |
| implemented by using NOP or preferably TERMINATE_SEQUENCE commands in the stream. |
| |
| 18) I want to implement some level of detail scheme; is that possible ? |
| |
| RESOLVED: you can use NOP or TERMINATE_SEQUENCE to skip the level of details that you don't want to draw. |
| |
| 19) Why can't I just get a token to change the state, and avoid specifying lists of |
| state and indirect buffers ? |
| |
| RESOLVED: Getting a token to specify a state switch imply that the application would |
| have access to a virtual address of state changes. This would potentially open security |
| issue, since part of the validation may involve complex sequence of programming. |
| |
| 20) Instead of void** which means all commands must be stored in one buffer, could GLuint64** be used |
| when EnableClientState(DRAW_INDIRECT_UNIFIED_NV) is set? This would allow managing different command |
| buffers independently. |
| |
| RESOLVED: separate Address command added |
| |
| 21) How big can each indirect command list's buffer size be? |
| |
| RESOLVED: no limit required. |
| |
| 22) How to retrieve the "index" within UniformAddressCommandNV, or is that the GL binding point? |
| |
| RESOLVED: added commandBindableNV layout qualifier in GLSL for uniform blocks to ensure fixed binding unit. |
| Also added stage value to command. |
| |
| 23) In what condition is the state left, that is modified by tokens, after the dispatch call? |
| |
| RESOLVED: state is reset. |
| |
| 24) How does working with this extension look like |
| |
| You will find related samples at https://github.com/nvpro-samples |
| |
| 25) How can I use textures, images, shader storage or atomic counter buffers in combination with state objects? |
| |
| Textures and images are covered via NV/ARB_bindless_texture, you can store their handles inside uniform buffers. |
| Shader storage and atomic counter buffers are currently not directly exposed, however NV_gpu_shader5 allows |
| storing pointers to such buffers inside uniform buffers as well. Atomic counters can be replaced by regular |
| atomic increments. |
| |
| Alternatively use DrawCommandsNV or DrawCommandsAddressNV, which does support any GLSL programs with these |
| resource bindings, as well as default-block uniforms. |
| |
| |
| Revision History |
| |
| Rev. Date Author Changes |
| ---- -------- -------- ----------------------------------------- |
| 6 11/3/2015 ckubisch Rephrase what stateobjects capture and what not |
| 5 8/17/2015 ckubisch correct errors for DrawCommandsNV and DrawCommandsAddressNV |
| rendering to default framebuffer is not allowed. Clarify |
| which state is inherited (updated Issue 25). |
| 4 6/18/2015 ckubisch Add missing interaction with ARB_shader_draw_parameters |
| 3 5/27/2015 jemmons Multiple minor fixes and clarifications |
| 2 4/16/2015 pboudier Fix incorrect type (size_t is now sizei) in ListDrawCommandsStatesClientNV |
| 1 pboudier concept |
| jbolz base spec |
| ckubisch detailed spec |
| mjk Internal revisions |