blob: c640eb6c3cf5e08b0cb5494e2dd5c77f2c35a6db [file] [log] [blame]
Name Strings
Pat Brown, NVIDIA Corporation (pbrown 'at'
Last Modified Date: February 11, 2014
NVIDIA Revision: 2
OpenGL 4.0 (Core or Compatibiity Profile) is required.
This extension is written against the OpenGL 4.2 Specification
(Compatibility Profile).
NV_gpu_program4 and NV_gpu_program5 are required.
ARB_shader_storage_buffer_object is required.
This specification interacts with NV_shader_atomic_float.
This extension provides assembly language support for shader storage
buffers (from the ARB_shader_storage_buffer_object extension) for all
program types supported by NV_gpu_program5, including compute programs
added by the NV_compute_program5 extension.
Assembly programs using this extension can read and write to the memory of
buffer objects bound to the binding points provided by
New Procedures and Functions
New Tokens
Additions to Chapter 2 of the OpenGL 4.2 (Compatibility Profile) Specification
(OpenGL Operation)
(All modifications are relative to Section 2.X, GPU Programs, from the
NV_gpu_program4 specification, as modified by NV_gpu_program5.)
Modify Section 2.X.2, Program Grammar
(add the following grammar rules to the NV_gpu_program5 base grammar for
shader storage buffers)
<MemInstruction> ::= <ATOMBop_instruction>
| <LDBop_instruction>
| <STBop_instruction>
<ATOMBop_instruction> ::= "ATOMB" <opModifiers> <instResult> ","
<instOperandV> "," <storageUseV>
<LDBop_instruction> ::= "LDB" <opModifiers> <instResult> ","
<STBop_instruction> ::= "STB" <opModifiers> <instOperandV> ","
<namingStatement> ::= <STORAGE_statement>
<STORAGE_statement> ::= "STORAGE" <establishName> <storageSingleInit>
| "STORAGE" <establishName> <optArraySize>
<storageSingleInit> ::= "=" <storageUseDS>
<storageMultipleInit> ::= "=" "{" <storageItemList> "}"
<storageItemList> ::= <storageUseDM>
| <storageUseDM> "," <storageItemList>
<programSingleItem> ::= <progStorLenParams>
<programMultipleItem> ::= <progStorLenParams>
<progStorLenParams> ::= "program" "." "storagelen" <arrayMemAbs>
| "program" "." "storagelen" <arrayRange>
<progStorLenParam> ::= "program" "." "storagelen" <arrayMemAbs>
<storageUseV> ::= <storageVarName> <optArrayMem>
| <storageVarName> <arrayMem> <optArrayMem>
<storageUseDS> ::= <storageBinding> <arrayMemAbs>
<storageUseDM> ::= <storageBinding> <arrayMemAbs>
| <storageBinding> <arrayRange>
| <storageBinding>
<storageBinding> ::= "program" "." "storage" <arrayMemAbs>
| "program" "." "storage" <arrayRange>
Modify Section 2.X.3.3, Program Parameters
Shader Storage Buffer Property Bindings
Binding Components Underlying State
------------------------- ---------- -------------------------------
program.storagelen[a] (x,-,-,-) program storage buffer a,
variable-size array length
program.storagelen[a..b] (x,-,-,-) program storage buffer a..b,
variable-size array length
If a program parameter binding matches "program.storagelen[a]", the "x"
component of the program parameter variable is filled an unsized array
length associated with the shader storage buffer binding point <a>. If
the program is generated internally by the implementation when compiling a
GLSL shader, this length is the number of elements that can be stored in
an unsized array at the end of the associated GLSL shader storage block.
If there is no shader block associated with shader storage buffer binding
point <a>, or if the associated block does not end with an unsized array,
the "x" component will hold the integer value zero. If the program is an
assembly program specified via ProgramStringARB or LoadProgramNV, the "x"
component will hold the integer value zero. The "y", "z", and "w"
components of the variable are undefined.
Additionally, for program parameter array bindings,
"program.storagelen[a..b]" is equivalent to specifying unsized array
lengths for storage buffer binding points <a> through <b>, in order. A
program using any of these bindings will fail to load if <a> is greater
than <b>.
Add New Section 2.X.3.Y, Shader Storage Buffers, after Section 2.X.3.6,
Program Parameter Buffers
Shader storage buffers are arrays of basic machine units from which data
can be read or written using the LDB and STB instructions. Shader storage
buffers also support atomic memory operations using the ATOMB instruction.
The GL provides an implementation-dependent number of shader storage
buffer binding points to which buffer objects can be attached. Shader
storage buffer contents can be changed either by updating the contents of
bound buffer object, by changing the buffer object attached to a binding
point, or by using the ATOMB or STB instructions in a shader to modify
contents of buffer object memory.
Shader storage buffer bindings are established by calling BindBufferBase,
BindBufferOffsetEXT, or BindBufferRange with a target of
SHADER_STORAGE_BUFFER, as documented in the ARB_shader_storage_buffer
extension. The number of shader storage buffer binding points is given by
the value of the constant MAX_SHADER_STORAGE_BUFFER_BINDINGS. There is a
limit on the maximum number of basic machine units in a buffer object that
can be accessed using any single parameter buffer binding point, given by
the implementation-dependent constant MAX_SHADER_STORAGE_BLOCK_SIZE.
Buffer objects larger than this size may be used, but the results of
accessing portions of the buffer object beyond the limit are undefined.
Shader storage buffer variables may only be used as operands in the ATOMB,
LDB, and STB instructions; they may not be used by used as results or
operands in general instructions. Shader storage buffer variables must be
declared explicitly via the <STORAGE_statement> grammar rule. Shader
storage buffer bindings can not be used directly in executable
Shader storage buffer variables may be declared as arrays, but all
bindings assigned to the array must use the same binding point(s) and must
increase consecutively.
In explicit shader storage variable declarations, the bindings in Table
X.2 starting with "[a..b]" may be used, indicating that the
variable spans multiple buffer binding points. Such variables must be
accessed as an arrays, with the first index specifying an offset into the
range of buffer object binding points. A buffer index of zero identifies
binding point <a>; an index of <b>-<a>-1 identifies binding point <b>. If
such a variable is declared as an array, a second index must be provided
to identify the individual array element. A program will fail to compile
if such bindings are used when <a> or <b> is negative or greater than or
equal to the number of buffer binding points supported for the program
type, or if <a> is greater than <b>.
Binding Components Underlying State
----------------------------- ---------- -----------------------------[a][c] (x,x,x,x) shader storage buffer a,
element c[a][c..d] (x,x,x,x) shader storage buffer a,
elements c through d[a] (x,x,x,x) shader storage buffer a,
all elements[a..b][c] (x,x,x,x) shader storage buffers a
through b, element c[a..b][c..d] (x,x,x,x) shader storage buffers a
through b, elements c
through d[a..b] (x,x,x,x) shader storage buffers a
through b, all elements
Table X.2: Shader Storage Bindings. <a> and <b> indicate buffer
numbers, <c> and <d> indicate individual elements.
If a shader storage buffer binding matches "[a][c]", the
shader storage buffer variable is associated with element <c> of the
buffer object bound to binding point <a>. If no buffer object is bound to
binding point <a>, or the bound buffer object is not large enough to hold
an element <c>, reads from the binding return zero and writes to the
binding have no effect. The binding point <a> must be a nonnegative
integer constant.
For shader storage array declarations, "[a][c..d]" is
equivalent to specifying elements <c> through <d> of the buffer object
bound to binding point <a> in order.
For shader storage array declarations, "[a]" is equivalent
to specifying the entire buffer -- elements 0 through <n>-1, where <n> is
either the size of the array (if declared) or the implementation-dependent
maximum shader storage buffer object size limit (if no size is declared).
When bindings beginning with "[a..b]" are used in a
variable declaration, they behave identically to corresponding beginning
with "[a]", except that the variable is filled with a
separate set of values for each buffer binding point from <a> to <b>
Modify Section 2.X.4, Program Execution Environment
(add to the opcode table)
Instr- Modifiers
uction F I C S H D Out Inputs Description
------ - - - - - - --- -------- --------------------------------
ATOMB - - X - - - s v,su atomic transaction to storage buffer
LDB - - X X - F v su load from storage buffer
STB - - - - - - - v,su store to storage buffer
Modify Section 2.X.4.5, Program Memory Access, from NV_gpu_program5
(modify first paragraph)
Programs may load from or store to buffer object memory via the ATOM
(atomic global memory operation), ATOMB (atomic storage buffer memory
operation), LDB (load from storage buffer), LDC (load constant), LOAD
(global load), STB (store to storage buffer), and STORE (global store)
(Add to "Section 2.X.6, Program Options" of the NV_gpu_program4 extension,
as extended by NV_gpu_program5:)
+ Shader Storage Buffer Operations (NV_shader_storage_buffer)
If a program (of any type, including compute programs) specifies the
"NV_shader_storage_buffer" option, it may use the "ATOMB", "LDB", and
"STB" opcodes to perform atomic memory options, loads, and stores to
shader storage buffers, and "STORAGE" to declare shader storage buffer
variables. If the option is not specified, a program will fail to load if
it contains "ATOMB", "LDB", or "STB" opcodes, or "STORAGE" declarations.
(add the following subsection to section 2.X.8, Program Instruction Set.)
Section 2.X.8.Z, ATOMB: Atomic Memory Operation (Storage Buffer Memory)
The ATOMB instruction performs an atomic memory operation by reading from
shader storage buffer memory specified by the second operand, computing a
new value based on the value read from memory and the first (vector)
operand, and then writing the result back to the same memory address. The
memory transaction is atomic, guaranteeing that no other write to the
memory accessed will occur between the time it is read and written by the
ATOMB instruction. The result of the ATOMB instruction is the scalar
value read from memory. The second operand used for the ATOMB instruction
must correspond to a shader storage variable declared using the "STORAGE"
statement; a program will fail to load if any other type of operand is
used for the second operand of an ATOMB instruction.
The ATOMB instruction has two required instruction modifiers. The atomic
modifier specifies the type of operation to be performed. The storage
modifier specifies the size and data type of the operand read from memory
and the base data type of the operation used to compute the value to be
written to memory.
atomic storage
modifier modifiers operation
-------- ------------------ --------------------------------------
ADD U32, S32, U64, F32 compute a sum
MIN U32, S32 compute minimum
MAX U32, S32 compute maximum
IWRAP U32 increment memory, wrapping at operand
DWRAP U32 decrement memory, wrapping at operand
AND U32, S32 compute bit-wise AND
OR U32, S32 compute bit-wise OR
XOR U32, S32 compute bit-wise XOR
EXCH U32, S32, U64, F32 exchange memory with operand
CSWAP U32, S32, U64 compare-and-swap
Table X.Y, Supported atomic and storage modifiers for the ATOM
Not all storage modifiers are supported by ATOMB, and the set of modifiers
allowed for any given instruction depends on the atomic modifier
specified. Table X.Y enumerates the set of atomic modifiers supported by
the ATOMB instruction, and the storage modifiers allowed for each.
tmp0 = VectorLoad(op0);
result = BufferMemoryLoad(op1, storageModifier);
switch (atomicModifier) {
case ADD:
writeval = tmp0.x + result;
case MIN:
writeval = min(tmp0.x, result);
case MAX:
writeval = max(tmp0.x, result);
case IWRAP:
writeval = (result >= tmp0.x) ? 0 : result+1;
case DWRAP:
writeval = (result == 0 || result > tmp0.x) ? tmp0.x : result-1;
case AND:
writeval = tmp0.x & result;
case OR:
writeval = tmp0.x | result;
case XOR:
writeval = tmp0.x ^ result;
case EXCH:
case CSWAP:
if (result == tmp0.x) {
writeval = tmp0.y;
} else {
return result; // no memory store
BufferMemoryStore(op1, writeval, storageModifier);
ATOMB performs a scalar atomic operation. The <y>, <z>, and <w>
components of the result vector are undefined.
ATOMB supports no base data type modifiers, but requires exactly one
storage modifier. The base data types of the result vector, and the first
(vector) operand are derived from the storage modifier.
Section 2.X.8.Z, LDB: Load from Storage Buffer Memory
The LDB instruction generates a result vector by fetching data from shader
storage buffer memory identified by the first operand, as described in
Section 2.X.4.5. The single operand for the LDB instruction must
correspond to a shader storage buffer variable declared using the
"STORAGE" statement; a program will fail to load if any other type of
operand is used in an LDB instruction.
result = BufferMemoryLoad(op0, storageModifier);
LDB supports no base data type modifiers, but requires exactly one storage
modifier. The base data type of the result vector is derived from the
storage modifier.
Section 2.X.8.Z, STB: Store to Storage Buffer Memory
The STB instruction writes the contents of the first vector operand to
shader storage buffer memory identified by the second operand, as
described in Section 2.X.4.5. This instruction generates no result. The
second operand for the STB instruction must correspond to a shader shader
storage buffer variable declared using the "STORAGE" statement; a program
will fail to load if any other type of operand is used in an STB
tmp0 = VectorLoad(op0);
BufferMemoryStore(op1, tmp0, storageModifier);
STB supports no base data type modifiers, but requires exactly one storage
modifier. The base data type of the vector components of the first
operand is derived from the storage modifier.
Additions to Chapter 3 of the OpenGL 4.2 (Compatibility Profile) Specification
Additions to Chapter 4 of the OpenGL 4.2 (Compatibility Profile) Specification
(Per-Fragment Operations and the Frame Buffer)
Additions to Chapter 5 of the OpenGL 4.2 (Compatibility Profile) Specification
(Special Functions)
Additions to Chapter 6 of the OpenGL 4.2 (Compatibility Profile) Specification
(State and State Requests)
Additions to the AGL/GLX/WGL Specifications
GLX Protocol
Dependencies on NV_shader_atomic_float
If NV_shader_atomic_float is not supported, the ADD and EXCH atomic
operations in the ATOMB instructions do not support the "F32" storage
New State
None, but shares buffer bindings and other state with the
ARB_shader_storage_buffer_object extension.
New Implementation Dependent State
None, but shares implementation-dependent state with the
ARB_shader_storage_buffer_object extension.
Revision History
Revision 2, February 11, 2014 (pbrown)
- Fix typos in the descriptions of the "" bindings.
Revision 1, May 9, 2012 (pbrown)
- Internal spec development.