| Name |
| |
| NV_gpu_shader5 |
| |
| Name Strings |
| |
| GL_NV_gpu_shader5 |
| |
| Contact |
| |
| Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com) |
| |
| Contributors |
| |
| Barthold Lichtenbelt, NVIDIA |
| Chris Dodd, NVIDIA |
| Eric Werness, NVIDIA |
| Greg Roth, NVIDIA |
| Jeff Bolz, NVIDIA |
| Piers Daniell, NVIDIA |
| Daniel Rakos, AMD |
| Mathias Heyer, NVIDIA |
| |
| Status |
| |
| Shipping. |
| |
| Version |
| |
| Last Modified Date: 03/07/2017 |
| NVIDIA Revision: 11 |
| |
| Number |
| |
| OpenGL Extension #389 |
| OpenGL ES Extension #260 |
| |
| Dependencies |
| |
| This extension is written against the OpenGL 3.2 (Compatibility Profile) |
| Specification. |
| |
| This extension is written against version 1.50 (revision 09) of the OpenGL |
| Shading Language Specification. |
| |
| If implemented in OpenGL, OpenGL 3.2 and GLSL 1.50 are required. |
| |
| If implemented in OpenGL, ARB_gpu_shader5 is required. |
| |
| This extension interacts with ARB_gpu_shader5. |
| |
| This extension interacts with ARB_gpu_shader_fp64. |
| |
| This extension interacts with ARB_tessellation_shader. |
| |
| This extension interacts with NV_shader_buffer_load. |
| |
| This extension interacts with EXT_direct_state_access. |
| |
| This extension interacts with EXT_vertex_attrib_64bit and |
| NV_vertex_attrib_integer_64bit. |
| |
| This extension interacts with OpenGL ES 3.1 (dated October 29th 2014). |
| |
| This extension interacts with OpenGL ES Shading Language 3.1 (revision 3). |
| |
| If implemented in OpenGL ES, OpenGL ES 3.1 and GLSL ES 3.10 are required. |
| |
| If implemented in OpenGL ES, OES/EXT_gpu_shader5 and EXT_shader_implicit- |
| _conversions are required. |
| |
| This extension interacts with OES/EXT_tessellation_shader |
| |
| This extension interacts with OES/EXT_geometry_shader |
| |
| Overview |
| |
| This extension provides a set of new features to the OpenGL Shading |
| Language and related APIs to support capabilities of new GPUs. Shaders |
| using the new functionality provided by this extension should enable this |
| functionality via the construct |
| |
| #extension GL_NV_gpu_shader5 : require (or enable) |
| |
| This extension was developed concurrently with the ARB_gpu_shader5 |
| extension, and provides a superset of the features provided there. The |
| features common to both extensions are documented in the ARB_gpu_shader5 |
| specification; this document describes only the addition language features |
| not available via ARB_gpu_shader5. A shader that enables this extension |
| via an #extension directive also implicitly enables the common |
| capabilities provided by ARB_gpu_shader5. |
| |
| In addition to the capabilities of ARB_gpu_shader5, this extension |
| provides a variety of new features for all shader types, including: |
| |
| * support for a full set of 8-, 16-, 32-, and 64-bit scalar and vector |
| data types, including uniform API, uniform buffer object, and shader |
| input and output support; |
| |
| * the ability to aggregate samplers into arrays, index these arrays with |
| arbitrary expressions, and not require that non-constant indices be |
| uniform across all shader invocations; |
| |
| * new built-in functions to pack and unpack 64-bit integer types into a |
| two-component 32-bit integer vector; |
| |
| * new built-in functions to pack and unpack 32-bit unsigned integer |
| types into a two-component 16-bit floating-point vector; |
| |
| * new built-in functions to convert double-precision floating-point |
| values to or from their 64-bit integer bit encodings; |
| |
| * new built-in functions to compute the composite of a set of boolean |
| conditions a group of shader threads; |
| |
| * vector relational functions supporting comparisons of vectors of 8-, |
| 16-, and 64-bit integer types or 16-bit floating-point types; and |
| |
| * extending texel offset support to allow loading texel offsets from |
| regular integer operands computed at run-time, except for lookups with |
| gradients (textureGrad*). |
| |
| This extension also provides additional support for processing patch |
| primitives (introduced by ARB_tessellation_shader). |
| ARB_tessellation_shader requires the use of a tessellation evaluation |
| shader when processing patches, which means that patches will never |
| survive past the tessellation pipeline stage. This extension lifts that |
| restriction, and allows patches to proceed further in the pipeline and be |
| used |
| |
| * as input to a geometry shader, using a new "patches" layout qualifier; |
| |
| * as input to transform feedback; |
| |
| * by fixed-function rasterization stages, in which case the patches are |
| drawn as independent points. |
| |
| Additionally, it allows geometry shaders to read per-patch attributes |
| written by a tessellation control shader using input variables declared |
| with "patch in". |
| |
| |
| New Procedures and Functions |
| |
| void Uniform1i64NV(int location, int64EXT x); |
| void Uniform2i64NV(int location, int64EXT x, int64EXT y); |
| void Uniform3i64NV(int location, int64EXT x, int64EXT y, int64EXT z); |
| void Uniform4i64NV(int location, int64EXT x, int64EXT y, int64EXT z, |
| int64EXT w); |
| void Uniform1i64vNV(int location, sizei count, const int64EXT *value); |
| void Uniform2i64vNV(int location, sizei count, const int64EXT *value); |
| void Uniform3i64vNV(int location, sizei count, const int64EXT *value); |
| void Uniform4i64vNV(int location, sizei count, const int64EXT *value); |
| |
| void Uniform1ui64NV(int location, uint64EXT x); |
| void Uniform2ui64NV(int location, uint64EXT x, uint64EXT y); |
| void Uniform3ui64NV(int location, uint64EXT x, uint64EXT y, uint64EXT z); |
| void Uniform4ui64NV(int location, uint64EXT x, uint64EXT y, uint64EXT z, |
| uint64EXT w); |
| void Uniform1ui64vNV(int location, sizei count, const uint64EXT *value); |
| void Uniform2ui64vNV(int location, sizei count, const uint64EXT *value); |
| void Uniform3ui64vNV(int location, sizei count, const uint64EXT *value); |
| void Uniform4ui64vNV(int location, sizei count, const uint64EXT *value); |
| |
| void GetUniformi64vNV(uint program, int location, int64EXT *params); |
| |
| |
| (The following function is also provided by NV_shader_buffer_load.) |
| |
| void GetUniformui64vNV(uint program, int location, uint64EXT *params); |
| |
| |
| (All of the following ProgramUniform* functions are supported if and only |
| if implemented in OpenGL ES or EXT_direct_state_access is supported.) |
| |
| void ProgramUniform1i64NV(uint program, int location, int64EXT x); |
| void ProgramUniform2i64NV(uint program, int location, int64EXT x, |
| int64EXT y); |
| void ProgramUniform3i64NV(uint program, int location, int64EXT x, |
| int64EXT y, int64EXT z); |
| void ProgramUniform4i64NV(uint program, int location, int64EXT x, |
| int64EXT y, int64EXT z, int64EXT w); |
| void ProgramUniform1i64vNV(uint program, int location, sizei count, |
| const int64EXT *value); |
| void ProgramUniform2i64vNV(uint program, int location, sizei count, |
| const int64EXT *value); |
| void ProgramUniform3i64vNV(uint program, int location, sizei count, |
| const int64EXT *value); |
| void ProgramUniform4i64vNV(uint program, int location, sizei count, |
| const int64EXT *value); |
| |
| void ProgramUniform1ui64NV(uint program, int location, uint64EXT x); |
| void ProgramUniform2ui64NV(uint program, int location, uint64EXT x, |
| uint64EXT y); |
| void ProgramUniform3ui64NV(uint program, int location, uint64EXT x, |
| uint64EXT y, uint64EXT z); |
| void ProgramUniform4ui64NV(uint program, int location, uint64EXT x, |
| uint64EXT y, uint64EXT z, uint64EXT w); |
| void ProgramUniform1ui64vNV(uint program, int location, sizei count, |
| const uint64EXT *value); |
| void ProgramUniform2ui64vNV(uint program, int location, sizei count, |
| const uint64EXT *value); |
| void ProgramUniform3ui64vNV(uint program, int location, sizei count, |
| const uint64EXT *value); |
| void ProgramUniform4ui64vNV(uint program, int location, sizei count, |
| const uint64EXT *value); |
| |
| |
| New Tokens |
| |
| Returned by the <type> parameter of GetActiveAttrib, GetActiveUniform, and |
| GetTransformFeedbackVarying: |
| |
| INT64_NV 0x140E |
| UNSIGNED_INT64_NV 0x140F |
| |
| INT8_NV 0x8FE0 |
| INT8_VEC2_NV 0x8FE1 |
| INT8_VEC3_NV 0x8FE2 |
| INT8_VEC4_NV 0x8FE3 |
| INT16_NV 0x8FE4 |
| INT16_VEC2_NV 0x8FE5 |
| INT16_VEC3_NV 0x8FE6 |
| INT16_VEC4_NV 0x8FE7 |
| INT64_VEC2_NV 0x8FE9 |
| INT64_VEC3_NV 0x8FEA |
| INT64_VEC4_NV 0x8FEB |
| UNSIGNED_INT8_NV 0x8FEC |
| UNSIGNED_INT8_VEC2_NV 0x8FED |
| UNSIGNED_INT8_VEC3_NV 0x8FEE |
| UNSIGNED_INT8_VEC4_NV 0x8FEF |
| UNSIGNED_INT16_NV 0x8FF0 |
| UNSIGNED_INT16_VEC2_NV 0x8FF1 |
| UNSIGNED_INT16_VEC3_NV 0x8FF2 |
| UNSIGNED_INT16_VEC4_NV 0x8FF3 |
| UNSIGNED_INT64_VEC2_NV 0x8FF5 |
| UNSIGNED_INT64_VEC3_NV 0x8FF6 |
| UNSIGNED_INT64_VEC4_NV 0x8FF7 |
| FLOAT16_NV 0x8FF8 |
| FLOAT16_VEC2_NV 0x8FF9 |
| FLOAT16_VEC3_NV 0x8FFA |
| FLOAT16_VEC4_NV 0x8FFB |
| |
| (If ARB_tessellation_shader is supported, the following enum is accepted |
| by a new primitive.) |
| |
| Accepted by the <primitiveMode> parameter of BeginTransformFeedback: |
| |
| PATCHES |
| |
| |
| |
| Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification |
| (OpenGL Operation) |
| |
| Modify Section 2.6.1, Begin and End, p. 22 |
| |
| (Extend language describing PATCHES introduced by ARB_tessellation_shader. |
| It particular, add the following to the end of the description of the |
| primitive type.) |
| |
| If a patch primitive is drawn, each patch is drawn separately as a |
| collection of points, which each patch vertex definining a separate point. |
| Extra vertices from an incomplete patch are never drawn. |
| |
| |
| Modify Section 2.14.3, Vertex Attributes, p. 86 |
| |
| (modify the second paragraph, p. 87) ... exceeds MAX_VERTEX_ATTRIBS. For |
| the purposes of this comparison, attribute variables of the type i64vec3, |
| u64vec3, i64vec4, and u64vec4 count as consuming twice as many attributes |
| as equivalent single-precision types. |
| |
| |
| (extend the list of types in the first paragraph, p. 88) |
| ... UNSIGNED_INT_VEC3, UNSIGNED_INT_VEC4, INT8_NV, INT8_VEC2_NV, |
| INT8_VEC3_NV, INT8_VEC4_NV, INT16_NV, INT16_VEC2_NV, INT16_VEC3_NV, |
| INT16_VEC4_NV, INT64_NV, INT64_VEC2_NV, INT64_VEC3_NV, INT64_VEC4_NV, |
| UNSIGNED_INT8_NV, UNSIGNED_INT8_VEC2_NV, UNSIGNED_INT8_VEC3_NV, |
| UNSIGNED_INT8_VEC4_NV, UNSIGNED_INT16_NV, UNSIGNED_INT16_VEC2_NV, |
| UNSIGNED_INT16_VEC3_NV, UNSIGNED_INT16_VEC4_NV, UNSIGNED_INT64_NV, |
| UNSIGNED_INT64_VEC2_NV, UNSIGNED_INT64_VEC3_NV, UNSIGNED_INT64_VEC4_NV, |
| FLOAT16_NV, FLOAT16_VEC2_NV, FLOAT16_VEC3_NV, or FLOAT16_VEC4_NV. |
| |
| |
| Modify Section 2.14.4, Uniform Variables, p. 89 |
| |
| (modify third paragraph, p. 90) ... uniform variable storage for a vertex |
| shader. A scalar or vector uniform with with 64-bit integer components |
| will consume no more than 2<n> components, where <n> is 1 for scalars, and |
| the component count for vectors. A link error is generated ... |
| |
| (add to Table 2.13, p. 96) |
| |
| Type Name Token Keyword |
| -------------------- ---------------- |
| INT8_NV int8_t |
| INT8_VEC2_NV i8vec2 |
| INT8_VEC3_NV i8vec3 |
| INT8_VEC4_NV i8vec4 |
| INT16_NV int16_t |
| INT16_VEC2_NV i16vec2 |
| INT16_VEC3_NV i16vec3 |
| INT16_VEC4_NV i16vec4 |
| INT64_NV int64_t |
| INT64_VEC2_NV i64vec2 |
| INT64_VEC3_NV i64vec3 |
| INT64_VEC4_NV i64vec4 |
| UNSIGNED_INT8_NV uint8_t |
| UNSIGNED_INT8_VEC2_NV u8vec2 |
| UNSIGNED_INT8_VEC3_NV u8vec3 |
| UNSIGNED_INT8_VEC4_NV u8vec4 |
| UNSIGNED_INT16_NV uint16_t |
| UNSIGNED_INT16_VEC2_NV u16vec2 |
| UNSIGNED_INT16_VEC3_NV u16vec3 |
| UNSIGNED_INT16_VEC4_NV u16vec4 |
| UNSIGNED_INT64_NV uint64_t |
| UNSIGNED_INT64_VEC2_NV u64vec2 |
| UNSIGNED_INT64_VEC3_NV u64vec3 |
| UNSIGNED_INT64_VEC4_NV u64vec4 |
| FLOAT16_NV float16_t |
| FLOAT16_VEC2_NV f16vec2 |
| FLOAT16_VEC3_NV f16vec3 |
| FLOAT16_VEC4_NV f16vec4 |
| |
| (modify list of commands at the bottom of p. 99) |
| |
| void Uniform{1,2,3,4}{i64,ui64}NV(int location, T value); |
| void Uniform{1,2,3,4}{i64,ui64}vNV(int location, T value); |
| |
| (insert after fourth paragraph, p. 100) The Uniform*i64{v}NV and |
| Uniform*ui64{v}NV commands will load <count> sets of one to four 64-bit |
| signed or unsigned integer values into a uniform location defined as a |
| 64-bit signed or unsigned integer scalar or vector types. |
| |
| |
| (modify "Uniform Buffer Object Storage", p. 102, adding two bullets after |
| the last "Members of type", and modifying the subsequent bullet) |
| |
| * Members of type int8_t, int16_t, and int64_t are extracted from a |
| buffer object by reading a single byte, short, or int64-typed value at |
| the specified offset. |
| |
| * Members of type uint8_t, uint16_t, and uint64_t are extracted from a |
| buffer object by reading a single ubyte, ushort, or uint64-typed value |
| at the specified offset. |
| |
| * Members of type float16_t are extracted from a buffer object by reading |
| a single half-typed value at the specified offset. |
| |
| * Vectors with N elements with basic data types of bool, int, uint, |
| float, double, int8_t, int16_t, int64_t, uint8_t, uint16_t, uint64_t, |
| or float16_t are extracted as N values in consecutive memory locations |
| beginning at the specified offset, with components stored in order with |
| the first (X) component at the lowest offset. The GL data type used for |
| component extraction is derived according to the rules for scalar |
| members above. |
| |
| |
| Modify Section 2.14.6, Varying Variables, p. 106 |
| |
| (modify third paragraph, p. 107) ... For the purposes of counting input |
| and output components consumed by a shader, variables declared as vectors, |
| matrices, and arrays will all consume multiple components. Each component |
| of variables declared as 64-bit integer scalars or vectors, will be |
| counted as consuming two components. |
| |
| (add after the bulleted list, p. 108) For the purposes of counting the |
| total number of components to capture, each component of outputs declared |
| as 64-bit integer scalars or vectors will be counted as consuming two |
| components. |
| |
| |
| Modify Section 2.15.1, Geometry Shader Input Primitives, p. 118 |
| |
| (add new qualifier at the end of the section, p. 120) |
| |
| Patches (patches) |
| |
| Geometry shaders that operate on patches are valid for the PATCHES |
| primitive type. The number of vertices available to each program |
| invocation is equal to the vertex count of the variable-size patch, with |
| vertices presented to the geometry shader in the order specified in the |
| patch. |
| |
| |
| Modify Section 2.15.4, Geometry Shader Execution Environment, p. 121 |
| |
| (add to the end of "Geometry Shader Inputs", p. 123) |
| |
| Geometry shaders also support built-in and user-defined per-primitive |
| inputs. The following built-in inputs, not replicated per-vertex and not |
| contained in gl_in[], are supported: |
| |
| * The variable gl_PatchVerticesIn is filled with the number of the |
| vertices in the input primitive. |
| |
| * The variables gl_TessLevelOuter[] and gl_TessLevelInner[] are arrays |
| holding outer and inner tessellation levels of an input patch. If a |
| tessellation control shader is active, the tessellation levels will be |
| taken from the corresponding outputs of the tessellation control |
| shader. Otherwise, the default levels provided as patch parameters |
| are used. Tessellation level values loaded in these variables will be |
| prior to the clamping and rounding operations performed by the |
| primitive generator as described in Section 2.X.2 of |
| ARB_tessellation_shader. For triangular tessellation, |
| gl_TessLevelOuter[3] and gl_TessLevelInner[1] will be undefined. For |
| isoline tessellation, gl_TessLevelOuter[2], gl_TessLevelOuter[3], and |
| both values in gl_TessLevelInner[] are undefined. |
| |
| Additionally, a geometry shader with an input primitive type of "patches" |
| may declare per-patch input variables using the qualifier "patch in". |
| Unlike per-vertex inputs, per-patch inputs do not correspond to any |
| specific vertex in the input primitive, and are not indexed by vertex |
| number. Per-patch inputs declared as arrays have multiple values for the |
| input patch; similarly declared per-vertex inputs would indicate a single |
| value for each vertex in the output patch. User-defined per-patch input |
| variables are filled with corresponding per-patch output values written by |
| the tessellation control shader. If no tessellation control shader is |
| active, all such variables are undefined. |
| |
| Per-patch input variables and the built-in inputs "gl_PatchVerticesIn", |
| "gl_TessLevelOuter[]", and "gl_TessLevelInner[]" are supported only for |
| geometry shaders with an input primitive type of "patches". A program |
| will fail to link if any such variable is used in a geometry shader with a |
| input primitive type other than "patches". |
| |
| |
| Modify Section 2.19, Transform Feedback, p. 130 |
| |
| (add to Table 2.14, p. 131) |
| |
| Transform Feedback |
| primitiveMode allowed render primitive modes |
| ---------------------- --------------------------------- |
| PATCHES PATCHES |
| |
| |
| (modify first paragraph, p. 131) ... <primitiveMode> is one of TRIANGLES, |
| LINES, POINTS, or PATCHES and specifies the type of primitives that will |
| be recorded into the buffer objects bound for transform feedback (see |
| below). ... |
| |
| (modify last paragraph, p. 131 and first paragraph, p. 132, adding patch |
| support, and dealing with capture of 8- and 16-bit components) |
| |
| When an individual point, line, triangle, or patch primitive reaches the |
| transform feedback stage ... When capturing line, triangle, and patch |
| primitives, all attributes ... For multi-component varying variables or |
| varying array elements, the individual components are written in order. |
| For variables with 8- or 16-bit fixed- or floating-point components, |
| individual components will be converted to and stored as equivalent values |
| of type "int", "uint", or "float". The value for any attribute specified |
| ... |
| |
| (modify next-to-last paragraph, p. 132) ... is not incremented. If |
| transform feedback receives a primitive that fits in the remaining space |
| after such an overflow occurs, that primitive may or may not be recorded. |
| Primitives that fail to fit in the remaining space are never recorded. |
| |
| |
| Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification |
| (Rasterization) |
| |
| None. |
| |
| Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification |
| (Per-Fragment Operations and the Frame Buffer) |
| |
| None. |
| |
| Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification |
| (Special Functions) |
| |
| None. |
| |
| Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification |
| (State and State Requests) |
| |
| Modify Section 6.1.15, Shader and Program Queries, p. 332 |
| |
| (add to the first list of commands, p. 337) |
| |
| void GetUniformi64vNV(uint program, int location, int64EXT *params); |
| void GetUniformui64vNV(uint program, int location, uint64EXT *params); |
| |
| |
| Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile) |
| Specification (Invariance) |
| |
| None. |
| |
| Additions to the AGL/GLX/WGL Specifications |
| |
| None. |
| |
| Modifications to The OpenGL Shading Language Specification, Version 1.50 |
| (Revision 09) |
| |
| Including the following line in a shader can be used to control the |
| language features described in this extension: |
| |
| #extension GL_NV_gpu_shader5 : <behavior> |
| |
| where <behavior> is as specified in section 3.3. |
| |
| New preprocessor #defines are added to the OpenGL Shading Language: |
| |
| #define GL_NV_gpu_shader5 1 |
| |
| If the features of this extension are enabled by an #extension directive, |
| shading language features documented in the ARB_gpu_shader5 extension will |
| also be provided. |
| |
| |
| Modify Section 3.6, Keywords, p. 15 |
| |
| (add the following to the list of reserved keywords) |
| |
| int8_t i8vec2 i8vec3 i8vec4 |
| int16_t i16vec2 i16vec3 i16vec4 |
| int32_t i32vec2 i32vec3 i32vec4 |
| int64_t i64vec2 i64vec3 i64vec4 |
| uint8_t u8vec2 u8vec3 u8vec4 |
| uint16_t u16vec2 u16vec3 u16vec4 |
| uint32_t u32vec2 u32vec3 u32vec4 |
| uint64_t u64vec2 u64vec3 u64vec4 |
| float16_t f16vec2 f16vec3 f16vec4 |
| float32_t f32vec2 f32vec3 f32vec4 |
| float64_t f64vec2 f64vec3 f64vec4 |
| |
| (note: the "float64_t" and "f64vec*" types are available if and only if |
| ARB_gpu_shader_fp64 is also supported) |
| |
| |
| Modify Section 4.1, Basic Types, p. 18 |
| |
| (add to the basic "Transparent Types" table, p. 18) |
| |
| Types Meaning |
| -------- ---------------------------------------------------------- |
| int8_t an 8-bit signed integer |
| i8vec2 a two-component signed integer vector (8-bit components) |
| i8vec3 a three-component signed integer vector (8-bit components) |
| i8vec4 a four-component signed integer vector (8-bit components) |
| |
| int16_t a 16-bit signed integer |
| i16vec2 a two-component signed integer vector (16-bit components) |
| i16vec3 a three-component signed integer vector (16-bit components) |
| i16vec4 a four-component signed integer vector (16-bit components) |
| |
| int32_t a 32-bit signed integer |
| i32vec2 a two-component signed integer vector (32-bit components) |
| i32vec3 a three-component signed integer vector (32-bit components) |
| i32vec4 a four-component signed integer vector (32-bit components) |
| |
| int64_t a 64-bit signed integer |
| i64vec2 a two-component signed integer vector (64-bit components) |
| i64vec3 a three-component signed integer vector (64-bit components) |
| i64vec4 a four-component signed integer vector (64-bit components) |
| |
| uint8_t a 8-bit unsigned integer |
| u8vec2 a two-component unsigned integer vector (8-bit components) |
| u8vec3 a three-component unsigned integer vector (8-bit components) |
| u8vec4 a four-component unsigned integer vector (8-bit components) |
| |
| uint16_t a 16-bit unsigned integer |
| u16vec2 a two-component unsigned integer vector (16-bit components) |
| u16vec3 a three-component unsigned integer vector (16-bit components) |
| u16vec4 a four-component unsigned integer vector (16-bit components) |
| |
| uint32_t a 32-bit unsigned integer |
| u32vec2 a two-component unsigned integer vector (32-bit components) |
| u32vec3 a three-component unsigned integer vector (32-bit components) |
| u32vec4 a four-component unsigned integer vector (32-bit components) |
| |
| uint64_t a 64-bit unsigned integer |
| u64vec2 a two-component unsigned integer vector (64-bit components) |
| u64vec3 a three-component unsigned integer vector (64-bit components) |
| u64vec4 a four-component unsigned integer vector (64-bit components) |
| |
| float16_t a single 16-bit floating-point value |
| f16vec2 a two-component floating-point vector (16-bit components) |
| f16vec3 a three-component floating-point vector (16-bit components) |
| f16vec4 a four-component floating-point vector (16-bit components) |
| |
| float32_t a single 32-bit floating-point value |
| f32vec2 a two-component floating-point vector (32-bit components) |
| f32vec3 a three-component floating-point vector (32-bit components) |
| f32vec4 a four-component floating-point vector (32-bit components) |
| |
| float64_t a single 64-bit floating-point value |
| f64vec2 a two-component floating-point vector (64-bit components) |
| f64vec3 a three-component floating-point vector (64-bit components) |
| f64vec4 a four-component floating-point vector (64-bit components) |
| |
| |
| Modify Section 4.1.3, Integers, p. 20 |
| |
| (add after the first paragraph of the section, p. 20) |
| |
| Variables with the types "int8_t", "int16_t", and "int64_t" represent |
| signed integer values with exactly 8, 16, or 64 bits, respectively. |
| Variables with the type "uint8_t", "uint16_t", and "uint64_t" represent |
| unsigned integer values with exactly 8, 16, or 64 bits, respectively. |
| Variables with the type "int32_t" and "uint32_t" represent signed and |
| unsigned integer values with 32 bits, and are equivalent to "int" and |
| "uint" types, respectively. |
| |
| |
| (modify the grammar, p. 21, adding "L" and "UL" suffixes) |
| |
| integer-suffix: one of |
| |
| u U l L ul UL |
| |
| (modify next-to-last paragraph, p. 21) ... When the suffix "u" or "U" is |
| present, the literal has type <uint>. When the suffix "l" or "L" is |
| present, the literal has type <int64_t>. When the suffix "ul" or "UL" is |
| present, the literal has type <uint64_t>. Otherwise, the type is |
| <int>. ... |
| |
| |
| Modify Section 4.1.4, Floats, p. 22 |
| |
| (insert after second paragraph, p. 22) |
| |
| Variables of type "float16_t" represent floating-point using exactly 16 |
| bits and are stored using the 16-bit floating-point representation |
| described in the OpenGL Specification. Variables of type "float32_t" |
| and "float64_t" represent floating-point with 32 or 64 bits, and are |
| equivalent to "float" and "double" types, respectively. |
| |
| |
| Modify Section 4.1.7, Samplers, p. 23 |
| |
| (modify 1st paragraph of the section, deleting the restriction requiring |
| constant indexing of sampler arrays) ... Samplers may aggregated into |
| arrays within a shader (using square brackets [ ]) and can be indexed with |
| general integer expressions. The results of accessing a sampler array |
| with an out-of-bounds index are undefined. ... |
| |
| (remove the additional restriction added by ARB_gpu_shader5 making a |
| similar edit requiring uniform indexing across shader invocations for |
| defined results. NV_gpu_shader5 has no such limitation.) |
| |
| |
| Modify Section 4.1.10, Implicit Conversions, p. 27 |
| |
| (modify table of implicit conversions) |
| |
| Can be implicitly |
| Type of expression converted to |
| -------------------- ----------------------------------------- |
| int uint, int64_t, uint64_t, float, double(*) |
| ivec2 uvec2, i64vec2, u64vec2, vec2, dvec2(*) |
| ivec3 uvec3, i64vec3, u64vec3, vec3, dvec3(*) |
| ivec4 uvec4, i64vec4, u64vec4, vec4, dvec4(*) |
| |
| int8_t int16_t int, int64_t, uint, uint64_t, float, double(*) |
| i8vec2 i16vec2 ivec2, i64vec2, uvec2, u64vec2, vec2, dvec2(*) |
| i8vec3 i16vec3 ivec3, i64vec3, uvec3, u64vec3, vec3, dvec3(*) |
| i8vec4 i16vec4 ivec4, i64vec4, uvec4, u64vec4, vec4, dvec4(*) |
| |
| int64_t uint64_t, double(*) |
| i64vec2 u64vec2, dvec2(*) |
| i64vec3 u64vec3, dvec3(*) |
| i64vec4 u64vec4, dvec4(*) |
| |
| uint uint64_t, float, double(*) |
| uvec2 u64vec2, vec2, dvec2(*) |
| uvec3 u64vec3, vec3, dvec3(*) |
| uvec4 u64vec4, vec4, dvec4(*) |
| |
| uint8_t uint16_t uint, uint64_t, float, double(*) |
| u8vec2 u16vec2 uvec2, u64vec2, vec2, dvec2(*) |
| u8vec3 i16vec3 uvec3, u64vec3, vec3, dvec3(*) |
| u8vec4 i16vec4 uvec4, u64vec4, vec4, dvec4(*) |
| |
| uint64_t double(*) |
| u64vec2 dvec2(*) |
| u64vec3 dvec3(*) |
| u64vec4 dvec4(*) |
| |
| float double(*) |
| vec2 dvec2(*) |
| vec3 dvec3(*) |
| vec4 dvec4(*) |
| |
| float16_t float, double(*) |
| f16vec2 vec2, dvec2(*) |
| f16vec3 vec3, dvec3(*) |
| f16vec4 vec4, dvec4(*) |
| |
| (*) if ARB_gpu_shader_fp64 is supported |
| |
| (Note: Expressions of type "int32_t", "uint32_t", "float32_t", and |
| "float64_t" are treated as identical to those of type "int", "uint", |
| "float", and "double", respectively. Implicit conversions to and from |
| these explicitly-sized types are allowed whenever conversions involving |
| the equivalent base type are allowed.) |
| |
| |
| (modify second paragraph of the section) No implicit conversions are |
| provided to convert from unsigned to signed integer types, from |
| floating-point to integer types, from higher-precision to lower-precision |
| types, from 8-bit to 16-bit types, or between matrix types. There are no |
| implicit array or structure conversions. |
| |
| (add before the final paragraph of the section, p. 27) |
| |
| (insert before the final paragraph of the section) When performing |
| implicit conversion for binary operators, there may be multiple data types |
| to which the two operands can be converted. For example, when adding an |
| int8_t value to a uint16_t value, both values can be implicitly converted |
| to uint, uint64_t, float, and double. In such cases, a floating-point |
| type is chosen if either operand has a floating-point type. Otherwise, an |
| unsigned integer type is chosen if either operand has an unsigned integer |
| type. Otherwise, a signed integer type is chosen. If operands can be |
| converted to both 32- and 64-bit versions of the chosen base data type, |
| the 32-bit version is used. |
| |
| |
| Modify Section 4.3.4, Inputs, p. 31 |
| |
| (modify third paragraph of section, p. 31, allowing explicitly-sized |
| types) ... Vertex shader inputs variables can only be signed and unsigned |
| integers, floats, doubles, explicitly-sized integers and floating-point |
| values, vectors of any of these types, and matrices. ... |
| |
| (modify edits done in ARB_tessellation_shader adding support for "patch |
| in", allowing for geometry shaders as well) Additionally, tessellation |
| evaluation and geometry shaders support per-patch input variables declared |
| with the "patch in" qualifier. Per-patch input ... |
| |
| |
| (modify third paragraph, p. 32) ... Fragment inputs can only be signed and |
| unsigned integers, floats, doubles, explicitly-sized integers and |
| floating-point values, vectors of any of these types, matrices, or arrays |
| or structures of these. Fragment inputs declared as signed or unsigned |
| integers, doubles, 64-bit floating-point values, including vectors, |
| matrices, or arrays derived from those types, must be qualified as "flat". |
| |
| |
| Modify Section 4.3.6, Outputs, p. 33 |
| |
| (modify third paragraph of the section, p. 33) ... They can only be signed |
| and unsigned integers, floats, doubles, explicitly-sized integers and |
| floating-point values, vectors of any of these types, matrices, or arrays |
| or structures of these. |
| |
| (modify last paragraph, p. 33) ... Fragment outputs can only be signed |
| and unsigned integers, floats, explicitly-sized integers and |
| floating-point values with 32 or fewer bits, vectors of any of these |
| types, or arrays of these. Doubles, 64-bit integers or floating-point |
| values, vectors or arrays of those types, matrices, and structures cannot |
| be output. ... |
| |
| |
| Modify Section 4.3.8.1, Input Layout Qualifiers, p. 37 |
| |
| (add to the list of qualifiers for geometry shaders, p. 37) |
| |
| layout-qualifier-id: |
| ... |
| triangles_adjacency |
| patches |
| |
| (modify the "size of input arrays" table, p. 38) |
| |
| Layout Size of Input Arrays |
| ------------ -------------------- |
| patches gl_MaxPatchVertices |
| |
| (add paragraph below that table, p. 38) |
| |
| When using the input primitive type "patches", the geometry shader is used |
| to process a set of patches with vertex counts that may vary from patch to |
| patch. For the purposes of input array sizing, patches are treated as |
| having a vertex count fixed at the implementation-dependent maximum patch |
| size, gl_MaxPatchVertices. If a shader reads an input corresponding to a |
| vertex not found in the patch being processed, the values read are |
| undefined. |
| |
| |
| Modify Section 5.4.1, Conversion and Scalar Constructors, p. 49 |
| |
| (add after first list of constructor examples) |
| |
| Similar constructors are provided to convert to and from explicitly-sized |
| scalar data types, as well: |
| |
| float(uint8_t) // converts an 8-bit uint value to a float |
| int64_t(double) // converts a double value to a 64-bit int |
| float64_t(int16_t) // converts a 16-bit int value to a 64-bit float |
| uint16_t(bool) // converts a Boolean value to a 16-bit uint |
| |
| (replace final two paragraphs, p. 49, and the first paragraph, p. 50, |
| using more general language) |
| |
| When constructors are used to convert any floating-point type to any |
| integer type, the fractional part of the floating-point value is dropped. |
| It is undefined to convert a negative floating point value to an unsigned |
| integer type. |
| |
| When a constructor is used to convert any integer or floating-point type |
| to bool, 0 and 0.0 are converted to false, and non-zero values are |
| converted to true. When a constructor is used to convert a bool to any |
| integer or floating-point type, false is converted to 0 or 0.0, and true |
| is converted to 1 or 1.0. |
| |
| Constructors converting between signed and unsigned integers with the same |
| bit count always preserve the bit pattern of the input. This will change |
| the value of the argument if its most significant bit is set, converting a |
| negative signed integer to a large unsigned integer, or vice versa. |
| |
| |
| Modify Section 5.9, Expressions, p. 57 |
| |
| (modify bulleted list as follows, adding support for expressions with |
| 64-bit integer types) |
| |
| Expressions in the shading language are built from the following: |
| |
| * Constants of type bool, int, int64_t, uint, uint64_t, float, all vector |
| types, and all matrix types. |
| |
| ... |
| |
| * The arithmetic binary operators add (+), subtract (-), multiply (*), and |
| divide (/) operate on 32-bit integer, 64-bit integer, and floating-point |
| scalars, vectors, and matrices. If the fundamental types of the |
| operands do not match, the conversions from Section 4.1.10 "Implicit |
| Conversions" are applied to produce matching types. ... |
| |
| * The operator modulus (%) operate on 32- and 64-bit integer scalars or |
| vectors. If the fundamental types of the operands do not match, the |
| conversions from Section 4.1.10 "Implicit Conversions" are applied to |
| produce matching types. ... |
| |
| * The arithmetic unary operators negate (-), post- and pre-increment and |
| decrement (-- and ++) operate on 32-bit integer, 64-bit integer, and |
| floating-point values (including vectors and matrices). ... |
| |
| * The relational operators greater than (>), less than (<), and less than |
| or equal (<=) operate only on scalar 32-bit integer, 64-bit integer, and |
| floating-point expressions. The result is scalar Boolean. The |
| fundamental type of the two operands must match, either as specified, or |
| after one of the implicit type conversions specified in Section 4.1.10. |
| ... |
| |
| * The equality operators equal (==), and not equal (!=) operate only on |
| scalar 32-bit integer, 64-bit integer, and floating-point expressions. |
| The result is scalar Boolean. The fundamental type of the two operands |
| must match, either as specified, or after one of the implicit type |
| conversions specified in Section 4.1.10. ... |
| |
| |
| Modify Section 6.1, Function Definitions, p. 63 |
| |
| (ARB_gpu_shader5 adds a set of rules for defining whether implicit |
| conversions for one matching function definition are better or worse than |
| those for another. These comparisons are done argument by argument. |
| Extend the edits made by ARB_gpu_shader5 to add several new rules for |
| comparing implicit conversions for a single argument, corresponding to the |
| new data types introduced by this extension.) |
| |
| To determine whether the conversion for a single argument in one match is |
| better than that for another match, the following rules are applied, in |
| order: |
| |
| 1. An exact match is better than a match involving any implicit |
| conversion. |
| |
| 2. A match involving a conversion from a signed integer, unsigned |
| integer, or floating-point type to a similar type having a larger |
| number of bits is better a match not involving another conversion. |
| The set of conversions qualifying under this rule are: |
| |
| source types destination types |
| ----------------- ----------------- |
| int8_t, int16_t int, int64_t |
| int int64_t |
| uint8_t, uint16_t uint, uint64_t |
| uint uint64_t |
| float16_t float |
| float double |
| |
| 3. A match involving one conversion in rule 2 is better than a match |
| involving another conversion in rule 2 if: |
| |
| (a) both conversions start with the same type and the first |
| conversion is to a type with a smaller number of bits (e.g., |
| converting from int16_t to int is preferred to converting |
| int16_t to int64_t), or |
| |
| (b) both conversions end with the same type and the first |
| conversion is from a type with a larger number of bits (e.g., |
| converting an "out" parameter from int16_t to int is preferred |
| to convering from int8_t to int). |
| |
| 4. A match involving an implicit conversion from any integer type to |
| float is better than a match involving an implicit conversion from |
| any integer type to double. |
| |
| |
| Modify Section 7.1, Vertex and Geometry Shader Special Variables, p. 69 |
| |
| (NOTE: These edits are written against the re-organized section in the |
| ARB_tessellation_shader specification.) |
| |
| (add to the list of built-ins inputs for geometry shaders) In the geometry |
| language, built-in input and output variables are intrinsically declared |
| as: |
| |
| in int gl_PatchVerticesIn; |
| patch in float gl_TessLevelOuter[4]; |
| patch in float gl_TessLevelInner[2]; |
| |
| ... |
| |
| The input variable gl_PatchVerticesIn behaves as in the identically-named |
| tessellation control and evaluation shader inputs. |
| |
| The input variables gl_TessLevelOuter[] and gl_TessLevelInner[] behave as |
| in the identically-named tessellation evaluation shader inputs. |
| |
| |
| Modify Chapter 8, Built-in Functions, p. 81 |
| |
| (add to description of generic types, last paragraph of p. 69) ... Where |
| the input arguments (and corresponding output) can be int64_t, i64vec2, |
| i64vec3, or i64vec4, <genI64Type> is used as the argument. Where the |
| input arguments (and corresponding output) can be uint64_t, u64vec2, |
| u64vec3, or u64vec4, <genU64Type> is used as the argument. |
| |
| |
| Modify Section 8.3, Common Functions, p. 84 |
| |
| (add support for 64-bit integer packing and unpacking functions) |
| |
| Syntax: |
| |
| int64_t packInt2x32(ivec2 v); |
| uint64_t packUint2x32(uvec2 v); |
| |
| ivec2 unpackInt2x32(int64_t v); |
| uvec2 unpackUint2x32(uint64_t v); |
| |
| The functions packInt2x32() and packUint2x32() return a signed or unsigned |
| 64-bit integer obtained by packing the components of a two-component |
| signed or unsigned integer vector, respectively. The first vector |
| component specifies the 32 least significant bits; the second component |
| specifies the 32 most significant bits. |
| |
| The functions unpackInt2x32() and unpackUint2x32() return a signed or |
| unsigned integer vector built from a 64-bit signed or unsigned integer |
| scalar, respectively. The first component of the vector contains the 32 |
| least significant bits of the input; the second component consists the 32 |
| most significant bits. |
| |
| |
| (add support for 16-bit floating-point packing and unpacking functions) |
| |
| Syntax: |
| |
| uint packFloat2x16(f16vec2 v); |
| f16vec2 unpackFloat2x16(uint v); |
| |
| The function packFloat2x16() returns an unsigned integer obtained by |
| interpreting the components of a two-component 16-bit floating-point |
| vector as integers according to OpenGL Specification, and then packing the |
| two 16-bit integers into a 32-bit unsigned integer. The first vector |
| component specifies the 16 least significant bits of the result; the |
| second component specifies the 16 most significant bits. |
| |
| The function unpackFloat2x16() returns a two-component vector with 16-bit |
| floating-point components obtained by unpacking a 32-bit unsigned integer |
| into a pair of 16-bit values, and interpreting those values as 16-bit |
| floating-point numbers according to the OpenGL Specification. The first |
| component of the vector is obtained from the 16 least significant bits of |
| the input; the second component is obtained from the 16 most significant |
| bits. |
| |
| |
| (add functions to get/set the bit encoding for floating-point values) |
| |
| 64-bit floating-point data types in the OpenGL shading language are |
| specified to be encoded according to the IEEE specification for |
| double-precision floating-point values. The functions below allow shaders |
| to convert double-precision floating-point values to and from 64-bit |
| signed or unsigned integers representing their encoding. |
| |
| To obtain signed or unsigned integer values holding the encoding of a |
| floating-point value, use: |
| |
| genI64Type doubleBitsToInt64(genDType value); |
| genU64Type doubleBitsToUint64(genDType value); |
| |
| Conversions are done on a component-by-component basis. |
| |
| To obtain a floating-point value corresponding to a signed or unsigned |
| integer encoding, use: |
| |
| genDType int64BitsToDouble(genI64Type value); |
| genDType uint64BitsToDouble(genU64Type value); |
| |
| |
| (add functions to evaluate predicates over groups of threads) |
| |
| Syntax: |
| |
| bool anyThreadNV(bool value); |
| bool allThreadsNV(bool value); |
| bool allThreadsEqualNV(bool value); |
| |
| Implementations of the OpenGL Shading Language may, but are not required, |
| to run multiple shader threads for a single stage as a SIMD thread group, |
| where individual execution threads are assigned to thread groups in an |
| undefined, implementation-dependent order. Algorithms may benefit from |
| being able to evaluate a composite of boolean values over all active |
| threads in the thread group. |
| |
| The function anyThreadNV() returns true if and only if <value> is true for |
| at least one active thread in the group. The function allThreadsNV() |
| returns true if and only if <value> is true for all active threads in the |
| group. The function allThreadsEqualNV() returns true if <value> is the |
| same for all active threads in the group; the result of |
| allThreadsEqualNV() will be true if and only if anyThreadNV() and |
| allThreadsNV() would return the same value. |
| |
| Since these functions depends on the values of <value> in an undefined |
| group of threads, the value returned by these functions is largely |
| undefined. However, anyThreadNV() is guaranteed to return true if <value> |
| is true, and allThreadsNV() is guaranteed to return false if <value> is |
| false. |
| |
| Since implementations are generally not required to combine threads into |
| groups, simply returning <value> for anyThreadNV() and allThreadsNV() and |
| returning true for allThreadsEqualNV() is a legal implementation of these |
| functions. |
| |
| |
| Modify Section 8.6, Vector Relational Functions, p. 90 |
| |
| (modify the first paragraph, p. 90, adding support for relational |
| functions operating on explicitly-sized types) |
| |
| Relational and equality operators (<, <=, >, >=, ==, !=) are defined (or |
| reserved) to operate on scalars and produce scalar Boolean results. For |
| vector results, use the following built-in functions. In the definitions |
| below, the following terms are used as placeholders for all vector types |
| for a given fundamental data type: |
| |
| placeholder fundamental types |
| ----------- ------------------------------------------------ |
| bvec bvec2, bvec3, bvec4 |
| |
| ivec ivec2, ivec3, ivec4, i8vec2, i8vec3, i8vec4, |
| i16vec2, i16vec3, i16vec4, i64vec2, i64vec3, i64vec4 |
| |
| uvec uvec2, uvec3, uvec4, u8vec2, u8vec3, u8vec4, |
| u16vec2, u16vec3, u16vec4, u64vec2, u64vec3, u64vec4 |
| |
| vec vec2, vec3, vec4, dvec2(*), dvec3(*), dvec4(*), |
| f16vec2, f16vec3, f16vec4 |
| |
| (*) only if ARB_gpu_shader_fp64 is supported |
| |
| In all cases, the sizes of the input and return vectors for any |
| particular call must match. |
| |
| |
| Modify Section 8.7, Texture Lookup Functions, p. 91 |
| |
| (modify text for textureOffset() functions, p. 94, allowing non-constant |
| offsets) |
| |
| Do a texture lookup as in texture but with offset added to the (u,v,w) |
| texel coordinates before looking up each texel. The value <offset> need |
| not be constant; however, a limited range of offset values are supported. |
| If any component of <offset> is less than MIN_PROGRAM_TEXEL_OFFSET_EXT or |
| greater than MAX_PROGRAM_TEXEL_OFFSET_EXT, the offset applied to the |
| texture coordinates is undefined. Note that offset does not apply to the |
| layer coordinate for texture arrays. This is explained in detail in |
| section 3.9.9 of the OpenGL Specification (Version 3.2, Compatibility |
| Profile), where offset is (delta_u, delta_v, delta_w). Note that texel |
| offsets are also not supported for cube maps. |
| |
| (Note: This lifting of the constant offset restriction also applies to |
| texelFetchOffset, p. 95, textureProjOffset, p. 95, textureLodOffset, |
| p. 96, textureProjLodOffset, p. 96.) |
| |
| |
| (modify the description of the textureGradOffset() functions, p. 97, |
| preserving the restriction on constant offsets) |
| |
| Do a texture lookup with both explicit gradient and offset, as described |
| in textureGrad and textureOffset. For these functions, the offset value |
| must be a constant expression. A limited range of offset values are |
| supported; the minimum and maximum offset values are |
| implementation-dependent and given by MIN_PROGRAM_TEXEL_OFFSET and |
| MAX_PROGRAM_TEXEL_OFFSET, respectively. |
| |
| |
| (modify the description of the textureProjGradOffset() functions, |
| p. 98, preserving the restriction on constant offsets) |
| |
| Do a texture lookup projectively and with explicit gradient as described |
| in textureProjGrad, as well as with offset, as described in textureOffset. |
| For these functions, the offset value must be a constant expression. A |
| limited range of offset values are supported; the minimum and maximum |
| offset values are implementation-dependent and given by |
| MIN_PROGRAM_TEXEL_OFFSET and MAX_PROGRAM_TEXEL_OFFSET, respectively. |
| |
| (modify the description of the textureGatherOffsets() functions, |
| added in ARB_gpu_shader5, to remove the restriction on constant offsets) |
| |
| The textureGatherOffsets() functions operate identically ... |
| selecting the texel T_i0_j0 of that footprint. The specified values in |
| <offsets> need not be constant. A limited range of ... |
| |
| Modify Section 9, Shading Language Grammar, p. 92 |
| |
| !!! TBD !!! |
| |
| |
| GLX Protocol |
| |
| TBD |
| |
| Interactions with OpenGL ES 3.1 |
| |
| If implemented in OpenGL ES, NV_gpu_shader5 acts as a superset |
| of functionality provided by OES_gpu_shader5. |
| |
| A shader that enables this extension |
| via an #extension directive also implicitly enables the common |
| capabilities provided by OES_gpu_shader5. |
| |
| Replace references to ARB_gpu_shader5 with OES_gpu_shader5 and |
| EXT_shader_implicit_conversions (as appropriate). |
| Replace references to ARB_geometry_shader with OES/EXT_geometry_shader. |
| Replace references to ARB_tessellation_shader with OES/EXT_tessellation_shader. |
| |
| Replace references to int64EXT and uint64EXT with int64 and uint64, |
| respectively. |
| |
| The specification should be edited as follows to include new |
| ProgramUniform* functions. |
| |
| (modify the ProgramUniform* language) |
| |
| The following commands: |
| |
| .... |
| void ProgramUniform{1,2,3,4}{i64,ui64}NV |
| (uint program int location, T value); |
| void ProgramUniform{1,2,3,4}{i64,ui64}vNV |
| (uint program, int location, const T *value); |
| |
| operate identically to the corresponding command where "Program" is |
| deleted from the name (and extension suffixes are dropped or updated |
| appropriately) except, rather than updating the currently active program |
| object, these "Program" commands update the program object named by the |
| <program> parameter. ... |
| |
| Changes to Section 2.6.1 "Begin and End" don't apply. |
| |
| Disregard introduction of 64bit -integer or -floating point vertex |
| attribute types. |
| |
| Interactions with OpenGL ES Shading Language 3.10, revision 3 |
| |
| If implemented in GLSL ES, NV_gpu_shader5 acts as a superset |
| of functionality provided by OES_gpu_shader5 and |
| EXT_shader_implicit_conversions. |
| |
| A shader that enables this extension via an #extension directive |
| also implicitly enables the common capabilities provided by |
| OES_gpu_shader5 and EXT_shader_implicit_conversions. |
| |
| Replace references to ARB_tessellation_shader with OES/EXT_tessellation_shader. |
| |
| Implicit conversion between GLSL ES types are introduced by |
| EXT_shader_implicit_conversions instead of ARB_gpu_shader5. |
| |
| Disregard the notion of 'double' types as vertex shader inputs. |
| |
| Section 4.1.7.2 "Images" |
| Remove the third sentence restricts |
| access to arrays of images to constant integral expression. |
| |
| This essentially leaves it to the 'dynamically uniform integral |
| expressions' default as OES_gpu_shader5 introduced. |
| |
| Modify Section 4.3.9 "Interface Blocks", as modified OES_gpu_shader5 |
| |
| NV_gpu_shader5 also lifts OES_gpu_shader5 restrictions with |
| regard to indexing into arrays of uniforms blocks and shader |
| storage blocks. |
| |
| Change sentence |
| "All indices used to index a shader storage block array must be |
| constant integral expressions. A uniform block array can only |
| be indexed with a dynamically uniform integral expression, |
| otherwise results are undefined." into |
| |
| "Arbitrary indices may be used to index a uniform block array; |
| integral constant expressions are not required. If the index |
| used to access an array of uniform blocks is out-of-bounds, |
| the results of the access are undefined." |
| |
| Indexing into arrays of shader storage blocks defaults to |
| 'dynamically uniform integral expressions'. |
| |
| Changes to Section 4.3.9, p.48 "Interface Blocks" |
| |
| Replace the sentence |
| "All indices used to index a shader storage block array must be |
| constant integral expressions. A uniform block array can only |
| be indexed with a dynamically uniform integral expression, |
| otherwise results are undefined." |
| with |
| "Arbitrary indices may be used to index a uniform block array; |
| integral constant expressions are not required. If the index |
| used to access an array of uniform blocks is out-of-bounds, the |
| results of the access are undefined." |
| |
| 4.4.1.1 "Compute Shader Inputs" change |
| |
| "layout-qualifier-id: |
| local_size_x = integer-constant |
| local_size_y = integer-constant |
| local_size_z = integer-constant" into |
| |
| "layout-qualifier-id: |
| local_size_x = integer-constant-expression |
| local_size_y = integer-constant-expression |
| local_size_z = integer-constant-expression" |
| |
| Section 4.4.1.gs "Geometry Shader Inputs" change |
| |
| "<layout-qualifier-id> |
| ... |
| invocations = integer-constant" into |
| |
| "<layout-qualifier-id> |
| ... |
| invocations = integer-constant-expression" |
| |
| Section 4.4.2 "Output Layout Qualifiers" change |
| |
| "layout-qualifier-id: |
| location = integer-constant" into |
| |
| "layout-qualifier-id: |
| location = integer-constant-expression" |
| |
| Section 4.4.2.ts "Tessellation Control Outputs" change |
| |
| "layout-qualifier-id |
| vertices = integer-constant" into |
| |
| "layout-qualifier-id: |
| vertices = integer-constant-expression" |
| |
| Section 4.4.3 "Uniform Variable Layout Qualifiers" change |
| |
| "layout-qualifier-id: |
| location = integer-constant" into |
| |
| "layout-qualifier-id: |
| location = integer-constant-expression" |
| |
| Section 4.4.4 "Uniform and Shader Storage Block Layout Qualifiers" change |
| |
| "layout-qualifier-id: |
| ... |
| binding = integer-constant" into |
| |
| "layout-qualifier-id: |
| ... |
| binding = integer-constant-expression" |
| |
| Section 4.4.5 "Opaque Uniform Layout Qualifiers" change |
| |
| "layout-qualifier-id: |
| binding = integer-constant" into |
| |
| "layout-qualifier-id: |
| binding = integer-constant-expression" |
| |
| Change sentence |
| "A link-time error will result if two shaders in a program |
| specify different integer-constant bindings for the same |
| opaque-uniform name." into |
| |
| "A link-time error will result if two shaders in a program |
| specify different bindings for the same opaque-uniform |
| name." |
| |
| Section 4.4.6 "Atomic Counter Layout Qualifiers" change |
| |
| "layout-qualifier-id: |
| binding = integer-constant |
| offset = integer-constant" into |
| |
| "layout-qualifier-id: |
| binding = integer-constant-expression |
| offset = integer-constant-expression" |
| |
| Section 4.4.7 "Format Layout Qualifiers" change |
| |
| "layout-qualifier-id: |
| ... |
| binding = integer-constant" into |
| |
| "layout-qualifier-id: |
| ... |
| binding = integer-constant-expression" |
| |
| Section 4.7.3 "Precision Qualifiers" |
| |
| After "Literal constants do not have precision qualifiers." add |
| "Neither do explicitly sized types such as int8_t, uint32_t, |
| float16_t etc." |
| |
| Dependencies on OES_gpu_shader5 |
| |
| In addition to allowing arbitrary indexing arrays of samplers, this |
| extension also lifts OES_gpu_shader5 restrictions for indexing |
| arrays of images and shader storage blocks. Additionally, it allows |
| usage of 'integer-constant-expressions' for layout qualifiers that |
| formerly took 'integer-constant'. |
| |
| In Section 'Overview': change the bullet point |
| |
| "* the ability to aggregate samplers into arrays...." |
| |
| to |
| |
| "* the ability to index into arrays of samplers, uniforms and shader |
| storage blocks with arbitrary expressions, and not require that |
| non-constant indices be uniform across all shader invocations." |
| |
| "* the ability to index into arrays of images using dynamically |
| uniform integers." |
| |
| "* the ability to use 'integer-constant-expressions' in place of |
| 'integer-constant' for layout qualifiers." |
| |
| Dependencies on OES/EXT_tessellation_shader and OpenGL ES 3.2 |
| |
| If implemented in OpenGL ES 3.1 or earlier and |
| OES/EXT_tessellation_shader is not supported, language introduced by |
| this extension describing processing patches in geometry shaders, |
| transform feedback, and rasterization should be removed. |
| |
| If implemented in OpenGL ES 3.2 or implemented in |
| OpenGL ES 3.1 and OES/EXT_tessellation_shader is supported: |
| |
| It is legal to send patches past the tessellation stage -- the |
| following language from OES/EXT_tessellation_shader is removed: |
| |
| Patch primitives are not supported by pipeline stages below the |
| tessellation evaluation shader. |
| |
| It is legal to use a tessellation control shader without a tessellation |
| evaluation shader. |
| |
| Remove from the bullet list describing reasons for link failure below the |
| LinkProgram command on p. 70 (as modified by OES/EXT_tessellation_shader): |
| |
| * the program is not separable and contains no object to form a |
| tessellation evaluation shader; or |
| |
| Modify section 11.1.2.1, "Output Variables" on p. 262 (as modified |
| by the OES/EXT_geometry_shader extension): |
| |
| Into the paragraph starting with |
| "Each program object can specify a set of output variables from one |
| shader to be recorded in transform feedback mode..." |
| |
| Insert after the tesselation evaluation shader bullet point: |
| * tesselation control shader |
| |
| |
| Modify section 11.1.3.11, "Validation" to replace the bullet point |
| starting with "One but not both of the tessellation..." on p. 271 |
| |
| * the tessellation evaluation but not tessellation control stage |
| has an active program with corresponding executable shader. |
| |
| |
| Modify section 11.1ts, "Tessellation" |
| |
| Replace |
| "Tessellation is considered active if and only if the active |
| program object or program pipeline object includes both a |
| tessellation control shader and a tessellation evaluation shader." |
| with |
| "Tessellation is considered active if and only if the active |
| program object or program pipeline object includes a tessellation |
| control shader." |
| |
| Replace |
| "An INVALID_OPERATION error is generated by any command that |
| transfers vertices to the GL if the current program state has one |
| but not both of a tessellation control shader and tessellation |
| evaluation shader." |
| with |
| "An INVALID_OPERATION error is generated by any command that |
| transfers vertices to the GL if the current program state has a |
| tessellation evaluation shader but not a tessellation control |
| shader." |
| |
| Modify section 12.1.2 "Transform Feedback Primitive Capture" |
| |
| Replace the second paragraph of the section on p. 274 (as modified |
| by OES/EXT_tessellation_shader): |
| |
| The data captured in transform feedback mode depends on the active |
| programs on each of the shader stages. If a program is active for the |
| geometry shader stage, transform feedback captures the vertices of each |
| primitive emitted by the geometry shader. Otherwise, if a program is |
| active for the tessellation evaluation shader stage, transform feedback |
| captures each primitive produced by the tessellation primitive generator, |
| whose vertices are processed by the tessellation evaluation shader. |
| Otherwise, if a program is active for the tessellation control shader stage, |
| transform feedback captures each output patch of that stage. |
| Otherwise, transform feedback captures each primitive processed by the |
| vertex shader. |
| |
| Modify the second paragraph following ResumeTransformFeedback on p. 277 |
| (as modified by OES/EXT_tessellation_shader): |
| |
| When transform feedback is active and not paused ... If a tessellation |
| or geometry shader is active, the type of primitive emitted |
| by that shader is used instead of the <mode> parameter passed to drawing |
| commands for the purposes of this error check. If tessellation |
| and geometry shaders are both active, the output primitive |
| type of the geometry shader will be used for the purposes of this error. |
| Any primitive type may be used while transform feedback is paused. |
| |
| |
| Modify section 13.3, "Points" |
| |
| After |
| "The point size is determined by the last active stage before the |
| rasterizer:" |
| |
| Add a new bullet point to the list, between the |
| tessellation evaluation shader and the vertex shader: |
| |
| * the tessellation control shader, if active and no tessellation |
| evaluation shader is active; |
| |
| Dependencies on OES/EXT_geometry_shader |
| |
| If implemented in GLSL ES and OES/EXT_geometry_shader is not supported, |
| disregard all changes to geometry shader related functionality. |
| |
| Dependencies on ARB_gpu_shader5 |
| |
| This extension also incorporates all the changes to the OpenGL Shading |
| Language made by ARB_gpu_shader5; enabling this extension by a #extension |
| directive in shader code also enables all features of ARB_gpu_shader5 as |
| though the shader code has also declared |
| |
| #extension GL_ARB_gpu_shader5 : enable |
| |
| The converse is not true; implementations supporting both extensions |
| should not provide the shading language features in this extension if |
| shader code #extension directives enable only ARB_gpu_shader5. |
| |
| This specification and ARB_gpu_shader5 both lift the restriction in GLSL |
| 1.50 requiring that indexing in arrays of samplers must be done with |
| constant expressions. However, ARB_gpu_shader5 specifies that results are |
| undefined if the indices would diverge if multiple shader invocations are |
| run in lockstep. This extension does not impose the non-divergent |
| indexing requirement. |
| |
| Dependencies on ARB_gpu_shader_fp64 |
| |
| This extension and ARB_gpu_shader_fp64 both provide support for shading |
| language variables with 64-bit components. If both extensions are |
| supported, the various edits describing this new support should be |
| combined. |
| |
| If ARB_gpu_shader_fp64 is not supported, the following edits should be |
| removed: |
| |
| * language adding the data types "float64_t", "f64vec2", "f64vec3", and |
| "f64vec4"; |
| |
| * language allowing implicit conversions of various types to double, |
| dvec2, dvec3, or dvec4; and |
| |
| * the built-in functions doubleBitsToInt64(), doubleBitsToUint64(), |
| int64BitsToDouble(), and uint64BitsToDouble(). |
| |
| Dependencies on ARB_tessellation_shader |
| |
| If ARB_tessellation_shader is not supported, language introduced by this |
| extension describing processing patches in geometry shaders, transform |
| feedback, and rasterization should be removed. |
| |
| If this extension and ARB_tessellation_shader are supported, it is legal |
| to send patches past the tessellation stage -- the following language from |
| ARB_tessellation_shader is removed: |
| |
| Patch primitives are not supported by pipeline stages below the |
| tessellation evaluation shader. If there is no active program object or |
| the active program object does not contain a tessellation evaluation |
| shader, the error INVALID_OPERATION is generated by Begin (or vertex |
| array commands that implicitly call Begin) if the primitive mode is |
| PATCHES. |
| |
| Dependencies on NV_shader_buffer_load |
| |
| If NV_shader_buffer_load is supported, that specification should be edited |
| as follows, to allow pointers to dereference the new data types added by |
| this extension. |
| |
| Modify "Section 2.20.X, Shader Memory Access" from NV_shader_buffer_load. |
| |
| (add rules for loads of variables having the new data types from this |
| extension to the list of bullets following "When a shader dereferences a |
| pointer variable") |
| |
| - Data of type "int8_t," "int16_t", "int32_t", and "int64_t" are read |
| from or written to memory as a single 8-, 16-, 32-, or 64-bit signed |
| integer value at the specified GPU address. |
| |
| - Data of type "uint8_t," "uint16_t", "uint32_t", and "uint64_t" are read |
| from or written to memory as a single 8-, 16-, 32-, or 64-bit unsigned |
| integer value at the specified GPU address. |
| |
| - Data of type "float16_t", "float32_t", and "float64_t" are read from or |
| written to memory as a single 16-, 32-, or 64-bit floating-point value |
| at the specified GPU address. |
| |
| Dependencies on EXT_direct_state_access |
| |
| If EXT_direct_state_access is supported, that specification should be |
| edited as follows to include new ProgramUniform* functions. |
| |
| (modify the ProgramUniform* language) |
| |
| The following commands: |
| |
| .... |
| void ProgramUniform{1,2,3,4}{i64,ui64}NV |
| (uint program int location, T value); |
| void ProgramUniform{1,2,3,4}{i64,ui64}vNV |
| (uint program, int location, const T *value); |
| |
| operate identically to the corresponding command where "Program" is |
| deleted from the name (and extension suffixes are dropped or updated |
| appropriately) except, rather than updating the currently active program |
| object, these "Program" commands update the program object named by the |
| <program> parameter. ... |
| |
| Dependencies on EXT_vertex_attrib_64bit and NV_vertex_attrib_integer_64bit |
| |
| The EXT_vertex_attrib_64bit extension provides the ability to specify |
| 64-bit floating-point vertex attributes in a GLSL vertex shader and the |
| specify the values of these attributes via the OpenGL API. To |
| successfully compile vertex shaders with fp64 input variables, is |
| necessary to include |
| |
| #extension GL_EXT_vertex_attrib_64bit : enable |
| |
| in the shader text. |
| |
| However, this extension is considered to enable 64-bit |
| floating-point and integer inputs. Provided EXT_vertex_attrib_64bit |
| and NV_vertex_attrib_integer_64bit are supported, including the |
| following code in a vertex shader |
| |
| #extension GL_NV_gpu_shader5 : enable |
| |
| will enable 64-bit floating-point or integer input variables whose |
| values would be specified using the OpenGL API mechanisms found in |
| the EXT_vertex_attrib_64bit and NV_vertex_attrib_integer_64bit |
| extensions. |
| |
| |
| Errors |
| |
| None. |
| |
| New State |
| |
| None. |
| |
| New Implementation Dependent State |
| |
| None. |
| |
| Issues |
| |
| (1) What implicit conversions are supported by this extension on top of |
| those provided by related extensions? |
| |
| RESOLVED: ARB_gpu_shader5 and ARB_gpu_shader_fp64 provide new implicit |
| conversions from "int" to "uint", and from "int", "uint", and "float" to |
| "double". |
| |
| This extension provides integer types of multiple sizes and supports |
| implicit conversions from small integer types to 32- or 64-bit integer |
| types of the same signedness, as well as float and double. It also |
| provides floating-point types of multiple sizes and supports implicit |
| conversions from smaller to larger types. Additionally, it supports |
| conversion from 64-bit integer types to double. |
| |
| (2) How do these implicit conversions impact binary operators? |
| |
| RESOLVED: For binary operators, we prefer converting to a common type |
| that is as close as possible in size and type to the original |
| expression. |
| |
| (3) How do these implicit conversions impact function overloading rules? |
| |
| RESOLVED: We extend the preference rules in ARB_gpu_shader5 to account |
| for the new data types, adding rules to: |
| |
| * favor new "promotions" in integer/floating point types (previously, |
| the only promotion was float-to-double) |
| |
| * for promotions, favor conversion to the type closer in size (e.g., |
| prefer converting from int16_t to int over converting to int64_t) |
| |
| (4) What should be done to distinguish between 32- and 64-bit integer |
| constants? |
| |
| RESOLVED: We will use "L" and "UL" to identify signed and unsigned |
| 64-bit integer constants; the use of "L" matches a similar ("long") |
| suffix in the C programming language. C leaves the size of integer |
| types implementation-dependent, and many implementations require an "LL" |
| suffix to declare 64-bit integer constants. With our size definitions, |
| "L" will be considered sufficient to make an integer constant 64-bit. |
| |
| (5) Should provide support for vertex attributes with 64-bit components, |
| and if so, how should the support be provided in the OpenGL API? |
| |
| RESOLVED: Yes, this seems like useful functionality, particularly for |
| applications wanting to provide double-precision or 64-bit integer data |
| to shaders performing computations on such types. We provide |
| VertexAttribL* entry points for 64-bit components in the separate |
| EXT_vertex_attrib_64bit and NV_vertex_attrib_64bit extensions, which |
| should be supported on all implementations supporting this extension. |
| |
| (6) Should we allow vertex attributes with 8- or 16-bit components in the |
| shading language, and if so, how does it interact with the OpenGL API? |
| |
| RESOLVED: Yes, but we will use existing APIs to specify such |
| attributes, which already typically allow 8- and 16-bit components on |
| the API side. Vertex attribute components (other than 64-bit ones) |
| specified by the API will be converted from the type specified in the |
| vertex attribute commands to the component type of the attribute. For |
| floating-point values, that may involve 16-to-32 bit conversion or vice |
| versa. For integer types, that may involve dropping all but the least |
| significant bits of attribute components. |
| |
| (7) Should we support uniforms with double or 64-bit attribute types, and |
| if so, how? Should we support uniforms with <32-bit components, and |
| if so, how? |
| |
| RESOLVED: We will support uniforms of all component types, either in a |
| buffer object (via OpenGL 3.1 or ARB_uniform_buffer_object) or in |
| storage associated with the program. |
| |
| When uniforms are stored in buffer object, they are stored using their |
| native data types according to the pre-existing packing and layout |
| rules. Those rules were already written to be able to accommodate both |
| the larger and smaller new data types. |
| |
| Uniforms stored in program objects are loaded with Uniform* APIs. There |
| are no pre-existing uniform APIs accepting doubles or other "long" |
| types, so there was no clear need to add an extra "L" to the name to |
| distinguish from other APIs like we do with VertexAttribL* APIs. |
| |
| Uniforms with 8- and 16- bit components are loaded with the "larger" |
| Uniform*{i,ui,f} APIs; it didn't seem worth it to add numerous entry |
| points to the APIs to handle all those new types. |
| |
| (8) How do the uniform loading commands introduced by this extension |
| interact similar commands added by NV_shader_buffer_load? |
| |
| RESOLVED: NV_shader_buffer_load provided the command Uniformui64NV to |
| load pointer uniforms with a single 64-bit unsigned integer. This |
| extension provides vectors of 64-bit unsigned integers, so we needed |
| Uniform{2,3,4}ui64NV commands. We chose to provide a Uniform1ui64NV |
| command, which will be functionally equivalent to Uniformui64NV. |
| |
| (9) How will transform feedback work for capturing variables with double |
| or 64-bit components? Should we support transform feedback on |
| variables with components with fewer than 32 bits? |
| |
| RESOLVED: Transform feedback will support variables with any component |
| size. Components with fewer than 32-bits are converted to their |
| equivalent 32-bit types. |
| |
| For doubles and variables with 64-bit components, each component |
| captured will count as 64-bit values and occupy two components for the |
| purpose of component counting rules. This could be a problem for the |
| SEPARATE_ATTRIBS mode, since the minimum component limit is four, which |
| would not be sufficient to capture a dvec3 or dvec4. However, |
| implementations supporting this extension should also be able to support |
| ARB_transform_feedback3, which extends INTERLEAVED_ATTRIBS mode to |
| capture vertex attribute values interleaved into multiple buffers. That |
| functionality effectively obsoletes the SEPARATE_ATTRIBS mode, since it |
| is a functional superset. |
| |
| We considered support for capturing 8- and 16-bit values directly, which |
| had a number of problems. First, full byte addressing might impose both |
| alignment issues (e.g., capturing a uint8_t followed by a float might |
| misalign the float) and additional hardware implementation burdens. One |
| other option would be to pack multiple values into a 32-bit integer |
| (e.g., f16vec2 would be packed with .x in the LSBs and .y in the MSBs). |
| This could work, even with word addressing, but would require padding |
| for odd sizes (e.g., f16vec2 padded to two words, with the second word |
| holding only .z). It would also have endianness issues; packed values |
| would look like arrays of the corresponding smaller type on |
| little-endian systems, but not on big-endian ones. |
| |
| (10) What precision will be used for computation, storage, and inter-stage |
| transfer of 8- and 16-bit component data types? |
| |
| RESOLVED: The components may be considered to occupy a full 32 bits for |
| the purposes of input/output component count limits. 8- and 16-bit |
| values should, however, be passed at that precision. |
| |
| (11) Is the new support for non-constant texel offsets completely |
| orthogonal? |
| |
| RESOLVED: No. Non-constant offsets are not supported for the existing |
| functions textureGradOffset() and textureProjGradOffset(). |
| |
| (12) Should we provide functions like intBitsToFloat() that operate on |
| 16-bit floating-point values? |
| |
| RESOLVED: Not in this extension. Such conversions can be performed |
| using the following code: |
| |
| uint16_t float16BitsToUint16(float16_t v) |
| { |
| return uint16_t(packFloat2x16(f16vec2(v, 0)); |
| } |
| |
| float16_t uint16BitsToFloat16(uint16_t v) |
| { |
| return unpackFloat2x16(uint(v)).x; |
| } |
| |
| (13) Should we provide distinct sized types for 32-bit integers and |
| floats, and 64-bit floats? Should we provide those types as aliases |
| for existing unsized types? Or should we provide no such types at |
| all? |
| |
| RESOLVED: We will provide sized versions of these types, which are |
| defined as completely equivalent to unsized types according to the |
| following table: |
| |
| unsized type sized types |
| ------------- --------------- |
| int int32_t |
| uint uint32_t |
| float float32_t |
| double float64_t |
| |
| Vector types with sized and unsized components have equivalent |
| relationships. |
| |
| Note that the nominally "unsized" data types in the GLSL 1.30 spec are |
| actually sized. The specification explicitly defines signed and unsized |
| integers (int, uint) to be 32-bit values. It also defines |
| floating-point values to "match the IEEE single precision floating-point |
| definition for precision and dynamic range", which are also 32-bit |
| values. |
| |
| This type equivalence has minor implications on function overloading: |
| |
| * You can't declare separate versions of a function with an "int" |
| argument in one version and an "int32_t" argument in another. |
| |
| * Because there is no implicit conversion between equivalent types, we |
| will get an exact match if an argument is declared with one type |
| (e.g., "int") in the caller and a textually different but equivalent |
| type ("int32_t") in the function. |
| |
| Note that the type equivalence also applies to API data type queries. |
| For example, the type INT will be returned for a variable declared as |
| "int32_t". |
| |
| (14) What are functions like anyThreadNV() and allThreadsNV() good for? |
| |
| NRESOLVED: If an implementation performs SIMD thread execution, |
| divergent branching may result in reduced performance if the "if" and |
| "else" blocks of an "if" statement are executed sequentially. For |
| example, an algorithm may have both a "fast path" that performs a |
| computation quickly for a subset of all cases and a "fast path" that |
| performs a computation quickly but correctly. When performing SIMD |
| execution, code like the following: |
| |
| if (condition) { |
| result = do_fast_path(...); |
| } else { |
| result = do_slow_path(...); |
| } |
| |
| may end up executing *both* the fast and slow paths for a SIMD thread |
| group if <condition> diverges, and may execute more slowly than simply |
| executing the slow path unconditionally. These functions allow code |
| like: |
| |
| if (allThreadsNV(condition)) { |
| result = do_fast_path(...); |
| } else { |
| result = do_slow_path(...); |
| } |
| |
| that executes the fast path if and only if it can be used for *all* |
| threads in the group. For thread groups where <condition> diverges, |
| this algorithm would unconditionally run the slow path, but would never |
| run both in sequence. |
| |
| There may be other cases where "voting" across shader invocations may be |
| useful. Note that we provide no control over how shader invocations may |
| be packed within a SIMD thread group, unlike various "compute" APIs |
| (CUDA, OpenCL). |
| |
| (15) Can the 64-bit uniform APIs be used to load values for uniforms of |
| type "bool", "bvec2", "bvec3", or "bvec4"? |
| |
| RESOLVED: No. OpenGL 2.0 and beyond did allow "bool" variable to be |
| set with Uniform*i* and Uniform*f APIs, and OpenGL 3.0 extended that |
| support to Uniform*ui* for orthogonality. But it seems pointless to |
| extended this capability forward to 64-bit Uniform APIs as well. |
| |
| (19) The ARB_tessellation_shader extension adds support for patch |
| primitives that might survive to the transform feedback stage. How |
| are such primitives captured? |
| |
| RESOLVED: If patch primitives survive to the transform feedback stage, |
| they are recorded on a patch-by-patch basis. Incomplete patches are not |
| recorded. As with other primitive types, if the transform feedback |
| buffers do not contain enough space to capture an entire patch, no |
| vertices are recorded. |
| |
| Note that the only way to get patch primitives all the way to transform |
| feedback is to have tessellation evaluation and geometry shaders |
| disabled; the output streams from both of those shader stages are |
| collections of points, lines, or triangles. |
| |
| (20) Previous transform feedback allowed capturing only fixed-size |
| primitives; this extension supports variable-sized patches. What |
| interactions does this functionality have with transform feedback |
| buffer overflow? |
| |
| RESOLVED: With fixed-size point, line, or triangle primitives, once any |
| primitive fails to be recorded due to insufficient space, all subsequent |
| primitives would also fail. With variable-size patch primitives, the |
| transform feedback stage might first receive a large patch that doesn't |
| fit, followed by a smaller patch that could squeeze into the remaining |
| space. |
| |
| To allow for different types of implementation of this extension without |
| requiring special-case handling of this corner case, we've chosen to |
| leave this behavior undefined -- the smaller patch may or may not be |
| recorded. |
| |
| |
| Revision History |
| |
| Rev. Date Author Changes |
| ---- -------- -------- ----------------------------------------- |
| 11 03/07/17 mheyer Update OpenGL ES interactions to clarify |
| that using a tessellation control shader |
| without a tessellation evaluation shader |
| is legal, and PATCHES can be sent past the |
| tessellation stage. |
| |
| 10 04/16/16 mheyer Add OpenGL ES interactions (written before |
| revision 9, but not published) |
| |
| 9 02/19/16 pbrown Clarify that non-constant offset vectors are |
| supported in textureGatherOffsets(). |
| |
| 8 09/11/14 pbrown Fix incorrect implicit conversions, which |
| follow the general pattern of little->big |
| and int->uint->float. Thanks to Daniel |
| Rakos, author of similar functionality in |
| the AMD_gpu_shader_int64 spec. |
| |
| 7 11/08/10 pbrown Fix typos in description of packFloat2x16 and |
| unpackFloat2x16. |
| |
| 6 03/23/10 pbrown Update overview, dependencies, remove references |
| to old extension names. Extend the function |
| overloading prioritization rules from |
| ARB_gpu_shader5 to account for new data types. |
| Major overhaul of the issues section to match |
| the refactoring done to produce ARB specs. |
| |
| 5 03/08/10 pbrown Add interaction with EXT_vertex_attrib_64bit and |
| NV_vertex_attrib_integer_64bit; enabling this |
| extension automatically enables 64-bit floating- |
| point and integer vertex inputs. |
| |
| 4 03/01/10 pbrown Fix prototype for GetUniformui64vNV. |
| |
| 3 01/14/10 pbrown Fix with updated enum assignments. |
| |
| 2 12/08/09 pbrown Add explicit component counting rules for |
| 64-bit integer attributes similar to those |
| in the ARB_gpu_shader_fp64 spec. |
| |
| 1 pbrown Internal revisions. |