extensions/NV/NV_gpu_shader5.txt - external/github.com/KhronosGroup/OpenGL-Registry - Git at Google

 Name

     NV_gpu_shader5

 Name Strings

     GL_NV_gpu_shader5

 Contact

     Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com)

 Contributors

     Barthold Lichtenbelt, NVIDIA
     Chris Dodd, NVIDIA
     Eric Werness, NVIDIA
     Greg Roth, NVIDIA
     Jeff Bolz, NVIDIA
     Piers Daniell, NVIDIA
     Daniel Rakos, AMD
     Mathias Heyer, NVIDIA

 Status

     Shipping.

 Version

     Last Modified Date:         03/07/2017
     NVIDIA Revision:            11

 Number

     OpenGL Extension #389
     OpenGL ES Extension #260

 Dependencies

     This extension is written against the OpenGL 3.2 (Compatibility Profile)
     Specification.

     This extension is written against version 1.50 (revision 09) of the OpenGL
     Shading Language Specification.

     If implemented in OpenGL, OpenGL 3.2 and GLSL 1.50 are required.

     If implemented in OpenGL, ARB_gpu_shader5 is required.

     This extension interacts with ARB_gpu_shader5.

     This extension interacts with ARB_gpu_shader_fp64.

     This extension interacts with ARB_tessellation_shader.

     This extension interacts with NV_shader_buffer_load.

     This extension interacts with EXT_direct_state_access.

     This extension interacts with EXT_vertex_attrib_64bit and
     NV_vertex_attrib_integer_64bit.

     This extension interacts with OpenGL ES 3.1 (dated October 29th 2014).

     This extension interacts with OpenGL ES Shading Language 3.1 (revision 3).

     If implemented in OpenGL ES, OpenGL ES 3.1 and GLSL ES 3.10 are required.

     If implemented in OpenGL ES, OES/EXT_gpu_shader5 and EXT_shader_implicit-
     _conversions are required.

     This extension interacts with OES/EXT_tessellation_shader

     This extension interacts with OES/EXT_geometry_shader

 Overview

     This extension provides a set of new features to the OpenGL Shading
     Language and related APIs to support capabilities of new GPUs.  Shaders
     using the new functionality provided by this extension should enable this
     functionality via the construct

       #extension GL_NV_gpu_shader5 : require     (or enable)

     This extension was developed concurrently with the ARB_gpu_shader5
     extension, and provides a superset of the features provided there.  The
     features common to both extensions are documented in the ARB_gpu_shader5
     specification; this document describes only the addition language features
     not available via ARB_gpu_shader5.  A shader that enables this extension
     via an #extension directive also implicitly enables the common
     capabilities provided by ARB_gpu_shader5.

     In addition to the capabilities of ARB_gpu_shader5, this extension
     provides a variety of new features for all shader types, including:

       * support for a full set of 8-, 16-, 32-, and 64-bit scalar and vector
         data types, including uniform API, uniform buffer object, and shader
         input and output support;

       * the ability to aggregate samplers into arrays, index these arrays with
         arbitrary expressions, and not require that non-constant indices be
         uniform across all shader invocations;

       * new built-in functions to pack and unpack 64-bit integer types into a
         two-component 32-bit integer vector;

       * new built-in functions to pack and unpack 32-bit unsigned integer
         types into a two-component 16-bit floating-point vector;

       * new built-in functions to convert double-precision floating-point
         values to or from their 64-bit integer bit encodings;

       * new built-in functions to compute the composite of a set of boolean
         conditions a group of shader threads;

       * vector relational functions supporting comparisons of vectors of 8-,
         16-, and 64-bit integer types or 16-bit floating-point types; and

       * extending texel offset support to allow loading texel offsets from
         regular integer operands computed at run-time, except for lookups with
         gradients (textureGrad*).

     This extension also provides additional support for processing patch
     primitives (introduced by ARB_tessellation_shader).
     ARB_tessellation_shader requires the use of a tessellation evaluation
     shader when processing patches, which means that patches will never
     survive past the tessellation pipeline stage.  This extension lifts that
     restriction, and allows patches to proceed further in the pipeline and be
     used

       * as input to a geometry shader, using a new "patches" layout qualifier;

       * as input to transform feedback;

       * by fixed-function rasterization stages, in which case the patches are
         drawn as independent points.

     Additionally, it allows geometry shaders to read per-patch attributes
     written by a tessellation control shader using input variables declared
     with "patch in".


 New Procedures and Functions

     void Uniform1i64NV(int location, int64EXT x);
     void Uniform2i64NV(int location, int64EXT x, int64EXT y);
     void Uniform3i64NV(int location, int64EXT x, int64EXT y, int64EXT z);
     void Uniform4i64NV(int location, int64EXT x, int64EXT y, int64EXT z,
                        int64EXT w);
     void Uniform1i64vNV(int location, sizei count, const int64EXT *value);
     void Uniform2i64vNV(int location, sizei count, const int64EXT *value);
     void Uniform3i64vNV(int location, sizei count, const int64EXT *value);
     void Uniform4i64vNV(int location, sizei count, const int64EXT *value);

     void Uniform1ui64NV(int location, uint64EXT x);
     void Uniform2ui64NV(int location, uint64EXT x, uint64EXT y);
     void Uniform3ui64NV(int location, uint64EXT x, uint64EXT y, uint64EXT z);
     void Uniform4ui64NV(int location, uint64EXT x, uint64EXT y, uint64EXT z,
                        uint64EXT w);
     void Uniform1ui64vNV(int location, sizei count, const uint64EXT *value);
     void Uniform2ui64vNV(int location, sizei count, const uint64EXT *value);
     void Uniform3ui64vNV(int location, sizei count, const uint64EXT *value);
     void Uniform4ui64vNV(int location, sizei count, const uint64EXT *value);

     void GetUniformi64vNV(uint program, int location, int64EXT *params);


     (The following function is also provided by NV_shader_buffer_load.)

     void GetUniformui64vNV(uint program, int location, uint64EXT *params);


     (All of the following ProgramUniform* functions are supported if and only
      if implemented in OpenGL ES or EXT_direct_state_access is supported.)

     void ProgramUniform1i64NV(uint program, int location, int64EXT x);
     void ProgramUniform2i64NV(uint program, int location, int64EXT x,
                               int64EXT y);
     void ProgramUniform3i64NV(uint program, int location, int64EXT x,
                               int64EXT y, int64EXT z);
     void ProgramUniform4i64NV(uint program, int location, int64EXT x,
                               int64EXT y, int64EXT z, int64EXT w);
     void ProgramUniform1i64vNV(uint program, int location, sizei count,
                                const int64EXT *value);
     void ProgramUniform2i64vNV(uint program, int location, sizei count,
                                const int64EXT *value);
     void ProgramUniform3i64vNV(uint program, int location, sizei count,
                                const int64EXT *value);
     void ProgramUniform4i64vNV(uint program, int location, sizei count,
                                const int64EXT *value);

     void ProgramUniform1ui64NV(uint program, int location, uint64EXT x);
     void ProgramUniform2ui64NV(uint program, int location, uint64EXT x,
                                uint64EXT y);
     void ProgramUniform3ui64NV(uint program, int location, uint64EXT x,
                                uint64EXT y, uint64EXT z);
     void ProgramUniform4ui64NV(uint program, int location, uint64EXT x,
                                uint64EXT y, uint64EXT z, uint64EXT w);
     void ProgramUniform1ui64vNV(uint program, int location, sizei count,
                                 const uint64EXT *value);
     void ProgramUniform2ui64vNV(uint program, int location, sizei count,
                                 const uint64EXT *value);
     void ProgramUniform3ui64vNV(uint program, int location, sizei count,
                                 const uint64EXT *value);
     void ProgramUniform4ui64vNV(uint program, int location, sizei count,
                                 const uint64EXT *value);


 New Tokens

     Returned by the <type> parameter of GetActiveAttrib, GetActiveUniform, and
     GetTransformFeedbackVarying:

         INT64_NV                                        0x140E
         UNSIGNED_INT64_NV                               0x140F

         INT8_NV                                         0x8FE0
         INT8_VEC2_NV                                    0x8FE1
         INT8_VEC3_NV                                    0x8FE2
         INT8_VEC4_NV                                    0x8FE3
         INT16_NV                                        0x8FE4
         INT16_VEC2_NV                                   0x8FE5
         INT16_VEC3_NV                                   0x8FE6
         INT16_VEC4_NV                                   0x8FE7
         INT64_VEC2_NV                                   0x8FE9
         INT64_VEC3_NV                                   0x8FEA
         INT64_VEC4_NV                                   0x8FEB
         UNSIGNED_INT8_NV                                0x8FEC
         UNSIGNED_INT8_VEC2_NV                           0x8FED
         UNSIGNED_INT8_VEC3_NV                           0x8FEE
         UNSIGNED_INT8_VEC4_NV                           0x8FEF
         UNSIGNED_INT16_NV                               0x8FF0
         UNSIGNED_INT16_VEC2_NV                          0x8FF1
         UNSIGNED_INT16_VEC3_NV                          0x8FF2
         UNSIGNED_INT16_VEC4_NV                          0x8FF3
         UNSIGNED_INT64_VEC2_NV                          0x8FF5
         UNSIGNED_INT64_VEC3_NV                          0x8FF6
         UNSIGNED_INT64_VEC4_NV                          0x8FF7
         FLOAT16_NV                                      0x8FF8
         FLOAT16_VEC2_NV                                 0x8FF9
         FLOAT16_VEC3_NV                                 0x8FFA
         FLOAT16_VEC4_NV                                 0x8FFB

     (If ARB_tessellation_shader is supported, the following enum is accepted
      by a new primitive.)

     Accepted by the <primitiveMode> parameter of BeginTransformFeedback:

         PATCHES


 Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification
 (OpenGL Operation)

     Modify Section 2.6.1, Begin and End, p. 22

     (Extend language describing PATCHES introduced by ARB_tessellation_shader.
     It particular, add the following to the end of the description of the
     primitive type.)

     If a patch primitive is drawn, each patch is drawn separately as a
     collection of points, which each patch vertex definining a separate point.
     Extra vertices from an incomplete patch are never drawn.


     Modify Section 2.14.3, Vertex Attributes, p. 86

     (modify the second paragraph, p. 87) ... exceeds MAX_VERTEX_ATTRIBS.  For
     the purposes of this comparison, attribute variables of the type i64vec3,
     u64vec3, i64vec4, and u64vec4 count as consuming twice as many attributes
     as equivalent single-precision types.


     (extend the list of types in the first paragraph, p. 88)
     ... UNSIGNED_INT_VEC3, UNSIGNED_INT_VEC4, INT8_NV, INT8_VEC2_NV,
     INT8_VEC3_NV, INT8_VEC4_NV, INT16_NV, INT16_VEC2_NV, INT16_VEC3_NV,
     INT16_VEC4_NV, INT64_NV, INT64_VEC2_NV, INT64_VEC3_NV, INT64_VEC4_NV,
     UNSIGNED_INT8_NV, UNSIGNED_INT8_VEC2_NV, UNSIGNED_INT8_VEC3_NV,
     UNSIGNED_INT8_VEC4_NV, UNSIGNED_INT16_NV, UNSIGNED_INT16_VEC2_NV,
     UNSIGNED_INT16_VEC3_NV, UNSIGNED_INT16_VEC4_NV, UNSIGNED_INT64_NV,
     UNSIGNED_INT64_VEC2_NV, UNSIGNED_INT64_VEC3_NV, UNSIGNED_INT64_VEC4_NV,
     FLOAT16_NV, FLOAT16_VEC2_NV, FLOAT16_VEC3_NV, or FLOAT16_VEC4_NV.


     Modify Section 2.14.4, Uniform Variables, p. 89

     (modify third paragraph, p. 90) ... uniform variable storage for a vertex
     shader.  A scalar or vector uniform with with 64-bit integer components
     will consume no more than 2<n> components, where <n> is 1 for scalars, and
     the component count for vectors.  A link error is generated ...

     (add to Table 2.13, p. 96)

       Type Name Token           Keyword
       --------------------      ----------------
       INT8_NV                   int8_t
       INT8_VEC2_NV              i8vec2
       INT8_VEC3_NV              i8vec3
       INT8_VEC4_NV              i8vec4
       INT16_NV                  int16_t
       INT16_VEC2_NV             i16vec2
       INT16_VEC3_NV             i16vec3
       INT16_VEC4_NV             i16vec4
       INT64_NV                  int64_t
       INT64_VEC2_NV             i64vec2
       INT64_VEC3_NV             i64vec3
       INT64_VEC4_NV             i64vec4
       UNSIGNED_INT8_NV          uint8_t
       UNSIGNED_INT8_VEC2_NV     u8vec2
       UNSIGNED_INT8_VEC3_NV     u8vec3
       UNSIGNED_INT8_VEC4_NV     u8vec4
       UNSIGNED_INT16_NV         uint16_t
       UNSIGNED_INT16_VEC2_NV    u16vec2
       UNSIGNED_INT16_VEC3_NV    u16vec3
       UNSIGNED_INT16_VEC4_NV    u16vec4
       UNSIGNED_INT64_NV         uint64_t
       UNSIGNED_INT64_VEC2_NV    u64vec2
       UNSIGNED_INT64_VEC3_NV    u64vec3
       UNSIGNED_INT64_VEC4_NV    u64vec4
       FLOAT16_NV                float16_t
       FLOAT16_VEC2_NV           f16vec2
       FLOAT16_VEC3_NV           f16vec3
       FLOAT16_VEC4_NV           f16vec4

     (modify list of commands at the bottom of p. 99)

       void Uniform{1,2,3,4}{i64,ui64}NV(int location, T value);
       void Uniform{1,2,3,4}{i64,ui64}vNV(int location, T value);

     (insert after fourth paragraph, p. 100) The Uniform*i64{v}NV and
     Uniform*ui64{v}NV commands will load <count> sets of one to four 64-bit
     signed or unsigned integer values into a uniform location defined as a
     64-bit signed or unsigned integer scalar or vector types.


     (modify "Uniform Buffer Object Storage", p. 102, adding two bullets after
      the last "Members of type", and modifying the subsequent bullet)

      * Members of type int8_t, int16_t, and int64_t are extracted from a
        buffer object by reading a single byte, short, or int64-typed value at
        the specified offset.

      * Members of type uint8_t, uint16_t, and uint64_t are extracted from a
        buffer object by reading a single ubyte, ushort, or uint64-typed value
        at the specified offset.

      * Members of type float16_t are extracted from a buffer object by reading
        a single half-typed value at the specified offset.

      * Vectors with N elements with basic data types of bool, int, uint,
        float, double, int8_t, int16_t, int64_t, uint8_t, uint16_t, uint64_t,
        or float16_t are extracted as N values in consecutive memory locations
        beginning at the specified offset, with components stored in order with
        the first (X) component at the lowest offset. The GL data type used for
        component extraction is derived according to the rules for scalar
        members above.


     Modify Section 2.14.6, Varying Variables, p. 106

     (modify third paragraph, p. 107) ... For the purposes of counting input
     and output components consumed by a shader, variables declared as vectors,
     matrices, and arrays will all consume multiple components.  Each component
     of variables declared as 64-bit integer scalars or vectors, will be
     counted as consuming two components.

     (add after the bulleted list, p. 108) For the purposes of counting the
     total number of components to capture, each component of outputs declared
     as 64-bit integer scalars or vectors will be counted as consuming two
     components.


     Modify Section 2.15.1, Geometry Shader Input Primitives, p. 118

     (add new qualifier at the end of the section, p. 120)

     Patches (patches)

     Geometry shaders that operate on patches are valid for the PATCHES
     primitive type.  The number of vertices available to each program
     invocation is equal to the vertex count of the variable-size patch, with
     vertices presented to the geometry shader in the order specified in the
     patch.


     Modify Section 2.15.4, Geometry Shader Execution Environment, p. 121

     (add to the end of "Geometry Shader Inputs", p. 123)

     Geometry shaders also support built-in and user-defined per-primitive
     inputs.  The following built-in inputs, not replicated per-vertex and not
     contained in gl_in[], are supported:

       * The variable gl_PatchVerticesIn is filled with the number of the
         vertices in the input primitive.

       * The variables gl_TessLevelOuter[] and gl_TessLevelInner[] are arrays
         holding outer and inner tessellation levels of an input patch.  If a
         tessellation control shader is active, the tessellation levels will be
         taken from the corresponding outputs of the tessellation control
         shader.  Otherwise, the default levels provided as patch parameters
         are used.  Tessellation level values loaded in these variables will be
         prior to the clamping and rounding operations performed by the
         primitive generator as described in Section 2.X.2 of
         ARB_tessellation_shader.  For triangular tessellation,
         gl_TessLevelOuter[3] and gl_TessLevelInner[1] will be undefined.  For
         isoline tessellation, gl_TessLevelOuter[2], gl_TessLevelOuter[3], and
         both values in gl_TessLevelInner[] are undefined.

     Additionally, a geometry shader with an input primitive type of "patches"
     may declare per-patch input variables using the qualifier "patch in".
     Unlike per-vertex inputs, per-patch inputs do not correspond to any
     specific vertex in the input primitive, and are not indexed by vertex
     number.  Per-patch inputs declared as arrays have multiple values for the
     input patch; similarly declared per-vertex inputs would indicate a single
     value for each vertex in the output patch.  User-defined per-patch input
     variables are filled with corresponding per-patch output values written by
     the tessellation control shader.  If no tessellation control shader is
     active, all such variables are undefined.

     Per-patch input variables and the built-in inputs "gl_PatchVerticesIn",
     "gl_TessLevelOuter[]", and "gl_TessLevelInner[]" are supported only for
     geometry shaders with an input primitive type of "patches".  A program
     will fail to link if any such variable is used in a geometry shader with a
     input primitive type other than "patches".


     Modify Section 2.19, Transform Feedback, p. 130

     (add to Table 2.14, p. 131)

       Transform Feedback
       primitiveMode               allowed render primitive modes
       ----------------------      ---------------------------------
       PATCHES                     PATCHES


     (modify first paragraph, p. 131) ... <primitiveMode> is one of TRIANGLES,
     LINES, POINTS, or PATCHES and specifies the type of primitives that will
     be recorded into the buffer objects bound for transform feedback (see
     below). ...

     (modify last paragraph, p. 131 and first paragraph, p. 132, adding patch
     support, and dealing with capture of 8- and 16-bit components)

     When an individual point, line, triangle, or patch primitive reaches the
     transform feedback stage ...  When capturing line, triangle, and patch
     primitives, all attributes ...  For multi-component varying variables or
     varying array elements, the individual components are written in order.
     For variables with 8- or 16-bit fixed- or floating-point components,
     individual components will be converted to and stored as equivalent values
     of type "int", "uint", or "float".  The value for any attribute specified
     ...

     (modify next-to-last paragraph, p. 132) ... is not incremented.  If
     transform feedback receives a primitive that fits in the remaining space
     after such an overflow occurs, that primitive may or may not be recorded.
     Primitives that fail to fit in the remaining space are never recorded.


 Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification
 (Rasterization)

     None.

 Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification
 (Per-Fragment Operations and the Frame Buffer)

     None.

 Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification
 (Special Functions)

     None.

 Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification
 (State and State Requests)

     Modify Section 6.1.15, Shader and Program Queries, p. 332

     (add to the first list of commands, p. 337)

       void GetUniformi64vNV(uint program, int location, int64EXT *params);
       void GetUniformui64vNV(uint program, int location, uint64EXT *params);


 Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile)
 Specification (Invariance)

     None.

 Additions to the AGL/GLX/WGL Specifications

     None.

 Modifications to The OpenGL Shading Language Specification, Version 1.50
 (Revision 09)

     Including the following line in a shader can be used to control the
     language features described in this extension:

       #extension GL_NV_gpu_shader5 : <behavior>

     where <behavior> is as specified in section 3.3.

     New preprocessor #defines are added to the OpenGL Shading Language:

       #define GL_NV_gpu_shader5         1

     If the features of this extension are enabled by an #extension directive,
     shading language features documented in the ARB_gpu_shader5 extension will
     also be provided.


     Modify Section 3.6, Keywords, p. 15

     (add the following to the list of reserved keywords)

     int8_t              i8vec2          i8vec3          i8vec4
     int16_t             i16vec2         i16vec3         i16vec4
     int32_t             i32vec2         i32vec3         i32vec4
     int64_t             i64vec2         i64vec3         i64vec4
     uint8_t             u8vec2          u8vec3          u8vec4
     uint16_t            u16vec2         u16vec3         u16vec4
     uint32_t            u32vec2         u32vec3         u32vec4
     uint64_t            u64vec2         u64vec3         u64vec4
     float16_t           f16vec2         f16vec3         f16vec4
     float32_t           f32vec2         f32vec3         f32vec4
     float64_t           f64vec2         f64vec3         f64vec4

     (note:  the "float64_t" and "f64vec*" types are available if and only if
     ARB_gpu_shader_fp64 is also supported)


     Modify Section 4.1, Basic Types, p. 18

     (add to the basic "Transparent Types" table, p. 18)

       Types       Meaning
       --------    ----------------------------------------------------------
       int8_t      an 8-bit signed integer
       i8vec2      a two-component signed integer vector (8-bit components)
       i8vec3      a three-component signed integer vector (8-bit components)
       i8vec4      a four-component signed integer vector (8-bit components)

       int16_t     a 16-bit signed integer
       i16vec2     a two-component signed integer vector (16-bit components)
       i16vec3     a three-component signed integer vector (16-bit components)
       i16vec4     a four-component signed integer vector (16-bit components)

       int32_t     a 32-bit signed integer
       i32vec2     a two-component signed integer vector (32-bit components)
       i32vec3     a three-component signed integer vector (32-bit components)
       i32vec4     a four-component signed integer vector (32-bit components)

       int64_t     a 64-bit signed integer
       i64vec2     a two-component signed integer vector (64-bit components)
       i64vec3     a three-component signed integer vector (64-bit components)
       i64vec4     a four-component signed integer vector (64-bit components)

       uint8_t     a 8-bit unsigned integer
       u8vec2      a two-component unsigned integer vector (8-bit components)
       u8vec3      a three-component unsigned integer vector (8-bit components)
       u8vec4      a four-component unsigned integer vector (8-bit components)

       uint16_t    a 16-bit unsigned integer
       u16vec2     a two-component unsigned integer vector (16-bit components)
       u16vec3     a three-component unsigned integer vector (16-bit components)
       u16vec4     a four-component unsigned integer vector (16-bit components)

       uint32_t    a 32-bit unsigned integer
       u32vec2     a two-component unsigned integer vector (32-bit components)
       u32vec3     a three-component unsigned integer vector (32-bit components)
       u32vec4     a four-component unsigned integer vector (32-bit components)

       uint64_t    a 64-bit unsigned integer
       u64vec2     a two-component unsigned integer vector (64-bit components)
       u64vec3     a three-component unsigned integer vector (64-bit components)
       u64vec4     a four-component unsigned integer vector (64-bit components)

       float16_t   a single 16-bit floating-point value
       f16vec2     a two-component floating-point vector (16-bit components)
       f16vec3     a three-component floating-point vector (16-bit components)
       f16vec4     a four-component floating-point vector (16-bit components)

       float32_t   a single 32-bit floating-point value
       f32vec2     a two-component floating-point vector (32-bit components)
       f32vec3     a three-component floating-point vector (32-bit components)
       f32vec4     a four-component floating-point vector (32-bit components)

       float64_t   a single 64-bit floating-point value
       f64vec2     a two-component floating-point vector (64-bit components)
       f64vec3     a three-component floating-point vector (64-bit components)
       f64vec4     a four-component floating-point vector (64-bit components)


     Modify Section 4.1.3, Integers, p. 20

     (add after the first paragraph of the section, p. 20)

     Variables with the types "int8_t", "int16_t", and "int64_t" represent
     signed integer values with exactly 8, 16, or 64 bits, respectively.
     Variables with the type "uint8_t", "uint16_t", and "uint64_t" represent
     unsigned integer values with exactly 8, 16, or 64 bits, respectively.
     Variables with the type "int32_t" and "uint32_t" represent signed and
     unsigned integer values with 32 bits, and are equivalent to "int" and
     "uint" types, respectively.


     (modify the grammar, p. 21, adding "L" and "UL" suffixes)

       integer-suffix:  one of

         u U l L ul UL

     (modify next-to-last paragraph, p. 21) ... When the suffix "u" or "U" is
     present, the literal has type <uint>.  When the suffix "l" or "L" is
     present, the literal has type <int64_t>.  When the suffix "ul" or "UL" is
     present, the literal has type <uint64_t>.  Otherwise, the type is
     <int>. ...


     Modify Section 4.1.4, Floats, p. 22

     (insert after second paragraph, p. 22)

     Variables of type "float16_t" represent floating-point using exactly 16
     bits and are stored using the 16-bit floating-point representation
     described in the OpenGL Specification.  Variables of type "float32_t"
     and "float64_t" represent floating-point with 32 or 64 bits, and are
     equivalent to "float" and "double" types, respectively.


     Modify Section 4.1.7, Samplers, p. 23

     (modify 1st paragraph of the section, deleting the restriction requiring
     constant indexing of sampler arrays) ... Samplers may aggregated into
     arrays within a shader (using square brackets [ ]) and can be indexed with
     general integer expressions.  The results of accessing a sampler array
     with an out-of-bounds index are undefined. ...

     (remove the additional restriction added by ARB_gpu_shader5 making a
     similar edit requiring uniform indexing across shader invocations for
     defined results.  NV_gpu_shader5 has no such limitation.)


     Modify Section 4.1.10, Implicit Conversions, p. 27

     (modify table of implicit conversions)

                                 Can be implicitly
         Type of expression        converted to
         --------------------    -----------------------------------------
         int                     uint, int64_t, uint64_t, float, double(*)
         ivec2                   uvec2, i64vec2, u64vec2, vec2, dvec2(*)
         ivec3                   uvec3, i64vec3, u64vec3, vec3, dvec3(*)
         ivec4                   uvec4, i64vec4, u64vec4, vec4, dvec4(*)

         int8_t   int16_t        int, int64_t, uint, uint64_t, float, double(*)
         i8vec2   i16vec2        ivec2, i64vec2, uvec2, u64vec2, vec2, dvec2(*)
         i8vec3   i16vec3        ivec3, i64vec3, uvec3, u64vec3, vec3, dvec3(*)
         i8vec4   i16vec4        ivec4, i64vec4, uvec4, u64vec4, vec4, dvec4(*)

         int64_t                 uint64_t, double(*)
         i64vec2                 u64vec2, dvec2(*)
         i64vec3                 u64vec3, dvec3(*)
         i64vec4                 u64vec4, dvec4(*)

         uint                    uint64_t, float, double(*)
         uvec2                   u64vec2, vec2, dvec2(*)
         uvec3                   u64vec3, vec3, dvec3(*)
         uvec4                   u64vec4, vec4, dvec4(*)

         uint8_t  uint16_t       uint, uint64_t, float, double(*)
         u8vec2   u16vec2        uvec2, u64vec2, vec2, dvec2(*)
         u8vec3   i16vec3        uvec3, u64vec3, vec3, dvec3(*)
         u8vec4   i16vec4        uvec4, u64vec4, vec4, dvec4(*)

         uint64_t                double(*)
         u64vec2                 dvec2(*)
         u64vec3                 dvec3(*)
         u64vec4                 dvec4(*)

         float                   double(*)
         vec2                    dvec2(*)
         vec3                    dvec3(*)
         vec4                    dvec4(*)

         float16_t               float, double(*)
         f16vec2                 vec2, dvec2(*)
         f16vec3                 vec3, dvec3(*)
         f16vec4                 vec4, dvec4(*)

         (*) if ARB_gpu_shader_fp64 is supported

     (Note:  Expressions of type "int32_t", "uint32_t", "float32_t", and
     "float64_t" are treated as identical to those of type "int", "uint",
     "float", and "double", respectively.  Implicit conversions to and from
     these explicitly-sized types are allowed whenever conversions involving
     the equivalent base type are allowed.)


     (modify second paragraph of the section) No implicit conversions are
     provided to convert from unsigned to signed integer types, from
     floating-point to integer types, from higher-precision to lower-precision
     types, from 8-bit to 16-bit types, or between matrix types.  There are no
     implicit array or structure conversions.

     (add before the final paragraph of the section, p. 27)

     (insert before the final paragraph of the section) When performing
     implicit conversion for binary operators, there may be multiple data types
     to which the two operands can be converted.  For example, when adding an
     int8_t value to a uint16_t value, both values can be implicitly converted
     to uint, uint64_t, float, and double.  In such cases, a floating-point
     type is chosen if either operand has a floating-point type.  Otherwise, an
     unsigned integer type is chosen if either operand has an unsigned integer
     type.  Otherwise, a signed integer type is chosen.  If operands can be
     converted to both 32- and 64-bit versions of the chosen base data type,
     the 32-bit version is used.


     Modify Section 4.3.4, Inputs, p. 31

     (modify third paragraph of section, p. 31, allowing explicitly-sized
     types) ... Vertex shader inputs variables can only be signed and unsigned
     integers, floats, doubles, explicitly-sized integers and floating-point
     values, vectors of any of these types, and matrices.  ...

     (modify edits done in ARB_tessellation_shader adding support for "patch
     in", allowing for geometry shaders as well) Additionally, tessellation
     evaluation and geometry shaders support per-patch input variables declared
     with the "patch in" qualifier.  Per-patch input ...


     (modify third paragraph, p. 32) ... Fragment inputs can only be signed and
     unsigned integers, floats, doubles, explicitly-sized integers and
     floating-point values, vectors of any of these types, matrices, or arrays
     or structures of these.  Fragment inputs declared as signed or unsigned
     integers, doubles, 64-bit floating-point values, including vectors,
     matrices, or arrays derived from those types, must be qualified as "flat".


     Modify Section 4.3.6, Outputs, p. 33

     (modify third paragraph of the section, p. 33) ... They can only be signed
     and unsigned integers, floats, doubles, explicitly-sized integers and
     floating-point values, vectors of any of these types, matrices, or arrays
     or structures of these.

     (modify last paragraph, p. 33) ...  Fragment outputs can only be signed
     and unsigned integers, floats, explicitly-sized integers and
     floating-point values with 32 or fewer bits, vectors of any of these
     types, or arrays of these.  Doubles, 64-bit integers or floating-point
     values, vectors or arrays of those types, matrices, and structures cannot
     be output. ...


     Modify Section 4.3.8.1, Input Layout Qualifiers, p. 37

     (add to the list of qualifiers for geometry shaders, p. 37)

       layout-qualifier-id:
         ...
         triangles_adjacency
         patches

     (modify the "size of input arrays" table, p. 38)

         Layout          Size of Input Arrays
       ------------      --------------------
         patches         gl_MaxPatchVertices

     (add paragraph below that table, p. 38)

     When using the input primitive type "patches", the geometry shader is used
     to process a set of patches with vertex counts that may vary from patch to
     patch.  For the purposes of input array sizing, patches are treated as
     having a vertex count fixed at the implementation-dependent maximum patch
     size, gl_MaxPatchVertices.  If a shader reads an input corresponding to a
     vertex not found in the patch being processed, the values read are
     undefined.


     Modify Section 5.4.1, Conversion and Scalar Constructors, p. 49

     (add after first list of constructor examples)

     Similar constructors are provided to convert to and from explicitly-sized
     scalar data types, as well:

       float(uint8_t)      // converts an 8-bit uint value to a float
       int64_t(double)     // converts a double value to a 64-bit int
       float64_t(int16_t)  // converts a 16-bit int value to a 64-bit float
       uint16_t(bool)      // converts a Boolean value to a 16-bit uint

     (replace final two paragraphs, p. 49, and the first paragraph, p. 50,
     using more general language)

     When constructors are used to convert any floating-point type to any
     integer type, the fractional part of the floating-point value is dropped.
     It is undefined to convert a negative floating point value to an unsigned
     integer type.

     When a constructor is used to convert any integer or floating-point type
     to bool, 0 and 0.0 are converted to false, and non-zero values are
     converted to true.  When a constructor is used to convert a bool to any
     integer or floating-point type, false is converted to 0 or 0.0, and true
     is converted to 1 or 1.0.

     Constructors converting between signed and unsigned integers with the same
     bit count always preserve the bit pattern of the input.  This will change
     the value of the argument if its most significant bit is set, converting a
     negative signed integer to a large unsigned integer, or vice versa.


     Modify Section 5.9, Expressions, p. 57

     (modify bulleted list as follows, adding support for expressions with
     64-bit integer types)

     Expressions in the shading language are built from the following:

     * Constants of type bool, int, int64_t, uint, uint64_t, float, all vector
       types, and all matrix types.

     ...

     * The arithmetic binary operators add (+), subtract (-), multiply (*), and
       divide (/) operate on 32-bit integer, 64-bit integer, and floating-point
       scalars, vectors, and matrices.  If the fundamental types of the
       operands do not match, the conversions from Section 4.1.10 "Implicit
       Conversions" are applied to produce matching types.  ...

     * The operator modulus (%) operate on 32- and 64-bit integer scalars or
       vectors. If the fundamental types of the operands do not match, the
       conversions from Section 4.1.10 "Implicit Conversions" are applied to
       produce matching types.  ...

     * The arithmetic unary operators negate (-), post- and pre-increment and
       decrement (-- and ++) operate on 32-bit integer, 64-bit integer, and
       floating-point values (including vectors and matrices). ...

     * The relational operators greater than (>), less than (<), and less than
       or equal (<=) operate only on scalar 32-bit integer, 64-bit integer, and
       floating-point expressions.  The result is scalar Boolean.  The
       fundamental type of the two operands must match, either as specified, or
       after one of the implicit type conversions specified in Section 4.1.10.
       ...

     * The equality operators equal (==), and not equal (!=) operate only on
       scalar 32-bit integer, 64-bit integer, and floating-point expressions.
       The result is scalar Boolean.  The fundamental type of the two operands
       must match, either as specified, or after one of the implicit type
       conversions specified in Section 4.1.10.  ...


     Modify Section 6.1, Function Definitions, p. 63

     (ARB_gpu_shader5 adds a set of rules for defining whether implicit
     conversions for one matching function definition are better or worse than
     those for another.  These comparisons are done argument by argument.
     Extend the edits made by ARB_gpu_shader5 to add several new rules for
     comparing implicit conversions for a single argument, corresponding to the
     new data types introduced by this extension.)

      To determine whether the conversion for a single argument in one match is
      better than that for another match, the following rules are applied, in
      order:

        1.  An exact match is better than a match involving any implicit
            conversion.

        2.  A match involving a conversion from a signed integer, unsigned
            integer, or floating-point type to a similar type having a larger
            number of bits is better a match not involving another conversion.
            The set of conversions qualifying under this rule are:

             source types                destination types
             -----------------           -----------------
             int8_t, int16_t             int, int64_t
             int                         int64_t
             uint8_t, uint16_t           uint, uint64_t
             uint                        uint64_t
             float16_t                   float
             float                       double

        3.  A match involving one conversion in rule 2 is better than a match
            involving another conversion in rule 2 if:

             (a) both conversions start with the same type and the first
                 conversion is to a type with a smaller number of bits (e.g.,
                 converting from int16_t to int is preferred to converting
                 int16_t to int64_t), or

             (b) both conversions end with the same type and the first
                 conversion is from a type with a larger number of bits (e.g.,
                 converting an "out" parameter from int16_t to int is preferred
                 to convering from int8_t to int).

        4. A match involving an implicit conversion from any integer type to
           float is better than a match involving an implicit conversion from
           any integer type to double.


     Modify Section 7.1, Vertex and Geometry Shader Special Variables, p. 69

     (NOTE:  These edits are written against the re-organized section in the
     ARB_tessellation_shader specification.)

     (add to the list of built-ins inputs for geometry shaders) In the geometry
     language, built-in input and output variables are intrinsically declared
     as:

       in int gl_PatchVerticesIn;
       patch in float gl_TessLevelOuter[4];
       patch in float gl_TessLevelInner[2];

     ...

     The input variable gl_PatchVerticesIn behaves as in the identically-named
     tessellation control and evaluation shader inputs.

     The input variables gl_TessLevelOuter[] and gl_TessLevelInner[] behave as
     in the identically-named tessellation evaluation shader inputs.


     Modify Chapter 8, Built-in Functions, p. 81

     (add to description of generic types, last paragraph of p. 69) ...  Where
     the input arguments (and corresponding output) can be int64_t, i64vec2,
     i64vec3, or i64vec4, <genI64Type> is used as the argument.  Where the
     input arguments (and corresponding output) can be uint64_t, u64vec2,
     u64vec3, or u64vec4, <genU64Type> is used as the argument.


     Modify Section 8.3, Common Functions, p. 84

     (add support for 64-bit integer packing and unpacking functions)

     Syntax:

       int64_t  packInt2x32(ivec2 v);
       uint64_t packUint2x32(uvec2 v);

       ivec2  unpackInt2x32(int64_t v);
       uvec2  unpackUint2x32(uint64_t v);

     The functions packInt2x32() and packUint2x32() return a signed or unsigned
     64-bit integer obtained by packing the components of a two-component
     signed or unsigned integer vector, respectively.  The first vector
     component specifies the 32 least significant bits; the second component
     specifies the 32 most significant bits.

     The functions unpackInt2x32() and unpackUint2x32() return a signed or
     unsigned integer vector built from a 64-bit signed or unsigned integer
     scalar, respectively.  The first component of the vector contains the 32
     least significant bits of the input; the second component consists the 32
     most significant bits.


     (add support for 16-bit floating-point packing and unpacking functions)

     Syntax:

       uint      packFloat2x16(f16vec2 v);
       f16vec2   unpackFloat2x16(uint v);

     The function packFloat2x16() returns an unsigned integer obtained by
     interpreting the components of a two-component 16-bit floating-point
     vector as integers according to OpenGL Specification, and then packing the
     two 16-bit integers into a 32-bit unsigned integer.  The first vector
     component specifies the 16 least significant bits of the result; the
     second component specifies the 16 most significant bits.

     The function unpackFloat2x16() returns a two-component vector with 16-bit
     floating-point components obtained by unpacking a 32-bit unsigned integer
     into a pair of 16-bit values, and interpreting those values as 16-bit
     floating-point numbers according to the OpenGL Specification.  The first
     component of the vector is obtained from the 16 least significant bits of
     the input; the second component is obtained from the 16 most significant
     bits.


     (add functions to get/set the bit encoding for floating-point values)

     64-bit floating-point data types in the OpenGL shading language are
     specified to be encoded according to the IEEE specification for
     double-precision floating-point values.  The functions below allow shaders
     to convert double-precision floating-point values to and from 64-bit
     signed or unsigned integers representing their encoding.

     To obtain signed or unsigned integer values holding the encoding of a
     floating-point value, use:

       genI64Type doubleBitsToInt64(genDType value);
       genU64Type doubleBitsToUint64(genDType value);

     Conversions are done on a component-by-component basis.

     To obtain a floating-point value corresponding to a signed or unsigned
     integer encoding, use:

       genDType int64BitsToDouble(genI64Type value);
       genDType uint64BitsToDouble(genU64Type value);


     (add functions to evaluate predicates over groups of threads)

     Syntax:

       bool anyThreadNV(bool value);
       bool allThreadsNV(bool value);
       bool allThreadsEqualNV(bool value);

     Implementations of the OpenGL Shading Language may, but are not required,
     to run multiple shader threads for a single stage as a SIMD thread group,
     where individual execution threads are assigned to thread groups in an
     undefined, implementation-dependent order.  Algorithms may benefit from
     being able to evaluate a composite of boolean values over all active
     threads in the thread group.

     The function anyThreadNV() returns true if and only if <value> is true for
     at least one active thread in the group.  The function allThreadsNV()
     returns true if and only if <value> is true for all active threads in the
     group.  The function allThreadsEqualNV() returns true if <value> is the
     same for all active threads in the group; the result of
     allThreadsEqualNV() will be true if and only if anyThreadNV() and
     allThreadsNV() would return the same value.

     Since these functions depends on the values of <value> in an undefined
     group of threads, the value returned by these functions is largely
     undefined.  However, anyThreadNV() is guaranteed to return true if <value>
     is true, and allThreadsNV() is guaranteed to return false if <value> is
     false.

     Since implementations are generally not required to combine threads into
     groups, simply returning <value> for anyThreadNV() and allThreadsNV() and
     returning true for allThreadsEqualNV() is a legal implementation of these
     functions.


     Modify Section 8.6, Vector Relational Functions, p. 90

     (modify the first paragraph, p. 90, adding support for relational
     functions operating on explicitly-sized types)

     Relational and equality operators (<, <=, >, >=, ==, !=) are defined (or
     reserved) to operate on scalars and produce scalar Boolean results.  For
     vector results, use the following built-in functions.  In the definitions
     below, the following terms are used as placeholders for all vector types
     for a given fundamental data type:

         placeholder     fundamental types
         -----------     ------------------------------------------------
         bvec            bvec2, bvec3, bvec4

         ivec            ivec2, ivec3, ivec4, i8vec2, i8vec3, i8vec4,
                         i16vec2, i16vec3, i16vec4, i64vec2, i64vec3, i64vec4

         uvec            uvec2, uvec3, uvec4, u8vec2, u8vec3, u8vec4,
                         u16vec2, u16vec3, u16vec4, u64vec2, u64vec3, u64vec4

         vec             vec2, vec3, vec4, dvec2(*), dvec3(*), dvec4(*),
                         f16vec2, f16vec3, f16vec4

         (*) only if ARB_gpu_shader_fp64 is supported

     In all cases, the sizes of the input and return vectors for any
     particular call must match.


     Modify Section 8.7, Texture Lookup Functions, p. 91

     (modify text for textureOffset() functions, p. 94, allowing non-constant
     offsets)

     Do a texture lookup as in texture but with offset added to the (u,v,w)
     texel coordinates before looking up each texel.  The value <offset> need
     not be constant; however, a limited range of offset values are supported.
     If any component of <offset> is less than MIN_PROGRAM_TEXEL_OFFSET_EXT or
     greater than MAX_PROGRAM_TEXEL_OFFSET_EXT, the offset applied to the
     texture coordinates is undefined.  Note that offset does not apply to the
     layer coordinate for texture arrays. This is explained in detail in
     section 3.9.9 of the OpenGL Specification (Version 3.2, Compatibility
     Profile), where offset is (delta_u, delta_v, delta_w).  Note that texel
     offsets are also not supported for cube maps.

     (Note:  This lifting of the constant offset restriction also applies to
     texelFetchOffset, p. 95, textureProjOffset, p. 95, textureLodOffset,
     p. 96, textureProjLodOffset, p. 96.)


     (modify the description of the textureGradOffset() functions, p. 97,
     preserving the restriction on constant offsets)

     Do a texture lookup with both explicit gradient and offset, as described
     in textureGrad and textureOffset.  For these functions, the offset value
     must be a constant expression.  A limited range of offset values are
     supported; the minimum and maximum offset values are
     implementation-dependent and given by MIN_PROGRAM_TEXEL_OFFSET and
     MAX_PROGRAM_TEXEL_OFFSET, respectively.


     (modify the description of the textureProjGradOffset() functions,
     p. 98, preserving the restriction on constant offsets)

     Do a texture lookup projectively and with explicit gradient as described
     in textureProjGrad, as well as with offset, as described in textureOffset.
     For these functions, the offset value must be a constant expression.  A
     limited range of offset values are supported; the minimum and maximum
     offset values are implementation-dependent and given by
     MIN_PROGRAM_TEXEL_OFFSET and MAX_PROGRAM_TEXEL_OFFSET, respectively.

     (modify the description of the textureGatherOffsets() functions,
      added in ARB_gpu_shader5, to remove the restriction on constant offsets)

     The textureGatherOffsets() functions operate identically ...
     selecting the texel T_i0_j0 of that footprint.  The specified values in
     <offsets> need not be constant.  A limited range of ...

     Modify Section 9, Shading Language Grammar, p. 92

     !!! TBD !!!


 GLX Protocol

     TBD

 Interactions with OpenGL ES 3.1

     If implemented in OpenGL ES, NV_gpu_shader5 acts as a superset
     of functionality provided by OES_gpu_shader5.

     A shader that enables this extension
     via an #extension directive also implicitly enables the common
     capabilities provided by OES_gpu_shader5.

     Replace references to ARB_gpu_shader5 with OES_gpu_shader5 and
     EXT_shader_implicit_conversions (as appropriate).
     Replace references to ARB_geometry_shader with OES/EXT_geometry_shader.
     Replace references to ARB_tessellation_shader with OES/EXT_tessellation_shader.

     Replace references to int64EXT and uint64EXT with int64 and uint64,
     respectively.

     The specification should be edited as follows to include new
     ProgramUniform* functions.

     (modify the ProgramUniform* language)

     The following commands:

         ....
         void ProgramUniform{1,2,3,4}{i64,ui64}NV
             (uint program int location, T value);
         void ProgramUniform{1,2,3,4}{i64,ui64}vNV
             (uint program, int location, const T *value);

     operate identically to the corresponding command where "Program" is
     deleted from the name (and extension suffixes are dropped or updated
     appropriately) except, rather than updating the currently active program
     object, these "Program" commands update the program object named by the
     <program> parameter.  ...

     Changes to Section 2.6.1 "Begin and End" don't apply.

     Disregard introduction of 64bit -integer or -floating point vertex
     attribute types.

 Interactions with OpenGL ES Shading Language 3.10, revision 3

     If implemented in GLSL ES, NV_gpu_shader5 acts as a superset
     of functionality provided by OES_gpu_shader5 and
     EXT_shader_implicit_conversions.

     A shader that enables this extension via an #extension directive
     also implicitly enables the common capabilities provided by
     OES_gpu_shader5 and EXT_shader_implicit_conversions.

     Replace references to ARB_tessellation_shader with OES/EXT_tessellation_shader.

     Implicit conversion between GLSL ES types are introduced by
     EXT_shader_implicit_conversions instead of ARB_gpu_shader5.

     Disregard the notion of 'double' types as vertex shader inputs.

     Section 4.1.7.2 "Images"
         Remove the third sentence restricts
         access to arrays of images to constant integral expression.

         This essentially leaves it to the 'dynamically uniform integral
         expressions' default as OES_gpu_shader5 introduced.

     Modify Section 4.3.9 "Interface Blocks", as modified OES_gpu_shader5

         NV_gpu_shader5 also lifts OES_gpu_shader5 restrictions with
         regard to indexing into arrays of uniforms blocks and shader
         storage blocks.

         Change sentence
         "All indices used to index a shader storage block array must be
          constant integral expressions. A uniform block array can only
          be indexed with a dynamically uniform integral expression,
          otherwise results are undefined." into

         "Arbitrary indices may be used to index a uniform block array;
          integral constant expressions are not required. If the index
          used to access an array of uniform blocks is out-of-bounds,
          the results of the access are undefined."

         Indexing into arrays  of shader storage blocks defaults to
         'dynamically uniform integral expressions'.

     Changes to Section 4.3.9, p.48 "Interface Blocks"

         Replace the sentence
         "All indices used to index a shader storage block array must be
          constant integral expressions. A uniform block array can only
          be indexed with a dynamically uniform integral expression,
          otherwise results are undefined."
         with
         "Arbitrary indices may be used to index a uniform block array;
          integral constant expressions are not required. If the index
          used to access an array of uniform blocks is out-of-bounds, the
          results of the access are undefined."

     4.4.1.1 "Compute Shader Inputs" change

         "layout-qualifier-id:
             local_size_x = integer-constant
             local_size_y = integer-constant
             local_size_z = integer-constant" into

         "layout-qualifier-id:
             local_size_x = integer-constant-expression
             local_size_y = integer-constant-expression
             local_size_z = integer-constant-expression"

     Section 4.4.1.gs "Geometry Shader Inputs" change

         "<layout-qualifier-id>
             ...
             invocations = integer-constant"  into

         "<layout-qualifier-id>
             ...
             invocations = integer-constant-expression"

     Section 4.4.2 "Output Layout Qualifiers" change

         "layout-qualifier-id:
             location = integer-constant" into

         "layout-qualifier-id:
             location = integer-constant-expression"

     Section 4.4.2.ts "Tessellation Control Outputs" change

         "layout-qualifier-id
             vertices = integer-constant"  into

         "layout-qualifier-id:
             vertices = integer-constant-expression"

     Section 4.4.3 "Uniform Variable Layout Qualifiers" change

         "layout-qualifier-id:
             location = integer-constant" into

         "layout-qualifier-id:
             location = integer-constant-expression"

     Section 4.4.4 "Uniform and Shader Storage Block Layout Qualifiers" change

         "layout-qualifier-id:
             ...
             binding = integer-constant" into

         "layout-qualifier-id:
             ...
             binding = integer-constant-expression"

     Section 4.4.5 "Opaque Uniform Layout Qualifiers" change

         "layout-qualifier-id:
             binding = integer-constant" into

         "layout-qualifier-id:
             binding = integer-constant-expression"

     Change sentence
         "A link-time error will result if two shaders in a program
          specify different integer-constant bindings for the same
          opaque-uniform name." into

          "A link-time error will result if two shaders in a program
           specify different bindings for the same opaque-uniform
           name."

     Section 4.4.6 "Atomic Counter Layout Qualifiers" change

         "layout-qualifier-id:
             binding = integer-constant
              offset = integer-constant" into

         "layout-qualifier-id:
             binding = integer-constant-expression
              offset = integer-constant-expression"

     Section 4.4.7 "Format Layout Qualifiers" change

         "layout-qualifier-id:
             ...
             binding = integer-constant" into

         "layout-qualifier-id:
             ...
             binding = integer-constant-expression"

     Section 4.7.3 "Precision Qualifiers"

     After "Literal constants do not have precision qualifiers." add
     "Neither do explicitly sized types such as int8_t, uint32_t,
     float16_t etc."

 Dependencies on OES_gpu_shader5

     In addition to allowing arbitrary indexing arrays of samplers, this
     extension also lifts OES_gpu_shader5 restrictions for indexing
     arrays of images and shader storage blocks. Additionally, it allows
     usage of 'integer-constant-expressions' for layout qualifiers that
     formerly took 'integer-constant'.

     In Section 'Overview': change the bullet point

     "* the ability to aggregate samplers into arrays...."

     to

     "* the ability to index into arrays of samplers, uniforms and shader
        storage blocks with arbitrary expressions, and not require that
        non-constant indices be uniform across all shader invocations."

     "* the ability to index into arrays of images using dynamically
        uniform integers."

     "* the ability to use 'integer-constant-expressions' in place of
        'integer-constant' for layout qualifiers."

 Dependencies on OES/EXT_tessellation_shader and OpenGL ES 3.2

     If implemented in OpenGL ES 3.1 or earlier and
     OES/EXT_tessellation_shader is not supported, language introduced by
     this extension describing processing patches in geometry shaders,
     transform feedback, and rasterization should be removed.

     If implemented in OpenGL ES 3.2 or implemented in
     OpenGL ES 3.1 and OES/EXT_tessellation_shader is supported:

     It is legal to send patches past the tessellation stage -- the
     following language from OES/EXT_tessellation_shader is removed:

       Patch primitives are not supported by pipeline stages below the
       tessellation evaluation shader.

     It is legal to use a tessellation control shader without a tessellation
     evaluation shader.

     Remove from the bullet list describing reasons for link failure below the
     LinkProgram command on p. 70 (as modified by OES/EXT_tessellation_shader):

       * the program is not separable and contains no object to form a
       tessellation evaluation shader; or

     Modify section 11.1.2.1, "Output Variables" on p. 262 (as modified
     by the OES/EXT_geometry_shader extension):

     Into the paragraph starting with
      "Each program object can specify a set of output variables from one
       shader to be recorded in transform feedback mode..."

     Insert after the tesselation evaluation shader bullet point:
       * tesselation control shader


     Modify section 11.1.3.11, "Validation" to replace the bullet point
     starting with "One but not both of the tessellation..." on p. 271

       * the tessellation evaluation but not tessellation control stage
         has an active program with corresponding executable shader.


     Modify section 11.1ts, "Tessellation"

     Replace
       "Tessellation is considered active if and only if the active
       program object or program pipeline object includes both a
       tessellation control shader and a tessellation evaluation shader."
     with
       "Tessellation is considered active if and only if the active
       program object or program pipeline object includes a tessellation
       control shader."

     Replace
       "An INVALID_OPERATION error is generated by any command that
       transfers vertices to the GL if the current program state has one
       but not both of a tessellation control shader and tessellation
       evaluation shader."
     with
       "An INVALID_OPERATION error is generated by any command that
       transfers vertices to the GL if the current program state has a
       tessellation evaluation shader but not a tessellation control
       shader."

     Modify section 12.1.2 "Transform Feedback Primitive Capture"

     Replace the second paragraph of the section on p. 274 (as modified
     by OES/EXT_tessellation_shader):

     The data captured in transform feedback mode depends on the active
     programs on each of the shader stages. If a program is active for the
     geometry shader stage, transform feedback captures the vertices of each
     primitive emitted by the geometry shader. Otherwise, if a program is
     active for the tessellation evaluation shader stage, transform feedback
     captures each primitive produced by the tessellation primitive generator,
     whose vertices are processed by the tessellation evaluation shader.
     Otherwise, if a program is active for the tessellation control shader stage,
     transform feedback captures each output patch of that stage.
     Otherwise, transform feedback captures each primitive processed by the
     vertex shader.

     Modify the second paragraph following ResumeTransformFeedback on p. 277
     (as modified by OES/EXT_tessellation_shader):

     When transform feedback is active and not paused ... If a tessellation
     or geometry shader is active, the type of primitive emitted
     by that shader is used instead of the <mode> parameter passed to drawing
     commands for the purposes of this error check. If tessellation
     and geometry shaders are both active, the output primitive
     type of the geometry shader will be used for the purposes of this error.
     Any primitive type may be used while transform feedback is paused.


     Modify section 13.3, "Points"

     After
       "The point size is determined by the last active stage before the
       rasterizer:"

     Add a new bullet point to the list, between the
     tessellation evaluation shader and the vertex shader:

       * the tessellation control shader, if active and no tessellation
         evaluation shader is active;

 Dependencies on OES/EXT_geometry_shader

     If implemented in GLSL ES and OES/EXT_geometry_shader is not supported,
     disregard all changes to geometry shader related functionality.

 Dependencies on ARB_gpu_shader5

     This extension also incorporates all the changes to the OpenGL Shading
     Language made by ARB_gpu_shader5; enabling this extension by a #extension
     directive in shader code also enables all features of ARB_gpu_shader5 as
     though the shader code has also declared

       #extension GL_ARB_gpu_shader5 : enable

     The converse is not true; implementations supporting both extensions
     should not provide the shading language features in this extension if
     shader code #extension directives enable only ARB_gpu_shader5.

     This specification and ARB_gpu_shader5 both lift the restriction in GLSL
     1.50 requiring that indexing in arrays of samplers must be done with
     constant expressions.  However, ARB_gpu_shader5 specifies that results are
     undefined if the indices would diverge if multiple shader invocations are
     run in lockstep.  This extension does not impose the non-divergent
     indexing requirement.

 Dependencies on ARB_gpu_shader_fp64

     This extension and ARB_gpu_shader_fp64 both provide support for shading
     language variables with 64-bit components.  If both extensions are
     supported, the various edits describing this new support should be
     combined.

     If ARB_gpu_shader_fp64 is not supported, the following edits should be
     removed:

      * language adding the data types "float64_t", "f64vec2", "f64vec3", and
        "f64vec4";

      * language allowing implicit conversions of various types to double,
        dvec2, dvec3, or dvec4; and

      * the built-in functions doubleBitsToInt64(), doubleBitsToUint64(),
        int64BitsToDouble(), and uint64BitsToDouble().

 Dependencies on ARB_tessellation_shader

     If ARB_tessellation_shader is not supported, language introduced by this
     extension describing processing patches in geometry shaders, transform
     feedback, and rasterization should be removed.

     If this extension and ARB_tessellation_shader are supported, it is legal
     to send patches past the tessellation stage -- the following language from
     ARB_tessellation_shader is removed:

       Patch primitives are not supported by pipeline stages below the
       tessellation evaluation shader.  If there is no active program object or
       the active program object does not contain a tessellation evaluation
       shader, the error INVALID_OPERATION is generated by Begin (or vertex
       array commands that implicitly call Begin) if the primitive mode is
       PATCHES.

 Dependencies on NV_shader_buffer_load

     If NV_shader_buffer_load is supported, that specification should be edited
     as follows, to allow pointers to dereference the new data types added by
     this extension.

     Modify "Section 2.20.X, Shader Memory Access" from NV_shader_buffer_load.

     (add rules for loads of variables having the new data types from this
     extension to the list of bullets following "When a shader dereferences a
     pointer variable")

     - Data of type "int8_t," "int16_t", "int32_t", and "int64_t" are read
       from or written to memory as a single 8-, 16-, 32-, or 64-bit signed
       integer value at the specified GPU address.

     - Data of type "uint8_t," "uint16_t", "uint32_t", and "uint64_t" are read
       from or written to memory as a single 8-, 16-, 32-, or 64-bit unsigned
       integer value at the specified GPU address.

     - Data of type "float16_t", "float32_t", and "float64_t" are read from or
       written to memory as a single 16-, 32-, or 64-bit floating-point value
       at the specified GPU address.

 Dependencies on EXT_direct_state_access

     If EXT_direct_state_access is supported, that specification should be
     edited as follows to include new ProgramUniform* functions.

     (modify the ProgramUniform* language)

     The following commands:

         ....
         void ProgramUniform{1,2,3,4}{i64,ui64}NV
             (uint program int location, T value);
         void ProgramUniform{1,2,3,4}{i64,ui64}vNV
             (uint program, int location, const T *value);

     operate identically to the corresponding command where "Program" is
     deleted from the name (and extension suffixes are dropped or updated
     appropriately) except, rather than updating the currently active program
     object, these "Program" commands update the program object named by the
     <program> parameter.  ...

 Dependencies on EXT_vertex_attrib_64bit and NV_vertex_attrib_integer_64bit

     The EXT_vertex_attrib_64bit extension provides the ability to specify
     64-bit floating-point vertex attributes in a GLSL vertex shader and the
     specify the values of these attributes via the OpenGL API.  To
     successfully compile vertex shaders with fp64 input variables, is
     necessary to include

       #extension GL_EXT_vertex_attrib_64bit : enable

     in the shader text.

     However, this extension is considered to enable 64-bit
     floating-point and integer inputs. Provided EXT_vertex_attrib_64bit
     and NV_vertex_attrib_integer_64bit are supported, including the
     following code in a vertex shader

       #extension GL_NV_gpu_shader5 : enable

     will enable 64-bit floating-point or integer input variables whose
     values would be specified using the OpenGL API mechanisms found in
     the EXT_vertex_attrib_64bit and NV_vertex_attrib_integer_64bit
     extensions.


 Errors

     None.

 New State

     None.

 New Implementation Dependent State

     None.

 Issues

     (1) What implicit conversions are supported by this extension on top of
         those provided by related extensions?

       RESOLVED:  ARB_gpu_shader5 and ARB_gpu_shader_fp64 provide new implicit
       conversions from "int" to "uint", and from "int", "uint", and "float" to
       "double".

       This extension provides integer types of multiple sizes and supports
       implicit conversions from small integer types to 32- or 64-bit integer
       types of the same signedness, as well as float and double.  It also
       provides floating-point types of multiple sizes and supports implicit
       conversions from smaller to larger types.  Additionally, it supports
       conversion from 64-bit integer types to double.

     (2) How do these implicit conversions impact binary operators?

       RESOLVED:  For binary operators, we prefer converting to a common type
       that is as close as possible in size and type to the original
       expression.

     (3) How do these implicit conversions impact function overloading rules?

       RESOLVED:  We extend the preference rules in ARB_gpu_shader5 to account
       for the new data types, adding rules to:

         * favor new "promotions" in integer/floating point types (previously,
           the only promotion was float-to-double)

         * for promotions, favor conversion to the type closer in size (e.g.,
           prefer converting from int16_t to int over converting to int64_t)

     (4) What should be done to distinguish between 32- and 64-bit integer
         constants?

       RESOLVED:  We will use "L" and "UL" to identify signed and unsigned
       64-bit integer constants; the use of "L" matches a similar ("long")
       suffix in the C programming language.  C leaves the size of integer
       types implementation-dependent, and many implementations require an "LL"
       suffix to declare 64-bit integer constants.  With our size definitions,
       "L" will be considered sufficient to make an integer constant 64-bit.

     (5) Should provide support for vertex attributes with 64-bit components,
         and if so, how should the support be provided in the OpenGL API?

       RESOLVED:  Yes, this seems like useful functionality, particularly for
       applications wanting to provide double-precision or 64-bit integer data
       to shaders performing computations on such types.  We provide
       VertexAttribL* entry points for 64-bit components in the separate
       EXT_vertex_attrib_64bit and NV_vertex_attrib_64bit extensions, which
       should be supported on all implementations supporting this extension.

     (6) Should we allow vertex attributes with 8- or 16-bit components in the
         shading language, and if so, how does it interact with the OpenGL API?

       RESOLVED:  Yes, but we will use existing APIs to specify such
       attributes, which already typically allow 8- and 16-bit components on
       the API side.  Vertex attribute components (other than 64-bit ones)
       specified by the API will be converted from the type specified in the
       vertex attribute commands to the component type of the attribute.  For
       floating-point values, that may involve 16-to-32 bit conversion or vice
       versa.  For integer types, that may involve dropping all but the least
       significant bits of attribute components.

     (7) Should we support uniforms with double or 64-bit attribute types, and
         if so, how?  Should we support uniforms with <32-bit components, and
         if so, how?

       RESOLVED:  We will support uniforms of all component types, either in a
       buffer object (via OpenGL 3.1 or ARB_uniform_buffer_object) or in
       storage associated with the program.

       When uniforms are stored in buffer object, they are stored using their
       native data types according to the pre-existing packing and layout
       rules.  Those rules were already written to be able to accommodate both
       the larger and smaller new data types.

       Uniforms stored in program objects are loaded with Uniform* APIs.  There
       are no pre-existing uniform APIs accepting doubles or other "long"
       types, so there was no clear need to add an extra "L" to the name to
       distinguish from other APIs like we do with VertexAttribL* APIs.

       Uniforms with 8- and 16- bit components are loaded with the "larger"
       Uniform*{i,ui,f} APIs; it didn't seem worth it to add numerous entry
       points to the APIs to handle all those new types.

     (8) How do the uniform loading commands introduced by this extension
         interact similar commands added by NV_shader_buffer_load?

       RESOLVED:  NV_shader_buffer_load provided the command Uniformui64NV to
       load pointer uniforms with a single 64-bit unsigned integer.  This
       extension provides vectors of 64-bit unsigned integers, so we needed
       Uniform{2,3,4}ui64NV commands.  We chose to provide a Uniform1ui64NV
       command, which will be functionally equivalent to Uniformui64NV.

     (9) How will transform feedback work for capturing variables with double
         or 64-bit components?  Should we support transform feedback on
         variables with components with fewer than 32 bits?

       RESOLVED:  Transform feedback will support variables with any component
       size.  Components with fewer than 32-bits are converted to their
       equivalent 32-bit types.

       For doubles and variables with 64-bit components, each component
       captured will count as 64-bit values and occupy two components for the
       purpose of component counting rules.  This could be a problem for the
       SEPARATE_ATTRIBS mode, since the minimum component limit is four, which
       would not be sufficient to capture a dvec3 or dvec4.  However,
       implementations supporting this extension should also be able to support
       ARB_transform_feedback3, which extends INTERLEAVED_ATTRIBS mode to
       capture vertex attribute values interleaved into multiple buffers.  That
       functionality effectively obsoletes the SEPARATE_ATTRIBS mode, since it
       is a functional superset.

       We considered support for capturing 8- and 16-bit values directly, which
       had a number of problems.  First, full byte addressing might impose both
       alignment issues (e.g., capturing a uint8_t followed by a float might
       misalign the float) and additional hardware implementation burdens.  One
       other option would be to pack multiple values into a 32-bit integer
       (e.g., f16vec2 would be packed with .x in the LSBs and .y in the MSBs).
       This could work, even with word addressing, but would require padding
       for odd sizes (e.g., f16vec2 padded to two words, with the second word
       holding only .z).  It would also have endianness issues; packed values
       would look like arrays of the corresponding smaller type on
       little-endian systems, but not on big-endian ones.

     (10) What precision will be used for computation, storage, and inter-stage
          transfer of 8- and 16-bit component data types?

       RESOLVED:  The components may be considered to occupy a full 32 bits for
       the purposes of input/output component count limits.  8- and 16-bit
       values should, however, be passed at that precision.

     (11) Is the new support for non-constant texel offsets completely
          orthogonal?

       RESOLVED:  No.  Non-constant offsets are not supported for the existing
       functions textureGradOffset() and textureProjGradOffset().

     (12) Should we provide functions like intBitsToFloat() that operate on
          16-bit floating-point values?

       RESOLVED:  Not in this extension.  Such conversions can be performed
       using the following code:

         uint16_t float16BitsToUint16(float16_t v)
         {
           return uint16_t(packFloat2x16(f16vec2(v, 0));
         }

         float16_t uint16BitsToFloat16(uint16_t v)
         {
           return unpackFloat2x16(uint(v)).x;
         }

     (13) Should we provide distinct sized types for 32-bit integers and
          floats, and 64-bit floats?  Should we provide those types as aliases
          for existing unsized types?  Or should we provide no such types at
          all?

       RESOLVED:  We will provide sized versions of these types, which are
       defined as completely equivalent to unsized types according to the
       following table:

         unsized type     sized types
         -------------    ---------------
         int              int32_t
         uint             uint32_t
         float            float32_t
         double           float64_t

       Vector types with sized and unsized components have equivalent
       relationships.

       Note that the nominally "unsized" data types in the GLSL 1.30 spec are
       actually sized.  The specification explicitly defines signed and unsized
       integers (int, uint) to be 32-bit values.  It also defines
       floating-point values to "match the IEEE single precision floating-point
       definition for precision and dynamic range", which are also 32-bit
       values.

       This type equivalence has minor implications on function overloading:

         * You can't declare separate versions of a function with an "int"
           argument in one version and an "int32_t" argument in another.

         * Because there is no implicit conversion between equivalent types, we
           will get an exact match if an argument is declared with one type
           (e.g., "int") in the caller and a textually different but equivalent
           type ("int32_t") in the function.

       Note that the type equivalence also applies to API data type queries.
       For example, the type INT will be returned for a variable declared as
       "int32_t".

     (14) What are functions like anyThreadNV() and allThreadsNV() good for?

       NRESOLVED:  If an implementation performs SIMD thread execution,
       divergent branching may result in reduced performance if the "if" and
       "else" blocks of an "if" statement are executed sequentially.  For
       example, an algorithm may have both a "fast path" that performs a
       computation quickly for a subset of all cases and a "fast path" that
       performs a computation quickly but correctly.  When performing SIMD
       execution, code like the following:

         if (condition) {
           result = do_fast_path(...);
         } else {
           result = do_slow_path(...);
         }

       may end up executing *both* the fast and slow paths for a SIMD thread
       group if <condition> diverges, and may execute more slowly than simply
       executing the slow path unconditionally.  These functions allow code
       like:

         if (allThreadsNV(condition)) {
           result = do_fast_path(...);
         } else {
           result = do_slow_path(...);
         }

       that executes the fast path if and only if it can be used for *all*
       threads in the group.  For thread groups where <condition> diverges,
       this algorithm would unconditionally run the slow path, but would never
       run both in sequence.

       There may be other cases where "voting" across shader invocations may be
       useful.  Note that we provide no control over how shader invocations may
       be packed within a SIMD thread group, unlike various "compute" APIs
       (CUDA, OpenCL).

     (15) Can the 64-bit uniform APIs be used to load values for uniforms of
          type "bool", "bvec2", "bvec3", or "bvec4"?

       RESOLVED:  No.  OpenGL 2.0 and beyond did allow "bool" variable to be
       set with Uniform*i* and Uniform*f APIs, and OpenGL 3.0 extended that
       support to Uniform*ui* for orthogonality.  But it seems pointless to
       extended this capability forward to 64-bit Uniform APIs as well.

     (19) The ARB_tessellation_shader extension adds support for patch
          primitives that might survive to the transform feedback stage.  How
          are such primitives captured?

       RESOLVED:  If patch primitives survive to the transform feedback stage,
       they are recorded on a patch-by-patch basis.  Incomplete patches are not
       recorded.  As with other primitive types, if the transform feedback
       buffers do not contain enough space to capture an entire patch, no
       vertices are recorded.

       Note that the only way to get patch primitives all the way to transform
       feedback is to have tessellation evaluation and geometry shaders
       disabled; the output streams from both of those shader stages are
       collections of points, lines, or triangles.

     (20) Previous transform feedback allowed capturing only fixed-size
          primitives; this extension supports variable-sized patches.  What
          interactions does this functionality have with transform feedback
          buffer overflow?

       RESOLVED:  With fixed-size point, line, or triangle primitives, once any
       primitive fails to be recorded due to insufficient space, all subsequent
       primitives would also fail.  With variable-size patch primitives, the
       transform feedback stage might first receive a large patch that doesn't
       fit, followed by a smaller patch that could squeeze into the remaining
       space.

       To allow for different types of implementation of this extension without
       requiring special-case handling of this corner case, we've chosen to
       leave this behavior undefined -- the smaller patch may or may not be
       recorded.


 Revision History

     Rev.    Date    Author    Changes
     ----  --------  --------  -----------------------------------------
     11    03/07/17  mheyer    Update OpenGL ES interactions to clarify
                               that using a tessellation control shader
                               without a tessellation evaluation shader
                               is legal, and PATCHES can be sent past the
                               tessellation stage.

     10    04/16/16  mheyer    Add OpenGL ES interactions (written before
                               revision 9, but not published)

      9    02/19/16  pbrown    Clarify that non-constant offset vectors are
                               supported in textureGatherOffsets().

      8    09/11/14  pbrown    Fix incorrect implicit conversions, which
                               follow the general pattern of little->big
                               and int->uint->float.  Thanks to Daniel
                               Rakos, author of similar functionality in
                               the AMD_gpu_shader_int64 spec.

      7    11/08/10  pbrown    Fix typos in description of packFloat2x16 and
                               unpackFloat2x16.

      6    03/23/10  pbrown    Update overview, dependencies, remove references
                               to old extension names.  Extend the function
                               overloading prioritization rules from
                               ARB_gpu_shader5 to account for new data types.
                               Major overhaul of the issues section to match
                               the refactoring done to produce ARB specs.

      5    03/08/10  pbrown    Add interaction with EXT_vertex_attrib_64bit and
                               NV_vertex_attrib_integer_64bit; enabling this
                               extension automatically enables 64-bit floating-
                               point and integer vertex inputs.

      4    03/01/10  pbrown    Fix prototype for GetUniformui64vNV.

      3    01/14/10  pbrown    Fix with updated enum assignments.

      2    12/08/09  pbrown    Add explicit component counting rules for
                               64-bit integer attributes similar to those
                               in the ARB_gpu_shader_fp64 spec.

      1              pbrown    Internal revisions.