extensions/ARB/ARB_gpu_shader_fp64.txt - external/github.com/KhronosGroup/OpenGL-Registry - Git at Google

 Name

     ARB_gpu_shader_fp64

 Name Strings

     GL_ARB_gpu_shader_fp64

 Contact

     Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com)

 Contributors

     Barthold Lichtenbelt, NVIDIA
     Bill Licea-Kane, AMD
     Bruce Merry, ARM
     Chris Dodd, NVIDIA
     Eric Werness, NVIDIA
     Graham Sellers, AMD
     Greg Roth, NVIDIA
     Jeff Bolz, NVIDIA
     Nick Haemel, AMD
     Pierre Boudier, AMD
     Piers Daniell, NVIDIA

 Notice

     Copyright (c) 2010-2013 The Khronos Group Inc. Copyright terms at
         http://www.khronos.org/registry/speccopyright.html

 Status

     Complete. Approved by the ARB at the 2010/01/22 F2F meeting.
     Approved by the Khronos Board of Promoters on March 10, 2010.

 Version

     Last Modified Date:         August 27, 2012
     NVIDIA Revision:            11

 Number

     ARB Extension #89

 Dependencies

     This extension is written against the OpenGL 3.2 (Compatibility Profile)
     Specification.

     This extension is written against version 1.50 (revision 09) of the OpenGL
     Shading Language Specification.

     OpenGL 3.2 and GLSL 1.50 are required.

     This extension interacts with EXT_direct_state_access.

     This extension interacts with NV_shader_buffer_load.

 Overview

     This extension allows GLSL shaders to use double-precision floating-point
     data types, including vectors and matrices of doubles.  Doubles may be
     used as inputs, outputs, and uniforms.

     The shading language supports various arithmetic and comparison operators
     on double-precision scalar, vector, and matrix types, and provides a set
     of built-in functions including:

       * square roots and inverse square roots;

       * fused floating-point multiply-add operations;

       * splitting a floating-point number into a significand and exponent
         (frexp), or building a floating-point number from a significand and
         exponent (ldexp);

       * absolute value, sign tests, various functions to round to an integer
         value, modulus, minimum, maximum, clamping, blending two values, step
         functions, and testing for infinity and NaN values;

       * packing and unpacking doubles into a pair of 32-bit unsigned integers;

       * matrix component-wise multiplication, and computation of outer
         products, transposes, determinants, and inverses; and

       * vector relational functions.

     Double-precision versions of angle, trigonometry, and exponential
     functions are not supported.

     Implicit conversions are supported from integer and single-precision
     floating-point values to doubles, and this extension uses the relaxed
     function overloading rules specified by the ARB_gpu_shader5 extension to
     resolve ambiguities.

     This extension provides API functions for specifying double-precision
     uniforms in the default uniform block, including functions similar to the
     uniform functions added by EXT_direct_state_access (if supported).

     This extension provides an "LF" suffix for specifying double-precision
     constants.  Floating-point constants without a suffix in GLSL are treated
     as single-precision values for backward compatibility with versions not
     supporting doubles; similar constants are treated as double-precision
     values in the "C" programming language.

     This extension does not support interpolation of double-precision values;
     doubles used as fragment shader inputs must be qualified as "flat".
     Additionally, this extension does not allow vertex attributes with 64-bit
     components.  That support is added separately by EXT_vertex_attrib_64bit.

 IP Status

     No known IP claims.

 New Procedures and Functions

     void Uniform1d(int location, double x);
     void Uniform2d(int location, double x, double y);
     void Uniform3d(int location, double x, double y, double z);
     void Uniform4d(int location, double x, double y, double z, double w);
     void Uniform1dv(int location, sizei count, const double *value);
     void Uniform2dv(int location, sizei count, const double *value);
     void Uniform3dv(int location, sizei count, const double *value);
     void Uniform4dv(int location, sizei count, const double *value);

     void UniformMatrix2dv(int location, sizei count, boolean transpose,
                           const double *value);
     void UniformMatrix3dv(int location, sizei count, boolean transpose,
                           const double *value);
     void UniformMatrix4dv(int location, sizei count, boolean transpose,
                           const double *value);
     void UniformMatrix2x3dv(int location, sizei count, boolean transpose,
                             const double *value);
     void UniformMatrix2x4dv(int location, sizei count, boolean transpose,
                             const double *value);
     void UniformMatrix3x2dv(int location, sizei count, boolean transpose,
                             const double *value);
     void UniformMatrix3x4dv(int location, sizei count, boolean transpose,
                             const double *value);
     void UniformMatrix4x2dv(int location, sizei count, boolean transpose,
                             const double *value);
     void UniformMatrix4x3dv(int location, sizei count, boolean transpose,
                             const double *value);

     void GetUniformdv(uint program, int location, double *params);

     (All of the following ProgramUniform* functions are supported if and only
      if EXT_direct_state_access is supported.)

     void ProgramUniform1dEXT(uint program, int location, double x);
     void ProgramUniform2dEXT(uint program, int location, double x, double y);
     void ProgramUniform3dEXT(uint program, int location, double x, double y,
                              double z);
     void ProgramUniform4dEXT(uint program, int location, double x, double y,
                              double z, double w);
     void ProgramUniform1dvEXT(uint program, int location, sizei count,
                               const double *value);
     void ProgramUniform2dvEXT(uint program, int location, sizei count,
                               const double *value);
     void ProgramUniform3dvEXT(uint program, int location, sizei count,
                               const double *value);
     void ProgramUniform4dvEXT(uint program, int location, sizei count,
                               const double *value);

     void ProgramUniformMatrix2dvEXT(uint program, int location, sizei count,
                                     boolean transpose, const double *value);
     void ProgramUniformMatrix3dvEXT(uint program, int location, sizei count,
                                     boolean transpose, const double *value);
     void ProgramUniformMatrix4dvEXT(uint program, int location, sizei count,
                                     boolean transpose, const double *value);
     void ProgramUniformMatrix2x3dvEXT(uint program, int location, sizei count,
                                       boolean transpose, const double *value);
     void ProgramUniformMatrix2x4dvEXT(uint program, int location, sizei count,
                                       boolean transpose, const double *value);
     void ProgramUniformMatrix3x2dvEXT(uint program, int location, sizei count,
                                       boolean transpose, const double *value);
     void ProgramUniformMatrix3x4dvEXT(uint program, int location, sizei count,
                                       boolean transpose, const double *value);
     void ProgramUniformMatrix4x2dvEXT(uint program, int location, sizei count,
                                       boolean transpose, const double *value);
     void ProgramUniformMatrix4x3dvEXT(uint program, int location, sizei count,
                                       boolean transpose, const double *value);

 New Tokens

     Returned in the <type> parameter of GetActiveUniform, and
     GetTransformFeedbackVarying:

         DOUBLE
         DOUBLE_VEC2                                     0x8FFC
         DOUBLE_VEC3                                     0x8FFD
         DOUBLE_VEC4                                     0x8FFE
         DOUBLE_MAT2                                     0x8F46
         DOUBLE_MAT3                                     0x8F47
         DOUBLE_MAT4                                     0x8F48
         DOUBLE_MAT2x3                                   0x8F49
         DOUBLE_MAT2x4                                   0x8F4A
         DOUBLE_MAT3x2                                   0x8F4B
         DOUBLE_MAT3x4                                   0x8F4C
         DOUBLE_MAT4x2                                   0x8F4D
         DOUBLE_MAT4x3                                   0x8F4E


 Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification
 (OpenGL Operation)

     Modify Section 2.14.4, Uniform Variables, p. 89

     (modify third paragraph, p. 90) ... uniform variable storage for a vertex
     shader.  A uniform matrix with single- or double-precision components will
     consume no more than 4 * min(r,c) or 8 * min(r,c) uniform components,
     respectively.  A scalar or vector uniform with double-precision components
     will consume no more than 2<n> components, where <n> is 1 for scalars, and
     the component count for vectors.  A link error is generated ...

     (add to Table 2.13, p. 96)

       Type Name Token           Keyword
       --------------------      ----------------
       DOUBLE                    double
       DOUBLE_VEC2               dvec2
       DOUBLE_VEC3               dvec3
       DOUBLE_VEC4               dvec4
       DOUBLE_MAT2               dmat2
       DOUBLE_MAT3               dmat3
       DOUBLE_MAT4               dmat4
       DOUBLE_MAT2x3             dmat2x3
       DOUBLE_MAT2x4             dmat2x4
       DOUBLE_MAT3x2             dmat3x2
       DOUBLE_MAT3x4             dmat3x4
       DOUBLE_MAT4x2             dmat4x2
       DOUBLE_MAT4x3             dmat4x3

     (modify list of commands at the bottom of p. 99)

       void Uniform{1,2,3,4}d(int location, T value);
       void Uniform{1,2,3,4}dv(int location, T value);
       void UniformMatrix{2,3,4}dv
            (int location, sizei count, boolean transpose,
             const double *value);
       void UniformMatrix{2x3,3x2,2x4,4x2,3x4,4x3}dv
            (int location, sizei count, boolean transpose,
             const double *value);

     (insert after fourth paragraph, p. 100) The Uniform*d{v} commands will
     load <count> sets of one to four double-precision floating-point values
     into a uniform location defined as a double, a double vector, or an array
     of double scalars or vectors.

     (modify fifth paragraph, p. 100) The UniformMatrix{2,3,4}fv and
     UniformMatrix{2,3,4}dv commands will load <count> 2x2, 3x3, or 4x4
     matrices (corresponding to 2, 3, or 4 in the command name) of single- or
     double-precision floating-point values, respectively, into ...

     (replace second bullet on the middle of p. 101, regarding
      INVALID_OPERATION errors in Uniform* comamnds)

      * if the type of the uniform declared in the shader does not match the
        component type and count indicated in the Uniform* command name (where
        a boolean uniform component type is considered to match any of the
        Uniform*i{v}, Uniform*ui{v}, or Uniform*f{v} commands),

     (modify sixth paragraph, p. 100) The UniformMatrix{2x3,3x2,2x4,
     4x2,3x4,4x3}fv and UniformMatrix{2x3,3x2,2x4,4x2,3x4,4x3}dv commands will
     load <count> 2x3, 3x2, 2x4, 4x2, 3x4, or 4x3 matrices (corresponding to
     the numbers in the command name) of single- or double-precision
     floating-point values, respectively, into ...

     (modify "Uniform Buffer Object Storage", p. 102, adding a bullet after the
      last "Members of type", and modifying the subsequent bullet)

      * Members of type double are extracted from a buffer object by reading a
        single double-typed value at the specified offset.

      * Vectors with N elements with basic data types of bool, int, uint,
        float, or double are extracted as N values in consecutive memory
        locations beginning at the specified offset, with components stored in
        order with the first (X) component at the lowest offset. The GL data
        type used for component extraction is derived according to the rules
        for scalar members above.


     Modify Section 2.14.6, Varying Variables, p. 106

     (modify third paragraph, p. 107) ... For the purposes of counting input
     and output components consumed by a shader, variables declared as vectors,
     matrices, and arrays will all consume multiple components.  Each component
     of variables declared as double-precision floating-point scalars, vectors,
     or matrices may be counted as consuming two components.

     (add after the bulleted list, p. 108) For the purposes of counting the
     total number of components to capture, each component of outputs declared
     as double-precision floating-point scalars, vectors, or matrices may be
     counted as consuming two components.


     Modify Section 2.19, Transform Feedback, p. 130

     (add to end of first paragraph, p. 132) ...  The results of appending a
     varying variable to a transform feedback buffer are undefined if any
     component of that variable would be written at an offset not aligned to
     the size of the component.


 Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification
 (Rasterization)

     None.

 Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification
 (Per-Fragment Operations and the Frame Buffer)

     None.

 Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification
 (Special Functions)

     None.

 Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification
 (State and State Requests)

     Modify Section 6.1.15, Shader and Program Queries, p. 332

     (add to the first list of commands, p. 337)

       void GetUniformdv(uint program, int location, double *params);


 Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile)
 Specification (Invariance)

     None.

 Additions to the AGL/GLX/WGL Specifications

     None.

 Modifications to The OpenGL Shading Language Specification, Version 1.50
 (Revision 09)

     Including the following line in a shader can be used to control the
     language features described in this extension:

       #extension GL_ARB_gpu_shader_fp64 : <behavior>

     where <behavior> is as specified in section 3.3.

     New preprocessor #defines are added to the OpenGL Shading Language:

       #define GL_ARB_gpu_shader_fp64    1


     Modify Section 3.6, Keywords, p. 14

     (add the following to the list of keywords, p. 14)

     double              dvec2           dvec3           dvec4

     dmat2               dmat3           dmat4
     dmat2x2             dmat2x3         dmat2x4
     dmat3x2             dmat3x3         dmat3x4
     dmat4x2             dmat4x3         dmat4x4

     (remove "double", "dvec2", "dvec3", and "dvec4" from the list of
     keywords reserved for future use, p. 15)


     Modify Section 4.1, Basic Types, p. 17

     (add to the basic "Transparent Types" table, pp. 17-18)

       Types       Meaning
       --------    ----------------------------------------------------------
       double      a single double-precision floating point scalar
       dvec2       a two-component double precision floating-point vector
       dvec3       a three component double precision floating-point vector
       dvec4       a four component double precision floating-point vector

       dmat2       a 2x2 double-precision floating-point matrix
       dmat3       a 3x3 double-precision floating-point matrix
       dmat4       a 4x4 double-precision floating-point matrix
       dmat2x2     same as dmat2
       dmat2x3     a double-precision matrix with 2 columns and 3 rows
       dmat2x4     a double-precision matrix with 2 columns and 4 rows
       dmat3x2     a double-precision matrix with 3 columns and 2 rows
       dmat3x3     same as dmat3
       dmat3x4     a double-precision matrix with 3 columns and 4 rows
       dmat4x2     a double-precision matrix with 4 columns and 2 rows
       dmat4x3     a double-precision matrix with 4 columns and 3 rows
       dmat4x4     same as dmat4


     Modify Section 4.1.4, Floats, p. 22

     (modify two paragraphs of the section, adding support for doubles)

     Single- and double-precision floating-point values are available for use
     in a variety of scalar calculations.  Floating-point variables are defined
     as in the following example:

       float a, b = 1.5;
       double c, d = 2.0LF;

     As an input value to one of the processing units, a single or
     double-precision floating-point variable is expected to match the IEEE
     floating-point definition for precision and dynamic range of the
     corresponding type.  It is not required that the precision of internal
     processing for operands of type "float" match the IEEE floating-point
     specification for floating-point operations, but the minimum guidelines
     for precision established by the OpenGL specification must be met.
     Treatment of conditions such as divide by 0 may lead to an unspecified
     result, but in no case should such a condition lead to the interruption or
     termination of processing.

     (modify the grammar, p. 22, adding "L" suffix)

       floating-suffix:  one of

         f F lf LF

     (modify last paragraph, p. 22) ...  including before a suffix.  When the
     suffix "lf" or "LF" is present, the literal has type <double>.  Otherwise,
     the literal has type <float>.  A leading unary ...


     Modify Section 4.1.6, Matrices, p. 23

     (modify the first paragraph of the section)

     The OpenGL Shading Language has built-in types for 2×2, 2×3, 2×4, 3×2,
     3×3, 3×4, 4×2, 4×3, and 4×4 matrices of single- and double-precision
     floating-point numbers.  Matrix types beginning with "mat" have
     single-precision components; matrix types beginning with "dmat" have
     double-precision components.  The first number in the type is the number
     of columns, the second is the number of rows. Example matrix declarations:

       mat2 mat2D;
       mat3 optMatrix;
       mat4 view, projection;
       mat4x4 view; // an alternate way of declaring a mat4
       mat3x2 m; // a matrix with 3 columns and 2 rows
       dmat4 highPrecisionMVP;
       dmat2x4 skinnyAndTallWithBigComponents;

     ...

     Modify Section 4.1.10, Implicit Conversions, p. 27

     (modify table of implicit conversions)

                                 Can be implicitly
         Type of expression        converted to
         ---------------------   -------------------
         int                     uint(*), float, double
         ivec2                   uvec2(*), vec2, dvec2
         ivec3                   uvec3(*), vec3, dvec3
         ivec4                   uvec4(*), vec4, dvec4

         uint                    float, double
         uvec2                   vec2, dvec2
         uvec3                   vec3, dvec3
         uvec4                   vec4, dvec4

         float                   double
         vec2                    dvec2
         vec3                    dvec3
         vec4                    dvec4

         mat2                    dmat2
         mat3                    dmat3
         mat4                    dmat4
         mat2x3                  dmat2x3
         mat2x4                  dmat2x4
         mat3x2                  dmat3x2
         mat3x4                  dmat3x4
         mat4x2                  dmat4x2
         mat4x3                  dmat4x3

         (*) if ARB_gpu_shader5 or NV_gpu_shader5 is supported

     (modify second paragraph of the section) No implicit conversions are
     provided to convert from unsigned to signed integer types, from
     floating-point to integer types, or from higher-precision to
     lower-precision types.  There are no implicit array or structure
     conversions.

     (add before the final paragraph of the section, p. 27)

     (insert before the final paragraph of the section) When performing
     implicit conversion for binary operators, there may be multiple data types
     to which the two operands can be converted.  For example, when adding an
     int value to a uint value, both values can be implicitly converted to
     uint, float, and double.  In such cases, a floating-point type is chosen
     if either operand has a floating-point type.  Otherwise, an unsigned
     integer type is chosen if either operand has an unsigned integer type.
     Otherwise, a signed integer type is chosen.  If operands can be implicitly
     converted to multiple data types deriving from the same base data type,
     the type with the smallest component size is used.


     Modify Section 4.3.4, Inputs, p. 31

     (modify third paragraph of the section, p. 31) ... Vertex shader inputs
     can only be single-precision floating-point scalars, vectors, or matrices,
     or signed and unsigned integers and integer vectors.  Vertex shader inputs
     can also form arrays of these types, but not structures.

     (modify third paragraph, p. 32, allowing doubles as inputs and disallowing
     as non-flat fragment inputs) ... Fragment inputs can only be signed and
     unsigned integers and integer vectors, float, floating-point vectors,
     double, double-precision vectors, single- or double-precision matrices, or
     arrays or structures of these. Fragment shader inputs that are signed or
     unsigned integers, integer vectors, doubles, double-precision vectors, or
     double-precision matrices must be qualified with the interpolation
     qualifier flat.


     Modify Section 4.3.6, Outputs, p. 33

     (modify third paragraph of the section, p. 33) They can only be float,
     double, single- or double-precision floating-point vectors or matrices,
     signed or unsigned integers or integer vectors, or arrays or structures of
     any these.

     (modify last paragraph, p. 33) ... Fragment outputs can only be float,
     single-precision floating-point vectors, signed or unsigned integers or
     integer vectors, or arrays of these. ...


     Modify Section 5.4.1, Conversion and Scalar Constructors, p. 49

     (add double to the first list of constructor examples)

     Converting between scalar types is done as the following prototypes
     indicate:

       int(uint)     // converts an unsigned integer value to a signed integer
       int(float)    // converts a float value to a signed integer
       int(double)   // converts a double value to a signed integer
       int(bool)     // converts a Boolean value to a signed integer
       uint(int)     // converts a signed integer value to an unsigned integer
       uint(float)   // converts a float value to an unsigned integer
       uint(double)  // converts a double value to an unsigned integer
       uint(bool)    // converts a Boolean value to an unsigned integer
       float(int)    // converts a signed integer value to a float
       float(uint)   // converts an unsigned integer value to a float
       float(double) // converts a double value to a float
       float(bool)   // converts a Boolean value to a float
       double(int)   // converts a signed integer value to a double
       double(uint)  // converts an unsigned integer value to a double
       double(float) // converts a float value to a double
       double(bool)  // converts a Boolean value to a double
       bool(int)     // converts a signed integer value to a Boolean
       bool(uint)    // converts an unsigned integer value to a Boolean
       bool(float)   // converts a float value to a Boolean
       bool(double)  // converts a double value to a Boolean

     (modify second paragraph of the section, p. 49) When constructors are used
     to convert any floating-point type to an integer, the fractional part of
     the floating-point value is dropped. ...

     (modify third paragraph of the section, p. 49) When a constructor is used
     to convert any integer or floating-point type to bool, 0 and 0.0 are
     converted to false, and non-zero values are converted to true.  When a
     constructor is used to convert a bool to any integer or floating-point
     type, false is converted to 0 or 0.0, and true is converted to 1 or 1.0.


     Modify Section 5.4.2, Vector and Matrix Constructors, p. 50

     (modify the last paragraph, p. 50) If the basic type (bool, int, uint,
     float, or double) of a parameter to a constructor does not match the basic
     type of the object being constructed, the scalar construction rules
     (above) are used to convert the parameters.


     (add to the first group of examples, p. 52)

       dmat2(dvec2, dvec2)
       dmat3(dvec3, dvec3, dvec3)
       dmat4(dvec4, dvec4, dvec4, dvec4)
       dmat2x4(dvec3, double,   // first column
               double, dvec3)   // second column


     Modify Section 5.9, Expressions, p. 57

     (modify bulleted list as follows, adding support for double-precision
     floating-point types)

     Expressions in the shading language are built from the following:

     * Constants of type bool, int, uint, float, double, all vector types and
       all matrix types.

     ...

     * The arithmetic binary operators add (+), subtract (-), multiply (*), and
       divide (/) operate on integer, single-precision floating-point, and
       double-precision floating-point scalars, vectors, and matrices.  If the
       fundamental type (integer, single-precision floating-point,
       double-precision floating-point) of the operands do not match, the
       conversions from Section 4.1.10 "Implicit Conversions" are applied to
       produce matching types.  ...

     * The arithmetic unary operators negate (-), post- and pre-increment and
       decrement (-- and ++) operate on integer, single-precision
       floating-point, or double-precision floating-point values (including
       vectors and matrices). ...

     * The relational operators greater than (>), less than (<), and less than
       or equal (<=) operate only on scalar integer, single-precision
       floating-point, or double-precision floating-point expressions.  The
       result is scalar Boolean.  The fundamental type of the two operands must
       match, either as specified, or after one of the implicit type
       conversions specified in Section 4.1.10.  ...

       ...


     Modify Chapter 8, Built-in Functions, p. 81

     (add to description of generic types, last paragraph of p. 81) ... Where
     the input arguments (and corresponding output) can be double, dvec2,
     dvec3, or dvec4, <genDType> is used as the argument.  ... Similarly, <mat>
     is used for any matrix basic type with single-precision components and
     <dmat> is used for any matrix basic type with double-precision components.


     Modify Section 8.2, Exponential Functions, p. 83

     (add overloads for double-precision square roots)

       genDType sqrt(genDType x);
       genDType inversesqrt(genDType x);


     Modify Section 8.3, Common Functions, p. 84

     (add support for double-precision floating-point multiply-add)

     Syntax:

       genDType fma(genDType a, genDType b, genDType c);

     The function fma() performs a fused double-precision floating-point
     multiply-add to compute the value a*b+c.  The results of fma() may not be
     identical to evaluating the expression (a*b)+c, because the computation
     may be performed in a single operation with intermediate precision
     different from that used to compute a non-fma() expression.

     The results of fma() are guaranteed to be invariant given fixed inputs
     <a>, <b>, and <c>, as though the result were taken from a variable
     declared as "precise".


     (add support for double-precision frexp and ldexp functions)

     Syntax:

       genDType frexp(genDType x, out genIType exp);
       genDType ldexp(genDType x, in genIType exp);

     The function frexp() splits each double-precision floating-point number in
     <x> into its binary significand, a floating-point number in the range
     [0.5, 1.0), and an integral exponent of two, such that:

       x = significand * 2 ^ exponent

     The significand is returned by the function; the exponent is returned in
     the parameter <exp>.  For a floating-point value of zero, the significant
     and exponent are both zero.  For a floating-point value that is an
     infinity or is not a number, the results of frexp() are undefined.

     If the input <x> is a vector, this operation is performed in a
     component-wise manner; the value returned by the function and the value
     written to <exp> are vectors with the same number of components as <x>.

     The function ldexp() builds a double-precision floating-point number from
     each significand component in <x> and the corresponding integral exponent
     of two in <exp>, returning:

       significand * 2 ^ exponent

     If this product is too large to be represented as a double-precision
     floating-point value, the result is considered undefined.

     If the input <x> is a vector, this operation is performed in a
     component-wise manner; the value passed in <exp> and returned by the
     function are vectors with the same number of components as <x>.


     (add overloads for double-precision functions)

       genDType abs(genDType x);
       genDType sign(genDType x);
       genDType floor(genDType x);
       genDType trunc(genDType x);
       genDType round(genDType x);
       genDType roundEven(genDType x);
       genDType ceil(genDType x);
       genDType fract(genDType x);
       genDType mod(genDType x, double y);
       genDType mod(genDType x, genDType y);
       genDType modf(genDType x, out genDType i);
       genDType min(genDType x, genDType y);
       genDType min(genDType x, double y);
       genDType max(genDType x, genDType y);
       genDType max(genDType x, double y);
       genDType clamp(genDType x, genDType minVal, genDType maxVal);
       genDType clamp(genDType x, double minVal, double maxVal);
       genDType mix(genDType x, genDType y, genDType a);
       genDType mix(genDType x, genDType y, double a);
       genDType mix(genDType x, genDType y, genBType a);
       genDType step(genDType edge, genDType x);
       genDType step(double edge, genDType x);
       genDType smoothstep(genDType edge0, genDType edge1, genDType x);
       genDType smoothstep(double edge0, double edge1, genDType x);
       genBType isnan(genDType x);
       genBType isinf(genDType x);


     (add support for 64-bit floating-point packing and unpacking functions)

     Syntax:

       double   packDouble2x32(uvec2 v);
       uvec2    unpackDouble2x32(double v);

     The function packDouble2x32() returns a double obtained by packing the
     components of a two-component unsigned integer vector into a 64-bit value
     and interpeting its bits according to the IEEE double-precision
     floating-point representation.  The first vector component specifies the
     32 least significant bits; the second component specifies the 32 most
     significant bits.

     The function unpackDouble2x32() returns a two-component unsigned integer
     vector obtained by interpreting a double using the 64-bit IEEE
     double-precision floating-point representation and unpacking into two
     32-bit halves.  The first component of the vector contains the 32 least
     significant bits of the double; the second component consists the 32 most
     significant bits.


     Modify Section 8.4, Geometric Functions, p. 87

     (add double-precision equivalents for existing geometric functions)

       double length(genDType x);
       double distance(genDType p0, genDType p1);
       double dot(genDType x, genDType y);
       dvec3 cross(dvec3 x, dvec3 y);
       genDType normalize(genDType x);
       genDType faceforward(genDType N, genDType I, genDType Nref);
       genDType reflect(genDType I, genDType N);
       genDType refract(genDType I, genDType N, double eta);


     Modify Section 8.5, Matrix Functions, p. 89

     (add double-precision equivalents for existing matrix functions)

       dmat matrixCompMult(dmat x, dmat y);
       dmat2 outerProduct(dvec2 c, dvec2 r);
       dmat3 outerProduct(dvec3 c, dvec3 r);
       dmat4 outerProduct(dvec4 c, dvec4 r);
       dmat2x3 outerProduct(dvec3 c, dvec2 r);
       dmat3x2 outerProduct(dvec2 c, dvec3 r);
       dmat2x4 outerProduct(dvec4 c, dvec2 r);
       dmat4x2 outerProduct(dvec2 c, dvec4 r);
       dmat3x4 outerProduct(dvec4 c, dvec3 r);
       dmat4x3 outerProduct(dvec3 c, dvec4 r);
       dmat2 transpose(dmat2 m);
       dmat3 transpose(dmat3 m);
       dmat4 transpose(dmat4 m);
       dmat2x3 transpose(dmat3x2 m);
       dmat3x2 transpose(dmat2x3 m);
       dmat2x4 transpose(dmat4x2 m);
       dmat4x2 transpose(dmat2x4 m);
       dmat3x4 transpose(dmat4x3 m);
       dmat4x3 transpose(dmat3x4 m);
       double determinant(dmat2 m);
       double determinant(dmat3 m);
       double determinant(dmat4 m);
       dmat2 inverse(dmat2 m);
       dmat3 inverse(dmat3 m);
       dmat4 inverse(dmat4 m);


     Modify Section 8.6, Vector Relational Functions, p. 90

     (modify the first paragraph, p. 90, adding support for relational
     functions operating on double precision types)

     Relational and equality operators (<, <=, >, >=, ==, !=) are defined (or
     reserved) to operate on scalars and produce scalar Boolean results.  For
     vector results, use the following built-in functions.  In the definitions
     below, the following terms are used as placeholders for all vector types
     for a given fundamental data type.  In all cases, the sizes of the input
     and return vectors for any particular call must match.

         placeholder     fundamental types
         -----------     ------------------------------------------------
         bvec            bvec2, bvec3, bvec4

         ivec            ivec2, ivec3, ivec4

         uvec            uvec2, uvec3, uvec4

         vec             vec2, vec3, vec4, dvec2, dvec3, dvec4


     Modify Section 9, Shading Language Grammar, p. 92

     !!! TBD !!!


 GLX Protocol

     !!! TBD

 Dependencies on ARB_gpu_shader5

     If ARB_gpu_shader5 is not supported, the changes to the function
     overloading rules in the OpenGL Shading Language Specification provided
     there should included in this extension.

 Dependencies on NV_gpu_shader5

     This extension and NV_gpu_shader5 both provide support for shading
     language variables with 64-bit components.  If both extensions are
     supported, the various edits describing this new support should be
     combined.

 Dependencies on EXT_direct_state_access

     If EXT_direct_state_access is not supported, references to the
     ProgramUniform*d*EXT functions should be removed.

     If EXT_direct_state_access is supported, that specification should be
     edited as follows:

     (modify the ProgramUniform* language)

     The following commands:

         ....
         void ProgramUniform{1,2,3,4}dEXT(uint program int location, T value);
         void ProgramUniform{1,2,3,4}dvEXT (uint program, int location,
                                           const T *value);
         void ProgramUniformMatrix{2,3,4}dvEXT
              (uint program, int location, sizei count, boolean transpose,
               const double *value);
         void ProgramUniformMatrix{2x3,3x2,2x4,4x2,3x4,4x3}dvEXT
              (uint program, int location, sizei count, boolean transpose,
               const double *value);

     operate identically to the corresponding command where "Program" is
     deleted from the name (and extension suffixes are dropped or updated
     appropriately) except, rather than updating the currently active program
     object, these "Program" commands update the program object named by the
     <program> parameter.  ...

 Dependencies on NV_shader_buffer_load

     If NV_shader_buffer_load is supported, that specification should be edited
     as follows:

     Modify "Section 2.20.X, Shader Memory Access" from NV_shader_buffer_load.

     (add rules for loads of variables having the new data types from this
     extension to the list of bullets following "When a shader dereferences a
     pointer variable")

       - Data of type "double" are read from or written to memory as one
         double-typed value at the specified GPU address.


 Errors

     None.

 New State

     None.

 New Implementation Dependent State

     None.

 Issues

     (1) How do double-precision types interact with the rules for storing
     uniforms in a buffer object?

       RESOLVED:  The rules were already written with data types larger and
       smaller than those in the original GLSL in mind.  Single precision
       floats typically take four bytes; doubles take eight bytes.  The larger
       storage requirement for doubles means a larger alignment requirement;
       doubles still need to be size-aligned.

     (2) Should double-precision vertex shader inputs be supported?

       RESOLVED:  Not in this extension.  Such support will be added by the
       EXT_vertex_attrib_64bit extension.

     (3) Should double-precision fragment shader outputs be supported?

       RESOLVED:  Not in this extension.  Note that we don't have
       double-precision framebuffer formats to accept such values.

     (4) Should transform feedback be able to capture double-precision
     components?

       RESOLVED:  Yes.  However, undefined behavior will occur unless all
       components are captured to size-aligned offsets.

       If any variable captured in transform feedback has double-precision
       components, the practical requirements for defined behavior are:

         (a) the offset of the base of a buffer object must be a multiple of
             eight bytes;

         (b) the amount of data captured per vertex must be a multiple of eight
             bytes; and

         (c) each double-precision variable captured must be aligned to a
             multiple of eight bytes relative to the beginning of a vertex.

       If capturing a mix of single- and double-precision components, it might
       be necessary to use the "gl_SkipComponents1" variable from
       ARB_transform_feedback3 to force proper alignment.

       We considered the possibility of adding error checks to throw errors in
       cases where undefined behavior might occur, but chose not to include
       such errors.  For OpenGL 3.0-style transform feedback, cases (b) and (c)
       are solely a function of the variables captured could be detected when a
       program object is linked.  (Such an error would be more problematic for
       transform feedback via NV_transform_feedback, where the set of variables
       captured can be updated without relinking.)  For case (a), the
       requirement of OpenGL 3.0 is that transform feedback buffer offsets must
       be a multiple of 4 bytes; enforcing a stricter 8-byte alignment would
       require either a backward-incompatible change or a Begin-time error to
       checks the offset of transform feedback buffers against the current
       program.

     (5) Should we have double-precision matrix types?  We didn't add integer
         matrices, but integer matrix math is fairly uncommon.

       RESOLVED:  Yes, we will support all matrix sizes in double-precision.
       We will also provide double-precision equivalents for all matrix
       operators and built-in matrix functions.

     (6) What should be done to distinguish between single- and
         double-precision floating-point constants?

       RESOLVED:  We will use "LF" to identify double-precision floating-point
       constants.  Here, we depart from the C standard.  In C, floating-point
       constants without a suffix are implicitly double-precision and require a
       "F" suffix to specify a single-precision constant.  However, GLSL has
       historically provided no support for double precision.  Changing to C
       rules would materially affect the behavior of pre-existing shaders that
       add an #extension line for this extension, since constants with no
       suffix have meant "float" up to now.  Additionally, such a change would
       likely have required that we introduce implicit conversions from double
       to float; otherwise, assigning a constant with no suffix to a float
       would result in a compile-time error.

     (7) Should we require IEEE 1394-compliant behavior for NaNs and
         infinities?  Denorms?

       RESOLVED:  Following historical precedent in the GLSL and OpenGL APIs
       not defining special-case floating-point behavior, we chose not to do so
       in this extension.

     (8) Should we provide double-precision versions of all the built-ins that
         take a <genType>, which are currently defined to be floats and
         floating-point vectors?

       RESOLVED:  We provide double-precision versions of most of the built-in
       functions supported by GLSL.  We opted not to provide double-precision
       functions for special trigonometry, exponential, derivative, and noise
       functions.

     (9) Are double-precision "varyings" (values passed between shader stages)
         supported by this extension?  If so, is double-precision interpolation
         is supported?

       RESOLVED:  Double-precision shader inputs and outputs are supported,
       except for vertex shader inputs and fragment shader outputs.
       Additionally, double-precision vertex shader inputs are provided by the
       separate extension EXT_vertex_attrib_64bit.  No known extension provides
       double-precision fragment outputs, but that doesn't seem important since
       OpenGL provides no pixel/texture formats with double-precision
       components that could reasonably receive such outputs.

       Interpolation not supported in this extension for double-precision
       floating-point components.  As with integer types in OpenGL 3.0,
       double-precision floating-point fragment shader inputs must be qualified
       as "flat".

       Note that this extension reformulates the spec language requiring "flat"
       qualifiers, in addition to adding doubles to the list of "flat" types.
       In GLSL 1.30, the spec applies these requirements to vertex shader
       outputs but imposes no requirement on fragment inputs.  We move this
       requirement to fragment inputs, since vertex shader outputs may be
       passed to tessellation or geometry shaders without interpolation, and
       thus without the need for qualification by "flat".

     (15) Can the 64-bit uniform APIs be used to load values for uniforms of
          type "bool", "bvec2", "bvec3", or "bvec4"?

       RESOLVED:  No.  OpenGL 2.0 and beyond did allow "bool" variable to be
       set with Uniform*i* and Uniform*f APIs, and OpenGL 3.0 extended that
       support to Uniform*ui* for orthogonality.  But it seems pointless to
       extended this capability forward to 64-bit Uniform APIs as well.

     (19) Should we support any implicit conversion of matrix types, now that
          we have both "mat4" and "dmat4"?

       RESOLVED:  No.  It doesn't seem worth the trouble.


 Revision History

     Rev.    Date    Author    Changes
     ----  --------  --------  -----------------------------------------
     11    08/27/12  pbrown    Clarify that Uniform*d can not be used to load
                               uniforms with boolean types (bug 9345); import
                               issue (15) on the topic from NV_gpu_shader5.

     10    03/23/10  pbrown    Update issues section to include fp64 issues
                               that were left behind in NV_gpu_shader5 when the
                               specs were refactored.

      9    02/02/10  pbrown    Specify that capturing any component at an
                               offset that is not size-aligned results in
                               undefined behavior (bug 5863).

      8    01/29/10  pbrown    Remove shading language and API support for
                               double-precision vertex attributes; moved to the
                               EXT_vertex_attrib_64bit specification (bug
                               5953).  Added clarification disallowing
                               double-precision fragment shader outputs.

      7    01/29/10  pbrown    Delete accidental modifications to the language
                               for equal and not equal operators (bug 5904),
                               which already supported all types.

      6    01/15/10  pbrown    Modify the spec rules for counting attributes,
                               input and output components, and components
                               to capture in transform feedback to permit,
                               but not require, double-precision values to
                               require twice as many resources as single-
                               precision equivalents (bug 5855).

      5    01/14/10  pbrown    Minor updates from spec reviews.

      4    12/10/09  pbrown    Functionality updates from spec review:
                               Allow implicit conversion from mat*->dmat*.
                               Rename fmad and [un]packFloat2x32 to fma
                               and [un]packDouble2x32.  Add overlooked
                               fp64 versions of geometric functions.

      3    12/10/09  pbrown    Convert from EXT to ARB.

      2    12/08/09  pbrown    Miscellaneous fixes from spec review:  Clarified
                               input/output component counting rules, where
                               each fp64 value counts double.  General typo
                               fixes and language clarifications.

      1              pbrown    Internal revisions.