skia / external / github.com / KhronosGroup / OpenGL-Registry / 2ba128b63ad6b6e7138e963762a3a6b76f5e1598 / . / extensions / ARB / ARB_gpu_shader_fp64.txt

Name | |

ARB_gpu_shader_fp64 | |

Name Strings | |

GL_ARB_gpu_shader_fp64 | |

Contact | |

Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com) | |

Contributors | |

Barthold Lichtenbelt, NVIDIA | |

Bill Licea-Kane, AMD | |

Bruce Merry, ARM | |

Chris Dodd, NVIDIA | |

Eric Werness, NVIDIA | |

Graham Sellers, AMD | |

Greg Roth, NVIDIA | |

Jeff Bolz, NVIDIA | |

Nick Haemel, AMD | |

Pierre Boudier, AMD | |

Piers Daniell, NVIDIA | |

Notice | |

Copyright (c) 2010-2013 The Khronos Group Inc. Copyright terms at | |

http://www.khronos.org/registry/speccopyright.html | |

Specification Update Policy | |

Khronos-approved extension specifications are updated in response to | |

issues and bugs prioritized by the Khronos OpenGL Working Group. For | |

extensions which have been promoted to a core Specification, fixes will | |

first appear in the latest version of that core Specification, and will | |

eventually be backported to the extension document. This policy is | |

described in more detail at | |

https://www.khronos.org/registry/OpenGL/docs/update_policy.php | |

Status | |

Complete. Approved by the ARB at the 2010/01/22 F2F meeting. | |

Approved by the Khronos Board of Promoters on March 10, 2010. | |

Version | |

Last Modified Date: August 27, 2012 | |

NVIDIA Revision: 11 | |

Number | |

ARB Extension #89 | |

Dependencies | |

This extension is written against the OpenGL 3.2 (Compatibility Profile) | |

Specification. | |

This extension is written against version 1.50 (revision 09) of the OpenGL | |

Shading Language Specification. | |

OpenGL 3.2 and GLSL 1.50 are required. | |

This extension interacts with EXT_direct_state_access. | |

This extension interacts with NV_shader_buffer_load. | |

Overview | |

This extension allows GLSL shaders to use double-precision floating-point | |

data types, including vectors and matrices of doubles. Doubles may be | |

used as inputs, outputs, and uniforms. | |

The shading language supports various arithmetic and comparison operators | |

on double-precision scalar, vector, and matrix types, and provides a set | |

of built-in functions including: | |

* square roots and inverse square roots; | |

* fused floating-point multiply-add operations; | |

* splitting a floating-point number into a significand and exponent | |

(frexp), or building a floating-point number from a significand and | |

exponent (ldexp); | |

* absolute value, sign tests, various functions to round to an integer | |

value, modulus, minimum, maximum, clamping, blending two values, step | |

functions, and testing for infinity and NaN values; | |

* packing and unpacking doubles into a pair of 32-bit unsigned integers; | |

* matrix component-wise multiplication, and computation of outer | |

products, transposes, determinants, and inverses; and | |

* vector relational functions. | |

Double-precision versions of angle, trigonometry, and exponential | |

functions are not supported. | |

Implicit conversions are supported from integer and single-precision | |

floating-point values to doubles, and this extension uses the relaxed | |

function overloading rules specified by the ARB_gpu_shader5 extension to | |

resolve ambiguities. | |

This extension provides API functions for specifying double-precision | |

uniforms in the default uniform block, including functions similar to the | |

uniform functions added by EXT_direct_state_access (if supported). | |

This extension provides an "LF" suffix for specifying double-precision | |

constants. Floating-point constants without a suffix in GLSL are treated | |

as single-precision values for backward compatibility with versions not | |

supporting doubles; similar constants are treated as double-precision | |

values in the "C" programming language. | |

This extension does not support interpolation of double-precision values; | |

doubles used as fragment shader inputs must be qualified as "flat". | |

Additionally, this extension does not allow vertex attributes with 64-bit | |

components. That support is added separately by EXT_vertex_attrib_64bit. | |

IP Status | |

No known IP claims. | |

New Procedures and Functions | |

void Uniform1d(int location, double x); | |

void Uniform2d(int location, double x, double y); | |

void Uniform3d(int location, double x, double y, double z); | |

void Uniform4d(int location, double x, double y, double z, double w); | |

void Uniform1dv(int location, sizei count, const double *value); | |

void Uniform2dv(int location, sizei count, const double *value); | |

void Uniform3dv(int location, sizei count, const double *value); | |

void Uniform4dv(int location, sizei count, const double *value); | |

void UniformMatrix2dv(int location, sizei count, boolean transpose, | |

const double *value); | |

void UniformMatrix3dv(int location, sizei count, boolean transpose, | |

const double *value); | |

void UniformMatrix4dv(int location, sizei count, boolean transpose, | |

const double *value); | |

void UniformMatrix2x3dv(int location, sizei count, boolean transpose, | |

const double *value); | |

void UniformMatrix2x4dv(int location, sizei count, boolean transpose, | |

const double *value); | |

void UniformMatrix3x2dv(int location, sizei count, boolean transpose, | |

const double *value); | |

void UniformMatrix3x4dv(int location, sizei count, boolean transpose, | |

const double *value); | |

void UniformMatrix4x2dv(int location, sizei count, boolean transpose, | |

const double *value); | |

void UniformMatrix4x3dv(int location, sizei count, boolean transpose, | |

const double *value); | |

void GetUniformdv(uint program, int location, double *params); | |

(All of the following ProgramUniform* functions are supported if and only | |

if EXT_direct_state_access is supported.) | |

void ProgramUniform1dEXT(uint program, int location, double x); | |

void ProgramUniform2dEXT(uint program, int location, double x, double y); | |

void ProgramUniform3dEXT(uint program, int location, double x, double y, | |

double z); | |

void ProgramUniform4dEXT(uint program, int location, double x, double y, | |

double z, double w); | |

void ProgramUniform1dvEXT(uint program, int location, sizei count, | |

const double *value); | |

void ProgramUniform2dvEXT(uint program, int location, sizei count, | |

const double *value); | |

void ProgramUniform3dvEXT(uint program, int location, sizei count, | |

const double *value); | |

void ProgramUniform4dvEXT(uint program, int location, sizei count, | |

const double *value); | |

void ProgramUniformMatrix2dvEXT(uint program, int location, sizei count, | |

boolean transpose, const double *value); | |

void ProgramUniformMatrix3dvEXT(uint program, int location, sizei count, | |

boolean transpose, const double *value); | |

void ProgramUniformMatrix4dvEXT(uint program, int location, sizei count, | |

boolean transpose, const double *value); | |

void ProgramUniformMatrix2x3dvEXT(uint program, int location, sizei count, | |

boolean transpose, const double *value); | |

void ProgramUniformMatrix2x4dvEXT(uint program, int location, sizei count, | |

boolean transpose, const double *value); | |

void ProgramUniformMatrix3x2dvEXT(uint program, int location, sizei count, | |

boolean transpose, const double *value); | |

void ProgramUniformMatrix3x4dvEXT(uint program, int location, sizei count, | |

boolean transpose, const double *value); | |

void ProgramUniformMatrix4x2dvEXT(uint program, int location, sizei count, | |

boolean transpose, const double *value); | |

void ProgramUniformMatrix4x3dvEXT(uint program, int location, sizei count, | |

boolean transpose, const double *value); | |

New Tokens | |

Returned in the <type> parameter of GetActiveUniform, and | |

GetTransformFeedbackVarying: | |

DOUBLE | |

DOUBLE_VEC2 0x8FFC | |

DOUBLE_VEC3 0x8FFD | |

DOUBLE_VEC4 0x8FFE | |

DOUBLE_MAT2 0x8F46 | |

DOUBLE_MAT3 0x8F47 | |

DOUBLE_MAT4 0x8F48 | |

DOUBLE_MAT2x3 0x8F49 | |

DOUBLE_MAT2x4 0x8F4A | |

DOUBLE_MAT3x2 0x8F4B | |

DOUBLE_MAT3x4 0x8F4C | |

DOUBLE_MAT4x2 0x8F4D | |

DOUBLE_MAT4x3 0x8F4E | |

Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification | |

(OpenGL Operation) | |

Modify Section 2.14.4, Uniform Variables, p. 89 | |

(modify third paragraph, p. 90) ... uniform variable storage for a vertex | |

shader. A uniform matrix with single- or double-precision components will | |

consume no more than 4 * min(r,c) or 8 * min(r,c) uniform components, | |

respectively. A scalar or vector uniform with double-precision components | |

will consume no more than 2<n> components, where <n> is 1 for scalars, and | |

the component count for vectors. A link error is generated ... | |

(add to Table 2.13, p. 96) | |

Type Name Token Keyword | |

-------------------- ---------------- | |

DOUBLE double | |

DOUBLE_VEC2 dvec2 | |

DOUBLE_VEC3 dvec3 | |

DOUBLE_VEC4 dvec4 | |

DOUBLE_MAT2 dmat2 | |

DOUBLE_MAT3 dmat3 | |

DOUBLE_MAT4 dmat4 | |

DOUBLE_MAT2x3 dmat2x3 | |

DOUBLE_MAT2x4 dmat2x4 | |

DOUBLE_MAT3x2 dmat3x2 | |

DOUBLE_MAT3x4 dmat3x4 | |

DOUBLE_MAT4x2 dmat4x2 | |

DOUBLE_MAT4x3 dmat4x3 | |

(modify list of commands at the bottom of p. 99) | |

void Uniform{1,2,3,4}d(int location, T value); | |

void Uniform{1,2,3,4}dv(int location, T value); | |

void UniformMatrix{2,3,4}dv | |

(int location, sizei count, boolean transpose, | |

const double *value); | |

void UniformMatrix{2x3,3x2,2x4,4x2,3x4,4x3}dv | |

(int location, sizei count, boolean transpose, | |

const double *value); | |

(insert after fourth paragraph, p. 100) The Uniform*d{v} commands will | |

load <count> sets of one to four double-precision floating-point values | |

into a uniform location defined as a double, a double vector, or an array | |

of double scalars or vectors. | |

(modify fifth paragraph, p. 100) The UniformMatrix{2,3,4}fv and | |

UniformMatrix{2,3,4}dv commands will load <count> 2x2, 3x3, or 4x4 | |

matrices (corresponding to 2, 3, or 4 in the command name) of single- or | |

double-precision floating-point values, respectively, into ... | |

(replace second bullet on the middle of p. 101, regarding | |

INVALID_OPERATION errors in Uniform* comamnds) | |

* if the type of the uniform declared in the shader does not match the | |

component type and count indicated in the Uniform* command name (where | |

a boolean uniform component type is considered to match any of the | |

Uniform*i{v}, Uniform*ui{v}, or Uniform*f{v} commands), | |

(modify sixth paragraph, p. 100) The UniformMatrix{2x3,3x2,2x4, | |

4x2,3x4,4x3}fv and UniformMatrix{2x3,3x2,2x4,4x2,3x4,4x3}dv commands will | |

load <count> 2x3, 3x2, 2x4, 4x2, 3x4, or 4x3 matrices (corresponding to | |

the numbers in the command name) of single- or double-precision | |

floating-point values, respectively, into ... | |

(modify "Uniform Buffer Object Storage", p. 102, adding a bullet after the | |

last "Members of type", and modifying the subsequent bullet) | |

* Members of type double are extracted from a buffer object by reading a | |

single double-typed value at the specified offset. | |

* Vectors with N elements with basic data types of bool, int, uint, | |

float, or double are extracted as N values in consecutive memory | |

locations beginning at the specified offset, with components stored in | |

order with the first (X) component at the lowest offset. The GL data | |

type used for component extraction is derived according to the rules | |

for scalar members above. | |

Modify Section 2.14.6, Varying Variables, p. 106 | |

(modify third paragraph, p. 107) ... For the purposes of counting input | |

and output components consumed by a shader, variables declared as vectors, | |

matrices, and arrays will all consume multiple components. Each component | |

of variables declared as double-precision floating-point scalars, vectors, | |

or matrices may be counted as consuming two components. | |

(add after the bulleted list, p. 108) For the purposes of counting the | |

total number of components to capture, each component of outputs declared | |

as double-precision floating-point scalars, vectors, or matrices may be | |

counted as consuming two components. | |

Modify Section 2.19, Transform Feedback, p. 130 | |

(add to end of first paragraph, p. 132) ... The results of appending a | |

varying variable to a transform feedback buffer are undefined if any | |

component of that variable would be written at an offset not aligned to | |

the size of the component. | |

Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification | |

(Rasterization) | |

None. | |

Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification | |

(Per-Fragment Operations and the Frame Buffer) | |

None. | |

Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification | |

(Special Functions) | |

None. | |

Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification | |

(State and State Requests) | |

Modify Section 6.1.15, Shader and Program Queries, p. 332 | |

(add to the first list of commands, p. 337) | |

void GetUniformdv(uint program, int location, double *params); | |

Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile) | |

Specification (Invariance) | |

None. | |

Additions to the AGL/GLX/WGL Specifications | |

None. | |

Modifications to The OpenGL Shading Language Specification, Version 1.50 | |

(Revision 09) | |

Including the following line in a shader can be used to control the | |

language features described in this extension: | |

#extension GL_ARB_gpu_shader_fp64 : <behavior> | |

where <behavior> is as specified in section 3.3. | |

New preprocessor #defines are added to the OpenGL Shading Language: | |

#define GL_ARB_gpu_shader_fp64 1 | |

Modify Section 3.6, Keywords, p. 14 | |

(add the following to the list of keywords, p. 14) | |

double dvec2 dvec3 dvec4 | |

dmat2 dmat3 dmat4 | |

dmat2x2 dmat2x3 dmat2x4 | |

dmat3x2 dmat3x3 dmat3x4 | |

dmat4x2 dmat4x3 dmat4x4 | |

(remove "double", "dvec2", "dvec3", and "dvec4" from the list of | |

keywords reserved for future use, p. 15) | |

Modify Section 4.1, Basic Types, p. 17 | |

(add to the basic "Transparent Types" table, pp. 17-18) | |

Types Meaning | |

-------- ---------------------------------------------------------- | |

double a single double-precision floating point scalar | |

dvec2 a two-component double precision floating-point vector | |

dvec3 a three component double precision floating-point vector | |

dvec4 a four component double precision floating-point vector | |

dmat2 a 2x2 double-precision floating-point matrix | |

dmat3 a 3x3 double-precision floating-point matrix | |

dmat4 a 4x4 double-precision floating-point matrix | |

dmat2x2 same as dmat2 | |

dmat2x3 a double-precision matrix with 2 columns and 3 rows | |

dmat2x4 a double-precision matrix with 2 columns and 4 rows | |

dmat3x2 a double-precision matrix with 3 columns and 2 rows | |

dmat3x3 same as dmat3 | |

dmat3x4 a double-precision matrix with 3 columns and 4 rows | |

dmat4x2 a double-precision matrix with 4 columns and 2 rows | |

dmat4x3 a double-precision matrix with 4 columns and 3 rows | |

dmat4x4 same as dmat4 | |

Modify Section 4.1.4, Floats, p. 22 | |

(modify two paragraphs of the section, adding support for doubles) | |

Single- and double-precision floating-point values are available for use | |

in a variety of scalar calculations. Floating-point variables are defined | |

as in the following example: | |

float a, b = 1.5; | |

double c, d = 2.0LF; | |

As an input value to one of the processing units, a single or | |

double-precision floating-point variable is expected to match the IEEE | |

floating-point definition for precision and dynamic range of the | |

corresponding type. It is not required that the precision of internal | |

processing for operands of type "float" match the IEEE floating-point | |

specification for floating-point operations, but the minimum guidelines | |

for precision established by the OpenGL specification must be met. | |

Treatment of conditions such as divide by 0 may lead to an unspecified | |

result, but in no case should such a condition lead to the interruption or | |

termination of processing. | |

(modify the grammar, p. 22, adding "L" suffix) | |

floating-suffix: one of | |

f F lf LF | |

(modify last paragraph, p. 22) ... including before a suffix. When the | |

suffix "lf" or "LF" is present, the literal has type <double>. Otherwise, | |

the literal has type <float>. A leading unary ... | |

Modify Section 4.1.6, Matrices, p. 23 | |

(modify the first paragraph of the section) | |

The OpenGL Shading Language has built-in types for 2Ã—2, 2Ã—3, 2Ã—4, 3Ã—2, | |

3Ã—3, 3Ã—4, 4Ã—2, 4Ã—3, and 4Ã—4 matrices of single- and double-precision | |

floating-point numbers. Matrix types beginning with "mat" have | |

single-precision components; matrix types beginning with "dmat" have | |

double-precision components. The first number in the type is the number | |

of columns, the second is the number of rows. Example matrix declarations: | |

mat2 mat2D; | |

mat3 optMatrix; | |

mat4 view, projection; | |

mat4x4 view; // an alternate way of declaring a mat4 | |

mat3x2 m; // a matrix with 3 columns and 2 rows | |

dmat4 highPrecisionMVP; | |

dmat2x4 skinnyAndTallWithBigComponents; | |

... | |

Modify Section 4.1.10, Implicit Conversions, p. 27 | |

(modify table of implicit conversions) | |

Can be implicitly | |

Type of expression converted to | |

--------------------- ------------------- | |

int uint(*), float, double | |

ivec2 uvec2(*), vec2, dvec2 | |

ivec3 uvec3(*), vec3, dvec3 | |

ivec4 uvec4(*), vec4, dvec4 | |

uint float, double | |

uvec2 vec2, dvec2 | |

uvec3 vec3, dvec3 | |

uvec4 vec4, dvec4 | |

float double | |

vec2 dvec2 | |

vec3 dvec3 | |

vec4 dvec4 | |

mat2 dmat2 | |

mat3 dmat3 | |

mat4 dmat4 | |

mat2x3 dmat2x3 | |

mat2x4 dmat2x4 | |

mat3x2 dmat3x2 | |

mat3x4 dmat3x4 | |

mat4x2 dmat4x2 | |

mat4x3 dmat4x3 | |

(*) if ARB_gpu_shader5 or NV_gpu_shader5 is supported | |

(modify second paragraph of the section) No implicit conversions are | |

provided to convert from unsigned to signed integer types, from | |

floating-point to integer types, or from higher-precision to | |

lower-precision types. There are no implicit array or structure | |

conversions. | |

(add before the final paragraph of the section, p. 27) | |

(insert before the final paragraph of the section) When performing | |

implicit conversion for binary operators, there may be multiple data types | |

to which the two operands can be converted. For example, when adding an | |

int value to a uint value, both values can be implicitly converted to | |

uint, float, and double. In such cases, a floating-point type is chosen | |

if either operand has a floating-point type. Otherwise, an unsigned | |

integer type is chosen if either operand has an unsigned integer type. | |

Otherwise, a signed integer type is chosen. If operands can be implicitly | |

converted to multiple data types deriving from the same base data type, | |

the type with the smallest component size is used. | |

Modify Section 4.3.4, Inputs, p. 31 | |

(modify third paragraph of the section, p. 31) ... Vertex shader inputs | |

can only be single-precision floating-point scalars, vectors, or matrices, | |

or signed and unsigned integers and integer vectors. Vertex shader inputs | |

can also form arrays of these types, but not structures. | |

(modify third paragraph, p. 32, allowing doubles as inputs and disallowing | |

as non-flat fragment inputs) ... Fragment inputs can only be signed and | |

unsigned integers and integer vectors, float, floating-point vectors, | |

double, double-precision vectors, single- or double-precision matrices, or | |

arrays or structures of these. Fragment shader inputs that are signed or | |

unsigned integers, integer vectors, doubles, double-precision vectors, or | |

double-precision matrices must be qualified with the interpolation | |

qualifier flat. | |

Modify Section 4.3.6, Outputs, p. 33 | |

(modify third paragraph of the section, p. 33) They can only be float, | |

double, single- or double-precision floating-point vectors or matrices, | |

signed or unsigned integers or integer vectors, or arrays or structures of | |

any these. | |

(modify last paragraph, p. 33) ... Fragment outputs can only be float, | |

single-precision floating-point vectors, signed or unsigned integers or | |

integer vectors, or arrays of these. ... | |

Modify Section 5.4.1, Conversion and Scalar Constructors, p. 49 | |

(add double to the first list of constructor examples) | |

Converting between scalar types is done as the following prototypes | |

indicate: | |

int(uint) // converts an unsigned integer value to a signed integer | |

int(float) // converts a float value to a signed integer | |

int(double) // converts a double value to a signed integer | |

int(bool) // converts a Boolean value to a signed integer | |

uint(int) // converts a signed integer value to an unsigned integer | |

uint(float) // converts a float value to an unsigned integer | |

uint(double) // converts a double value to an unsigned integer | |

uint(bool) // converts a Boolean value to an unsigned integer | |

float(int) // converts a signed integer value to a float | |

float(uint) // converts an unsigned integer value to a float | |

float(double) // converts a double value to a float | |

float(bool) // converts a Boolean value to a float | |

double(int) // converts a signed integer value to a double | |

double(uint) // converts an unsigned integer value to a double | |

double(float) // converts a float value to a double | |

double(bool) // converts a Boolean value to a double | |

bool(int) // converts a signed integer value to a Boolean | |

bool(uint) // converts an unsigned integer value to a Boolean | |

bool(float) // converts a float value to a Boolean | |

bool(double) // converts a double value to a Boolean | |

(modify second paragraph of the section, p. 49) When constructors are used | |

to convert any floating-point type to an integer, the fractional part of | |

the floating-point value is dropped. ... | |

(modify third paragraph of the section, p. 49) When a constructor is used | |

to convert any integer or floating-point type to bool, 0 and 0.0 are | |

converted to false, and non-zero values are converted to true. When a | |

constructor is used to convert a bool to any integer or floating-point | |

type, false is converted to 0 or 0.0, and true is converted to 1 or 1.0. | |

Modify Section 5.4.2, Vector and Matrix Constructors, p. 50 | |

(modify the last paragraph, p. 50) If the basic type (bool, int, uint, | |

float, or double) of a parameter to a constructor does not match the basic | |

type of the object being constructed, the scalar construction rules | |

(above) are used to convert the parameters. | |

(add to the first group of examples, p. 52) | |

dmat2(dvec2, dvec2) | |

dmat3(dvec3, dvec3, dvec3) | |

dmat4(dvec4, dvec4, dvec4, dvec4) | |

dmat2x4(dvec3, double, // first column | |

double, dvec3) // second column | |

Modify Section 5.9, Expressions, p. 57 | |

(modify bulleted list as follows, adding support for double-precision | |

floating-point types) | |

Expressions in the shading language are built from the following: | |

* Constants of type bool, int, uint, float, double, all vector types and | |

all matrix types. | |

... | |

* The arithmetic binary operators add (+), subtract (-), multiply (*), and | |

divide (/) operate on integer, single-precision floating-point, and | |

double-precision floating-point scalars, vectors, and matrices. If the | |

fundamental type (integer, single-precision floating-point, | |

double-precision floating-point) of the operands do not match, the | |

conversions from Section 4.1.10 "Implicit Conversions" are applied to | |

produce matching types. ... | |

* The arithmetic unary operators negate (-), post- and pre-increment and | |

decrement (-- and ++) operate on integer, single-precision | |

floating-point, or double-precision floating-point values (including | |

vectors and matrices). ... | |

* The relational operators greater than (>), less than (<), and less than | |

or equal (<=) operate only on scalar integer, single-precision | |

floating-point, or double-precision floating-point expressions. The | |

result is scalar Boolean. The fundamental type of the two operands must | |

match, either as specified, or after one of the implicit type | |

conversions specified in Section 4.1.10. ... | |

... | |

Modify Chapter 8, Built-in Functions, p. 81 | |

(add to description of generic types, last paragraph of p. 81) ... Where | |

the input arguments (and corresponding output) can be double, dvec2, | |

dvec3, or dvec4, <genDType> is used as the argument. ... Similarly, <mat> | |

is used for any matrix basic type with single-precision components and | |

<dmat> is used for any matrix basic type with double-precision components. | |

Modify Section 8.2, Exponential Functions, p. 83 | |

(add overloads for double-precision square roots) | |

genDType sqrt(genDType x); | |

genDType inversesqrt(genDType x); | |

Modify Section 8.3, Common Functions, p. 84 | |

(add support for double-precision floating-point multiply-add) | |

Syntax: | |

genDType fma(genDType a, genDType b, genDType c); | |

The function fma() performs a fused double-precision floating-point | |

multiply-add to compute the value a*b+c. The results of fma() may not be | |

identical to evaluating the expression (a*b)+c, because the computation | |

may be performed in a single operation with intermediate precision | |

different from that used to compute a non-fma() expression. | |

The results of fma() are guaranteed to be invariant given fixed inputs | |

<a>, <b>, and <c>, as though the result were taken from a variable | |

declared as "precise". | |

(add support for double-precision frexp and ldexp functions) | |

Syntax: | |

genDType frexp(genDType x, out genIType exp); | |

genDType ldexp(genDType x, in genIType exp); | |

The function frexp() splits each double-precision floating-point number in | |

<x> into its binary significand, a floating-point number in the range | |

[0.5, 1.0), and an integral exponent of two, such that: | |

x = significand * 2 ^ exponent | |

The significand is returned by the function; the exponent is returned in | |

the parameter <exp>. For a floating-point value of zero, the significant | |

and exponent are both zero. For a floating-point value that is an | |

infinity or is not a number, the results of frexp() are undefined. | |

If the input <x> is a vector, this operation is performed in a | |

component-wise manner; the value returned by the function and the value | |

written to <exp> are vectors with the same number of components as <x>. | |

The function ldexp() builds a double-precision floating-point number from | |

each significand component in <x> and the corresponding integral exponent | |

of two in <exp>, returning: | |

significand * 2 ^ exponent | |

If this product is too large to be represented as a double-precision | |

floating-point value, the result is considered undefined. | |

If the input <x> is a vector, this operation is performed in a | |

component-wise manner; the value passed in <exp> and returned by the | |

function are vectors with the same number of components as <x>. | |

(add overloads for double-precision functions) | |

genDType abs(genDType x); | |

genDType sign(genDType x); | |

genDType floor(genDType x); | |

genDType trunc(genDType x); | |

genDType round(genDType x); | |

genDType roundEven(genDType x); | |

genDType ceil(genDType x); | |

genDType fract(genDType x); | |

genDType mod(genDType x, double y); | |

genDType mod(genDType x, genDType y); | |

genDType modf(genDType x, out genDType i); | |

genDType min(genDType x, genDType y); | |

genDType min(genDType x, double y); | |

genDType max(genDType x, genDType y); | |

genDType max(genDType x, double y); | |

genDType clamp(genDType x, genDType minVal, genDType maxVal); | |

genDType clamp(genDType x, double minVal, double maxVal); | |

genDType mix(genDType x, genDType y, genDType a); | |

genDType mix(genDType x, genDType y, double a); | |

genDType mix(genDType x, genDType y, genBType a); | |

genDType step(genDType edge, genDType x); | |

genDType step(double edge, genDType x); | |

genDType smoothstep(genDType edge0, genDType edge1, genDType x); | |

genDType smoothstep(double edge0, double edge1, genDType x); | |

genBType isnan(genDType x); | |

genBType isinf(genDType x); | |

(add support for 64-bit floating-point packing and unpacking functions) | |

Syntax: | |

double packDouble2x32(uvec2 v); | |

uvec2 unpackDouble2x32(double v); | |

The function packDouble2x32() returns a double obtained by packing the | |

components of a two-component unsigned integer vector into a 64-bit value | |

and interpeting its bits according to the IEEE double-precision | |

floating-point representation. The first vector component specifies the | |

32 least significant bits; the second component specifies the 32 most | |

significant bits. | |

The function unpackDouble2x32() returns a two-component unsigned integer | |

vector obtained by interpreting a double using the 64-bit IEEE | |

double-precision floating-point representation and unpacking into two | |

32-bit halves. The first component of the vector contains the 32 least | |

significant bits of the double; the second component consists the 32 most | |

significant bits. | |

Modify Section 8.4, Geometric Functions, p. 87 | |

(add double-precision equivalents for existing geometric functions) | |

double length(genDType x); | |

double distance(genDType p0, genDType p1); | |

double dot(genDType x, genDType y); | |

dvec3 cross(dvec3 x, dvec3 y); | |

genDType normalize(genDType x); | |

genDType faceforward(genDType N, genDType I, genDType Nref); | |

genDType reflect(genDType I, genDType N); | |

genDType refract(genDType I, genDType N, double eta); | |

Modify Section 8.5, Matrix Functions, p. 89 | |

(add double-precision equivalents for existing matrix functions) | |

dmat matrixCompMult(dmat x, dmat y); | |

dmat2 outerProduct(dvec2 c, dvec2 r); | |

dmat3 outerProduct(dvec3 c, dvec3 r); | |

dmat4 outerProduct(dvec4 c, dvec4 r); | |

dmat2x3 outerProduct(dvec3 c, dvec2 r); | |

dmat3x2 outerProduct(dvec2 c, dvec3 r); | |

dmat2x4 outerProduct(dvec4 c, dvec2 r); | |

dmat4x2 outerProduct(dvec2 c, dvec4 r); | |

dmat3x4 outerProduct(dvec4 c, dvec3 r); | |

dmat4x3 outerProduct(dvec3 c, dvec4 r); | |

dmat2 transpose(dmat2 m); | |

dmat3 transpose(dmat3 m); | |

dmat4 transpose(dmat4 m); | |

dmat2x3 transpose(dmat3x2 m); | |

dmat3x2 transpose(dmat2x3 m); | |

dmat2x4 transpose(dmat4x2 m); | |

dmat4x2 transpose(dmat2x4 m); | |

dmat3x4 transpose(dmat4x3 m); | |

dmat4x3 transpose(dmat3x4 m); | |

double determinant(dmat2 m); | |

double determinant(dmat3 m); | |

double determinant(dmat4 m); | |

dmat2 inverse(dmat2 m); | |

dmat3 inverse(dmat3 m); | |

dmat4 inverse(dmat4 m); | |

Modify Section 8.6, Vector Relational Functions, p. 90 | |

(modify the first paragraph, p. 90, adding support for relational | |

functions operating on double precision types) | |

Relational and equality operators (<, <=, >, >=, ==, !=) are defined (or | |

reserved) to operate on scalars and produce scalar Boolean results. For | |

vector results, use the following built-in functions. In the definitions | |

below, the following terms are used as placeholders for all vector types | |

for a given fundamental data type. In all cases, the sizes of the input | |

and return vectors for any particular call must match. | |

placeholder fundamental types | |

----------- ------------------------------------------------ | |

bvec bvec2, bvec3, bvec4 | |

ivec ivec2, ivec3, ivec4 | |

uvec uvec2, uvec3, uvec4 | |

vec vec2, vec3, vec4, dvec2, dvec3, dvec4 | |

Modify Section 9, Shading Language Grammar, p. 92 | |

!!! TBD !!! | |

GLX Protocol | |

!!! TBD | |

Dependencies on ARB_gpu_shader5 | |

If ARB_gpu_shader5 is not supported, the changes to the function | |

overloading rules in the OpenGL Shading Language Specification provided | |

there should included in this extension. | |

Dependencies on NV_gpu_shader5 | |

This extension and NV_gpu_shader5 both provide support for shading | |

language variables with 64-bit components. If both extensions are | |

supported, the various edits describing this new support should be | |

combined. | |

Dependencies on EXT_direct_state_access | |

If EXT_direct_state_access is not supported, references to the | |

ProgramUniform*d*EXT functions should be removed. | |

If EXT_direct_state_access is supported, that specification should be | |

edited as follows: | |

(modify the ProgramUniform* language) | |

The following commands: | |

.... | |

void ProgramUniform{1,2,3,4}dEXT(uint program int location, T value); | |

void ProgramUniform{1,2,3,4}dvEXT (uint program, int location, | |

const T *value); | |

void ProgramUniformMatrix{2,3,4}dvEXT | |

(uint program, int location, sizei count, boolean transpose, | |

const double *value); | |

void ProgramUniformMatrix{2x3,3x2,2x4,4x2,3x4,4x3}dvEXT | |

(uint program, int location, sizei count, boolean transpose, | |

const double *value); | |

operate identically to the corresponding command where "Program" is | |

deleted from the name (and extension suffixes are dropped or updated | |

appropriately) except, rather than updating the currently active program | |

object, these "Program" commands update the program object named by the | |

<program> parameter. ... | |

Dependencies on NV_shader_buffer_load | |

If NV_shader_buffer_load is supported, that specification should be edited | |

as follows: | |

Modify "Section 2.20.X, Shader Memory Access" from NV_shader_buffer_load. | |

(add rules for loads of variables having the new data types from this | |

extension to the list of bullets following "When a shader dereferences a | |

pointer variable") | |

- Data of type "double" are read from or written to memory as one | |

double-typed value at the specified GPU address. | |

Errors | |

None. | |

New State | |

None. | |

New Implementation Dependent State | |

None. | |

Issues | |

(1) How do double-precision types interact with the rules for storing | |

uniforms in a buffer object? | |

RESOLVED: The rules were already written with data types larger and | |

smaller than those in the original GLSL in mind. Single precision | |

floats typically take four bytes; doubles take eight bytes. The larger | |

storage requirement for doubles means a larger alignment requirement; | |

doubles still need to be size-aligned. | |

(2) Should double-precision vertex shader inputs be supported? | |

RESOLVED: Not in this extension. Such support will be added by the | |

EXT_vertex_attrib_64bit extension. | |

(3) Should double-precision fragment shader outputs be supported? | |

RESOLVED: Not in this extension. Note that we don't have | |

double-precision framebuffer formats to accept such values. | |

(4) Should transform feedback be able to capture double-precision | |

components? | |

RESOLVED: Yes. However, undefined behavior will occur unless all | |

components are captured to size-aligned offsets. | |

If any variable captured in transform feedback has double-precision | |

components, the practical requirements for defined behavior are: | |

(a) the offset of the base of a buffer object must be a multiple of | |

eight bytes; | |

(b) the amount of data captured per vertex must be a multiple of eight | |

bytes; and | |

(c) each double-precision variable captured must be aligned to a | |

multiple of eight bytes relative to the beginning of a vertex. | |

If capturing a mix of single- and double-precision components, it might | |

be necessary to use the "gl_SkipComponents1" variable from | |

ARB_transform_feedback3 to force proper alignment. | |

We considered the possibility of adding error checks to throw errors in | |

cases where undefined behavior might occur, but chose not to include | |

such errors. For OpenGL 3.0-style transform feedback, cases (b) and (c) | |

are solely a function of the variables captured could be detected when a | |

program object is linked. (Such an error would be more problematic for | |

transform feedback via NV_transform_feedback, where the set of variables | |

captured can be updated without relinking.) For case (a), the | |

requirement of OpenGL 3.0 is that transform feedback buffer offsets must | |

be a multiple of 4 bytes; enforcing a stricter 8-byte alignment would | |

require either a backward-incompatible change or a Begin-time error to | |

checks the offset of transform feedback buffers against the current | |

program. | |

(5) Should we have double-precision matrix types? We didn't add integer | |

matrices, but integer matrix math is fairly uncommon. | |

RESOLVED: Yes, we will support all matrix sizes in double-precision. | |

We will also provide double-precision equivalents for all matrix | |

operators and built-in matrix functions. | |

(6) What should be done to distinguish between single- and | |

double-precision floating-point constants? | |

RESOLVED: We will use "LF" to identify double-precision floating-point | |

constants. Here, we depart from the C standard. In C, floating-point | |

constants without a suffix are implicitly double-precision and require a | |

"F" suffix to specify a single-precision constant. However, GLSL has | |

historically provided no support for double precision. Changing to C | |

rules would materially affect the behavior of pre-existing shaders that | |

add an #extension line for this extension, since constants with no | |

suffix have meant "float" up to now. Additionally, such a change would | |

likely have required that we introduce implicit conversions from double | |

to float; otherwise, assigning a constant with no suffix to a float | |

would result in a compile-time error. | |

(7) Should we require IEEE 1394-compliant behavior for NaNs and | |

infinities? Denorms? | |

RESOLVED: Following historical precedent in the GLSL and OpenGL APIs | |

not defining special-case floating-point behavior, we chose not to do so | |

in this extension. | |

(8) Should we provide double-precision versions of all the built-ins that | |

take a <genType>, which are currently defined to be floats and | |

floating-point vectors? | |

RESOLVED: We provide double-precision versions of most of the built-in | |

functions supported by GLSL. We opted not to provide double-precision | |

functions for special trigonometry, exponential, derivative, and noise | |

functions. | |

(9) Are double-precision "varyings" (values passed between shader stages) | |

supported by this extension? If so, is double-precision interpolation | |

is supported? | |

RESOLVED: Double-precision shader inputs and outputs are supported, | |

except for vertex shader inputs and fragment shader outputs. | |

Additionally, double-precision vertex shader inputs are provided by the | |

separate extension EXT_vertex_attrib_64bit. No known extension provides | |

double-precision fragment outputs, but that doesn't seem important since | |

OpenGL provides no pixel/texture formats with double-precision | |

components that could reasonably receive such outputs. | |

Interpolation not supported in this extension for double-precision | |

floating-point components. As with integer types in OpenGL 3.0, | |

double-precision floating-point fragment shader inputs must be qualified | |

as "flat". | |

Note that this extension reformulates the spec language requiring "flat" | |

qualifiers, in addition to adding doubles to the list of "flat" types. | |

In GLSL 1.30, the spec applies these requirements to vertex shader | |

outputs but imposes no requirement on fragment inputs. We move this | |

requirement to fragment inputs, since vertex shader outputs may be | |

passed to tessellation or geometry shaders without interpolation, and | |

thus without the need for qualification by "flat". | |

(15) Can the 64-bit uniform APIs be used to load values for uniforms of | |

type "bool", "bvec2", "bvec3", or "bvec4"? | |

RESOLVED: No. OpenGL 2.0 and beyond did allow "bool" variable to be | |

set with Uniform*i* and Uniform*f APIs, and OpenGL 3.0 extended that | |

support to Uniform*ui* for orthogonality. But it seems pointless to | |

extended this capability forward to 64-bit Uniform APIs as well. | |

(19) Should we support any implicit conversion of matrix types, now that | |

we have both "mat4" and "dmat4"? | |

RESOLVED: No. It doesn't seem worth the trouble. | |

Revision History | |

Rev. Date Author Changes | |

---- -------- -------- ----------------------------------------- | |

11 08/27/12 pbrown Clarify that Uniform*d can not be used to load | |

uniforms with boolean types (bug 9345); import | |

issue (15) on the topic from NV_gpu_shader5. | |

10 03/23/10 pbrown Update issues section to include fp64 issues | |

that were left behind in NV_gpu_shader5 when the | |

specs were refactored. | |

9 02/02/10 pbrown Specify that capturing any component at an | |

offset that is not size-aligned results in | |

undefined behavior (bug 5863). | |

8 01/29/10 pbrown Remove shading language and API support for | |

double-precision vertex attributes; moved to the | |

EXT_vertex_attrib_64bit specification (bug | |

5953). Added clarification disallowing | |

double-precision fragment shader outputs. | |

7 01/29/10 pbrown Delete accidental modifications to the language | |

for equal and not equal operators (bug 5904), | |

which already supported all types. | |

6 01/15/10 pbrown Modify the spec rules for counting attributes, | |

input and output components, and components | |

to capture in transform feedback to permit, | |

but not require, double-precision values to | |

require twice as many resources as single- | |

precision equivalents (bug 5855). | |

5 01/14/10 pbrown Minor updates from spec reviews. | |

4 12/10/09 pbrown Functionality updates from spec review: | |

Allow implicit conversion from mat*->dmat*. | |

Rename fmad and [un]packFloat2x32 to fma | |

and [un]packDouble2x32. Add overlooked | |

fp64 versions of geometric functions. | |

3 12/10/09 pbrown Convert from EXT to ARB. | |

2 12/08/09 pbrown Miscellaneous fixes from spec review: Clarified | |

input/output component counting rules, where | |

each fp64 value counts double. General typo | |

fixes and language clarifications. | |

1 pbrown Internal revisions. |