extensions/EXT/EXT_gpu_shader5.txt - external/github.com/KhronosGroup/OpenGL-Registry - Git at Google

 Name

     EXT_gpu_shader5

 Name Strings

     GL_EXT_gpu_shader5

 Contact

     Jon Leech (oddhack 'at' sonic.net)
     Daniel Koch, NVIDIA (dkoch 'at' nvidia.com)

 Contributors

     Daniel Koch, NVIDIA (dkoch 'at' nvidia.com)
     Pat Brown, NVIDIA (pbrown 'at' nvidia.com)
     Jesse Hall, Google
     Maurice Ribble, Qualcomm
     Bill Licea-Kane, Qualcomm
     Graham Connor, Imagination
     Ben Bowman, Imagination
     Jonathan Putsman, Imagination
     Marcin Kantoch, Mobica
     Slawomir Grajewski, Intel
     Contributors to ARB_gpu_shader5

 Notice

     Copyright (c) 2010-2013 The Khronos Group Inc. Copyright terms at
         http://www.khronos.org/registry/speccopyright.html

     Portions Copyright (c) 2013-2014 NVIDIA Corporation.

 Status

     Complete.

 Version

     Last Modified Date: March 27, 2015
     Revision: 12

 Number

     OpenGL ES Extension #178

 Dependencies

     OpenGL ES 3.1 and OpenGL ES Shading Language 3.10 are required.

     This specification is written against the OpenGL ES 3.1 (March 17,
     2014) and OpenGL ES 3.10 Shading Language (March 17, 2014)
     Specifications.

     This extension interacts with EXT_geometry_shader.

 Overview

     This extension provides a set of new features to the OpenGL ES Shading
     Language and related APIs to support capabilities of new GPUs, extending
     the capabilities of version 3.10 of the OpenGL ES Shading Language.
     Shaders using the new functionality provided by this extension should
     enable this functionality via the construct

       #extension GL_EXT_gpu_shader5 : require     (or enable)

     This extension provides a variety of new features for all shader types,
     including:

       * support for indexing into arrays of opaque types (samplers,
         and atomic counters) using dynamically uniform integer expressions;

       * support for indexing into arrays of images and shader storage blocks
         using only constant integral expressions;

       * extending the uniform block capability to allow shaders to index
         into an array of uniform blocks;

       * a "precise" qualifier allowing computations to be carried out exactly
         as specified in the shader source to avoid optimization-induced
         invariance issues (which might cause cracking in tessellation);

       * new built-in functions supporting:

         * fused floating-point multiply-add operations;

       * extending the textureGather() built-in functions provided by
         OpenGL ES Shading Language 3.10:

         * allowing shaders to use arbitrary offsets computed at run-time to
           select a 2x2 footprint to gather from; and
         * allowing shaders to use separate independent offsets for each of
           the four texels returned, instead of requiring a fixed 2x2
           footprint.

 New Procedures and Functions

     None

 New Tokens

     None

 Additions to the OpenGL ES 3.1 Specification

     Add to the end of section 8.13.2, "Coordinate Wrapping and Texel
     Selection":

     ... texture source color of (0,0,0,1) for all four source texels.

     The textureGatherOffsets built-in shader functions return a vector
     derived from sampling four texels in the image array of level
     <level_base>. For each of the four texel offsets specified by the
     <offsets> argument, the rules for the LINEAR minification filter are
     applied to identify a 2x2 texel footprint, from which the single texel
     T_i0_j0 is selected. A four-component vector is then assembled by taking
     a single component from each of the four T_i0_j0 texels in the same
     manner as for the textureGather function.


 Additions to the OpenGL ES Shading Language 3.10 Specification

     Including the following line in a shader can be used to control the
     language features described in this extension:

       #extension GL_EXT_gpu_shader5 : <behavior>

     where <behavior> is as specified in section 3.4.

     A new preprocessor #define is added to the OpenGL ES Shading Language:

       #define GL_EXT_gpu_shader5        1


     Modifications to Section 3.7 (Keywords)

     Remove "precise" from the list of reserved keywords and add it to the
     list of keywords.

     Remove the last paragraph from section 3.9.3 "Dynamically Uniform
     Expressions" (starting "The definition is not used in this version...")


     Add to the introduction to section 4.1.7, "Opaque Types" on p. 26:

     When aggregated into arrays within a shader, opaque types can only be
     indexed with a dynamically uniform integral expression (see section
     3.9.3) unless otherwise noted; otherwise, results are undefined.


     Replace the first paragraph of section 4.1.7.1, "Samplers" (removing the
     second sentence) on p. 27:

     Sampler types (e.g., sampler2D) are opaque types, declared and behaving
     as described above for opaque types.

     Sampler variables are ...


     Modify Section 4.3.9 "Interface Blocks", as modified by
     EXT_geometry_shader and EXT_shader_io_blocks:

     (modify the paragraph starting "For uniform or shader storage blocks
     declared as an array", removing the requirement for indexing uniform
     blocks using constant expressions)

     For uniform or shader storage blocks declared as an array, each
     individual array element corresponds to a separate buffer object bind
     range, backing one instance of the block. As the array size indicates
     the number of buffer objects needed, uniform and shader storage block
     array declarations must specify an array size. All indices used to index
     a shader storage block array must be constant integral expressions. A
     uniform block array can only be indexed with a dynamically uniform
     integral expression, otherwise results are undefined.


     Add new section 4.9gs5 before section 4.10 "Order of Qualification":

     4.9gs5 The Precise Qualifier

     Some algorithms may require that floating-point computations be carried
     out in exactly the manner specified in the source code, even if the
     implementation supports optimizations that could produce nearly
     equivalent results with higher performance. For example, many GL
     implementations support a "multiply-add" that can compute values such as

       float result = (float(a) * float(b)) + float(c);

     in a single operation. The result of a floating-point multiply-add may
     not always be identical to first doing a multiply yielding a
     floating-point result, and then doing a floating-point add. By default,
     implementations are permitted to perform optimizations that effectively
     modify the order of the operations used to evaluate an expression, even
     if those optimizations may produce slightly different results relative
     to unoptimized code.

     The qualifier "precise" will ensure that operations contributing to a
     variable's value are performed in the order and with the precision
     specified in the source code. Order of evaluation is determined by
     operator precedence and parentheses, as described in Section &5.
     Expressions must be evaluated with a precision consistent with the
     operation; for example, multiplying two "float" values must produce a
     single value with "float" precision. This effectively prohibits the
     arbitrary use of fused multiply-add operations if the intermediate
     multiply result is kept at a higher precision. For example:

       precise out vec4 position;

     declares that computations used to produce the value of "position" must
     be performed precisely using the order and precision specified. As with
     the invariant qualifier (section &4.6.1), the precise qualifier may be
     used to qualify a built-in or previously declared user-defined variable
     as being precise:

       out vec3 Color;
       precise Color;            // make existing Color be precise

     This qualifier will affect the evaluation of expressions used on the
     right-hand side of an assignment if and only if:

       * the variable assigned to is qualified as "precise"; or

       * the value assigned is used later in the same function, either
         directly or indirectly, on the right-hand of an assignment to a
         variable declared as "precise".

     Expressions computed in a function are treated as precise only if
     assigned to a variable qualified as "precise" in that same function. Any
     other expressions within a function are not automatically treated as
     precise, even if they are used to determine a value that is returned by
     the function and directly assigned to a variable qualified as "precise".

     Some examples of the use of "precise" include:

       in vec4 a, b, c, d;
       precise out vec4 v;

       float func(float e, float f, float g, float h)
       {
         return (e*f) + (g*h);            // no special precision
       }

       float func2(float e, float f, float g, float h)
       {
         precise result = (e*f) + (g*h);  // ensures a precise return value
         return result;
       }

       float func3(float i, float j, precise out float k)
       {
         k = i * i + j;                   // precise, due to <k> declaration
       }

       void main(void)
       {
         vec4 r = vec3(a * b);           // precise, used to compute v.xyz
         vec4 s = vec3(c * d);           // precise, used to compute v.xyz
         v.xyz = r + s;                      // precise
         v.w = (a.w * b.w) + (c.w * d.w);    // precise
         v.x = func(a.x, b.x, c.x, d.x);     // values computed in func()
                                             // are NOT precise
         v.x = func2(a.x, b.x, c.x, d.x);    // precise!
         func3(a.x * b.x, c.x * d.x, v.x);   // precise!
       }


     Modify Section 8.3, Common Functions, p. 104

     (add support for floating-point multiply-add)

     Syntax:

       genType fma(genType a, genType b, genType c);

     Computes and returns a * b + c.

     In uses where the return value is eventually consumed by a variable
     declared as precise:

     * fma() is considered a single operation, whereas the expression
       "a*b + c" consumed by a variable declared precise is considered two
       operations.
     * The precision of fma() can differ from the precision of the expression
       "a*b + c".
     * fma() will be computed with the same precision as any other fma()
       consumed by a precise variable, giving invariant results for the same
       input values of a, b, and c.

     Otherwise, in the absence of precise consumption, there are no special
     constraints on the number of operations or difference in precision
     between fma() and the expression "a*b + c".


     Modify the table of functions in section 8.9.3 "Texture Gather
     Functions", changing the "Description" column for the existing
     textureGatherOffset functions on p. 127:

     Description

         Perform a texture gather operation as in textureGather offset by
         <offset> as described in textureOffset, except that the <offset> can
         be variable (non-constant) and the implementation-dependent minimum
         and maximum offset values are given by the values of
         MIN_PROGRAM_TEXTURE_GATHER_OFFSET and
         MAX_PROGRAM_TEXTURE_GATHER_OFFSET, respectively.


     Add new textureGatherOffsets functions to the same table, on p. 127:

     Syntax

         gvec4 textureGatherOffsets(gsampler2D sampler, vec2 P,
                                    ivec2 offsets[4] [, int comp])
         gvec4 textureGatherOffsets(gsampler2DArray sampler, vec3 P,
                                    ivec2 offsets[4] [, int comp])
         vec4 textureGatherOffsets(sampler2DShadow sampler, vec2 P,
                                   float refZ, ivec2 offsets[4])
         vec4 textureGatherOffsets(sampler2DArrayShadow sampler, vec3 P,
                                   float refZ, ivec2 offsets[4])

     Description

         Operate identically to textureGatherOffset except that <offsets> is
         used to determine the location of the four texels to sample. Each of
         the four texels is obtained by applying the corresponding offset in
         <offsets> as a (u,v) coordinate offset to <coord>, identifying the
         four-texel linear footprint, and then selecting texel (i0,j0) of
         that footprint. The specified values in <offsets> must be constant
         integral expressions.

 New Implementation Dependent State

     None.

 Issues

     Note: These issues apply specifically to the definition of the
     EXT_gpu_shader5 specification, which is based on the OpenGL extension
     ARB_gpu_shader5 as updated in OpenGL 4.x. Resolved issues from
     ARB_gpu_shader5 have been removed, but some remain applicable to this
     extension. ARB_gpu_shader5 can be found in the OpenGL Registry.

     (1) What functionality was removed relative to ARB_gpu_shader5?

       - Instanced geometry support (moved into EXT_geometry_shader)
       - Implicit conversions (moved to EXT_shader_implicit_conversions)
       - Interactions with features not supported by the underlying
         ES 3.1 API and Shading Language, including:
         * interactions with ARB_gpu_Shader_fp64 and NV_gpu_shader, including
           support for double-precision in implicit conversions and function
           overload resolution
         * multiple vertex streams (these require ARB_transform_feedback3)
         * textureGather built-in variants for cube map array and rectangle
           texture samples.
         * shading language function overloading rules involving the type
           double
       - Functionality already in OpenGL ES 3.00, including packing and
         unpacking of 16-bit types and converting floating-point values to or
         from their integer bit encodings.
       - Functionality already in OpenGL ES 3.10, including
         * splitting and building floating-point numbers from a significand and
           exponent, integer bitfield manipulation, and packing and unpacking
           vectors of 8-bit fixed-point data types.
         * a subset of the textureGather and textureGatherOffset builtins
           (but some textureGather builtins remain in this extension).
       - Functionality already in OES_sample_variables, including support for
         reading a mask of covered samples in a fragment shader.
       - Functionality already in OES_shader_multisample_interpolation,
         including support for interpolating a fragment shader input at a
         programmable offset relative to the pixel center, a programmable
         sample number, or at the centroid.
       - MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS (Issue 9).

     (2) What functionality was changed and added relative to
         ARB_gpu_shader5?

       - Support for indexing into arrays of samplers with extended to all
         opaque types, and the description of allowed indices was rewritten
         in terms of dynamically uniform expressions, as was done when
         ARB_gpu_shader5 was promoted into OpenGL 4.0.
       - The only remaining API interaction is an increase in a
         minium-maximum value, so no "Changes to the OpenGL ES Specification"
         sections are included above.
       - arrays of images and shader storage blocks can only be indexed
         with constant integral expressions.

     (3) What should the rules on GLSL suffixing be?

     RESOLVED: "precise" is not a reserved keyword in ESSL 3.00, but it is
     a keyword in GLSL 4.40. ESSL 3.10 updated the reserved keyword list
     to include all keywords used or reserved in GLSL 4.40 (but not otherwise
     used in ES) and thus we can use "precise" in this spec by moving it
     from the reserved keywords section. See bug 11179.

     (4) Are changes to the "Order of Qualification" section needed?

     RESOLVED. No. ESSL 3.10 relaxes the ordering constraints similarly to
     GLSL 4.40. And thus there is no need for modifications to section 4.7
     in 3.00 (4.10 in 3.10) in this extension.

     (5) Are any more changes needed to the descriptions of texture gather?

     Probably not. Bug 11109 suggests cleanup to be applied to both desktop
     API and language specifications to make them cleaner and more
     consistent. The important parts of this cleanup were done in the texture
     gather functionality folded into ES 3.1, although some small language
     tweaks may still be needed.

     (6) Moved to EXT_shader_implicit_conversions Issue 4.

     (7) Should uniform and shader storage blocks be backable with buffer
         object subranges?

     RESOLVED: Yes. The section 4.3.7 "Interface Blocks" language picked up
     from desktop GL allows this (they are called "bind ranges"). This is a
     spec oversight in ES, because BindBufferRange is fully supported in
     OpenGL ES 3.0.

     (8) Where is MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS?

     RESOLVED. It was not added in Core GL because ARB_texture_gather and
     ARB_gpu_shader5 were both added to GL 4.0 and thus the query was
     unneeded. Since OpenGL ES 3.1 also includes texture gather and the
     multi-component gather support from gpu_shader5, the query was also
     unnecessary there and here.  Bug 11002.

     (9) Some vendors may not be able to support dynamic indexing
     of arrays of images or shader storage blocks. What should we use instead?

     RESOLVED: Only allowing 'constant integral expression' instead of
     'dynamically uniform integer expression' for arrays of images or shader
     storage blocks. For images this is done by carving out an exception in the
     general language for opaque types. For shader storage blocks, different
     rules are given for arrays of uniform blocks and arrays of shader storage
     blocks.

 Revision History

     Revision 1, 2013/10/27 (Jon Leech)
         - Initial version based on ARB_gpu_shader5

     Revision 2, 2013/11/06 (Jon Leech)
         - Update Issues list with unresolved issues 4-7, which are dependent
           on decisions to be made by the ARB and ES working groups.
         - Remove {un,}packUnorm2x16EXT (already in ESSL 3.00)
         - Match changes to ES 3.1 texture gather language, but still
           reorganize the textureGather functions into their own subsection &
           table. ES 3.1 restored the [, int comp] argument to the functions
           it defined. Removed sampler2DRect variants incorrectly left in.
         - Clean up function overloading example text and opened bug 11178 to
           resolve possible problems with the GLSL 4.40 language this is
           based on.
         - Remove reference to image2DMS, since there is no longer any image
           load/store support for multisample textures in ES 3.1
         - Add issue (8) regarding "bind ranges".

     Revision 3, 2013/11/14 (Jon Leech)
         - Resolve function overloading issue 7, per bug 11178.

     Revision 4, 2013/11/20 (Jon Leech)
         - Sync with ES 3.1 spec language update.
         - Refer to ES 3.1 instead of ES 3plus.

     Revision 5, 2013/11/21 (Daniel Koch)
         - removed implicit conversion language (to a separate document).
         - updated textureGather functions to reflect the shadow gather
           functionality being added in ES 3.1.
         - added issue 9.

     Revision 6, 2013/12/18 (Daniel Koch)
         - minor cleanup
         - added issue 10, restrict arrays of images to const-int-expr

     Revision 7, 2014/02/12 (Daniel Koch)
         - restrict indexing arrays of shader storage blocks to const-int-expr.
         - Resolved issues 4, 5, 8, 9, 10 and supporting edits.

     Revision 8, 2014/03/10 (Jon Leech)
         - Rebase on OpenGL ES 3.1 and change suffix to EXT.
         - Remove textureGather functions already present in the existing
           GLSL-ES 3.10 spec section 8.9.3

     Revision 9, 2014/03/26 (Daniel Koch)
         - update contributors

     Revision 10, 2014/03/28 (Jon Leech)
         - Sync with released ES 3.1 specs. Reflow text.

     Revision 11, 2014/04/01 (Daniel Koch)
         - Update contributors

     Revision 12, 2015/03/27 (Daniel Koch)
         - Add missing function and token sections.
	Name

	EXT_gpu_shader5

	Name Strings

	GL_EXT_gpu_shader5

	Contact

	Jon Leech (oddhack 'at' sonic.net)
	Daniel Koch, NVIDIA (dkoch 'at' nvidia.com)

	Contributors

	Daniel Koch, NVIDIA (dkoch 'at' nvidia.com)
	Pat Brown, NVIDIA (pbrown 'at' nvidia.com)
	Jesse Hall, Google
	Maurice Ribble, Qualcomm
	Bill Licea-Kane, Qualcomm
	Graham Connor, Imagination
	Ben Bowman, Imagination
	Jonathan Putsman, Imagination
	Marcin Kantoch, Mobica
	Slawomir Grajewski, Intel
	Contributors to ARB_gpu_shader5

	Notice

	Copyright (c) 2010-2013 The Khronos Group Inc. Copyright terms at
	http://www.khronos.org/registry/speccopyright.html

	Portions Copyright (c) 2013-2014 NVIDIA Corporation.

	Status

	Complete.

	Version

	Last Modified Date: March 27, 2015
	Revision: 12

	Number

	OpenGL ES Extension #178

	Dependencies

	OpenGL ES 3.1 and OpenGL ES Shading Language 3.10 are required.

	This specification is written against the OpenGL ES 3.1 (March 17,
	2014) and OpenGL ES 3.10 Shading Language (March 17, 2014)
	Specifications.

	This extension interacts with EXT_geometry_shader.

	Overview

	This extension provides a set of new features to the OpenGL ES Shading
	Language and related APIs to support capabilities of new GPUs, extending
	the capabilities of version 3.10 of the OpenGL ES Shading Language.
	Shaders using the new functionality provided by this extension should
	enable this functionality via the construct

	#extension GL_EXT_gpu_shader5 : require (or enable)

	This extension provides a variety of new features for all shader types,
	including:

	* support for indexing into arrays of opaque types (samplers,
	and atomic counters) using dynamically uniform integer expressions;

	* support for indexing into arrays of images and shader storage blocks
	using only constant integral expressions;

	* extending the uniform block capability to allow shaders to index
	into an array of uniform blocks;

	* a "precise" qualifier allowing computations to be carried out exactly
	as specified in the shader source to avoid optimization-induced
	invariance issues (which might cause cracking in tessellation);

	* new built-in functions supporting:

	* fused floating-point multiply-add operations;

	* extending the textureGather() built-in functions provided by
	OpenGL ES Shading Language 3.10:

	* allowing shaders to use arbitrary offsets computed at run-time to
	select a 2x2 footprint to gather from; and
	* allowing shaders to use separate independent offsets for each of
	the four texels returned, instead of requiring a fixed 2x2
	footprint.

	New Procedures and Functions

	None

	New Tokens

	None

	Additions to the OpenGL ES 3.1 Specification

	Add to the end of section 8.13.2, "Coordinate Wrapping and Texel
	Selection":

	... texture source color of (0,0,0,1) for all four source texels.

	The textureGatherOffsets built-in shader functions return a vector
	derived from sampling four texels in the image array of level
	<level_base>. For each of the four texel offsets specified by the
	<offsets> argument, the rules for the LINEAR minification filter are
	applied to identify a 2x2 texel footprint, from which the single texel
	T_i0_j0 is selected. A four-component vector is then assembled by taking
	a single component from each of the four T_i0_j0 texels in the same
	manner as for the textureGather function.


	Additions to the OpenGL ES Shading Language 3.10 Specification

	Including the following line in a shader can be used to control the
	language features described in this extension:

	#extension GL_EXT_gpu_shader5 : <behavior>

	where <behavior> is as specified in section 3.4.

	A new preprocessor #define is added to the OpenGL ES Shading Language:

	#define GL_EXT_gpu_shader5 1


	Modifications to Section 3.7 (Keywords)

	Remove "precise" from the list of reserved keywords and add it to the
	list of keywords.

	Remove the last paragraph from section 3.9.3 "Dynamically Uniform
	Expressions" (starting "The definition is not used in this version...")


	Add to the introduction to section 4.1.7, "Opaque Types" on p. 26:

	When aggregated into arrays within a shader, opaque types can only be
	indexed with a dynamically uniform integral expression (see section
	3.9.3) unless otherwise noted; otherwise, results are undefined.


	Replace the first paragraph of section 4.1.7.1, "Samplers" (removing the
	second sentence) on p. 27:

	Sampler types (e.g., sampler2D) are opaque types, declared and behaving
	as described above for opaque types.

	Sampler variables are ...



	Modify Section 4.3.9 "Interface Blocks", as modified by
	EXT_geometry_shader and EXT_shader_io_blocks:

	(modify the paragraph starting "For uniform or shader storage blocks
	declared as an array", removing the requirement for indexing uniform
	blocks using constant expressions)

	For uniform or shader storage blocks declared as an array, each
	individual array element corresponds to a separate buffer object bind
	range, backing one instance of the block. As the array size indicates
	the number of buffer objects needed, uniform and shader storage block
	array declarations must specify an array size. All indices used to index
	a shader storage block array must be constant integral expressions. A
	uniform block array can only be indexed with a dynamically uniform
	integral expression, otherwise results are undefined.


	Add new section 4.9gs5 before section 4.10 "Order of Qualification":

	4.9gs5 The Precise Qualifier

	Some algorithms may require that floating-point computations be carried
	out in exactly the manner specified in the source code, even if the
	implementation supports optimizations that could produce nearly
	equivalent results with higher performance. For example, many GL
	implementations support a "multiply-add" that can compute values such as

	float result = (float(a) * float(b)) + float(c);

	in a single operation. The result of a floating-point multiply-add may
	not always be identical to first doing a multiply yielding a
	floating-point result, and then doing a floating-point add. By default,
	implementations are permitted to perform optimizations that effectively
	modify the order of the operations used to evaluate an expression, even
	if those optimizations may produce slightly different results relative
	to unoptimized code.

	The qualifier "precise" will ensure that operations contributing to a
	variable's value are performed in the order and with the precision
	specified in the source code. Order of evaluation is determined by
	operator precedence and parentheses, as described in Section &5.
	Expressions must be evaluated with a precision consistent with the
	operation; for example, multiplying two "float" values must produce a
	single value with "float" precision. This effectively prohibits the
	arbitrary use of fused multiply-add operations if the intermediate
	multiply result is kept at a higher precision. For example:

	precise out vec4 position;

	declares that computations used to produce the value of "position" must
	be performed precisely using the order and precision specified. As with
	the invariant qualifier (section &4.6.1), the precise qualifier may be
	used to qualify a built-in or previously declared user-defined variable
	as being precise:

	out vec3 Color;
	precise Color; // make existing Color be precise

	This qualifier will affect the evaluation of expressions used on the
	right-hand side of an assignment if and only if:

	* the variable assigned to is qualified as "precise"; or

	* the value assigned is used later in the same function, either
	directly or indirectly, on the right-hand of an assignment to a
	variable declared as "precise".

	Expressions computed in a function are treated as precise only if
	assigned to a variable qualified as "precise" in that same function. Any
	other expressions within a function are not automatically treated as
	precise, even if they are used to determine a value that is returned by
	the function and directly assigned to a variable qualified as "precise".

	Some examples of the use of "precise" include:

	in vec4 a, b, c, d;
	precise out vec4 v;

	float func(float e, float f, float g, float h)
	{
	return (ef) + (gh); // no special precision
	}

	float func2(float e, float f, float g, float h)
	{
	precise result = (ef) + (gh); // ensures a precise return value
	return result;
	}

	float func3(float i, float j, precise out float k)
	{
	k = i * i + j; // precise, due to <k> declaration
	}

	void main(void)
	{
	vec4 r = vec3(a * b); // precise, used to compute v.xyz
	vec4 s = vec3(c * d); // precise, used to compute v.xyz
	v.xyz = r + s; // precise
	v.w = (a.w * b.w) + (c.w * d.w); // precise
	v.x = func(a.x, b.x, c.x, d.x); // values computed in func()
	// are NOT precise
	v.x = func2(a.x, b.x, c.x, d.x); // precise!
	func3(a.x * b.x, c.x * d.x, v.x); // precise!
	}


	Modify Section 8.3, Common Functions, p. 104

	(add support for floating-point multiply-add)

	Syntax:

	genType fma(genType a, genType b, genType c);

	Computes and returns a * b + c.

	In uses where the return value is eventually consumed by a variable
	declared as precise:

	* fma() is considered a single operation, whereas the expression
	"a*b + c" consumed by a variable declared precise is considered two
	operations.
	* The precision of fma() can differ from the precision of the expression
	"a*b + c".
	* fma() will be computed with the same precision as any other fma()
	consumed by a precise variable, giving invariant results for the same
	input values of a, b, and c.

	Otherwise, in the absence of precise consumption, there are no special
	constraints on the number of operations or difference in precision
	between fma() and the expression "a*b + c".


	Modify the table of functions in section 8.9.3 "Texture Gather
	Functions", changing the "Description" column for the existing
	textureGatherOffset functions on p. 127:

	Description

	Perform a texture gather operation as in textureGather offset by
	<offset> as described in textureOffset, except that the <offset> can
	be variable (non-constant) and the implementation-dependent minimum
	and maximum offset values are given by the values of
	MIN_PROGRAM_TEXTURE_GATHER_OFFSET and
	MAX_PROGRAM_TEXTURE_GATHER_OFFSET, respectively.


	Add new textureGatherOffsets functions to the same table, on p. 127:

	Syntax

	gvec4 textureGatherOffsets(gsampler2D sampler, vec2 P,
	ivec2 offsets[4] [, int comp])
	gvec4 textureGatherOffsets(gsampler2DArray sampler, vec3 P,
	ivec2 offsets[4] [, int comp])
	vec4 textureGatherOffsets(sampler2DShadow sampler, vec2 P,
	float refZ, ivec2 offsets[4])
	vec4 textureGatherOffsets(sampler2DArrayShadow sampler, vec3 P,
	float refZ, ivec2 offsets[4])

	Description

	Operate identically to textureGatherOffset except that <offsets> is
	used to determine the location of the four texels to sample. Each of
	the four texels is obtained by applying the corresponding offset in
	<offsets> as a (u,v) coordinate offset to <coord>, identifying the
	four-texel linear footprint, and then selecting texel (i0,j0) of
	that footprint. The specified values in <offsets> must be constant
	integral expressions.

	New Implementation Dependent State

	None.

	Issues

	Note: These issues apply specifically to the definition of the
	EXT_gpu_shader5 specification, which is based on the OpenGL extension
	ARB_gpu_shader5 as updated in OpenGL 4.x. Resolved issues from
	ARB_gpu_shader5 have been removed, but some remain applicable to this
	extension. ARB_gpu_shader5 can be found in the OpenGL Registry.

	(1) What functionality was removed relative to ARB_gpu_shader5?

	- Instanced geometry support (moved into EXT_geometry_shader)
	- Implicit conversions (moved to EXT_shader_implicit_conversions)
	- Interactions with features not supported by the underlying
	ES 3.1 API and Shading Language, including:
	* interactions with ARB_gpu_Shader_fp64 and NV_gpu_shader, including
	support for double-precision in implicit conversions and function
	overload resolution
	* multiple vertex streams (these require ARB_transform_feedback3)
	* textureGather built-in variants for cube map array and rectangle
	texture samples.
	* shading language function overloading rules involving the type
	double
	- Functionality already in OpenGL ES 3.00, including packing and
	unpacking of 16-bit types and converting floating-point values to or
	from their integer bit encodings.
	- Functionality already in OpenGL ES 3.10, including
	* splitting and building floating-point numbers from a significand and
	exponent, integer bitfield manipulation, and packing and unpacking
	vectors of 8-bit fixed-point data types.
	* a subset of the textureGather and textureGatherOffset builtins
	(but some textureGather builtins remain in this extension).
	- Functionality already in OES_sample_variables, including support for
	reading a mask of covered samples in a fragment shader.
	- Functionality already in OES_shader_multisample_interpolation,
	including support for interpolating a fragment shader input at a
	programmable offset relative to the pixel center, a programmable
	sample number, or at the centroid.
	- MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS (Issue 9).

	(2) What functionality was changed and added relative to
	ARB_gpu_shader5?

	- Support for indexing into arrays of samplers with extended to all
	opaque types, and the description of allowed indices was rewritten
	in terms of dynamically uniform expressions, as was done when
	ARB_gpu_shader5 was promoted into OpenGL 4.0.
	- The only remaining API interaction is an increase in a
	minium-maximum value, so no "Changes to the OpenGL ES Specification"
	sections are included above.
	- arrays of images and shader storage blocks can only be indexed
	with constant integral expressions.

	(3) What should the rules on GLSL suffixing be?

	RESOLVED: "precise" is not a reserved keyword in ESSL 3.00, but it is
	a keyword in GLSL 4.40. ESSL 3.10 updated the reserved keyword list
	to include all keywords used or reserved in GLSL 4.40 (but not otherwise
	used in ES) and thus we can use "precise" in this spec by moving it
	from the reserved keywords section. See bug 11179.

	(4) Are changes to the "Order of Qualification" section needed?

	RESOLVED. No. ESSL 3.10 relaxes the ordering constraints similarly to
	GLSL 4.40. And thus there is no need for modifications to section 4.7
	in 3.00 (4.10 in 3.10) in this extension.

	(5) Are any more changes needed to the descriptions of texture gather?

	Probably not. Bug 11109 suggests cleanup to be applied to both desktop
	API and language specifications to make them cleaner and more
	consistent. The important parts of this cleanup were done in the texture
	gather functionality folded into ES 3.1, although some small language
	tweaks may still be needed.

	(6) Moved to EXT_shader_implicit_conversions Issue 4.

	(7) Should uniform and shader storage blocks be backable with buffer
	object subranges?

	RESOLVED: Yes. The section 4.3.7 "Interface Blocks" language picked up
	from desktop GL allows this (they are called "bind ranges"). This is a
	spec oversight in ES, because BindBufferRange is fully supported in
	OpenGL ES 3.0.

	(8) Where is MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS?

	RESOLVED. It was not added in Core GL because ARB_texture_gather and
	ARB_gpu_shader5 were both added to GL 4.0 and thus the query was
	unneeded. Since OpenGL ES 3.1 also includes texture gather and the
	multi-component gather support from gpu_shader5, the query was also
	unnecessary there and here. Bug 11002.

	(9) Some vendors may not be able to support dynamic indexing
	of arrays of images or shader storage blocks. What should we use instead?

	RESOLVED: Only allowing 'constant integral expression' instead of
	'dynamically uniform integer expression' for arrays of images or shader
	storage blocks. For images this is done by carving out an exception in the
	general language for opaque types. For shader storage blocks, different
	rules are given for arrays of uniform blocks and arrays of shader storage
	blocks.

	Revision History

	Revision 1, 2013/10/27 (Jon Leech)
	- Initial version based on ARB_gpu_shader5

	Revision 2, 2013/11/06 (Jon Leech)
	- Update Issues list with unresolved issues 4-7, which are dependent
	on decisions to be made by the ARB and ES working groups.
	- Remove {un,}packUnorm2x16EXT (already in ESSL 3.00)
	- Match changes to ES 3.1 texture gather language, but still
	reorganize the textureGather functions into their own subsection &
	table. ES 3.1 restored the [, int comp] argument to the functions
	it defined. Removed sampler2DRect variants incorrectly left in.
	- Clean up function overloading example text and opened bug 11178 to
	resolve possible problems with the GLSL 4.40 language this is
	based on.
	- Remove reference to image2DMS, since there is no longer any image
	load/store support for multisample textures in ES 3.1
	- Add issue (8) regarding "bind ranges".

	Revision 3, 2013/11/14 (Jon Leech)
	- Resolve function overloading issue 7, per bug 11178.

	Revision 4, 2013/11/20 (Jon Leech)
	- Sync with ES 3.1 spec language update.
	- Refer to ES 3.1 instead of ES 3plus.

	Revision 5, 2013/11/21 (Daniel Koch)
	- removed implicit conversion language (to a separate document).
	- updated textureGather functions to reflect the shadow gather
	functionality being added in ES 3.1.
	- added issue 9.

	Revision 6, 2013/12/18 (Daniel Koch)
	- minor cleanup
	- added issue 10, restrict arrays of images to const-int-expr

	Revision 7, 2014/02/12 (Daniel Koch)
	- restrict indexing arrays of shader storage blocks to const-int-expr.
	- Resolved issues 4, 5, 8, 9, 10 and supporting edits.

	Revision 8, 2014/03/10 (Jon Leech)
	- Rebase on OpenGL ES 3.1 and change suffix to EXT.
	- Remove textureGather functions already present in the existing
	GLSL-ES 3.10 spec section 8.9.3

	Revision 9, 2014/03/26 (Daniel Koch)
	- update contributors

	Revision 10, 2014/03/28 (Jon Leech)
	- Sync with released ES 3.1 specs. Reflow text.

	Revision 11, 2014/04/01 (Daniel Koch)
	- Update contributors

	Revision 12, 2015/03/27 (Daniel Koch)
	- Add missing function and token sections.