blob: 11424ed7fb8d6eb0c65f3c2ccdc2c812bf6d3a8f [file] [log] [blame]
Name Strings
Slawomir Grajewski, Intel (slawomir.grajewski 'at'
Tim Foley, Intel
Brent Insko, Intel
Tomasz Janczak, Intel
Marco Salvi, Intel
Larry Seiler, Intel
Tomasz Poniecki, Intel
Last Modified Date: 08/29/2013
INTEL Revision: 3
OpenGL Extension #441
This extension is written against the OpenGL 4.4 (Core Profile)
This extension is written against version 4.40 (revision 6) of the OpenGL
Shading Language Specification.
OpenGL 4.2 or ARB_shader_image_load_store is required;GLSL 4.20 is
Graphics devices may execute in parallel fragment shaders referring to the
same window xy coordinates. Framebuffer writes are guaranteed to be
processed in primitive rasterization order, but there is no order guarantee
for other instructions and image or buffer object accesses in
The extension introduces a new GLSL built-in function,
beginFragmentShaderOrderingINTEL(), which blocks execution of a fragment
shader invocation until invocations from previous primitives that map to
the same xy window coordinates (and same sample when per-sample shading
is active) complete their execution. All memory transactions from previous
fragment shader invocations are made visible to the fragment shader
invocation that called beginFragmentShaderOrderingINTEL() when the function
Including the following line in a shader can be used to control the
language features described in this extension:
#extension GL_INTEL_fragment_shader_ordering : <behavior>
where <behavior> is as specified in section 3.3.
New preprocessor #defines are added to the OpenGL Shading Language:
#define GL_INTEL_fragment_shader_ordering 1
New Procedures and Functions
New Tokens
Additions to the OpenGL 4.4 (Core Profile) Specification
Modify section 2.11.13 Shader Memory Access
(add new text in last bullet on p. 143)
The exception is when the built-in function
beginFragmentShaderOrderingINTEL() is used. This function creates a
region in a fragment shader in which memory operations complete and are
made visible in rasterization order for fragments that overlap in xy
window coordinates.
(add new paragraph after second paragraph on p. 144)
The built-in function beginFragmentShaderOrderingINTEL() may be used to
provide ordering of reads and writes performed by fragment shader
invocations that overlap in xy window coordinates. Calling
beginFragmentShaderOrderingINTEL() guarantees that any memory
transactions issued by shader invocations from previous primitives,
mapped to same xy window coordinates (and same sample when per-sample
shading is active), complete and are visible to the shader invocation
that called beginFragmentShaderOrderingINTEL().
Modifications to the OpenGL Shading Language Specification, Version 4.40
(revision 6)
Modify Section 8.17 Shader Memory Control Functions
(add to the table, p. 178)
void beginFragmentShaderOrderingINTEL()
This function is available only in fragment shaders.
The beginFragmentShaderOrderingINTEL() function blocks fragment shader
execution until completion of all shader invocations from previous
primitives that map to the same window xy coordinates (and same sample
if appropriate). All memory transactions from previous fragment shader
invocations mapped to same xy window coordinates are made visible to
the current fragment shader invocation when this function returns.
When per-fragment shading is active, the ordering guarantee applies
regardless of whether previous and current invocations cover overlapping
or disjoint sets of samples under same fragment of window xy coordinates.
When per-sample shading is active, the ordering guarantee applies to
shader invocations mapped to the same sample number under same window xy
coordinates. This function has no effect on shader execution for fragments
with non-overlapping window xy coordinates.
The beginFragmentShaderOrderingINTEL() function can be called under control
flow, including non-uniform control flow. Synchronization and ordering
guarantees are valid only if the conditional branch is taken during
execution. It is valid for a fragment shader to dynamically execute
multiple calls to beginFragmentShaderOrderingINTEL(). In such cases, the
first executed call will block as described above, but subsequent calls
will never block.
Note there is no explicit built-in function to signal the end of the region
that should be ordered. Instead, the region that will be ordered logically
extends to the end of fragment shader execution.
GLSL code example
layout(binding = 0, rgba8) uniform image2D image;
vec4 main()
... compute output color
if (color.w > 0) // potential non-uniform control flow
... read/modify/write image // ordered access guaranteed
... no ordering guarantees (as varying branch might not be taken)
... update image again // ordered access guaranteed
New State
New Implementation Dependent State
Revision History
Rev. Date Author Changes
---- -------- -------- -----------------------------------------
1 07/30/13 S.Grajewski Initial revision.
2 08/26/13 T.Janczak removed per-image synchronization
3 08/29/13 T.Janczak incorporate internal review feedback