blob: b733b0736eb222fdcd4012d42231e8ed06d6c90f [file] [log] [blame]
Name
NV_shading_rate_image
Name Strings
GL_NV_shading_rate_image
Contact
Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com)
Contributors
Daniel Koch, NVIDIA
Mark Kilgard, NVIDIA
Jeff Bolz, NVIDIA
Mathias Schott, NVIDIA
Pyarelal Knowles, NVIDIA
Status
Shipping
Version
Last Modified: March 15, 2019
Revision: 2
Number
OpenGL Extension #531
OpenGL ES Extension #315
Dependencies
This extension is written against the OpenGL 4.5 Specification
(Compatibility Profile), dated October 24, 2016.
OpenGL 4.5 or OpenGL ES 3.2 is required.
This extension requires support for the OpenGL Shading Language (GLSL)
extension "NV_shading_rate_image", which can be found at the Khronos Group
Github site here:
https://github.com/KhronosGroup/GLSL
This extension interacts trivially with ARB_sample_locations and
NV_sample_locations.
This extension interacts with NV_scissor_exclusive.
This extension interacts with NV_conservative_raster.
This extension interacts with NV_conservative_raster_underestimation.
This extension interacts with EXT_raster_multisample.
NV_framebuffer_mixed_samples is required.
If implemented in OpenGL ES, at least one of NV_viewport_array or
OES_viewport_array is required.
Overview
By default, OpenGL runs a fragment shader once for each pixel covered by a
primitive being rasterized. When using multisampling, the outputs of that
fragment shader are broadcast to each covered sample of the fragment's
pixel. When using multisampling, applications can also request that the
fragment shader be run once per color sample (when using the "sample"
qualifier on one or more active fragment shader inputs), or run a fixed
number of times per pixel using SAMPLE_SHADING enable and the
MinSampleShading frequency value. In all of these approaches, the number
of fragment shader invocations per pixel is fixed, based on API state.
This extension allows applications to bind and enable a shading rate image
that can be used to vary the number of fragment shader invocations across
the framebuffer. This can be useful for applications like eye tracking
for virtual reality, where the portion of the framebuffer that the user is
looking at directly can be processed at high frequency, while distant
corners of the image can be processed at lower frequency. The shading
rate image is an immutable-format two-dimensional or two-dimensional array
texture that uses a format of R8UI. Each texel represents a fixed-size
rectangle in the framebuffer, covering 16x16 pixels in the initial
implementation of this extension. When rasterizing a primitive covering
one of these rectangles, the OpenGL implementation reads the texel in the
bound shading rate image and looks up the fetched value in a palette of
shading rates. The shading rate used can vary from (finest) 16 fragment
shader invocations per pixel to (coarsest) one fragment shader invocation
for each 4x4 block of pixels.
When this extension is advertised by an OpenGL implementation, the
implementation must also support the GLSL extension
"GL_NV_shading_rate_image" (documented separately), which provides new
built-in variables that allow fragment shaders to determine the effective
shading rate used for each fragment. Additionally, the GLSL extension also
provides new layout qualifiers allowing the interlock functionality provided
by ARB_fragment_shader_interlock to guarantee mutual exclusion across an
entire fragment when the shading rate specifies multiple pixels per fragment
shader invocation.
Note that this extension requires the use of a framebuffer object; the
shading rate image and related state are ignored when rendering to the
default framebuffer.
New Procedures and Functions
void BindShadingRateImageNV(uint texture);
void ShadingRateImagePaletteNV(uint viewport, uint first, sizei count,
const enum *rates);
void GetShadingRateImagePaletteNV(uint viewport, uint entry,
enum *rate);
void ShadingRateImageBarrierNV(boolean synchronize);
void ShadingRateImageBarrierNV(enum order);
void ShadingRateSampleOrderCustomNV(enum rate, uint samples,
const int *locations);
void GetShadingRateSampleLocationivNV(enum rate, uint samples,
uint index, int *location);
New Tokens
Accepted by the <cap> parameter of Enable, Disable, and IsEnabled, by the
<target> parameter of Enablei, Disablei, IsEnabledi, EnableIndexedEXT,
DisableIndexedEXT, and IsEnabledIndexedEXT, and by the <pname> parameter
of GetBooleanv, GetIntegerv, GetInteger64v, GetFloatv, GetDoublev,
GetDoubleIndexedv, GetBooleani_v, GetIntegeri_v, GetInteger64i_v,
GetFloati_v, GetDoublei_v, GetBooleanIndexedvEXT, GetIntegerIndexedvEXT,
and GetFloatIndexedvEXT:
SHADING_RATE_IMAGE_NV 0x9563
Accepted in the <rates> parameter of ShadingRateImagePaletteNV and the
<rate> parameter of ShadingRateSampleOrderCustomNV and
GetShadingRateSampleLocationivNV; returned in the <rate> parameter of
GetShadingRateImagePaletteNV:
SHADING_RATE_NO_INVOCATIONS_NV 0x9564
SHADING_RATE_1_INVOCATION_PER_PIXEL_NV 0x9565
SHADING_RATE_1_INVOCATION_PER_1X2_PIXELS_NV 0x9566
SHADING_RATE_1_INVOCATION_PER_2X1_PIXELS_NV 0x9567
SHADING_RATE_1_INVOCATION_PER_2X2_PIXELS_NV 0x9568
SHADING_RATE_1_INVOCATION_PER_2X4_PIXELS_NV 0x9569
SHADING_RATE_1_INVOCATION_PER_4X2_PIXELS_NV 0x956A
SHADING_RATE_1_INVOCATION_PER_4X4_PIXELS_NV 0x956B
SHADING_RATE_2_INVOCATIONS_PER_PIXEL_NV 0x956C
SHADING_RATE_4_INVOCATIONS_PER_PIXEL_NV 0x956D
SHADING_RATE_8_INVOCATIONS_PER_PIXEL_NV 0x956E
SHADING_RATE_16_INVOCATIONS_PER_PIXEL_NV 0x956F
Accepted by the <pname> parameter of GetBooleanv, GetDoublev,
GetIntegerv, and GetFloatv:
SHADING_RATE_IMAGE_BINDING_NV 0x955B
SHADING_RATE_IMAGE_TEXEL_WIDTH_NV 0x955C
SHADING_RATE_IMAGE_TEXEL_HEIGHT_NV 0x955D
SHADING_RATE_IMAGE_PALETTE_SIZE_NV 0x955E
MAX_COARSE_FRAGMENT_SAMPLES_NV 0x955F
Accepted by the <order> parameter of ShadingRateSampleOrderNV:
SHADING_RATE_SAMPLE_ORDER_DEFAULT_NV 0x95AE
SHADING_RATE_SAMPLE_ORDER_PIXEL_MAJOR_NV 0x95AF
SHADING_RATE_SAMPLE_ORDER_SAMPLE_MAJOR_NV 0x95B0
Modifications to the OpenGL 4.5 Specification (Compatibility Profile)
Modify Section 14.3.1, Multisampling, p. 532
(add to the end of the section)
When using a shading rate image (Section 14.4.1), rasterization may
produce fragments covering multiple pixels, where each pixel is treated as
a sample. If SHADING_RATE_IMAGE_NV is enabled for any viewport,
primitives will be processed with multisample rasterization rules,
regardless of the MULTISAMPLE enable or the value of SAMPLE_BUFFERS. If
the framebuffer has no multisample buffers, each pixel is treated as
having a single sample located at the pixel center.
Delete Section 14.3.1.1, Sample Shading, p. 532. The functionality in
this section is moved to the new Section 14.4, "Shading Rate Control".
Add new section before Section 14.4, Points, p. 533
Section 14.4, Shading Rate Control
By default, each fragment processed by programmable fragment processing
(chapter 15) [[compatibility only: or fixed-function fragment processing
(chapter 16)]] corresponds to a single pixel with a single (x,y)
coordinate. When using multisampling, implementations are permitted to run
separate fragment shader invocations for each sample, but often only run a
single invocation for all samples of the fragment. We will refer to the
density of fragment shader invocations in a particular framebuffer region
as the _shading rate_. Applications can use the shading rate to increase
the size of fragments to cover multiple pixels and reduce the amount of
fragment shader work. Applications can also use the shading rate to
explicitly control the minimum number of fragment shader invocations when
multisampling.
Section 14.4.1, Shading Rate Image
Applications can specify the use of a shading rate that varies by (x,y)
location using a _shading rate image_. Use of a shading rate image is
enabled or disabled for all viewports using Enable or Disable with target
SHADING_RATE_IMAGE_NV. Use of a shading rate image is enabled or disabled
for a specific viewport using Enablei or Disablei with the constant
SHADING_RATE_IMAGE_NV and the index of the selected viewport. The shading
rate image may only be used with a framebuffer object. When rendering to
the default framebuffer, the shading rate image operations in this section
are disabled.
The shading rate image is a texture that can be bound with the command
void BindShadingRateImageNV(uint texture);
This command unbinds the current shading rate image, if any. If <texture>
is zero, no new texture is bound. If <texture> is non-zero, it must be
the name of an existing immutable-format texture with a target of
TEXTURE_2D or TEXTURE_2D_ARRAY with a format of R8UI. If <texture> has
multiple mipmap levels, only the base level will be used as the shading
rate image.
Errors
INVALID_VALUE is generated if <texture> is not zero and is not the
name of an existing texture object.
INVALID_OPERATION is generated if <texture> is not an immutable-format
texture, has a format other than R8UI, or has a texture target other
than TEXTURE_2D or TEXTURE_2D_ARRAY.
When rasterizing a primitive covering pixel (x,y) with a shading rate
image having a target of TEXTURE_2D, a two-dimensional texel coordinate
(u,v) is generated, where:
u = floor(x / SHADING_RATE_IMAGE_TEXEL_WIDTH_NV)
v = floor(y / SHADING_RATE_IMAGE_TEXEL_HEIGHT_NV)
and where SHADING_RATE_IMAGE_TEXEL_WIDTH_NV and
SHADING_RATE_IMAGE_TEXEL_HEIGHT_NV are the width and height of the
implementation-dependent footprint of each shading rate image texel in the
framebuffer. If the bound shading rate image has a target of
TEXTURE_2D_ARRAY, a three-dimensional texture coordinate (u,v,w) is
generated, where u and v are computed as above. The coordinate w is set
to the layer L of the framebuffer being rendered to if L is less than the
number of layers in the shading rate image, or zero otherwise.
If a texel with coordinates (u,v) or (u,v,w) exists in the bound shading
rate image, the value of the 8-bit R component of the texel is used as the
shading rate index. If the (u,v) or (u,v,w) coordinate is outside the
extent of the shading rate image, or if no shading rate image is bound,
zero will be used as the shading rate index.
A shading rate index is mapped to a _base shading rate_ using a lookup
table called the shading rate image palette. There is a separate palette
for each viewport. The number of entries in each palette is given by the
implementation-dependent constant SHADING_RATE_IMAGE_PALETTE_SIZE_NV. The
base shading rate for an (x,y) coordinate with a shading rate index of <i>
will be given by palette entry <i>. If the shading rate index is greater
than or equal to the palette size, the results of the palette lookup are
undefined.
Shading rate image palettes are updated using the command
void ShadingRateImagePaletteNV(uint viewport, uint first, sizei count,
const enum *rates);
<viewport> specifies the number of the viewport whose palette should be
updated. <rates> is an array of <count> shading rate enums and is used to
update entries <first> through <first> + <count> - 1 in the palette. The
set of shading rate values accepted in <rates> is given in Table X.1. The
default value for all palette entries is
SHADING_RATE_1_INVOCATION_PER_PIXEL_NV.
Shading Rate Size Invocations
------------------------------------------- ----- -----------
SHADING_RATE_NO_INVOCATIONS_NV - 0
SHADING_RATE_1_INVOCATION_PER_PIXEL_NV 1x1 1
SHADING_RATE_1_INVOCATION_PER_1X2_PIXELS_NV 1x2 1
SHADING_RATE_1_INVOCATION_PER_2X1_PIXELS_NV 2x1 1
SHADING_RATE_1_INVOCATION_PER_2X2_PIXELS_NV 2x2 1
SHADING_RATE_1_INVOCATION_PER_2X4_PIXELS_NV 2x4 1
SHADING_RATE_1_INVOCATION_PER_4X2_PIXELS_NV 4x2 1
SHADING_RATE_1_INVOCATION_PER_4X4_PIXELS_NV 4x4 1
SHADING_RATE_2_INVOCATIONS_PER_PIXEL_NV 1x1 2
SHADING_RATE_4_INVOCATIONS_PER_PIXEL_NV 1x1 4
SHADING_RATE_8_INVOCATIONS_PER_PIXEL_NV 1x1 8
SHADING_RATE_16_INVOCATIONS_PER_PIXEL_NV 1x1 16
Table X.1: Shading rates accepted by ShadingRateImagePaletteNV. An
entry of "<W>x<H>" in the "Size" column indicates that the shading
rate results in fragments with a width and height (in pixels) of <W>
and <H>, respectively. The entry in the "Invocations" column
specifies the number of fragment shader invocations that should be
generated for each fragment.
Errors
INVALID_VALUE is generated if <viewport> is greater than or equal to
MAX_VIEWPORTS or if <first> plus <count> is greater than
SHADING_RATE_IMAGE_PALETTE_SIZE_NV.
INVALID_ENUM is generated if any entry in <rates> is not a valid
shading rate.
Individual entries in the shading rate palette can be queried using the
command:
void GetShadingRateImagePaletteNV(uint viewport, uint entry,
enum *rate);
where <viewport> specifies the viewport of the palette to query and
<entry> specifies the palette entry number. A single enum from Table X.1
is returned in <rate>.
Errors
INVALID_VALUE is generated if <viewport> is greater than or equal to
MAX_VIEWPORTS or if <entry> is greater than or equal to
SHADING_RATE_IMAGE_PALETTE_SIZE_NV.
If the shading rate image is enabled, a base shading rate will be obtained
as described above. If the shading rate image is disabled, the base
shading rate will be SHADING_RATE_1_INVOCATION_PER_PIXEL_NV. In either
case, the shading rate will be adjusted as described in the following
sections.
The rasterization hardware that reads from the shading rate image may
cache texels it reads for maximum performance. If the shading rate image
is updated using commands such as TexSubImage2D, image stores in shaders,
or by framebuffer writes performed when the shading rate image is bound to
a framebuffer object, this cache may retain out-of-date texture data.
Calling
void ShadingRateImageBarrierNV(boolean synchronize);
with <synchronize> set to TRUE ensures that rendering commands submitted
after the barrier don't access old shading rate image data updated
directly (TexSubImage2D) or indirectly (rendering, image stores) by
commands submitted before the barrier. If <synchronize> is set to FALSE,
ShadingRateImageBarrierNV doesn't wait on the completion of commands
submitted before the barrier. If an application has ensured that all
prior commands updating the shading rate image have completed using sync
objects or other mechanism, <synchronize> can be safely set to FALSE.
Otherwise, the lack of synchronization may cause subsequent rendering
commands to source the shading rate image before prior updates have
completed.
Section 14.4.2, Sample Shading
When the shading rate image is disabled, sample shading can be used to
specify a minimum number of fragment shader invocations to generate for
each fragment. When the shading rate image is enabled, sample shading can
be used to adjust the shading rate to increase the number of fragment
shader invocations generated for each primitive. Sample shading is
controlled by calling Enable or Disable with target SAMPLE_SHADING. If
MULTISAMPLE or SAMPLE_SHADING is disabled, sample shading has no effect.
When sample shading is active, an integer sample shading factor is derived
based on the value provided in the command:
void MinSampleShading(float value);
When the shading rate image is disabled, a <value> of 0.0 specifies that
the minimum number of fragment shader invocations for the shading rate be
executed and a <value> of 1.0 specifies that a fragment shader should be
on each shadeable sample with separate values per sample. When the
shading rate image is enabled, <value> is used to derive a sample shading
rate that can adjust the shading rate. <value> is not clamped to [0.0,
1.0]; values larger than 1.0 can be used to force larger adjustments to
the shading rate.
The sample shading factor is computed from <value> in an
implementation-dependent manner but must be greater than or equal to:
factor = max(ceil(value * max_shaded_samples), 1)
In this computation, <max_shaded_samples> is the maximum number of
fragment shader invocations per fragment, and is equal to:
- the number of color samples, if the framebuffer has color attachments;
- the number of depth/stencil samples, if the framebuffer has
depth/stencil attachments but no color attachments; or
- the value of FRAMEBUFFER_DEFAULT_SAMPLES if the framebuffer has no
attachments.
If the framebuffer has non-multisample attachments, the maximum number of
shaded samples per pixel is always one.
Section 14.4.3, Shading Rate Adjustment
Once a base shading rate has been established, it is adjusted to produce a
final shading rate.
First, if the base shading rate specifies multiple pixels for a fragment,
the shading rate is adjusted in an implementation-dependent manner to
limit the total number of coverage samples for the "coarse" fragment.
After adjustment, the maximum number of samples will not exceed the
implementation-dependent maximum MAX_COARSE_FRAGMENT_SAMPLES_NV. However,
implementations are permitted to clamp to a lower number of coverage
samples if required. Table X.2 describes the clamping performed in the
initial implementation of this extension.
Coverage Samples per Pixel
Base rate 2 4 8 16
--------- ----- ----- ----- -----
1x2 - - - 1x1
2x1 - - 1x1 1x1
2x2 - - 1x2 1x1
2x4 - 2x2 1x2 1x1
4x2 2x2 2x2 1x2 1x1
4x4 2x4 2x2 1x2 1x1
Table X.2, Coarse shading rate adjustment for total coverage sample
count for the initial implementation of this extension, where
MAX_COARSE_FRAGMENT_SAMPLES_NV is 16. The entries in the "2", "4", "8",
and "16" columns indicate the fragment size for the adjusted shading
rate.
If sample shading is enabled and the sample shading factor is greater than
one, the base shading rate is further adjusted to result in more shader
invocations per pixel. Table X.3 describes how the shading rate is
adjusted in the initial implementation of this extension.
Sample Shading Factor
Base rate 2 4 8 16
---------- --------- ------- -------- --------
1x1 / 1 1x1 / 2 1x1 / 4 1x1 / 8 1x1 / 16
1x2 / 1 1x1 / 1 1x1 / 2 1x1 / 4 1x1 / 8
2x1 / 1 1x1 / 1 1x1 / 2 1x1 / 4 1x1 / 8
2x2 / 1 1x2 / 1 1x1 / 1 1x1 / 2 1x1 / 4
2x4 / 1 2x2 / 1 1x2 / 1 1x1 / 1 1x1 / 2
4x2 / 1 2x2 / 1 2x1 / 1 1x1 / 1 1x1 / 2
4x4 / 1 2x4 / 1 2x2 / 1 1x2 / 1 1x1 / 1
1x1 / 2 1x1 / 4 1x1 / 8 1x1 / 16 1x1 / 16
1x1 / 4 1x1 / 8 1x1 / 16 1x1 / 16 1x1 / 16
1x1 / 8 1x1 / 16 1x1 / 16 1x1 / 16 1x1 / 16
1x1 / 16 1x1 / 16 1x1 / 16 1x1 / 16 1x1 / 16
Table X.3, Shading rate adjustment based on the sample shading factor in
the initial implementation of this extension. All rates in this table
are of the form "<W>x<H> / <I>", indicating a fragment size of <W>x<H>
pixels with <I> invocations per fragment.
If RASTER_MULTISAMPLE_EXT is enabled and the shading rate indicates
multiple fragment shader invocations per pixel, implementations are
permitted to adjust the shading rate to reduce the number of invocations
per pixel. In this case, implementations are not required to support more
than one invocations per pixel.
If the active fragment shader uses any inputs that are qualified with
"sample" (unique values per sample), including the built-ins "gl_SampleID"
and "gl_SamplePosition", the shader code is written to expect a separate
shader invocation for each shaded sample. For such fragment shaders, the
shading rate is set to the maximum number of shader invocations per pixel
(SHADING_RATE_16_INVOCATIONS_PER_PIXEL_NV). This adjustment effectively
disables the shading rate image.
Finally, if the shading rate indicates multiple fragment shader
invocations per sample, the total number of invocations per fragment in
the shading rate is clamped to the maximum number of shaded samples per
pixel described in section 14.4.2.
Section 14.4.4, Shading Rate Application
If the palette indicates a shading rate of SHADING_RATE_NO_INVOCATIONS_NV,
for pixel (x,y), no fragments will be generated for that pixel.
When the final shading rate for pixel (x,y) is results in fragments with a
width and height of <W> and <H>, where either <W> or <H> is greater than
one, a single fragment will be produced for that pixel that also includes
all other pixels covered by the same primitive whose coordinates (x',y')
satisfy:
floor(x / W) == floor(x' / W), and
floor(y / H) == floor(y' / H).
This combined fragment is considered to have multiple coverage samples;
the total number of samples in this fragment is given by
samples = A * B * S
where <A> and <B> are the width and height of the combined fragment, in
pixels, and <S> is the number of coverage samples per pixel in the draw
framebuffer. The set of coverage samples in the fragment is the union of
the per-pixel coverage samples in each of the fragment's pixels. The
location and order of coverage samples within each pixel in the combined
fragment are the same as the location and order used for single-pixel
fragments. Each coverage sample in the set of pixels belonging to the
combined fragment is assigned a unique sample number in the range
[0,<S>-1]. When rendering to a framebuffer object, the order of coverage
samples can be specified for each combination of fragment size and
coverage sample count. When using the default framebuffer, the coverage
samples are ordered in an implementation-dependent manner. The command
void ShadingRateSampleOrderNV(enum order);
sets the coverage sample order for all valid combinations of shading rate
and per-pixel sample coverage count. If <order> is
COARSE_SAMPLE_ORDER_DEFAULT_NV, coverage samples are ordered in an
implementation-dependent default order. If <order> is
COARSE_SAMPLE_ORDER_PIXEL_MAJOR_NV, coverage samples in the combined
fragment will be ordered sequentially, sorted first by pixel coordinate
(in row-major order) and then by per-pixel coverage sample number. If
<order> is COARSE_SAMPLE_ORDER_SAMPLE_MAJOR_NV, coverage samples in the
combined fragment will be ordered sequentially, sorted first by per-pixel
coverage sample number and then by pixel coordinate (in row-major order).
When processing a fragment using an ordering specified by
COARSE_SAMPLE_ORDER_PIXEL_MAJOR_NV sample <cs> in the combined fragment
will be assigned to coverage sample <ps> of pixel (px,py) specified by:
px = fx + (floor(cs / fsc) % fw)
py = fy + floor(cs / (fsc * fw))
ps = cs % fsc
where the lower-leftmost pixel in the fragment has coordinates (fx,fy),
the fragment width and height are <fw> and <fh>, respectively, and there
are <fsc> coverage samples per pixel. When processing a fragment with an
ordering specified by COARSE_SAMPLE_ORDER_SAMPLE_MAJOR_NV, sample <cs> in
the combined fragment will be assigned using:
px = fx + (cs % fw)
py = fy + (floor(cs / fw) % fh)
ps = floor(cs / (fw * fh))
Additionally, the command
void ShadingRateSampleOrderCustomNV(enum rate, uint samples,
const int *locations);
specifies the order of coverage samples for fragments using a shading rate
of <rate> with <samples> coverage samples per pixel. <rate> must be one
of the shading rates specified in Table X.1 and must specify a shading
rate with more than one pixel per fragment. <locations> specifies an
array of N (x,y,s) tuples, where N is the product the fragment width
indicated by <rate>, the fragment height indicated by <rate>, and
<samples>. For each (x,y,s) tuple specified in <locations>, <x> must be
in the range [0,fw-1], y must be in the range [0,fh-1], and s must be in
the range [0,fsc-1]. No two tuples in <locations> may have the same
values.
When using a sample order specified by ShadingRateSampleOrderCustomNV,
sample <cs> in the combined fragment will be assigned using:
px = fx + locations[3 * cs + 0]
py = fy + locations[3 * cs + 1]
ps = locations[3 * cs + 2]
where all terms in these equations are defined as in the equations
specified for ShadingRateSampleOrderNV and are consistent with a shading
rate of <rate> and a per-pixel sample count of <samples>.
Errors
* INVALID_ENUM is generated if <rate> is not one of the enums in Table
X.1.
* INVALID_OPERATION is generated if <rate> does not specify a
shading rate palette entry that specifies fragments with more than
one pixel.
* INVALID_VALUE is generated if <sampleCount> is not 1, 2, 4, or 8.
* INVALID_OPERATION is generated if the product of the fragment width
indicated by <rate>, the fragment height indicated by <rate>, and
samples is greater than MAX_COARSE_FRAGMENT_SAMPLES_NV.
* INVALID_VALUE is generated if any (x,y,s) tuple in <locations> has
negative values of <x>, <y>, or <s>, has an <x> value greater than or
equal to the width of fragments using <rate>, has a <y> value greater
than or equal to the height of fragments using <rate>, or has an <s>
value greater than or equal to <sampleCount>.
* INVALID_OPERATION is generated if any pair of (x,y,s) tuples in
<locations> have identical values.
In the initial state, the order of coverage samples in combined fragments
is implementation-dependent, but will be identical to the order obtained
by passing COARSE_SAMPLE_ORDER_DEFAULT_NV to ShadingRateSampleOrderNV.
The command
void GetShadingRateSampleLocationivNV(enum rate, uint samples,
uint index, int *location);
can be used to determine the specific pixel and sample number for each
numbered sample in a single- or multi-pixel fragment when the final
shading rate is <rate> and uses <samples> coverage samples per pixel.
<index> specifies a sample number in the fragment. Three integers are
returned in <location>, and are interpreted in the same manner as each
(x,y,s) tuples passed to ShadingRateSampleOrderCustomNV. The command
GetMultisamplefv can be used to determine the location of the identified
sample <s> within a combined fragment pixel identified by (x,y).
Errors
INVALID_OPERATION is returned if <rate> is
SHADING_RATE_NO_INVOCATIONS_NV.
INVALID_VALUE is returned if <index> is greater than or equal to the
number of coverage samples in the draw framebuffer in a combined pixel
for a shading rate given by <rate>.
When the final shading rate for pixel (x,y) specifies single-pixel
fragments, a single fragment with S samples numbered in the range
[0,<S>-1] will be generated when (x,y) is covered.
If the final shading rate for the fragment containing pixel (x,y) produces
fragments covering multiple pixels, a single fragment shader invocation
will be generated for the combined fragment. When using fragments with
multiple pixels per fragment, fragment shader outputs (e.g., color values
and gl_FragDepth) will be broadcast to all covered pixels/samples of the
fragment. If a "discard" is used in a fragment shader, none of the
pixels/samples of the fragment will be updated.
If the final shading rate for pixel (x,y) indicates <N> fragment shader
invocations per fragment, <N> separate fragment shader invocations will be
generated for the single-pixel fragment. Each coverage sample in the
fragment is assigned to one of the <N> fragment shader invocations in an
implementation-dependent manner.
If sample shading is enabled and the final shading rate results in
multiple fragment shader invocations per pixel, each fragment shader
invocation for a pixel will have a separate set of interpolated input
values. If sample shading is disabled, interpolated fragment shader
inputs not qualified with "centroid" may have the same value for each
invocation.
Modify Section 14.6.X, Conservative Rasterization from the
NV_conservative_raster extension specification
(add to the end of the section)
When the shading rate results in fragments covering more than one pixel,
coverage evaluation for conservative rasterization will be performed
independently for each pixel. In a such a case, a pixel considered not to
be covered by a conservatively rasterized primitive will still be
considered uncovered even if a neighboring pixel in the same fragment is
covered.
Modify Section 14.9.2, Scissor Test
(add to the end of the section)
When the shading rate results in fragments covering more than one pixel,
the scissor tests are performed separately for each pixel in the fragment.
If a pixel covered by a fragment fails either the scissor or exclusive
scissor test, that pixel is treated as though it was not covered by the
primitive. If all pixels covered by a fragment are either not covered by
the primitive being rasterized or fail either scissor test, the fragment
is discarded.
Modify Section 14.9.3, Multisample Fragment Operations (p. 562)
(modify the end of the first paragraph to indicate that sample mask
operations are performed when using the shading rate image, which can
produce coarse fragments where each pixel is considered a "sample")
... This step is skipped if MULTISAMPLE is disabled or if the value of
SAMPLE_BUFFERS is not one, unless SHADING_RATE_IMAGE_NV is enabled for one
or more viewports.
(add to the end of the section)
When the shading rate results in fragments covering more than one pixel,
each fragment will a composite coverage mask that includes separate
coverage bits for each sample in each pixel covered by the fragment. This
composite coverage mask will be used by the GLSL built-in input variable
gl_SampleMaskIn[] and updated according to the built-in output variable
gl_SampleMask[]. Each bit number in this composite mask maps to a
specific pixel and sample number within that pixel.
When building the composite coverage mask for a fragment, rasterization
logic evaluates separate per-pixel coverage masks and then modifies each
per-pixel mask as described in this section. After that, it assembles the
composite mask by applying the mapping of composite mask bits to
pixels/samples, which can be queried using GetShadingRateSampleLocationfvNV.
When using the output sample mask gl_SampleMask[] to determine which
samples should be updated by subsequent per-fragment operations, a set of
separate per-pixel output masks is extracted by reversing the mapping used
to generate the composite sample mask.
Modify Section 15.1, Fragment Shader Variables (p. 566)
(modify fourth paragraph, p. 567, specifying how "centroid" works for
multi-pixel fragments)
When interpolating input variables, the default screen-space location at
which these variables are sampled is defined in previous rasterization
sections. The default location may be overriden by interpolation
qualifiers. When interpolating variables declared using "centroid in",
the variable is sampled at a location inside the area of the fragment that
is covered by the primitive generating the fragment. ...
Modify Section 15.2.2, Shader Inputs (p. 566), as edited by
NV_conservative_raster_underestimation
(add to new paragraph on gl_FragFullyCoveredNV)
When CONSERVATIVE_RASTERIZATION_NV or CONSERVATIVE_RASTERIZATION2_NV is
enabled, the built-in read-only variable gl_FragFullyCoveredNV is set to
true if the fragment is fully covered by the generating primitive, and
false otherwise. When the shading rate results in fragments covering more
than one pixel, gl_FragFullyCoveredNV will be true if and only if all
pixels covered by the fragment are fully covered by the primitive being
rasterized.
Modify Section 17.3, Per-Fragment Operations (p. 587)
(insert a new paragraph after the first paragraph of the section)
If the fragment covers multiple pixels, the operations described in the
section are performed independently for each pixel covered by the
fragment. The set of samples covered by each pixel is determined by
extracting the portion of the fragment's composite coverage that applies
to that pixel, as described in section 14.9.3.
Dependencies on ARB_sample_locations and NV_sample_locations
If ARB_sample_locations or NV_sample_locations is supported, applications
can enable programmable sample locations instead of the default sample
locations, and also configure sample locations that may vary from pixel to
pixel.
When using "coarse" shading rates covering multiple pixels, the coarse
fragment is considered to include the samples of all the pixels it
contains. Each sample of each pixel in the coarse fragment is mapped to
exactly one sample in the coarse fragment. The location of each sample in
the coarse fragment is determined by mapping the sample to a pixel (px,py)
and a sample <s> within the identified pixel. The exact location of that
identified sample is the same as it would be for one-pixel fragments. If
programmable sample locations are enabled, those locations will be used.
If the sample location pixel grid is enabled, those locations will depend
on the (x,y) coordinate of the containing pixel.
Dependencies on NV_scissor_exclusive
If NV_scissor_exclusive is not supported, remove references to the
exclusive scissor test in section 14.9.2.
Dependencies on NV_sample_mask_override_coverage
If NV_sample_mask_override_coverage is supported, applications are able to
use the sample mask to enable coverage for samples not covered by the
primitive being rasterized. When this extension is used in conjunction
with a shading rate where fragments cover multiple pixels, it's possible
for the sample mask override to enable coverage for pixels that would
normally be discarded. For example, this can enable coverage in pixels
that are not covered by the primitive being rasterized or that fail the
scissor test.
Dependencies on NV_conservative_raster
If NV_conservative_raster is supported, conservative rasterization
evaluates coverage per pixel, even when using a shading rate that
specifies multiple pixels per fragment.
If NV_conservative_raster is not supported, remove edits to the "Section
14.6.X" section from that extension.
Dependencies on NV_conservative_raster_underestimation
If NV_conservative_raster_underestimation is supported, and conservative
rasterization is enabled with a shading rate that specifies multiple
pixels per fragment, gl_FragFullyCoveredNV will be true if and only if all
pixels covered by the fragment are fully covered by the primitive being
rasterized.
If NV_conservative_raster_underestimation is not supported, remove edits
to Section 15.2.2 related to gl_FragFullyCoveredNV.
Dependencies on EXT_raster_multisample
If EXT_raster_multisample is not supported, remove the language allowing
implementations to reduce the number of fragment shader invocations
per pixel if RASTER_MULTISAMPLE_EXT is enabled.
Interactions with NV_viewport_array or OES_viewport_array
If NV_viewport_array is supported, references to MAX_VIEWPORTS and
GetFloati_v apply to MAX_VIEWPORTS_NV and GetFloati_vNV respecively.
If OES_viewport_array is supported, references to MAX_VIEWPORTS and
GetFloati_v apply to MAX_VIEWPORTS_OES and GetFloati_vOES respectively.
Interactions with OpenGL ES 3.2
If implemented in OpenGL ES, remove all references to GetDoublev,
GetDoublei_v, EnableIndexedEXT, DisableIndexedEXT, IsEnabledIndexedEXT,
GetBooleanIndexedvEXT, GetIntegerIndexedvEXT, GetFloatIndexedvEXT and
GetDoubleIndexedv.
If implemented in OpenGL ES, remove all references to the MULTISAMPLE enable
state.
Additions to the AGL/GLX/WGL Specifications
None
Errors
See the "Errors" sections for individual commands above.
New State
Get Value Get Command Type Initial Value Description Sec. Attribute
--------- --------------- ---- ------------- ----------- ---- ---------
SHADING_RATE_IMAGE_NV IsEnabledi 16+ x FALSE Use shading rate image to 14.4.1 enable
B determine shading rate for
a given viewport
SHADING_RATE_IMAGE_ GetIntegerv Z 0 Texture object bound for 14.4.1 none
BINDING_NV use as a shading rate image
<none> GetShadingRate- 16+ x SHADING_RATE_1_- Shading rate palette 14.4.1 none
ImagePaletteNV 16+ x INVOCATION_PER_- entries
Z12 PIXEL_NV
<none> GetShadingRate- many n/a Locations of individual 14.4.3 none
SampleLocation- 3xZ+ samples in "coarse"
fragments
New Implementation Dependent State
Minimum
Get Value Type Get Command Value Description Sec.
--------- ----- --------------- ------- ------------------------ ------
SHADING_RATE_IMAGE_ Z+ GetIntegerv 1 Width (in pixels) covered by 14.4.1
TEXEL_WIDTH_NV each shading rate image texel
SHADING_RATE_IMAGE_ Z+ GetIntegerv 1 Height (in pixels) covered by 14.4.1
TEXEL_HEIGHT_NV each shading rate image texel
SHADING_RATE_IMAGE_ Z+ GetIntegerv 16 Number of entries in each 14.4.1
PALETTE_SIZE_NV viewport's shading rate
palette
MAX_COARSE_FRAGMENT_ Z+ GetIntegerv 1 Maximum number of samples in 14.4.3
PALETTE_SIZE_NV "coarse" fragments
Issues
(1) How should we name this extension?
RESOLVED: We are calling this extension NV_shading_rate_image. We use
the term "shading rate" to indicate the variable number of fragment
shader invocations that will be spawned for a particular neighborhood of
covered pixels. The extension can support shading rates running one
invocation for multiple pixels and/or multiple invocations for a single
pixel. We use "image" in the extension name because we allow
applications to control the shading rate using an image, where each
pixel specifies a shading rate for a portion of the framebuffer.
We considered a name like "NV_variable_rate_shading", but decided that
name didn't sufficiently distinguish between this extension (where
shading rate varies across the framebuffer at once) from an extension
where an API is provided to change the shading rate for the entire
framebuffer. For example, the MinSampleShadingARB() API in
ARB_sample_shading allows an application to run one thread per pixel
(0.0) for some draw calls and one thread per sample (1.0) for others.
(2) Should this extension support only off-screen (FBO) rendering or can
it also support on-screen rendering?
RESOLVED: This extension only supports rendering to a framebuffer
object; the feature is disabled when rendering to the default
framebuffer. In some window system environments, the default
framebuffer may be a subset of a larger framebuffer allocation
corresponding the full screen. Because the initial hardware
implementation of this extension always uses (x,y) coordinates relative
to the framebuffer allocation to determine the shading rate, the shading
rate would depend on the location of a window on the screen and change
as the window moves. While some window systems may have separate
default framebuffer allocations for each window, we've chosen to
disallow use of the shading rate image with the default framebuffer
globally instead of adding a "Can I use the shading rate image with a
default framebuffer?" query.
(3) How does this feature work with per-sample shading?
RESOLVED: When using per-sample shading, an application is expecting a
fragment shader to run with a separate invocation per sample. The
shading rate image might allow for a "coarsening" that would break such
shaders. We've chosen to override the shading rate (effectively
disabling the shading rate image) when per-sample shading is used.
(4) Should BindShadingRateImageNV take any arguments to bind a subset of
a complex texture (e.g., a specific layer of an array texture or a
non-base mipmap level)?
RESOLVED: No. Applications can use texture views to create texture
that refer to the desired subset of a more complex texture, if required.
(5) Does a shading rate image need to be bound in order to use the shading
rate feature?
RESOLVED: No. The behavior where there is no texture bound when
SHADING_RATE_IMAGE_NV is enabled is explicitly defined to behave as if a
lookup was performed and returned zero. If an application wants to use
a constant rate other than SHADING_RATE_1_INVOCATION_PER_PIXEL_NV, it
can enable SHADING_RATE_IMAGE_NV, ensure no image is bound, and define
the entries for index zero in the relevant palette(s) to contain the
desired shading rate. This technique can be used to emulate 16x
multisampling on implementations that don't support it by binding larger
4x multisample textures to the framebuffer and then setting a shading
rate of SHADING_RATE_1_INVOCATION_PER_2X2_PIXELS_NV.
(6) How is the FRAGMENT_SHADER_INVOCATIONS_ARB query (from
ARB_pipeline_statistics_query) handled with fragments covering
multiple pixels?
RESOLVED: The fragment shader invocation for each multi-pixel fragment
is counted exactly once.
(7) How do we handle the combination of variable-rate shading (including
multiple invocations per pixel) and target-independent rasterization
(i.e., RASTER_MULTISAMPLE_EXT)?
RESOLVED: In EXT_raster_multisample, the specification allows
implementations to run a single fragment shader invocation for each
pixel, even if sample shading would normally call for multiple
invocations per pixel:
If RASTER_MULTISAMPLE_EXT is enabled, the number of unique samples to
process is implementation-dependent and need not be more than one.
The shading rates in this extension calling for multiple fragment shader
invocations per pixel behave similarly to sample shading, so we extend
the allowance to this extension as well. If the shading rate in a
region of the framebuffer calls for multiple fragment shader invocations
per pixel, implementations are permitted to modify the shading rate and
need not support more than one invocation per pixel.
(8) Both the shading rate image and the framebuffer attachments can be
layered or non-layered. Do they have to match?
RESOLVED: No. When using a shading rate image with a target of
TEXTURE_2D with a layered framebuffer, all layers in the framebuffer
will use the same two-dimensional shading rate image. When using a
shading rate image with a target of TEXTURE_2D_ARRAY with a non-layered
framebuffer, layer zero of the shading rate image will be used, except
perhaps in the (undefined behavior) case where a shader writes a
non-zero value to gl_Layer.
(9) When using shading rates that specify "coarse" fragments covering
multiple pixels, we will generate a combined coverage mask that
combines the coverage masks of all pixels covered by the fragment. By
default, these masks are combined in an implementation-dependent
order. Should we provide a mechanism allowing applications to query
or specify an exact order?
RESOLVED: Yes, this feature is useful for cases where most of the
fragment shader can be evaluated once for an entire coarse fragment, but
where some per-pixel computations are also required. For example, a
per-pixel alpha test may want to kill all the samples for some pixels in
a coarse fragment. This sort of test can be implemented using an output
sample mask, but such a shader would need to know which bit in the mask
corresponds to each sample in the coarse fragment. The command
ShadingRateSampleOrderNV allows applications to specify simple orderings
for all combinations, while ShadingRateSampleOrderCustomNV allows for
completely customized orders for each combination.
(10) How do centroid-sampled variables work with fragments larger than one
pixel?
RESOLVED: For single-pixel fragments, attributes declared with
"centroid" are sampled at an implementation-dependent location in the
intersection of the area of the primitive being rasterized and the area
of the pixel that corresponds to the fragment. With multi-pixel
fragments, we follow a similar pattern, using the intersection of the
primitive and the *set* of pixels corresponding to the fragment.
One important thing to keep in mind when using such "coarse" shading
rates is that fragment attributes are sampled at the center of the
fragment by default, regardless of the set of pixels/samples covered by
the fragment. For fragments with a size of 4x4 pixels, this center
location will be more than two pixels (1.5 * sqrt(2)) away from the
center of the pixels at the corners of the fragment. When rendering a
primitive that covers only a small part of a coarse fragment,
interpolating a color outside the primitive can produce overly bright or
dark color values if the color values have a large gradient. To deal
with this, an application can use centroid sampling on attributes where
"extrapolation" artifacts can lead to overly bright or dark pixels.
Note that this same problem also exists for multisampling with
single-pixel fragments, but is less severe because it only affects
certain samples of a pixel and such bright/dark samples may be averaged
with other samples that don't have a similar problem.
(11) How does this feature interact with multisampling?
RESOLVED: The shading rate image can produce "coarse" fragments larger
than one pixel, which we want to behave a lot like regular multisample.
One can consider each coarse fragment to be a lot like a "pixel", where
the individual pixels covered by the fragment are treated as "samples".
When the shading rate is enabled, we override several rules related to
multisampling:
(a) Multisample rasterization rules apply, even if we don't have
multisample buffers or if MULTISAMPLE is disabled.
(b) Coverage for the pixels comprising a coarse fragment is combined
into a single aggregate coverage mask that can be read using the
fragment shader input "gl_SampleMaskIn[]".
(c) Coverage for pixels comprising a coarse fragment can be modified using
the fragment shader output "gl_SampleMask[]", which is also
interpreted as an aggregate coverage mask.
Note that (a) means that point and line primitives may be rasterized
differently depending on whether the shading rate image is enabled or
disabled.
Also, please refer to issues in the GLSL extension specification.
Revision History
Revision 2 (pknowles)
- ES interactions.
Revision 1 (pbrown)
- Internal revisions.