| Name |
| |
| ARB_pipeline_statistics_query |
| |
| Name Strings |
| |
| GL_ARB_pipeline_statistics_query |
| |
| Contact |
| |
| Brian Paul, VMware Inc. (brianp 'at' vmware.com) |
| |
| Contributors |
| |
| Brian Paul, VMware |
| Daniel Rakos, AMD |
| Graham Sellers, AMD |
| Pat Brown, NVIDIA |
| Piers Daniell, NVIDIA |
| |
| Notice |
| |
| Copyright (c) 2014 The Khronos Group Inc. Copyright terms at |
| http://www.khronos.org/registry/speccopyright.html |
| |
| Status |
| |
| Complete. |
| Approved by the ARB on June 26, 2014. |
| Ratified by the Khronos Board of Promoters on August 7, 2014. |
| |
| Version |
| |
| Date: 2017-07-23 |
| Revision: 11 |
| |
| Number |
| |
| ARB Extension #171 |
| |
| Dependencies |
| |
| OpenGL 3.0 is required. |
| |
| The extension is written against the OpenGL 4.4 Specification, Core |
| Profile, March 19, 2014. |
| |
| OpenGL 3.2 and ARB_geometry_shader4 affect the definition of this |
| extension. |
| |
| OpenGL 4.0 and ARB_gpu_shader5 affect the definition of this extension. |
| |
| OpenGL 4.0 and ARB_tessellation_shader affect the definition of this |
| extension. |
| |
| OpenGL 4.3 and ARB_compute_shader affect the definition of this extension. |
| |
| This extension interacts with AMD_transform_feedback4. |
| |
| Overview |
| |
| This extension introduces new query types that allow applications to get |
| statistics information about different parts of the pipeline: |
| |
| * Number of vertices and primitives issued to the GL; |
| |
| * Number of times a vertex shader, tessellation evaluation shader, |
| geometry shader, fragment shader, and compute shader was invoked; |
| |
| * Number of patches processed by the tessellation control shader stage; |
| |
| * Number of primitives emitted by a geometry shader; |
| |
| * Number of primitives that entered the primitive clipping stage; |
| |
| * Number of primitives that are output by the primitive clipping stage; |
| |
| IP Status |
| |
| No known IP claims. |
| |
| New Procedures and Functions |
| |
| None. |
| |
| New Tokens |
| |
| Accepted by the <target> parameter of BeginQuery, EndQuery, GetQueryiv, |
| BeginQueryIndexed, EndQueryIndexed and GetQueryIndexediv: |
| |
| VERTICES_SUBMITTED_ARB 0x82EE |
| PRIMITIVES_SUBMITTED_ARB 0x82EF |
| VERTEX_SHADER_INVOCATIONS_ARB 0x82F0 |
| TESS_CONTROL_SHADER_PATCHES_ARB 0x82F1 |
| TESS_EVALUATION_SHADER_INVOCATIONS_ARB 0x82F2 |
| GEOMETRY_SHADER_INVOCATIONS 0x887F |
| GEOMETRY_SHADER_PRIMITIVES_EMITTED_ARB 0x82F3 |
| FRAGMENT_SHADER_INVOCATIONS_ARB 0x82F4 |
| COMPUTE_SHADER_INVOCATIONS_ARB 0x82F5 |
| CLIPPING_INPUT_PRIMITIVES_ARB 0x82F6 |
| CLIPPING_OUTPUT_PRIMITIVES_ARB 0x82F7 |
| |
| Additions to Chapter 4 of the OpenGL 4.4 (Core Profile) Specification (Event Model) |
| |
| Modify Section 4.2, Query Objects and Asynchronous Queries |
| |
| (add to the end of the bullet list on the first paragraph on p. 39) |
| |
| * Submission queries with a target of VERTICES_SUBMITTED_ARB and |
| PRIMITIVES_SUBMITTED_ARB return information on the number of vertices |
| and primitives transferred to the GL, respectively (see section 10.11). |
| |
| * Vertex shader queries with a target of VERTEX_SHADER_INVOCATIONS_ARB |
| return information on the number of times the vertex shader has been |
| invoked (see section 11.1.4). |
| |
| * Tessellation shader queries with a target of TESS_CONTROL_SHADER_- |
| PATCHES_ARB and TESS_EVALUATION_SHADER_INVOCATIONS_ARB return |
| information on the number of patches processed by the tessellation |
| control shader stage and the number of times the tessellation |
| evaluation shader has been invoked, respectively (see section 11.2.4). |
| |
| * Geometry shader queries with a target of GEOMETRY_SHADER_INVOCATIONS |
| and GEOMETRY_SHADER_PRIMITIVES_EMITTED_ARB return information on the |
| number of times the geometry shader has been invoked and the number of |
| primitives it emitted (see section 11.3.5). |
| |
| * Primitive clipping queries with a target of CLIPPING_INPUT_- |
| PRIMITIVES_ARB and CLIPPING_OUTPUT_PRIMITIVES_ARB return information |
| on the number of primitives that were processed in the primitive |
| clipping stage and the number of primitives that were output by the |
| primitive clipping stage and are further processed by the |
| rasterization stage, respectively (see section 13.5.2). |
| |
| * Fragment shader queries with a target of FRAGMENT_SHADER_INVOCATIONS_- |
| ARB return information on the number of times the fragment shader has |
| been invoked (see section 15.3). |
| |
| * Compute shader queries with a target of COMPUTE_SHADER_INVOCATIONS_ARB |
| return information on the number of times the compute shader has been |
| invoked (see section 19.2). |
| |
| (replace the INVALID_ENUM error for the <target> parameter of |
| BeginQueryIndexed on p. 40): |
| |
| An INVALID_ENUM error is generated if <target> is TIMESTAMP, or is not |
| one of the query object targets described in section 4.2. |
| |
| (replace the INVALID_ENUM error for the <target> parameter of |
| EndQueryIndexed on p. 41): |
| |
| An INVALID_ENUM error is generated if <target> is TIMESTAMP, or is not |
| one of the query object targets described in section 4.2. |
| |
| (modify the INVALID_VALUE error for <index> on non-indexed <target>s on |
| p. 42): |
| |
| An INVALID_OPERATION error is generated if <target> is a valid target |
| other than PRIMITIVES_GENERATED or |
| TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN, and <index> is not zero. |
| |
| |
| Modify Section 4.2.1, Query Object Queries |
| |
| (add before the errors section for GetQueryIndexediv on p. 43) |
| |
| For pipeline statistics queries (VERTICES_SUBMITTED_ARB, PRIMITIVES_- |
| SUBMITTED_ARB, VERTEX_SHADER_INVOCATIONS_ARB, TESS_CONTROL_SHADER_- |
| PATCHES_ARB, TESS_EVALUATION_SHADER_INVOCATIONS_ARB, GEOMETRY_SHADER_- |
| INVOCATIONS, FRAGMENT_SHADER_INVOCATIONS_ARB, COMPUTE_SHADER_- |
| INVOCATIONS_ARB, GEOMETRY_SHADER_PRIMITIVES_EMITTED_ARB, CLIPPING_- |
| INPUT_PRIMITIVES_ARB, CLIPPING_OUTPUT_PRIMITIVES_ARB), if the number |
| of bits is non-zero, the minimum number of bits allowed is 32. |
| |
| (replace the INVALID_ENUM error for the <target> parameter of |
| GetQueryIndexediv on p. 43): |
| |
| An INVALID_ENUM error is generated if <target> is not one of the query |
| object targets described in section 4.2. |
| |
| (modify the INVALID_VALUE error for <index> on non-indexed <target>s on |
| p. 43): |
| |
| An INVALID_OPERATION error is generated if <target> is a valid target |
| other than PRIMITIVES_GENERATED or |
| TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN, and <index> is not zero. |
| |
| Additions to Chapter 10 of the OpenGL 4.4 (Core Profile) Specification (Vertex Specification and Drawing Commands) |
| |
| Add new Section after 10.10, Conditional Rendering |
| |
| 10.11 Submission Queries |
| |
| Submission queries use query objects to track the number of vertices and |
| primitives that are issued to the GL using draw commands. |
| |
| When BeginQuery is called with a target of VERTICES_SUBMITTED_ARB, the |
| submitted vertices count maintained by the GL is set to zero. When a |
| vertices submitted query is active, the submitted vertices count is |
| incremented every time a vertex is transferred to the GL (see sections |
| 10.3.4, and 10.5). In case of primitive types with adjacency information |
| (see sections 10.1.11 through 10.1.14) implementations may or may not |
| count vertices not belonging to the main primitive. In case of line loop |
| primitives implementations are allowed to count the first vertex twice |
| for the purposes of VERTICES_SUBMITTED_ARB queries. Additionally, |
| vertices corresponding to incomplete primitives may or may not be counted. |
| |
| When BeginQuery is called with a target of PRIMITIVES_SUBMITTED_ARB, the |
| submitted primitives count maintained by the GL is set to zero. When a |
| primitives submitted query is active, the submitted primitives count is |
| incremented every time a point, line, triangle, or patch primitive is |
| transferred to the GL (see sections 10.1, 10.3.5, and 10.5). Restarting |
| a primitive topology using the primitive restart index has no effect on |
| the issued primitives count. Incomplete primitives may or may not be |
| counted. |
| |
| Additions to Chapter 11 of the OpenGL 4.4 (Core Profile) Specification (Programmable Vertex Processing) |
| |
| Modify Section 11.1.3, Shader Execution |
| |
| (add after bullet list on p. 352) |
| |
| Implementations are allowed to skip the execution of certain shader |
| invocations, and to execute additional shader invocations for any shader |
| type during programmable vertex processing due to implementation dependent |
| reasons, including the execution of shader invocations that don't have an |
| active program object present for the particular shader stage, as long as |
| the results of rendering otherwise remain unchanged. |
| |
| Add new Section after 11.1.3, Shader Execution |
| |
| 11.1.4 Vertex Shader Queries |
| |
| Vertex shader queries use query objects to track the number of vertex |
| shader invocations. |
| |
| When BeginQuery is called with a target of VERTEX_SHADER_INVOCATIONS_ARB, |
| the vertex shader invocations count maintained by the GL is set to zero. |
| When a vertex shader invocations query is active, the counter is |
| incremented every time the vertex shader is invoked (see section 11.1). |
| |
| The result of vertex shader queries may be implementation dependent due |
| to reasons described in section 11.1.3. |
| |
| Add new Section after 11.2.3, Tessellation Evaluation Shaders |
| |
| 11.2.4 Tessellation Shader Queries |
| |
| Tessellation shader queries use query objects to track the number of |
| tessellation control shader and tessellation evaluation shader invocations. |
| |
| When BeginQuery is called with a target of TESS_CONTROL_SHADER_PATCHES_ARB, |
| the tessellation control shader patches count maintained by the GL is set |
| to zero. When a tessellation control shader patches query is active, the |
| counter is incremented every time a patch is processed by the tessellation |
| control shader stage (see section 11.2.1). |
| |
| When BeginQuery is called with a target of TESS_EVALUATION_SHADER_- |
| INVOCATIONS_ARB, the tessellation evaluation shader invocations count |
| maintained by the GL is set to zero. When a tessellation evaluation shader |
| invocations query is active, the counter is incremented every time the |
| tessellation evaluation shader is invoked (see section 11.2.3). |
| |
| The result of tessellation shader queries may be implementation dependent |
| due to reasons described in section 11.1.3. |
| |
| Add new Section after 11.3.4, Geometry Shader Execution Environment |
| |
| 11.3.5 Geometry Shader Queries |
| |
| Geometry shader queries use query objects to track the number of geometry |
| shader invocations and the number of primitives those emitted. |
| |
| When BeginQuery is called with a target of GEOMETRY_SHADER_INVOCATIONS, |
| the geometry shader invocations count maintained by the GL is set to zero. |
| When a geometry shader invocations query is active, the counter is |
| incremented every time the geometry shader is invoked (see section 11.3). |
| In case of instanced geometry shaders (see section 11.3.4.2) the geometry |
| shader invocations count is incremented for each separate instanced |
| invocation. |
| |
| When BeginQuery is called with a target of GEOMETRY_SHADER_PRIMITIVES_- |
| EMITTED_ARB, the geometry shader output primitives count maintained by the |
| GL is set to zero. When a geometry shader primitives emitted query is |
| active, the counter is incremented every time the geometry shader emits |
| a primitive to a vertex stream. Implementations may or may not count |
| primitives emitted to a vertex stream that isn't further processed by the |
| GL (see section 11.3.2). Restarting primitive topology using the shading |
| language built-in functions EndPrimitive or EndStreamPrimitive does not |
| increment the geometry shader output primitives count. |
| |
| The result of geometry shader queries may be implementation dependent due |
| to reasons described in section 11.1.3. |
| |
| Additions to Chapter 13 of the OpenGL 4.4 (Core Profile) Specification (Fixed-Function Vertex Post-Processing) |
| |
| Modify Section 13.5, Primitive Clipping |
| |
| (add new paragraph before the last paragraph of the section on p. 405) |
| |
| Implementations are allowed to pass incoming primitives unchanged and to |
| output multiple primitives for an incoming primitive due to implementation |
| dependent reasons as long as the results of rendering otherwise remain |
| unchanged. |
| |
| Add new Section after 13.5.1, Clipping Shader Outputs |
| |
| 13.5.2 Primitive Clipping Queries |
| |
| Primitive clipping queries use query objects to track the number of |
| primitives that are processed by the primitive clipping stage and the |
| number of primitives that are output by the primitive clipping stage and |
| are further processed by the rasterization stage. |
| |
| When BeginQuery is called with a target of CLIPPING_INPUT_PRIMITIVES_ARB, |
| the clipping input primitives count maintained by the GL is set to zero. |
| When a clipping input primitives query is active, the counter is |
| incremented every time a primitive reaches the primitive clipping stage |
| (see section 13.5). |
| |
| When BeginQuery is called with a target of CLIPPING_OUTPUT_PRIMITIVES_ARB, |
| the clipping output primitives count maintained by the GL is set to zero. |
| When a clipping output primitives query is active, the counter is |
| incremented every time a primitive passes the primitive clipping stage. |
| The actual number of primitives output by the primitive clipping stage for |
| a particular input primitive is implementation dependent (see section 13.5) |
| but must satisfy the following conditions: |
| |
| * If at least one vertex of the input primitive lies inside the clipping |
| volume, the counter is incremented by one or more. |
| |
| * Otherwise, the counter is incremented by zero or more. |
| |
| If RASTERIZER_DISCARD is enabled, implementations are allowed to discard |
| primitives right after the optional transform feedback state (see Section |
| 14.1). As a result, if RASTERIZER_DISCARD is enabled, the clipping input |
| and output primitives count may not be incremented. |
| |
| Additions to Chapter 15 of the OpenGL 4.4 (Core Profile) Specification (Programmable Fragment Processing) |
| |
| Modify Section 15.2, Shader Execution |
| |
| (add after first paragraph on p. 434) |
| |
| Implementations are allowed to skip the execution of certain fragment |
| shader invocations, and to execute additional fragment shader invocations |
| during programmable fragment processing due to implementation dependent |
| reasons, including the execution of fragment shader invocations when there |
| isn't an active program object present for the fragment shader stage, as |
| long as the results of rendering otherwise remain unchanged. |
| |
| Add new Section after 15.2, Shader Execution |
| |
| 15.3 Fragment Shader Queries |
| |
| Fragment shader queries use query objects to track the number of fragment |
| shader invocations. |
| |
| When BeginQuery is called with a target of FRAGMENT_SHADER_INVOCATIONS_ARB, |
| the fragment shader invocations count maintained by the GL is set to zero. |
| When a fragment shader invocations query is active, the counter is |
| incremented every time the fragment shader is invoked (see section 15.2). |
| |
| The result of fragment shader queries may be implementation dependent due |
| to reasons described in section 15.2. |
| |
| Additions to Chapter 19 of the OpenGL 4.4 (Core Profile) Specification (Compute Shaders) |
| |
| Add new Section after 19.1, Compute Shader Variables |
| |
| 19.2 Compute Shader Queries |
| |
| Compute shader queries use query objects to track the number of compute |
| shader invocations. |
| |
| When BeginQuery is called with a target of COMPUTE_SHADER_INVOCATIONS_ARB, |
| the compute shader invocations count maintained by the GL is set to zero. |
| When a compute shader invocations query is active, the counter is |
| incremented every time the compute shader is invoked (see chapter 19). |
| |
| Implementations are allowed to skip the execution of certain compute |
| shader invocations, and to execute additional compute shader invocations |
| due to implementation dependent reasons as long as the results of |
| rendering otherwise remain unchanged. |
| |
| Additions to the AGL/EGL/GLX/WGL Specifications |
| |
| None. |
| |
| Dependencies on OpenGL 3.2 and ARB_geometry_shader4 |
| |
| If OpenGL 3.2 and ARB_geometry_shader4 are not supported then remove all |
| references to GEOMETRY_SHADER_INVOCATIONS and |
| GEOMETRY_SHADER_PRIMITIVES_EMITTED_ARB. |
| |
| Dependencies on OpenGL 4.0 and ARB_gpu_shader5 |
| |
| If OpenGL 4.0 and ARB_gpu_shader5 are not supported then rename |
| GEOMETRY_SHADER_INVOCATIONS to GEOMETRY_SHADER_INVOCATIONS_ARB. |
| |
| Dependencies on OpenGL 4.0 and ARB_tessellation_shader |
| |
| If OpenGL 4.0 and ARB_tessellation_shader are not supported then remove |
| all references to TESS_CONTROL_SHADER_PATCHES_ARB and |
| TESS_EVALUATION_SHADER_INVOCATIONS_ARB. |
| |
| Dependencies on OpenGL 4.3 and ARB_compute_shader |
| |
| If OpenGL 4.3 and ARB_compute_shader are not supported then remove all |
| references to COMPUTE_SHADER_INVOCATIONS_ARB. |
| |
| Dependencies on AMD_transform_feedback4 |
| |
| If AMD_transform_feedback4 is supported then GEOMETRY_SHADER_PRIMITIVES_- |
| EMITTED_ARB counts primitives emitted to any of the vertex streams for |
| which STREAM_RASTERIZATION_AMD is enabled. |
| |
| New State |
| |
| Modify Table 23.74, Miscellaneous |
| |
| (update the state table to cover the new query types on p. 599) |
| |
| Get Value Type Get Command Initial Value Description Sec. |
| ------------- ----- ----------- ------------- ------------------------- ----- |
| CURRENT_QUERY 18xZ+ GetQueryiv 0 Active query object names 4.2.1 |
| |
| New Implementation Dependent State |
| |
| Modify Table 23.69, Implementation Dependent Values |
| |
| (update the state table to cover the new query types on p. 594) |
| |
| Get Value Type Get Command Minimum Value Description Sec. |
| ------------------ ----- ----------- -------------- ------------------ ----- |
| QUERY_COUNTER_BITS 18xZ+ GetQueryiv see sec. 4.2.1 Asynchronous query 4.1.1 |
| counter bits |
| |
| Issues |
| |
| (1) Why is this extension necessary? |
| |
| RESOLVED: A competing graphics API supports this feature. This extension |
| will allow one to easier implement that API's features on top of OpenGL. |
| Also, this feature could be useful for profiling tools and debuggers. |
| |
| (2) Should a single query (such as GL_PIPELINE_STATISTICS) return all the |
| statistics in an 11-field record or should there be separate queries? |
| |
| DISCUSSION: |
| |
| Single query: Returning 11 values in one query may be trouble if we want |
| to extend the set of statistics in the future. It would probably require |
| defining a whole new query. Also, if someone is only interested in one |
| or two queries there may be overhead in querying all the statistics at |
| once. Also, the interaction with GL_ARB_query_buffer_object is not |
| clear. Would all 11 values be written to the buffer or would we define a |
| set of 11 enumerants to specify which value is queried? |
| |
| Multiple queries: Defining 11 separate queries is straight-forward. |
| But if the underlying hardware is designed to collect the whole set of |
| statistics, it may be inefficient to support separate queries. |
| |
| RESOLVED: Define 11 separate queries to avoid problems with future |
| statistic queries. |
| |
| (3) Can the result of pipeline statistic queries be used for conditional |
| rendering? |
| |
| DISCUSSION: It doesn't make sense if one query of 11 values is used. |
| It could make sense if there are 11 separate queries. But is there |
| a legitimate use case for this? D3D10 doesn't allow this. |
| |
| RESOLVED: No. |
| |
| (4) Should pipeline statistics use the glBegin/EndQuery() interface or |
| the glQueryCounter() interface? |
| |
| DISCUSSION: The glBegin/EndQuery interface matches what D3D10 uses. |
| To count the statistics between points A and B with glQueryCounter() |
| one would query the statistic counter at point A and again at point B |
| and compute the difference. A problem with this approach is that the |
| statistic counters would always have to be running because we wouldn't |
| know when they might be queried. That could be inefficient/inconvenient. |
| |
| RESOLVED: Use the glBegin/EndQuery interface. |
| |
| (5) How accurate should the statistics be? |
| |
| RESOLVED: None of the statistics have to be exact, thus implementations |
| might return slightly different results for any of them. |
| |
| (6) What should this extension be called? |
| |
| DISCUSSION: This extension provides similar functionality to that of |
| D3D's pipeline statistics queries thus it makes sense to call this |
| extension similarly (even though there is a separate classification of |
| the individual queries in this specification). |
| |
| RESOLVED: ARB_pipeline_statistics_query. |
| |
| (7) Can multiple pipeline statistics queries be active at the same time? |
| |
| RESOLVED: Yes, as long as they have different targets. Otherwise it is |
| an error. |
| |
| (8) What stage the VERTICES_SUBMITTED_ARB and PRIMITIVES_SUBMITTED_ARB |
| belong to? What do they count? |
| |
| DISCUSSION: There is no separate pipeline stage introduced in the |
| specification that matches D3D's "input assembler" stage. While the |
| latest version of the GL specification mentions a "vertex puller" stage |
| in the pipeline diagram, this stage does not have a corresponding |
| chapter in the specification that introduces it. |
| |
| RESOLVED: Introduce VERTICES_SUBMITTED_ARB and PRIMITIVES_SUBMITTED_ARB |
| in chapter 10, Vertex Specification and Drawing Command. They count the |
| total number of vertices and primitives processed by the GL. Including |
| multiple instances. |
| |
| (9) What does 'number of primitives' mean in case of PRIMITIVES_SUBMITTED_- |
| ARB, GEOMETRY_SHADER_PRIMITIVES_EMITTED_ARB, CLIPPING_INPUT_- |
| PRIMITIVES_ARB, and CLIPPING_OUTPUT_PRIMITIVES_ARB queries? |
| |
| DISCUSSION: The specification heavily overloaded the term primitive. |
| E.g. a triangle strip is considered a primitive type and primitive index |
| is meant to restart the 'primitive', however, on the other hand, |
| gl_PrimitiveID is incremented for each individual triangle of a triangle |
| strip and despite a geometry shader operates on primitives, it works |
| also on the indivudal triangles of a triangle strip. |
| |
| RESOLVED: The number of individual points, lines, triangles, or patches |
| are counted (or polygons, in case of CLIPPING_OUTPUT_PRIMTIIVES_ARB). |
| |
| (10) Why doesn't GEOMETRY_SHADER_INVOCATIONS have an ARB suffix? |
| |
| RESOLVED: We reuse the existing token introduced by ARB_gpu_shader5 that |
| was previously only accepted by GetProgramiv and meant to return the |
| invocation count of instanced geometry shaders. |
| |
| (11) What does GEOMETRY_SHADER_PRIMITIVES_EMITTED_ARB count? How is it |
| different than PRIMITIVES_GENERATED? |
| |
| RESOLVED: GEOMETRY_SHADER_PRIMITIVES_EMITTED_ARB counts primitives that |
| were output by the geometry shader. All vertex streams are considered, |
| but implementations are allowed to not count primitives that aren't |
| further processed by the GL. If no goemetry shader is present then the |
| counter may or may not be incremented. |
| |
| (12) What does CLIPPING_INPUT_PRIMITIVES_ARB count? |
| |
| RESOLVED: The number of primitives that reach the primitive clipping |
| stage. However, see issue (13) for more details. |
| |
| (13) What is the result of a CLIPPING_INPUT_PRIMITIVES_ARB query in case |
| RASTERIZER_DISCARD is enabled? |
| |
| DISCUSSION: Currently RASTERIZER_DISCARD is specified to be happening |
| after primitive clipping, however, some implementations might discard |
| primitives right after the transform feedback stage if RASTERIZER_DISCARD |
| is enabled. This is perfectly legal from a spec point of view, as none |
| of the vertex post-processing operations after transform feedback have |
| any effect if RASTERIZER_DISCARD is enabled. |
| |
| RESOLVED: Allow implementations to not count clipping input and output |
| primitives if RASTERIZER_DISCARD is enabled. |
| |
| (14) What does CLIPPING_OUTPUT_PRIMITIVES_ARB count? |
| |
| DISCUSSION: The specification defines primitive clipping as an operation |
| on points, lines, or polygons. Points and lines are of no interest as |
| they always generate at most one output primitive even if clipped. |
| On the other hand, according to the specification, triangles are handled |
| as polygons and in case clipping happens new vertices are added to the |
| polygon but it still remains a single polygon. |
| |
| Actual hardware, on the other hand, is likely to support only triangle |
| rasterization so in these cases each vertex added due to clipping |
| implies the generation of another triangle. |
| |
| Also, the specification defines primitive clipping to be water-tight, |
| i.e. in theory hardware should always clip primitives that have one of |
| their vertices fall out of any of the clip planes. In practice, however, |
| this can be fairly sub-optimal as primitive clipping can be way more |
| expensive than if the non-visible parts of the primitives would be |
| discarded at e.g. the pixel ownership test, so hardware often uses |
| guardbands that allow some primitives to pass through the clipper |
| unchanged even though they partially fall outside of the clip volume. |
| |
| All of these hardware optimizations are legal from the specification's |
| point of view, but make it difficult to define the meaning of this new |
| counter without introducing severe restrictions to how a GL |
| implementation should handle certain cases. |
| |
| RESOLVED: Define CLIPPING_OUTPUT_PRIMITIVES_ARB so that they count |
| the actual number of primitives output by the primitive clipping stage |
| by the implemenation, which might include primitives output for |
| implementation dependent reasons. The only guarantees on the number |
| of output primitives are the following: |
| |
| * If at least one vertex of the primitive lies inside the clipping |
| volume, the counter is incremented by one or more. |
| * Otherwise, the counter is incremented by zero or more. |
| |
| (15) Do we need to add any language to discuss why certain shader |
| invocation counts might not match the "expected" values in practice? |
| |
| DISCUSSION: Implementations might be able to do optimizations that |
| allow avoiding the execution of certain invocations in some |
| circumstances while also might need "helper" invocations in other cases. |
| |
| RESOLVED: Add language to describe that such behavior is allowed as long |
| as the results of the rendering otherwise remain unchanged. |
| |
| (16) What should be the result of VERTEX_SHADER_INVOCATIONS_ARB, |
| TESS_CONTROL_SHADER_PATCHES_ARB, TESS_EVALUATION_SHADER_- |
| INVOCATIONS_ARB, GEOMETRY_SHADER_INVOCATIONS, FRAGMENT_SHADER_- |
| INVOCATIONS_ARB and COMPUTE_SHADER_INVOCATIONS_ARB if the current |
| program does not contain a shader of the appropriate type? |
| |
| DISCUSSION: D3D is vague about the exact specification of this scenario, |
| except that it explicitly allows geometry shader invocations count to |
| increment also if there is no geometry shader. |
| |
| In case of OpenGL, however, the programmable fragment processing stage |
| is undefined if there is no fragment shader in the current program. This |
| is because some implementations might require to run a fragment shader |
| even if the application developer does not need one. |
| |
| RESOLVED: Add language to describe that implementations are allowed to |
| increment these counters even if there isn't a current program for the |
| particular shader stage. |
| |
| (17) Due to the introduction of a lot of new query types the error section |
| of query object related commands like BeginQueryIndexed, |
| EndQueryIndexed and GetQueryIndexediv became pretty bloated. |
| Shouldn't we introduce some new tables for indexed and non-indexed |
| query types and reference those in the error sections instead? |
| |
| RESOLVED: Probably, but not as part of this extension. |
| |
| (18) What are VERTEX_SHADER_INVOCATIONS_ARB queries useful for? |
| |
| DISCUSSION: In most cases VERTEX_SHADER_INVOCATIONS_ARB queries are |
| likely to return the same results as VERTICES_SUBMITTED_ARB queries. |
| However, implementations are allowed to perform optimizations that |
| enable avoiding the re-processing of the same vertex in case of an |
| indexed draw command. This is often referred to as vertex reuse or |
| post-transform vertex cache optimization. In case such optimizations |
| are applied, the number of vertex shader invocations can be smaller |
| than the number of vertices issued. |
| |
| RESOLVED: They can be used together with VERTICES_SUBMITTED_ARB queries |
| to analyze how efficiently the index ordering takes advantage of the |
| post-transform vertex cache. |
| |
| (19) Does GEOMETRY_SHADER_INVOCATIONS queries account for instanced |
| geometry shaders? |
| |
| RESOLVED: Yes, GEOMETRY_SHADER_INVOCATIONS queries count the total |
| number of geometry shader executions, including individual invocations |
| of an instanced geometry shader. |
| |
| (20) What are CLIPPING_INPUT_PRIMITIVES_ARB and CLIPPING_OUTPUT_- |
| PRIMITIVES_ARB queries useful for? |
| |
| RESOLVED: These two types of queries can be used together to determine a |
| conservative estimate on how efficiently the primitive clipping stage is |
| used. If the rasterizer primitives count is substantially lower than the |
| clipper primitives count, it may indicate that too many primitives were |
| tried to be rendered that ended up outside of the viewport. On the other |
| hand, if the rasterizer primitives count is substantially higher than |
| the clipper primitives count, it may indicate that too many primitives |
| were clipped and primitive clipping might have become the bottleneck of |
| the rendering pipeline. |
| |
| (21) What are FRAGMENT_SHADER_INVOCATIONS_ARB queries useful for? |
| |
| DISCUSSION: In many cases the hardware can perform early per-fragment |
| tests which might result in the fragment shader not being executed. |
| These and similar optimizations might result in a lower fragment shader |
| invocation count than expected. |
| |
| RESOLVED: They can be used to analyze how efficiently the application |
| takes advantage of early per-fragment tests and other fragment shader |
| optimizations. |
| |
| (22) What is the behavior of pipeline statistics queries returning |
| information about primitive counts in case of legacy primitive types |
| like quads or polygons? |
| |
| DISCUSSION: This extension is intentionally written against the core |
| profile of the specification as defining the behavior of these queries |
| for legacy primitive types would be either non-portable or too relaxed |
| to be useful for any reasonably accurate measurement. |
| |
| RESOLVED: Undefined, as this is a core profile extension. |
| |
| (23) How do operations like Clear, TexSubImage, etc. affect the results of |
| the newly introduced queries? |
| |
| DISCUSSION: Implementations might require "helper" rendering commands be |
| issued to implement certain operations like Clear, TexSubImage, etc. |
| |
| RESOLVED: They don't. Only application submitted rendering commands |
| should have an effect on the results of the queries. |
| |
| (24) Should partial primitives be counted by submission queries? |
| |
| DISCUSSION: Consider the example of calling DrawArrays with <mode> |
| TRIANGLES and <count> of 8. |
| Should VERTICES_SUBMITTED_ARB return 6 or 8? |
| Should PRIMITIVES_SUBMITTED_ARB return 2 or 3? |
| |
| RESOLVED: Undefined, incomplete primitives and vertices of incomplete |
| primitives may or may not be counted by PRIMITIVES_SUBMITTED_ARB and |
| VERTICES_SUBMITTED_ARB queries, respectively. |
| |
| (25) What should we count in case of tessellation control shaders? |
| |
| DISCUSSION: While OpenGL tessellation control shaders are defined to |
| be invoked once per vertex, D3D defines the same shader stage to be |
| executed once per patch. |
| |
| RESOLVED: The number of patches processed by the tessellation control |
| shader stage is counted. |
| |
| (26) Should VERTICES_SUBMITTED_ARB count adjacent vertices in case of |
| primitives with adjacency? |
| |
| DISCUSSION: Implementations have different answers for this. |
| |
| RESOLVED: Allow both. It is up to the implementation whether adjacent |
| vertices are counted. |
| |
| (27) Should VERTICES_SUBMITTED_ARB count vertices multiple times in case |
| of primitive types that reuse vertices (e.g. LINE_LOOP, LINE_STRIP, |
| TRIANGLE_STRIP)? |
| |
| RESOLVED: No for strip primitives, but allow (but not require) counting |
| the first vertex twice for line loop primitives. |
| |
| Revision History |
| |
| Revision 11, 2017/07/23 (Jon Leech) |
| - Replace the long list of valid <target> parameters for |
| BeginQueryIndexed, EndQueryIndexed, and GetQueryIndexediv with a |
| reference to the list of query targets in section 4.2 (gitlab #18). |
| - Add the new query targets to those for which the <index> parameter |
| of BeginQueryIndexed and GetQueryIndexediv must be zero (gitlab |
| #26). |
| |
| Revision 10, 2014/10/30 (Daniel Rakos) |
| - Relaxed the behavior of VERTICES_SUBMITTED_ARB queries for primitives |
| with adjacency to allow counting of adjacent vertices. |
| - Relaxed the behavior of GEOMETRY_SHADER_PRIMITIVES_EMITTED_ARB queries |
| to also allow counting primitives emitted to all vertex streams. |
| |
| Revision 9, 2014/10/08 (Daniel Rakos) |
| - Specified that the vertices submitted count is only incremented for |
| vertices belonging to the main primitive in case of primitives with |
| adjacency. |
| - Relaxed the definition of VERTICES_SUBMITTED_ARB queries to allow |
| implementations to count the first vertex twice for line loop |
| primitives. |
| - Changed the definition of GEOMETRY_SHADER_PRIMITIVES_EMITTED_ARB |
| queries to only count primitives emitted to vertex streams that are |
| further processed by the GL. |
| - Added interaction with AMD_transform_feedback4. |
| |
| Revision 8, 2014/06/27 (Daniel Rakos) |
| - Renamed tessellation control shader query to TESS_CONTROL_SHADER_- |
| PATCHES_ARB and updated language respectively. |
| |
| Revision 7, 2014/05/09 (Daniel Rakos) |
| - Resolved issue (24), updated resolution of issue (5). |
| |
| Revision 6, 2014/05/06 (Daniel Rakos) |
| - Added issue (24). |
| |
| Revision 5, 2014/04/25 (Daniel Rakos) |
| - Renamed to ARB_pipeline_statistics_query. |
| - Replaced EXT suffixes with ARB ones. |
| - Resolved outstanding issues and added language to the spec to explain |
| these resolutions. |
| |
| Revision 4, 2014/04/23 (Daniel Rakos) |
| - Fixed some typos. |
| - Renamed primitive clipping queries to CLIPPING_INPUT_PRIMITIVES_EXT |
| and CLIPPING_OUTPUT_PRIMITIVES_EXT. |
| - Resolved issues (2), (9), (12), (18), and (20). |
| - Updated suggestions for issues (11), (13), (14), (15), and (16). |
| - Added issue (23). |
| |
| Revision 3, 2014/04/16 (Daniel Rakos) |
| - Major rewrite of the spec language that clarifies in what pipeline |
| stage the various queries take place and what exactly is counted. |
| - Added issues (6) through (22). |
| - Removed conformance testing section (a separate conformance test spec |
| will be created). |
| - Added state table changes. |
| - Clarified dependencies on other extensions. |
| |
| Revision 2, 2014/04/09 (Brian Paul) |
| - Break the original single 11-valued query into 11 individual queries. |
| - Added issues (2), (3), (4), and (5). |
| - Added conformance testing section. |
| |
| Revision 1, 2014/02/03 (Brian Paul) |
| - Initial revision. |