| Name |
| |
| ARB_timer_query |
| |
| Name Strings |
| |
| GL_ARB_timer_query |
| |
| Contact |
| |
| Piers Daniell, NVIDIA Corporation (pdaniell 'at' nvidia.com) |
| |
| Contributors |
| |
| Axel Mamode, Sony |
| Brian Paul, Tungsten Graphics |
| Bruce Merry, ARM |
| James Jones, NVIDIA Corporation |
| Pat Brown, NVIDIA |
| Remi Arnaud, Sony |
| |
| Notice |
| |
| Copyright (c) 2010-2013 The Khronos Group Inc. Copyright terms at |
| http://www.khronos.org/registry/speccopyright.html |
| |
| Status |
| |
| Complete. Approved by the ARB at the 2010/01/22 F2F meeting. |
| Approved by the Khronos Board of Promoters on March 10, 2010. |
| |
| Version |
| |
| Last Modified Date: August 9, 2014 |
| Revision: 13 |
| |
| Number |
| |
| ARB Extension #85 |
| |
| Dependencies |
| |
| This extension is written against the OpenGL 3.2 specification. |
| |
| Overview |
| |
| Applications can benefit from accurate timing information in a number of |
| different ways. During application development, timing information can |
| help identify application or driver bottlenecks. At run time, |
| applications can use timing information to dynamically adjust the amount |
| of detail in a scene to achieve constant frame rates. OpenGL |
| implementations have historically provided little to no useful timing |
| information. Applications can get some idea of timing by reading timers |
| on the CPU, but these timers are not synchronized with the graphics |
| rendering pipeline. Reading a CPU timer does not guarantee the completion |
| of a potentially large amount of graphics work accumulated before the |
| timer is read, and will thus produce wildly inaccurate results. |
| glFinish() can be used to determine when previous rendering commands have |
| been completed, but will idle the graphics pipeline and adversely affect |
| application performance. |
| |
| This extension provides a query mechanism that can be used to determine |
| the amount of time it takes to fully complete a set of GL commands, and |
| without stalling the rendering pipeline. It uses the query object |
| mechanisms first introduced in the occlusion query extension, which allow |
| time intervals to be polled asynchronously by the application. |
| |
| IP Status |
| |
| No known IP claims. |
| |
| New Procedures and Functions |
| |
| void QueryCounter(uint id, enum target); |
| |
| void GetQueryObjecti64v(uint id, enum pname, int64 *params); |
| void GetQueryObjectui64v(uint id, enum pname, uint64 *params); |
| |
| New Tokens |
| |
| Accepted by the <target> parameter of BeginQuery, EndQuery, and |
| GetQueryiv: |
| |
| TIME_ELAPSED 0x88BF |
| |
| Accepted by the <target> parameter of GetQueryiv and QueryCounter. |
| Accepted by the <value> parameter of GetBooleanv, GetIntegerv, |
| GetInteger64v, GetFloatv, and GetDoublev: |
| |
| TIMESTAMP 0x8E28 |
| |
| Additions to Chapter 2 of the OpenGL 3.2 (Core Profile) Specification |
| (OpenGL Operation) |
| |
| (Modify table 2.1, Correspondence of command suffix letters to GL argument |
| types, p. 14) Add one new type and suffix: |
| |
| Letter Corresponding GL Type |
| ------ --------------------- |
| ui64 uint64 |
| |
| (Modify Section 2.14, Asynchronous Queries, p. 89) |
| |
| Asynchronous queries provide a mechanism to return information about the |
| processing of a sequence of GL commands. There are three query types |
| supported by the GL. Transform feedback queries (see section 2.16) return |
| information on the number of vertices and primitives processed by the GL |
| and written to one or more buffer objects. Occlusion queries (see section |
| 4.1.6) count the number of fragments or samples that pass the depth test. |
| Timer queries (section 5.4) record the amount of time needed to fully |
| process these commands or the current time of the GL. |
| |
| Additions to Chapter 3 of the OpenGL 3.2 Specification (Rasterization) |
| |
| None. |
| |
| Additions to Chapter 4 of the OpenGL 3.2 Specification (Per-Fragment |
| Operations and the Framebuffer) |
| |
| None. |
| |
| Additions to Chapter 5 of the OpenGL 3.2 Specification (Special Functions) |
| |
| (Add new Section 5.4, Timer Queries, p. 246) |
| |
| Timer queries use query objects to track the amount of time needed to |
| fully complete a set of GL commands, or to determine the current time |
| of the GL. |
| |
| When BeginQuery and EndQuery are called with a <target> of |
| TIME_ELAPSED, the GL prepares to start and stop the timer used for |
| timer queries. The timer is started or stopped when the effects from all |
| previous commands on the GL client and server state and the framebuffer |
| have been fully realized. The BeginQuery and EndQuery commands may return |
| before the timer is actually started or stopped. When the timer query |
| timer is finally stopped, the elapsed time (in nanoseconds) is written to |
| the corresponding query object as the query result value, and the query |
| result for that object is marked as available. |
| |
| If the elapsed time overflows the number of bits, <n>, available to hold |
| elapsed time, its value becomes undefined. It is recommended, but not |
| required, that implementations handle this overflow case by saturating at |
| 2^n - 1. |
| |
| A timer query object is created with the command |
| |
| void QueryCounter(uint id, enum target); |
| |
| <target> must be TIMESTAMP. If <id> is an unused query object name, the |
| name is marked as used and associated with a new query object of type |
| TIMESTAMP. Otherwise <id> must be the name of an existing query object |
| of that type. |
| |
| When QueryCounter is called, the GL records the current time into |
| the corresponding query object. The time is recorded after all previous |
| commands on the GL client and server state and the framebuffer have been |
| fully realized. When the time is recorded, the query result for that |
| object is marked available. QueryCounter timer queries can be used |
| within a BeginQuery / EndQuery block where the <target> is TIME_ELAPSED |
| and it does not affect the result of that query object. |
| |
| ** core profile only |
| QueryCounter fails and an INVALID\_OPERATION error is generated if <id> |
| is not a name returned from a previous call to GenQueries, or if such a |
| name has since been deleted with DeleteQueries. |
| ** end core profile only |
| |
| If <id> is already in use within a BeginQuery / EndQuery block, or if |
| <id> is the name of an existing query object whose type does not match |
| <target>, an INVALID_OPERATION error is generated. |
| |
| The current time of the GL may be queried by calling GetIntegerv or |
| GetInteger64v with the symbolic constant TIMESTAMP. This will return the |
| GL time after all previous commands have reached the GL server but have |
| not yet necessarily executed. By using a combination of this synchronous |
| get command and the asynchronous timestamp query object target, |
| applications can measure the latency between when commands reach the GL |
| server and when they are realized in the framebuffer. |
| |
| Additions to Chapter 6 of the OpenGL 2.0 Specification (State and State |
| Requests) |
| |
| (Modify Section 6.1.6, Asynchronous Queries, p. 255) |
| |
| Section 6.1.6, Asynchronous Queries |
| |
| The command |
| |
| boolean IsQuery(uint id); |
| |
| returns TRUE if <id> is the name of a query object. If <id> is zero, or if |
| <id> is a non-zero value that is not the name of a query object, IsQuery |
| returns FALSE. |
| |
| Information about a query target can be queried with the command |
| |
| void GetQueryiv(enum target, enum pname, int *params); |
| |
| <target> identifies the query target and can be SAMPLES_PASSED for |
| occlusion queries, PRIMITIVES_GENERATED and |
| TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN for primitive queries, or |
| TIME_ELAPSED or TIMESTAMP for timer queries. |
| |
| If <pname> is CURRENT_QUERY, the name of the currently active query for |
| <target>, or zero if no query is active, will be placed in <params>. |
| |
| If <pname> is QUERY_COUNTER_BITS, the implementation-dependent number of |
| bits used to hold the query result for <target> will be placed in |
| <params>. The number of query counter bits may be zero, in which case |
| the counter contains no useful information. |
| |
| For primitive queries (PRIMITIVES_GENERATED and |
| TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN) if the number of bits is non-zero, |
| the minimum number of bits allowed is 32. |
| |
| For occlusion queries (SAMPLES_PASSED), if the number of bits is |
| non-zero, the minimum number of bits allowed is a function of the |
| implementation's maximum viewport dimensions (MAX_VIEWPORT_DIMS). The |
| counter must be able to represent at least two overdraws for every pixel |
| in the viewport. The formula to compute the allowable minimum value |
| (where <n> is the minimum number of bits) is: |
| |
| n = min(32, ceil(log_2(maxViewportWidth * maxViewportHeight * 2))). |
| |
| For timer queries (TIME_ELAPSED and TIMESTAMP), if the number |
| of bits is non-zero, the minimum number of bits allowed is 30 which |
| will allow at least 1 second of timing. |
| |
| The state of a query object can be queried with the commands |
| |
| void GetQueryObjectiv(uint id, enum pname, int *params); |
| void GetQueryObjectuiv(uint id, enum pname, uint *params); |
| void GetQueryObjecti64v(uint id, enum pname, int64 *params); |
| void GetQueryObjectui64v(uint id, enum pname, uint64 *params); |
| |
| If <id> is not the name of a query object, or if the query object named |
| by <id> is currently active, then an INVALID_OPERATION error is |
| generated. |
| |
| If <pname> is QUERY_RESULT, then the query object's result |
| value is returned as a single integer in <params>. If the value is so |
| large in magnitude that it cannot be represented with the requested type, |
| then the nearest value representable using the requested type is |
| returned. If the number of query counter bits for target is zero, then |
| the result is returned as a single integer with the value zero. |
| |
| There may be an indeterminate delay before the above query returns. If |
| <pname> is QUERY_RESULT_AVAILABLE, FALSE is returned if such a delay |
| would be required; otherwise TRUE is returned. It must always be true |
| that if any query object returns a result available of TRUE, all queries |
| of the same type issued prior to that query must also return TRUE. |
| |
| Querying the state for any given query object forces that occlusion |
| query to complete within a finite amount of time. |
| |
| If multiple queries are issued using the same object name prior to |
| calling GetQueryObject[u]i[64]v, the result and availability information |
| returned will always be from the last query issued. The results from any |
| queries before the last one will be lost if they are not retrieved before |
| starting a new query on the same <target> and <id>. |
| |
| Interactions with NV_present_video and NV_video_capture |
| |
| The GL timer recorded by this extension is the same timer as that used |
| by the NV_present_video and NV_video_capture extensions. This allows |
| the timer to be used with any of these extensions interchangeably. |
| |
| Interactions with the Compatibility Profile |
| |
| In the compatibility profile, query objects support application-provided |
| names, and the language requiring an error is <id> is not a name |
| returned from GenQueries is removed. This is noted in the body text |
| above. |
| |
| Errors |
| |
| The error INVALID_ENUM is generated if BeginQuery or EndQuery is called |
| where <target> is not SAMPLES_PASSED, |
| TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN or TIME_ELAPSED. |
| |
| The error INVALID_ENUM is generated if GetQueryiv is called where |
| <target> is not SAMPLES_PASSED, TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN, |
| TIME_ELAPSED or TIMESTAMP. |
| |
| The error INVALID_ENUM is generated if QueryCounter is called where |
| <target> is not TIMESTAMP. |
| |
| The error INVALID_OPERATION is generated if QueryCounter is called |
| on a query object that is already in use inside a BeginQuery/EndQuery. |
| |
| The error INVALID_OPERATION is generated if QueryCounter is called on |
| a query object whose type is not TIMESTAMP. |
| |
| (in the core profile only) |
| The error INVALID_OPERATION is generated if QueryCounter is called |
| where <id> is not a name returned from a previous call to GenQueries, |
| or if such a name has since been deleted with DeleteQueries. |
| |
| The error INVALID_OPERATION is generated if GetQueryObjecti64v or |
| GetQueryObjectui64v is called where <id> is not the name of a query |
| object. |
| |
| The error INVALID_OPERATION is generated if GetQueryObjecti64v or |
| GetQueryObjectui64v is called where <id> is the name of a currently |
| active query object. |
| |
| The error INVALID_ENUM is generated if GetQueryObjecti64v or |
| GetQueryObjectui64v is called where <pname> is not QUERY_RESULT or |
| QUERY_RESULT_AVAILABLE. |
| |
| New State |
| |
| None. |
| |
| Examples |
| |
| (1) Here is some rough sample code that demonstrates the intended usage |
| of this extension. |
| |
| GLuint queries[N]; |
| GLint available = 0; |
| // timer queries can contain more than 32 bits of data, so always |
| // query them using the 64 bit types to avoid overflow |
| GLuint64 timeElapsed = 0; |
| |
| // Create a query object. |
| glGenQueries(N, queries); |
| |
| // Start query 1 |
| glBeginQuery(GL_TIME_ELAPSED, queries[0]); |
| |
| // Draw object 1 |
| .... |
| |
| // End query 1 |
| glEndQuery(GL_TIME_ELAPSED); |
| |
| ... |
| |
| // Start query N |
| glBeginQuery(GL_TIME_ELAPSED, queries[N-1]); |
| |
| // Draw object N |
| .... |
| |
| // End query N |
| glEndQuery(GL_TIME_ELAPSED); |
| |
| // Wait for all results to become available |
| while (!available) { |
| glGetQueryObjectiv(queries[N-1], GL_QUERY_RESULT_AVAILABLE, &available); |
| } |
| |
| for (i = 0; i < N; i++) { |
| // See how much time the rendering of object i took in nanoseconds. |
| glGetQueryObjectui64v(queries[i], GL_QUERY_RESULT, &timeElapsed); |
| |
| // Do something useful with the time. Note that care should be |
| // taken to use all significant bits of the result, not just the |
| // least significant 32 bits. |
| AdjustObjectLODBasedOnDrawTime(i, timeElapsed); |
| } |
| |
| This example is sub-optimal in that it stalls at the end of every |
| frame to wait for query results. Ideally, the collection of results |
| would be delayed one frame to minimize the amount of time spent |
| waiting for the GPU to finish rendering. |
| |
| (2) This example is basically the same as the example above but uses |
| QueryCounter instead. |
| |
| GLuint queries[N+1]; |
| GLint available = 0; |
| // timer queries can contain more than 32 bits of data, so always |
| // query them using the 64 bit types to avoid overflow |
| GLuint64 timeStart, timeEnd, timeElapsed = 0; |
| |
| // Create a query object. |
| glGenQueries(N+1, queries); |
| |
| // Query current timestamp 1 |
| glQueryCounter(queries[0], GL_TIMESTAMP); |
| |
| // Draw object 1 |
| .... |
| |
| // Query current timestamp N |
| glQueryCounter(queries[N-1], GL_TIMESTAMP); |
| |
| // Draw object N |
| .... |
| |
| // Query current timestamp N+1 |
| glQueryCounter(queries[N], GL_TIMESTAMP); |
| |
| // Wait for all results to become available |
| while (!available) { |
| glGetQueryObjectiv(queries[N], GL_QUERY_RESULT_AVAILABLE, &available); |
| } |
| |
| for (i = 0; i < N; i++) { |
| // See how much time the rendering of object i took in nanoseconds. |
| glGetQueryObjectui64v(queries[i], GL_QUERY_RESULT, &timeStart); |
| glGetQueryObjectui64v(queries[i+1], GL_QUERY_RESULT, &timeEnd); |
| timeElapsed = timeEnd - timeStart; |
| |
| // Do something useful with the time. Note that care should be |
| // taken to use all significant bits of the result, not just the |
| // least significant 32 bits. |
| AdjustObjectLODBasedOnDrawTime(i, timeElapsed); |
| } |
| |
| (3) This example demonstrates how to measure the latency between GL |
| commands reaching the server and being realized in the framebuffer. |
| |
| /* Submit a frame of rendering commands */ |
| while (!doneRendering) { |
| ... |
| glDrawElements(...); |
| } |
| |
| /* |
| * Measure rendering latency: |
| * |
| * Some commands may have already been submitted to hardware, |
| * and some of those may have already completed. The goal is |
| * to measure the time it takes for the remaining commands to |
| * complete, thereby measuring how far behind the app the GPU |
| * is lagging, but without synchronizing the GPU with the CPU. |
| */ |
| |
| /* Queue a query to find out when the frame finishes on the GL */ |
| glQueryCounter(endFrameQuery, GL_TIMESTAMP); |
| |
| /* Get the current GL time without stalling the GL */ |
| glGet(GL_TIMESTAMP, &flushTime); |
| |
| /* Finish the frame, submitting outstanding commands to the GL */ |
| SwapBuffers(); |
| |
| /* Render another frame */ |
| |
| /* |
| * Later, compare the query result of <endFrameQuery> |
| * and <flushTime> to measure the latency of the frame |
| */ |
| |
| |
| Issues from EXT_timer_query |
| |
| (1) What time interval is being measured? |
| |
| RESOLVED: The timer starts when all commands prior to BeginQuery() have |
| been fully executed. At that point, everything that should be drawn by |
| those commands has been written to the framebuffer. The timer stops |
| when all commands prior to EndQuery() have been fully executed. |
| |
| (2) What unit of time will time intervals be returned in? |
| |
| RESOLVED: Nanoseconds (10^-9 seconds). This unit of measurement allows |
| for reasonably accurate timing of even small blocks of rendering |
| commands. The granularity of the timer is implementation-dependent. A |
| 32-bit query counter can express intervals of up to approximately 4 |
| seconds. |
| |
| (3) What should be the minimum number of counter bits for timer queries? |
| |
| RESOLVED: 30 bits, which will allow timing sections that take up to 1 |
| second to render. |
| |
| (4) How are counter results of more than 32 bits returned? |
| |
| RESOLVED: Via two new datatypes, int64EXT and uint64EXT, and their |
| corresponding GetQueryObject entry points. These types hold integer |
| values and have a minimum bit width of 64. |
| |
| UPDATE: This resolution was relevant for EXT_timer_query and OpenGL 2.0. |
| OpenGL 3.2 now has int64 and uint64 datatypes as part of the core spec. |
| |
| (5) Should the extension measure total time elapsed between the full |
| completion of the BeginQuery and EndQuery commands, or just time |
| spent in the graphics library? |
| |
| RESOLVED: This extension will measure the total time elapsed between |
| the full completion of these commands. Future extensions may implement |
| a query to determine time elapsed at different stages of the graphics |
| pipeline. |
| |
| (6) This extension introduces a second query type supported by |
| BeginQuery/EndQuery. Can multiple query types be active |
| simultaneously? |
| |
| RESOLVED: Yes; an application may perform an occlusion query and a |
| timer query simultaneously. An application can not perform multiple |
| occlusion queries or multiple timer queries simultaneously. An |
| application also can not use the same query object for an occlusion |
| query and a timer query simultaneously. |
| |
| (7) Do query objects have a query type permanently associated with them? |
| |
| RESOLVED: No. A single query object can be used to perform different |
| types of queries, but not at the same time. |
| |
| Having a fixed type for each query object simplifies some aspects of the |
| implementation -- not having to deal with queries with different result |
| sizes, for example. It would also mean that BeginQuery() with a query |
| object of the "wrong" type would result in an INVALID_OPERATION error. |
| |
| UPDATE: This resolution was relevant for EXT_timer_query and OpenGL 2.0. |
| Since EXT_transform_feedback has since been incorporated into the core, |
| the resolution is that BeginQuery will generate error INVALID_OPERATION |
| if <id> represents a query object of a different type. |
| |
| (8) How predictable/repeatable are the results returned by the timer |
| query? |
| |
| RESOLVED: In general, the amount of time needed to render the same |
| primitives should be fairly constant. But there may be many other |
| system issues (e.g., context switching on the CPU and GPU, virtual |
| memory page faults, memory cache behavior on the CPU and GPU) that can |
| cause times to vary wildly. |
| |
| Note that modern GPUs are generally highly pipelined, and may be |
| processing different primitives in different pipeline stages |
| simultaneously. In this extension, the timers start and stop when the |
| BeginQuery/EndQuery commands reach the bottom of the rendering pipeline. |
| What that means is that by the time the timer starts, the GL driver on |
| the CPU may have started work on GL commands issued after BeginQuery, |
| and the higher pipeline stages (e.g., vertex transformation) may have |
| started as well. |
| |
| (9) What should the new 64 bit integer type be called? |
| |
| RESOLVED: The new types will be called GLint64EXT/GLuint64EXT The new |
| command suffixes will be i64 and ui64. These names clearly convey the |
| minimum size of the types. These types are similar to the C99 standard |
| type int_least64_t, but we use names similar to the C99 optional type |
| int64_t for simplicity. |
| |
| UPDATE: This resolution was relevant for EXT_timer_query and OpenGL 2.0. |
| OpenGL 3.2 now has int64 and uint64 datatypes as part of the core spec. |
| The i64 suffix already exists in OpenGL 3.2 and the ui64 suffix has been |
| added as part of this extension. |
| |
| Issues |
| |
| (10) What about tile-based implementations? The effects of a command are |
| not complete until the frame is completely rendered. Timing recorded |
| before the frame is complete may not be what developers expect. Also |
| the amount of time needed to render the same primitives is not |
| consistent, which conflicts with issue (8) above. The time depends on |
| how early or late in the scene it is placed. |
| |
| RESOLVED: The current language supports tile-based rendering okay as it |
| is written. Developers are warned that using timers on tile-based |
| implementation may not produce results they expect since rendering is not |
| done in a linear order. Timing results are calculated when the frame is |
| completed and may depend on how early or late in the scene it is placed. |
| |
| (11) Can the GL implementation use different clocks to implement the |
| TIME_ELAPSED and TIMESTAMP queries? |
| |
| RESOLVED: Yes, the implementation can use different internal clocks to |
| implement TIME_ELAPSED and TIMESTAMP. If different clocks are |
| used it is possible there is a slight discrepancy when comparing queries |
| made from TIME_ELAPSED and TIMESTAMP; they may have slight |
| differences when both are used to measure the same sequence. However, this |
| is unlikely to affect real applications since comparing the two queries is |
| not expected to be useful. |
| |
| (12) Why do BeginQuery and QueryCounter have the same arguments in the |
| opposite order? |
| |
| RESOLVED: Due to an unfortunate oversight, which cannot be fixed at |
| this point. |
| |
| |
| Revision History |
| |
| Rev. Date Author Changes |
| ---- ------------ -------- ------------------------------------------- |
| 13 Aug 9, 2014 Jon Leech Fix typo in example 3 (bug 12552). |
| |
| 12 Jul 11, 2013 Jon Leech Change type of queries[] in sample code to |
| GLuint (public bug 432). |
| |
| 11 Apr 13, 2012 Jon Leech Clean up error language, add error for |
| query objects which are not of type |
| TIMESTAMP, and add issue 12 (Khronos |
| internal bug 7662) |
| |
| 10 June 3, 2011 dkoch Add INVALID_OPERATION error when calling |
| QueryCounter with a non-generated <id> in |
| the core profile (Khronos internal bug 7662). |
| |
| 9 Dec 18, 2009 pdaniell Remove ambiguous language about "interuptions |
| to the GL". Rename CURRENT_TIME to TIMESTAMP. |
| |
| 8 Dec 10, 2009 Jon Leech Improve description of QueryCounter command. |
| |
| 7 Dec 10, 2009 Jon Leech Replace non-ASCII punctuation. |
| |
| 6 Dec 07, 2009 pdaniell Remove ARB suffix from new tokens for core. |
| |
| 5 Oct 29, 2009 pdaniell TIMESTAMP_ARB renamed to CURRENT_TIME_ARB. |
| Issue (11) raised about using different |
| clocks to implement CURRENT_TIME and |
| TIME_ELAPSED queries. Add example (3) for |
| calculating the GL latency. |
| |
| 4 Oct 23, 2009 pdaniell Add support for TIMESTAMP_ARB as a <value> |
| to Get* to allow synchronous time query. |
| |
| 3 Oct 15, 2009 pdaniell Resolved Issue (10). Added Interactions |
| with NV_present_video and NV_video_capture |
| section. |
| |
| 2 Oct 15, 2009 pdaniell Clarified some of the old EXT_timer_query |
| Issues wrt OpenGL 3.2. Added specification |
| for the TIMESTAMP_ARB time. Added new Issue |
| for tile-based implementations. Issue 3 |
| resolution added to the spec. |
| |
| 1 Oct 13, 2009 pdaniell Initial revision based on EXT_timer_query |