blob: bf124eb301402d579d764ec2a552531cde1b5330 [file] [log] [blame]
Name
ARM_thread_limit_hint
Name Strings
cl_arm_thread_limit_hint
Contributors
Robert Elliott, ARM Ltd.
Kévin Petit, ARM Ltd.
Contact
Kévin Petit, ARM Ltd. (kevin.petit 'at' ARM.com)
IP Status
No claims or disclosures are known to exist.
Version
Revision: #3, Sept 28th, 2017
Number
OpenCL Extension #41
Status
Complete.
Extension Type
OpenCL device extension
Dependencies
Requires OpenCL version 1.0 or later.
Overview
This extension enables an application to provide a hint for the maximum
number of threads allowed to run concurrently on a compute unit. This
results in a limit in the threads used by a kernel instance on devices
that support it, lowering pressure on caches.
Header File
No host changes needed.
Glossary
No new terminology is introduced by this extension.
New Types
None
New Procedures and Functions
The new kernel qualifier
__attribute__((arm_thread_limit_hint(N)))
Description
The attribute can be specified as part of the declaration of a kernel and
provides a hint to the implementation that using fewer threads is desired.
The implementation will accept any number between 0 and
CL_DEVICE_MAX_WORK_GROUP_SIZE and choose the closest number that can be
used.
If the hint is larger than the maximum workgroup size supported by the
kernel for that device, it is not honored.
If the hint is smaller than the requested workgroup size for the kernel-
instance, it is not honored.
If the hint is not honored, a warning will be produced on context_notify.
The hint will be honored on devices which support this feature.
New Tokens
OpenCL kernel code now has access to:
#pragma OPENCL EXTENSION cl_arm_thread_limit_hint : enable
The define cl_arm_thread_limit_hint is also present.
Interactions with other extensions
None
Issues
None
Sample Code
The following is a basic example of use, nothing else is required for
the extension to function:
// Check for extension and define a throttle value if it is present. This
// is portable to drivers or devices without support for the extension.
#ifdef cl_arm_thread_limit_hint
#pragma OPENCL EXTENSION cl_arm_thread_limit_hint : enable
#define THROTTLE_ATTRIBUTE __attribute__((arm_thread_limit_hint(64)))
#else
#define THROTTLE_ATTRIBUTE
#endif
kernel THROTTLE_ATTRIBUTE void throttled_kernel( global int* in, global int *out )
{
// Kernel body ...
}
Conformance Tests
None
Revision History
Revision: #1, Feb 2nd, 2015 - Initial revision
Revision: #2, Feb 23rd, 2015 - Tidied up some of the language, added _hint
to the extension to be more consistent with other extensions.
Revision: #3, Sept 28th, 2017 - Relaxed the constraints on the number of
threads accepted. Clarified the wording.