Add cl_arm_integer_dot_product extension Signed-off-by: Kevin Petit <kevin.petit@arm.com>
diff --git a/extensions/arm/cl_arm_integer_dot_product.txt b/extensions/arm/cl_arm_integer_dot_product.txt new file mode 100644 index 0000000..6c22f20 --- /dev/null +++ b/extensions/arm/cl_arm_integer_dot_product.txt
@@ -0,0 +1,113 @@ +Name + + ARM_integer_dot_product + +Name Strings + + cl_arm_integer_dot_product_int8 + cl_arm_integer_dot_product_accumulate_int8 + cl_arm_integer_dot_product_accumulate_int16 + +Contributors + + Kevin Petit, ARM Ltd. + Abel Bernabeu, ARM Ltd. + Giridhar Tammana, ARM Ltd. + +Contact + + Kevin Petit, ARM Ltd. (kevin.petit 'at' ARM.com) + +IP Status + + No claims or disclosures are known to exist. + +Version + + Revision: #3, November 17th, 2017 + +Number + + OpenCL Extension #? + +Status + + Complete. + +Extension Type + + OpenCL device extension + +Dependencies + + Requires OpenCL version 1.2 or later. + +Overview + + This extension adds built-in functions giving direct access to specialised + integer dot product instructions that are supported on some devices. An + application wishing to support a fallback path for devices where a specific + function is not available may use the following pattern: + + #ifdef cl_arm_integer_dot_product_XXX + #pragma OPENCL EXTENSION cl_arm_integer_dot_product_XXX : enable + code using arm_dot_xxx() + #else + alternative implementation + #endif + +Header File + + cl_ext.h + +New Procedures and Functions + + Built-in Functions + + int arm_dot(char4 a, char4 b); + uint arm_dot(uchar4 a, uchar4 b); + + Description + + These functions are available when cl_arm_integer_dot_product_int8 is reported. + The cl_arm_integer_dot_product_int8 preprocessor macro is then defined. + + The value returned is: + + (a.x * b.x) + (a.y * b.y) + (a.z * b.z) + (a.w * b.w). + + Operands are zero- or sign-extended to 32-bit before the multiplications. + + Built-in Functions + + int arm_dot_acc(char4 a, char4 b, int acc); + uint arm_dot_acc(uchar4 a, uchar4 b, uint acc); + + Description + + These functions are available when cl_arm_integer_dot_product_accumulate_int8 + is reported. The cl_arm_integer_dot_product_accumulate_int8 preprocessor + macro is then defined. + + The value returned is: + + acc + [ (a.x * b.x) + (a.y * b.y) + (a.z * b.z) + (a.w * b.w) ]. + + Operands are zero- or sign-extended to 32-bit before the multiplications. + + Built-in Functions + + int arm_dot_acc(short2 a, short2 b, int acc); + uint arm_dot_acc(ushort2 a, ushort2 b, uint acc); + + Description + + These functions are available when cl_arm_integer_dot_product_accumulate_int16 + is reported. The cl_arm_integer_dot_product_accumulate_int16 preprocessor + macro is then defined. + + The value returned is: + + acc + [ (a.x * b.x) + (a.y * b.y) ]. + + Operands are zero- or sign-extended to 32-bit before the multiplications.