extensions/intel/cl_intel_motion_estimation.txt - external/github.com/KhronosGroup/OpenCL-Registry - Git at Google

 Name String

     cl_intel_motion_estimation

 Contributors

     Nico Galoppo, Intel
     Craig Hansen-Sturm, Intel
     Yejun Guo, Intel
     Ben Ashbaugh, Intel

 Contact

     Ben Ashbaugh, Intel (ben.ashbaugh 'at' intel.com)

 IP Status

     TBD

 Version

     Version 2, March 18, 2016

 Number

     OpenCL Extension #23

 Status

     Final draft

 Dependencies

     OpenCL 1.2 and support for the cl_intel_accelerator extension is required.
     This extension is written against revision 19 of the OpenCL 1.2
     specification and against version 2 of the cl_intel_accelerator extension
     specification.

 Overview

     This document describes an OpenCL extension to expose basic motion estimation
     capabilities. There are two key parts of this extension: The first is a
     built-in kernel for frame-based motion estimation that is exposed using the
     built-in kernel infrastructure added to OpenCL 1.2. The second is the ability
     to create motion estimation accelerators, which can be used to control
     behavior of the motion estimation engine. The motion estimation accelerators
     use the framework described in the cl_intel_accelerator extension.

 New Procedures and Functions

     None

 New Tokens

     Accepted as the <accelerator_type> parameter of clCreateAcceleratorINTEL:

         CL_ACCELERATOR_TYPE_MOTION_ESTIMATION_INTEL 0x0

     Accepted as values for fields of the cl_motion_estimation_desc_intel structure:

         Accepted values for <mb_block_type>:

         CL_ME_MB_TYPE_16x16_INTEL                   0x0
         CL_ME_MB_TYPE_8x8_INTEL                     0x1
         CL_ME_MB_TYPE_4x4_INTEL                     0x2

         Accepted values for <subpixel_mode>:

         CL_ME_SUBPIXEL_MODE_INTEGER_INTEL           0x0
         CL_ME_SUBPIXEL_MODE_HPEL_INTEL              0x1
         CL_ME_SUBPIXEL_MODE_QPEL_INTEL              0x2

         Accepted values for <sad_adjust_mode>:

         CL_ME_SAD_ADJUST_MODE_NONE_INTEL            0x0
         CL_ME_SAD_ADJUST_MODE_HAAR_INTEL            0x1

         Accepted values for <search_path_type>:

         CL_ME_SEARCH_PATH_RADIUS_2_2_INTEL          0x0
         CL_ME_SEARCH_PATH_RADIUS_4_4_INTEL          0x1
         CL_ME_SEARCH_PATH_RADIUS_16_12_INTEL        0x5

 New Types

     Accepted as a type pointed to by <accelerator_desc> for
     clCreateAcceleratorINTEL, and pointed to by <param_value> for
     clGetAcceleratorInfoINTEL:

         typedef struct _cl_motion_estimation_desc_intel {
             cl_uint     mb_block_type;
             cl_uint     subpixel_mode;
             cl_uint     sad_adjust_mode;
             cl_uint     search_path_type;
         } cl_motion_estimation_desc_intel;

 Modifications to Section 5.XX.1 - "Creating Accelerator Objects" in the
 cl_intel_accelerator specification:

     Add a table of accelerator types, immediately following the description of
     <accelerator_type>:

    "----------------------------------------------------------------------------
     cl_accelerator_type_intel  Description
     -------------------------  -----------
     CL_ACCELERATOR_TYPE_       Create a basic full-frame motion estimation
     MOTION_ESTIMATION_INTEL    accelerator
     ----------------------------------------------------------------------------"

     Add a table of accelerator descriptors, immediately following the
     description of <descriptor>:

    "----------------------------------------------------------------------------
     Descriptor Type                  Description
     ---------------                  -----------
     cl_motion_estimation_desc_intel  Used to represent the configuration state
                                      of basic full-frame motion estimation
                                      accelerators
     ----------------------------------------------------------------------------"

 Add a new sub-section "Motion Estimation Accelerators" to Section 5.XX.3 -
 "Accelerator Descriptors" in the cl_intel_accelerator specification:

    "The cl_motion_estimation_desc_intel descriptor structure is used to create
     an accelerator to control behavior of the motion estimation engine. This
     motion estimation descriptor structure is defined as:

     typedef struct _cl_motion_estimation_desc_intel {
         cl_uint     mb_block_type;
         cl_uint     subpixel_mode;
         cl_uint     sad_adjust_mode;
         cl_uint     search_path_type;
     } cl_motion_estimation_desc_intel;

     <mb_block_type> describes the size of the blocks (partitioning mode)
     considered by the motion estimator. A frame is first divided into
     macroblocks of size 16x16, and then may be sub-divided into blocks
     of size 8x8 or 4x4, or retained as 16x16 blocks. This field influences
     the size of the output image(s) and buffer(s) of the motion estimation
     algorithms defined in this extension. Valid values are described in the
     table below.

     ----------------------------------------------------------------------------
     Type                                   Description
     ----                                   -----------
     CL_ME_MB_TYPE_16x16_INTEL              Each block is of size 16x16 pixels
     CL_ME_MB_TYPE_8x8_INTEL                Each block is of size 8x8 pixels
     CL_ME_MB_TYPE_4x4_INTEL                Each block is of size 4x4 pixels
     ----------------------------------------------------------------------------

     <subpixel_mode> defines the search precision (and hence, the precision of
     the returned motion vectors). Valid values are described in the table below.

     ----------------------------------------------------------------------------
     Type                                   Description
     ----                                   -----------
     CL_ME_SUBPIXEL_MODE_INTEGER_INTEL      Integer pixel mode searching
     CL_ME_SUBPIXEL_MODE_HPEL_INTEL         Half-pixel mode searching
     CL_ME_SUBPIXEL_MODE_QPEL_INTEL         Quarter-pixel mode searching
     ----------------------------------------------------------------------------

     <sad_adjust_mode> specifies distortion measure adjustment used for the motion
     search SAD (Sum of Absolute Difference) comparison. Valid values are
     described in the table below.

     ----------------------------------------------------------------------------
     Type                                   Description
     ----                                   -----------
     CL_ME_SAD_ADJUST_MODE_NONE_INTEL       Non-adjusted SAD
     CL_ME_SAD_ADJUST_MODE_HAAR_INTEL       Haar transformed SATD (frequency space)
     ----------------------------------------------------------------------------

     <search_path_type> specifies the search path and search radius when matching
     blocks in the neighborhood of each pixel block. Currently, all search
     algorithms match the source block with pixel blocks in the reference
     area exhaustively within a [Rx, Ry] radius from the center of the co-located
     source macroblock in the reference frame that the block belongs to,
     optionally offset by a predicted motion vector.

     ----------------------------------------------------------------------------
     Flag                                   Description
     ----                                   -----------
     CL_ME_SEARCH_PATH_RADIUS_2_2_INTEL     Exhaustive search within
                                            [+/-2, +/-2] radius
     CL_ME_SEARCH_PATH_RADIUS_4_4_INTEL     Exhaustive search within
                                            [+/-4, +/-4] radius
     CL_ME_SEARCH_PATH_RADIUS_16_12_INTEL   Exhaustive search within
                                            [+/-16, +/-12] radius
     ----------------------------------------------------------------------------"

 Add a new sub-section "Using Motion Estimation Accelerators" to Section 5.XX.4 -
 "Using Accelerators" in the cl_intel_accelerator specification:

    "Motion estimation accelerators control behavior of the motion estimation
     engine and are used by motion estimation kernels.  The
     cl_intel_motion_estimation extension adds a built-in kernel for basic
     motion estimation.  This section describes the motion estimation built-in
     kernel and how to set the motion estimation accelerator as an argument to
     the motion estimation built-in kernel.

     The motion estimation built-in kernel uses the built-in kernel functionality
     added in OpenCL 1.2. After creating the motion estimation built-in kernel,
     the motion estimation accelerator can be set as a kernel argument to control
     the motion estimation engine.

     There is currently no OpenCL C type for motion estimation accelerators, and
     motion estimation accelerators can only be used with the motion estimation
     built-in kernel.

     The kernel

         __kernel void block_motion_estimate_intel(
             accelerator_intel_t accelerator,
             __read_only  image2d_t src_image,
             __read_only  image2d_t ref_image,
             __global short2 * prediction_motion_vector_buffer,
             __global short2 * motion_vector_buffer,
             __global ushort * residuals);

     computes motion vectors by comparing a 2d source image with a 2d reference
     image, producing an output field of motion vectors. The algorithm searches
     for the best match of each pixel block in the source image by searching a
     region of the reference image, centered on the coordinates of that pixel
     block in the source image.

     The starting search coordinate may optionally be offset by a prediction
     motion vector. Additionally, the kernel may optionally return a vector
     field of per-pixel-block best-match distortion (SAD) values.

     When enqueuing this kernel, the <global_work_size> and <global_work_offset>
     determine the region of interest (ROI) of the input frame in pixels. The
     <global_work_offset> determines the offset to the start of the region of
     interest. The <global_work_size> determines the width and height of the
     region of interest. The amount of data written to the <motion_vector_buffer>
     is dependent on the size of the region of interest and the partitioning
     mode specified by the accelerator.

     <accelerator> must be a valid accelerator object created by
     clCreateAcceleratorINTEL.  For the block_motion_estimate_intel kernel, the
     type of the accelerator must be CL_ACCELERATOR_TYPE_MOTION_ESTIMATION_INTEL.

     <src_image> is the input source image, typically representing 8-bit luminance
     information. The <image_channel_order> and <image_data_type> of <src_image>
     are restricted as follows:

     ----------------------------------------------------------------------------
     Channel Order              Src Channel Data Type
     -------------              ---------------------
     CL_R                       CL_UNORM_INT8
     ----------------------------------------------------------------------------

     <ref_image> is the input reference image, typically representing 8-bit
     luminance information. The <image_channel_order> and <image_data_type> of
     <ref_image> must match src_image, as follows:

     ----------------------------------------------------------------------------
     Channel Order              Ref Channel Data Type
     -------------              ---------------------
     CL_R                       CL_UNORM_INT8
     ----------------------------------------------------------------------------

     <motion_vector_buffer> is the output motion vector buffer, representing a
     vector field of pixel block motion vectors, stored linearly in row-major
     order. The elements of this buffer describe a motion vector for the
     corresponding pixel block, with its x/y components packed as two 16-bit
     integer values. Each component is encoded as a S13.2 fixed point value
     (two's complement). The precision is specified by the <subpixel_mode>
     of the <accelerator>. The size of the returned data is determined by the
     <mb_block_type> of the <accelerator>. The <motion_vector_buffer> needs to
     be sized that it fits the results of all pixel blocks of the source image,
     i.e. the number of full or partial 16x16 source pixel macroblocks times the
     number of returned motion vectors per source block (1, 4, or 16).

     <prediction_motion_vector_buffer> is an optional input buffer representing a
     vector field of prediction motion vectors, stored linearly in row-major
     order. When provided, this buffer must contain one prediction motion vector
     for each full or partial 16x16 pixel macroblock in the region of interest.
     The prediction motion vector is used to offset the default search center
     for each 16x16 pixel block search window. The default search center is the
     center of the co-located source macroblock in the reference frame.  Note
     that in feedback algorithms, where the output <motion_vector_buffer> of one
     frame is used as the input <prediction_motion_buffer> for the next frame,
     the output buffer may need to be downsampled. The
     <prediction_motion_vector_buffer> argument is optional and may be set to
     NULL, in which case the prediction motion vectors are implied to be (0,0).

     <residuals> is an optional output buffer representing a field of residuals
     or "distortion values", one for each returned motion vector, stored linearly
     in row-major order. These "residuals of the compensated image" represent the
     sum-of-absolute-differences (SAD) between the source frame pixel block and
     the best-match reference frame pixel block that produced the returned motion
     vector. The <sad_adjust_mode> of the <accelerator> determines whether plain
     SAD or SATD (Haar-adjusted) values are returned. The <rediduals> argument is
     optional and may be set to NULL, in which case residual information is not
     returned.

     The motion estimation built-in kernels are queued for execution by the host
     application using clEnqueueNDRangeKernel(). This function will return the
     usual error codes, augmented with the following specific error codes for
     this kernel:

       * CL_INVALID_WORK_DIMENSION if <work_dim> is not 2. This built-in kernel
         requires a 2D ND-range.
       * CL_INVALID_WORK_GROUP_SIZE if <local_work_size> is not NULL. This
         built-in kernel requires the work-group size to be set by the runtime.
       * CL_INVALID_IMAGE_SIZE if the region of interest defined by the
         <global_work_size> and <global_work_offset> exceed the dimensions of
         the <src_image> or <ref_image>.
       * CL_INVALID_IMAGE_FORMAT_DESCRIPTOR  if the <src_image> or <ref_image>
         kernel arguments do not meet the image format restrictions described
         above."

 Revision History

     Version 1 - Initial Revision
     Version 2 - Formatting and minor bug fixes.
	Name String

	cl_intel_motion_estimation

	Contributors

	Nico Galoppo, Intel
	Craig Hansen-Sturm, Intel
	Yejun Guo, Intel
	Ben Ashbaugh, Intel

	Contact

	Ben Ashbaugh, Intel (ben.ashbaugh 'at' intel.com)

	IP Status

	TBD

	Version

	Version 2, March 18, 2016

	Number

	OpenCL Extension #23

	Status

	Final draft

	Dependencies

	OpenCL 1.2 and support for the cl_intel_accelerator extension is required.
	This extension is written against revision 19 of the OpenCL 1.2
	specification and against version 2 of the cl_intel_accelerator extension
	specification.

	Overview

	This document describes an OpenCL extension to expose basic motion estimation
	capabilities. There are two key parts of this extension: The first is a
	built-in kernel for frame-based motion estimation that is exposed using the
	built-in kernel infrastructure added to OpenCL 1.2. The second is the ability
	to create motion estimation accelerators, which can be used to control
	behavior of the motion estimation engine. The motion estimation accelerators
	use the framework described in the cl_intel_accelerator extension.

	New Procedures and Functions

	None

	New Tokens

	Accepted as the <accelerator_type> parameter of clCreateAcceleratorINTEL:

	CL_ACCELERATOR_TYPE_MOTION_ESTIMATION_INTEL 0x0

	Accepted as values for fields of the cl_motion_estimation_desc_intel structure:

	Accepted values for <mb_block_type>:

	CL_ME_MB_TYPE_16x16_INTEL 0x0
	CL_ME_MB_TYPE_8x8_INTEL 0x1
	CL_ME_MB_TYPE_4x4_INTEL 0x2

	Accepted values for <subpixel_mode>:

	CL_ME_SUBPIXEL_MODE_INTEGER_INTEL 0x0
	CL_ME_SUBPIXEL_MODE_HPEL_INTEL 0x1
	CL_ME_SUBPIXEL_MODE_QPEL_INTEL 0x2

	Accepted values for <sad_adjust_mode>:

	CL_ME_SAD_ADJUST_MODE_NONE_INTEL 0x0
	CL_ME_SAD_ADJUST_MODE_HAAR_INTEL 0x1

	Accepted values for <search_path_type>:

	CL_ME_SEARCH_PATH_RADIUS_2_2_INTEL 0x0
	CL_ME_SEARCH_PATH_RADIUS_4_4_INTEL 0x1
	CL_ME_SEARCH_PATH_RADIUS_16_12_INTEL 0x5

	New Types

	Accepted as a type pointed to by <accelerator_desc> for
	clCreateAcceleratorINTEL, and pointed to by <param_value> for
	clGetAcceleratorInfoINTEL:

	typedef struct _cl_motion_estimation_desc_intel {
	cl_uint mb_block_type;
	cl_uint subpixel_mode;
	cl_uint sad_adjust_mode;
	cl_uint search_path_type;
	} cl_motion_estimation_desc_intel;

	Modifications to Section 5.XX.1 - "Creating Accelerator Objects" in the
	cl_intel_accelerator specification:

	Add a table of accelerator types, immediately following the description of
	<accelerator_type>:

	"----------------------------------------------------------------------------
	cl_accelerator_type_intel Description
	------------------------- -----------
	CL_ACCELERATOR_TYPE_ Create a basic full-frame motion estimation
	MOTION_ESTIMATION_INTEL accelerator
	----------------------------------------------------------------------------"

	Add a table of accelerator descriptors, immediately following the
	description of <descriptor>:

	"----------------------------------------------------------------------------
	Descriptor Type Description
	--------------- -----------
	cl_motion_estimation_desc_intel Used to represent the configuration state
	of basic full-frame motion estimation
	accelerators
	----------------------------------------------------------------------------"

	Add a new sub-section "Motion Estimation Accelerators" to Section 5.XX.3 -
	"Accelerator Descriptors" in the cl_intel_accelerator specification:

	"The cl_motion_estimation_desc_intel descriptor structure is used to create
	an accelerator to control behavior of the motion estimation engine. This
	motion estimation descriptor structure is defined as:

	typedef struct _cl_motion_estimation_desc_intel {
	cl_uint mb_block_type;
	cl_uint subpixel_mode;
	cl_uint sad_adjust_mode;
	cl_uint search_path_type;
	} cl_motion_estimation_desc_intel;

	<mb_block_type> describes the size of the blocks (partitioning mode)
	considered by the motion estimator. A frame is first divided into
	macroblocks of size 16x16, and then may be sub-divided into blocks
	of size 8x8 or 4x4, or retained as 16x16 blocks. This field influences
	the size of the output image(s) and buffer(s) of the motion estimation
	algorithms defined in this extension. Valid values are described in the
	table below.

	----------------------------------------------------------------------------
	Type Description
	---- -----------
	CL_ME_MB_TYPE_16x16_INTEL Each block is of size 16x16 pixels
	CL_ME_MB_TYPE_8x8_INTEL Each block is of size 8x8 pixels
	CL_ME_MB_TYPE_4x4_INTEL Each block is of size 4x4 pixels
	----------------------------------------------------------------------------

	<subpixel_mode> defines the search precision (and hence, the precision of
	the returned motion vectors). Valid values are described in the table below.

	----------------------------------------------------------------------------
	Type Description
	---- -----------
	CL_ME_SUBPIXEL_MODE_INTEGER_INTEL Integer pixel mode searching
	CL_ME_SUBPIXEL_MODE_HPEL_INTEL Half-pixel mode searching
	CL_ME_SUBPIXEL_MODE_QPEL_INTEL Quarter-pixel mode searching
	----------------------------------------------------------------------------

	<sad_adjust_mode> specifies distortion measure adjustment used for the motion
	search SAD (Sum of Absolute Difference) comparison. Valid values are
	described in the table below.

	----------------------------------------------------------------------------
	Type Description
	---- -----------
	CL_ME_SAD_ADJUST_MODE_NONE_INTEL Non-adjusted SAD
	CL_ME_SAD_ADJUST_MODE_HAAR_INTEL Haar transformed SATD (frequency space)
	----------------------------------------------------------------------------

	<search_path_type> specifies the search path and search radius when matching
	blocks in the neighborhood of each pixel block. Currently, all search
	algorithms match the source block with pixel blocks in the reference
	area exhaustively within a [Rx, Ry] radius from the center of the co-located
	source macroblock in the reference frame that the block belongs to,
	optionally offset by a predicted motion vector.

	----------------------------------------------------------------------------
	Flag Description
	---- -----------
	CL_ME_SEARCH_PATH_RADIUS_2_2_INTEL Exhaustive search within
	[+/-2, +/-2] radius
	CL_ME_SEARCH_PATH_RADIUS_4_4_INTEL Exhaustive search within
	[+/-4, +/-4] radius
	CL_ME_SEARCH_PATH_RADIUS_16_12_INTEL Exhaustive search within
	[+/-16, +/-12] radius
	----------------------------------------------------------------------------"

	Add a new sub-section "Using Motion Estimation Accelerators" to Section 5.XX.4 -
	"Using Accelerators" in the cl_intel_accelerator specification:

	"Motion estimation accelerators control behavior of the motion estimation
	engine and are used by motion estimation kernels. The
	cl_intel_motion_estimation extension adds a built-in kernel for basic
	motion estimation. This section describes the motion estimation built-in
	kernel and how to set the motion estimation accelerator as an argument to
	the motion estimation built-in kernel.

	The motion estimation built-in kernel uses the built-in kernel functionality
	added in OpenCL 1.2. After creating the motion estimation built-in kernel,
	the motion estimation accelerator can be set as a kernel argument to control
	the motion estimation engine.

	There is currently no OpenCL C type for motion estimation accelerators, and
	motion estimation accelerators can only be used with the motion estimation
	built-in kernel.

	The kernel

	__kernel void block_motion_estimate_intel(
	accelerator_intel_t accelerator,
	__read_only image2d_t src_image,
	__read_only image2d_t ref_image,
	__global short2 * prediction_motion_vector_buffer,
	__global short2 * motion_vector_buffer,
	__global ushort * residuals);

	computes motion vectors by comparing a 2d source image with a 2d reference
	image, producing an output field of motion vectors. The algorithm searches
	for the best match of each pixel block in the source image by searching a
	region of the reference image, centered on the coordinates of that pixel
	block in the source image.

	The starting search coordinate may optionally be offset by a prediction
	motion vector. Additionally, the kernel may optionally return a vector
	field of per-pixel-block best-match distortion (SAD) values.

	When enqueuing this kernel, the <global_work_size> and <global_work_offset>
	determine the region of interest (ROI) of the input frame in pixels. The
	<global_work_offset> determines the offset to the start of the region of
	interest. The <global_work_size> determines the width and height of the
	region of interest. The amount of data written to the <motion_vector_buffer>
	is dependent on the size of the region of interest and the partitioning
	mode specified by the accelerator.

	<accelerator> must be a valid accelerator object created by
	clCreateAcceleratorINTEL. For the block_motion_estimate_intel kernel, the
	type of the accelerator must be CL_ACCELERATOR_TYPE_MOTION_ESTIMATION_INTEL.

	<src_image> is the input source image, typically representing 8-bit luminance
	information. The <image_channel_order> and <image_data_type> of <src_image>
	are restricted as follows:

	----------------------------------------------------------------------------
	Channel Order Src Channel Data Type
	------------- ---------------------
	CL_R CL_UNORM_INT8
	----------------------------------------------------------------------------

	<ref_image> is the input reference image, typically representing 8-bit
	luminance information. The <image_channel_order> and <image_data_type> of
	<ref_image> must match src_image, as follows:

	----------------------------------------------------------------------------
	Channel Order Ref Channel Data Type
	------------- ---------------------
	CL_R CL_UNORM_INT8
	----------------------------------------------------------------------------

	<motion_vector_buffer> is the output motion vector buffer, representing a
	vector field of pixel block motion vectors, stored linearly in row-major
	order. The elements of this buffer describe a motion vector for the
	corresponding pixel block, with its x/y components packed as two 16-bit
	integer values. Each component is encoded as a S13.2 fixed point value
	(two's complement). The precision is specified by the <subpixel_mode>
	of the <accelerator>. The size of the returned data is determined by the
	<mb_block_type> of the <accelerator>. The <motion_vector_buffer> needs to
	be sized that it fits the results of all pixel blocks of the source image,
	i.e. the number of full or partial 16x16 source pixel macroblocks times the
	number of returned motion vectors per source block (1, 4, or 16).

	<prediction_motion_vector_buffer> is an optional input buffer representing a
	vector field of prediction motion vectors, stored linearly in row-major
	order. When provided, this buffer must contain one prediction motion vector
	for each full or partial 16x16 pixel macroblock in the region of interest.
	The prediction motion vector is used to offset the default search center
	for each 16x16 pixel block search window. The default search center is the
	center of the co-located source macroblock in the reference frame. Note
	that in feedback algorithms, where the output <motion_vector_buffer> of one
	frame is used as the input <prediction_motion_buffer> for the next frame,
	the output buffer may need to be downsampled. The
	<prediction_motion_vector_buffer> argument is optional and may be set to
	NULL, in which case the prediction motion vectors are implied to be (0,0).

	<residuals> is an optional output buffer representing a field of residuals
	or "distortion values", one for each returned motion vector, stored linearly
	in row-major order. These "residuals of the compensated image" represent the
	sum-of-absolute-differences (SAD) between the source frame pixel block and
	the best-match reference frame pixel block that produced the returned motion
	vector. The <sad_adjust_mode> of the <accelerator> determines whether plain
	SAD or SATD (Haar-adjusted) values are returned. The <rediduals> argument is
	optional and may be set to NULL, in which case residual information is not
	returned.

	The motion estimation built-in kernels are queued for execution by the host
	application using clEnqueueNDRangeKernel(). This function will return the
	usual error codes, augmented with the following specific error codes for
	this kernel:

	* CL_INVALID_WORK_DIMENSION if <work_dim> is not 2. This built-in kernel
	requires a 2D ND-range.
	* CL_INVALID_WORK_GROUP_SIZE if <local_work_size> is not NULL. This
	built-in kernel requires the work-group size to be set by the runtime.
	* CL_INVALID_IMAGE_SIZE if the region of interest defined by the
	<global_work_size> and <global_work_offset> exceed the dimensions of
	the <src_image> or <ref_image>.
	* CL_INVALID_IMAGE_FORMAT_DESCRIPTOR if the <src_image> or <ref_image>
	kernel arguments do not meet the image format restrictions described
	above."

	Revision History

	Version 1 - Initial Revision
	Version 2 - Formatting and minor bug fixes.