blob: a01fba491de26c0af47f4c8ad51e7d7f0875659e [file] [log] [blame]
Name Strings
Pierre Boudier
Graham Sellers
Benedikt Kessler
Ofer Rosenberg
Brian Sumner, AMD (Brian.Sumner 'at'
IP Status
No known IP issues.
Version 1.2, Nov 3, 2011
OpenCL Extension #25
Extension Type
OpenCL platform extension
OpenCL 1.1 is required
This extension defines an API that allows improved control of the
physical memory used by the graphics device
It allows to share a memory allocated by the Graphics driver to be used
by other device on the bus by exposing a write-only bus address. One
example of application would be a video capture device which would DMA
into the GPU memory
It also offers the reverse operation of specifying a buffer allocated on
another device to be used for write access by the GPU
Table and Chapter numbers mentioned below match OpenCL 1.1 Rev 44
New Procedures and Functions
cl_int clEnqueueWaitSignalAMD(cl_command_queue command_queue,
cl_mem mem_object, uint value, cl_uint num_events,
const cl_event *event_wait_list, cl_event *event);
cl_int clEnqueueWriteSignalAMD(cl_command_queue command_queue,
cl_mem mem_object, uint value, cl_ulong offset,
cl_uint num_events, const cl_event *event_wait_list,
cl_event *event);
cl_int clEnqueueMakeBuffersResidentAMD(cl_command_queue command_queue,
cl_uint num_mem_objects, cl_mem* mem_objects,
cl_bool blocking_make_resident,
cl_bus_address_amd * bus_addresses, cl_uint num_events,
const cl_event *event_wait_list, cl_event *event);
New Types
typedef struct _cl_bus_address_amd
New Tokens
Accepted by the <flags> parameter of clCreateBuffer.
New command types for the events returned by the above functions
Additions to Table 5.4 (List of supported cl_mem_flags values)
cl_mem_flags | Description
CL_MEM_BUS_ADDRESSABLE_AMD | This flag specifies that the application
| wants the OpenCL implementation to
| create a buffer that can be accessed by
| remote device DMA.
| are mutually exclusive.
CL_MEM_EXTERNAL_PHYSICAL_AMD| This flag specifies that the application
| wants the OpenCL implementation to
| create a buffer from an already
| allocated memory on remote device
| are mutually exclusive.
| are mutually exclusive.
Additions to section 5.2.1 (Creating Buffer Objects)
This extension defines two new flags passed to clCreateBuffer:
OpenCL buffers which are used to communicate with a remote device.
In addition, the extension defines the following structure which provides
the bus address information:
typedef struct _cl_bus_address_amd
cl_long surfbusaddress;
cl_long signalbusaddress
There are two types of buffer objects that can be created:
1. A buffer object which represents a buffer created on the device's
memory, which can be shared with a remote device, to be shared with
the remote device.
This buffer is created using CL_MEM_BUS_ADDRESSABLE_AMD.
The application may initialize this memory by adding the mem flag
CL_MEM_COPY_HOST_PTR, and providing data in host_ptr. Otherwise, the
buffer content are undefined.
The application is required to make buffers resident before accessing
them from remote device. Using these buffers without making them
resident will lead to undefined behavior (especially, the addresses
for other Buffers may become invalid). See next section for making
buffers resident.
2. A write only buffer object which represents a remote buffer, located on
the remote device. There is no actual memory allocation on the device in
this case. This buffer is created using CL_MEM_EXTERNAL_PHYSICAL_AMD. A
kernel running on the device can only write to this buffer.
When creating a buffer using CL_MEM_EXTERNAL_PHYSICAL_AMD, the application
is required to pass the cl_bus_address_amd struct to host_ptr argument
of clCreateBuffer.
<surfbusaddress> will contain the page aligned physical starting address
of the backing store preallocated by the application on a remote device.
<signalbusaddress> will contain the page aligned physical starting
address of preallocated signaling surface.
Both bus addresses must have been allocated from the same device and
memory pool. Failure will occur if multiple buffers are created for the
same <surfbusaddress>.
Map/unmap and read operations are not supported for external physical memory.
Sub buffers are not supported for both bus addressable memory and external
physical memory.
The following errors are added to clCreateBuffer:
CL_MEM_OBJECT_ALLOCATION_FAILURE is generated if the <flag> parameter of
clCreateBuffer is CL_MEM_EXTERNAL_PHYSICAL_AMD, and the remote bus address
cannot be mapped to the device address space.
CL_INVALID_HOST_PTR is generated if the <flag> parameter of clCreateBuffer
is CL_MEM_EXTERNAL_PHYSICAL_AMD and the <surfbusaddress> or
<markerbusaddress> parameter of clCreateBuffer are 0 or not page aligned.
CL_OUT_OF_HOST_MEMORY is generated if if the <flag> parameter of
clCreateBuffer is CL_BUS_ADDRESSABLE_MEMORY_AMD, and no memory can be
allocated with a valid bus address.
CL_OUT_OF_RESOURCES if bus addressable memory is already used by another
application or context
New section 5.4.4 (Making buffers resident)
The application requires the bus address in order to access the buffers
from a remote device. As the OS may rearrange buffers to make space for
other memory allocation, we must make the buffers resident before trying
to access them on remote device.
The following API is used to make buffers resident:
cl_int clEnqueueMakeBuffersResidentAMD(cl_command_queue command_queue,
cl_uint num_mem_objects,
const cl_mem *mem_objects
cl_bus_address_amd * bus_addresses,
cl_bool blocking_make_resident,
cl_uint num_events_in_wait_list,
const cl_event *event_wait_list,
cl_event *event);
The memory objects passed need to be buffers created with
clEnqueueMakeBuffersResidentAMD return CL_SUCCESS if the function is
executed successfully. Otherwise, it returns one of the following errors:
CL_INVALID_OPERATION is generated if any of the pointer parameters
of clEnqueueMakeBuffersResidentAMD are NULL (and count is > 0).
CL_INVALID_OPERATION is generated if any of the mem_objects passed to
clEnqueueMakeBuffersResidentAMD was not a valid cl_mem object created with
CL_OUT_OF_HOST_MEMORY is generated if any of the mem_objects passed to
clEnqueueMakeBuffersResidentAMD could not be made resident so that the
buffer or signal bus addresses will be returned as 0.
New section 5.4.5 (Memory Object Synchronization)
The following API is used to synchronize with the remote device. This
Synchronization enables the device to know when the other device finished
writing to the buffer/s.
cl_int clEnqueueWaitSignalAMD(cl_command_queue command_queue,
cl_mem buffer,
uint value,
cl_uint num_events,
const cl_event *event_wait_list,
cl_event *event);
This command instructs the OpenCL to wait until <value> is written to
<buffer> before issuing the next command.
cl_int clEnqueueWriteSignalAMD(cl_command_queue command_queue,
cl_mem buffer,
uint value,
cl_ulong offset,
cl_uint num_events,
const cl_event *event_wait_list,
cl_event *event);
This command instructs the OpenCL to write <value> to the signal address +
<offset> of <buffer> (which must be a buffer created with
CL_MEM_EXTERNAL_PHYSICAL_AMD). This should be done after a write operation
by the device into that buffer is complete. Consecutive marker values must
keep increasing.
These commands return CL_SUCCESS if the function is executed successfully.
Otherwise, it returns one of the following errors:
CL_INVALID_MEM_OBJECT is generated if the <buffer> parameter of
clEnqueueWaitSignalAMD or clEnqueueWriteSignalAMD is not a valid buffer
CL_INVALID_COMMAND_QUEUE is generated if the <command_queue> parameter of
clEnqueueWaitSignalAMD or clEnqueueWriteSignalAMD is not a valid command
CL_INVALID_MEM_OBJECT is generated if the <buffer> parameter of
clEnqueueWaitSignalAMD does not represent a buffer allocated with
CL_INVALID_MEM_OBJECT is generated if the <buffer> parameter of
clEnqueueWriteSignalAMD does not represent a buffer defined as
CL_INVALID_BUFFER_SIZE is generated if the <offset> parameter of
clEnqueueWriteSignalAMD would lead to a write beyond the size of <buffer>
CL_INVALID_VALUE is generated if the signal address used by
clEnqueueWriteSignalAMD or clEnqueueWaitSignalAMD of <buf> is invalid
(for example 0)