| Name |
| |
| ARM_import_memory |
| |
| Name Strings |
| |
| cl_arm_import_memory |
| cl_arm_import_memory_host |
| cl_arm_import_memory_dma_buf |
| cl_arm_import_memory_protected |
| |
| cl_arm_import_memory will be reported if at least one of the other extension |
| strings is also reported. |
| |
| Contributors |
| |
| Robert Elliott, Arm Ltd. |
| Vatsalya Prasad, Arm Ltd. |
| Kévin Petit, Arm Ltd. |
| Sam Laynton, Arm Ltd. |
| James Morrissey, Arm Ltd. |
| |
| Contact |
| |
| Kévin Petit, ARM (kevin.petit 'at' ARM.com) |
| |
| IP Status |
| |
| No claims or disclosures are known to exist. |
| |
| Version |
| |
| Revision: #6, Jan 5th, 2018 |
| |
| Number |
| |
| OpenCL Extension #38 |
| |
| Status |
| |
| Complete. |
| |
| Extension Type |
| |
| OpenCL device extension |
| |
| Dependencies |
| |
| Requires OpenCL version 1.0 or later. Requires OpenCL 1.2 for host access |
| cl_mem_flags use when importing, as these were introduced in OpenCL 1.2. |
| |
| Overview |
| |
| This extension adds a new function that allows for direct memory import into |
| OpenCL via the clImportMemoryARM function. |
| |
| Memory imported through this interface will be mapped into the device's page |
| tables directly, providing zero copy access. It will never fall back to copy |
| operations and aliased buffers, instead producing an error when mapping is |
| not possible. |
| |
| Types of memory supported for import are specified as additional extension |
| strings. |
| |
| Header File |
| |
| cl_ext.h |
| |
| Glossary |
| |
| Protected memory: a zone of memory that is protected using TrustZone Media |
| Protection. |
| |
| New Types |
| |
| None |
| |
| New Procedures and Functions |
| |
| The new function |
| |
| cl_mem clImportMemoryARM( cl_context context, |
| cl_mem_flags flags, |
| const cl_import_properties_arm *properties, |
| void *memory, |
| size_t size, |
| cl_int *errorcode_ret ); |
| |
| Description |
| |
| Given a suitable pointer to an external memory allocation <memory> this |
| function will map the memory into the device page tables. |
| |
| The flags argument provides standard OpenCL memory object flags. |
| |
| Valid <flags> are: |
| |
| * CL_MEM_READ_WRITE, CL_MEM_WRITE_ONLY, CL_MEM_READ_ONLY |
| * CL_MEM_HOST_WRITE_ONLY, CL_MEM_HOST_READ_ONLY, CL_MEM_HOST_NO_ACCESS - |
| where the host flags are only hints and only apply from OpenCL 1.2 |
| onwards. |
| * CL_MEM_USE_HOST_PTR - this flag has no effect, it is ignored. |
| |
| The properties argument provides a list of key value pairs, with a zero |
| terminator. properties can be NULL or point to a single zero value if the |
| default behaviour is desired. |
| |
| Valid <properties> are: |
| |
| * CL_IMPORT_TYPE_ARM |
| * CL_IMPORT_TYPE_PROTECTED_ARM |
| |
| Valid values for CL_IMPORT_TYPE_ARM are: |
| |
| * CL_IMPORT_TYPE_HOST_ARM - this is the default |
| * CL_IMPORT_TYPE_DMA_BUF_ARM |
| |
| Valid values for CL_IMPORT_TYPE_PROTECTED_ARM are: |
| |
| * CL_FALSE - the memory is imported with no special behaviour, this is |
| the default. |
| * CL_TRUE - the memory is imported as protected memory. |
| |
| When CL_IMPORT_TYPE_PROTECTED_ARM is set to CL_TRUE, the following |
| restrictions apply: |
| |
| * CL_MEM_HOST_NO_ACCESS is the only accepted host access flag, all others |
| are rejected. If host access flags are omitted, then this is assumed. |
| * CL_IMPORT_TYPE_DMA_BUF_ARM is the only valid value for |
| CL_IMPORT_TYPE_ARM. |
| |
| If <properties> is NULL, default values are taken. |
| |
| Valid <memory> pointer is dependent on the TYPE passed in properties. |
| |
| Errors |
| |
| CL_INVALID_CONTEXT on invalid context. |
| |
| CL_INVALID_VALUE on invalid flag input. |
| |
| CL_INVALID_PROPERTY when invalid properties or combination of properties |
| are passed. |
| |
| CL_INVALID_VALUE if memory is NULL. |
| |
| CL_INVALID_OPERATION when host virtual pages in the range of <memory> to |
| <memory>+<size> are not mapped in the userspace address space. This does |
| _not_ include cases where physical pages are not allocated. For specific |
| behaviour see documentation for those memory types. |
| |
| CL_INVALID_OPERATION when an imported memory object has been passed to |
| one of the following API functions (they can't be used with imported |
| memory objects): |
| |
| * clEnqueueMapBuffer |
| * clEnqueueMapImage |
| * clEnqueueUnmapMemObject |
| * clEnqueueReadImage |
| * clEnqueueWriteImage |
| * clEnqueueReadBuffer |
| * clEnqueueReadBufferRect |
| * clEnqueueWriteBuffer |
| * clEnqueueWriteBufferRect |
| * clEnqueueCopyBuffer |
| * clEnqueueCopyBufferRect |
| * clEnqueueCopyBufferToImage |
| * clEnqueueCopyImageToBuffer |
| * clEnqueueCopyImage |
| * clEnqueueFillBuffer |
| * clEnqueueFillImage |
| |
| CL_INVALID_OPERATION if a host memory flag other than CL_MEM_HOST_NO_ACCESS |
| is used when CL_IMPORT_TYPE_PROTECTED_ARM is set to CL_TRUE. |
| |
| CL_INVALID_OPERATION if a command that makes use of a protected memory |
| object is enqueued to a command queue that has CL_QUEUE_PROFILING_ENABLE |
| set. |
| |
| Futher error information may be reported via the cl_context callback |
| function. |
| |
| New memory import types |
| |
| Linux dma_buf memory type - CL_IMPORT_TYPE_DMA_BUF_ARM |
| |
| If the extension string cl_arm_import_memory_dma_buf is exposed then |
| importing from dma_buf file handles is supported. |
| |
| The CL runtime manages dma_buf memory coherency between the host CPU and |
| GPU. It is the application's responsibility to ensure visibility in memory |
| of changes done by devices which aren't in the same coherency domain as |
| the GPU and CPU before using that memory from an OpenCL command. This can |
| be achieved either by not enqueueing the workload until the data is |
| visible, or by using a user event to prevent the command from being |
| executed until the expected data has reached memory. |
| |
| Flags attached to a dma_buf file handle take precedence over memory flags |
| supplied to clImportMemoryARM. For example, if a dma_buf allocation |
| originally created with a read-only flag is passed to clImportMemoryARM |
| with the READ_WRITE flag, the more restrictive READ_ONLY will take |
| precedence. |
| |
| dma_buf allocations are page-aligned and their size is a whole number of |
| pages. |
| |
| If an application only requires communication between the host CPU and |
| the GPU, it should favour using host imports as described further. |
| |
| See also, the code example below. |
| |
| Host process memory type - CL_IMPORT_TYPE_HOST_ARM |
| |
| If the extension string cl_arm_import_memory_host is exposed then importing |
| from normal userspace allocations (such as those created via malloc) is |
| supported. |
| |
| If the host OS is linux and overcommit of VA is allowed, then this |
| function will commit and pin physical pages for the VA range. This may |
| cause larger physical allocations than the application typically provokes |
| if memory is sparsely used. In this case sub-ranges of the host allocation |
| should be passed to the import function individually. |
| |
| It is the application's responsibility to align for the datatype being |
| accessed. Though the application is free to provide allocations without |
| any specific alignment on coherent systems, there is a requirement to |
| provide pointers aligned to a cache line on systems where there is no |
| HW-managed coherency between CPU and GPU. When alignment is less than a |
| page size then whole pages touched by addresses in the range of <memory> |
| to <memory>+<size> will be mapped into the device. If the page is already |
| mapped by another unaligned import, an error will occur. |
| |
| Cache coherency will be HW-managed on systems where it is supported. |
| Otherwise, cache maintenance operations will be added by the CL runtime |
| where needed. |
| |
| Importing host memory that is otherwise being used by a device outside |
| of the CPU/GPU coherency domain isn't guaranteed to work and the GPU |
| caches may contain stale data. |
| |
| Importing dma_buf pages through a CPU mapping is undefined. |
| |
| Importing two allocations that aren't page-aligned and that request |
| different memory flags is unsupported; an error will be returned. |
| |
| This method is recommended to be used when interoperating with an existing |
| host library which performs its own allocation and cannot be passed |
| handles to mapped OpenCL buffers. |
| |
| See also, the code example below. |
| |
| Protected memory |
| |
| If the extension string cl_arm_import_memory_protected is exposed then |
| using CL_IMPORT_TYPE_PROTECTED_ARM in the list of <properties> is |
| allowed. If the property's value is set to CL_TRUE, then the memory |
| is imported as protected memory. |
| |
| New Tokens |
| |
| None |
| |
| Interactions with other extensions |
| |
| This extension produces cl_mem memory objects which are compatible with all |
| other uses of cl_mem in the standard API, including creating images from |
| the resulting cl_mem and subject to the restrictions listed in this |
| document. |
| |
| In order to guarantee data consistency, applications must ensure that neither |
| the host nor any device attempt to perform simultaneous read and write |
| operations on any part of the memory backing an imported buffer or sub-buffers |
| created therefrom, even if these accesses do not overlap. For example, this |
| implies that it is not possible to write part of the memory backing an imported |
| buffer on the host while reading a sub-buffer created from that buffer on a |
| device, even if the memory written by the host is not visible through the |
| sub-buffer. |
| |
| This extension also provides an alternative to image import via EGL. |
| |
| Sample Code |
| |
| CL_IMPORT_TYPE_DMA_BUF_ARM |
| |
| #define WIDTH 1024 |
| #define HEIGHT 512 |
| |
| // Create buffer to be used as a hardware texture with graphics APIs (can also |
| // include video/camera use flags here) |
| int dma_buf_handle = get_dma_buf_handle_from_exporter_kernel_module( ..., WIDTH * HEIGHT * 2 ); |
| |
| cl_int error = CL_SUCCESS; |
| cl_mem buffer = clImportMemoryARM( ctx, |
| CL_MEM_READ_WRITE, |
| { CL_IMPORT_TYPE_ARM, CL_IMPORT_TYPE_DMA_BUF_ARM, 0 }, |
| &dma_buf_handle |
| WIDTH * HEIGHT * 2, |
| &error ); |
| |
| if( error == CL_SUCCESS ) |
| { |
| // Use <buffer> as you would any other cl_mem buffer |
| } |
| |
| |
| CL_IMPORT_TYPE_HOST_ARM |
| |
| #define WIDTH 1024 |
| #define HEIGHT 512 |
| |
| // tightly packed buffer we will treat as RGB565 |
| char *buffer = malloc( WIDTH * HEIGHT * 2 ); |
| |
| // The type CL_IMPORT_TYPE_HOST_ARM can be omitted as it is the default |
| cl_int error = CL_SUCCESS; |
| cl_mem buffer = clImportMemoryARM( ctx, |
| CL_MEM_READ_WRITE, |
| NULL, |
| buffer, |
| WIDTH * HEIGHT * 2, |
| &error ); |
| |
| if( error == CL_SUCCESS ) |
| { |
| // Use <buffer> as you would any other cl_mem buffer |
| } |
| |
| CL_IMPORT_TYPE_PROTECTED_ARM |
| |
| #define BUFFER_SIZE 4096 |
| |
| // Platform specific method of allocating protected memory and obtaining |
| // a handle. For example, from a protected dma_buf or ION region. |
| int protected_fd = protected_alloc( BUFFER_SIZE ); |
| |
| cl_int error = CL_SUCCESS; |
| // Import memory of type dma_buf and identify it as protected |
| const cl_import_properties_arm properties = { |
| CL_IMPORT_TYPE_ARM, CL_IMPORT_TYPE_DMA_BUF_ARM, |
| CL_IMPORT_TYPE_PROTECTED_ARM, CL_TRUE, |
| 0 |
| }; |
| cl_mem buffer = clImportMemoryARM( context, |
| CL_MEM_READ_WRITE, |
| properties, |
| &protected_fd, |
| BUFFER_SIZE, |
| &error ); |
| |
| if( error == CL_SUCCESS ) |
| { |
| // Use <buffer> as cl_mem buffer, observing the usage restrictions above. |
| } |
| |
| Conformance Tests |
| |
| None |
| |
| Revision History |
| |
| Revision: #1, Apr 27th, 2015 - Initial revision |
| Revision: #2, Apr 28th, 2015 - Added properties field to avoid type |
| inferrence. Added Issues section. |
| Revision: #3, May 5th, 2015 - Added image support info in Issues. |
| Revision: #4, Aug 4th, 2015 - Revised based on implementation and design |
| changes made during review. |
| Revision: #5, May 3rd, 2017 - Additional restrictions on host operations |
| and general cleanup / clarification. |
| Revision: #6, Jan 5th, 2018 - Support creating a sub-buffer from an imported |
| buffer. |
| Revision: #7, Jun 19th, 2018 - Add support for protected memory imports. |