| <!DOCTYPE html> |
| <html lang="en"> |
| <head> |
| <meta charset="UTF-8"> |
| <!--[if IE]><meta http-equiv="X-UA-Compatible" content="IE=edge"><![endif]--> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> |
| <meta name="generator" content="Asciidoctor 1.5.8"> |
| <meta name="author" content="Khronos® OpenCL Working Group"> |
| <title>The OpenCL™ Specification</title> |
| <style> |
| /*! normalize.css v2.1.2 | MIT License | git.io/normalize */ |
| /* ========================================================================== HTML5 display definitions ========================================================================== */ |
| /** Correct `block` display not defined in IE 8/9. */ |
| article, aside, details, figcaption, figure, footer, header, hgroup, main, nav, section, summary { display: block; } |
| |
| /** Correct `inline-block` display not defined in IE 8/9. */ |
| audio, canvas, video { display: inline-block; } |
| |
| /** Prevent modern browsers from displaying `audio` without controls. Remove excess height in iOS 5 devices. */ |
| audio:not([controls]) { display: none; height: 0; } |
| |
| /** Address `[hidden]` styling not present in IE 8/9. Hide the `template` element in IE, Safari, and Firefox < 22. */ |
| [hidden], template { display: none; } |
| |
| script { display: none !important; } |
| |
| /* ========================================================================== Base ========================================================================== */ |
| /** 1. Set default font family to sans-serif. 2. Prevent iOS text size adjust after orientation change, without disabling user zoom. */ |
| html { font-family: sans-serif; /* 1 */ -ms-text-size-adjust: 100%; /* 2 */ -webkit-text-size-adjust: 100%; /* 2 */ } |
| |
| /** Remove default margin. */ |
| body { margin: 0; } |
| |
| /* ========================================================================== Links ========================================================================== */ |
| /** Remove the gray background color from active links in IE 10. */ |
| a { background: transparent; } |
| |
| /** Address `outline` inconsistency between Chrome and other browsers. */ |
| a:focus { outline: thin dotted; } |
| |
| /** Improve readability when focused and also mouse hovered in all browsers. */ |
| a:active, a:hover { outline: 0; } |
| |
| /* ========================================================================== Typography ========================================================================== */ |
| /** Address variable `h1` font-size and margin within `section` and `article` contexts in Firefox 4+, Safari 5, and Chrome. */ |
| h1 { font-size: 2em; margin: 0.67em 0; } |
| |
| /** Address styling not present in IE 8/9, Safari 5, and Chrome. */ |
| abbr[title] { border-bottom: 1px dotted; } |
| |
| /** Address style set to `bolder` in Firefox 4+, Safari 5, and Chrome. */ |
| b, strong { font-weight: bold; } |
| |
| /** Address styling not present in Safari 5 and Chrome. */ |
| dfn { font-style: italic; } |
| |
| /** Address differences between Firefox and other browsers. */ |
| hr { -moz-box-sizing: content-box; box-sizing: content-box; height: 0; } |
| |
| /** Address styling not present in IE 8/9. */ |
| mark { background: #ff0; color: #000; } |
| |
| /** Correct font family set oddly in Safari 5 and Chrome. */ |
| code, kbd, pre, samp { font-family: monospace, serif; font-size: 1em; } |
| |
| /** Improve readability of pre-formatted text in all browsers. */ |
| pre { white-space: pre-wrap; } |
| |
| /** Set consistent quote types. */ |
| q { quotes: "\201C" "\201D" "\2018" "\2019"; } |
| |
| /** Address inconsistent and variable font size in all browsers. */ |
| small { font-size: 80%; } |
| |
| /** Prevent `sub` and `sup` affecting `line-height` in all browsers. */ |
| sub, sup { font-size: 75%; line-height: 0; position: relative; vertical-align: baseline; } |
| |
| sup { top: -0.5em; } |
| |
| sub { bottom: -0.25em; } |
| |
| /* ========================================================================== Embedded content ========================================================================== */ |
| /** Remove border when inside `a` element in IE 8/9. */ |
| img { border: 0; } |
| |
| /** Correct overflow displayed oddly in IE 9. */ |
| svg:not(:root) { overflow: hidden; } |
| |
| /* ========================================================================== Figures ========================================================================== */ |
| /** Address margin not present in IE 8/9 and Safari 5. */ |
| figure { margin: 0; } |
| |
| /* ========================================================================== Forms ========================================================================== */ |
| /** Define consistent border, margin, and padding. */ |
| fieldset { border: 1px solid #c0c0c0; margin: 0 2px; padding: 0.35em 0.625em 0.75em; } |
| |
| /** 1. Correct `color` not being inherited in IE 8/9. 2. Remove padding so people aren't caught out if they zero out fieldsets. */ |
| legend { border: 0; /* 1 */ padding: 0; /* 2 */ } |
| |
| /** 1. Correct font family not being inherited in all browsers. 2. Correct font size not being inherited in all browsers. 3. Address margins set differently in Firefox 4+, Safari 5, and Chrome. */ |
| button, input, select, textarea { font-family: inherit; /* 1 */ font-size: 100%; /* 2 */ margin: 0; /* 3 */ } |
| |
| /** Address Firefox 4+ setting `line-height` on `input` using `!important` in the UA stylesheet. */ |
| button, input { line-height: normal; } |
| |
| /** Address inconsistent `text-transform` inheritance for `button` and `select`. All other form control elements do not inherit `text-transform` values. Correct `button` style inheritance in Chrome, Safari 5+, and IE 8+. Correct `select` style inheritance in Firefox 4+ and Opera. */ |
| button, select { text-transform: none; } |
| |
| /** 1. Avoid the WebKit bug in Android 4.0.* where (2) destroys native `audio` and `video` controls. 2. Correct inability to style clickable `input` types in iOS. 3. Improve usability and consistency of cursor style between image-type `input` and others. */ |
| button, html input[type="button"], input[type="reset"], input[type="submit"] { -webkit-appearance: button; /* 2 */ cursor: pointer; /* 3 */ } |
| |
| /** Re-set default cursor for disabled elements. */ |
| button[disabled], html input[disabled] { cursor: default; } |
| |
| /** 1. Address box sizing set to `content-box` in IE 8/9. 2. Remove excess padding in IE 8/9. */ |
| input[type="checkbox"], input[type="radio"] { box-sizing: border-box; /* 1 */ padding: 0; /* 2 */ } |
| |
| /** 1. Address `appearance` set to `searchfield` in Safari 5 and Chrome. 2. Address `box-sizing` set to `border-box` in Safari 5 and Chrome (include `-moz` to future-proof). */ |
| input[type="search"] { -webkit-appearance: textfield; /* 1 */ -moz-box-sizing: content-box; -webkit-box-sizing: content-box; /* 2 */ box-sizing: content-box; } |
| |
| /** Remove inner padding and search cancel button in Safari 5 and Chrome on OS X. */ |
| input[type="search"]::-webkit-search-cancel-button, input[type="search"]::-webkit-search-decoration { -webkit-appearance: none; } |
| |
| /** Remove inner padding and border in Firefox 4+. */ |
| button::-moz-focus-inner, input::-moz-focus-inner { border: 0; padding: 0; } |
| |
| /** 1. Remove default vertical scrollbar in IE 8/9. 2. Improve readability and alignment in all browsers. */ |
| textarea { overflow: auto; /* 1 */ vertical-align: top; /* 2 */ } |
| |
| /* ========================================================================== Tables ========================================================================== */ |
| /** Remove most spacing between table cells. */ |
| table { border-collapse: collapse; border-spacing: 0; } |
| |
| meta.foundation-mq-small { font-family: "only screen and (min-width: 768px)"; width: 768px; } |
| |
| meta.foundation-mq-medium { font-family: "only screen and (min-width:1280px)"; width: 1280px; } |
| |
| meta.foundation-mq-large { font-family: "only screen and (min-width:1440px)"; width: 1440px; } |
| |
| *, *:before, *:after { -moz-box-sizing: border-box; -webkit-box-sizing: border-box; box-sizing: border-box; } |
| |
| html, body { font-size: 100%; } |
| |
| body { background: white; color: #222222; padding: 0; margin: 0; font-family: "Helvetica Neue", "Helvetica", Helvetica, Arial, sans-serif; font-weight: normal; font-style: normal; line-height: 1; position: relative; cursor: auto; } |
| |
| a:hover { cursor: pointer; } |
| |
| img, object, embed { max-width: 100%; height: auto; } |
| |
| object, embed { height: 100%; } |
| |
| img { -ms-interpolation-mode: bicubic; } |
| |
| #map_canvas img, #map_canvas embed, #map_canvas object, .map_canvas img, .map_canvas embed, .map_canvas object { max-width: none !important; } |
| |
| .left { float: left !important; } |
| |
| .right { float: right !important; } |
| |
| .text-left { text-align: left !important; } |
| |
| .text-right { text-align: right !important; } |
| |
| .text-center { text-align: center !important; } |
| |
| .text-justify { text-align: justify !important; } |
| |
| .hide { display: none; } |
| |
| .antialiased { -webkit-font-smoothing: antialiased; } |
| |
| img { display: inline-block; vertical-align: middle; } |
| |
| textarea { height: auto; min-height: 50px; } |
| |
| select { width: 100%; } |
| |
| object, svg { display: inline-block; vertical-align: middle; } |
| |
| .center { margin-left: auto; margin-right: auto; } |
| |
| .spread { width: 100%; } |
| |
| p.lead, .paragraph.lead > p, #preamble > .sectionbody > .paragraph:first-of-type p { font-size: 1.21875em; line-height: 1.6; } |
| |
| .subheader, .admonitionblock td.content > .title, .audioblock > .title, .exampleblock > .title, .imageblock > .title, .listingblock > .title, .literalblock > .title, .stemblock > .title, .openblock > .title, .paragraph > .title, .quoteblock > .title, table.tableblock > .title, .verseblock > .title, .videoblock > .title, .dlist > .title, .olist > .title, .ulist > .title, .qlist > .title, .hdlist > .title { line-height: 1.4; color: black; font-weight: 300; margin-top: 0.2em; margin-bottom: 0.5em; } |
| |
| /* Typography resets */ |
| div, dl, dt, dd, ul, ol, li, h1, h2, h3, #toctitle, .sidebarblock > .content > .title, h4, h5, h6, pre, form, p, blockquote, th, td { margin: 0; padding: 0; direction: ltr; } |
| |
| /* Default Link Styles */ |
| a { color: #0068b0; text-decoration: none; line-height: inherit; } |
| a:hover, a:focus { color: #333333; } |
| a img { border: none; } |
| |
| /* Default paragraph styles */ |
| p { font-family: Noto, sans-serif; font-weight: normal; font-size: 1em; line-height: 1.6; margin-bottom: 0.75em; text-rendering: optimizeLegibility; } |
| p aside { font-size: 0.875em; line-height: 1.35; font-style: italic; } |
| |
| /* Default header styles */ |
| h1, h2, h3, #toctitle, .sidebarblock > .content > .title, h4, h5, h6 { font-family: Noto, sans-serif; font-weight: normal; font-style: normal; color: black; text-rendering: optimizeLegibility; margin-top: 0.5em; margin-bottom: 0.5em; line-height: 1.2125em; } |
| h1 small, h2 small, h3 small, #toctitle small, .sidebarblock > .content > .title small, h4 small, h5 small, h6 small { font-size: 60%; color: #4d4d4d; line-height: 0; } |
| |
| h1 { font-size: 2.125em; } |
| |
| h2 { font-size: 1.6875em; } |
| |
| h3, #toctitle, .sidebarblock > .content > .title { font-size: 1.375em; } |
| |
| h4 { font-size: 1.125em; } |
| |
| h5 { font-size: 1.125em; } |
| |
| h6 { font-size: 1em; } |
| |
| hr { border: solid #dddddd; border-width: 1px 0 0; clear: both; margin: 1.25em 0 1.1875em; height: 0; } |
| |
| /* Helpful Typography Defaults */ |
| em, i { font-style: italic; line-height: inherit; } |
| |
| strong, b { font-weight: bold; line-height: inherit; } |
| |
| small { font-size: 60%; line-height: inherit; } |
| |
| code { font-family: Consolas, "Liberation Mono", Courier, monospace; font-weight: normal; color: #264357; } |
| |
| /* Lists */ |
| ul, ol, dl { font-size: 1em; line-height: 1.6; margin-bottom: 0.75em; list-style-position: outside; font-family: Noto, sans-serif; } |
| |
| ul, ol { margin-left: 1.5em; } |
| ul.no-bullet, ol.no-bullet { margin-left: 1.5em; } |
| |
| /* Unordered Lists */ |
| ul li ul, ul li ol { margin-left: 1.25em; margin-bottom: 0; font-size: 1em; /* Override nested font-size change */ } |
| ul.square li ul, ul.circle li ul, ul.disc li ul { list-style: inherit; } |
| ul.square { list-style-type: square; } |
| ul.circle { list-style-type: circle; } |
| ul.disc { list-style-type: disc; } |
| ul.no-bullet { list-style: none; } |
| |
| /* Ordered Lists */ |
| ol li ul, ol li ol { margin-left: 1.25em; margin-bottom: 0; } |
| |
| /* Definition Lists */ |
| dl dt { margin-bottom: 0.3em; font-weight: bold; } |
| dl dd { margin-bottom: 0.75em; } |
| |
| /* Abbreviations */ |
| abbr, acronym { text-transform: uppercase; font-size: 90%; color: black; border-bottom: 1px dotted #dddddd; cursor: help; } |
| |
| abbr { text-transform: none; } |
| |
| /* Blockquotes */ |
| blockquote { margin: 0 0 0.75em; padding: 0.5625em 1.25em 0 1.1875em; border-left: 1px solid #dddddd; } |
| blockquote cite { display: block; font-size: 0.8125em; color: #5e93b8; } |
| blockquote cite:before { content: "\2014 \0020"; } |
| blockquote cite a, blockquote cite a:visited { color: #5e93b8; } |
| |
| blockquote, blockquote p { line-height: 1.6; color: #333333; } |
| |
| /* Microformats */ |
| .vcard { display: inline-block; margin: 0 0 1.25em 0; border: 1px solid #dddddd; padding: 0.625em 0.75em; } |
| .vcard li { margin: 0; display: block; } |
| .vcard .fn { font-weight: bold; font-size: 0.9375em; } |
| |
| .vevent .summary { font-weight: bold; } |
| .vevent abbr { cursor: auto; text-decoration: none; font-weight: bold; border: none; padding: 0 0.0625em; } |
| |
| @media only screen and (min-width: 768px) { h1, h2, h3, #toctitle, .sidebarblock > .content > .title, h4, h5, h6 { line-height: 1.4; } |
| h1 { font-size: 2.75em; } |
| h2 { font-size: 2.3125em; } |
| h3, #toctitle, .sidebarblock > .content > .title { font-size: 1.6875em; } |
| h4 { font-size: 1.4375em; } } |
| /* Tables */ |
| table { background: white; margin-bottom: 1.25em; border: solid 1px #d8d8ce; } |
| table thead, table tfoot { background: -webkit-linear-gradient(top, #add386, #90b66a); font-weight: bold; } |
| table thead tr th, table thead tr td, table tfoot tr th, table tfoot tr td { padding: 0.5em 0.625em 0.625em; font-size: inherit; color: white; text-align: left; } |
| table tr th, table tr td { padding: 0.5625em 0.625em; font-size: inherit; color: #6d6e71; } |
| table tr.even, table tr.alt, table tr:nth-of-type(even) { background: #edf2f2; } |
| table thead tr th, table tfoot tr th, table tbody tr td, table tr td, table tfoot tr td { display: table-cell; line-height: 1.4; } |
| |
| body { -moz-osx-font-smoothing: grayscale; -webkit-font-smoothing: antialiased; tab-size: 4; } |
| |
| h1, h2, h3, #toctitle, .sidebarblock > .content > .title, h4, h5, h6 { line-height: 1.4; } |
| |
| a:hover, a:focus { text-decoration: underline; } |
| |
| .clearfix:before, .clearfix:after, .float-group:before, .float-group:after { content: " "; display: table; } |
| .clearfix:after, .float-group:after { clear: both; } |
| |
| *:not(pre) > code { font-size: inherit; font-style: normal !important; letter-spacing: 0; padding: 0; background-color: white; -webkit-border-radius: 0; border-radius: 0; line-height: inherit; word-wrap: break-word; } |
| *:not(pre) > code.nobreak { word-wrap: normal; } |
| *:not(pre) > code.nowrap { white-space: nowrap; } |
| |
| pre, pre > code { line-height: 1.6; color: #264357; font-family: Consolas, "Liberation Mono", Courier, monospace; font-weight: normal; } |
| |
| em em { font-style: normal; } |
| |
| strong strong { font-weight: normal; } |
| |
| .keyseq { color: #333333; } |
| |
| kbd { font-family: Consolas, "Liberation Mono", Courier, monospace; display: inline-block; color: black; font-size: 0.65em; line-height: 1.45; background-color: #f7f7f7; border: 1px solid #ccc; -webkit-border-radius: 3px; border-radius: 3px; -webkit-box-shadow: 0 1px 0 rgba(0, 0, 0, 0.2), 0 0 0 0.1em white inset; box-shadow: 0 1px 0 rgba(0, 0, 0, 0.2), 0 0 0 0.1em white inset; margin: 0 0.15em; padding: 0.2em 0.5em; vertical-align: middle; position: relative; top: -0.1em; white-space: nowrap; } |
| |
| .keyseq kbd:first-child { margin-left: 0; } |
| |
| .keyseq kbd:last-child { margin-right: 0; } |
| |
| .menuseq, .menuref { color: #000; } |
| |
| .menuseq b:not(.caret), .menuref { font-weight: inherit; } |
| |
| .menuseq { word-spacing: -0.02em; } |
| .menuseq b.caret { font-size: 1.25em; line-height: 0.8; } |
| .menuseq i.caret { font-weight: bold; text-align: center; width: 0.45em; } |
| |
| b.button:before, b.button:after { position: relative; top: -1px; font-weight: normal; } |
| |
| b.button:before { content: "["; padding: 0 3px 0 2px; } |
| |
| b.button:after { content: "]"; padding: 0 2px 0 3px; } |
| |
| #header, #content, #footnotes, #footer { width: 100%; margin-left: auto; margin-right: auto; margin-top: 0; margin-bottom: 0; max-width: 62.5em; *zoom: 1; position: relative; padding-left: 1.5em; padding-right: 1.5em; } |
| #header:before, #header:after, #content:before, #content:after, #footnotes:before, #footnotes:after, #footer:before, #footer:after { content: " "; display: table; } |
| #header:after, #content:after, #footnotes:after, #footer:after { clear: both; } |
| |
| #content { margin-top: 1.25em; } |
| |
| #content:before { content: none; } |
| |
| #header > h1:first-child { color: black; margin-top: 2.25rem; margin-bottom: 0; } |
| #header > h1:first-child + #toc { margin-top: 8px; border-top: 1px solid #dddddd; } |
| #header > h1:only-child, body.toc2 #header > h1:nth-last-child(2) { border-bottom: 1px solid #dddddd; padding-bottom: 8px; } |
| #header .details { border-bottom: 1px solid #dddddd; line-height: 1.45; padding-top: 0.25em; padding-bottom: 0.25em; padding-left: 0.25em; color: #5e93b8; display: -ms-flexbox; display: -webkit-flex; display: flex; -ms-flex-flow: row wrap; -webkit-flex-flow: row wrap; flex-flow: row wrap; } |
| #header .details span:first-child { margin-left: -0.125em; } |
| #header .details span.email a { color: #333333; } |
| #header .details br { display: none; } |
| #header .details br + span:before { content: "\00a0\2013\00a0"; } |
| #header .details br + span.author:before { content: "\00a0\22c5\00a0"; color: #333333; } |
| #header .details br + span#revremark:before { content: "\00a0|\00a0"; } |
| #header #revnumber { text-transform: capitalize; } |
| #header #revnumber:after { content: "\00a0"; } |
| |
| #content > h1:first-child:not([class]) { color: black; border-bottom: 1px solid #dddddd; padding-bottom: 8px; margin-top: 0; padding-top: 1rem; margin-bottom: 1.25rem; } |
| |
| #toc { border-bottom: 0 solid #dddddd; padding-bottom: 0.5em; } |
| #toc > ul { margin-left: 0.125em; } |
| #toc ul.sectlevel0 > li > a { font-style: italic; } |
| #toc ul.sectlevel0 ul.sectlevel1 { margin: 0.5em 0; } |
| #toc ul { font-family: Noto, sans-serif; list-style-type: none; } |
| #toc li { line-height: 1.3334; margin-top: 0.3334em; } |
| #toc a { text-decoration: none; } |
| #toc a:active { text-decoration: underline; } |
| |
| #toctitle { color: black; font-size: 1.2em; } |
| |
| @media only screen and (min-width: 768px) { #toctitle { font-size: 1.375em; } |
| body.toc2 { padding-left: 15em; padding-right: 0; } |
| #toc.toc2 { margin-top: 0 !important; background-color: white; position: fixed; width: 15em; left: 0; top: 0; border-right: 1px solid #dddddd; border-top-width: 0 !important; border-bottom-width: 0 !important; z-index: 1000; padding: 1.25em 1em; height: 100%; overflow: auto; } |
| #toc.toc2 #toctitle { margin-top: 0; margin-bottom: 0.8rem; font-size: 1.2em; } |
| #toc.toc2 > ul { font-size: 0.9em; margin-bottom: 0; } |
| #toc.toc2 ul ul { margin-left: 0; padding-left: 1em; } |
| #toc.toc2 ul.sectlevel0 ul.sectlevel1 { padding-left: 0; margin-top: 0.5em; margin-bottom: 0.5em; } |
| body.toc2.toc-right { padding-left: 0; padding-right: 15em; } |
| body.toc2.toc-right #toc.toc2 { border-right-width: 0; border-left: 1px solid #dddddd; left: auto; right: 0; } } |
| @media only screen and (min-width: 1280px) { body.toc2 { padding-left: 20em; padding-right: 0; } |
| #toc.toc2 { width: 20em; } |
| #toc.toc2 #toctitle { font-size: 1.375em; } |
| #toc.toc2 > ul { font-size: 0.95em; } |
| #toc.toc2 ul ul { padding-left: 1.25em; } |
| body.toc2.toc-right { padding-left: 0; padding-right: 20em; } } |
| #content #toc { border-style: solid; border-width: 1px; border-color: #e6e6e6; margin-bottom: 1.25em; padding: 1.25em; background: white; -webkit-border-radius: 0; border-radius: 0; } |
| #content #toc > :first-child { margin-top: 0; } |
| #content #toc > :last-child { margin-bottom: 0; } |
| |
| #footer { max-width: 100%; background-color: none; padding: 1.25em; } |
| |
| #footer-text { color: black; line-height: 1.44; } |
| |
| #content { margin-bottom: 0.625em; } |
| |
| .sect1 { padding-bottom: 0.625em; } |
| |
| @media only screen and (min-width: 768px) { #content { margin-bottom: 1.25em; } |
| .sect1 { padding-bottom: 1.25em; } } |
| .sect1:last-child { padding-bottom: 0; } |
| |
| .sect1 + .sect1 { border-top: 0 solid #dddddd; } |
| |
| #content h1 > a.anchor, h2 > a.anchor, h3 > a.anchor, #toctitle > a.anchor, .sidebarblock > .content > .title > a.anchor, h4 > a.anchor, h5 > a.anchor, h6 > a.anchor { position: absolute; z-index: 1001; width: 1.5ex; margin-left: -1.5ex; display: block; text-decoration: none !important; visibility: hidden; text-align: center; font-weight: normal; } |
| #content h1 > a.anchor:before, h2 > a.anchor:before, h3 > a.anchor:before, #toctitle > a.anchor:before, .sidebarblock > .content > .title > a.anchor:before, h4 > a.anchor:before, h5 > a.anchor:before, h6 > a.anchor:before { content: "\00A7"; font-size: 0.85em; display: block; padding-top: 0.1em; } |
| #content h1:hover > a.anchor, #content h1 > a.anchor:hover, h2:hover > a.anchor, h2 > a.anchor:hover, h3:hover > a.anchor, #toctitle:hover > a.anchor, .sidebarblock > .content > .title:hover > a.anchor, h3 > a.anchor:hover, #toctitle > a.anchor:hover, .sidebarblock > .content > .title > a.anchor:hover, h4:hover > a.anchor, h4 > a.anchor:hover, h5:hover > a.anchor, h5 > a.anchor:hover, h6:hover > a.anchor, h6 > a.anchor:hover { visibility: visible; } |
| #content h1 > a.link, h2 > a.link, h3 > a.link, #toctitle > a.link, .sidebarblock > .content > .title > a.link, h4 > a.link, h5 > a.link, h6 > a.link { color: black; text-decoration: none; } |
| #content h1 > a.link:hover, h2 > a.link:hover, h3 > a.link:hover, #toctitle > a.link:hover, .sidebarblock > .content > .title > a.link:hover, h4 > a.link:hover, h5 > a.link:hover, h6 > a.link:hover { color: black; } |
| |
| .audioblock, .imageblock, .literalblock, .listingblock, .stemblock, .videoblock { margin-bottom: 1.25em; } |
| |
| .admonitionblock td.content > .title, .audioblock > .title, .exampleblock > .title, .imageblock > .title, .listingblock > .title, .literalblock > .title, .stemblock > .title, .openblock > .title, .paragraph > .title, .quoteblock > .title, table.tableblock > .title, .verseblock > .title, .videoblock > .title, .dlist > .title, .olist > .title, .ulist > .title, .qlist > .title, .hdlist > .title { text-rendering: optimizeLegibility; text-align: left; } |
| |
| table.tableblock > caption.title { white-space: nowrap; overflow: visible; max-width: 0; } |
| |
| .paragraph.lead > p, #preamble > .sectionbody > .paragraph:first-of-type p { color: black; } |
| |
| table.tableblock #preamble > .sectionbody > .paragraph:first-of-type p { font-size: inherit; } |
| |
| .admonitionblock > table { border-collapse: separate; border: 0; background: none; width: 100%; } |
| .admonitionblock > table td.icon { text-align: center; width: 80px; } |
| .admonitionblock > table td.icon img { max-width: initial; } |
| .admonitionblock > table td.icon .title { font-weight: bold; font-family: Noto, sans-serif; text-transform: uppercase; } |
| .admonitionblock > table td.content { padding-left: 1.125em; padding-right: 1.25em; border-left: 1px solid #dddddd; color: #5e93b8; } |
| .admonitionblock > table td.content > :last-child > :last-child { margin-bottom: 0; } |
| |
| .exampleblock > .content { border-style: solid; border-width: 1px; border-color: #e6e6e6; margin-bottom: 1.25em; padding: 1.25em; background: white; -webkit-border-radius: 0; border-radius: 0; } |
| .exampleblock > .content > :first-child { margin-top: 0; } |
| .exampleblock > .content > :last-child { margin-bottom: 0; } |
| |
| .sidebarblock { border-style: solid; border-width: 1px; border-color: #e6e6e6; margin-bottom: 1.25em; padding: 1.25em; background: white; -webkit-border-radius: 0; border-radius: 0; } |
| .sidebarblock > :first-child { margin-top: 0; } |
| .sidebarblock > :last-child { margin-bottom: 0; } |
| .sidebarblock > .content > .title { color: black; margin-top: 0; } |
| |
| .exampleblock > .content > :last-child > :last-child, .exampleblock > .content .olist > ol > li:last-child > :last-child, .exampleblock > .content .ulist > ul > li:last-child > :last-child, .exampleblock > .content .qlist > ol > li:last-child > :last-child, .sidebarblock > .content > :last-child > :last-child, .sidebarblock > .content .olist > ol > li:last-child > :last-child, .sidebarblock > .content .ulist > ul > li:last-child > :last-child, .sidebarblock > .content .qlist > ol > li:last-child > :last-child { margin-bottom: 0; } |
| |
| .literalblock pre, .listingblock pre:not(.highlight), .listingblock pre[class="highlight"], .listingblock pre[class^="highlight "], .listingblock pre.CodeRay, .listingblock pre.prettyprint { background: #eeeeee; } |
| .sidebarblock .literalblock pre, .sidebarblock .listingblock pre:not(.highlight), .sidebarblock .listingblock pre[class="highlight"], .sidebarblock .listingblock pre[class^="highlight "], .sidebarblock .listingblock pre.CodeRay, .sidebarblock .listingblock pre.prettyprint { background: #f2f1f1; } |
| |
| .literalblock pre, .literalblock pre[class], .listingblock pre, .listingblock pre[class] { border: 1px hidden #666666; -webkit-border-radius: 0; border-radius: 0; word-wrap: break-word; padding: 1.25em 1.5625em 1.125em 1.5625em; font-size: 0.8125em; } |
| .literalblock pre.nowrap, .literalblock pre[class].nowrap, .listingblock pre.nowrap, .listingblock pre[class].nowrap { overflow-x: auto; white-space: pre; word-wrap: normal; } |
| @media only screen and (min-width: 768px) { .literalblock pre, .literalblock pre[class], .listingblock pre, .listingblock pre[class] { font-size: 0.90625em; } } |
| @media only screen and (min-width: 1280px) { .literalblock pre, .literalblock pre[class], .listingblock pre, .listingblock pre[class] { font-size: 1em; } } |
| |
| .literalblock.output pre { color: #eeeeee; background-color: #264357; } |
| |
| .listingblock pre.highlightjs { padding: 0; } |
| .listingblock pre.highlightjs > code { padding: 1.25em 1.5625em 1.125em 1.5625em; -webkit-border-radius: 0; border-radius: 0; } |
| |
| .listingblock > .content { position: relative; } |
| |
| .listingblock code[data-lang]:before { display: none; content: attr(data-lang); position: absolute; font-size: 0.75em; top: 0.425rem; right: 0.5rem; line-height: 1; text-transform: uppercase; color: #999; } |
| |
| .listingblock:hover code[data-lang]:before { display: block; } |
| |
| .listingblock.terminal pre .command:before { content: attr(data-prompt); padding-right: 0.5em; color: #999; } |
| |
| .listingblock.terminal pre .command:not([data-prompt]):before { content: "$"; } |
| |
| table.pyhltable { border-collapse: separate; border: 0; margin-bottom: 0; background: none; } |
| |
| table.pyhltable td { vertical-align: top; padding-top: 0; padding-bottom: 0; line-height: 1.6; } |
| |
| table.pyhltable td.code { padding-left: .75em; padding-right: 0; } |
| |
| pre.pygments .lineno, table.pyhltable td:not(.code) { color: #999; padding-left: 0; padding-right: .5em; border-right: 1px solid #dddddd; } |
| |
| pre.pygments .lineno { display: inline-block; margin-right: .25em; } |
| |
| table.pyhltable .linenodiv { background: none !important; padding-right: 0 !important; } |
| |
| .quoteblock { margin: 0 1em 0.75em 1.5em; display: table; } |
| .quoteblock > .title { margin-left: -1.5em; margin-bottom: 0.75em; } |
| .quoteblock blockquote, .quoteblock blockquote p { color: #333333; font-size: 1.15rem; line-height: 1.75; word-spacing: 0.1em; letter-spacing: 0; font-style: italic; text-align: justify; } |
| .quoteblock blockquote { margin: 0; padding: 0; border: 0; } |
| .quoteblock blockquote:before { content: "\201c"; float: left; font-size: 2.75em; font-weight: bold; line-height: 0.6em; margin-left: -0.6em; color: black; text-shadow: 0 1px 2px rgba(0, 0, 0, 0.1); } |
| .quoteblock blockquote > .paragraph:last-child p { margin-bottom: 0; } |
| .quoteblock .attribution { margin-top: 0.5em; margin-right: 0.5ex; text-align: right; } |
| .quoteblock .quoteblock { margin-left: 0; margin-right: 0; padding: 0.5em 0; border-left: 3px solid #5e93b8; } |
| .quoteblock .quoteblock blockquote { padding: 0 0 0 0.75em; } |
| .quoteblock .quoteblock blockquote:before { display: none; } |
| |
| .verseblock { margin: 0 1em 0.75em 1em; } |
| .verseblock pre { font-family: "Open Sans", "DejaVu Sans", sans; font-size: 1.15rem; color: #333333; font-weight: 300; text-rendering: optimizeLegibility; } |
| .verseblock pre strong { font-weight: 400; } |
| .verseblock .attribution { margin-top: 1.25rem; margin-left: 0.5ex; } |
| |
| .quoteblock .attribution, .verseblock .attribution { font-size: 0.8125em; line-height: 1.45; font-style: italic; } |
| .quoteblock .attribution br, .verseblock .attribution br { display: none; } |
| .quoteblock .attribution cite, .verseblock .attribution cite { display: block; letter-spacing: -0.025em; color: #5e93b8; } |
| |
| .quoteblock.abstract { margin: 0 0 0.75em 0; display: block; } |
| .quoteblock.abstract blockquote, .quoteblock.abstract blockquote p { text-align: left; word-spacing: 0; } |
| .quoteblock.abstract blockquote:before, .quoteblock.abstract blockquote p:first-of-type:before { display: none; } |
| |
| table.tableblock { max-width: 100%; border-collapse: separate; } |
| table.tableblock td > .paragraph:last-child p > p:last-child, table.tableblock th > p:last-child, table.tableblock td > p:last-child { margin-bottom: 0; } |
| |
| table.tableblock, th.tableblock, td.tableblock { border: 0 solid #d8d8ce; } |
| |
| table.grid-all > thead > tr > .tableblock, table.grid-all > tbody > tr > .tableblock { border-width: 0 1px 1px 0; } |
| |
| table.grid-all > tfoot > tr > .tableblock { border-width: 1px 1px 0 0; } |
| |
| table.grid-cols > * > tr > .tableblock { border-width: 0 1px 0 0; } |
| |
| table.grid-rows > thead > tr > .tableblock, table.grid-rows > tbody > tr > .tableblock { border-width: 0 0 1px 0; } |
| |
| table.grid-rows > tfoot > tr > .tableblock { border-width: 1px 0 0 0; } |
| |
| table.grid-all > * > tr > .tableblock:last-child, table.grid-cols > * > tr > .tableblock:last-child { border-right-width: 0; } |
| |
| table.grid-all > tbody > tr:last-child > .tableblock, table.grid-all > thead:last-child > tr > .tableblock, table.grid-rows > tbody > tr:last-child > .tableblock, table.grid-rows > thead:last-child > tr > .tableblock { border-bottom-width: 0; } |
| |
| table.frame-all { border-width: 1px; } |
| |
| table.frame-sides { border-width: 0 1px; } |
| |
| table.frame-topbot { border-width: 1px 0; } |
| |
| th.halign-left, td.halign-left { text-align: left; } |
| |
| th.halign-right, td.halign-right { text-align: right; } |
| |
| th.halign-center, td.halign-center { text-align: center; } |
| |
| th.valign-top, td.valign-top { vertical-align: top; } |
| |
| th.valign-bottom, td.valign-bottom { vertical-align: bottom; } |
| |
| th.valign-middle, td.valign-middle { vertical-align: middle; } |
| |
| table thead th, table tfoot th { font-weight: bold; } |
| |
| tbody tr th { display: table-cell; line-height: 1.4; background: -webkit-linear-gradient(top, #add386, #90b66a); } |
| |
| tbody tr th, tbody tr th p, tfoot tr th, tfoot tr th p { color: white; font-weight: bold; } |
| |
| p.tableblock > code:only-child { background: none; padding: 0; } |
| |
| p.tableblock { font-size: 1em; } |
| |
| td > div.verse { white-space: pre; } |
| |
| ol { margin-left: 1.75em; } |
| |
| ul li ol { margin-left: 1.5em; } |
| |
| dl dd { margin-left: 1.125em; } |
| |
| dl dd:last-child, dl dd:last-child > :last-child { margin-bottom: 0; } |
| |
| ol > li p, ul > li p, ul dd, ol dd, .olist .olist, .ulist .ulist, .ulist .olist, .olist .ulist { margin-bottom: 0.375em; } |
| |
| ul.checklist, ul.none, ol.none, ul.no-bullet, ol.no-bullet, ol.unnumbered, ul.unstyled, ol.unstyled { list-style-type: none; } |
| |
| ul.no-bullet, ol.no-bullet, ol.unnumbered { margin-left: 0.625em; } |
| |
| ul.unstyled, ol.unstyled { margin-left: 0; } |
| |
| ul.checklist { margin-left: 0.625em; } |
| |
| ul.checklist li > p:first-child > .fa-square-o:first-child, ul.checklist li > p:first-child > .fa-check-square-o:first-child { width: 1.25em; font-size: 0.8em; position: relative; bottom: 0.125em; } |
| |
| ul.checklist li > p:first-child > input[type="checkbox"]:first-child { margin-right: 0.25em; } |
| |
| ul.inline { display: -ms-flexbox; display: -webkit-box; display: flex; -ms-flex-flow: row wrap; -webkit-flex-flow: row wrap; flex-flow: row wrap; list-style: none; margin: 0 0 0.375em -0.75em; } |
| |
| ul.inline > li { margin-left: 0.75em; } |
| |
| .unstyled dl dt { font-weight: normal; font-style: normal; } |
| |
| ol.arabic { list-style-type: decimal; } |
| |
| ol.decimal { list-style-type: decimal-leading-zero; } |
| |
| ol.loweralpha { list-style-type: lower-alpha; } |
| |
| ol.upperalpha { list-style-type: upper-alpha; } |
| |
| ol.lowerroman { list-style-type: lower-roman; } |
| |
| ol.upperroman { list-style-type: upper-roman; } |
| |
| ol.lowergreek { list-style-type: lower-greek; } |
| |
| .hdlist > table, .colist > table { border: 0; background: none; } |
| .hdlist > table > tbody > tr, .colist > table > tbody > tr { background: none; } |
| |
| td.hdlist1, td.hdlist2 { vertical-align: top; padding: 0 0.625em; } |
| |
| td.hdlist1 { font-weight: bold; padding-bottom: 0.75em; } |
| |
| .literalblock + .colist, .listingblock + .colist { margin-top: -0.5em; } |
| |
| .colist > table tr > td:first-of-type { padding: 0.4em 0.75em 0 0.75em; line-height: 1; vertical-align: top; } |
| .colist > table tr > td:first-of-type img { max-width: initial; } |
| .colist > table tr > td:last-of-type { padding: 0.25em 0; } |
| |
| .thumb, .th { line-height: 0; display: inline-block; border: solid 4px white; -webkit-box-shadow: 0 0 0 1px #dddddd; box-shadow: 0 0 0 1px #dddddd; } |
| |
| .imageblock.left, .imageblock[style*="float: left"] { margin: 0.25em 0.625em 1.25em 0; } |
| .imageblock.right, .imageblock[style*="float: right"] { margin: 0.25em 0 1.25em 0.625em; } |
| .imageblock > .title { margin-bottom: 0; } |
| .imageblock.thumb, .imageblock.th { border-width: 6px; } |
| .imageblock.thumb > .title, .imageblock.th > .title { padding: 0 0.125em; } |
| |
| .image.left, .image.right { margin-top: 0.25em; margin-bottom: 0.25em; display: inline-block; line-height: 0; } |
| .image.left { margin-right: 0.625em; } |
| .image.right { margin-left: 0.625em; } |
| |
| a.image { text-decoration: none; display: inline-block; } |
| a.image object { pointer-events: none; } |
| |
| sup.footnote, sup.footnoteref { font-size: 0.875em; position: static; vertical-align: super; } |
| sup.footnote a, sup.footnoteref a { text-decoration: none; } |
| sup.footnote a:active, sup.footnoteref a:active { text-decoration: underline; } |
| |
| #footnotes { padding-top: 0.75em; padding-bottom: 0.75em; margin-bottom: 0.625em; } |
| #footnotes hr { width: 20%; min-width: 6.25em; margin: -0.25em 0 0.75em 0; border-width: 1px 0 0 0; } |
| #footnotes .footnote { padding: 0 0.375em 0 0.225em; line-height: 1.3334; font-size: 0.875em; margin-left: 1.2em; margin-bottom: 0.2em; } |
| #footnotes .footnote a:first-of-type { font-weight: bold; text-decoration: none; margin-left: -1.05em; } |
| #footnotes .footnote:last-of-type { margin-bottom: 0; } |
| #content #footnotes { margin-top: -0.625em; margin-bottom: 0; padding: 0.75em 0; } |
| |
| .gist .file-data > table { border: 0; background: #fff; width: 100%; margin-bottom: 0; } |
| .gist .file-data > table td.line-data { width: 99%; } |
| |
| div.unbreakable { page-break-inside: avoid; } |
| |
| .big { font-size: larger; } |
| |
| .small { font-size: smaller; } |
| |
| .underline { text-decoration: underline; } |
| |
| .overline { text-decoration: overline; } |
| |
| .line-through { text-decoration: line-through; } |
| |
| .aqua { color: #00bfbf; } |
| |
| .aqua-background { background-color: #00fafa; } |
| |
| .black { color: black; } |
| |
| .black-background { background-color: black; } |
| |
| .blue { color: #0000bf; } |
| |
| .blue-background { background-color: #0000fa; } |
| |
| .fuchsia { color: #bf00bf; } |
| |
| .fuchsia-background { background-color: #fa00fa; } |
| |
| .gray { color: #606060; } |
| |
| .gray-background { background-color: #7d7d7d; } |
| |
| .green { color: #006000; } |
| |
| .green-background { background-color: #007d00; } |
| |
| .lime { color: #00bf00; } |
| |
| .lime-background { background-color: #00fa00; } |
| |
| .maroon { color: #600000; } |
| |
| .maroon-background { background-color: #7d0000; } |
| |
| .navy { color: #000060; } |
| |
| .navy-background { background-color: #00007d; } |
| |
| .olive { color: #606000; } |
| |
| .olive-background { background-color: #7d7d00; } |
| |
| .purple { color: #600060; } |
| |
| .purple-background { background-color: #7d007d; } |
| |
| .red { color: #bf0000; } |
| |
| .red-background { background-color: #fa0000; } |
| |
| .silver { color: #909090; } |
| |
| .silver-background { background-color: #bcbcbc; } |
| |
| .teal { color: #006060; } |
| |
| .teal-background { background-color: #007d7d; } |
| |
| .white { color: #bfbfbf; } |
| |
| .white-background { background-color: #fafafa; } |
| |
| .yellow { color: #bfbf00; } |
| |
| .yellow-background { background-color: #fafa00; } |
| |
| span.icon > .fa { cursor: default; } |
| a span.icon > .fa { cursor: inherit; } |
| |
| .admonitionblock td.icon [class^="fa icon-"] { font-size: 2.5em; text-shadow: 1px 1px 2px rgba(0, 0, 0, 0.5); cursor: default; } |
| .admonitionblock td.icon .icon-note:before { content: "\f05a"; color: #29475c; } |
| .admonitionblock td.icon .icon-tip:before { content: "\f0eb"; text-shadow: 1px 1px 2px rgba(155, 155, 0, 0.8); color: #111; } |
| .admonitionblock td.icon .icon-warning:before { content: "\f071"; color: #bf6900; } |
| .admonitionblock td.icon .icon-caution:before { content: "\f06d"; color: #bf3400; } |
| .admonitionblock td.icon .icon-important:before { content: "\f06a"; color: #bf0000; } |
| |
| .conum[data-value] { display: inline-block; color: #fff !important; background-color: black; -webkit-border-radius: 100px; border-radius: 100px; text-align: center; font-size: 0.75em; width: 1.67em; height: 1.67em; line-height: 1.67em; font-family: "Open Sans", "DejaVu Sans", sans-serif; font-style: normal; font-weight: bold; } |
| .conum[data-value] * { color: #fff !important; } |
| .conum[data-value] + b { display: none; } |
| .conum[data-value]:after { content: attr(data-value); } |
| pre .conum[data-value] { position: relative; top: -0.125em; } |
| |
| b.conum * { color: inherit !important; } |
| |
| .conum:not([data-value]):empty { display: none; } |
| |
| h1, h2, h3, #toctitle, .sidebarblock > .content > .title, h4, h5, h6 { border-bottom: 1px solid #dddddd; } |
| |
| .sect1 { padding-bottom: 0; } |
| |
| #toctitle { color: #00406F; font-weight: normal; margin-top: 1.5em; } |
| |
| .sidebarblock { border-color: #aaa; } |
| |
| code { -webkit-border-radius: 4px; border-radius: 4px; } |
| |
| p.tableblock.header { color: #6d6e71; } |
| |
| .literalblock pre, .listingblock pre { background: #eeeeee; } |
| |
| </style> |
| <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css"> |
| <style> |
| /* Stylesheet for CodeRay to match GitHub theme | MIT License | http://foundation.zurb.com */ |
| /*pre.CodeRay {background-color:#f7f7f8;}*/ |
| .CodeRay .line-numbers{border-right:1px solid #d8d8d8;padding:0 0.5em 0 .25em} |
| .CodeRay span.line-numbers{display:inline-block;margin-right:.5em;color:rgba(0,0,0,.3)} |
| .CodeRay .line-numbers strong{color:rgba(0,0,0,.4)} |
| table.CodeRay{border-collapse:separate;border-spacing:0;margin-bottom:0;border:0;background:none} |
| table.CodeRay td{vertical-align: top;line-height:1.45} |
| table.CodeRay td.line-numbers{text-align:right} |
| table.CodeRay td.line-numbers>pre{padding:0;color:rgba(0,0,0,.3)} |
| table.CodeRay td.code{padding:0 0 0 .5em} |
| table.CodeRay td.code>pre{padding:0} |
| .CodeRay .debug{color:#fff !important;background:#000080 !important} |
| .CodeRay .annotation{color:#007} |
| .CodeRay .attribute-name{color:#000080} |
| .CodeRay .attribute-value{color:#700} |
| .CodeRay .binary{color:#509} |
| .CodeRay .comment{color:#998;font-style:italic} |
| .CodeRay .char{color:#04d} |
| .CodeRay .char .content{color:#04d} |
| .CodeRay .char .delimiter{color:#039} |
| .CodeRay .class{color:#458;font-weight:bold} |
| .CodeRay .complex{color:#a08} |
| .CodeRay .constant,.CodeRay .predefined-constant{color:#008080} |
| .CodeRay .color{color:#099} |
| .CodeRay .class-variable{color:#369} |
| .CodeRay .decorator{color:#b0b} |
| .CodeRay .definition{color:#099} |
| .CodeRay .delimiter{color:#000} |
| .CodeRay .doc{color:#970} |
| .CodeRay .doctype{color:#34b} |
| .CodeRay .doc-string{color:#d42} |
| .CodeRay .escape{color:#666} |
| .CodeRay .entity{color:#800} |
| .CodeRay .error{color:#808} |
| .CodeRay .exception{color:inherit} |
| .CodeRay .filename{color:#099} |
| .CodeRay .function{color:#900;font-weight:bold} |
| .CodeRay .global-variable{color:#008080} |
| .CodeRay .hex{color:#058} |
| .CodeRay .integer,.CodeRay .float{color:#099} |
| .CodeRay .include{color:#555} |
| .CodeRay .inline{color:#000} |
| .CodeRay .inline .inline{background:#ccc} |
| .CodeRay .inline .inline .inline{background:#bbb} |
| .CodeRay .inline .inline-delimiter{color:#d14} |
| .CodeRay .inline-delimiter{color:#d14} |
| .CodeRay .important{color:#555;font-weight:bold} |
| .CodeRay .interpreted{color:#b2b} |
| .CodeRay .instance-variable{color:#008080} |
| .CodeRay .label{color:#970} |
| .CodeRay .local-variable{color:#963} |
| .CodeRay .octal{color:#40e} |
| .CodeRay .predefined{color:#369} |
| .CodeRay .preprocessor{color:#579} |
| .CodeRay .pseudo-class{color:#555} |
| .CodeRay .directive{font-weight:bold} |
| .CodeRay .type{font-weight:bold} |
| .CodeRay .predefined-type{color:inherit} |
| .CodeRay .reserved,.CodeRay .keyword {color:#000;font-weight:bold} |
| .CodeRay .key{color:#808} |
| .CodeRay .key .delimiter{color:#606} |
| .CodeRay .key .char{color:#80f} |
| .CodeRay .value{color:#088} |
| .CodeRay .regexp .delimiter{color:#808} |
| .CodeRay .regexp .content{color:#808} |
| .CodeRay .regexp .modifier{color:#808} |
| .CodeRay .regexp .char{color:#d14} |
| .CodeRay .regexp .function{color:#404;font-weight:bold} |
| .CodeRay .string{color:#d20} |
| .CodeRay .string .string .string{background:#ffd0d0} |
| .CodeRay .string .content{color:#d14} |
| .CodeRay .string .char{color:#d14} |
| .CodeRay .string .delimiter{color:#d14} |
| .CodeRay .shell{color:#d14} |
| .CodeRay .shell .delimiter{color:#d14} |
| .CodeRay .symbol{color:#990073} |
| .CodeRay .symbol .content{color:#a60} |
| .CodeRay .symbol .delimiter{color:#630} |
| .CodeRay .tag{color:#008080} |
| .CodeRay .tag-special{color:#d70} |
| .CodeRay .variable{color:#036} |
| .CodeRay .insert{background:#afa} |
| .CodeRay .delete{background:#faa} |
| .CodeRay .change{color:#aaf;background:#007} |
| .CodeRay .head{color:#f8f;background:#505} |
| .CodeRay .insert .insert{color:#080} |
| .CodeRay .delete .delete{color:#800} |
| .CodeRay .change .change{color:#66f} |
| .CodeRay .head .head{color:#f4f} |
| </style> |
| <link rel="stylesheet" href="../katex/katex.min.css"> |
| <script src="../katex/katex.min.js"></script> |
| <script src="../katex/contrib/auto-render.min.js"></script> |
| <!-- Use KaTeX to render math once document is loaded, see |
| https://github.com/Khan/KaTeX/tree/master/contrib/auto-render --> |
| <script> |
| document.addEventListener("DOMContentLoaded", function () { |
| renderMathInElement( |
| document.body, |
| { |
| delimiters: [ |
| { left: "$$", right: "$$", display: true}, |
| { left: "\\[", right: "\\]", display: true}, |
| { left: "$", right: "$", display: false}, |
| { left: "\\(", right: "\\)", display: false} |
| ] |
| } |
| ); |
| }); |
| </script></head> |
| <body class="book toc2 toc-left" style="max-width: 100;"> |
| <div id="header"> |
| <h1>The OpenCL<sup>™</sup> Specification</h1> |
| <div class="details"> |
| <span id="author" class="author">Khronos<sup>®</sup> OpenCL Working Group</span><br> |
| <span id="revnumber">version v3.0.6,</span> |
| <span id="revdate">Fri, 18 Dec 2020 12:00:00 +0000</span> |
| <br><span id="revremark">from git branch: master commit: e9a4d468b1a0a38c1e10b8af484bb2bbb495e2b7</span> |
| </div> |
| <div id="toc" class="toc2"> |
| <div id="toctitle">Table of Contents</div> |
| <ul class="sectlevel1"> |
| <li><a href="#_introduction">1. Introduction</a> |
| <ul class="sectlevel2"> |
| <li><a href="#_normative_references">1.1. Normative References</a></li> |
| <li><a href="#_version_numbers">1.2. Version Numbers</a></li> |
| <li><a href="#unified-spec">1.3. Unified Specification</a></li> |
| </ul> |
| </li> |
| <li><a href="#_glossary">2. Glossary</a></li> |
| <li><a href="#_the_opencl_architecture">3. The OpenCL Architecture</a> |
| <ul class="sectlevel2"> |
| <li><a href="#_platform_model">3.1. Platform Model</a></li> |
| <li><a href="#_execution_model">3.2. Execution Model</a> |
| <ul class="sectlevel3"> |
| <li><a href="#_mapping_work_items_onto_an_ndrange">3.2.1. Mapping work-items onto an NDRange</a></li> |
| <li><a href="#_execution_of_kernel_instances">3.2.2. Execution of kernel-instances</a></li> |
| <li><a href="#device-side-enqueue">3.2.3. Device-side enqueue</a></li> |
| <li><a href="#execution-model-sync">3.2.4. Synchronization</a></li> |
| <li><a href="#_categories_of_kernels">3.2.5. Categories of Kernels</a></li> |
| </ul> |
| </li> |
| <li><a href="#_memory_model">3.3. Memory Model</a> |
| <ul class="sectlevel3"> |
| <li><a href="#_fundamental_memory_regions">3.3.1. Fundamental Memory Regions</a></li> |
| <li><a href="#_memory_objects">3.3.2. Memory Objects</a></li> |
| <li><a href="#shared-virtual-memory">3.3.3. Shared Virtual Memory</a></li> |
| <li><a href="#_memory_consistency_model_for_opencl_1_x">3.3.4. Memory Consistency Model for OpenCL 1.x</a></li> |
| <li><a href="#memory-consistency-model">3.3.5. Memory Consistency Model for OpenCL 2.x</a></li> |
| <li><a href="#_overview_of_atomic_and_fence_operations">3.3.6. Overview of atomic and fence operations</a></li> |
| <li><a href="#memory-ordering-rules">3.3.7. Memory Ordering Rules</a></li> |
| </ul> |
| </li> |
| <li><a href="#opencl-framework">3.4. The OpenCL Framework</a> |
| <ul class="sectlevel3"> |
| <li><a href="#_mixed_version_support">3.4.1. Mixed Version Support</a></li> |
| <li><a href="#_backwards_compatibility">3.4.2. Backwards Compatibility</a></li> |
| <li><a href="#_versioning">3.4.3. Versioning</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li><a href="#opencl-platform-layer">4. The OpenCL Platform Layer</a> |
| <ul class="sectlevel2"> |
| <li><a href="#_querying_platform_info">4.1. Querying Platform Info</a></li> |
| <li><a href="#platform-querying-devices">4.2. Querying Devices</a></li> |
| <li><a href="#_partitioning_a_device">4.3. Partitioning a Device</a></li> |
| <li><a href="#_contexts">4.4. Contexts</a></li> |
| </ul> |
| </li> |
| <li><a href="#opencl-runtime">5. The OpenCL Runtime</a> |
| <ul class="sectlevel2"> |
| <li><a href="#_command_queues">5.1. Command Queues</a></li> |
| <li><a href="#_buffer_objects">5.2. Buffer Objects</a> |
| <ul class="sectlevel3"> |
| <li><a href="#_creating_buffer_objects">5.2.1. Creating Buffer Objects</a></li> |
| <li><a href="#_reading_writing_and_copying_buffer_objects">5.2.2. Reading, Writing and Copying Buffer Objects</a></li> |
| <li><a href="#_filling_buffer_objects">5.2.3. Filling Buffer Objects</a></li> |
| <li><a href="#_mapping_buffer_objects">5.2.4. Mapping Buffer Objects</a></li> |
| </ul> |
| </li> |
| <li><a href="#_image_objects">5.3. Image Objects</a> |
| <ul class="sectlevel3"> |
| <li><a href="#_creating_image_objects">5.3.1. Creating Image Objects</a></li> |
| <li><a href="#_querying_list_of_supported_image_formats">5.3.2. Querying List of Supported Image Formats</a></li> |
| <li><a href="#_reading_writing_and_copying_image_objects">5.3.3. Reading, Writing and Copying Image Objects</a></li> |
| <li><a href="#_filling_image_objects">5.3.4. Filling Image Objects</a></li> |
| <li><a href="#_copying_between_image_and_buffer_objects">5.3.5. Copying between Image and Buffer Objects</a></li> |
| <li><a href="#_mapping_image_objects">5.3.6. Mapping Image Objects</a></li> |
| <li><a href="#image-object-queries">5.3.7. Image Object Queries</a></li> |
| </ul> |
| </li> |
| <li><a href="#_pipes">5.4. Pipes</a> |
| <ul class="sectlevel3"> |
| <li><a href="#_creating_pipe_objects">5.4.1. Creating Pipe Objects</a></li> |
| <li><a href="#_pipe_object_queries">5.4.2. Pipe Object Queries</a></li> |
| </ul> |
| </li> |
| <li><a href="#_querying_unmapping_migrating_retaining_and_releasing_memory_objects">5.5. Querying, Unmapping, Migrating, Retaining and Releasing Memory Objects</a> |
| <ul class="sectlevel3"> |
| <li><a href="#_retaining_and_releasing_memory_objects">5.5.1. Retaining and Releasing Memory Objects</a></li> |
| <li><a href="#unmapping-mapped-memory">5.5.2. Unmapping Mapped Memory Objects</a></li> |
| <li><a href="#accessing-mapped-regions">5.5.3. Accessing mapped regions of a memory object</a></li> |
| <li><a href="#_migrating_memory_objects">5.5.4. Migrating Memory Objects</a></li> |
| <li><a href="#memory-object-queries">5.5.5. Memory Object Queries</a></li> |
| </ul> |
| </li> |
| <li><a href="#_shared_virtual_memory">5.6. Shared Virtual Memory</a> |
| <ul class="sectlevel3"> |
| <li><a href="#_svm_sharing_granularity_coarse_and_fine_grained_sharing">5.6.1. SVM sharing granularity: coarse- and fine- grained sharing</a></li> |
| <li><a href="#_memory_consistency_for_svm_allocations">5.6.2. Memory consistency for SVM allocations</a></li> |
| </ul> |
| </li> |
| <li><a href="#_sampler_objects">5.7. Sampler Objects</a> |
| <ul class="sectlevel3"> |
| <li><a href="#_creating_sampler_objects">5.7.1. Creating Sampler Objects</a></li> |
| <li><a href="#_sampler_object_queries">5.7.2. Sampler Object Queries</a></li> |
| </ul> |
| </li> |
| <li><a href="#_program_objects">5.8. Program Objects</a> |
| <ul class="sectlevel3"> |
| <li><a href="#_creating_program_objects">5.8.1. Creating Program Objects</a></li> |
| <li><a href="#_retaining_and_releasing_program_objects">5.8.2. Retaining and Releasing Program Objects</a></li> |
| <li><a href="#_setting_spir_v_specialization_constants">5.8.3. Setting SPIR-V specialization constants</a></li> |
| <li><a href="#_building_program_executables">5.8.4. Building Program Executables</a></li> |
| <li><a href="#_separate_compilation_and_linking_of_programs">5.8.5. Separate Compilation and Linking of Programs</a></li> |
| <li><a href="#compiler-options">5.8.6. Compiler Options</a></li> |
| <li><a href="#linker-options">5.8.7. Linker Options</a></li> |
| <li><a href="#_unloading_the_opencl_compiler">5.8.8. Unloading the OpenCL Compiler</a></li> |
| <li><a href="#_program_object_queries">5.8.9. Program Object Queries</a></li> |
| </ul> |
| </li> |
| <li><a href="#_kernel_objects">5.9. Kernel Objects</a> |
| <ul class="sectlevel3"> |
| <li><a href="#_creating_kernel_objects">5.9.1. Creating Kernel Objects</a></li> |
| <li><a href="#_setting_kernel_arguments">5.9.2. Setting Kernel Arguments</a></li> |
| <li><a href="#_copying_kernel_objects">5.9.3. Copying Kernel Objects</a></li> |
| <li><a href="#_kernel_object_queries">5.9.4. Kernel Object Queries</a></li> |
| </ul> |
| </li> |
| <li><a href="#_executing_kernels">5.10. Executing Kernels</a></li> |
| <li><a href="#event-objects">5.11. Event Objects</a></li> |
| <li><a href="#markers-barriers-waiting-for-events">5.12. Markers, Barriers and Waiting for Events</a></li> |
| <li><a href="#_out_of_order_execution_of_kernels_and_memory_object_commands">5.13. Out-of-order Execution of Kernels and Memory Object Commands</a></li> |
| <li><a href="#profiling-operations">5.14. Profiling Operations on Memory Objects and Kernels</a></li> |
| <li><a href="#_flush_and_finish">5.15. Flush and Finish</a></li> |
| </ul> |
| </li> |
| <li><a href="#_associated_opencl_specification">6. Associated OpenCL specification</a> |
| <ul class="sectlevel2"> |
| <li><a href="#spirv-il">6.1. SPIR-V Intermediate Language</a></li> |
| <li><a href="#opencl-extensions">6.2. Extensions to OpenCL</a></li> |
| <li><a href="#opencl-c-kernel-language">6.3. The OpenCL C Kernel Language</a></li> |
| </ul> |
| </li> |
| <li><a href="#opencl-embedded-profile">7. OpenCL Embedded Profile</a></li> |
| <li><a href="#_host_environment_and_thread_safety">Appendix A: Host environment and thread safety</a> |
| <ul class="sectlevel2"> |
| <li><a href="#shared-opencl-objects">Shared OpenCL Objects</a></li> |
| <li><a href="#_multiple_host_threads">Multiple Host Threads</a></li> |
| <li><a href="#_global_constructors_and_destructors">Global constructors and destructors</a></li> |
| </ul> |
| </li> |
| <li><a href="#_portability">Appendix B: Portability</a></li> |
| <li><a href="#data-types">Appendix C: Application Data Types</a> |
| <ul class="sectlevel2"> |
| <li><a href="#scalar-data-types">Supported Application Scalar Data Types</a></li> |
| <li><a href="#vector-data-types">Supported Application Vector Data Types</a></li> |
| <li><a href="#alignment-app-data-types">Alignment of Application Data Types</a></li> |
| <li><a href="#_vector_literals">Vector Literals</a></li> |
| <li><a href="#vector-components">Vector Components</a> |
| <ul class="sectlevel3"> |
| <li><a href="#_named_vector_components_notation">Named vector components notation</a></li> |
| <li><a href="#_highlow_vector_component_notation">High/Low vector component notation</a></li> |
| <li><a href="#_native_vector_type_notation">Native vector type notation</a></li> |
| </ul> |
| </li> |
| <li><a href="#_implicit_conversions">Implicit Conversions</a></li> |
| <li><a href="#_explicit_casts">Explicit Casts</a></li> |
| <li><a href="#_other_operators_and_functions">Other operators and functions</a></li> |
| <li><a href="#_application_constant_definitions">Application constant definitions</a></li> |
| </ul> |
| </li> |
| <li><a href="#check-copy-overlap">Appendix D: Checking for Memory Copy Overlap</a></li> |
| <li><a href="#changes_to_opencl">Appendix E: Changes to OpenCL</a> |
| <ul class="sectlevel2"> |
| <li><a href="#_summary_of_changes_from_opencl_1_0_to_opencl_1_1">Summary of changes from OpenCL 1.0 to OpenCL 1.1</a></li> |
| <li><a href="#_summary_of_changes_from_opencl_1_1_to_opencl_1_2">Summary of changes from OpenCL 1.1 to OpenCL 1.2</a></li> |
| <li><a href="#_summary_of_changes_from_opencl_1_2_to_opencl_2_0">Summary of changes from OpenCL 1.2 to OpenCL 2.0</a></li> |
| <li><a href="#_summary_of_changes_from_opencl_2_0_to_opencl_2_1">Summary of changes from OpenCL 2.0 to OpenCL 2.1</a></li> |
| <li><a href="#_summary_of_changes_from_opencl_2_1_to_opencl_2_2">Summary of changes from OpenCL 2.1 to OpenCL 2.2</a></li> |
| <li><a href="#_summary_of_changes_from_opencl_2_2_to_opencl_3_0">Summary of changes from OpenCL 2.2 to OpenCL 3.0</a></li> |
| </ul> |
| </li> |
| <li><a href="#error_codes">Appendix F: Error Codes</a></li> |
| <li><a href="#error_other_misc_enums">Appendix G: Other Miscellaneous Enums</a></li> |
| <li><a href="#opencl-3.0-backwards-compatibility">Appendix H: OpenCL 3.0 Backwards Compatibility</a> |
| <ul class="sectlevel2"> |
| <li><a href="#_shared_virtual_memory_2">Shared Virtual Memory</a></li> |
| <li><a href="#_memory_consistency_model">Memory Consistency Model</a></li> |
| <li><a href="#_device_side_enqueue">Device-Side Enqueue</a></li> |
| <li><a href="#_pipes_2">Pipes</a></li> |
| <li><a href="#_program_scope_global_variables">Program Scope Global Variables</a></li> |
| <li><a href="#_non_uniform_work_groups">Non-Uniform Work Groups</a></li> |
| <li><a href="#_read_write_images">Read-Write Images</a></li> |
| <li><a href="#_creating_2d_images_from_buffers">Creating 2D Images from Buffers</a></li> |
| <li><a href="#_srgb_images">sRGB Images</a></li> |
| <li><a href="#_depth_images">Depth Images</a></li> |
| <li><a href="#_device_and_host_timer_synchronization">Device and Host Timer Synchronization</a></li> |
| <li><a href="#_intermediate_language_programs">Intermediate Language Programs</a></li> |
| <li><a href="#_subgroups">Subgroups</a></li> |
| <li><a href="#_program_initialization_and_clean_up_kernels">Program Initialization and Clean-Up Kernels</a></li> |
| <li><a href="#_3d_image_writes">3D Image Writes</a></li> |
| <li><a href="#_work_group_collective_functions">Work Group Collective Functions</a></li> |
| <li><a href="#_generic_address_space">Generic Address Space</a></li> |
| <li><a href="#_language_features_that_were_already_optional">Language Features that Were Already Optional</a></li> |
| </ul> |
| </li> |
| <li><a href="#_acknowledgements">Acknowledgements</a></li> |
| </ul> |
| </div> |
| </div> |
| <div id="content"> |
| <div id="preamble"> |
| <div class="sectionbody"> |
| <div style="page-break-after: always;"></div> |
| <div class="paragraph"> |
| <p>Copyright 2008-2020 The Khronos Group.</p> |
| </div> |
| <div class="paragraph"> |
| <p>This specification is protected by copyright laws and contains material proprietary |
| to the Khronos Group, Inc. Except as described by these terms, it or any components |
| may not be reproduced, republished, distributed, transmitted, displayed, broadcast |
| or otherwise exploited in any manner without the express prior written permission |
| of Khronos Group.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Khronos Group grants a conditional copyright license to use and reproduce the |
| unmodified specification for any purpose, without fee or royalty, EXCEPT no licenses |
| to any patent, trademark or other intellectual property rights are granted under |
| these terms. Parties desiring to implement the specification and make use of |
| Khronos trademarks in relation to that implementation, and receive reciprocal patent |
| license protection under the Khronos IP Policy must become Adopters and confirm the |
| implementation as conformant under the process defined by Khronos for this |
| specification; see <a href="https://www.khronos.org/adopters" class="bare">https://www.khronos.org/adopters</a>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Khronos Group makes no, and expressly disclaims any, representations or warranties, |
| express or implied, regarding this specification, including, without limitation: |
| merchantability, fitness for a particular purpose, non-infringement of any |
| intellectual property, correctness, accuracy, completeness, timeliness, and |
| reliability. Under no circumstances will the Khronos Group, or any of its Promoters, |
| Contributors or Members, or their respective partners, officers, directors, |
| employees, agents or representatives be liable for any damages, whether direct, |
| indirect, special or consequential damages for lost revenues, lost profits, or |
| otherwise, arising from or in connection with these materials.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Vulkan and Khronos are registered trademarks, and OpenXR, SPIR, SPIR-V, SYCL, WebGL, |
| WebCL, OpenVX, OpenVG, EGL, COLLADA, glTF, NNEF, OpenKODE, OpenKCAM, StreamInput, |
| OpenWF, OpenSL ES, OpenMAX, OpenMAX AL, OpenMAX IL, OpenMAX DL, OpenML and DevU are |
| trademarks of the Khronos Group Inc. ASTC is a trademark of ARM Holdings PLC, |
| OpenCL is a trademark of Apple Inc. and OpenGL and OpenML are registered trademarks |
| and the OpenGL ES and OpenGL SC logos are trademarks of Silicon Graphics |
| International used under license by Khronos. All other product names, trademarks, |
| and/or company names are used solely for identification and belong to their |
| respective owners.</p> |
| </div> |
| <div style="page-break-after: always;"></div> |
| </div> |
| </div> |
| <div class="sect1"> |
| <h2 id="_introduction"><a class="anchor" href="#_introduction"></a>1. Introduction</h2> |
| <div class="sectionbody"> |
| <div class="paragraph"> |
| <p>Modern processor architectures have embraced parallelism as an important |
| pathway to increased performance. |
| Facing technical challenges with higher clock speeds in a fixed power |
| envelope, Central Processing Units (CPUs) now improve performance by adding |
| multiple cores. |
| Graphics Processing Units (GPUs) have also evolved from fixed function |
| rendering devices into programmable parallel processors. |
| As todays computer systems often include highly parallel CPUs, GPUs and |
| other types of processors, it is important to enable software developers to |
| take full advantage of these heterogeneous processing platforms.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Creating applications for heterogeneous parallel processing platforms is |
| challenging as traditional programming approaches for multi-core CPUs and |
| GPUs are very different. |
| CPU-based parallel programming models are typically based on standards but |
| usually assume a shared address space and do not encompass vector |
| operations. |
| General purpose GPU programming models address complex memory hierarchies |
| and vector operations but are traditionally platform-, vendor- or |
| hardware-specific. |
| These limitations make it difficult for a developer to access the compute |
| power of heterogeneous CPUs, GPUs and other types of processors from a |
| single, multi-platform source code base. |
| More than ever, there is a need to enable software developers to effectively |
| take full advantage of heterogeneous processing platforms from high |
| performance compute servers, through desktop computer systems to handheld |
| devices - that include a diverse mix of parallel CPUs, GPUs and other |
| processors such as DSPs and the Cell/B.E. |
| processor.</p> |
| </div> |
| <div class="paragraph"> |
| <p><strong>OpenCL</strong> (Open Computing Language) is an open royalty-free standard for |
| general purpose parallel programming across CPUs, GPUs and other processors, |
| giving software developers portable and efficient access to the power of |
| these heterogeneous processing platforms.</p> |
| </div> |
| <div class="paragraph"> |
| <p>OpenCL supports a wide range of applications, ranging from embedded and |
| consumer software to HPC solutions, through a low-level, high-performance, |
| portable abstraction. |
| By creating an efficient, close-to-the-metal programming interface, OpenCL |
| will form the foundation layer of a parallel computing ecosystem of |
| platform-independent tools, middleware and applications. |
| OpenCL is particularly suited to play an increasingly significant role in |
| emerging interactive graphics applications that combine general parallel |
| compute algorithms with graphics rendering pipelines.</p> |
| </div> |
| <div class="paragraph"> |
| <p>OpenCL consists of an API for coordinating parallel computation across |
| heterogeneous processors, a cross-platform programming language, and a |
| cross-platform intermediate language with a well-specified computation |
| environment. |
| The OpenCL standard:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>Supports both data- and task-based parallel programming models</p> |
| </li> |
| <li> |
| <p>Supports kernels written using a subset of ISO C99 with extensions |
| for parallel execution</p> |
| </li> |
| <li> |
| <p>Supports kernels represented by a portable and self-contained |
| intermediate language (e.g. SPIR-V) with support for parallel execution</p> |
| </li> |
| <li> |
| <p>Defines consistent numerical requirements based on IEEE 754</p> |
| </li> |
| <li> |
| <p>Defines a configuration profile for handheld and embedded devices</p> |
| </li> |
| <li> |
| <p>Supports efficient interop with OpenGL, OpenGL ES and other APIs</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>This document begins with an overview of basic concepts and the architecture |
| of OpenCL, followed by a detailed description of its execution model, memory |
| model and synchronization support. |
| It then discusses the OpenCL platform and runtime API. |
| Some examples are given that describe sample compute use-cases and how they |
| would be written in OpenCL. |
| The specification is divided into a core specification that any OpenCL |
| compliant implementation must support; a handheld/embedded profile which |
| relaxes the OpenCL compliance requirements for handheld and embedded |
| devices; and a set of optional extensions that are likely to move into the |
| core specification in later revisions of the OpenCL specification.</p> |
| </div> |
| <div class="sect2"> |
| <h3 id="_normative_references"><a class="anchor" href="#_normative_references"></a>1.1. Normative References</h3> |
| <div class="paragraph"> |
| <p>Normative references are references to external documents or resources to |
| which implementers of OpenCL must comply with all, or specified portions of, |
| as described in this specification.</p> |
| </div> |
| <div id="iso-c11" class="paragraph"> |
| <p><em>ISO/IEC 9899:2011 - Information technology - Programming languages - C</em>, |
| <a href="https://www.iso.org/standard/57853.html" class="bare">https://www.iso.org/standard/57853.html</a> (final specification), |
| <a href="http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1570.pdf" class="bare">http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1570.pdf</a> (last public |
| draft).</p> |
| </div> |
| </div> |
| <div class="sect2"> |
| <h3 id="_version_numbers"><a class="anchor" href="#_version_numbers"></a>1.2. Version Numbers</h3> |
| <div class="paragraph"> |
| <p>The OpenCL version number follows a <em>major.minor-revision</em> scheme. When this |
| version number is used within the API it generally only includes the |
| <em>major.minor</em> components of the version number.</p> |
| </div> |
| <div class="paragraph"> |
| <p>A difference in the <em>major</em> or <em>minor</em> version number indicates that some |
| amount of new functionality has been added to the specification, and may also |
| include behavior changes and bug fixes. |
| Functionality may also be deprecated or removed when the <em>major</em> or <em>minor</em> |
| version changes.</p> |
| </div> |
| <div class="paragraph"> |
| <p>A difference in the <em>revision</em> number indicates small changes to the |
| specification, typically to fix a bug or to clarify language. |
| When the <em>revision</em> number changes there may be an impact on the behavior of |
| existing functionality, but this should not affect backwards compatibility. |
| Functionality should not be added or removed when the <em>revision</em> number |
| changes.</p> |
| </div> |
| </div> |
| <div class="sect2"> |
| <h3 id="unified-spec"><a class="anchor" href="#unified-spec"></a>1.3. Unified Specification</h3> |
| <div class="paragraph"> |
| <p>This document specifies all versions of the OpenCL API.</p> |
| </div> |
| <div class="paragraph"> |
| <p>There are three ways that an OpenCL feature may be described in terms of what |
| versions of OpenCL support that feature.</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>Missing before <em>major.minor</em>: Features that were introduced in |
| version <em>major.minor</em>. Implementations of an earlier version of OpenCL |
| will not provide these features.</p> |
| </li> |
| <li> |
| <p>Deprecated by <em>major.minor</em>: Features that were deprecated |
| in version <em>major.minor</em>, see the definition of deprecation in the |
| glossary.</p> |
| </li> |
| <li> |
| <p>Universal: Features that have no mention of what version they are missing |
| before or deprecated by are available in all versions of OpenCL.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="sect1"> |
| <h2 id="_glossary"><a class="anchor" href="#_glossary"></a>2. Glossary</h2> |
| <div class="sectionbody"> |
| <div class="dlist"> |
| <dl> |
| <dt class="hdlist1">Application </dt> |
| <dd> |
| <p>The combination of the program running on the host and OpenCL devices.</p> |
| </dd> |
| <dt class="hdlist1">Acquire semantics </dt> |
| <dd> |
| <p>One of the memory order semantics defined for synchronization |
| operations. |
| Acquire semantics apply to atomic operations that load from memory. |
| Given two units of execution, <strong>A</strong> and <strong>B</strong>, acting on a shared atomic |
| object <strong>M</strong>, if <strong>A</strong> uses an atomic load of <strong>M</strong> with acquire semantics to |
| synchronize-with an atomic store to <strong>M</strong> by <strong>B</strong> that used release |
| semantics, then <strong>A</strong>'s atomic load will occur before any subsequent |
| operations by <strong>A</strong>. |
| Note that the memory orders <em>release</em>, <em>sequentially consistent</em>, and |
| <em>acquire_release</em> all include <em>release semantics</em> and effectively pair |
| with a load using acquire semantics.</p> |
| </dd> |
| <dt class="hdlist1">Acquire release semantics </dt> |
| <dd> |
| <p>A memory order semantics for synchronization operations (such as atomic |
| operations) that has the properties of both acquire and release memory |
| orders. |
| It is used with read-modify-write operations.</p> |
| </dd> |
| <dt class="hdlist1">Atomic operations </dt> |
| <dd> |
| <p>Operations that at any point, and from any perspective, have either |
| occurred completely, or not at all. |
| Memory orders associated with atomic operations may constrain the |
| visibility of loads and stores with respect to the atomic operations |
| (see <em>relaxed semantics</em>, <em>acquire semantics</em>, <em>release semantics</em> or |
| <em>acquire release semantics</em>).</p> |
| </dd> |
| <dt class="hdlist1">Blocking and Non-Blocking Enqueue API calls </dt> |
| <dd> |
| <p>A <em>non-blocking enqueue API call</em> places a <em>command</em> on a |
| <em>command-queue</em> and returns immediately to the host. |
| The <em>blocking-mode enqueue API calls</em> do not return to the host until |
| the command has completed.</p> |
| </dd> |
| <dt class="hdlist1">Barrier </dt> |
| <dd> |
| <p>There are three types of <em>barriers</em> a command-queue barrier, a |
| work-group barrier and a sub-group barrier.</p> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>The OpenCL API provides a function to enqueue a <em>command-queue</em> |
| <em>barrier</em> command. |
| This <em>barrier</em> command ensures that all previously enqueued commands to |
| a command-queue have finished execution before any following <em>commands</em> |
| enqueued in the <em>command-queue</em> can begin execution.</p> |
| </li> |
| <li> |
| <p>The OpenCL kernel execution model provides built-in <em>work-group barrier</em> |
| functionality. |
| This <em>barrier</em> built-in function can be used by a <em>kernel</em> executing on |
| a <em>device</em> to perform synchronization between <em>work-items</em> in a |
| <em>work-group</em> executing the <em>kernel</em>. |
| All the <em>work-items</em> of a <em>work-group</em> must execute the <em>barrier</em> |
| construct before any are allowed to continue execution beyond the |
| <em>barrier</em>.</p> |
| </li> |
| <li> |
| <p>The OpenCL kernel execution model provides built-in <em>sub-group barrier</em> |
| functionality. |
| This <em>barrier</em> built-in function can be used by a <em>kernel</em> executing on |
| a <em>device</em> to perform synchronization between <em>work-items</em> in a |
| <em>sub-group</em> executing the <em>kernel</em>. |
| All the <em>work-items</em> of a <em>sub-group</em> must execute the <em>barrier</em> |
| construct before any are allowed to continue execution beyond the |
| <em>barrier</em>.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| </dd> |
| <dt class="hdlist1">Buffer Object </dt> |
| <dd> |
| <p>A memory object that stores a linear collection of bytes. |
| Buffer objects are accessible using a pointer in a <em>kernel</em> executing on |
| a <em>device</em>. |
| Buffer objects can be manipulated by the host using OpenCL API calls. |
| A <em>buffer object</em> encapsulates the following information:</p> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>Size in bytes.</p> |
| </li> |
| <li> |
| <p>Properties that describe usage information and which region to allocate |
| from.</p> |
| </li> |
| <li> |
| <p>Buffer data.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| </dd> |
| <dt class="hdlist1">Built-in Kernel </dt> |
| <dd> |
| <p>A <em>built-in kernel</em> is a <em>kernel</em> that is executed on an OpenCL <em>device</em> |
| or <em>custom device</em> by fixed-function hardware or in firmware. |
| <em>Applications</em> can query the <em>built-in kernels</em> supported by a <em>device</em> |
| or <em>custom device</em>. |
| A <em>program object</em> can only contain <em>kernels</em> written in OpenCL C or |
| <em>built-in kernels</em> but not both. |
| See also <em>Kernel</em> and <em>Program</em>.</p> |
| </dd> |
| <dt class="hdlist1">Child kernel </dt> |
| <dd> |
| <p>See <em>Device-side enqueue</em>.</p> |
| </dd> |
| <dt class="hdlist1">Command </dt> |
| <dd> |
| <p>The OpenCL operations that are submitted to a <em>command-queue</em> for |
| execution. |
| For example, OpenCL commands issue kernels for execution on a compute |
| device, manipulate memory objects, etc.</p> |
| </dd> |
| <dt class="hdlist1">Command-queue </dt> |
| <dd> |
| <p>An object that holds <em>commands</em> that will be executed on a specific |
| <em>device</em>. |
| The <em>command-queue</em> is created on a specific <em>device</em> in a <em>context</em>. |
| <em>Commands</em> to a <em>command-queue</em> are queued in-order but may be executed |
| in-order or out-of-order. |
| <em>Refer to In-order Execution_and_Out-of-order Execution</em>.</p> |
| </dd> |
| <dt class="hdlist1">Command-queue Barrier </dt> |
| <dd> |
| <p>See <em>Barrier</em>.</p> |
| </dd> |
| <dt class="hdlist1">Command synchronization </dt> |
| <dd> |
| <p>Constraints on the order that commands are launched for execution on a |
| device defined in terms of the synchronization points that occur between |
| commands in host command-queues and between commands in device-side |
| command-queues. |
| See <em>synchronization points</em>.</p> |
| </dd> |
| <dt class="hdlist1">Complete </dt> |
| <dd> |
| <p>The final state in the six state model for the execution of a command. |
| The transition into this state occurs is signaled through event objects |
| or callback functions associated with a command.</p> |
| </dd> |
| <dt class="hdlist1">Compute Device Memory </dt> |
| <dd> |
| <p>This refers to one or more memories attached to the compute device.</p> |
| </dd> |
| <dt class="hdlist1">Compute Unit </dt> |
| <dd> |
| <p>An OpenCL <em>device</em> has one or more <em>compute units</em>. |
| A <em>work-group</em> executes on a single <em>compute unit</em>. |
| A <em>compute unit</em> is composed of one or more <em>processing elements</em> and |
| <em>local memory</em>. |
| A <em>compute unit</em> may also include dedicated texture filter units that |
| can be accessed by its processing elements.</p> |
| </dd> |
| <dt class="hdlist1">Concurrency </dt> |
| <dd> |
| <p>A property of a system in which a set of tasks in a system can remain |
| active and make progress at the same time. |
| To utilize concurrent execution when running a program, a programmer |
| must identify the concurrency in their problem, expose it within the |
| source code, and then exploit it using a notation that supports |
| concurrency.</p> |
| </dd> |
| <dt class="hdlist1">Constant Memory </dt> |
| <dd> |
| <p>A region of <em>global memory</em> that remains constant during the execution |
| of a <em>kernel</em>. |
| The <em>host</em> allocates and initializes memory objects placed into |
| <em>constant memory</em>.</p> |
| </dd> |
| <dt class="hdlist1">Context </dt> |
| <dd> |
| <p>The environment within which the kernels execute and the domain in which |
| synchronization and memory management is defined. |
| The <em>context</em> includes a set of <em>devices</em>, the memory accessible to |
| those <em>devices</em>, the corresponding memory properties and one or more |
| <em>command-queues</em> used to schedule execution of a <em>kernel(s)</em> or |
| operations on <em>memory objects</em>.</p> |
| </dd> |
| <dt class="hdlist1">Control flow </dt> |
| <dd> |
| <p>The flow of instructions executed by a work-item. |
| Multiple logically related work-items may or may not execute the same |
| control flow. |
| The control flow is said to be <em>converged</em> if all the work-items in the |
| set execution the same stream of instructions. |
| In a <em>diverged</em> control flow, the work-items in the set execute |
| different instructions. |
| At a later point, if a diverged control flow becomes converged, it is |
| said to be a re-converged control flow.</p> |
| </dd> |
| <dt class="hdlist1">Converged control flow </dt> |
| <dd> |
| <p>See <em>Control flow</em>.</p> |
| </dd> |
| <dt class="hdlist1">Custom Device </dt> |
| <dd> |
| <p>An OpenCL <em>device</em> that fully implements the OpenCL Runtime but does not |
| support <em>programs</em> written in OpenCL C. |
| A custom device may be specialized non-programmable hardware that is |
| very power efficient and performant for directed tasks or hardware with |
| limited programmable capabilities such as specialized DSPs. |
| Custom devices are not OpenCL conformant. |
| Custom devices may support an online compiler. |
| Programs for custom devices can be created using the OpenCL runtime APIs |
| that allow OpenCL programs to be created from source (if an online |
| compiler is supported) and/or binary, or from <em>built-in kernels</em> |
| supported by the <em>device</em>. |
| See also <em>Device</em>.</p> |
| </dd> |
| <dt class="hdlist1">Data Parallel Programming Model </dt> |
| <dd> |
| <p>Traditionally, this term refers to a programming model where concurrency |
| is expressed as instructions from a single program applied to multiple |
| elements within a set of data structures. |
| The term has been generalized in OpenCL to refer to a model wherein a |
| set of instructions from a single program are applied concurrently to |
| each point within an abstract domain of indices.</p> |
| </dd> |
| <dt class="hdlist1">Data race </dt> |
| <dd> |
| <p>The execution of a program contains a data race if it contains two |
| actions in different work-items or host threads where (1) one action |
| modifies a memory location and the other action reads or modifies the |
| same memory location, and (2) at least one of these actions is not |
| atomic, or the corresponding memory scopes are not inclusive, and (3) |
| the actions are global actions unordered by the global-happens-before |
| relation or are local actions unordered by the local-happens before |
| relation.</p> |
| </dd> |
| <dt class="hdlist1">Deprecation </dt> |
| <dd> |
| <p>Existing features are marked as deprecated if their usage is not |
| recommended as that feature is being de-emphasized, superseded and may |
| be removed from a future version of the specification.</p> |
| </dd> |
| <dt class="hdlist1">Device </dt> |
| <dd> |
| <p>A <em>device</em> is a collection of <em>compute units</em>. |
| A <em>command-queue</em> is used to queue <em>commands</em> to a <em>device</em>. |
| Examples of <em>commands</em> include executing <em>kernels</em>, or reading and |
| writing <em>memory objects</em>. |
| OpenCL devices typically correspond to a GPU, a multi-core CPU, and |
| other processors such as DSPs and the Cell/B.E. |
| processor.</p> |
| </dd> |
| <dt class="hdlist1">Device-side enqueue </dt> |
| <dd> |
| <p>A mechanism whereby a kernel-instance is enqueued by a kernel-instance |
| running on a device without direct involvement by the host program. |
| This produces <em>nested parallelism</em>; i.e. additional levels of |
| concurrency are nested inside a running kernel-instance. |
| The kernel-instance executing on a device (the <em>parent kernel</em>) enqueues |
| a kernel-instance (the <em>child kernel</em>) to a device-side command queue. |
| Child and parent kernels execute asynchronously though a parent kernel |
| does not complete until all of its child-kernels have completed.</p> |
| </dd> |
| <dt class="hdlist1">Diverged control flow </dt> |
| <dd> |
| <p>See <em>Control flow</em>.</p> |
| </dd> |
| <dt class="hdlist1">Ended </dt> |
| <dd> |
| <p>The fifth state in the six state model for the execution of a command. |
| The transition into this state occurs when execution of a command has |
| ended. |
| When a Kernel-enqueue command ends, all of the work-groups associated |
| with that command have finished their execution.</p> |
| </dd> |
| <dt class="hdlist1">Event Object </dt> |
| <dd> |
| <p>An <em>event object</em> encapsulates the status of an operation such as a |
| <em>command</em>. |
| It can be used to synchronize operations in a context.</p> |
| </dd> |
| <dt class="hdlist1">Event Wait List </dt> |
| <dd> |
| <p>An <em>event wait list</em> is a list of <em>event objects</em> that can be used to |
| control when a particular <em>command</em> begins execution.</p> |
| </dd> |
| <dt class="hdlist1">Fence </dt> |
| <dd> |
| <p>A memory ordering operation without an associated atomic object. |
| A fence can use the <em>acquire semantics, release semantics</em>, or <em>acquire |
| release semantics</em>.</p> |
| </dd> |
| <dt class="hdlist1">Framework </dt> |
| <dd> |
| <p>A software system that contains the set of components to support |
| software development and execution. |
| A <em>framework</em> typically includes libraries, APIs, runtime systems, |
| compilers, etc.</p> |
| </dd> |
| <dt class="hdlist1">Generic address space </dt> |
| <dd> |
| <p>An address space that include the <em>private</em>, <em>local</em>, and <em>global</em> |
| address spaces available to a device. |
| The generic address space supports conversion of pointers to and from |
| private, local and global address spaces, and hence lets a programmer |
| write a single function that at compile time can take arguments from any |
| of the three named address spaces.</p> |
| </dd> |
| <dt class="hdlist1">Global Happens before </dt> |
| <dd> |
| <p>See <em>Happens before</em>.</p> |
| </dd> |
| <dt class="hdlist1">Global ID </dt> |
| <dd> |
| <p>A <em>global ID</em> is used to uniquely identify a <em>work-item</em> and is derived |
| from the number of <em>global work-items</em> specified when executing a |
| <em>kernel</em>. |
| The <em>global ID</em> is a N-dimensional value that starts at (0, 0, …​ 0). |
| See also <em>Local ID</em>.</p> |
| </dd> |
| <dt class="hdlist1">Global Memory </dt> |
| <dd> |
| <p>A memory region accessible to all <em>work-items</em> executing in a <em>context</em>. |
| It is accessible to the <em>host</em> using <em>commands</em> such as read, write and |
| map. |
| <em>Global memory</em> is included within the <em>generic address space</em> that |
| includes the private and local address spaces.</p> |
| </dd> |
| <dt class="hdlist1">GL share group </dt> |
| <dd> |
| <p>A <em>GL share group</em> object manages shared OpenGL or OpenGL ES resources |
| such as textures, buffers, framebuffers, and renderbuffers and is |
| associated with one or more GL context objects. |
| The <em>GL share group</em> is typically an opaque object and not directly |
| accessible.</p> |
| </dd> |
| <dt class="hdlist1">Handle </dt> |
| <dd> |
| <p>An opaque type that references an <em>object</em> allocated by OpenCL. |
| Any operation on an <em>object</em> occurs by reference to that objects handle.</p> |
| </dd> |
| <dt class="hdlist1">Happens before </dt> |
| <dd> |
| <p>An ordering relationship between operations that execute on multiple |
| units of execution. |
| If an operation A happens-before operation B then A must occur before B; |
| in particular, any value written by A will be visible to B. |
| We define two separate happens before relations: <em>global-happens-before</em> |
| and <em>local-happens-before</em>. |
| These are defined in <a href="#memory-ordering-rules">Memory Ordering Rules</a>.</p> |
| </dd> |
| <dt class="hdlist1">Host </dt> |
| <dd> |
| <p>The <em>host</em> interacts with the <em>context</em> using the OpenCL API.</p> |
| </dd> |
| <dt class="hdlist1">Host-thread </dt> |
| <dd> |
| <p>The unit of execution that executes the statements in the host program.</p> |
| </dd> |
| <dt class="hdlist1">Host pointer </dt> |
| <dd> |
| <p>A pointer to memory that is in the virtual address space on the <em>host</em>.</p> |
| </dd> |
| <dt class="hdlist1">Illegal </dt> |
| <dd> |
| <p>Behavior of a system that is explicitly not allowed and will be reported |
| as an error when encountered by OpenCL.</p> |
| </dd> |
| <dt class="hdlist1">Image Object </dt> |
| <dd> |
| <p>A <em>memory object</em> that stores a two- or three-dimensional structured |
| array. |
| Image data can only be accessed with read and write functions. |
| The read functions use a <em>sampler</em>.</p> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>The <em>image object</em> encapsulates the following information:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>Dimensions of the image.</p> |
| </li> |
| <li> |
| <p>Description of each element in the image.</p> |
| </li> |
| <li> |
| <p>Properties that describe usage information and which region to allocate |
| from.</p> |
| </li> |
| <li> |
| <p>Image data.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The elements of an image are selected from a list of predefined image |
| formats.</p> |
| </div> |
| </div> |
| </div> |
| </dd> |
| <dt class="hdlist1">Implementation Defined </dt> |
| <dd> |
| <p>Behavior that is explicitly allowed to vary between conforming |
| implementations of OpenCL. |
| An OpenCL implementor is required to document the implementation-defined |
| behavior.</p> |
| </dd> |
| <dt class="hdlist1">Independent Forward Progress </dt> |
| <dd> |
| <p>If an entity supports independent forward progress, then if it is |
| otherwise not dependent on any actions due to be performed by any other |
| entity (for example it does not wait on a lock held by, and thus that |
| must be released by, any other entity), then its execution cannot be |
| blocked by the execution of any other entity in the system (it will not |
| be starved). |
| Work-items in a subgroup, for example, typically do not support |
| independent forward progress, so one work-item in a subgroup may be |
| completely blocked (starved) if a different work-item in the same |
| subgroup enters a spin loop.</p> |
| </dd> |
| <dt class="hdlist1">In-order Execution </dt> |
| <dd> |
| <p>A model of execution in OpenCL where the <em>commands</em> in a <em>command-queue</em> |
| are executed in order of submission with each <em>command</em> running to |
| completion before the next one begins. |
| See Out-of-order Execution.</p> |
| </dd> |
| <dt class="hdlist1">Intermediate Language </dt> |
| <dd> |
| <p>A lower-level language that may be used to create programs. |
| SPIR-V is a required intermediate language (IL) for OpenCL 2.1 and 2.2 devices. |
| Other OpenCL devices may optionally support SPIR-V or other ILs.</p> |
| </dd> |
| <dt class="hdlist1">Kernel </dt> |
| <dd> |
| <p>A <em>kernel</em> is a function declared in a <em>program</em> and executed on an |
| OpenCL <em>device</em>. |
| A <em>kernel</em> is identified by the <code>__kernel</code> or <code>kernel</code> qualifier applied to |
| any function defined in a <em>program</em>.</p> |
| </dd> |
| <dt class="hdlist1">Kernel-instance </dt> |
| <dd> |
| <p>The work carried out by an OpenCL program occurs through the execution |
| of kernel-instances on devices. |
| The kernel instance is the <em>kernel object</em>, the values associated with |
| the arguments to the kernel, and the parameters that define the |
| <em>NDRange</em> index space.</p> |
| </dd> |
| <dt class="hdlist1">Kernel Object </dt> |
| <dd> |
| <p>A <em>kernel object</em> encapsulates a specific <em>kernel</em> function declared |
| in a <em>program</em> and the argument values to be used when executing this |
| <em>kernel</em> function.</p> |
| </dd> |
| <dt class="hdlist1">Kernel Language </dt> |
| <dd> |
| <p>A language that is used to represent source code for kernel. |
| Kernels may be directly created from OpenCL C kernel language |
| source strings. |
| Other kernel languages may be supported by compiling to SPIR-V, |
| another supported Intermediate Language, or to a device-specific |
| program binary format.</p> |
| </dd> |
| <dt class="hdlist1">Launch </dt> |
| <dd> |
| <p>The transition of a command from the <em>submitted</em> state to the <em>ready</em> |
| state. |
| See <em>Ready</em>.</p> |
| </dd> |
| <dt class="hdlist1">Local ID </dt> |
| <dd> |
| <p>A <em>local ID</em> specifies a unique <em>work-item ID</em> within a given |
| <em>work-group</em> that is executing a <em>kernel</em>. |
| The <em>local ID</em> is a N-dimensional value that starts at (0, 0, …​ 0). |
| See also <em>Global ID</em>.</p> |
| </dd> |
| <dt class="hdlist1">Local Memory </dt> |
| <dd> |
| <p>A memory region associated with a <em>work-group</em> and accessible only by |
| <em>work-items</em> in that <em>work-group</em>. |
| <em>Local memory</em> is included within the <em>generic address space</em> that |
| includes the private and global address spaces.</p> |
| </dd> |
| <dt class="hdlist1">Marker </dt> |
| <dd> |
| <p>A <em>command</em> queued in a <em>command-queue</em> that can be used to tag all |
| <em>commands</em> queued before the <em>marker</em> in the <em>command-queue</em>. |
| The <em>marker</em> command returns an <em>event</em> which can be used by the |
| <em>application</em> to queue a wait on the marker event i.e. wait for all |
| commands queued before the <em>marker</em> command to complete.</p> |
| </dd> |
| <dt class="hdlist1">Memory Consistency Model </dt> |
| <dd> |
| <p>Rules that define which values are observed when multiple units of |
| execution load data from any shared memory plus the synchronization |
| operations that constrain the order of memory operations and define |
| synchronization relationships. |
| The memory consistency model in OpenCL is based on the memory model from |
| the ISO C11 programming language.</p> |
| </dd> |
| <dt class="hdlist1">Memory Objects </dt> |
| <dd> |
| <p>A <em>memory object</em> is a handle to a reference counted region of <em>Global |
| Memory</em>. |
| Also see <em>Buffer Object</em> and <em>Image Object</em>.</p> |
| </dd> |
| <dt class="hdlist1">Memory Regions (or Pools) </dt> |
| <dd> |
| <p>A distinct address space in OpenCL. |
| <em>Memory regions</em> may overlap in physical memory though OpenCL will treat |
| them as logically distinct. |
| The <em>memory regions</em> are denoted as <em>private</em>, <em>local</em>, <em>constant,</em> and |
| <em>global</em>.</p> |
| </dd> |
| <dt class="hdlist1">Memory Scopes </dt> |
| <dd> |
| <p>These memory scopes define a hierarchy of visibilities when analyzing |
| the ordering constraints of memory operations. |
| They are defined by the values of the <strong>memory_scope</strong> enumeration |
| constant. |
| Current values are <strong>memory_scope_work_item</strong> (memory constraints only |
| apply to a single work-item and in practice apply only to image |
| operations), <strong>memory_scope_sub_group</strong> (memory-ordering constraints only |
| apply to work-items executing in a sub-group), <strong>memory_scope_work_group</strong> |
| (memory-ordering constraints only apply to work-items executing in a |
| work-group), <strong>memory_scope_device</strong> (memory-ordering constraints only |
| apply to work-items executing on a single device) and |
| <strong>memory_scope_all_svm_devices</strong> (memory-ordering constraints only apply |
| to work-items executing across multiple devices and when using shared |
| virtual memory).</p> |
| </dd> |
| <dt class="hdlist1">Modification Order </dt> |
| <dd> |
| <p>All modifications to a particular atomic object M occur in some |
| particular <em>total order</em>, called the <em>modification order</em> of M. |
| If A and B are modifications of an atomic object M, and A happens-before |
| B, then A shall precede B in the modification order of M. |
| Note that the modification order of an atomic object M is independent of |
| whether M is in local or global memory.</p> |
| </dd> |
| <dt class="hdlist1">Nested Parallelism </dt> |
| <dd> |
| <p>See <em>device-side enqueue</em>.</p> |
| </dd> |
| <dt class="hdlist1">Object </dt> |
| <dd> |
| <p>Objects are abstract representation of the resources that can be |
| manipulated by the OpenCL API. |
| Examples include <em>program objects</em>, <em>kernel objects</em>, and <em>memory |
| objects</em>.</p> |
| </dd> |
| <dt class="hdlist1">Out-of-Order Execution </dt> |
| <dd> |
| <p>A model of execution in which <em>commands</em> placed in the <em>work queue</em> may |
| begin and complete execution in any order consistent with constraints |
| imposed by <em>event wait lists_and_command-queue barrier</em>. |
| See <em>In-order Execution</em>.</p> |
| </dd> |
| <dt class="hdlist1">Parent device </dt> |
| <dd> |
| <p>The OpenCL <em>device</em> which is partitioned to create <em>sub-devices</em>. |
| Not all <em>parent devices</em> are <em>root devices</em>. |
| A <em>root device</em> might be partitioned and the <em>sub-devices</em> partitioned |
| again. |
| In this case, the first set of <em>sub-devices</em> would be <em>parent devices</em> |
| of the second set, but not the <em>root devices</em>. |
| Also see <em>Device</em>, <em>parent device</em> and <em>root device</em>.</p> |
| </dd> |
| <dt class="hdlist1">Parent kernel </dt> |
| <dd> |
| <p>see <em>Device-side enqueue</em>.</p> |
| </dd> |
| <dt class="hdlist1">Pipe </dt> |
| <dd> |
| <p>The <em>pipe</em> memory object conceptually is an ordered sequence of data |
| items. |
| A pipe has two endpoints: a write endpoint into which data items are |
| inserted, and a read endpoint from which data items are removed. |
| At any one time, only one kernel instance may write into a pipe, and |
| only one kernel instance may read from a pipe. |
| To support the producer consumer design pattern, one kernel instance |
| connects to the write endpoint (the producer) while another kernel |
| instance connects to the reading endpoint (the consumer).</p> |
| </dd> |
| <dt class="hdlist1">Platform </dt> |
| <dd> |
| <p>The <em>host</em> plus a collection of <em>devices</em> managed by the OpenCL |
| <em>framework</em> that allow an application to share <em>resources</em> and execute |
| <em>kernels</em> on <em>devices</em> in the <em>platform</em>.</p> |
| </dd> |
| <dt class="hdlist1">Private Memory </dt> |
| <dd> |
| <p>A region of memory private to a <em>work-item</em>. |
| Variables defined in one <em>work-items</em> <em>private memory</em> are not visible |
| to another <em>work-item</em>.</p> |
| </dd> |
| <dt class="hdlist1">Processing Element </dt> |
| <dd> |
| <p>A virtual scalar processor. |
| A work-item may execute on one or more processing elements.</p> |
| </dd> |
| <dt class="hdlist1">Program </dt> |
| <dd> |
| <p>An OpenCL <em>program</em> consists of a set of <em>kernels</em>. |
| <em>Programs</em> may also contain auxiliary functions called by the |
| <em>kernel</em> functions and constant data.</p> |
| </dd> |
| <dt class="hdlist1">Program Object </dt> |
| <dd> |
| <p>A <em>program object</em> encapsulates the following information:</p> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>A reference to an associated <em>context</em>.</p> |
| </li> |
| <li> |
| <p>A <em>program</em> source or binary.</p> |
| </li> |
| <li> |
| <p>The latest successfully built program executable, the list of <em>devices</em> |
| for which the program executable is built, the build options used and a |
| build log.</p> |
| </li> |
| <li> |
| <p>The number of <em>kernel objects</em> currently attached.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| </dd> |
| <dt class="hdlist1">Queued </dt> |
| <dd> |
| <p>The first state in the six state model for the execution of a command. |
| The transition into this state occurs when the command is enqueued into |
| a command-queue.</p> |
| </dd> |
| <dt class="hdlist1">Ready </dt> |
| <dd> |
| <p>The third state in the six state model for the execution of a command. |
| The transition into this state occurs when pre-requisites constraining |
| execution of a command have been met; i.e. the command has been |
| launched. |
| When a kernel-enqueue command is launched, work-groups associated with |
| the command are placed in a devices work-pool from which they are |
| scheduled for execution.</p> |
| </dd> |
| <dt class="hdlist1">Re-converged Control Flow </dt> |
| <dd> |
| <p>see <em>Control flow</em>.</p> |
| </dd> |
| <dt class="hdlist1">Reference Count </dt> |
| <dd> |
| <p>The life span of an OpenCL object is determined by its <em>reference |
| count</em>, an internal count of the number of references to the object. |
| When you create an object in OpenCL, its <em>reference count</em> is set to |
| one. |
| Subsequent calls to the appropriate <em>retain</em> API (such as |
| <a href="#clRetainContext"><strong>clRetainContext</strong></a>, <a href="#clRetainCommandQueue"><strong>clRetainCommandQueue</strong></a>) increment the <em>reference |
| count</em>. |
| Calls to the appropriate <em>release</em> API (such as <a href="#clReleaseContext"><strong>clReleaseContext</strong></a>, |
| <a href="#clReleaseCommandQueue"><strong>clReleaseCommandQueue</strong></a>) decrement the <em>reference count</em>. |
| Implementations may also modify the <em>reference count</em>, e.g. to track |
| attached objects or to ensure correct operation of in-progress or |
| scheduled activities. |
| The object becomes inaccessible to host code when the number of |
| <em>release</em> operations performed matches the number of <em>retain</em> operations |
| plus the allocation of the object. |
| At this point the reference count may be zero but this is not |
| guaranteed.</p> |
| </dd> |
| <dt class="hdlist1">Relaxed Consistency </dt> |
| <dd> |
| <p>A memory consistency model in which the contents of memory visible to |
| different <em>work-items</em> or <em>commands</em> may be different except at a |
| <em>barrier</em> or other explicit synchronization points.</p> |
| </dd> |
| <dt class="hdlist1">Relaxed Semantics </dt> |
| <dd> |
| <p>A memory order semantics for atomic operations that implies no order |
| constraints. |
| The operation is <em>atomic</em> but it has no impact on the order of memory |
| operations.</p> |
| </dd> |
| <dt class="hdlist1">Release Semantics </dt> |
| <dd> |
| <p>One of the memory order semantics defined for synchronization |
| operations. |
| Release semantics apply to atomic operations that store to memory. |
| Given two units of execution, <strong>A</strong> and <strong>B</strong>, acting on a shared atomic |
| object <strong>M</strong>, if <strong>A</strong> uses an atomic store of <strong>M</strong> with release semantics to |
| synchronize-with an atomic load to <strong>M</strong> by <strong>B</strong> that used acquire |
| semantics, then <strong>A</strong>'s atomic store will occur <em>after</em> any prior |
| operations by <strong>A</strong>. |
| Note that the memory orders <em>acquire</em>, <em>sequentially consistent</em>, and |
| <em>acquire_release</em> all include <em>acquire semantics</em> and effectively pair |
| with a store using release semantics.</p> |
| </dd> |
| <dt class="hdlist1">Remainder work-groups </dt> |
| <dd> |
| <p>When the work-groups associated with a kernel-instance are defined, the |
| sizes of a work-group in each dimension may not evenly divide the size |
| of the NDRange in the corresponding dimensions. |
| The result is a collection of work-groups on the boundaries of the |
| NDRange that are smaller than the base work-group size. |
| These are known as <em>remainder work-groups</em>.</p> |
| </dd> |
| <dt class="hdlist1">Running </dt> |
| <dd> |
| <p>The fourth state in the six state model for the execution of a command. |
| The transition into this state occurs when the execution of the command |
| starts. |
| When a Kernel-enqueue command starts, one or more work-groups associated |
| with the command start to execute.</p> |
| </dd> |
| <dt class="hdlist1">Root device </dt> |
| <dd> |
| <p>A <em>root device</em> is an OpenCL <em>device</em> that has not been partitioned. |
| Also see <em>Device</em>, <em>Parent device</em> and <em>Root device</em>.</p> |
| </dd> |
| <dt class="hdlist1">Resource </dt> |
| <dd> |
| <p>A class of <em>objects</em> defined by OpenCL. |
| An instance of a <em>resource</em> is an <em>object</em>. |
| The most common <em>resources</em> are the <em>context</em>, <em>command-queue</em>, <em>program |
| objects</em>, <em>kernel objects</em>, and <em>memory objects</em>. |
| Computational resources are hardware elements that participate in the |
| action of advancing a program counter. |
| Examples include the <em>host</em>, <em>devices</em>, <em>compute units</em> and <em>processing |
| elements</em>.</p> |
| </dd> |
| <dt class="hdlist1">Retain, Release </dt> |
| <dd> |
| <p>The action of incrementing (retain) and decrementing (release) the |
| reference count using an OpenCL <em>object</em>. |
| This is a book keeping functionality to make sure the system doesn’t |
| remove an <em>object</em> before all instances that use this <em>object</em> have |
| finished. |
| Refer to <em>Reference Count</em>.</p> |
| </dd> |
| <dt class="hdlist1">Sampler </dt> |
| <dd> |
| <p>An <em>object</em> that describes how to sample an image when the image is read |
| in the <em>kernel</em>. |
| The image read functions take a <em>sampler</em> as an argument. |
| The <em>sampler</em> specifies the image addressing-mode i.e. how out-of-range |
| image coordinates are handled, the filter mode, and whether the input |
| image coordinate is a normalized or unnormalized value.</p> |
| </dd> |
| <dt class="hdlist1">Scope inclusion </dt> |
| <dd> |
| <p>Two actions <strong>A</strong> and <strong>B</strong> are defined to have an inclusive scope if they |
| have the same scope <strong>P</strong> such that: (1) if <strong>P</strong> is |
| <strong>memory_scope_sub_group</strong>, and <strong>A</strong> and <strong>B</strong> are executed by work-items |
| within the same sub-group, or (2) if <strong>P</strong> is <strong>memory_scope_work_group</strong>, |
| and <strong>A</strong> and <strong>B</strong> are executed by work-items within the same work-group, |
| or (3) if <strong>P</strong> is <strong>memory_scope_device</strong>, and <strong>A</strong> and <strong>B</strong> are executed by |
| work-items on the same device, or (4) if <strong>P</strong> is |
| <strong>memory_scope_all_svm_devices</strong>, if <strong>A</strong> and <strong>B</strong> are executed by host |
| threads or by work-items on one or more devices that can share SVM |
| memory with each other and the host process.</p> |
| </dd> |
| <dt class="hdlist1">Sequenced before </dt> |
| <dd> |
| <p>A relation between evaluations executed by a single unit of execution. |
| Sequenced-before is an asymmetric, transitive, pair-wise relation that |
| induces a partial order between evaluations. |
| Given any two evaluations A and B, if A is sequenced-before B, then the |
| execution of A shall precede the execution of B.</p> |
| </dd> |
| <dt class="hdlist1">Sequential consistency </dt> |
| <dd> |
| <p>Sequential consistency interleaves the steps executed by each unit of |
| execution. |
| Each access to a memory location sees the last assignment to that |
| location in that interleaving.</p> |
| </dd> |
| <dt class="hdlist1">Sequentially consistent semantics </dt> |
| <dd> |
| <p>One of the memory order semantics defined for synchronization |
| operations. |
| When using sequentially-consistent synchronization operations, the loads |
| and stores within one unit of execution appear to execute in program |
| order (i.e., the sequenced-before order), and loads and stores from |
| different units of execution appear to be simply interleaved.</p> |
| </dd> |
| <dt class="hdlist1">Shared Virtual Memory (SVM) </dt> |
| <dd> |
| <p>An address space exposed to both the host and the devices within a |
| context. |
| SVM causes addresses to be meaningful between the host and all of the |
| devices within a context and therefore supports the use of pointer based |
| data structures in OpenCL kernels. |
| It logically extends a portion of the global memory into the host |
| address space therefore giving work-items access to the host address |
| space. |
| There are three types of SVM in OpenCL:</p> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="dlist"> |
| <dl> |
| <dt class="hdlist1"><em>Coarse-Grained buffer SVM</em> </dt> |
| <dd> |
| <p>Sharing occurs at the granularity of regions of OpenCL buffer memory |
| objects.</p> |
| </dd> |
| <dt class="hdlist1"><em>Fine-Grained buffer SVM</em> </dt> |
| <dd> |
| <p>Sharing occurs at the granularity of individual loads/stores into bytes |
| within OpenCL buffer memory objects.</p> |
| </dd> |
| <dt class="hdlist1"><em>Fine-Grained system SVM</em> </dt> |
| <dd> |
| <p>Sharing occurs at the granularity of individual loads/stores into bytes |
| occurring anywhere within the host memory.</p> |
| </dd> |
| </dl> |
| </div> |
| </div> |
| </div> |
| </dd> |
| <dt class="hdlist1">SIMD </dt> |
| <dd> |
| <p>Single Instruction Multiple Data. |
| A programming model where a <em>kernel</em> is executed concurrently on |
| multiple <em>processing elements</em> each with its own data and a shared |
| program counter. |
| All <em>processing elements</em> execute a strictly identical set of |
| instructions.</p> |
| </dd> |
| <dt class="hdlist1">Specialization constants </dt> |
| <dd> |
| <p>Specialization constants are special constant objects that do not |
| have known constant values in an intermediate language (e.g. SPIR-V). |
| Applications may provide updated values for the specialization constants |
| before a program is built. |
| Specialization constants that do not receive a value from an application |
| shall use the default specialization constant value.</p> |
| </dd> |
| <dt class="hdlist1">SPMD </dt> |
| <dd> |
| <p>Single Program Multiple Data. |
| A programming model where a <em>kernel</em> is executed concurrently on |
| multiple <em>processing elements</em> each with its own data and its own |
| program counter. |
| Hence, while all computational resources run the same <em>kernel</em> they |
| maintain their own instruction counter and due to branches in a |
| <em>kernel</em>, the actual sequence of instructions can be quite different |
| across the set of <em>processing elements</em>.</p> |
| </dd> |
| <dt class="hdlist1">Sub-device </dt> |
| <dd> |
| <p>An OpenCL <em>device</em> can be partitioned into multiple <em>sub-devices</em>. |
| The new <em>sub-devices</em> alias specific collections of compute units within |
| the parent <em>device</em>, according to a partition scheme. |
| The <em>sub-devices</em> may be used in any situation that their parent |
| <em>device</em> may be used. |
| Partitioning a <em>device</em> does not destroy the parent <em>device</em>, which may |
| continue to be used along side and intermingled with its child |
| <em>sub-devices</em>. |
| Also see <em>Device</em>, <em>Parent device</em> and <em>Root device</em>.</p> |
| </dd> |
| <dt class="hdlist1">Sub-group </dt> |
| <dd> |
| <p>Sub-groups are an implementation-dependent grouping of work-items within |
| a work-group. |
| The size and number of sub-groups is implementation-defined.</p> |
| </dd> |
| <dt class="hdlist1">Sub-group Barrier </dt> |
| <dd> |
| <p>See <em>Barrier</em>.</p> |
| </dd> |
| <dt class="hdlist1">Submitted </dt> |
| <dd> |
| <p>The second state in the six state model for the execution of a command. |
| The transition into this state occurs when the command is flushed from |
| the command-queue and submitted for execution on the device. |
| Once submitted, a programmer can assume a command will execute once its |
| prerequisites have been met.</p> |
| </dd> |
| <dt class="hdlist1">SVM Buffer </dt> |
| <dd> |
| <p>A memory allocation enabled to work with <em>Shared Virtual Memory (SVM)</em>. |
| Depending on how the SVM buffer is created, it can be a coarse-grained |
| or fine-grained SVM buffer. |
| Optionally it may be wrapped by a <em>Buffer Object</em>. |
| See <em>Shared Virtual Memory (SVM)</em>.</p> |
| </dd> |
| <dt class="hdlist1">Synchronization </dt> |
| <dd> |
| <p>Synchronization refers to mechanisms that constrain the order of |
| execution and the visibility of memory operations between two or more |
| units of execution.</p> |
| </dd> |
| <dt class="hdlist1">Synchronization operations </dt> |
| <dd> |
| <p>Operations that define memory order constraints in a program. |
| They play a special role in controlling how memory operations in one |
| unit of execution (such as work-items or, when using SVM a host thread) |
| are made visible to another. |
| Synchronization operations in OpenCL include <em>atomic operations</em> and |
| <em>fences</em>.</p> |
| </dd> |
| <dt class="hdlist1">Synchronization point </dt> |
| <dd> |
| <p>A synchronization point between a pair of commands (A and B) assures |
| that results of command A happens-before command B is launched (i.e. |
| enters the ready state) .</p> |
| </dd> |
| <dt class="hdlist1">Synchronizes with </dt> |
| <dd> |
| <p>A relation between operations in two different units of execution that |
| defines a memory order constraint in global memory |
| (<em>global-synchronizes-with</em>) or local memory |
| (<em>local-synchronizes-with</em>).</p> |
| </dd> |
| <dt class="hdlist1">Task Parallel Programming Model </dt> |
| <dd> |
| <p>A programming model in which computations are expressed in terms of |
| multiple concurrent tasks executing in one or more <em>command-queues</em>. |
| The concurrent tasks can be running different <em>kernels</em>.</p> |
| </dd> |
| <dt class="hdlist1">Thread-safe </dt> |
| <dd> |
| <p>An OpenCL API call is considered to be <em>thread-safe</em> if the internal |
| state as managed by OpenCL remains consistent when called simultaneously |
| by multiple <em>host</em> threads. |
| OpenCL API calls that are <em>thread-safe</em> allow an application to call |
| these functions in multiple <em>host</em> threads without having to implement |
| mutual exclusion across these <em>host</em> threads i.e. they are also |
| re-entrant-safe.</p> |
| </dd> |
| <dt class="hdlist1">Undefined </dt> |
| <dd> |
| <p>The behavior of an OpenCL API call, built-in function used inside a |
| <em>kernel</em> or execution of a <em>kernel</em> that is explicitly not defined by |
| OpenCL. |
| A conforming implementation is not required to specify what occurs when |
| an undefined construct is encountered in OpenCL.</p> |
| </dd> |
| <dt class="hdlist1">Unit of execution </dt> |
| <dd> |
| <p>A generic term for a process, OS managed thread running on the host (a |
| host-thread), kernel-instance, host program, work-item or any other |
| executable agent that advances the work associated with a program.</p> |
| </dd> |
| <dt class="hdlist1">Work-group </dt> |
| <dd> |
| <p>A collection of related <em>work-items</em> that execute on a single <em>compute |
| unit</em>. |
| The <em>work-items</em> in the group execute the same <em>kernel-instance</em> and |
| share <em>local</em> <em>memory</em> and <em>work-group functions</em>.</p> |
| </dd> |
| <dt class="hdlist1">Work-group Barrier </dt> |
| <dd> |
| <p>See <em>Barrier</em>.</p> |
| </dd> |
| <dt class="hdlist1">Work-group Function </dt> |
| <dd> |
| <p>A function that carries out collective operations across all the |
| work-items in a work-group. |
| Available collective operations are a barrier, reduction, broadcast, |
| prefix sum, and evaluation of a predicate. |
| A work-group function must occur within a <em>converged control flow</em>; i.e. |
| all work-items in the work-group must encounter precisely the same |
| work-group function.</p> |
| </dd> |
| <dt class="hdlist1">Work-group Synchronization </dt> |
| <dd> |
| <p>Constraints on the order of execution for work-items in a single |
| work-group.</p> |
| </dd> |
| <dt class="hdlist1">Work-pool </dt> |
| <dd> |
| <p>A logical pool associated with a device that holds commands and |
| work-groups from kernel-instances that are ready to execute. |
| OpenCL does not constrain the order that commands and work-groups are |
| scheduled for execution from the work-pool; i.e. a programmer must |
| assume that they could be interleaved. |
| There is one work-pool per device used by all command-queues associated |
| with that device. |
| The work-pool may be implemented in any manner as long as it assures |
| that work-groups placed in the pool will eventually execute.</p> |
| </dd> |
| <dt class="hdlist1">Work-item </dt> |
| <dd> |
| <p>One of a collection of parallel executions of a <em>kernel</em> invoked on a |
| <em>device</em> by a <em>command</em>. |
| A <em>work-item</em> is executed by one or more <em>processing elements</em> as part |
| of a <em>work-group</em> executing on a <em>compute unit</em>. |
| A <em>work-item</em> is distinguished from other work-items by its <em>global ID</em> |
| or the combination of its <em>work-group</em> ID and its <em>local ID</em> within a |
| <em>work-group</em>.</p> |
| </dd> |
| </dl> |
| </div> |
| </div> |
| </div> |
| <div class="sect1"> |
| <h2 id="_the_opencl_architecture"><a class="anchor" href="#_the_opencl_architecture"></a>3. The OpenCL Architecture</h2> |
| <div class="sectionbody"> |
| <div class="paragraph"> |
| <p><strong>OpenCL</strong> is an open industry standard for programming a heterogeneous |
| collection of CPUs, GPUs and other discrete computing devices organized into |
| a single platform. |
| It is more than a language. |
| OpenCL is a framework for parallel programming and includes a language, API, |
| libraries and a runtime system to support software development. |
| Using OpenCL, for example, a programmer can write general purpose programs |
| that execute on GPUs without the need to map their algorithms onto a 3D |
| graphics API such as OpenGL or DirectX.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The target of OpenCL is expert programmers wanting to write portable yet |
| efficient code. |
| This includes library writers, middleware vendors, and performance oriented |
| application programmers. |
| Therefore OpenCL provides a low-level hardware abstraction plus a framework |
| to support programming and many details of the underlying hardware are |
| exposed.</p> |
| </div> |
| <div class="paragraph"> |
| <p>To describe the core ideas behind OpenCL, we will use a hierarchy of models:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>Platform Model</p> |
| </li> |
| <li> |
| <p>Memory Model</p> |
| </li> |
| <li> |
| <p>Execution Model</p> |
| </li> |
| <li> |
| <p>Programming Model</p> |
| </li> |
| </ul> |
| </div> |
| <div class="sect2"> |
| <h3 id="_platform_model"><a class="anchor" href="#_platform_model"></a>3.1. Platform Model</h3> |
| <div class="paragraph"> |
| <p>The <a href="#platform-model-image">Platform model</a> for OpenCL is defined below. |
| The model consists of a <strong>host</strong> connected to one or more <strong>OpenCL devices</strong>. |
| An OpenCL device is divided into one or more <strong>compute units</strong> (CUs) which are |
| further divided into one or more <strong>processing elements</strong> (PEs). |
| Computations on a device occur within the processing elements.</p> |
| </div> |
| <div class="paragraph"> |
| <p>An OpenCL application is implemented as both host code and device kernel |
| code. |
| The host code portion of an OpenCL application runs on a host processor |
| according to the models native to the host platform. |
| The OpenCL application host code submits the kernel code as commands from |
| the host to OpenCL devices. |
| An OpenCL device executes the commands computation on the processing |
| elements within the device.</p> |
| </div> |
| <div class="paragraph"> |
| <p>An OpenCL device has considerable latitude on how computations are mapped |
| onto the devices processing elements. |
| When processing elements within a compute unit execute the same sequence of |
| statements across the processing elements, the control flow is said to be |
| <em>converged</em>. |
| Hardware optimized for executing a single stream of instructions over |
| multiple processing elements is well suited to converged control flows. |
| When the control flow varies from one processing element to another, it is |
| said to be <em>diverged</em>. |
| While a kernel always begins execution with a converged control flow, due to |
| branching statements within a kernel, converged and diverged control flows |
| may occur within a single kernel. |
| This provides a great deal of flexibility in the algorithms that can be |
| implemented with OpenCL.</p> |
| </div> |
| <div id="platform-model-image" class="imageblock text-center"> |
| <div class="content"> |
| <img src="" alt="platform model"> |
| </div> |
| <div class="title">Figure 1. Platform Model …​ one host plus one or more compute devices each with one or more compute units composed of one or more processing elements.</div> |
| </div> |
| <div class="paragraph"> |
| <p>Programmers may provide programs in the form of OpenCL C source strings, |
| the SPIR-V intermediate language, or as implementation-defined binary objects. |
| An OpenCL platform provides a compiler to translate programs of these |
| forms into executable program objects. |
| The device code compiler may be <em>online</em> or <em>offline</em>. |
| An <em>online</em> <em>compiler</em> is available during host program execution using |
| standard APIs. |
| An <em>offline compiler</em> is invoked outside of host program control, using |
| platform-specific methods. |
| The OpenCL runtime allows developers to get a previously compiled device |
| program executable and be able to load and execute a previously compiled |
| device program executable.</p> |
| </div> |
| <div class="paragraph"> |
| <p>OpenCL defines two kinds of platform profiles: a <em>full profile</em> and a |
| reduced-functionality <em>embedded profile</em>. |
| A full profile platform must provide an online compiler for all its devices. |
| An embedded platform may provide an online compiler, but is not required to |
| do so.</p> |
| </div> |
| <div class="paragraph"> |
| <p>A device may expose special purpose functionality as a <em>built-in kernel</em>. |
| The platform provides APIs for enumerating and invoking the built-in |
| kernels offered by a device, but otherwise does not define their |
| construction or semantics. |
| A <em>custom device</em> supports only built-in kernels, and cannot be programmed |
| via a kernel language.</p> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| Built-in kernels and custom devices are <a href="#unified-spec">missing before</a> |
| version 1.2. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>All device types support the OpenCL execution model, the OpenCL memory |
| model, and the APIs used in OpenCL to manage devices.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The platform model is an abstraction describing how OpenCL views the |
| hardware. |
| The relationship between the elements of the platform model and the hardware |
| in a system may be a fixed property of a device or it may be a dynamic |
| feature of a program dependent on how a compiler optimizes code to best |
| utilize physical hardware.</p> |
| </div> |
| </div> |
| <div class="sect2"> |
| <h3 id="_execution_model"><a class="anchor" href="#_execution_model"></a>3.2. Execution Model</h3> |
| <div class="paragraph"> |
| <p>The OpenCL execution model is defined in terms of two distinct units of |
| execution: <strong>kernels</strong> that execute on one or more OpenCL devices and a <strong>host |
| program</strong> that executes on the host. |
| With regard to OpenCL, the kernels are where the "work" associated with a |
| computation occurs. |
| This work occurs through <strong>work-items</strong> that execute in groups |
| (<strong>work-groups</strong>).</p> |
| </div> |
| <div class="paragraph"> |
| <p>A kernel executes within a well-defined context managed by the host. |
| The context defines the environment within which kernels execute. |
| It includes the following resources:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>Devices</strong>: One or more devices exposed by the OpenCL platform.</p> |
| </li> |
| <li> |
| <p><strong>Kernel Objects</strong>: The OpenCL functions with their associated argument |
| values that run on OpenCL devices.</p> |
| </li> |
| <li> |
| <p><strong>Program Objects</strong>: The program source and executable that implement the |
| kernels.</p> |
| </li> |
| <li> |
| <p><strong>Memory Objects</strong>: Variables visible to the host and the OpenCL devices. |
| Instances of kernels operate on these objects as they execute.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The host program uses the OpenCL API to create and manage the context. |
| Functions from the OpenCL API enable the host to interact with a device |
| through a <em>command-queue</em>. |
| Each command-queue is associated with a single device. |
| The commands placed into the command-queue fall into one of three types:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>Kernel-enqueue commands</strong>: Enqueue a kernel for execution on a device.</p> |
| </li> |
| <li> |
| <p><strong>Memory commands</strong>: Transfer data between the host and device memory, |
| between memory objects, or map and unmap memory objects from the host |
| address space.</p> |
| </li> |
| <li> |
| <p><strong>Synchronization commands</strong>: Explicit synchronization points that define |
| order constraints between commands.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>In addition to commands submitted from the host command-queue, a kernel |
| running on a device can enqueue commands to a device-side command queue. |
| This results in <em>child kernels</em> enqueued by a kernel executing on a device |
| (the <em>parent kernel</em>). |
| Regardless of whether the command-queue resides on the host or a device, |
| each command passes through six states.</p> |
| </div> |
| <div class="olist arabic"> |
| <ol class="arabic"> |
| <li> |
| <p><strong>Queued</strong>: The command is enqueued to a command-queue. |
| A command may reside in the queue until it is flushed either explicitly |
| (a call to <a href="#clFlush"><strong>clFlush</strong></a>) or implicitly by some other command.</p> |
| </li> |
| <li> |
| <p><strong>Submitted</strong>: The command is flushed from the command-queue and submitted |
| for execution on the device. |
| Once flushed from the command-queue, a command will execute after any |
| prerequisites for execution are met.</p> |
| </li> |
| <li> |
| <p><strong>Ready</strong>: All prerequisites constraining execution of a command have been |
| met. |
| The command, or for a kernel-enqueue command the collection of work |
| groups associated with a command, is placed in a device work-pool from |
| which it is scheduled for execution.</p> |
| </li> |
| <li> |
| <p><strong>Running</strong>: Execution of the command starts. |
| For the case of a kernel-enqueue command, one or more work-groups |
| associated with the command start to execute.</p> |
| </li> |
| <li> |
| <p><strong>Ended</strong>: Execution of a command ends. |
| When a Kernel-enqueue command ends, all of the work-groups associated |
| with that command have finished their execution. |
| <em>Immediate side effects</em>, i.e. those associated with the kernel but not |
| necessarily with its child kernels, are visible to other units of |
| execution. |
| These side effects include updates to values in global memory.</p> |
| </li> |
| <li> |
| <p><strong>Complete</strong>: The command and its child commands have finished execution |
| and the status of the event object, if any, associated with the command |
| is set to <a href="#CL_COMPLETE"><code>CL_COMPLETE</code></a>.</p> |
| </li> |
| </ol> |
| </div> |
| <div class="paragraph"> |
| <p>The <a href="#profiled-states-image">execution states and the transitions between |
| them</a> are summarized below. |
| These states and the concept of a device work-pool are conceptual elements |
| of the execution model. |
| An implementation of OpenCL has considerable freedom in how these are |
| exposed to a program. |
| Five of the transitions, however, are directly observable through a |
| profiling interface. |
| These <a href="#profiled-states-image">profiled states</a> are shown below.</p> |
| </div> |
| <div id="profiled-states-image" class="imageblock text-center"> |
| <div class="content"> |
| <img src="" alt="profiled states"> |
| </div> |
| <div class="title">Figure 2. The states and transitions between states defined in the OpenCL execution model. A subset of these transitions is exposed through the <a href="#profiling-operations">profiling interface</a>.</div> |
| </div> |
| <div class="paragraph"> |
| <p>Commands communicate their status through <em>Event objects</em>. |
| Successful completion is indicated by setting the event status associated |
| with a command to <a href="#CL_COMPLETE"><code>CL_COMPLETE</code></a>. |
| Unsuccessful completion results in abnormal termination of the command which |
| is indicated by setting the event status to a negative value. |
| In this case, the command-queue associated with the abnormally terminated |
| command and all other command-queues in the same context may no longer be |
| available and their behavior is implementation defined.</p> |
| </div> |
| <div class="paragraph"> |
| <p>A command submitted to a device will not launch until prerequisites that |
| constrain the order of commands have been resolved. |
| These prerequisites have three sources:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>They may arise from commands submitted to a command-queue that constrain |
| the order in which commands are launched. |
| For example, commands that follow a command queue barrier will not |
| launch until all commands prior to the barrier are complete.</p> |
| </li> |
| <li> |
| <p>The second source of prerequisites is dependencies between commands |
| expressed through events. |
| A command may include an optional list of events. |
| The command will wait and not launch until all the events in the list |
| are in the state CL COMPLETE. |
| By this mechanism, event objects define order constraints between |
| commands and coordinate execution between the host and one or more |
| devices.</p> |
| </li> |
| <li> |
| <p>The third source of prerequisites can be the presence of non-trivial C |
| initializers or C++ constructors for program scope global variables. |
| In this case, OpenCL C/C++ compiler shall generate program |
| initialization kernels that perform C initialization or C++ |
| construction. |
| These kernels must be executed by OpenCL runtime on a device before any |
| kernel from the same program can be executed on the same device. |
| The ND-range for any program initialization kernel is (1,1,1). |
| When multiple programs are linked together, the order of execution of |
| program initialization kernels that belong to different programs is |
| undefined.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Program clean up may result in the execution of one or more program clean up |
| kernels by the OpenCL runtime. |
| This is due to the presence of non-trivial C++ destructors for |
| program scope variables. |
| The ND-range for executing any program clean up kernel is (1,1,1). |
| The order of execution of clean up kernels from different programs (that are |
| linked together) is undefined.</p> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| Program initialization and clean-up kernels are <a href="#unified-spec">missing before</a> version 2.2. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>Note that C initializers, C++ constructors, or C++ destructors for program |
| scope variables cannot use pointers to coarse grain and fine grain SVM |
| allocations.</p> |
| </div> |
| <div class="paragraph"> |
| <p>A command may be submitted to a device and yet have no visible side effects |
| outside of waiting on and satisfying event dependences. |
| Examples include markers, kernels executed over ranges of no work-items or |
| copy operations with zero sizes. |
| Such commands may pass directly from the <em>ready</em> state to the <em>ended</em> state.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Command execution can be blocking or non-blocking. |
| Consider a sequence of OpenCL commands. |
| For blocking commands, the OpenCL API functions that enqueue commands don’t |
| return until the command has completed. |
| Alternatively, OpenCL functions that enqueue non-blocking commands return |
| immediately and require that a programmer defines dependencies between |
| enqueued commands to ensure that enqueued commands are not launched before |
| needed resources are available. |
| In both cases, the actual execution of the command may occur asynchronously |
| with execution of the host program.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Commands within a single command-queue execute relative to each other in one |
| of two modes:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>In-order Execution</strong>: Commands and any side effects associated with |
| commands appear to the OpenCL application as if they execute in the same |
| order they are enqueued to a command-queue.</p> |
| </li> |
| <li> |
| <p><strong>Out-of-order Execution</strong>: Commands execute in any order constrained only |
| by explicit synchronization points (e.g. through command queue barriers) |
| or explicit dependencies on events.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Multiple command-queues can be present within a single context. |
| Multiple command-queues execute commands independently. |
| Event objects visible to the host program can be used to define |
| synchronization points between commands in multiple command queues. |
| If such synchronization points are established between commands in multiple |
| command-queues, an implementation must assure that the command-queues |
| progress concurrently and correctly account for the dependencies established |
| by the synchronization points. |
| For a detailed explanation of synchronization points, see the execution model |
| <a href="#execution-model-sync">Synchronization</a> section.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The core of the OpenCL execution model is defined by how the kernels |
| execute. |
| When a kernel-enqueue command submits a kernel for execution, an index space |
| is defined. |
| The kernel, the argument values associated with the arguments to the kernel, |
| and the parameters that define the index space define a <em>kernel-instance</em>. |
| When a kernel-instance executes on a device, the kernel function executes |
| for each point in the defined index space. |
| Each of these executing kernel functions is called a <em>work-item</em>. |
| The work-items associated with a given kernel-instance are managed by the |
| device in groups called <em>work-groups</em>. |
| These work-groups define a coarse grained decomposition of the Index space. |
| Work-groups are further divided into <em>sub-groups</em>, which provide an |
| additional level of control over execution.</p> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| Sub-groups are <a href="#unified-spec">missing before</a> version 2.1. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>Work-items have a global ID based on their coordinates within the Index |
| space. |
| They can also be defined in terms of their work-group and the local ID |
| within a work-group. |
| The details of this mapping are described in the following section.</p> |
| </div> |
| <div class="sect3"> |
| <h4 id="_mapping_work_items_onto_an_ndrange"><a class="anchor" href="#_mapping_work_items_onto_an_ndrange"></a>3.2.1. Mapping work-items onto an NDRange</h4> |
| <div class="paragraph"> |
| <p>The index space supported by OpenCL is called an NDRange. |
| An NDRange is an N-dimensional index space, where N is one, two or three. |
| The NDRange is decomposed into work-groups forming blocks that cover the |
| Index space. |
| An NDRange is defined by three integer arrays of length N:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>The extent of the index space (or global size) in each dimension.</p> |
| </li> |
| <li> |
| <p>An offset index F indicating the initial value of the indices in each |
| dimension (zero by default).</p> |
| </li> |
| <li> |
| <p>The size of a work-group (local size) in each dimension.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Each work-items global ID is an N-dimensional tuple. |
| The global ID components are values in the range from F, to F plus the |
| number of elements in that dimension minus one.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Unless a kernel comes from a source that disallows it, e.g. OpenCL C 1.x or |
| using <code>-cl-uniform-work-group-size</code>, the size of work-groups in |
| an NDRange (the local size) need not be the same for all work-groups. |
| In this case, any single dimension for which the global size is not |
| divisible by the local size will be partitioned into two regions. |
| One region will have work-groups that have the same number of work-items as |
| was specified for that dimension by the programmer (the local size). |
| The other region will have work-groups with less than the number of work |
| items specified by the local size parameter in that dimension (the |
| <em>remainder work-groups</em>). |
| Work-group sizes could be non-uniform in multiple dimensions, potentially |
| producing work-groups of up to 4 different sizes in a 2D range and 8 |
| different sizes in a 3D range.</p> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| Non-uniform work-group sizes are <a href="#unified-spec">missing before</a> version |
| 2.0. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>Each work-item is assigned to a work-group and given a local ID to represent |
| its position within the work-group. |
| A work-item’s local ID is an N-dimensional tuple with components in the |
| range from zero to the size of the work-group in that dimension minus one.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Work-groups are assigned IDs similarly. |
| The number of work-groups in each dimension is not directly defined but is |
| inferred from the local and global NDRanges provided when a kernel-instance |
| is enqueued. |
| A work-group’s ID is an N-dimensional tuple with components in the range 0 |
| to the ceiling of the global size in that dimension divided by the local |
| size in the same dimension. |
| As a result, the combination of a work-group ID and the local-ID within a |
| work-group uniquely defines a work-item. |
| Each work-item is identifiable in two ways; in terms of a global index, and |
| in terms of a work-group index plus a local index within a work-group.</p> |
| </div> |
| <div class="paragraph"> |
| <p>For example, consider the <a href="#index-space-image">2-dimensional index space</a> |
| shown below. |
| We input the index space for the work-items (G<sub>x</sub>, G<sub>y</sub>), the size of each |
| work-group (S<sub>x</sub>, S<sub>y</sub>) and the global ID offset (F<sub>x</sub>, F<sub>y</sub>). |
| The global indices define an G<sub>x</sub>by G<sub>y</sub> index space where the total number |
| of work-items is the product of G<sub>x</sub> and G<sub>y</sub>. |
| The local indices define an S<sub>x</sub> by S<sub>y</sub> index space where the number of |
| work-items in a single work-group is the product of S<sub>x</sub> and S<sub>y</sub>. |
| Given the size of each work-group and the total number of work-items we can |
| compute the number of work-groups. |
| A 2-dimensional index space is used to uniquely identify a work-group. |
| Each work-item is identified by its global ID (<em>g</em><sub>x</sub>, <em>g</em><sub>y</sub>) or by the |
| combination of the work-group ID (<em>w</em><sub>x</sub>, <em>w</em><sub>y</sub>), the size of each |
| work-group (S<sub>x</sub>,S<sub>y</sub>) and the local ID (s<sub>x</sub>, s<sub>y</sub>) inside the work-group |
| such that</p> |
| </div> |
| <div class="ulist none"> |
| <ul class="none"> |
| <li> |
| <p>(g<sub>x</sub>, g<sub>y</sub>) = (w<sub>x</sub> × S<sub>x</sub> + s<sub>x</sub> + F<sub>x</sub>, w<sub>y</sub> × S<sub>y</sub> + s<sub>y</sub> + F<sub>y</sub>)</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The number of work-groups can be computed as:</p> |
| </div> |
| <div class="ulist none"> |
| <ul class="none"> |
| <li> |
| <p>(W<sub>x</sub>, W<sub>y</sub>) = (ceil(G<sub>x</sub> / S<sub>x</sub>), ceil(G<sub>y</sub> / S<sub>y</sub>))</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Given a global ID and the work-group size, the work-group ID for a work-item |
| is computed as:</p> |
| </div> |
| <div class="ulist none"> |
| <ul class="none"> |
| <li> |
| <p>(w<sub>x</sub>, w<sub>y</sub>) = ( (g<sub>x</sub> - s<sub>x</sub> - F<sub>x</sub>) / S<sub>x</sub>, (g<sub>y</sub> - s<sub>y</sub> - F<sub>y</sub>) / S<sub>y</sub> )</p> |
| </li> |
| </ul> |
| </div> |
| <div id="index-space-image" class="imageblock text-center"> |
| <div class="content"> |
| <img src="" alt="index space"> |
| </div> |
| <div class="title">Figure 3. An example of an NDRange index space showing work-items, their global IDs and their mapping onto the pair of work-group and local IDs. In this case, we assume that in each dimension, the size of the work-group evenly divides the global NDRange size (i.e. all work-groups have the same size) and that the offset is equal to zero.</div> |
| </div> |
| <div class="paragraph"> |
| <p>Within a work-group work-items may be divided into sub-groups. |
| The mapping of work-items to sub-groups is implementation-defined and may be |
| queried at runtime. |
| While sub-groups may be used in multi-dimensional work-groups, each |
| sub-group is 1-dimensional and any given work-item may query which sub-group |
| it is a member of.</p> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| Sub-groups are <a href="#unified-spec">missing before</a> version 2.1. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>Work-items are mapped into sub-groups through a combination of compile-time |
| decisions and the parameters of the dispatch. |
| The mapping to sub-groups is invariant for the duration of a kernels |
| execution, across dispatches of a given kernel with the same work-group |
| dimensions, between dispatches and query operations consistent with the |
| dispatch parameterization, and from one work-group to another within the |
| dispatch (excluding the trailing edge work-groups in the presence of |
| non-uniform work-group sizes). |
| In addition, all sub-groups within a work-group will be the same size, apart |
| from the sub-group with the maximum index which may be smaller if the size |
| of the work-group is not evenly divisible by the size of the sub-groups.</p> |
| </div> |
| <div class="paragraph"> |
| <p>In the degenerate case, a single sub-group must be supported for each |
| work-group. |
| In this situation all sub-group scope functions are equivalent to their |
| work-group level equivalents.</p> |
| </div> |
| </div> |
| <div class="sect3"> |
| <h4 id="_execution_of_kernel_instances"><a class="anchor" href="#_execution_of_kernel_instances"></a>3.2.2. Execution of kernel-instances</h4> |
| <div class="paragraph"> |
| <p>The work carried out by an OpenCL program occurs through the execution of |
| kernel-instances on compute devices. |
| To understand the details of OpenCL’s execution model, we need to consider |
| how a kernel object moves from the kernel-enqueue command, into a |
| command-queue, executes on a device, and completes.</p> |
| </div> |
| <div class="paragraph"> |
| <p>A kernel object is defined as a function within the program object and a |
| collection of arguments connecting the kernel to a set of argument values. |
| The host program enqueues a kernel object to the command queue along with |
| the NDRange and the work-group decomposition. |
| These define a <em>kernel-instance</em>. |
| In addition, an optional set of events may be defined when the kernel is |
| enqueued. |
| The events associated with a particular kernel-instance are used to |
| constrain when the kernel-instance is launched with respect to other |
| commands in the queue or to commands in other queues within the same |
| context.</p> |
| </div> |
| <div class="paragraph"> |
| <p>A kernel-instance is submitted to a device. |
| For an in-order command queue, the kernel instances appear to launch and |
| then execute in that same order; where we use the term appear to emphasize |
| that when there are no dependencies between commands and hence differences |
| in the order that commands execute cannot be observed in a program, an |
| implementation can reorder commands even in an in-order command queue. |
| For an out of order command-queue, kernel-instances wait to be launched |
| until:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>Synchronization commands enqueued prior to the kernel-instance are |
| satisfied.</p> |
| </li> |
| <li> |
| <p>Each of the events in an optional event list defined when the |
| kernel-instance was enqueued are set to <a href="#CL_COMPLETE"><code>CL_COMPLETE</code></a>.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Once these conditions are met, the kernel-instance is launched and the |
| work-groups associated with the kernel-instance are placed into a pool of |
| ready to execute work-groups. |
| This pool is called a <em>work-pool</em>. |
| The work-pool may be implemented in any manner as long as it assures that |
| work-groups placed in the pool will eventually execute. |
| The device schedules work-groups from the work-pool for execution on the |
| compute units of the device. |
| The kernel-enqueue command is complete when all work-groups associated with |
| the kernel-instance end their execution, updates to global memory associated |
| with a command are visible globally, and the device signals successful |
| completion by setting the event associated with the kernel-enqueue command |
| to <a href="#CL_COMPLETE"><code>CL_COMPLETE</code></a>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>While a command-queue is associated with only one device, a single device |
| may be associated with multiple command-queues all feeding into the single |
| work-pool. |
| A device may also be associated with command queues associated with |
| different contexts within the same platform, again all feeding into the |
| single work-pool. |
| The device will pull work-groups from the work-pool and execute them on one |
| or several compute units in any order; possibly interleaving execution of |
| work-groups from multiple commands. |
| A conforming implementation may choose to serialize the work-groups so a |
| correct algorithm cannot assume that work-groups will execute in parallel. |
| There is no safe and portable way to synchronize across the independent |
| execution of work-groups since once in the work-pool, they can execute in |
| any order.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The work-items within a single sub-group execute concurrently but not |
| necessarily in parallel (i.e. they are not guaranteed to make independent |
| forward progress). |
| Therefore, only high-level synchronization constructs (e.g. sub-group |
| functions such as barriers) that apply to all the work-items in a sub-group |
| are well defined and included in OpenCL.</p> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| Sub-groups are <a href="#unified-spec">missing before</a> version 2.1. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>Sub-groups execute concurrently within a given work-group and with |
| appropriate device support (see <a href="#platform-querying-devices">Querying |
| Devices</a>), may make independent forward progress with respect to each |
| other, with respect to host threads and with respect to any entities |
| external to the OpenCL system but running on an OpenCL device, even in the |
| absence of work-group barrier operations. |
| In this situation, sub-groups are able to internally synchronize using |
| barrier operations without synchronizing with each other and may perform |
| operations that rely on runtime dependencies on operations other sub-groups |
| perform.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The work-items within a single work-group execute concurrently but are only |
| guaranteed to make independent progress in the presence of sub-groups and |
| device support. |
| In the absence of this capability, only high-level synchronization |
| constructs (e.g. work-group functions such as barriers) that apply to all |
| the work-items in a work-group are well defined and included in OpenCL for |
| synchronization within the work-group.</p> |
| </div> |
| <div class="paragraph"> |
| <p>In the absence of synchronization functions (e.g. a barrier), work-items |
| within a sub-group may be serialized. |
| In the presence of sub -group functions, work-items within a sub -group may |
| be serialized before any given sub -group function, between dynamically |
| encountered pairs of sub-group functions and between a work-group function |
| and the end of the kernel.</p> |
| </div> |
| <div class="paragraph"> |
| <p>In the absence of independent forward progress of constituent sub-groups, |
| work-items within a work-group may be serialized before, after or between |
| work-group synchronization functions.</p> |
| </div> |
| </div> |
| <div class="sect3"> |
| <h4 id="device-side-enqueue"><a class="anchor" href="#device-side-enqueue"></a>3.2.3. Device-side enqueue</h4> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| Device-side enqueue is <a href="#unified-spec">missing before</a> version 2.0. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>Algorithms may need to generate additional work as they execute. |
| In many cases, this additional work cannot be determined statically; so the |
| work associated with a kernel only emerges at runtime as the kernel-instance |
| executes. |
| This capability could be implemented in logic running within the host |
| program, but involvement of the host may add significant overhead and/or |
| complexity to the application control flow. |
| A more efficient approach would be to nest kernel-enqueue commands from |
| inside other kernels. |
| This <strong>nested parallelism</strong> can be realized by supporting the enqueuing of |
| kernels on a device without direct involvement by the host program; |
| so-called <strong>device-side enqueue</strong>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Device-side kernel-enqueue commands are similar to host-side kernel-enqueue |
| commands. |
| The kernel executing on a device (the <strong>parent kernel</strong>) enqueues a |
| kernel-instance (the <strong>child kernel</strong>) to a device-side command queue. |
| This is an out-of-order command-queue and follows the same behavior as the |
| out-of-order command-queues exposed to the host program. |
| Commands enqueued to a device side command-queue generate and use events to |
| enforce order constraints just as for the command-queue on the host. |
| These events, however, are only visible to the parent kernel running on the |
| device. |
| When these prerequisite events take on the value <a href="#CL_COMPLETE"><code>CL_COMPLETE</code></a>, the |
| work-groups associated with the child kernel are launched into the devices |
| work pool. |
| The device then schedules them for execution on the compute units of the |
| device. |
| Child and parent kernels execute asynchronously. |
| However, a parent will not indicate that it is complete by setting its event |
| to <a href="#CL_COMPLETE"><code>CL_COMPLETE</code></a> until all child kernels have ended execution and have |
| signaled completion by setting any associated events to the value |
| <a href="#CL_COMPLETE"><code>CL_COMPLETE</code></a>. |
| Should any child kernel complete with an event status set to a negative |
| value (i.e. abnormally terminate), the parent kernel will abnormally |
| terminate and propagate the childs negative event value as the value of the |
| parents event. |
| If there are multiple children that have an event status set to a negative |
| value, the selection of which childs negative event value is propagated is |
| implementation-defined.</p> |
| </div> |
| </div> |
| <div class="sect3"> |
| <h4 id="execution-model-sync"><a class="anchor" href="#execution-model-sync"></a>3.2.4. Synchronization</h4> |
| <div class="paragraph"> |
| <p>Synchronization refers to mechanisms that constrain the order of execution |
| between two or more units of execution. |
| Consider the following three domains of synchronization in OpenCL:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>Work-group synchronization: Constraints on the order of execution for |
| work-items in a single work-group</p> |
| </li> |
| <li> |
| <p>Sub-group synchronization: Constraints on the order of execution for |
| work-items in a single sub-group. |
| Note: Sub-groups are <a href="#unified-spec">missing before</a> version 2.1</p> |
| </li> |
| <li> |
| <p>Command synchronization: Constraints on the order of commands launched |
| for execution</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Synchronization across all work-items within a single work-group is carried |
| out using a <em>work-group function</em>. |
| These functions carry out collective operations across all the work-items in |
| a work-group. |
| Available collective operations are: barrier, reduction, broadcast, prefix |
| sum, and evaluation of a predicate. |
| A work-group function must occur within a converged control flow; i.e. all |
| work-items in the work-group must encounter precisely the same work-group |
| function. |
| For example, if a work-group function occurs within a loop, the work-items |
| must encounter the same work-group function in the same loop iterations. |
| All the work-items of a work-group must execute the work-group function and |
| complete reads and writes to memory before any are allowed to continue |
| execution beyond the work-group function. |
| Work-group functions that apply between work-groups are not provided in |
| OpenCL since OpenCL does not define forward-progress or ordering relations |
| between work-groups, hence collective synchronization operations are not |
| well defined.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Synchronization across all work-items within a single sub-group is carried |
| out using a <em>sub-group function</em>. |
| These functions carry out collective operations across all the work-items in |
| a sub-group. |
| Available collective operations are: barrier, reduction, broadcast, prefix |
| sum, and evaluation of a predicate. |
| A sub-group function must occur within a converged control flow; i.e. all |
| work-items in the sub-group must encounter precisely the same sub-group |
| function. |
| For example, if a work-group function occurs within a loop, the work-items |
| must encounter the same sub-group function in the same loop iterations. |
| All the work-items of a sub-group must execute the sub-group function and |
| complete reads and writes to memory before any are allowed to continue |
| execution beyond the sub-group function. |
| Synchronization between sub-groups must either be performed using work-group |
| functions, or through memory operations. |
| Using memory operations for sub-group synchronization should be used |
| carefully as forward progress of sub-groups relative to each other is only |
| supported optionally by OpenCL implementations.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Command synchronization is defined in terms of distinct <strong>synchronization |
| points</strong>. |
| The synchronization points occur between commands in host command-queues and |
| between commands in device-side command-queues. |
| The synchronization points defined in OpenCL include:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>Launching a command:</strong> A kernel-instance is launched onto a device after |
| all events that kernel is waiting-on have been set to <a href="#CL_COMPLETE"><code>CL_COMPLETE</code></a>.</p> |
| </li> |
| <li> |
| <p><strong>Ending a command:</strong> Child kernels may be enqueued such that they wait |
| for the parent kernel to reach the <em>end</em> state before they can be |
| launched. |
| In this case, the ending of the parent command defines a synchronization |
| point.</p> |
| </li> |
| <li> |
| <p><strong>Completion of a command:</strong> A kernel-instance is complete after all of |
| the work-groups in the kernel and all of its child kernels have |
| completed. |
| This is signaled to the host, a parent kernel or other kernels within |
| command queues by setting the value of the event associated with a |
| kernel to <a href="#CL_COMPLETE"><code>CL_COMPLETE</code></a>.</p> |
| </li> |
| <li> |
| <p><strong>Blocking Commands:</strong> A blocking command defines a synchronization point |
| between the unit of execution that calls the blocking API function and |
| the enqueued command reaching the complete state.</p> |
| </li> |
| <li> |
| <p><strong>Command-queue barrier:</strong> The command-queue barrier ensures that all |
| previously enqueued commands have completed before subsequently enqueued |
| commands can be launched.</p> |
| </li> |
| <li> |
| <p><a href="#clFinish"><strong>clFinish</strong></a>: This function blocks until all previously enqueued commands |
| in the command queue have completed after which <a href="#clFinish"><strong>clFinish</strong></a> defines a |
| synchronization point and the <a href="#clFinish"><strong>clFinish</strong></a> function returns.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>A synchronization point between a pair of commands (A and B) assures that |
| results of command A happens-before command B is launched. |
| This requires that any updates to memory from command A complete and are |
| made available to other commands before the synchronization point completes. |
| Likewise, this requires that command B waits until after the synchronization |
| point before loading values from global memory. |
| The concept of a synchronization point works in a similar fashion for |
| commands such as a barrier that apply to two sets of commands. |
| All the commands prior to the barrier must complete and make their results |
| available to following commands. |
| Furthermore, any commands following the barrier must wait for the commands |
| prior to the barrier before loading values and continuing their execution.</p> |
| </div> |
| <div class="paragraph"> |
| <p>These <em>happens-before</em> relationships are a fundamental part of the OpenCL 2.x |
| memory model. |
| When applied at the level of commands, they are straightforward to define at |
| a language level in terms of ordering relationships between different |
| commands. |
| Ordering memory operations inside different commands, however, requires |
| rules more complex than can be captured by the high level concept of a |
| synchronization point. |
| These rules are described in detail in <a href="#memory-ordering-rules">Memory |
| Ordering Rules</a>.</p> |
| </div> |
| </div> |
| <div class="sect3"> |
| <h4 id="_categories_of_kernels"><a class="anchor" href="#_categories_of_kernels"></a>3.2.5. Categories of Kernels</h4> |
| <div class="paragraph"> |
| <p>The OpenCL execution model supports three types of kernels:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>OpenCL kernels</strong> are managed by the OpenCL API as kernel objects |
| associated with kernel functions within program objects. |
| OpenCL program objects are created and built using OpenCL APIs. |
| The OpenCL API includes functions to query the kernel languages and |
| and intermediate languages that may be used to create OpenCL program |
| objects for a device.</p> |
| </li> |
| <li> |
| <p><strong>Native kernels</strong> are accessed through a host function pointer. |
| Native kernels are queued for execution along with OpenCL kernels on a |
| device and share memory objects with OpenCL kernels. |
| For example, these native kernels could be functions defined in |
| application code or exported from a library. |
| The ability to execute native kernels is optional within OpenCL and the |
| semantics of native kernels are implementation-defined. |
| The OpenCL API includes functions to query capabilities of a device |
| to determine if this capability is supported.</p> |
| </li> |
| <li> |
| <p><strong>Built-in kernels</strong> are tied to particular device and are not built at |
| runtime from source code in a program object. |
| The common use of built in kernels is to expose fixed-function hardware |
| or firmware associated with a particular OpenCL device or custom device. |
| The semantics of a built-in kernel may be defined outside of OpenCL and |
| hence are implementation defined. |
| Note: Built-in kernels are <a href="#unified-spec">missing before</a> version 1.2.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>All three types of kernels are manipulated through the OpenCL command queues |
| and must conform to the synchronization points defined in the OpenCL |
| execution model.</p> |
| </div> |
| </div> |
| </div> |
| <div class="sect2"> |
| <h3 id="_memory_model"><a class="anchor" href="#_memory_model"></a>3.3. Memory Model</h3> |
| <div class="paragraph"> |
| <p>The OpenCL memory model describes the structure, contents, and behavior of |
| the memory exposed by an OpenCL platform as an OpenCL program runs. |
| The model allows a programmer to reason about values in memory as the host |
| program and multiple kernel-instances execute.</p> |
| </div> |
| <div class="paragraph"> |
| <p>An OpenCL program defines a context that includes a host, one or more |
| devices, command-queues, and memory exposed within the context. |
| Consider the units of execution involved with such a program. |
| The host program runs as one or more host threads managed by the operating |
| system running on the host (the details of which are defined outside of |
| OpenCL). |
| There may be multiple devices in a single context which all have access to |
| memory objects defined by OpenCL. |
| On a single device, multiple work-groups may execute in parallel with |
| potentially overlapping updates to memory. |
| Finally, within a single work-group, multiple work-items concurrently |
| execute, once again with potentially overlapping updates to memory.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The memory model must precisely define how the values in memory as seen from |
| each of these units of execution interact so a programmer can reason about |
| the correctness of OpenCL programs. |
| We define the memory model in four parts.</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>Memory regions: The distinct memories visible to the host and the |
| devices that share a context.</p> |
| </li> |
| <li> |
| <p>Memory objects: The objects defined by the OpenCL API and their |
| management by the host and devices.</p> |
| </li> |
| <li> |
| <p>Shared Virtual Memory: A virtual address space exposed to both the host |
| and the devices within a context. |
| Note: SVM is <a href="#unified-spec">missing before</a> version 2.0.</p> |
| </li> |
| <li> |
| <p>Consistency Model: Rules that define which values are observed when |
| multiple units of execution load data from memory plus the atomic/fence |
| operations that constrain the order of memory operations and define |
| synchronization relationships.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="sect3"> |
| <h4 id="_fundamental_memory_regions"><a class="anchor" href="#_fundamental_memory_regions"></a>3.3.1. Fundamental Memory Regions</h4> |
| <div class="paragraph"> |
| <p>Memory in OpenCL is divided into two parts.</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>Host Memory:</strong> The memory directly available to the host. |
| The detailed behavior of host memory is defined outside of OpenCL. |
| Memory objects move between the Host and the devices through functions |
| within the OpenCL API or through a shared virtual memory interface.</p> |
| </li> |
| <li> |
| <p><strong>Device Memory:</strong> Memory directly available to kernels executing on |
| OpenCL devices.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Device memory consists of four named address spaces or <em>memory regions</em>:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>Global Memory:</strong> This memory region permits read/write access to all |
| work-items in all work-groups running on any device within a context. |
| Work-items can read from or write to any element of a memory object. |
| Reads and writes to global memory may be cached depending on the |
| capabilities of the device.</p> |
| </li> |
| <li> |
| <p><strong>Constant Memory</strong>: A region of global memory that remains constant |
| during the execution of a kernel-instance. |
| The host allocates and initializes memory objects placed into constant |
| memory.</p> |
| </li> |
| <li> |
| <p><strong>Local Memory</strong>: A memory region local to a work-group. |
| This memory region can be used to allocate variables that are shared by |
| all work-items in that work-group.</p> |
| </li> |
| <li> |
| <p><strong>Private Memory</strong>: A region of memory private to a work-item. |
| Variables defined in one work-items private memory are not visible to |
| another work-item.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The <a href="#memory-regions-image">memory regions</a> and their relationship to the |
| OpenCL Platform model are summarized below. |
| Local and private memories are always associated with a particular device. |
| The global and constant memories, however, are shared between all devices |
| within a given context. |
| An OpenCL device may include a cache to support efficient access to these |
| shared memories.</p> |
| </div> |
| <div class="paragraph"> |
| <p>To understand memory in OpenCL, it is important to appreciate the |
| relationships between these named address spaces. |
| The four named address spaces available to a device are disjoint meaning |
| they do not overlap. |
| This is a logical relationship, however, and an implementation may choose to |
| let these disjoint named address spaces share physical memory.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Programmers often need functions callable from kernels where the pointers |
| manipulated by those functions can point to multiple named address spaces. |
| This saves a programmer from the error-prone and wasteful practice of |
| creating multiple copies of functions; one for each named address space. |
| Therefore the global, local and private address spaces belong to a single |
| <em>generic address space</em>. |
| This is closely modeled after the concept of a generic address space used in |
| the embedded C standard (ISO/IEC 9899:1999). |
| Since they all belong to a single generic address space, the following |
| properties are supported for pointers to named address spaces in device |
| memory:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>A pointer to the generic address space can be cast to a pointer to a |
| global, local or private address space</p> |
| </li> |
| <li> |
| <p>A pointer to a global, local or private address space can be cast to a |
| pointer to the generic address space.</p> |
| </li> |
| <li> |
| <p>A pointer to a global, local or private address space can be implicitly |
| converted to a pointer to the generic address space, but the converse is |
| not allowed.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The constant address space is disjoint from the generic address space.</p> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| The generic address space is <a href="#unified-spec">missing before</a> version |
| 2.0. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>The addresses of memory associated with memory objects in Global memory are |
| not preserved between kernel instances, between a device and the host, and |
| between devices. |
| In this regard global memory acts as a global pool of memory objects rather |
| than an address space. |
| This restriction is relaxed when shared virtual memory (SVM) is used.</p> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| Shared virtual memory is <a href="#unified-spec">missing before</a> version 2.0. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>SVM causes addresses to be meaningful between the host and all of the |
| devices within a context hence supporting the use of pointer based data |
| structures in OpenCL kernels. |
| It logically extends a portion of the global memory into the host address |
| space giving work-items access to the host address space. |
| On platforms with hardware support for a shared address space between the |
| host and one or more devices, SVM may also provide a more efficient way to |
| share data between devices and the host. |
| Details about SVM are presented in <a href="#shared-virtual-memory">Shared Virtual |
| Memory</a>.</p> |
| </div> |
| <div id="memory-regions-image" class="imageblock text-center"> |
| <div class="content"> |
| <img src="" alt="memory regions"> |
| </div> |
| <div class="title">Figure 4. The named address spaces exposed in an OpenCL Platform. Global and Constant memories are shared between the one or more devices within a context, while local and private memories are associated with a single device. Each device may include an optional cache to support efficient access to their view of the global and constant address spaces.</div> |
| </div> |
| <div class="paragraph"> |
| <p>A programmer may use the features of the <a href="#memory-consistency-model">memory |
| consistency model</a> to manage safe access to global memory from multiple |
| work-items potentially running on one or more devices. |
| In addition, when using shared virtual memory (SVM), the memory consistency |
| model may also be used to ensure that host threads safely access memory |
| locations in the shared memory region.</p> |
| </div> |
| </div> |
| <div class="sect3"> |
| <h4 id="_memory_objects"><a class="anchor" href="#_memory_objects"></a>3.3.2. Memory Objects</h4> |
| <div class="paragraph"> |
| <p>The contents of global memory are <em>memory objects</em>. |
| A memory object is a handle to a reference counted region of global memory. |
| Memory objects use the OpenCL type <em>cl_mem</em> and fall into three distinct |
| classes.</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>Buffer</strong>: A memory object stored as a block of contiguous memory and |
| used as a general purpose object to hold data used in an OpenCL program. |
| The types of the values within a buffer may be any of the built in types |
| (such as int, float), vector types, or user-defined structures. |
| The buffer can be manipulated through pointers much as one would with |
| any block of memory in C.</p> |
| </li> |
| <li> |
| <p><strong>Image</strong>: An image memory object holds one, two or three dimensional |
| images. |
| The formats are based on the standard image formats used in graphics |
| applications. |
| An image is an opaque data structure managed by functions defined in the |
| OpenCL API. |
| To optimize the manipulation of images stored in the texture memories |
| found in many GPUs, OpenCL kernels have traditionally been disallowed |
| from both reading and writing a single image. |
| In OpenCL 2.0, however, we have relaxed this restriction by providing |
| synchronization and fence operations that let programmers properly |
| synchronize their code to safely allow a kernel to read and write a |
| single image.</p> |
| </li> |
| <li> |
| <p><strong>Pipe</strong>: The <em>pipe</em> memory object conceptually is an ordered sequence of |
| data items. |
| A pipe has two endpoints: a write endpoint into which data items are |
| inserted, and a read endpoint from which data items are removed. |
| At any one time, only one kernel instance may write into a pipe, and |
| only one kernel instance may read from a pipe. |
| To support the producer consumer design pattern, one kernel instance |
| connects to the write endpoint (the producer) while another kernel |
| instance connects to the reading endpoint (the consumer). |
| Note: The <em>pipe</em> memory object is <a href="#unified-spec">missing before</a> |
| version 2.0.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Memory objects are allocated by host APIs. |
| The host program can provide the runtime with a pointer to a block of |
| continuous memory to hold the memory object when the object is created |
| (<a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a>). |
| Alternatively, the physical memory can be managed by the OpenCL runtime and |
| not be directly accessible to the host program.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Allocation and access to memory objects within the different memory regions |
| varies between the host and work-items running on a device. |
| This is summarized in the <a href="#memory-regions-table">Memory Regions</a> table, |
| which describes whether the kernel or the host can allocate from a memory |
| region, the type of allocation (static at compile time vs. |
| dynamic at runtime) and the type of access allowed (i.e. whether the kernel |
| or the host can read and/or write to a memory region).</p> |
| </div> |
| <table id="memory-regions-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 1. Memory Regions</caption> |
| <colgroup> |
| <col style="width: 12.5%;"> |
| <col style="width: 12.5%;"> |
| <col style="width: 18.75%;"> |
| <col style="width: 18.75%;"> |
| <col style="width: 18.75%;"> |
| <col style="width: 18.75%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top"></th> |
| <th class="tableblock halign-left valign-top"></th> |
| <th class="tableblock halign-left valign-top">Global</th> |
| <th class="tableblock halign-left valign-top">Constant</th> |
| <th class="tableblock halign-left valign-top">Local</th> |
| <th class="tableblock halign-left valign-top">Private</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top" rowspan="2"><p class="tableblock"><strong>Host</strong></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Allocation</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Dynamic</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Dynamic</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Dynamic</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">None</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Access</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Read/Write to Buffers and Images, but not Pipes</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Read/Write</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">None</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">None</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top" rowspan="2"><p class="tableblock"><strong>Kernel</strong></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Allocation</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Static (program scope variables)</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Static (program scope variables)</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Static for parent kernel, |
| Dynamic for child kernels</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Static</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Access</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Read/Write</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Read-only</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Read/Write, |
| No access to child kernel memory</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Read/Write</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="paragraph"> |
| <p>The <a href="#memory-regions-table">Memory Regions</a> table shows the different |
| memory regions in OpenCL and how memory objects are allocated and accessed |
| by the host and by an executing instance of a kernel. |
| For kernels, we distinguish between the behavior of local memory |
| for a parent kernel and its child kernels.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Once allocated, a memory object is made available to kernel-instances |
| running on one or more devices. |
| In addition to <a href="#shared-virtual-memory">Shared Virtual Memory</a>, there are |
| three basic ways to manage the contents of buffers between the host and |
| devices.</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>Read/Write/Fill commands</strong>: The data associated with a memory object is |
| explicitly read and written between the host and global memory regions |
| using commands enqueued to an OpenCL command queue. |
| Note: Fill commands are <a href="#unified-spec">missing before</a> version 1.2.</p> |
| </li> |
| <li> |
| <p><strong>Map/Unmap commands</strong>: Data from the memory object is mapped into a |
| contiguous block of memory accessed through a host accessible pointer. |
| The host program enqueues a <em>map</em> command on block of a memory object |
| before it can be safely manipulated by the host program. |
| When the host program is finished working with the block of memory, the |
| host program enqueues an <em>unmap</em> command to allow a kernel-instance to |
| safely read and/or write the buffer.</p> |
| </li> |
| <li> |
| <p><strong>Copy commands:</strong> The data associated with a memory object is copied |
| between two buffers, each of which may reside either on the host or on |
| the device.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>With Read/Write/Map, the commands |
| can be blocking or non-blocking operations. |
| The OpenCL function call for a blocking memory transfer returns once the |
| command (memory transfer) has completed. At this point the associated memory |
| resources on the host can be safely reused, and following operations on the host are |
| guaranteed that the transfer has already completed. |
| For a non-blocking memory transfer, the OpenCL function call returns as soon |
| as the command is enqueued.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Memory objects are bound to a context and hence can appear in multiple |
| kernel-instances running on more than one physical device. |
| The OpenCL platform must support a large range of hardware platforms |
| including systems that do not support a single shared address space in |
| hardware; hence the ways memory objects can be shared between |
| kernel-instances is restricted. |
| The basic principle is that multiple read operations on memory objects from |
| multiple kernel-instances that overlap in time are allowed, but mixing |
| overlapping reads and writes into the same memory objects from different |
| kernel instances is only allowed when fine grained synchronization is used |
| with <a href="#shared-virtual-memory">Shared Virtual Memory</a>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>When global memory is manipulated by multiple kernel-instances running on |
| multiple devices, the OpenCL runtime system must manage the association of |
| memory objects with a given device. |
| In most cases the OpenCL runtime will implicitly associate a memory object |
| with a device. |
| A kernel instance is naturally associated with the command queue to which |
| the kernel was submitted. |
| Since a command-queue can only access a single device, the queue uniquely |
| defines which device is involved with any given kernel-instance; hence |
| defining a clear association between memory objects, kernel-instances and |
| devices. |
| Programmers may anticipate these associations in their programs and |
| explicitly manage association of memory objects with devices in order to |
| improve performance.</p> |
| </div> |
| </div> |
| <div class="sect3"> |
| <h4 id="shared-virtual-memory"><a class="anchor" href="#shared-virtual-memory"></a>3.3.3. Shared Virtual Memory</h4> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| Shared virtual memory is <a href="#unified-spec">missing before</a> |
| version 2.0. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>OpenCL extends the global memory region into the host memory region through |
| a shared virtual memory (SVM) mechanism. |
| There are three types of SVM in OpenCL</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>Coarse-Grained buffer SVM</strong>: Sharing occurs at the granularity of |
| regions of OpenCL buffer memory objects. |
| Consistency is enforced at synchronization points and with map/unmap |
| commands to drive updates between the host and the device. |
| This form of SVM is similar to non-SVM use of memory; however, it lets |
| kernel-instances share pointer-based data structures (such as |
| linked-lists) with the host program. |
| Program scope global variables are treated as per-device coarse-grained |
| SVM for addressing and sharing purposes.</p> |
| </li> |
| <li> |
| <p><strong>Fine-Grained buffer SVM</strong>: Sharing occurs at the granularity of |
| individual loads/stores into bytes within OpenCL buffer memory objects. |
| Loads and stores may be cached. |
| This means consistency is guaranteed at synchronization points. |
| If the optional OpenCL atomics are supported, they can be used to |
| provide fine-grained control of memory consistency.</p> |
| </li> |
| <li> |
| <p><strong>Fine-Grained system SVM</strong>: Sharing occurs at the granularity of |
| individual loads/stores into bytes occurring anywhere within the host |
| memory. |
| Loads and stores may be cached so consistency is guaranteed at |
| synchronization points. |
| If the optional OpenCL atomics are supported, they can be used to |
| provide fine-grained control of memory consistency.</p> |
| </li> |
| </ul> |
| </div> |
| <table id="svm-summary-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 2. A summary of shared virtual memory (SVM) options in OpenCL</caption> |
| <colgroup> |
| <col style="width: 20%;"> |
| <col style="width: 20%;"> |
| <col style="width: 20%;"> |
| <col style="width: 20%;"> |
| <col style="width: 20%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-center valign-top"></th> |
| <th class="tableblock halign-center valign-top">Granularity of sharing</th> |
| <th class="tableblock halign-center valign-top">Memory Allocation</th> |
| <th class="tableblock halign-center valign-top">Mechanisms to enforce Consistency</th> |
| <th class="tableblock halign-center valign-top">Explicit updates between host and device</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-center valign-top"><p class="tableblock">Non-SVM buffers</p></td> |
| <td class="tableblock halign-center valign-top"><p class="tableblock">OpenCL Memory objects(buffer)</p></td> |
| <td class="tableblock halign-center valign-top"><p class="tableblock"><a href="#clCreateBuffer"><strong>clCreateBuffer</strong></a><br> |
| <a href="#clCreateBufferWithProperties"><strong>clCreateBufferWithProperties</strong></a></p></td> |
| <td class="tableblock halign-center valign-top"><p class="tableblock">Host synchronization points on the same or between devices.</p></td> |
| <td class="tableblock halign-center valign-top"><p class="tableblock">yes, through Map and Unmap commands.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-center valign-top"><p class="tableblock">Coarse-Grained buffer SVM</p></td> |
| <td class="tableblock halign-center valign-top"><p class="tableblock">OpenCL Memory objects (buffer)</p></td> |
| <td class="tableblock halign-center valign-top"><p class="tableblock"><a href="#clSVMAlloc"><strong>clSVMAlloc</strong></a></p></td> |
| <td class="tableblock halign-center valign-top"><p class="tableblock">Host synchronization points between devices</p></td> |
| <td class="tableblock halign-center valign-top"><p class="tableblock">yes, through Map and Unmap commands.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-center valign-top"><p class="tableblock">Fine-Grained buffer SVM</p></td> |
| <td class="tableblock halign-center valign-top"><p class="tableblock">Bytes within OpenCL Memory objects (buffer)</p></td> |
| <td class="tableblock halign-center valign-top"><p class="tableblock"><a href="#clSVMAlloc"><strong>clSVMAlloc</strong></a></p></td> |
| <td class="tableblock halign-center valign-top"><p class="tableblock">Synchronization points plus atomics (if supported)</p></td> |
| <td class="tableblock halign-center valign-top"><p class="tableblock">No</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-center valign-top"><p class="tableblock">Fine-Grained system SVM</p></td> |
| <td class="tableblock halign-center valign-top"><p class="tableblock">Bytes within Host memory (system)</p></td> |
| <td class="tableblock halign-center valign-top"><p class="tableblock">Host memory allocation mechanisms (e.g. malloc)</p></td> |
| <td class="tableblock halign-center valign-top"><p class="tableblock">Synchronization points plus atomics (if supported)</p></td> |
| <td class="tableblock halign-center valign-top"><p class="tableblock">No</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="paragraph"> |
| <p>Coarse-Grained buffer SVM is required in the core OpenCL specification. |
| The two finer grained approaches are optional features in OpenCL. |
| The various SVM mechanisms to access host memory from the work-items |
| associated with a kernel instance are <a href="#svm-summary-table">summarized |
| above</a>.</p> |
| </div> |
| </div> |
| <div class="sect3"> |
| <h4 id="_memory_consistency_model_for_opencl_1_x"><a class="anchor" href="#_memory_consistency_model_for_opencl_1_x"></a>3.3.4. Memory Consistency Model for OpenCL 1.x</h4> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| This memory consistency model is <a href="#unified-spec">deprecated |
| by</a> version 2.0. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>OpenCL 1.x uses a relaxed consistency memory model; i.e. the state of memory |
| visible to a work-item is not guaranteed to be consistent across the collection |
| of work-items at all times.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Within a work-item memory has load / store consistency. |
| Local memory is consistent across work-items in a single work-group at a |
| work-group barrier. |
| Global memory is consistent across work-items in a single work-group at a |
| work-group barrier, but there are no guarantees of memory consistency between |
| different work-groups executing a kernel.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Memory consistency for memory objects shared between enqueued commands is |
| enforced at a synchronization point.</p> |
| </div> |
| </div> |
| <div class="sect3"> |
| <h4 id="memory-consistency-model"><a class="anchor" href="#memory-consistency-model"></a>3.3.5. Memory Consistency Model for OpenCL 2.x</h4> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| This memory consistency model is <a href="#unified-spec">missing |
| before</a> version 2.0. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>The OpenCL 2.x memory model tells programmers what they can expect from an |
| OpenCL 2.x implementation; which memory operations are guaranteed to happen in |
| which order and which memory values each read operation will return. |
| The memory model tells compiler writers which restrictions they must follow |
| when implementing compiler optimizations; which variables they can cache in |
| registers and when they can move reads or writes around a barrier or atomic |
| operation. |
| The memory model also tells hardware designers about limitations on hardware |
| optimizations; for example, when they must flush or invalidate hardware |
| caches.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The memory consistency model in OpenCL 2.x is based on the memory model from |
| the ISO C11 programming language. |
| To help make the presentation more precise and self-contained, we include |
| modified paragraphs taken verbatim from the ISO C11 international standard. |
| When a paragraph is taken or modified from the C11 standard, it is |
| identified as such along with its original location in the <a href="#iso-c11">C11 |
| standard</a>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>For programmers, the most intuitive model is the <em>sequential consistency</em> |
| memory model. |
| Sequential consistency interleaves the steps executed by each of the units |
| of execution. |
| Each access to a memory location sees the last assignment to that location |
| in that interleaving. |
| While sequential consistency is relatively straightforward for a programmer |
| to reason about, implementing sequential consistency is expensive. |
| Therefore, OpenCL 2.x implements a relaxed memory consistency model; i.e. it is |
| possible to write programs where the loads from memory violate sequential |
| consistency. |
| Fortunately, if a program does not contain any races and if the program only |
| uses atomic operations that utilize the sequentially consistent memory order |
| (the default memory ordering for OpenCL 2.x), OpenCL programs appear to execute |
| with sequential consistency.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Programmers can to some degree control how the memory model is relaxed by |
| choosing the memory order for synchronization operations. |
| The precise semantics of synchronization and the memory orders are formally |
| defined in <a href="#memory-ordering-rules">Memory Ordering Rules</a>. |
| Here, we give a high level description of how these memory orders apply to |
| atomic operations on atomic objects shared between units of execution. |
| OpenCL 2.x memory_order choices are based on those from the ISO C11 standard |
| memory model. |
| They are specified in certain OpenCL functions through the following |
| enumeration constants:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>memory_order_relaxed</strong>: implies no order constraints. |
| This memory order can be used safely to increment counters that are |
| concurrently incremented, but it doesn’t guarantee anything about the |
| ordering with respect to operations to other memory locations. |
| It can also be used, for example, to do ticket allocation and by expert |
| programmers implementing lock-free algorithms.</p> |
| </li> |
| <li> |
| <p><strong>memory_order_acquire</strong>: A synchronization operation (fence or atomic) |
| that has acquire semantics "acquires" side-effects from a release |
| operation that synchronises with it: if an acquire synchronises with a |
| release, the acquiring unit of execution will see all side-effects |
| preceding that release (and possibly subsequent side-effects.) As part |
| of carefully-designed protocols, programmers can use an "acquire" to |
| safely observe the work of another unit of execution.</p> |
| </li> |
| <li> |
| <p><strong>memory_order_release</strong>: A synchronization operation (fence or atomic |
| operation) that has release semantics "releases" side effects to an |
| acquire operation that synchronises with it. |
| All side effects that precede the release are included in the release. |
| As part of carefully-designed protocols, programmers can use a "release" |
| to make changes made in one unit of execution visible to other units of |
| execution.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| In general, no acquire must <em>always</em> synchronise with any particular |
| release. |
| However, synchronisation can be forced by certain executions. |
| See the description of <a href="#memory-ordering-fence">Fence Operations</a> for |
| detailed rules for when synchronisation must occur. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>memory_order_acq_rel</strong>: A synchronization operation with acquire-release |
| semantics has the properties of both the acquire and release memory |
| orders. |
| It is typically used to order read-modify-write operations.</p> |
| </li> |
| <li> |
| <p><strong>memory_order_seq_cst</strong>: The loads and stores of each unit of execution |
| appear to execute in program (i.e., sequenced-before) order, and the |
| loads and stores from different units of execution appear to be simply |
| interleaved.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Regardless of which memory_order is specified, resolving constraints on |
| memory operations across a heterogeneous platform adds considerable overhead |
| to the execution of a program. |
| An OpenCL platform may be able to optimize certain operations that depend on |
| the features of the memory consistency model by restricting the scope of the |
| memory operations. |
| Distinct memory scopes are defined by the values of the memory_scope |
| enumeration constant:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>memory_scope_work_item</strong>: memory-ordering constraints only apply within |
| the work-item <sup class="footnote">[<a id="_footnoteref_1" class="footnote" href="#_footnotedef_1" title="View footnote.">1</a>]</sup>.</p> |
| </li> |
| <li> |
| <p><strong>memory_scope_sub_group</strong>: memory-ordering constraints only apply within |
| the sub-group.</p> |
| </li> |
| <li> |
| <p><strong>memory_scope_work_group</strong>: memory-ordering constraints only apply to |
| work-items executing within a single work-group.</p> |
| </li> |
| <li> |
| <p><strong>memory_scope_device:</strong> memory-ordering constraints only apply to |
| work-items executing on a single device</p> |
| </li> |
| <li> |
| <p><strong>memory_scope_all_svm_devices</strong>: memory-ordering constraints apply to |
| work-items executing across multiple devices and (when using SVM) the |
| host. |
| A release performed with <strong>memory_scope_all_svm_devices</strong> to a buffer that |
| does not have the <a href="#CL_MEM_SVM_ATOMICS"><code>CL_MEM_<wbr>SVM_<wbr>ATOMICS</code></a> flag set will commit to at least |
| <strong>memory_scope_device</strong> visibility, with full synchronization of the |
| buffer at a queue synchronization point (e.g. an OpenCL event).</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>These memory scopes define a hierarchy of visibilities when analyzing the |
| ordering constraints of memory operations. |
| For example if a programmer knows that a sequence of memory operations will |
| only be associated with a collection of work-items from a single work-group |
| (and hence will run on a single device), the implementation is spared the |
| overhead of managing the memory orders across other devices within the same |
| context. |
| This can substantially reduce overhead in a program. |
| All memory scopes are valid when used on global memory or local memory. |
| For local memory, all visibility is constrained to within a given work-group |
| and scopes wider than <strong>memory_scope_work_group</strong> carry no additional meaning.</p> |
| </div> |
| <div class="paragraph"> |
| <p>In the following subsections (leading up to <a href="#opencl-framework">OpenCL |
| Framework</a>), we will explain the synchronization constructs and detailed |
| rules needed to use OpenCL’s 2.x relaxed memory models. |
| It is important to appreciate, however, that many programs do not benefit |
| from relaxed memory models. |
| Even expert programmers have a difficult time using atomics and fences to |
| write correct programs with relaxed memory models. |
| A large number of OpenCL programs can be written using a simplified memory |
| model. |
| This is accomplished by following these guidelines.</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>Write programs that manage safe sharing of global memory objects through |
| the synchronization points defined by the command queues.</p> |
| </li> |
| <li> |
| <p>Restrict low level synchronization inside work-groups to the work-group |
| functions such as barrier.</p> |
| </li> |
| <li> |
| <p>If you want sequential consistency behavior with system allocations or |
| fine-grain SVM buffers with atomics support, use only |
| <strong>memory_order_seq_cst</strong> operations with the scope |
| <strong>memory_scope_all_svm_devices</strong>.</p> |
| </li> |
| <li> |
| <p>If you want sequential consistency behavior when not using system |
| allocations or fine-grain SVM buffers with atomics support, use only |
| <strong>memory_order_seq_cst</strong> operations with the scope <strong>memory_scope_device</strong> |
| or <strong>memory_scope_all_svm_devices</strong>.</p> |
| </li> |
| <li> |
| <p>Ensure your program has no races.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>If these guidelines are followed in your OpenCL programs, you can skip the |
| detailed rules behind the relaxed memory models and go directly to |
| <a href="#opencl-framework">OpenCL Framework</a>.</p> |
| </div> |
| </div> |
| <div class="sect3"> |
| <h4 id="_overview_of_atomic_and_fence_operations"><a class="anchor" href="#_overview_of_atomic_and_fence_operations"></a>3.3.6. Overview of atomic and fence operations</h4> |
| <div class="paragraph"> |
| <p>OpenCL 2.x has a number of <em>synchronization operations</em> that are used to define |
| memory order constraints in a program. |
| They play a special role in controlling how memory operations in one unit of |
| execution (such as work-items or, when using SVM a host thread) are made |
| visible to another. |
| There are two types of synchronization operations in OpenCL; <em>atomic |
| operations</em> and <em>fences</em>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Atomic operations are indivisible. |
| They either occur completely or not at all. |
| These operations are used to order memory operations between units of |
| execution and hence they are parameterized with the memory_order and |
| memory_scope parameters defined by the OpenCL memory consistency model. |
| The atomic operations for OpenCL kernel languages are similar to the |
| corresponding operations defined by the C11 standard.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The OpenCL 2.x atomic operations apply to variables of an atomic type (a |
| subset of those in the C11 standard) including atomic versions of the int, |
| uint, long, ulong, float, double, half, intptr_t, uintptr_t, size_t, and |
| ptrdiff_t types. |
| However, support for some of these atomic types depends on support for the |
| corresponding regular types.</p> |
| </div> |
| <div class="paragraph"> |
| <p>An atomic operation on one or more memory locations is either an acquire |
| operation, a release operation, or both an acquire and release operation. |
| An atomic operation without an associated memory location is a fence and can |
| be either an acquire fence, a release fence, or both an acquire and release |
| fence. |
| In addition, there are relaxed atomic operations, which do not have |
| synchronization properties, and atomic read-modify-write operations, which |
| have special characteristics. |
| <a href="#iso-c11">[C11 standard, Section 5.1.2.4, paragraph 5, modified.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>The orders <strong>memory_order_acquire</strong> (used for reads), <strong>memory_order_release</strong> |
| (used for writes), and <strong>memory_order_acq_rel</strong> (used for read-modify-write |
| operations) are used for simple communication between units of execution |
| using shared variables. |
| Informally, executing a <strong>memory_order_release</strong> on an atomic object A makes |
| all previous side effects visible to any unit of execution that later |
| executes a <strong>memory_order_acquire</strong> on A. |
| The orders <strong>memory_order_acquire</strong>, <strong>memory_order_release</strong>, and |
| <strong>memory_order_acq_rel</strong> do not provide sequential consistency for race-free |
| programs because they will not ensure that atomic stores followed by atomic |
| loads become visible to other threads in that order.</p> |
| </div> |
| <div id="atomic-fence-orders" class="paragraph"> |
| <p>The fence operation is atomic_work_item_fence, which includes a memory_order |
| argument as well as the memory_scope and cl_mem_fence_flags arguments. |
| Depending on the memory_order argument, this operation:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>has no effects, if <strong>memory_order_relaxed</strong>;</p> |
| </li> |
| <li> |
| <p>is an acquire fence, if <strong>memory_order_acquire</strong>;</p> |
| </li> |
| <li> |
| <p>is a release fence, if <strong>memory_order_release</strong>;</p> |
| </li> |
| <li> |
| <p>is both an acquire fence and a release fence, if <strong>memory_order_acq_rel</strong>;</p> |
| </li> |
| <li> |
| <p>is a sequentially-consistent fence with both acquire and release |
| semantics, if <strong>memory_order_seq_cst</strong>.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>If specified, the cl_mem_fence_flags argument must be <code>CLK_IMAGE_MEM_FENCE</code>, |
| <code>CLK_GLOBAL_MEM_FENCE</code>, <code>CLK_LOCAL_MEM_FENCE</code>, or <code>CLK_GLOBAL_MEM_FENCE | |
| CLK_LOCAL_MEM_FENCE</code>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The <code>atomic_work_item_fence(CLK_IMAGE_MEM_FENCE, …​)</code> built-in function must be |
| used to make sure that sampler-less writes are visible to later reads by the |
| same work-item. |
| Without use of the atomic_work_item_fence function, write-read coherence on |
| image objects is not guaranteed: if a work-item reads from an image to which |
| it has previously written without an intervening atomic_work_item_fence, it |
| is not guaranteed that those previous writes are visible to the work-item.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The synchronization operations in OpenCL 2.x can be parameterized by a |
| memory_scope. |
| Memory scopes control the extent that an atomic operation or fence is |
| visible with respect to the memory model. |
| These memory scopes may be used when performing atomic operations and fences |
| on global memory and local memory. |
| When used on global memory visibility is bounded by the capabilities of that |
| memory. |
| When used on a fine-grained non-atomic SVM buffer, a coarse-grained SVM |
| buffer, or a non-SVM buffer, operations parameterized with |
| <strong>memory_scope_all_svm_devices</strong> will behave as if they were parameterized |
| with <strong>memory_scope_device</strong>. |
| When used on local memory, visibility is bounded by the work-group and, as a |
| result, memory_scope with wider visibility than <strong>memory_scope_work_group</strong> |
| will be reduced to <strong>memory_scope_work_group</strong>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Two actions <strong>A</strong> and <strong>B</strong> are defined to have an inclusive scope if they have |
| the same scope <strong>P</strong> such that:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>P</strong> is <strong>memory_scope_sub_group</strong> and <strong>A</strong> and <strong>B</strong> are executed by |
| work-items within the same sub-group.</p> |
| </li> |
| <li> |
| <p><strong>P</strong> is <strong>memory_scope_work_group</strong> and <strong>A</strong> and <strong>B</strong> are executed by |
| work-items within the same work-group.</p> |
| </li> |
| <li> |
| <p><strong>P</strong> is <strong>memory_scope_device</strong> and <strong>A</strong> and <strong>B</strong> are executed by work-items |
| on the same device when <strong>A</strong> and <strong>B</strong> apply to an SVM allocation or <strong>A</strong> |
| and <strong>B</strong> are executed by work-items in the same kernel or one of its |
| children when <strong>A</strong> and <strong>B</strong> apply to a <code>cl_mem</code> buffer.</p> |
| </li> |
| <li> |
| <p><strong>P</strong> is <strong>memory_scope_all_svm_devices</strong> if <strong>A</strong> and <strong>B</strong> are executed by |
| host threads or by work-items on one or more devices that can share SVM |
| memory with each other and the host process.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| <div class="sect3"> |
| <h4 id="memory-ordering-rules"><a class="anchor" href="#memory-ordering-rules"></a>3.3.7. Memory Ordering Rules</h4> |
| <div class="paragraph"> |
| <p>Fundamentally, the issue in a memory model is to understand the orderings in |
| time of modifications to objects in memory. |
| Modifying an object or calling a function that modifies an object are side |
| effects, i.e. changes in the state of the execution environment. |
| Evaluation of an expression in general includes both value computations and |
| initiation of side effects. |
| Value computation for an lvalue expression includes determining the identity |
| of the designated object. |
| <a href="#iso-c11">[C11 standard, Section 5.1.2.3, paragraph 2, modified.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>We assume that the OpenCL kernel language and host programming languages |
| have a sequenced-before relation between the evaluations executed by a |
| single unit of execution. |
| This sequenced-before relation is an asymmetric, transitive, pair-wise |
| relation between those evaluations, which induces a partial order among |
| them. |
| Given any two evaluations <strong>A</strong> and <strong>B</strong>, if <strong>A</strong> is sequenced-before <strong>B</strong>, then |
| the execution of <strong>A</strong> shall precede the execution of <strong>B</strong>. |
| (Conversely, if <strong>A</strong> is sequenced-before <strong>B</strong>, then <strong>B</strong> is sequenced-after |
| <strong>A</strong>.) If <strong>A</strong> is not sequenced-before or sequenced-after <strong>B</strong>, then <strong>A</strong> and |
| <strong>B</strong> are unsequenced. |
| Evaluations <strong>A</strong> and <strong>B</strong> are indeterminately sequenced when <strong>A</strong> is either |
| sequenced-before or sequenced-after <strong>B</strong>, but it is unspecified which. |
| <a href="#iso-c11">[C11 standard, Section 5.1.2.3, paragraph 3, modified.]</a></p> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| Sequenced-before is a partial order of the operations executed by a |
| single unit of execution (e.g. a host thread or work-item). |
| It generally corresponds to the source program order of those operations, and |
| is partial because of the undefined argument evaluation order of the OpenCL C |
| kernel language. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>In an OpenCL kernel language, the value of an object visible to a work-item |
| W at a particular point is the initial value of the object, a value stored |
| in the object by W, or a value stored in the object by another work-item or |
| host thread, according to the rules below. |
| Depending on details of the host programming language, the value of an |
| object visible to a host thread may also be the value stored in that object |
| by another work-item or host thread. |
| <a href="#iso-c11">[C11 standard, Section 5.1.2.4, paragraph 2, modified.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>Two expression evaluations conflict if one of them modifies a memory |
| location and the other one reads or modifies the same memory location. |
| <a href="#iso-c11">[C11 standard, Section 5.1.2.4, paragraph 4.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>All modifications to a particular atomic object <strong>M</strong> occur in some particular |
| total order, called the modification order of <strong>M</strong>. |
| If <strong>A</strong> and <strong>B</strong> are modifications of an atomic object <strong>M</strong>, and <strong>A</strong> |
| happens-before <strong>B</strong>, then <strong>A</strong> shall precede <strong>B</strong> in the modification order of |
| <strong>M</strong>, which is defined below. |
| Note that the modification order of an atomic object <strong>M</strong> is independent of |
| whether <strong>M</strong> is in local or global memory. |
| <a href="#iso-c11">[C11 standard, Section 5.1.2.4, paragraph 7, modified.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>A release sequence begins with a release operation <strong>A</strong> on an atomic object |
| <strong>M</strong> and is the maximal contiguous sub-sequence of side effects in the |
| modification order of <strong>M</strong>, where the first operation is <strong>A</strong> and every |
| subsequent operation either is performed by the same work-item or host |
| thread that performed the release or is an atomic read-modify-write |
| operation. |
| <a href="#iso-c11">[C11 standard, Section 5.1.2.4, paragraph 10, modified.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>OpenCL’s local and global memories are disjoint. |
| Kernels may access both kinds of memory while host threads may only access |
| global memory. |
| Furthermore, the <em>flags</em> argument of OpenCL’s work_group_barrier function |
| specifies which memory operations the function will make visible: these |
| memory operations can be, for example, just the ones to local memory, or the |
| ones to global memory, or both. |
| Since the visibility of memory operations can be specified for local memory |
| separately from global memory, we define two related but independent |
| relations, <em>global-synchronizes-with</em> and <em>local-synchronizes-with</em>. |
| Certain operations on global memory may global-synchronize-with other |
| operations performed by another work-item or host thread. |
| An example is a release atomic operation in one work- item that |
| global-synchronizes-with an acquire atomic operation in a second work-item. |
| Similarly, certain atomic operations on local objects in kernels can |
| local-synchronize- with other atomic operations on those local objects. |
| <a href="#iso-c11">[C11 standard, Section 5.1.2.4, paragraph 11, modified.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>We define two separate happens-before relations: global-happens-before and |
| local-happens-before.</p> |
| </div> |
| <div class="paragraph"> |
| <p>A global memory action <strong>A</strong> global-happens-before a global memory action <strong>B</strong> |
| if</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>A</strong> is sequenced before <strong>B</strong>, or</p> |
| </li> |
| <li> |
| <p><strong>A</strong> global-synchronizes-with <strong>B</strong>, or</p> |
| </li> |
| <li> |
| <p>For some global memory action <strong>C</strong>, <strong>A</strong> global-happens-before <strong>C</strong> and <strong>C</strong> |
| global-happens-before <strong>B</strong>.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>A local memory action <strong>A</strong> local-happens-before a local memory action <strong>B</strong> if</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>A</strong> is sequenced before <strong>B</strong>, or</p> |
| </li> |
| <li> |
| <p><strong>A</strong> local-synchronizes-with <strong>B</strong>, or</p> |
| </li> |
| <li> |
| <p>For some local memory action <strong>C</strong>, <strong>A</strong> local-happens-before <strong>C</strong> and <strong>C</strong> |
| local-happens-before <strong>B</strong>.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>An OpenCL 2.x implementation shall ensure that no program execution |
| demonstrates a cycle in either the local-happens-before relation or the |
| global-happens-before relation.</p> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| The global- and local-happens-before relations are critical to |
| defining what values are read and when data races occur. |
| The global-happens-before relation, for example, defines what global memory |
| operations definitely happen before what other global memory operations. |
| If an operation <strong>A</strong> global-happens-before operation <strong>B</strong> then <strong>A</strong> must occur |
| before <strong>B</strong>; in particular, any write done by <strong>A</strong> will be visible to <strong>B</strong>. |
| The local-happens-before relation has similar properties for local memory. |
| Programmers can use the local- and global-happens-before relations to reason |
| about the order of program actions. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>A visible side effect <strong>A</strong> on a global object <strong>M</strong> with respect to a value |
| computation <strong>B</strong> of <strong>M</strong> satisfies the conditions:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>A</strong> global-happens-before <strong>B</strong>, and</p> |
| </li> |
| <li> |
| <p>there is no other side effect <strong>X</strong> to <strong>M</strong> such that <strong>A</strong> |
| global-happens-before <strong>X</strong> and <strong>X</strong> global-happens-before <strong>B</strong>.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>We define visible side effects for local objects <strong>M</strong> similarly. |
| The value of a non-atomic scalar object <strong>M</strong>, as determined by evaluation |
| <strong>B</strong>, shall be the value stored by the visible side effect <strong>A</strong>. |
| <a href="#iso-c11">[C11 standard, Section 5.1.2.4, paragraph 19, modified.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>The execution of a program contains a data race if it contains two |
| conflicting actions <strong>A</strong> and <strong>B</strong> in different units of execution, and</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>(1) at least one of <strong>A</strong> or <strong>B</strong> is not atomic, or <strong>A</strong> and <strong>B</strong> do not have |
| inclusive memory scope, and</p> |
| </li> |
| <li> |
| <p>(2) the actions are global actions unordered by the |
| global-happens-before relation or are local actions unordered by the |
| local-happens-before relation.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Any such data race results in undefined behavior. |
| <a href="#iso-c11">[C11 standard, Section 5.1.2.4, paragraph 25, modified.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>We also define the visible sequence of side effects on local and global |
| atomic objects. |
| The remaining paragraphs of this subsection define this sequence for a |
| global atomic object <strong>M</strong>; the visible sequence of side effects for a local |
| atomic object is defined similarly by using the local-happens-before |
| relation.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The visible sequence of side effects on a global atomic object <strong>M</strong>, with |
| respect to a value computation <strong>B</strong> of <strong>M</strong>, is a maximal contiguous |
| sub-sequence of side effects in the modification order of <strong>M</strong>, where the |
| first side effect is visible with respect to <strong>B</strong>, and for every side effect, |
| it is not the case that <strong>B</strong> global-happens-before it. |
| The value of <strong>M</strong>, as determined by evaluation <strong>B</strong>, shall be the value stored |
| by some operation in the visible sequence of <strong>M</strong> with respect to <strong>B</strong>. |
| <a href="#iso-c11">[C11 standard, Section 5.1.2.4, paragraph 22, modified.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>If an operation <strong>A</strong> that modifies an atomic object <strong>M</strong> global-happens before |
| an operation <strong>B</strong> that modifies <strong>M</strong>, then <strong>A</strong> shall be earlier than <strong>B</strong> in |
| the modification order of <strong>M</strong>. |
| This requirement is known as write-write coherence.</p> |
| </div> |
| <div class="paragraph"> |
| <p>If a value computation <strong>A</strong> of an atomic object <strong>M</strong> global-happens-before a |
| value computation <strong>B</strong> of <strong>M</strong>, and <strong>A</strong> takes its value from a side effect <strong>X</strong> |
| on <strong>M</strong>, then the value computed by <strong>B</strong> shall either equal the value stored |
| by <strong>X</strong>, or be the value stored by a side effect <strong>Y</strong> on <strong>M</strong>, where <strong>Y</strong> |
| follows <strong>X</strong> in the modification order of <strong>M</strong>. |
| This requirement is known as read-read coherence. |
| <a href="#iso-c11">[C11 standard, Section 5.1.2.4, paragraph 22, modified.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>If a value computation <strong>A</strong> of an atomic object <strong>M</strong> global-happens-before an |
| operation <strong>B</strong> on <strong>M</strong>, then <strong>A</strong> shall take its value from a side effect <strong>X</strong> |
| on <strong>M</strong>, where <strong>X</strong> precedes <strong>B</strong> in the modification order of <strong>M</strong>. |
| This requirement is known as read-write coherence.</p> |
| </div> |
| <div class="paragraph"> |
| <p>If a side effect <strong>X</strong> on an atomic object <strong>M</strong> global-happens-before a value |
| computation <strong>B</strong> of <strong>M</strong>, then the evaluation <strong>B</strong> shall take its value from |
| <strong>X</strong> or from a side effect <strong>Y</strong> that follows <strong>X</strong> in the modification order of |
| <strong>M</strong>. |
| This requirement is known as write-read coherence.</p> |
| </div> |
| <div class="sect4"> |
| <h5 id="_atomic_operations"><a class="anchor" href="#_atomic_operations"></a>3.3.7.1. Atomic Operations</h5> |
| <div class="paragraph"> |
| <p>This and following sections describe how different program actions in kernel |
| C code and the host program contribute to the local- and |
| global-happens-before relations. |
| This section discusses ordering rules for OpenCL 2.x atomic operations.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#device-side-enqueue">Device-side enqueue</a> defines the enumerated type |
| memory_order.</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>For <strong>memory_order_relaxed</strong>, no operation orders memory.</p> |
| </li> |
| <li> |
| <p>For <strong>memory_order_release</strong>, <strong>memory_order_acq_rel</strong>, and |
| <strong>memory_order_seq_cst</strong>, a store operation performs a release operation |
| on the affected memory location.</p> |
| </li> |
| <li> |
| <p>For <strong>memory_order_acquire</strong>, <strong>memory_order_acq_rel</strong>, and |
| <strong>memory_order_seq_cst</strong>, a load operation performs an acquire operation |
| on the affected memory location. |
| <a href="#iso-c11">[C11 standard, Section 7.17.3, paragraphs 2-4, modified.]</a></p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Certain built-in functions synchronize with other built-in functions |
| performed by another unit of execution. |
| This is true for pairs of release and acquire operations under specific |
| circumstances. |
| An atomic operation <strong>A</strong> that performs a release operation on a global object |
| <strong>M</strong> global-synchronizes-with an atomic operation <strong>B</strong> that performs an |
| acquire operation on <strong>M</strong> and reads a value written by any side effect in the |
| release sequence headed by <strong>A</strong>. |
| A similar rule holds for atomic operations on objects in local memory: an |
| atomic operation <strong>A</strong> that performs a release operation on a local object <strong>M</strong> |
| local-synchronizes-with an atomic operation <strong>B</strong> that performs an acquire |
| operation on <strong>M</strong> and reads a value written by any side effect in the release |
| sequence headed by <strong>A</strong>. |
| <a href="#iso-c11">[C11 standard, Section 5.1.2.4, paragraph 11, modified.]</a></p> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| Atomic operations specifying <strong>memory_order_relaxed</strong> are relaxed only |
| with respect to memory ordering. |
| Implementations must still guarantee that any given atomic access to a |
| particular atomic object be indivisible with respect to all other atomic |
| accesses to that object. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>There shall exist a single total order <strong>S</strong> for all <strong>memory_order_seq_cst</strong> |
| operations that is consistent with the modification orders for all affected |
| locations, as well as the appropriate global-happens-before and |
| local-happens-before orders for those locations, such that each |
| <strong>memory_order_seq_cst</strong> operation <strong>B</strong> that loads a value from an atomic object |
| <strong>M</strong> in global or local memory observes one of the following values:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>the result of the last modification <strong>A</strong> of <strong>M</strong> that precedes <strong>B</strong> in <strong>S</strong>, |
| if it exists, or</p> |
| </li> |
| <li> |
| <p>if <strong>A</strong> exists, the result of some modification of <strong>M</strong> in the visible |
| sequence of side effects with respect to <strong>B</strong> that is not |
| <strong>memory_order_seq_cst</strong> and that does not happen before <strong>A</strong>, or</p> |
| </li> |
| <li> |
| <p>if <strong>A</strong> does not exist, the result of some modification of <strong>M</strong> in the |
| visible sequence of side effects with respect to <strong>B</strong> that is not |
| <strong>memory_order_seq_cst</strong>. |
| <a href="#iso-c11">[C11 standard, Section 7.17.3, paragraph 6, modified.]</a></p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Let X and Y be two <strong>memory_order_seq_cst</strong> operations. |
| If X local-synchronizes-with or global-synchronizes-with Y then X both |
| local-synchronizes-with Y and global-synchronizes-with Y.</p> |
| </div> |
| <div class="paragraph"> |
| <p>If the total order <strong>S</strong> exists, the following rules hold:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>For an atomic operation <strong>B</strong> that reads the value of an atomic object |
| <strong>M</strong>, if there is a <strong>memory_order_seq_cst</strong> fence <strong>X</strong> sequenced-before |
| <strong>B</strong>, then <strong>B</strong> observes either the last <strong>memory_order_seq_cst</strong> |
| modification of <strong>M</strong> preceding <strong>X</strong> in the total order <strong>S</strong> or a later |
| modification of <strong>M</strong> in its modification order. |
| <a href="#iso-c11">[C11 standard, Section 7.17.3, paragraph 9.]</a></p> |
| </li> |
| <li> |
| <p>For atomic operations <strong>A</strong> and <strong>B</strong> on an atomic object <strong>M</strong>, where <strong>A</strong> |
| modifies <strong>M</strong> and <strong>B</strong> takes its value, if there is a |
| <strong>memory_order_seq_cst</strong> fence <strong>X</strong> such that <strong>A</strong> is sequenced-before <strong>X</strong> |
| and <strong>B</strong> follows <strong>X</strong> in <strong>S</strong>, then <strong>B</strong> observes either the effects of <strong>A</strong> |
| or a later modification of <strong>M</strong> in its modification order. |
| <a href="#iso-c11">[C11 standard, Section 7.17.3, paragraph 10.]</a></p> |
| </li> |
| <li> |
| <p>For atomic operations <strong>A</strong> and <strong>B</strong> on an atomic object <strong>M</strong>, where <strong>A</strong> |
| modifies <strong>M</strong> and <strong>B</strong> takes its value, if there are |
| <strong>memory_order_seq_cst</strong> fences <strong>X</strong> and <strong>Y</strong> such that <strong>A</strong> is |
| sequenced-before <strong>X</strong>, <strong>Y</strong> is sequenced-before <strong>B</strong>, and <strong>X</strong> precedes <strong>Y</strong> |
| in <strong>S</strong>, then <strong>B</strong> observes either the effects of <strong>A</strong> or a later |
| modification of <strong>M</strong> in its modification order. |
| <a href="#iso-c11">[C11 standard, Section 7.17.3, paragraph 11.]</a></p> |
| </li> |
| <li> |
| <p>For atomic operations <strong>A</strong> and <strong>B</strong> on an atomic object <strong>M</strong>, if there are |
| <strong>memory_order_seq_cst</strong> fences <strong>X</strong> and <strong>Y</strong> such that <strong>A</strong> is |
| sequenced-before <strong>X</strong>, <strong>Y</strong> is sequenced-before <strong>B</strong>, and <strong>X</strong> precedes <strong>Y</strong> |
| in <strong>S</strong>, then <strong>B</strong> occurs later than <strong>A</strong> in the modification order of <strong>M</strong>.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| <strong>memory_order_seq_cst</strong> ensures sequential consistency only for a |
| program that is (1) free of data races, and (2) exclusively uses |
| <strong>memory_order_seq_cst</strong> synchronization operations. |
| Any use of weaker ordering will invalidate this guarantee unless extreme |
| care is used. |
| In particular, <strong>memory_order_seq_cst</strong> fences ensure a total order only for |
| the fences themselves. |
| Fences cannot, in general, be used to restore sequential consistency for |
| atomic operations with weaker ordering specifications. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>Atomic read-modify-write operations should always read the last value (in |
| the modification order) stored before the write associated with the |
| read-modify-write operation. |
| <a href="#iso-c11">[C11 standard, Section 7.17.3, paragraph 12.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p><span class="underline">Implementations should ensure that no "out-of-thin-air" values |
| are computed that circularly depend on their own computation.</span></p> |
| </div> |
| <div class="paragraph"> |
| <p>Note: Under the rules described above, and independent to the previously |
| footnoted C++ issue, it is known that <em>x == y == 42</em> is a valid final state |
| in the following problematic example:</p> |
| </div> |
| <div class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c">global atomic_int x = ATOMIC_VAR_INIT(<span class="integer">0</span>); |
| local atomic_int y = ATOMIC_VAR_INIT(<span class="integer">0</span>); |
| |
| <span class="label">unit_of_execution_1:</span> |
| ... [execution not reading or writing x or y, leading up to:] |
| <span class="predefined-type">int</span> t = atomic_load_explicit(&y, memory_order_acquire); |
| atomic_store_explicit(&x, t, memory_order_release); |
| |
| <span class="label">unit_of_execution_2:</span> |
| ... [execution not reading or writing x or y, leading up to:] |
| <span class="predefined-type">int</span> t = atomic_load_explicit(&x, memory_order_acquire); |
| atomic_store_explicit(&y, t, memory_order_release);</code></pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>This is not useful behavior and implementations should not exploit this |
| phenomenon. |
| It should be expected that in the future this may be disallowed by |
| appropriate updates to the memory model description by the OpenCL committee.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Implementations should make atomic stores visible to atomic loads within a |
| reasonable amount of time. |
| <a href="#iso-c11">[C11 standard, Section 7.17.3, paragraph 16.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>As long as the following conditions are met, a host program sharing SVM memory |
| with a kernel executing on one or more OpenCL 2.x devices may use atomic and |
| synchronization operations to ensure that its assignments, and those of the |
| kernel, are visible to each other:</p> |
| </div> |
| <div class="olist arabic"> |
| <ol class="arabic"> |
| <li> |
| <p>Either fine-grained buffer or fine-grained system SVM must be used to |
| share memory. |
| While coarse-grained buffer SVM allocations may support atomic |
| operations, visibility on these allocations is not guaranteed except at |
| map and unmap operations.</p> |
| </li> |
| <li> |
| <p>The optional OpenCL 2.x SVM atomic-controlled visibility specified by |
| provision of the <a href="#CL_MEM_SVM_ATOMICS"><code>CL_MEM_<wbr>SVM_<wbr>ATOMICS</code></a> flag must be supported by the device |
| and the flag provided to the SVM buffer on allocation.</p> |
| </li> |
| <li> |
| <p>The host atomic and synchronization operations must be compatible with |
| those of an OpenCL kernel language. |
| This requires that the size and representation of the data types that |
| the host atomic operations act on be consistent with the OpenCL kernel |
| language atomic types.</p> |
| </li> |
| </ol> |
| </div> |
| <div class="paragraph"> |
| <p>If these conditions are met, the host operations will apply at |
| all_svm_devices scope.</p> |
| </div> |
| </div> |
| <div class="sect4"> |
| <h5 id="memory-ordering-fence"><a class="anchor" href="#memory-ordering-fence"></a>3.3.7.2. Fence Operations</h5> |
| <div class="paragraph"> |
| <p>This section describes how the OpenCL 2.x fence operations contribute to the |
| local- and global-happens-before relations.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Earlier, we introduced synchronization primitives called fences. |
| Fences can utilize the acquire memory_order, release memory_order, or both. |
| A fence with acquire semantics is called an acquire fence; a fence with |
| release semantics is called a release fence. The <a href="#atomic-fence-orders">overview of atomic and fence operations</a> section describes the memory orders |
| that result in acquire and release fences.</p> |
| </div> |
| <div class="paragraph"> |
| <p>A global release fence <strong>A</strong> global-synchronizes-with a global acquire fence |
| <strong>B</strong> if there exist atomic operations <strong>X</strong> and <strong>Y</strong>, both operating on some |
| global atomic object <strong>M</strong>, such that <strong>A</strong> is sequenced-before <strong>X</strong>, <strong>X</strong> |
| modifies <strong>M</strong>, <strong>Y</strong> is sequenced-before <strong>B</strong>, <strong>Y</strong> reads the value written by |
| <strong>X</strong> or a value written by any side effect in the hypothetical release |
| sequence <strong>X</strong> would head if it were a release operation, and that the scopes |
| of <strong>A</strong>, <strong>B</strong> are inclusive. |
| <a href="#iso-c11">[C11 standard, Section 7.17.4, paragraph 2, modified.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>A global release fence <strong>A</strong> global-synchronizes-with an atomic operation <strong>B</strong> |
| that performs an acquire operation on a global atomic object <strong>M</strong> if there |
| exists an atomic operation <strong>X</strong> such that <strong>A</strong> is sequenced-before <strong>X</strong>, <strong>X</strong> |
| modifies <strong>M</strong>, <strong>B</strong> reads the value written by <strong>X</strong> or a value written by any |
| side effect in the hypothetical release sequence <strong>X</strong> would head if it were a |
| release operation, and the scopes of <strong>A</strong> and <strong>B</strong> are inclusive. |
| <a href="#iso-c11">[C11 standard, Section 7.17.4, paragraph 3, modified.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>An atomic operation <strong>A</strong> that is a release operation on a global atomic |
| object <strong>M</strong> global-synchronizes-with a global acquire fence <strong>B</strong> if there |
| exists some atomic operation <strong>X</strong> on <strong>M</strong> such that <strong>X</strong> is sequenced-before |
| <strong>B</strong> and reads the value written by <strong>A</strong> or a value written by any side effect |
| in the release sequence headed by <strong>A</strong>, and the scopes of <strong>A</strong> and <strong>B</strong> are |
| inclusive. |
| <a href="#iso-c11">[C11 standard, Section 7.17.4, paragraph 4, modified.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>A local release fence <strong>A</strong> local-synchronizes-with a local acquire fence <strong>B</strong> |
| if there exist atomic operations <strong>X</strong> and <strong>Y</strong>, both operating on some local |
| atomic object <strong>M</strong>, such that <strong>A</strong> is sequenced-before <strong>X</strong>, <strong>X</strong> modifies <strong>M</strong>, |
| <strong>Y</strong> is sequenced-before <strong>B</strong>, and <strong>Y</strong> reads the value written by <strong>X</strong> or a |
| value written by any side effect in the hypothetical release sequence <strong>X</strong> |
| would head if it were a release operation, and the scopes of <strong>A</strong> and <strong>B</strong> are |
| inclusive. |
| <a href="#iso-c11">[C11 standard, Section 7.17.4, paragraph 2, modified.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>A local release fence <strong>A</strong> local-synchronizes-with an atomic operation <strong>B</strong> |
| that performs an acquire operation on a local atomic object <strong>M</strong> if there |
| exists an atomic operation <strong>X</strong> such that <strong>A</strong> is sequenced-before <strong>X</strong>, <strong>X</strong> |
| modifies <strong>M</strong>, and <strong>B</strong> reads the value written by <strong>X</strong> or a value written by |
| any side effect in the hypothetical release sequence <strong>X</strong> would head if it |
| were a release operation, and the scopes of <strong>A</strong> and <strong>B</strong> are inclusive. |
| <a href="#iso-c11">[C11 standard, Section 7.17.4, paragraph 3, modified.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>An atomic operation <strong>A</strong> that is a release operation on a local atomic object |
| <strong>M</strong> local-synchronizes-with a local acquire fence <strong>B</strong> if there exists some |
| atomic operation <strong>X</strong> on <strong>M</strong> such that <strong>X</strong> is sequenced-before <strong>B</strong> and reads |
| the value written by <strong>A</strong> or a value written by any side effect in the |
| release sequence headed by <strong>A</strong>, and the scopes of <strong>A</strong> and <strong>B</strong> are inclusive. |
| <a href="#iso-c11">[C11 standard, Section 7.17.4, paragraph 4, modified.]</a></p> |
| </div> |
| <div class="paragraph"> |
| <p>Let <strong>X</strong> and <strong>Y</strong> be two work-item fences that each have both the |
| <code>CLK_GLOBAL_MEM_FENCE</code> and <code>CLK_LOCAL_MEM_FENCE</code> flags set. |
| <strong>X</strong> global-synchronizes-with <strong>Y</strong> and <strong>X</strong> local synchronizes with <strong>Y</strong> if the |
| conditions required for <strong>X</strong> to global-synchronize with <strong>Y</strong> are met, the |
| conditions required for <strong>X</strong> to local-synchronize-with <strong>Y</strong> are met, or both |
| sets of conditions are met.</p> |
| </div> |
| </div> |
| <div class="sect4"> |
| <h5 id="_work_group_functions"><a class="anchor" href="#_work_group_functions"></a>3.3.7.3. Work-group Functions</h5> |
| <div class="paragraph"> |
| <p>The OpenCL kernel execution model includes collective operations across the |
| work-items within a single work-group. |
| These are called work-group functions, and include functions such as |
| barriers, scans, reductions, and broadcasts. |
| We will first discuss the work-group barrier function. |
| Other work-group functions are discussed afterwards.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The barrier function provides a mechanism for a kernel to synchronize the |
| work-items within a single work-group: informally, each work-item of the |
| work-group must execute the barrier before any are allowed to proceed. |
| It also orders memory operations to a specified combination of one or more |
| address spaces such as local memory or global memory, in a similar manner to |
| a fence.</p> |
| </div> |
| <div class="paragraph"> |
| <p>To precisely specify the memory ordering semantics for barrier, we need to |
| distinguish between a dynamic and a static instance of the call to a |
| barrier. |
| A call to a barrier can appear in a loop, for example, and each execution of |
| the same static barrier call results in a new dynamic instance of the |
| barrier that will independently synchronize a work-groups work-items.</p> |
| </div> |
| <div class="paragraph"> |
| <p>A work-item executing a dynamic instance of a barrier results in two |
| operations, both fences, that are called the entry and exit fences. |
| These fences obey all the rules for fences specified elsewhere in this |
| chapter as well as the following:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>The entry fence is a release fence with the same flags and scope as |
| requested for the barrier.</p> |
| </li> |
| <li> |
| <p>The exit fence is an acquire fence with the same flags and scope as |
| requested for the barrier.</p> |
| </li> |
| <li> |
| <p>For each work-item the entry fence is sequenced before the exit fence.</p> |
| </li> |
| <li> |
| <p>If the flags have <code>CLK_GLOBAL_MEM_FENCE</code> set then for each work-item the |
| entry fence global-synchronizes-with the exit fence of all other |
| work-items in the same work-group.</p> |
| </li> |
| <li> |
| <p>If the flags have <code>CLK_LOCAL_MEM_FENCE</code> set then for each work-item the |
| entry fence local-synchronizes-with the exit fence of all other |
| work-items in the same work-group.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Other work-group functions include such functions as scans, reductions, |
| and broadcasts, and are described in the kernel language and IL specifications. |
| The use of these work-group functions implies sequenced-before relationships |
| between statements within the execution of a single work-item in order to |
| satisfy data dependencies. |
| For example, a work-item that provides a value to a work-group function must |
| behave as if it generates that value before beginning execution of that |
| work-group function. |
| Furthermore, the programmer must ensure that all work-items in a work-group |
| must execute the same work-group function call site, or dynamic work-group |
| function instance.</p> |
| </div> |
| </div> |
| <div class="sect4"> |
| <h5 id="_sub_group_functions"><a class="anchor" href="#_sub_group_functions"></a>3.3.7.4. Sub-group Functions</h5> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| Sub-group functions are <a href="#unified-spec">missing before</a> version 2.1. |
| Also see extension <strong>cl_khr_subgroups</strong>. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>The OpenCL kernel execution model includes collective operations across the |
| work-items within a single sub-group. |
| These are called sub-group functions. |
| We will first discuss the sub-group barrier. |
| Other sub-group functions are discussed afterwards.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The barrier function provides a mechanism for a kernel to synchronize the |
| work-items within a single sub-group: informally, each work-item of the |
| sub-group must execute the barrier before any are allowed to proceed. |
| It also orders memory operations to a specified combination of one or more |
| address spaces such as local memory or global memory, in a similar manner to |
| a fence.</p> |
| </div> |
| <div class="paragraph"> |
| <p>To precisely specify the memory ordering semantics for barrier, we need to |
| distinguish between a dynamic and a static instance of the call to a |
| barrier. |
| A call to a barrier can appear in a loop, for example, and each execution of |
| the same static barrier call results in a new dynamic instance of the |
| barrier that will independently synchronize a sub-groups work-items.</p> |
| </div> |
| <div class="paragraph"> |
| <p>A work-item executing a dynamic instance of a barrier results in two |
| operations, both fences, that are called the entry and exit fences. |
| These fences obey all the rules for fences specified elsewhere in this |
| chapter as well as the following:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>The entry fence is a release fence with the same flags and scope as |
| requested for the barrier.</p> |
| </li> |
| <li> |
| <p>The exit fence is an acquire fence with the same flags and scope as |
| requested for the barrier.</p> |
| </li> |
| <li> |
| <p>For each work-item the entry fence is sequenced before the exit fence.</p> |
| </li> |
| <li> |
| <p>If the flags have <code>CLK_GLOBAL_MEM_FENCE</code> set then for each work-item the |
| entry fence global-synchronizes-with the exit fence of all other |
| work-items in the same sub-group.</p> |
| </li> |
| <li> |
| <p>If the flags have <code>CLK_LOCAL_MEM_FENCE</code> set then for each work-item the |
| entry fence local-synchronizes-with the exit fence of all other |
| work-items in the same sub-group.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Other sub-group functions include such functions as scans, reductions, |
| and broadcasts, and are described in the kernel languages and IL specifications. |
| The use of these sub-group functions implies sequenced-before relationships |
| between statements within the execution of a single work-item in order to |
| satisfy data dependencies. |
| For example, a work-item that provides a value to a sub-group function must |
| behave as if it generates that value before beginning execution of that |
| sub-group function. |
| Furthermore, the programmer must ensure that all work-items in a sub-group |
| must execute the same sub-group function call site, or dynamic sub-group |
| function instance.</p> |
| </div> |
| </div> |
| <div class="sect4"> |
| <h5 id="_host_side_and_device_side_commands"><a class="anchor" href="#_host_side_and_device_side_commands"></a>3.3.7.5. Host-side and Device-side Commands</h5> |
| <div class="paragraph"> |
| <p>This section describes how the OpenCL API functions associated with |
| command-queues contribute to happens-before relations. |
| There are two types of command queues and associated API functions in OpenCL |
| 2.x; <em>host command-queues</em> and <em>device command-queues</em>. |
| The interaction of these command queues with the memory model are for the |
| most part equivalent. |
| In a few cases, the rules only applies to the host command-queue. |
| We will indicate these special cases by specifically denoting the host |
| command-queue in the memory ordering rule. |
| SVM memory consistency in such instances is implied only with respect to |
| synchronizing host commands.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Memory ordering rules in this section apply to all memory objects (buffers, |
| images and pipes) as well as to SVM allocations where no earlier, and more |
| fine-grained, rules apply.</p> |
| </div> |
| <div class="paragraph"> |
| <p>In the remainder of this section, we assume that each command <strong>C</strong> enqueued |
| onto a command-queue has an associated event object <strong>E</strong> that signals its |
| execution status, regardless of whether <strong>E</strong> was returned to the unit of |
| execution that enqueued <strong>C</strong>. |
| We also distinguish between the API function call that enqueues a command |
| <strong>C</strong> and creates an event <strong>E</strong>, the execution of <strong>C</strong>, and the completion of |
| <strong>C</strong>(which marks the event <strong>E</strong> as complete).</p> |
| </div> |
| <div class="paragraph"> |
| <p>The ordering and synchronization rules for API commands are defined as |
| following:</p> |
| </div> |
| <div class="olist arabic"> |
| <ol class="arabic"> |
| <li> |
| <p>If an API function call <strong>X</strong> enqueues a command <strong>C</strong>, then <strong>X</strong> |
| global-synchronizes-with <strong>C</strong>. |
| For example, a host API function to enqueue a kernel |
| global-synchronizes-with the start of that kernel-instances execution, |
| so that memory updates sequenced-before the enqueue kernel function call |
| will global-happen-before any kernel reads or writes to those same |
| memory locations. |
| For a device-side enqueue, global memory updates sequenced before <strong>X</strong> |
| happens-before <strong>C</strong> reads or writes to those memory locations only in the |
| case of fine-grained SVM.</p> |
| </li> |
| <li> |
| <p>If <strong>E</strong> is an event upon which a command <strong>C</strong> waits, then <strong>E</strong> |
| global-synchronizes-with <strong>C</strong>. |
| In particular, if <strong>C</strong> waits on an event <strong>E</strong> that is tracking the |
| execution status of the command <strong>C1</strong>, then memory operations done by |
| <strong>C1</strong> will global-happen-before memory operations done by <strong>C</strong>. |
| As an example, assume we have an OpenCL program using coarse-grain SVM |
| sharing that enqueues a kernel to a host command-queue to manipulate the |
| contents of a region of a buffer that the host thread then accesses |
| after the kernel completes. |
| To do this, the host thread can call <a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a> to enqueue a |
| blocking-mode map command to map that buffer region, specifying that the |
| map command must wait on an event signaling the kernels completion. |
| When <a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a> returns, any memory operations performed by |
| the kernel to that buffer region will global- happen-before subsequent |
| memory operations made by the host thread.</p> |
| </li> |
| <li> |
| <p>If a command <strong>C</strong> has an event <strong>E</strong> that signals its completion, then <strong>C</strong> |
| global- synchronizes-with <strong>E</strong>.</p> |
| </li> |
| <li> |
| <p>For a command <strong>C</strong> enqueued to a host-side command queue, if <strong>C</strong> has an |
| event <strong>E</strong> that signals its completion, then <strong>E</strong> global-synchronizes-with |
| an API call <strong>X</strong> that waits on <strong>E</strong>. |
| For example, if a host thread or kernel-instance calls the |
| wait-for-events function on <strong>E</strong> (e.g. the <a href="#clWaitForEvents"><strong>clWaitForEvents</strong></a> function |
| called from a host thread), then <strong>E</strong> global-synchronizes-with that |
| wait-for-events function call.</p> |
| </li> |
| <li> |
| <p>If commands <strong>C</strong> and <strong>C1</strong> are enqueued in that sequence onto an in-order |
| command-queue, then the event (including the event implied between <strong>C</strong> |
| and <strong>C1</strong> due to the in-order queue) signaling <strong>C</strong>'s completion |
| global-synchronizes-with <strong>C1</strong>. |
| Note that in OpenCL 2.x, only a host command-queue can be configured as |
| an in-order queue.</p> |
| </li> |
| <li> |
| <p>If an API call enqueues a marker command <strong>C</strong> with an empty list of |
| events upon which <strong>C</strong> should wait, then the events of all commands |
| enqueued prior to <strong>C</strong> in the command-queue global-synchronize-with <strong>C</strong>.</p> |
| </li> |
| <li> |
| <p>If a host API call enqueues a command-queue barrier command <strong>C</strong> with an |
| empty list of events on which <strong>C</strong> should wait, then the events of all |
| commands enqueued prior to <strong>C</strong> in the command-queue |
| global-synchronize-with <strong>C</strong>. |
| In addition, the event signaling the completion of <strong>C</strong> |
| global-synchronizes-with all commands enqueued after <strong>C</strong> in the |
| command-queue.</p> |
| </li> |
| <li> |
| <p>If a host thread executes a <a href="#clFinish"><strong>clFinish</strong></a> call <strong>X</strong>, then the events of all |
| commands enqueued prior to <strong>X</strong> in the command-queue |
| global-synchronizes-with <strong>X</strong>.</p> |
| </li> |
| <li> |
| <p>The start of a kernel-instance <strong>K</strong> global-synchronizes-with all |
| operations in the work-items of <strong>K</strong>. |
| Note that this includes the execution of any atomic operations by the |
| work-items in a program using fine-grain SVM.</p> |
| </li> |
| <li> |
| <p>All operations of all work-items of a kernel-instance <strong>K</strong> |
| global-synchronizes-with the event signaling the completion of <strong>K</strong>. |
| Note that this also includes the execution of any atomic operations by |
| the work-items in a program using fine-grain SVM.</p> |
| </li> |
| <li> |
| <p>If a callback procedure <strong>P</strong> is registered on an event <strong>E</strong>, then <strong>E</strong> |
| global-synchronizes-with all operations of <strong>P</strong>. |
| Note that callback procedures are only defined for commands within host |
| command-queues.</p> |
| </li> |
| <li> |
| <p>If <strong>C</strong> is a command that waits for an event <strong>E</strong>'s completion, and API |
| function call <strong>X</strong> sets the status of a user event <strong>E</strong>'s status to |
| <a href="#CL_COMPLETE"><code>CL_COMPLETE</code></a> (for example, from a host thread using a |
| <a href="#clSetUserEventStatus"><strong>clSetUserEventStatus</strong></a> function), then <strong>X</strong> global-synchronizes-with <strong>C</strong>.</p> |
| </li> |
| <li> |
| <p>If a device enqueues a command <strong>C</strong> with the |
| <code>CLK_ENQUEUE_FLAGS_WAIT_KERNEL</code> flag, then the end state of the parent |
| kernel instance global-synchronizes with <strong>C</strong>.</p> |
| </li> |
| <li> |
| <p>If a work-group enqueues a command <strong>C</strong> with the |
| <code>CLK_ENQUEUE_FLAGS_WAIT_WORK_GROUP</code> flag, then the end state of the |
| work-group global-synchronizes with <strong>C</strong>.</p> |
| </li> |
| </ol> |
| </div> |
| <div class="paragraph"> |
| <p>When using an out-of-order command queue, a wait on an event or a marker or |
| command-queue barrier command can be used to ensure the correct ordering of |
| dependent commands. |
| In those cases, the wait for the event or the marker or barrier command will |
| provide the necessary global-synchronizes-with relation.</p> |
| </div> |
| <div class="paragraph"> |
| <p>In this situation:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>access to shared locations or disjoint locations in a single <code>cl_mem</code> |
| object when using atomic operations from different kernel instances |
| enqueued from the host such that one or more of the atomic operations is |
| a write is implementation-defined and correct behavior is not guaranteed |
| except at synchronization points.</p> |
| </li> |
| <li> |
| <p>access to shared locations or disjoint locations in a single <code>cl_mem</code> |
| object when using atomic operations from different kernel instances |
| consisting of a parent kernel and any number of child kernels enqueued |
| by that kernel is guaranteed under the memory ordering rules described |
| earlier in this section.</p> |
| </li> |
| <li> |
| <p>access to shared locations or disjoint locations in a single program |
| scope global variable, coarse-grained SVM allocation or fine-grained SVM |
| allocation when using atomic operations from different kernel instances |
| enqueued from the host to a single device is guaranteed under the memory |
| ordering rules described earlier in this section.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>If fine-grain SVM is used but without support for the OpenCL 2.x atomic |
| operations, then the host and devices can concurrently read the same memory |
| locations and can concurrently update non-overlapping memory regions, but |
| attempts to update the same memory locations are undefined. |
| Memory consistency is guaranteed at the OpenCL synchronization points |
| without the need for calls to <a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a> and |
| <a href="#clEnqueueUnmapMemObject"><strong>clEnqueueUnmapMemObject</strong></a>. |
| For fine-grained SVM buffers it is guaranteed that at synchronization points |
| only values written by the kernel will be updated. |
| No writes to fine-grained SVM buffers can be introduced that were not in the |
| original program.</p> |
| </div> |
| <div class="paragraph"> |
| <p>In the remainder of this section, we discuss a few points regarding the |
| ordering rules for commands with a host command queue.</p> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| In an OpenCL 1.x implementation a synchronization point is a |
| kernel-instance or host program location where the contents of memory |
| visible to different work-items or command-queue commands are the same. |
| It also says that waiting on an event and a command-queue barrier are |
| synchronization points between commands in command-queues. |
| Four of the rules listed above (2, 4, 7, and 8) cover these OpenCL |
| synchronization points. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>A map operation (<a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a> or <a href="#clEnqueueMapImage"><strong>clEnqueueMapImage</strong></a>) performed on a |
| non-SVM buffer or a coarse-grained SVM buffer is allowed to overwrite the |
| entire target region with the latest runtime view of the data as seen by the |
| command with which the map operation synchronizes, whether the values were |
| written by the executing kernels or not. |
| Any values that were changed within this region by another kernel or host |
| thread while the kernel synchronizing with the map operation was executing |
| may be overwritten by the map operation.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Access to non-SVM <code>cl_mem</code> buffers and coarse-grained SVM allocations is |
| ordered at synchronization points between host commands. |
| In the presence of an out-of-order command queue or a set of command queues |
| mapped to the same device, multiple kernel instances may execute |
| concurrently on the same device.</p> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="sect2"> |
| <h3 id="opencl-framework"><a class="anchor" href="#opencl-framework"></a>3.4. The OpenCL Framework</h3> |
| <div class="paragraph"> |
| <p>The OpenCL framework allows applications to use a host and one or more |
| OpenCL devices as a single heterogeneous parallel computer system. |
| The framework contains the following components:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><strong>OpenCL Platform layer</strong>: The platform layer allows the host program to |
| discover OpenCL devices and their capabilities and to create contexts.</p> |
| </li> |
| <li> |
| <p><strong>OpenCL Runtime</strong>: The runtime allows the host program to manipulate |
| contexts once they have been created.</p> |
| </li> |
| <li> |
| <p><strong>OpenCL Compiler</strong>: The OpenCL compiler creates program executables that |
| contain OpenCL kernels. |
| The OpenCL compiler may build program executables from OpenCL C source |
| strings, the SPIR-V intermediate language, or device-specific program |
| binary objects, depending on the capabilities of a device. |
| Other kernel languages or intermediate languages may be supported by |
| some implementations.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="sect3"> |
| <h4 id="_mixed_version_support"><a class="anchor" href="#_mixed_version_support"></a>3.4.1. Mixed Version Support</h4> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| Mixed version support <a href="#unified-spec">missing before</a> version 1.1. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>OpenCL supports devices with different capabilities under a single platform. |
| This includes devices which conform to different versions of the OpenCL |
| specification. |
| There are three version identifiers to consider for an OpenCL system: the |
| platform version, the version of a device, and the version(s) of the kernel |
| language or IL supported on a device.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The platform version indicates the version of the OpenCL runtime that is |
| supported. |
| This includes all of the APIs that the host can use to interact with |
| resources exposed by the OpenCL runtime; including contexts, memory objects, |
| devices, and command queues.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The device version is an indication of the device’s capabilities separate |
| from the runtime and compiler as represented by the device info returned by |
| <a href="#clGetDeviceInfo"><strong>clGetDeviceInfo</strong></a>. |
| Examples of attributes associated with the device version are resource |
| limits (e.g., minimum size of local memory per compute unit) and extended |
| functionality (e.g., list of supported KHR extensions). |
| The version returned corresponds to the highest version of the OpenCL |
| specification for which the device is conformant, but is not higher than the |
| platform version.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The language version for a device represents the OpenCL programming language |
| features a developer can assume are supported on a given device. |
| The version reported is the highest version of the language supported.</p> |
| </div> |
| </div> |
| <div class="sect3"> |
| <h4 id="_backwards_compatibility"><a class="anchor" href="#_backwards_compatibility"></a>3.4.2. Backwards Compatibility</h4> |
| <div class="paragraph"> |
| <p>Backwards compatibility is an important goal for the OpenCL standard. |
| Backwards compatibility is expected such that a device will consume earlier |
| versions of the OpenCL C programming languages and the SPIR-V intermediate language with the following |
| minimum requirements:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>An OpenCL 1.x device must support at least one 1.x version of the OpenCL C programming language.</p> |
| </li> |
| <li> |
| <p>An OpenCL 2.0 device must support all the requirements of an OpenCL 1.2 device in addition to the OpenCL C 2.0 programming language. |
| If multiple language versions are supported, the compiler defaults to using the OpenCL C 1.2 language version. |
| To utilize the OpenCL 2.0 Kernel programming language, a programmer must specifically pass the appropriate compiler build option (<code>-cl-std=CL2.0</code>). |
| The language version must not be higher than the platform version, but may exceed the <a href="#opencl-c-version">device version</a>.</p> |
| </li> |
| <li> |
| <p>An OpenCL 2.1 device must support all the requirements of an OpenCL 2.0 device in addition to the SPIR-V intermediate language at version 1.0 or above. |
| Intermediate language versioning is encoded as part of the binary object and no flags are required to be passed to the compiler.</p> |
| </li> |
| <li> |
| <p>An OpenCL 2.2 device must support all the requirements of an OpenCL 2.0 device in addition to the SPIR-V intermediate language at version 1.2 or above. |
| Intermediate language versioning is encoded as a part of the binary object and no flags are required to be passed to the compiler.</p> |
| </li> |
| <li> |
| <p>OpenCL 3.0 is designed to enable any OpenCL implementation supporting OpenCL 1.2 or newer to easily support and transition to OpenCL 3.0, by making many features in OpenCL 2.0, 2.1, or 2.2 optional. |
| This means that OpenCL 3.0 is backwards compatible with OpenCL 1.2, but is not necessarily backwards compatible with OpenCL 2.0, 2.1, or 2.2.</p> |
| <div class="paragraph"> |
| <p>An OpenCL 3.0 platform must implement all OpenCL 3.0 APIs, but some APIs may return an error code unconditionally when a feature is not supported by any devices in the platform. |
| Whenever a feature is optional, it will be paired with a query to determine whether the feature is supported. |
| The queries will enable correctly written applications to selectively use all optional features without generating any OpenCL errors, if desired.</p> |
| </div> |
| <div class="paragraph"> |
| <p>OpenCL 3.0 also adds a new version of the OpenCL C programming language, which makes many features in OpenCL C 2.0 optional. |
| The new version of OpenCL C is backwards compatible with OpenCL C 1.2, but is not backwards compatible with OpenCL C 2.0. |
| The new version of OpenCL C must be explicitly requested via the <code>-cl-std=</code> build option, otherwise a program will continue to be compiled using the highest OpenCL C 1.x language version supported for the device.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Whenever an OpenCL C feature is optional in the new version of the OpenCL C programming language, it will be paired with a feature macro, such as <code>__opencl_c_feature_name</code>, and a corresponding API query. |
| If a feature macro is defined then the feature is supported by the OpenCL C compiler, otherwise the optional feature is not supported.</p> |
| </div> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>In order to allow future versions of OpenCL to support new types of |
| devices, minor releases of OpenCL may add new profiles where some |
| features that are currently required for all OpenCL devices become |
| optional. |
| All features that are required for an OpenCL profile will also be |
| required for that profile in subsequent minor releases of OpenCL, |
| thereby guaranteeing backwards compatibility for applications |
| targeting specific profiles. |
| It is therefore strongly recommended that applications |
| <a href="#CL_DEVICE_PROFILE">query the profile</a> supported by the OpenCL device |
| they are running on in order to remain robust to future changes.</p> |
| </div> |
| </div> |
| <div class="sect3"> |
| <h4 id="_versioning"><a class="anchor" href="#_versioning"></a>3.4.3. Versioning</h4> |
| <div class="paragraph"> |
| <p>The OpenCL specification is regularly updated with bug fixes and clarifications. |
| Occasionally new functionality is added to the core and extensions. In order to |
| indicate to developers how and when these changes are made to the specification, |
| and to provide a way to identify each set of changes, the OpenCL API, C language, |
| intermediate languages and extensions maintain a version number. Built-in kernels |
| are also versioned.</p> |
| </div> |
| <div class="sect4"> |
| <h5 id="_versions"><a class="anchor" href="#_versions"></a>3.4.3.1. Versions</h5> |
| <div class="paragraph"> |
| <p>A version number comprises three logical fields:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>The <em>major</em> version indicates a significant change. Backwards compatibility may |
| break across major versions.</p> |
| </li> |
| <li> |
| <p>The <em>minor</em> version indicates the addition of new functionality with backwards |
| compatibility for any existing profiles.</p> |
| </li> |
| <li> |
| <p>The <em>patch</em> version indicates bug fixes, clarifications and general improvements.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Version numbers are represented using the <code>cl_version</code> type that is an alias for |
| a 32-bit integer. The fields are packed as follows:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>The <em>major</em> version is a 10-bit integer packed into bits 31-22.</p> |
| </li> |
| <li> |
| <p>The <em>minor</em> version is a 10-bit integer packed into bits 21-12.</p> |
| </li> |
| <li> |
| <p>The <em>patch</em> version is a 12-bit integer packed into bits 11-0.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>This enables versions to be ordered using standard C/C++ operators.</p> |
| </div> |
| <div class="paragraph"> |
| <p>A number of convenience macros are provided by the OpenCL Headers to make |
| working with version numbers easier.</p> |
| </div> |
| <div class="paragraph"> |
| <p><code>CL_VERSION_MAJOR</code> extracts the <em>major</em> version from a packed <code>cl_version</code>.<br> |
| <code>CL_VERSION_MINOR</code> extracts the <em>minor</em> version from a packed <code>cl_version</code>.<br> |
| <code>CL_VERSION_PATCH</code> extracts the <em>patch</em> version from a packed <code>cl_version</code>.<br> |
| <code>CL_MAKE_VERSION</code> returns a packed <code>cl_version</code> from a <em>major</em>, <em>minor</em> and |
| <em>patch</em> version.</p> |
| </div> |
| <div class="paragraph"> |
| <p>These are defined as follows:</p> |
| </div> |
| <div class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c"><span class="keyword">typedef</span> cl_uint cl_version; |
| |
| <span class="preprocessor">#define</span> CL_VERSION_MAJOR_BITS (<span class="integer">10</span>) |
| <span class="preprocessor">#define</span> CL_VERSION_MINOR_BITS (<span class="integer">10</span>) |
| <span class="preprocessor">#define</span> CL_VERSION_PATCH_BITS (<span class="integer">12</span>) |
| |
| <span class="preprocessor">#define</span> CL_VERSION_MAJOR_MASK ((<span class="integer">1</span> << CL_VERSION_MAJOR_BITS) - <span class="integer">1</span>) |
| <span class="preprocessor">#define</span> CL_VERSION_MINOR_MASK ((<span class="integer">1</span> << CL_VERSION_MINOR_BITS) - <span class="integer">1</span>) |
| <span class="preprocessor">#define</span> CL_VERSION_PATCH_MASK ((<span class="integer">1</span> << CL_VERSION_PATCH_BITS) - <span class="integer">1</span>) |
| |
| <span class="preprocessor">#define</span> CL_VERSION_MAJOR(version) \ |
| ((version) >> (CL_VERSION_MINOR_BITS + CL_VERSION_PATCH_BITS)) |
| |
| <span class="preprocessor">#define</span> CL_VERSION_MINOR(version) \ |
| (((version) >> CL_VERSION_PATCH_BITS) & CL_VERSION_MINOR_MASK) |
| |
| <span class="preprocessor">#define</span> CL_VERSION_PATCH(version) ((version) & CL_VERSION_PATCH_MASK) |
| |
| <span class="preprocessor">#define</span> CL_MAKE_VERSION(major, minor, patch) \ |
| ((((major)& CL_VERSION_MAJOR_MASK) << \ |
| (CL_VERSION_MINOR_BITS + CL_VERSION_PATCH_BITS)) | \ |
| (((minor)& CL_VERSION_MINOR_MASK) << \ |
| CL_VERSION_PATCH_BITS) | \ |
| ((patch) & CL_VERSION_PATCH_MASK))</code></pre> |
| </div> |
| </div> |
| </div> |
| <div class="sect4"> |
| <h5 id="_version_name_pairing"><a class="anchor" href="#_version_name_pairing"></a>3.4.3.2. Version name pairing</h5> |
| <div class="paragraph"> |
| <p>It is sometimes necessary to associate a version to an entity it applies to |
| (e.g. extension or built-in kernel). This is done using a dedicated |
| <a href="#cl_name_version"><code>cl_name_<wbr>version</code></a> structure, defined as follows:</p> |
| </div> |
| <div id="cl_name_version" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++"><span class="keyword">typedef</span> <span class="keyword">struct</span> cl_name_version { |
| cl_version version; |
| <span class="predefined-type">char</span> name[CL_NAME_VERSION_MAX_NAME_SIZE]; |
| } cl_name_version;</code></pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>The <code>name</code> field is an array of <code>CL_NAME_VERSION_MAX_NAME_SIZE</code> bytes used as |
| storage for a NUL-terminated string whose maximum length is therefore |
| <code>CL_NAME_VERSION_MAX_NAME_SIZE - 1</code>.</p> |
| </div> |
| </div> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="sect1"> |
| <h2 id="opencl-platform-layer"><a class="anchor" href="#opencl-platform-layer"></a>4. The OpenCL Platform Layer</h2> |
| <div class="sectionbody"> |
| <div class="paragraph"> |
| <p>This section describes the OpenCL platform layer which implements |
| platform-specific features that allow applications to query OpenCL devices, |
| device configuration information, and to create OpenCL contexts using one or |
| more devices.</p> |
| </div> |
| <div class="sect2"> |
| <h3 id="_querying_platform_info"><a class="anchor" href="#_querying_platform_info"></a>4.1. Querying Platform Info</h3> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>The list of platforms available can be obtained with the function:</p> |
| </div> |
| <div id="clGetPlatformIDs" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clGetPlatformIDs( |
| cl_uint num_entries, |
| cl_platform_id* platforms, |
| cl_uint* num_platforms);</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>num_entries</em> is the number of <code>cl_platform_<wbr>id</code> entries that can be added to |
| <em>platforms</em>. |
| If <em>platforms</em> is not <code>NULL</code>, the <em>num_entries</em> must be greater than zero.</p> |
| </li> |
| <li> |
| <p><em>platforms</em> returns a list of OpenCL platforms found. |
| The <code>cl_platform_<wbr>id</code> values returned in <em>platforms</em> can be used to identify a |
| specific OpenCL platform. |
| If <em>platforms</em> is <code>NULL</code>, this argument is ignored. |
| The number of OpenCL platforms returned is the minimum of the value |
| specified by <em>num_entries</em> or the number of OpenCL platforms available.</p> |
| </li> |
| <li> |
| <p><em>num_platforms</em> returns the number of OpenCL platforms available. |
| If <em>num_platforms</em> is <code>NULL</code>, this argument is ignored.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clGetPlatformIDs"><strong>clGetPlatformIDs</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>num_entries</em> is equal to zero and <em>platforms</em> is |
| not <code>NULL</code> or if both <em>num_platforms</em> and <em>platforms</em> are <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>Specific information about an OpenCL platform can be obtained with |
| the function:</p> |
| </div> |
| <div id="clGetPlatformInfo" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clGetPlatformInfo( |
| cl_platform_id platform, |
| cl_platform_info param_name, |
| size_t param_value_size, |
| <span class="directive">void</span>* param_value, |
| size_t* param_value_size_ret);</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>platform</em> refers to the platform ID returned by <a href="#clGetPlatformIDs"><strong>clGetPlatformIDs</strong></a> or can |
| be <code>NULL</code>. |
| If <em>platform</em> is <code>NULL</code>, the behavior is implementation-defined.</p> |
| </li> |
| <li> |
| <p><em>param_name</em> is an enumeration constant that identifies the platform |
| information being queried. |
| It can be one of the following values as specified in the |
| <a href="#platform-queries-table">Platform Queries</a> table.</p> |
| </li> |
| <li> |
| <p><em>param_value</em> is a pointer to memory location where appropriate values for a |
| given <em>param_name</em>, as specified in the <a href="#platform-queries-table">Platform |
| Queries</a> table, will be returned. |
| If <em>param_value</em> is <code>NULL</code>, it is ignored.</p> |
| </li> |
| <li> |
| <p><em>param_value_size</em> specifies the size in bytes of memory pointed to by |
| <em>param_value</em>. |
| This size in bytes must be ≥ size of return type specified in the |
| <a href="#platform-queries-table">Platform Queries</a> table.</p> |
| </li> |
| <li> |
| <p><em>param_value_size_ret</em> returns the actual size in bytes of data being |
| queried by <em>param_name</em>. |
| If <em>param_value_size_ret</em> is <code>NULL</code>, it is ignored.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The information that can be queried using <a href="#clGetPlatformInfo"><strong>clGetPlatformInfo</strong></a> is specified |
| in the <a href="#platform-queries-table">Platform Queries</a> table.</p> |
| </div> |
| <table id="platform-queries-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 3. List of supported param_names by <a href="#clGetPlatformInfo">clGetPlatformInfo</a></caption> |
| <colgroup> |
| <col style="width: 33%;"> |
| <col style="width: 17%;"> |
| <col style="width: 50%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">Platform Info</th> |
| <th class="tableblock halign-left valign-top">Return Type</th> |
| <th class="tableblock halign-left valign-top">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_PLATFORM_PROFILE"></a><a href="#CL_PLATFORM_PROFILE"><code>CL_PLATFORM_<wbr>PROFILE</code></a> <sup class="footnote">[<a id="_footnoteref_2" class="footnote" href="#_footnotedef_2" title="View footnote.">2</a>]</sup></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>char</code>[] <sup class="footnote">[<a id="_footnoteref_3" class="footnote" href="#_footnotedef_3" title="View footnote.">3</a>]</sup></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">OpenCL profile string. |
| Returns the profile name supported by the implementation. |
| The profile name returned can be one of the following strings:</p> |
| <p class="tableblock"> FULL_PROFILE - if the implementation supports the OpenCL |
| specification (functionality defined as part of the core |
| specification and does not require any extensions to be supported).</p> |
| <p class="tableblock"> EMBEDDED_PROFILE - if the implementation supports the OpenCL |
| embedded profile. |
| The embedded profile is defined to be a subset for each version of |
| OpenCL. |
| The embedded profile for OpenCL is described in |
| <a href="#opencl-embedded-profile">OpenCL Embedded Profile</a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_PLATFORM_VERSION"></a><a href="#CL_PLATFORM_VERSION"><code>CL_PLATFORM_<wbr>VERSION</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>char</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">OpenCL version string. |
| Returns the OpenCL version supported by the implementation. |
| This version string has the following format:</p> |
| <p class="tableblock"> <em>OpenCL<space><major_version.minor_version><space><platform-specific |
| information></em></p> |
| <p class="tableblock"> The <em>major_version.minor_version</em> value returned will be one of 1.0, |
| 1.1, 1.2, 2.0, 2.1, 2.2 or 3.0.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_PLATFORM_NUMERIC_VERSION"></a><a href="#CL_PLATFORM_NUMERIC_VERSION"><code>CL_PLATFORM_<wbr>NUMERIC_<wbr>VERSION</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_version</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns the detailed (major, minor, patch) version supported by the |
| platform. The major and minor version numbers returned must match |
| those returned via <a href="#CL_PLATFORM_VERSION"><code>CL_PLATFORM_<wbr>VERSION</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_PLATFORM_NAME"></a><a href="#CL_PLATFORM_NAME"><code>CL_PLATFORM_<wbr>NAME</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>char</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Platform name string.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_PLATFORM_VENDOR"></a><a href="#CL_PLATFORM_VENDOR"><code>CL_PLATFORM_<wbr>VENDOR</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>char</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Platform vendor string.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_PLATFORM_EXTENSIONS"></a><a href="#CL_PLATFORM_EXTENSIONS"><code>CL_PLATFORM_<wbr>EXTENSIONS</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>char</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns a space separated list of extension names (the extension |
| names themselves do not contain any spaces) supported by the |
| platform. |
| Each extension that is supported by all devices associated with this |
| platform must be reported here.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_PLATFORM_EXTENSIONS_WITH_VERSION"></a><a href="#CL_PLATFORM_EXTENSIONS_WITH_VERSION"><code>CL_PLATFORM_<wbr>EXTENSIONS_<wbr>WITH_<wbr>VERSION</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#cl_name_version"><code>cl_name_<wbr>version</code></a>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns an array of description (name and version) structures that lists |
| all the extensions supported by the platform. The same extension name |
| must not be reported more than once. The list of extensions reported |
| must match the list reported via <a href="#CL_PLATFORM_EXTENSIONS"><code>CL_PLATFORM_<wbr>EXTENSIONS</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_PLATFORM_HOST_TIMER_RESOLUTION"></a><a href="#CL_PLATFORM_HOST_TIMER_RESOLUTION"><code>CL_PLATFORM_<wbr>HOST_<wbr>TIMER_<wbr>RESOLUTION</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.1.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_ulong</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns the resolution of the host timer in nanoseconds as used by |
| <a href="#clGetDeviceAndHostTimer"><strong>clGetDeviceAndHostTimer</strong></a>.</p> |
| <p class="tableblock"> Support for device and host timer synchronization is required for |
| platforms supporting OpenCL 2.1 or 2.2. |
| This value must be 0 for devices that do not support device and |
| host timer synchronization.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="paragraph"> |
| <p><a href="#clGetPlatformInfo"><strong>clGetPlatformInfo</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed |
| successfully. |
| Otherwise, it returns one of the following |
| errors <sup class="footnote">[<a id="_footnoteref_4" class="footnote" href="#_footnotedef_4" title="View footnote.">4</a>]</sup>.</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_PLATFORM"><code>CL_INVALID_<wbr>PLATFORM</code></a> if <em>platform</em> is not a valid platform.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>param_name</em> is not one of the supported values or |
| if size in bytes specified by <em>param_value_size</em> is < size of return |
| type as specified in the <a href="#platform-queries-table">OpenCL Platform |
| Queries</a> table, and <em>param_value</em> is not a <code>NULL</code> value.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="sect2"> |
| <h3 id="platform-querying-devices"><a class="anchor" href="#platform-querying-devices"></a>4.2. Querying Devices</h3> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>The list of devices available on a platform can be obtained using the |
| function <sup class="footnote">[<a id="_footnoteref_5" class="footnote" href="#_footnotedef_5" title="View footnote.">5</a>]</sup>:</p> |
| </div> |
| <div id="clGetDeviceIDs" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clGetDeviceIDs( |
| cl_platform_id platform, |
| cl_device_type device_type, |
| cl_uint num_entries, |
| cl_device_id* devices, |
| cl_uint* num_devices);</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>platform</em> refers to the platform ID returned by <a href="#clGetPlatformIDs"><strong>clGetPlatformIDs</strong></a> or can |
| be <code>NULL</code>. |
| If <em>platform</em> is <code>NULL</code>, the behavior is implementation-defined.</p> |
| </li> |
| <li> |
| <p><em>device_type</em> is a bitfield that identifies the type of OpenCL device. |
| The <em>device_type</em> can be used to query specific OpenCL devices or all OpenCL |
| devices available. |
| The valid values for <em>device_type</em> are specified in the |
| <a href="#device-types-table">Device Types</a> table.</p> |
| </li> |
| <li> |
| <p><em>num_entries</em> is the number of <code>cl_device_<wbr>id</code> entries that can be added to |
| <em>devices</em>. |
| If <em>devices</em> is not <code>NULL</code>, the <em>num_entries</em> must be greater than zero.</p> |
| </li> |
| <li> |
| <p><em>devices</em> returns a list of OpenCL devices found. |
| The <code>cl_device_<wbr>id</code> values returned in <em>devices</em> can be used to identify a |
| specific OpenCL device. |
| If <em>devices</em> is <code>NULL</code>, this argument is ignored. |
| The number of OpenCL devices returned is the minimum of the value specified |
| by <em>num_entries</em> or the number of OpenCL devices whose type matches |
| <em>device_type</em>.</p> |
| </li> |
| <li> |
| <p><em>num_devices</em> returns the number of OpenCL devices available that match |
| <em>device_type</em>. |
| If <em>num_devices</em> is <code>NULL</code>, this argument is ignored.</p> |
| </li> |
| </ul> |
| </div> |
| <table id="device-types-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 4. List of supported device_types by <a href="#clGetDeviceIDs">clGetDeviceIDs</a></caption> |
| <colgroup> |
| <col style="width: 50%;"> |
| <col style="width: 50%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">Device Type</th> |
| <th class="tableblock halign-left valign-top">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_TYPE_CPU"></a><a href="#CL_DEVICE_TYPE_CPU"><code>CL_DEVICE_<wbr>TYPE_<wbr>CPU</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">An OpenCL device similar to a traditional CPU (Central Processing Unit). |
| The host processor that executes OpenCL host code may also be considered |
| a CPU OpenCL device.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_TYPE_GPU"></a><a href="#CL_DEVICE_TYPE_GPU"><code>CL_DEVICE_<wbr>TYPE_<wbr>GPU</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">An OpenCL device similar to a GPU (Graphics Processing Unit). |
| Many systems include a dedicated processor for graphics or rendering |
| that may be considered a GPU OpenCL device.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_TYPE_ACCELERATOR"></a><a href="#CL_DEVICE_TYPE_ACCELERATOR"><code>CL_DEVICE_<wbr>TYPE_<wbr>ACCELERATOR</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Dedicated devices that may accelerate OpenCL programs, such as FPGAs |
| (Field Programmable Gate Arrays), DSPs (Digital Signal Processors), or |
| AI (Artificial Intelligence) processors.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_TYPE_CUSTOM"></a><a href="#CL_DEVICE_TYPE_CUSTOM"><code>CL_DEVICE_<wbr>TYPE_<wbr>CUSTOM</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Specialized devices that implement some of the OpenCL runtime APIs but |
| do not support all required OpenCL functionality.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_TYPE_DEFAULT"></a><a href="#CL_DEVICE_TYPE_DEFAULT"><code>CL_DEVICE_<wbr>TYPE_<wbr>DEFAULT</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">The default OpenCL device in the platform. |
| The default OpenCL device must not be a <a href="#CL_DEVICE_TYPE_CUSTOM"><code>CL_DEVICE_<wbr>TYPE_<wbr>CUSTOM</code></a> device.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_TYPE_ALL"></a><a href="#CL_DEVICE_TYPE_ALL"><code>CL_DEVICE_<wbr>TYPE_<wbr>ALL</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">All OpenCL devices available in the platform, except for |
| <a href="#CL_DEVICE_TYPE_CUSTOM"><code>CL_DEVICE_<wbr>TYPE_<wbr>CUSTOM</code></a> devices.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="paragraph"> |
| <p>The device type is purely informational and has no semantic meaning.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Some devices may be more than one type. |
| For example, a <a href="#CL_DEVICE_TYPE_CPU"><code>CL_DEVICE_<wbr>TYPE_<wbr>CPU</code></a> device may also be a |
| <a href="#CL_DEVICE_TYPE_GPU"><code>CL_DEVICE_<wbr>TYPE_<wbr>GPU</code></a> device, or a <a href="#CL_DEVICE_TYPE_ACCELERATOR"><code>CL_DEVICE_<wbr>TYPE_<wbr>ACCELERATOR</code></a> device |
| may also be some other, more descriptive device type. |
| <a href="#CL_DEVICE_TYPE_CUSTOM"><code>CL_DEVICE_<wbr>TYPE_<wbr>CUSTOM</code></a> devices must not be combined with any other |
| device types.</p> |
| </div> |
| <div class="paragraph"> |
| <p>One device in the platform should be a <a href="#CL_DEVICE_TYPE_DEFAULT"><code>CL_DEVICE_<wbr>TYPE_<wbr>DEFAULT</code></a> device. |
| The default device should also be a more specific device type, such |
| as <a href="#CL_DEVICE_TYPE_CPU"><code>CL_DEVICE_<wbr>TYPE_<wbr>CPU</code></a> or <a href="#CL_DEVICE_TYPE_GPU"><code>CL_DEVICE_<wbr>TYPE_<wbr>GPU</code></a>.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clGetDeviceIDs"><strong>clGetDeviceIDs</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_PLATFORM"><code>CL_INVALID_<wbr>PLATFORM</code></a> if <em>platform</em> is not a valid platform.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_DEVICE_TYPE"><code>CL_INVALID_<wbr>DEVICE_<wbr>TYPE</code></a> if <em>device_type</em> is not a valid value.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>num_entries</em> is equal to zero and <em>devices</em> is not |
| <code>NULL</code> or if both <em>num_devices</em> and <em>devices</em> are <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_DEVICE_NOT_FOUND"><code>CL_DEVICE_<wbr>NOT_<wbr>FOUND</code></a> if no OpenCL devices that matched <em>device_type</em> were |
| found.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The application can query specific capabilities of the OpenCL device(s) |
| returned by <a href="#clGetDeviceIDs"><strong>clGetDeviceIDs</strong></a>. |
| This can be used by the application to determine which device(s) to use.</p> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To get specific information about an OpenCL device, call the function:</p> |
| </div> |
| <div id="clGetDeviceInfo" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clGetDeviceInfo( |
| cl_device_id device, |
| cl_device_info param_name, |
| size_t param_value_size, |
| <span class="directive">void</span>* param_value, |
| size_t* param_value_size_ret);</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>device</em> may be a device returned by <a href="#clGetDeviceIDs"><strong>clGetDeviceIDs</strong></a> or a sub-device |
| created by <a href="#clCreateSubDevices"><strong>clCreateSubDevices</strong></a>. |
| If <em>device</em> is a sub-device, the specific information for the sub-device |
| will be returned. |
| The information that can be queried using <a href="#clGetDeviceInfo"><strong>clGetDeviceInfo</strong></a> is specified in |
| the <a href="#device-queries-table">Device Queries</a> table.</p> |
| </li> |
| <li> |
| <p><em>param_name</em> is an enumeration constant that identifies the device |
| information being queried. |
| It can be one of the following values as specified in the |
| <a href="#device-queries-table">Device Queries</a> table.</p> |
| </li> |
| <li> |
| <p><em>param_value</em> is a pointer to memory location where appropriate values for a |
| given <em>param_name</em>, as specified in the <a href="#device-queries-table">Device |
| Queries</a> table, will be returned. |
| If <em>param_value</em> is <code>NULL</code>, it is ignored.</p> |
| </li> |
| <li> |
| <p><em>param_value_size</em> specifies the size in bytes of memory pointed to by |
| <em>param_value</em>. |
| This size in bytes must be ≥ size of return type specified in the |
| <a href="#device-queries-table">Device Queries</a> table.</p> |
| </li> |
| <li> |
| <p><em>param_value_size_ret</em> returns the actual size in bytes of data being |
| queried by <em>param_name</em>. |
| If <em>param_value_size_ret</em> is <code>NULL</code>, it is ignored.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The device queries described in the <a href="#device-queries-table">Device Queries</a> |
| table should return the same information for a root-level device i.e. a |
| device returned by <a href="#clGetDeviceIDs"><strong>clGetDeviceIDs</strong></a> and any sub-devices created from this |
| device except for the following queries:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_DEVICE_GLOBAL_MEM_CACHE_SIZE"><code>CL_DEVICE_<wbr>GLOBAL_<wbr>MEM_<wbr>CACHE_<wbr>SIZE</code></a></p> |
| </li> |
| <li> |
| <p><a href="#CL_DEVICE_BUILT_IN_KERNELS"><code>CL_DEVICE_<wbr>BUILT_<wbr>IN_<wbr>KERNELS</code></a></p> |
| </li> |
| <li> |
| <p><a href="#CL_DEVICE_PARENT_DEVICE"><code>CL_DEVICE_<wbr>PARENT_<wbr>DEVICE</code></a></p> |
| </li> |
| <li> |
| <p><a href="#CL_DEVICE_PARTITION_TYPE"><code>CL_DEVICE_<wbr>PARTITION_<wbr>TYPE</code></a></p> |
| </li> |
| <li> |
| <p><a href="#CL_DEVICE_REFERENCE_COUNT"><code>CL_DEVICE_<wbr>REFERENCE_<wbr>COUNT</code></a></p> |
| </li> |
| </ul> |
| </div> |
| <table id="device-queries-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 5. List of supported param_names by <a href="#clGetDeviceInfo">clGetDeviceInfo</a></caption> |
| <colgroup> |
| <col style="width: 33%;"> |
| <col style="width: 17%;"> |
| <col style="width: 50%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">Device Info</th> |
| <th class="tableblock halign-left valign-top">Return Type</th> |
| <th class="tableblock halign-left valign-top">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_TYPE"></a><a href="#CL_DEVICE_TYPE"><code>CL_DEVICE_<wbr>TYPE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_device_<wbr>type</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">The type or types of the OpenCL device.</p> |
| <p class="tableblock"> Please see the <a href="#device-types-table">Device Types</a> table |
| for supported device types and device type combinations.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_VENDOR_ID"></a><a href="#CL_DEVICE_VENDOR_ID"><code>CL_DEVICE_<wbr>VENDOR_<wbr>ID</code></a> <sup class="footnote">[<a id="_footnoteref_6" class="footnote" href="#_footnotedef_6" title="View footnote.">6</a>]</sup></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">A unique device vendor identifier.</p> |
| <p class="tableblock"> If the vendor has a PCI vendor ID, the low 16 bits must contain that PCI |
| vendor ID, and the remaining bits must be set to zero. Otherwise, the |
| value returned must be a valid Khronos vendor ID represented by type |
| <code>cl_khronos_<wbr>vendor_<wbr>id</code>. Khronos vendor IDs are allocated starting at |
| 0x10000, to distinguish them from the PCI vendor ID namespace.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_COMPUTE_UNITS"></a><a href="#CL_DEVICE_MAX_COMPUTE_UNITS"><code>CL_DEVICE_<wbr>MAX_<wbr>COMPUTE_<wbr>UNITS</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">The number of parallel compute units on the OpenCL device. |
| A work-group executes on a single compute unit. |
| The minimum value is 1.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS"></a><a href="#CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS"><code>CL_DEVICE_<wbr>MAX_<wbr>WORK_<wbr>ITEM_<wbr>DIMENSIONS</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Maximum dimensions that specify the global and local work-item IDs |
| used by the data parallel execution model. (Refer to |
| <a href="#clEnqueueNDRangeKernel"><strong>clEnqueueNDRangeKernel</strong></a>). |
| The minimum value is 3 for devices that are not of type |
| <a href="#CL_DEVICE_TYPE_CUSTOM"><code>CL_DEVICE_<wbr>TYPE_<wbr>CUSTOM</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_WORK_ITEM_SIZES"></a><a href="#CL_DEVICE_MAX_WORK_ITEM_SIZES"><code>CL_DEVICE_<wbr>MAX_<wbr>WORK_<wbr>ITEM_<wbr>SIZES</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>size_t</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Maximum number of work-items that can be specified in each dimension |
| of the work-group to <a href="#clEnqueueNDRangeKernel"><strong>clEnqueueNDRangeKernel</strong></a>.</p> |
| <p class="tableblock"> Returns <em>n</em> <code>size_t</code> entries, where <em>n</em> is the value returned by the |
| query for <a href="#CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS"><code>CL_DEVICE_<wbr>MAX_<wbr>WORK_<wbr>ITEM_<wbr>DIMENSIONS</code></a>.</p> |
| <p class="tableblock"> The minimum value is (1, 1, 1) for devices that are not of type |
| <a href="#CL_DEVICE_TYPE_CUSTOM"><code>CL_DEVICE_<wbr>TYPE_<wbr>CUSTOM</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_WORK_GROUP_SIZE"></a><a href="#CL_DEVICE_MAX_WORK_GROUP_SIZE"><code>CL_DEVICE_<wbr>MAX_<wbr>WORK_<wbr>GROUP_<wbr>SIZE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>size_t</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Maximum number of work-items in a work-group that a device is |
| capable of executing on a single compute unit, for any given |
| kernel-instance running on the device. (Refer also to |
| <a href="#clEnqueueNDRangeKernel"><strong>clEnqueueNDRangeKernel</strong></a> and <a href="#CL_KERNEL_WORK_GROUP_SIZE"><code>CL_KERNEL_<wbr>WORK_<wbr>GROUP_<wbr>SIZE</code></a> ). |
| The minimum value is 1. |
| The returned value is an upper limit and will not necessarily |
| maximize performance. |
| This maximum may be larger than supported by a specific kernel |
| (refer to the <a href="#CL_KERNEL_WORK_GROUP_SIZE"><code>CL_KERNEL_<wbr>WORK_<wbr>GROUP_<wbr>SIZE</code></a> query of <a href="#clGetKernelWorkGroupInfo"><strong>clGetKernelWorkGroupInfo</strong></a>).</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR"></a><a href="#CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR"><code>CL_DEVICE_<wbr>PREFERRED_<wbr>VECTOR_<wbr>WIDTH_<wbr>CHAR</code></a> <br> |
| <a id="CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT"></a><a href="#CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT"><code>CL_DEVICE_<wbr>PREFERRED_<wbr>VECTOR_<wbr>WIDTH_<wbr>SHORT</code></a> <br> |
| <a id="CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT"></a><a href="#CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT"><code>CL_DEVICE_<wbr>PREFERRED_<wbr>VECTOR_<wbr>WIDTH_<wbr>INT</code></a> <br> |
| <a id="CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG"></a><a href="#CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG"><code>CL_DEVICE_<wbr>PREFERRED_<wbr>VECTOR_<wbr>WIDTH_<wbr>LONG</code></a> <br> |
| <a id="CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT"></a><a href="#CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT"><code>CL_DEVICE_<wbr>PREFERRED_<wbr>VECTOR_<wbr>WIDTH_<wbr>FLOAT</code></a> <br> |
| <a id="CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE"></a><a href="#CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE"><code>CL_DEVICE_<wbr>PREFERRED_<wbr>VECTOR_<wbr>WIDTH_<wbr>DOUBLE</code></a><br> |
| <a id="CL_DEVICE_PREFERRED_VECTOR_WIDTH_HALF"></a><a href="#CL_DEVICE_PREFERRED_VECTOR_WIDTH_HALF"><code>CL_DEVICE_<wbr>PREFERRED_<wbr>VECTOR_<wbr>WIDTH_<wbr>HALF</code></a></p> |
| <p class="tableblock"> <a href="#CL_DEVICE_PREFERRED_VECTOR_WIDTH_HALF"><code>CL_DEVICE_<wbr>PREFERRED_<wbr>VECTOR_<wbr>WIDTH_<wbr>HALF</code></a> is <a href="#unified-spec">missing before</a> |
| version 1.1.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Preferred native vector width size for built-in scalar types that |
| can be put into vectors. |
| The vector width is defined as the number of scalar elements that |
| can be stored in the vector.</p> |
| <p class="tableblock"> If double precision is not supported, |
| <a href="#CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE"><code>CL_DEVICE_<wbr>PREFERRED_<wbr>VECTOR_<wbr>WIDTH_<wbr>DOUBLE</code></a> must return 0.</p> |
| <p class="tableblock"> If the <strong>cl_khr_fp16</strong> extension is not supported, |
| <a href="#CL_DEVICE_PREFERRED_VECTOR_WIDTH_HALF"><code>CL_DEVICE_<wbr>PREFERRED_<wbr>VECTOR_<wbr>WIDTH_<wbr>HALF</code></a> must return 0.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_NATIVE_VECTOR_WIDTH_CHAR"></a><a href="#CL_DEVICE_NATIVE_VECTOR_WIDTH_CHAR"><code>CL_DEVICE_<wbr>NATIVE_<wbr>VECTOR_<wbr>WIDTH_<wbr>CHAR</code></a> <br> |
| <a id="CL_DEVICE_NATIVE_VECTOR_WIDTH_SHORT"></a><a href="#CL_DEVICE_NATIVE_VECTOR_WIDTH_SHORT"><code>CL_DEVICE_<wbr>NATIVE_<wbr>VECTOR_<wbr>WIDTH_<wbr>SHORT</code></a> <br> |
| <a id="CL_DEVICE_NATIVE_VECTOR_WIDTH_INT"></a><a href="#CL_DEVICE_NATIVE_VECTOR_WIDTH_INT"><code>CL_DEVICE_<wbr>NATIVE_<wbr>VECTOR_<wbr>WIDTH_<wbr>INT</code></a> <br> |
| <a id="CL_DEVICE_NATIVE_VECTOR_WIDTH_LONG"></a><a href="#CL_DEVICE_NATIVE_VECTOR_WIDTH_LONG"><code>CL_DEVICE_<wbr>NATIVE_<wbr>VECTOR_<wbr>WIDTH_<wbr>LONG</code></a> <br> |
| <a id="CL_DEVICE_NATIVE_VECTOR_WIDTH_FLOAT"></a><a href="#CL_DEVICE_NATIVE_VECTOR_WIDTH_FLOAT"><code>CL_DEVICE_<wbr>NATIVE_<wbr>VECTOR_<wbr>WIDTH_<wbr>FLOAT</code></a> <br> |
| <a id="CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE"></a><a href="#CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE"><code>CL_DEVICE_<wbr>NATIVE_<wbr>VECTOR_<wbr>WIDTH_<wbr>DOUBLE</code></a><br> |
| <a id="CL_DEVICE_NATIVE_VECTOR_WIDTH_HALF"></a><a href="#CL_DEVICE_NATIVE_VECTOR_WIDTH_HALF"><code>CL_DEVICE_<wbr>NATIVE_<wbr>VECTOR_<wbr>WIDTH_<wbr>HALF</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.1.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns the native ISA vector width. |
| The vector width is defined as the number of scalar elements that |
| can be stored in the vector.</p> |
| <p class="tableblock"> If double precision is not supported, |
| <a href="#CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE"><code>CL_DEVICE_<wbr>NATIVE_<wbr>VECTOR_<wbr>WIDTH_<wbr>DOUBLE</code></a> must return 0.</p> |
| <p class="tableblock"> If the <strong>cl_khr_fp16</strong> extension is not supported, |
| <a href="#CL_DEVICE_NATIVE_VECTOR_WIDTH_HALF"><code>CL_DEVICE_<wbr>NATIVE_<wbr>VECTOR_<wbr>WIDTH_<wbr>HALF</code></a> must return 0.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_CLOCK_FREQUENCY"></a><a href="#CL_DEVICE_MAX_CLOCK_FREQUENCY"><code>CL_DEVICE_<wbr>MAX_<wbr>CLOCK_<wbr>FREQUENCY</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Clock frequency of the device in MHz. |
| The meaning of this value is implementation-defined. |
| For devices with multiple clock domains, the clock frequency for any |
| of the clock domains may be returned. |
| For devices that dynamically change frequency for power or thermal |
| reasons, the returned clock frequency may be any valid frequency. |
| Note: This definition is <a href="#unified-spec">missing before</a> version 2.2.</p> |
| <p class="tableblock"> Maximum configured clock frequency of the device in MHz. |
| Note: This definition is <a href="#unified-spec">deprecated by</a> version 2.2.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_ADDRESS_BITS"></a><a href="#CL_DEVICE_ADDRESS_BITS"><code>CL_DEVICE_<wbr>ADDRESS_<wbr>BITS</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">The default compute device address space size of the global address |
| space specified as an unsigned integer value in bits. |
| Currently supported values are 32 or 64 bits.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_MEM_ALLOC_SIZE"></a><a href="#CL_DEVICE_MAX_MEM_ALLOC_SIZE"><code>CL_DEVICE_<wbr>MAX_<wbr>MEM_<wbr>ALLOC_<wbr>SIZE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_ulong</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Max size of memory object allocation in bytes. |
| The minimum value is max(min(1024 × 1024 × 1024, 1/4<sup>th</sup> |
| of <a href="#CL_DEVICE_GLOBAL_MEM_SIZE"><code>CL_DEVICE_<wbr>GLOBAL_<wbr>MEM_<wbr>SIZE</code></a>), 32 × 1024 × 1024) for |
| devices that are not of type <a href="#CL_DEVICE_TYPE_CUSTOM"><code>CL_DEVICE_<wbr>TYPE_<wbr>CUSTOM</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_IMAGE_SUPPORT"></a><a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_bool</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Is <a href="#CL_TRUE"><code>CL_TRUE</code></a> if images are supported by the OpenCL device and <a href="#CL_FALSE"><code>CL_FALSE</code></a> |
| otherwise.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_READ_IMAGE_ARGS"></a><a href="#CL_DEVICE_MAX_READ_IMAGE_ARGS"><code>CL_DEVICE_<wbr>MAX_<wbr>READ_<wbr>IMAGE_<wbr>ARGS</code></a> <sup class="footnote">[<a id="_footnoteref_7" class="footnote" href="#_footnotedef_7" title="View footnote.">7</a>]</sup></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Max number of image objects arguments of a kernel declared with the |
| read_only qualifier. |
| The minimum value is 128 if <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>, the |
| value is 0 otherwise.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_WRITE_IMAGE_ARGS"></a><a href="#CL_DEVICE_MAX_WRITE_IMAGE_ARGS"><code>CL_DEVICE_<wbr>MAX_<wbr>WRITE_<wbr>IMAGE_<wbr>ARGS</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Max number of image objects arguments of a kernel declared with the |
| write_only qualifier. |
| The minimum value is 64 if <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>, the |
| value is 0 otherwise.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_READ_WRITE_IMAGE_ARGS"></a><a href="#CL_DEVICE_MAX_READ_WRITE_IMAGE_ARGS"><code>CL_DEVICE_<wbr>MAX_<wbr>READ_<wbr>WRITE_<wbr>IMAGE_<wbr>ARGS</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Max number of image objects arguments of a kernel declared with the |
| write_only or read_write qualifier.</p> |
| <p class="tableblock"> Support for read-write image arguments is required for an OpenCL 2.0, 2.1, |
| or 2.2 device if <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>.</p> |
| <p class="tableblock"> The minimum value is 64 if the device supports read-write images arguments, |
| and must be 0 for devices that do not support read-write images.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_IL_VERSION"></a><a href="#CL_DEVICE_IL_VERSION"><code>CL_DEVICE_<wbr>IL_<wbr>VERSION</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.1. |
| Also see extension <strong>cl_khr_il_program</strong>.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>char</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">The intermediate languages that can be supported by |
| <a href="#clCreateProgramWithIL"><strong>clCreateProgramWithIL</strong></a> for this device. |
| Returns a space-separated list of IL version strings of the form |
| <IL_Prefix>_<Major_Version>.<Minor_Version>.</p> |
| <p class="tableblock"> For an OpenCL 2.1 or 2.2 device, SPIR-V is a required IL prefix.</p> |
| <p class="tableblock"> If the device does not support intermediate language programs, the |
| value must be <code>""</code> (an empty string).</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_ILS_WITH_VERSION"></a><a href="#CL_DEVICE_ILS_WITH_VERSION"><code>CL_DEVICE_<wbr>ILS_<wbr>WITH_<wbr>VERSION</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0. |
| Also see extension <strong>cl_khr_il_program</strong>.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#cl_name_version"><code>cl_name_<wbr>version</code></a>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns an array of descriptions (name and version) for all supported |
| intermediate languages. Intermediate languages with the same name may be |
| reported more than once but each name and major/minor version |
| combination may only be reported once. The list of intermediate |
| languages reported must match the list reported via |
| <a href="#CL_DEVICE_IL_VERSION"><code>CL_DEVICE_<wbr>IL_<wbr>VERSION</code></a>.</p> |
| <p class="tableblock"> For an OpenCL 2.1 or 2.2 device, at least one version of SPIR-V must |
| be reported.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_IMAGE2D_MAX_WIDTH"></a><a href="#CL_DEVICE_IMAGE2D_MAX_WIDTH"><code>CL_DEVICE_<wbr>IMAGE2D_<wbr>MAX_<wbr>WIDTH</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>size_t</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Max width of 2D image or 1D image not created from a buffer object |
| in pixels.</p> |
| <p class="tableblock"> The minimum value is 16384 if <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>, |
| the value is 0 otherwise.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_IMAGE2D_MAX_HEIGHT"></a><a href="#CL_DEVICE_IMAGE2D_MAX_HEIGHT"><code>CL_DEVICE_<wbr>IMAGE2D_<wbr>MAX_<wbr>HEIGHT</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>size_t</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Max height of 2D image in pixels.</p> |
| <p class="tableblock"> The minimum value is 16384 if <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>, |
| the value is 0 otherwise.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_IMAGE3D_MAX_WIDTH"></a><a href="#CL_DEVICE_IMAGE3D_MAX_WIDTH"><code>CL_DEVICE_<wbr>IMAGE3D_<wbr>MAX_<wbr>WIDTH</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>size_t</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Max width of 3D image in pixels.</p> |
| <p class="tableblock"> The minimum value is 2048 if <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>, |
| the value is 0 otherwise.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_IMAGE3D_MAX_HEIGHT"></a><a href="#CL_DEVICE_IMAGE3D_MAX_HEIGHT"><code>CL_DEVICE_<wbr>IMAGE3D_<wbr>MAX_<wbr>HEIGHT</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>size_t</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Max height of 3D image in pixels.</p> |
| <p class="tableblock"> The minimum value is 2048 if <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>, |
| the value is 0 otherwise.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_IMAGE3D_MAX_DEPTH"></a><a href="#CL_DEVICE_IMAGE3D_MAX_DEPTH"><code>CL_DEVICE_<wbr>IMAGE3D_<wbr>MAX_<wbr>DEPTH</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>size_t</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Max depth of 3D image in pixels.</p> |
| <p class="tableblock"> The minimum value is 2048 if <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>, |
| the value is 0 otherwise.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_IMAGE_MAX_BUFFER_SIZE"></a><a href="#CL_DEVICE_IMAGE_MAX_BUFFER_SIZE"><code>CL_DEVICE_<wbr>IMAGE_<wbr>MAX_<wbr>BUFFER_<wbr>SIZE</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>size_t</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Max number of pixels for a 1D image created from a buffer object.</p> |
| <p class="tableblock"> The minimum value is 65536 if <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>, |
| the value is 0 otherwise.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_IMAGE_MAX_ARRAY_SIZE"></a><a href="#CL_DEVICE_IMAGE_MAX_ARRAY_SIZE"><code>CL_DEVICE_<wbr>IMAGE_<wbr>MAX_<wbr>ARRAY_<wbr>SIZE</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>size_t</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Max number of images in a 1D or 2D image array.</p> |
| <p class="tableblock"> The minimum value is 2048 if <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>, |
| the value is 0 otherwise.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_SAMPLERS"></a><a href="#CL_DEVICE_MAX_SAMPLERS"><code>CL_DEVICE_<wbr>MAX_<wbr>SAMPLERS</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Maximum number of samplers that can be used in a kernel.</p> |
| <p class="tableblock"> The minimum value is 16 if <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>, |
| the value is 0 otherwise.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_IMAGE_PITCH_ALIGNMENT"></a><a href="#CL_DEVICE_IMAGE_PITCH_ALIGNMENT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>PITCH_<wbr>ALIGNMENT</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">The row pitch alignment size in pixels for 2D images created from a |
| buffer. |
| The value returned must be a power of 2.</p> |
| <p class="tableblock"> Support for 2D images created from a buffer is required for an OpenCL 2.0, 2.1, |
| or 2.2 device if <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>.</p> |
| <p class="tableblock"> This value must be 0 for devices that do not support 2D images created from a buffer.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_IMAGE_BASE_ADDRESS_ALIGNMENT"></a><a href="#CL_DEVICE_IMAGE_BASE_ADDRESS_ALIGNMENT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>BASE_<wbr>ADDRESS_<wbr>ALIGNMENT</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">This query specifies the minimum alignment in pixels of the host_ptr |
| specified to <a href="#clCreateBuffer"><strong>clCreateBuffer</strong></a> or <a href="#clCreateBufferWithProperties"><strong>clCreateBufferWithProperties</strong></a> when a 2D image |
| is created from a buffer which was created using <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a>. |
| The value returned must be a power of 2.</p> |
| <p class="tableblock"> Support for 2D images created from a buffer is required for an OpenCL 2.0, 2.1, |
| or 2.2 device if <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>.</p> |
| <p class="tableblock"> This value must be 0 for devices that do not support 2D images created from a buffer.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_PIPE_ARGS"></a><a href="#CL_DEVICE_MAX_PIPE_ARGS"><code>CL_DEVICE_<wbr>MAX_<wbr>PIPE_<wbr>ARGS</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">The maximum number of pipe objects that can be passed as arguments |
| to a kernel. |
| The minimum value is 16 for devices supporting pipes, and must be |
| 0 for devices that do not support pipes.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PIPE_MAX_ACTIVE_RESERVATIONS"></a><a href="#CL_DEVICE_PIPE_MAX_ACTIVE_RESERVATIONS"><code>CL_DEVICE_<wbr>PIPE_<wbr>MAX_<wbr>ACTIVE_<wbr>RESERVATIONS</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">The maximum number of reservations that can be active for a pipe per |
| work-item in a kernel. |
| A work-group reservation is counted as one reservation per |
| work-item. |
| The minimum value is 1 for devices supporting pipes, and must be |
| 0 for devices that do not support pipes.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PIPE_MAX_PACKET_SIZE"></a><a href="#CL_DEVICE_PIPE_MAX_PACKET_SIZE"><code>CL_DEVICE_<wbr>PIPE_<wbr>MAX_<wbr>PACKET_<wbr>SIZE</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">The maximum size of pipe packet in bytes.</p> |
| <p class="tableblock"> Support for pipes is required for an OpenCL 2.0, 2.1, or 2.2 device. |
| The minimum value is 1024 bytes if the device supports pipes, and must be |
| 0 for devices that do not support pipes.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_PARAMETER_SIZE"></a><a href="#CL_DEVICE_MAX_PARAMETER_SIZE"><code>CL_DEVICE_<wbr>MAX_<wbr>PARAMETER_<wbr>SIZE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>size_t</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Max size in bytes of all arguments that can be passed to a kernel.</p> |
| <p class="tableblock"> The minimum value is 1024 for devices that are not of type |
| <a href="#CL_DEVICE_TYPE_CUSTOM"><code>CL_DEVICE_<wbr>TYPE_<wbr>CUSTOM</code></a>. |
| For this minimum value, only a maximum of 128 arguments can be |
| passed to a kernel</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MEM_BASE_ADDR_ALIGN"></a><a href="#CL_DEVICE_MEM_BASE_ADDR_ALIGN"><code>CL_DEVICE_<wbr>MEM_<wbr>BASE_<wbr>ADDR_<wbr>ALIGN</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Alignment requirement (in bits) for sub-buffer offsets. |
| The minimum value is the size (in bits) of the largest OpenCL |
| built-in data type supported by the device (long16 in FULL profile, |
| long16 or int16 in EMBEDDED profile) for devices that are not of |
| type <a href="#CL_DEVICE_TYPE_CUSTOM"><code>CL_DEVICE_<wbr>TYPE_<wbr>CUSTOM</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE"></a><a href="#CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE"><code>CL_DEVICE_<wbr>MIN_<wbr>DATA_<wbr>TYPE_<wbr>ALIGN_<wbr>SIZE</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Deprecated by</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">The minimum value is the size (in bytes) of the largest OpenCL data |
| type supported by the device (<code>long16</code> in FULL profile, <code>long16</code> or |
| <code>int16</code> in EMBEDDED profile).</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_SINGLE_FP_CONFIG"></a><a href="#CL_DEVICE_SINGLE_FP_CONFIG"><code>CL_DEVICE_<wbr>SINGLE_<wbr>FP_<wbr>CONFIG</code></a> <sup class="footnote" id="_footnote_native-rounding-modes">[<a id="_footnoteref_8" class="footnote" href="#_footnotedef_8" title="View footnote.">8</a>]</sup></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_device_<wbr>fp_<wbr>config</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Describes single precision floating-point capability of the device. |
| This is a bit-field that describes one or more of the following |
| values:</p> |
| <p class="tableblock"> <a id="CL_FP_DENORM"></a><a href="#CL_FP_DENORM"><code>CL_FP_<wbr>DENORM</code></a> - denorms are supported<br> |
| <a id="CL_FP_INF_NAN"></a><a href="#CL_FP_INF_NAN"><code>CL_FP_<wbr>INF_<wbr>NAN</code></a> - INF and quiet NaNs are supported<br> |
| <a id="CL_FP_ROUND_TO_NEAREST"></a><a href="#CL_FP_ROUND_TO_NEAREST"><code>CL_FP_<wbr>ROUND_<wbr>TO_<wbr>NEAREST</code></a>-- round to nearest even rounding mode |
| supported<br> |
| <a id="CL_FP_ROUND_TO_ZERO"></a><a href="#CL_FP_ROUND_TO_ZERO"><code>CL_FP_<wbr>ROUND_<wbr>TO_<wbr>ZERO</code></a> - round to zero rounding mode supported<br> |
| <a id="CL_FP_ROUND_TO_INF"></a><a href="#CL_FP_ROUND_TO_INF"><code>CL_FP_<wbr>ROUND_<wbr>TO_<wbr>INF</code></a> - round to positive and negative infinity |
| rounding modes supported<br> |
| <a id="CL_FP_FMA"></a><a href="#CL_FP_FMA"><code>CL_FP_<wbr>FMA</code></a> - IEEE754-2008 fused multiply-add is supported<br> |
| <a id="CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT"></a><a href="#CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT"><code>CL_FP_<wbr>CORRECTLY_<wbr>ROUNDED_<wbr>DIVIDE_<wbr>SQRT</code></a> - divide and sqrt are correctly |
| rounded as defined by the IEEE754 specification<br> |
| <a id="CL_FP_SOFT_FLOAT"></a><a href="#CL_FP_SOFT_FLOAT"><code>CL_FP_<wbr>SOFT_<wbr>FLOAT</code></a> - Basic floating-point operations (such as |
| addition, subtraction, multiplication) are implemented in software</p> |
| <p class="tableblock"> For the full profile, the mandated minimum floating-point capability |
| for devices that are not of type <a href="#CL_DEVICE_TYPE_CUSTOM"><code>CL_DEVICE_<wbr>TYPE_<wbr>CUSTOM</code></a> is:</p> |
| <p class="tableblock"> <a href="#CL_FP_ROUND_TO_NEAREST"><code>CL_FP_<wbr>ROUND_<wbr>TO_<wbr>NEAREST</code></a> |<br> |
| <a href="#CL_FP_INF_NAN"><code>CL_FP_<wbr>INF_<wbr>NAN</code></a>.</p> |
| <p class="tableblock"> For the embedded profile, see the |
| <a href="#embedded-profile-single-fp-config-requirements">dedicated table</a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_DOUBLE_FP_CONFIG"></a><a href="#CL_DEVICE_DOUBLE_FP_CONFIG"><code>CL_DEVICE_<wbr>DOUBLE_<wbr>FP_<wbr>CONFIG</code></a> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_8" title="View footnote.">8</a>]</sup></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2. |
| Also see extension <strong>cl_khr_fp64</strong>.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_device_<wbr>fp_<wbr>config</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Describes double precision floating-point capability of the OpenCL |
| device. |
| This is a bit-field that describes one or more of the following |
| values:</p> |
| <p class="tableblock"> <a href="#CL_FP_DENORM"><code>CL_FP_<wbr>DENORM</code></a> - denorms are supported<br> |
| <a href="#CL_FP_INF_NAN"><code>CL_FP_<wbr>INF_<wbr>NAN</code></a> - INF and NaNs are supported<br> |
| <a href="#CL_FP_ROUND_TO_NEAREST"><code>CL_FP_<wbr>ROUND_<wbr>TO_<wbr>NEAREST</code></a> - round to nearest even rounding mode |
| supported<br> |
| <a href="#CL_FP_ROUND_TO_ZERO"><code>CL_FP_<wbr>ROUND_<wbr>TO_<wbr>ZERO</code></a> - round to zero rounding mode supported<br> |
| <a href="#CL_FP_ROUND_TO_INF"><code>CL_FP_<wbr>ROUND_<wbr>TO_<wbr>INF</code></a> - round to positive and negative infinity |
| rounding modes supported<br> |
| <a href="#CL_FP_FMA"><code>CL_FP_<wbr>FMA</code></a> - IEEE754-2008 fused multiply-add is supported<br> |
| <a href="#CL_FP_SOFT_FLOAT"><code>CL_FP_<wbr>SOFT_<wbr>FLOAT</code></a> - Basic floating-point operations (such as |
| addition, subtraction, multiplication) are implemented in software</p> |
| <p class="tableblock"> Double precision is an optional feature so the mandated minimum |
| double precision floating-point capability is 0.</p> |
| <p class="tableblock"> If double precision is supported by the device, then the minimum |
| double precision floating-point capability is:</p> |
| <p class="tableblock"> <a href="#CL_FP_FMA"><code>CL_FP_<wbr>FMA</code></a> |<br> |
| <a href="#CL_FP_ROUND_TO_NEAREST"><code>CL_FP_<wbr>ROUND_<wbr>TO_<wbr>NEAREST</code></a> |<br> |
| <a href="#CL_FP_INF_NAN"><code>CL_FP_<wbr>INF_<wbr>NAN</code></a> |<br> |
| <a href="#CL_FP_DENORM"><code>CL_FP_<wbr>DENORM</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_GLOBAL_MEM_CACHE_TYPE"></a><a href="#CL_DEVICE_GLOBAL_MEM_CACHE_TYPE"><code>CL_DEVICE_<wbr>GLOBAL_<wbr>MEM_<wbr>CACHE_<wbr>TYPE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_device_<wbr>mem_<wbr>cache_<wbr>type</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Type of global memory cache supported. |
| Valid values are: <a href="#CL_NONE"><code>CL_NONE</code></a>, <a id="CL_READ_ONLY_CACHE"></a><a href="#CL_READ_ONLY_CACHE"><code>CL_READ_<wbr>ONLY_<wbr>CACHE</code></a>, and |
| <a id="CL_READ_WRITE_CACHE"></a><a href="#CL_READ_WRITE_CACHE"><code>CL_READ_<wbr>WRITE_<wbr>CACHE</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE"></a><a href="#CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE"><code>CL_DEVICE_<wbr>GLOBAL_<wbr>MEM_<wbr>CACHELINE_<wbr>SIZE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Size of global memory cache line in bytes.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_GLOBAL_MEM_CACHE_SIZE"></a><a href="#CL_DEVICE_GLOBAL_MEM_CACHE_SIZE"><code>CL_DEVICE_<wbr>GLOBAL_<wbr>MEM_<wbr>CACHE_<wbr>SIZE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_ulong</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Size of global memory cache in bytes.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_GLOBAL_MEM_SIZE"></a><a href="#CL_DEVICE_GLOBAL_MEM_SIZE"><code>CL_DEVICE_<wbr>GLOBAL_<wbr>MEM_<wbr>SIZE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_ulong</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Size of global device memory in bytes.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE"></a><a href="#CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE"><code>CL_DEVICE_<wbr>MAX_<wbr>CONSTANT_<wbr>BUFFER_<wbr>SIZE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_ulong</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Max size in bytes of a constant buffer allocation. |
| The minimum value is 64 KB for devices that are not of type |
| <a href="#CL_DEVICE_TYPE_CUSTOM"><code>CL_DEVICE_<wbr>TYPE_<wbr>CUSTOM</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_CONSTANT_ARGS"></a><a href="#CL_DEVICE_MAX_CONSTANT_ARGS"><code>CL_DEVICE_<wbr>MAX_<wbr>CONSTANT_<wbr>ARGS</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Max number of arguments declared with the <code>__constant</code> qualifier |
| in a kernel. |
| The minimum value is 8 for devices that are not of type |
| <a href="#CL_DEVICE_TYPE_CUSTOM"><code>CL_DEVICE_<wbr>TYPE_<wbr>CUSTOM</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_GLOBAL_VARIABLE_SIZE"></a><a href="#CL_DEVICE_MAX_GLOBAL_VARIABLE_SIZE"><code>CL_DEVICE_<wbr>MAX_<wbr>GLOBAL_<wbr>VARIABLE_<wbr>SIZE</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>size_t</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">The maximum number of bytes of storage that may be allocated for any |
| single variable in program scope or inside a function in an OpenCL |
| kernel language declared in the global address space.</p> |
| <p class="tableblock"> Support for program scope global variables is required for an OpenCL 2.0, |
| 2.1, or 2.2 device. |
| The minimum value is 64 KB if the device supports program scope global |
| variables, and must be 0 for devices that do not support program scope |
| global variables.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_GLOBAL_VARIABLE_PREFERRED_TOTAL_SIZE"></a><a href="#CL_DEVICE_GLOBAL_VARIABLE_PREFERRED_TOTAL_SIZE"><code>CL_DEVICE_<wbr>GLOBAL_<wbr>VARIABLE_<wbr>PREFERRED_<wbr>TOTAL_<wbr>SIZE</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>size_t</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Maximum preferred total size, in bytes, of all program variables in |
| the global address space. |
| This is a performance hint. |
| An implementation may place such variables in storage with optimized |
| device access. |
| This query returns the capacity of such storage. |
| The minimum value is 0.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_LOCAL_MEM_TYPE"></a><a href="#CL_DEVICE_LOCAL_MEM_TYPE"><code>CL_DEVICE_<wbr>LOCAL_<wbr>MEM_<wbr>TYPE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_device_<wbr>local_<wbr>mem_<wbr>type</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Type of local memory supported. |
| This can be set to <a id="CL_LOCAL"></a><a href="#CL_LOCAL"><code>CL_LOCAL</code></a> implying dedicated local memory storage |
| such as SRAM , or <a id="CL_GLOBAL"></a><a href="#CL_GLOBAL"><code>CL_GLOBAL</code></a>.</p> |
| <p class="tableblock"> For custom devices, <a href="#CL_NONE"><code>CL_NONE</code></a> can also be returned indicating no local |
| memory support.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_LOCAL_MEM_SIZE"></a><a href="#CL_DEVICE_LOCAL_MEM_SIZE"><code>CL_DEVICE_<wbr>LOCAL_<wbr>MEM_<wbr>SIZE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_ulong</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Size of local memory region in bytes. |
| The minimum value is 32 KB for devices that are not of type |
| <a href="#CL_DEVICE_TYPE_CUSTOM"><code>CL_DEVICE_<wbr>TYPE_<wbr>CUSTOM</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_ERROR_CORRECTION_SUPPORT"></a><a href="#CL_DEVICE_ERROR_CORRECTION_SUPPORT"><code>CL_DEVICE_<wbr>ERROR_<wbr>CORRECTION_<wbr>SUPPORT</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_bool</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Is <a href="#CL_TRUE"><code>CL_TRUE</code></a> if the device implements error correction for all |
| accesses to compute device memory (global and constant). |
| Is <a href="#CL_FALSE"><code>CL_FALSE</code></a> if the device does not implement such error correction.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_HOST_UNIFIED_MEMORY"></a><a href="#CL_DEVICE_HOST_UNIFIED_MEMORY"><code>CL_DEVICE_<wbr>HOST_<wbr>UNIFIED_<wbr>MEMORY</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.1 and <a href="#unified-spec">deprecated by</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_bool</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Is <a href="#CL_TRUE"><code>CL_TRUE</code></a> if the device and the host have a unified memory subsystem |
| and is <a href="#CL_FALSE"><code>CL_FALSE</code></a> otherwise.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PROFILING_TIMER_RESOLUTION"></a><a href="#CL_DEVICE_PROFILING_TIMER_RESOLUTION"><code>CL_DEVICE_<wbr>PROFILING_<wbr>TIMER_<wbr>RESOLUTION</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>size_t</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Describes the resolution of device timer. |
| This is measured in nanoseconds. |
| Refer to <a href="#profiling-operations">Profiling Operations</a> for details.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_ENDIAN_LITTLE"></a><a href="#CL_DEVICE_ENDIAN_LITTLE"><code>CL_DEVICE_<wbr>ENDIAN_<wbr>LITTLE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_bool</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Is <a href="#CL_TRUE"><code>CL_TRUE</code></a> if the OpenCL device is a little endian device and |
| <a href="#CL_FALSE"><code>CL_FALSE</code></a> otherwise</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_AVAILABLE"></a><a href="#CL_DEVICE_AVAILABLE"><code>CL_DEVICE_<wbr>AVAILABLE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_bool</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Is <a href="#CL_TRUE"><code>CL_TRUE</code></a> if the device is available and <a href="#CL_FALSE"><code>CL_FALSE</code></a> otherwise. |
| A device is considered to be available if the device can be expected |
| to successfully execute commands enqueued to the device.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_COMPILER_AVAILABLE"></a><a href="#CL_DEVICE_COMPILER_AVAILABLE"><code>CL_DEVICE_<wbr>COMPILER_<wbr>AVAILABLE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_bool</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Is <a href="#CL_FALSE"><code>CL_FALSE</code></a> if the implementation does not have a compiler available |
| to compile the program source.</p> |
| <p class="tableblock"> Is <a href="#CL_TRUE"><code>CL_TRUE</code></a> if the compiler is available. |
| This can be <a href="#CL_FALSE"><code>CL_FALSE</code></a> for the embedded platform profile only.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_LINKER_AVAILABLE"></a><a href="#CL_DEVICE_LINKER_AVAILABLE"><code>CL_DEVICE_<wbr>LINKER_<wbr>AVAILABLE</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_bool</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Is <a href="#CL_FALSE"><code>CL_FALSE</code></a> if the implementation does not have a linker available. |
| Is <a href="#CL_TRUE"><code>CL_TRUE</code></a> if the linker is available.</p> |
| <p class="tableblock"> This can be <a href="#CL_FALSE"><code>CL_FALSE</code></a> for the embedded platform profile only.</p> |
| <p class="tableblock"> This must be <a href="#CL_TRUE"><code>CL_TRUE</code></a> if <a href="#CL_DEVICE_COMPILER_AVAILABLE"><code>CL_DEVICE_<wbr>COMPILER_<wbr>AVAILABLE</code></a> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_EXECUTION_CAPABILITIES"></a><a href="#CL_DEVICE_EXECUTION_CAPABILITIES"><code>CL_DEVICE_<wbr>EXECUTION_<wbr>CAPABILITIES</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_device_<wbr>exec_<wbr>capabilities</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Describes the execution capabilities of the device. |
| This is a bit-field that describes one or more of the following |
| values:</p> |
| <p class="tableblock"> <a id="CL_EXEC_KERNEL"></a><a href="#CL_EXEC_KERNEL"><code>CL_EXEC_<wbr>KERNEL</code></a> - The OpenCL device can execute OpenCL kernels.<br> |
| <a id="CL_EXEC_NATIVE_KERNEL"></a><a href="#CL_EXEC_NATIVE_KERNEL"><code>CL_EXEC_<wbr>NATIVE_<wbr>KERNEL</code></a> - The OpenCL device can execute native |
| kernels.</p> |
| <p class="tableblock"> The mandated minimum capability is: <a href="#CL_EXEC_KERNEL"><code>CL_EXEC_<wbr>KERNEL</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_QUEUE_PROPERTIES"></a><a href="#CL_DEVICE_QUEUE_PROPERTIES"><code>CL_DEVICE_<wbr>QUEUE_<wbr>PROPERTIES</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Deprecated by</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_command_<wbr>queue_<wbr>properties</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">See description of <a href="#CL_DEVICE_QUEUE_ON_HOST_PROPERTIES"><code>CL_DEVICE_<wbr>QUEUE_<wbr>ON_<wbr>HOST_<wbr>PROPERTIES</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_QUEUE_ON_HOST_PROPERTIES"></a><a href="#CL_DEVICE_QUEUE_ON_HOST_PROPERTIES"><code>CL_DEVICE_<wbr>QUEUE_<wbr>ON_<wbr>HOST_<wbr>PROPERTIES</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_command_<wbr>queue_<wbr>properties</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Describes the on host command-queue properties supported by the |
| device. |
| This is a bit-field that describes one or more of the following |
| values:</p> |
| <p class="tableblock"> <a href="#CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE"><code>CL_QUEUE_<wbr>OUT_<wbr>OF_<wbr>ORDER_<wbr>EXEC_<wbr>MODE_<wbr>ENABLE</code></a><br> |
| <a href="#CL_QUEUE_PROFILING_ENABLE"><code>CL_QUEUE_<wbr>PROFILING_<wbr>ENABLE</code></a></p> |
| <p class="tableblock"> These properties are described in the <a href="#queue-properties-table">Queue Properties</a> table.</p> |
| <p class="tableblock"> The mandated minimum capability is: <a href="#CL_QUEUE_PROFILING_ENABLE"><code>CL_QUEUE_<wbr>PROFILING_<wbr>ENABLE</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_QUEUE_ON_DEVICE_PROPERTIES"></a><a href="#CL_DEVICE_QUEUE_ON_DEVICE_PROPERTIES"><code>CL_DEVICE_<wbr>QUEUE_<wbr>ON_<wbr>DEVICE_<wbr>PROPERTIES</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_command_<wbr>queue_<wbr>properties</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Describes the on device command-queue properties supported by the |
| device. |
| This is a bit-field that describes one or more of the following |
| values:</p> |
| <p class="tableblock"> <a href="#CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE"><code>CL_QUEUE_<wbr>OUT_<wbr>OF_<wbr>ORDER_<wbr>EXEC_<wbr>MODE_<wbr>ENABLE</code></a><br> |
| <a href="#CL_QUEUE_PROFILING_ENABLE"><code>CL_QUEUE_<wbr>PROFILING_<wbr>ENABLE</code></a></p> |
| <p class="tableblock"> These properties are described in the <a href="#queue-properties-table">Queue Properties</a> table.</p> |
| <p class="tableblock"> Support for on-device queues is required for an OpenCL 2.0, 2.1, or 2.2 device. |
| When on-device queues are supported, the mandated minimum capability is:</p> |
| <p class="tableblock"> <a href="#CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE"><code>CL_QUEUE_<wbr>OUT_<wbr>OF_<wbr>ORDER_<wbr>EXEC_<wbr>MODE_<wbr>ENABLE</code></a> |<br> |
| <a href="#CL_QUEUE_PROFILING_ENABLE"><code>CL_QUEUE_<wbr>PROFILING_<wbr>ENABLE</code></a>.</p> |
| <p class="tableblock"> Must be 0 for devices that do not support on-device queues.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_QUEUE_ON_DEVICE_PREFERRED_SIZE"></a><a href="#CL_DEVICE_QUEUE_ON_DEVICE_PREFERRED_SIZE"><code>CL_DEVICE_<wbr>QUEUE_<wbr>ON_<wbr>DEVICE_<wbr>PREFERRED_<wbr>SIZE</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">The preferred size of the device queue, in bytes. |
| Applications should use this size for the device queue to ensure |
| good performance.</p> |
| <p class="tableblock"> The minimum value is 16 KB for devices supporting on-device queues, |
| and must be 0 for devices that do not support on-device queues.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_QUEUE_ON_DEVICE_MAX_SIZE"></a><a href="#CL_DEVICE_QUEUE_ON_DEVICE_MAX_SIZE"><code>CL_DEVICE_<wbr>QUEUE_<wbr>ON_<wbr>DEVICE_<wbr>MAX_<wbr>SIZE</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">The maximum size of the device queue in bytes.</p> |
| <p class="tableblock"> The minimum value is 256 KB for the full profile and 64 KB for the |
| embedded profile for devices supporting on-device queues, |
| and must be 0 for devices that do not support on-device queues.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_ON_DEVICE_QUEUES"></a><a href="#CL_DEVICE_MAX_ON_DEVICE_QUEUES"><code>CL_DEVICE_<wbr>MAX_<wbr>ON_<wbr>DEVICE_<wbr>QUEUES</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">The maximum number of device queues that can be created for this |
| device in a single context.</p> |
| <p class="tableblock"> The minimum value is 1 for devices supporting on-device queues, |
| and must be 0 for devices that do not support on-device queues.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_ON_DEVICE_EVENTS"></a><a href="#CL_DEVICE_MAX_ON_DEVICE_EVENTS"><code>CL_DEVICE_<wbr>MAX_<wbr>ON_<wbr>DEVICE_<wbr>EVENTS</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">The maximum number of events in use by a device queue. |
| These refer to events returned by the <code>enqueue_</code> built-in functions |
| to a device queue or user events returned by the <code>create_user_event</code> |
| built-in function that have not been released.</p> |
| <p class="tableblock"> The minimum value is 1024 for devices supporting on-device queues, |
| and must be 0 for devices that do not support on-device queues.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_BUILT_IN_KERNELS"></a><a href="#CL_DEVICE_BUILT_IN_KERNELS"><code>CL_DEVICE_<wbr>BUILT_<wbr>IN_<wbr>KERNELS</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>char</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">A semi-colon separated list of built-in kernels supported by the |
| device. |
| An empty string is returned if no built-in kernels are supported by |
| the device.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_BUILT_IN_KERNELS_WITH_VERSION"></a><a href="#CL_DEVICE_BUILT_IN_KERNELS_WITH_VERSION"><code>CL_DEVICE_<wbr>BUILT_<wbr>IN_<wbr>KERNELS_<wbr>WITH_<wbr>VERSION</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#cl_name_version"><code>cl_name_<wbr>version</code></a>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns an array of descriptions for the built-in kernels supported by |
| the device. Each built-in kernel may only be reported once. The list of |
| reported kernels must match the list returned via |
| <a href="#CL_DEVICE_BUILT_IN_KERNELS"><code>CL_DEVICE_<wbr>BUILT_<wbr>IN_<wbr>KERNELS</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PLATFORM"></a><a href="#CL_DEVICE_PLATFORM"><code>CL_DEVICE_<wbr>PLATFORM</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_platform_<wbr>id</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">The platform associated with this device.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_NAME"></a><a href="#CL_DEVICE_NAME"><code>CL_DEVICE_<wbr>NAME</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>char</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Device name string.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_VENDOR"></a><a href="#CL_DEVICE_VENDOR"><code>CL_DEVICE_<wbr>VENDOR</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>char</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Vendor name string.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DRIVER_VERSION"></a><a href="#CL_DRIVER_VERSION"><code>CL_DRIVER_<wbr>VERSION</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>char</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">OpenCL software driver version string. |
| Follows a vendor-specific format.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PROFILE"></a><a href="#CL_DEVICE_PROFILE"><code>CL_DEVICE_<wbr>PROFILE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>char</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">OpenCL profile string. |
| Returns the profile name supported by the device. |
| The profile name returned can be one of the following strings:</p> |
| <p class="tableblock"> FULL_PROFILE - if the device supports the OpenCL specification |
| (functionality defined as part of the core specification and does |
| not require any extensions to be supported).</p> |
| <p class="tableblock"> EMBEDDED_PROFILE - if the device supports the OpenCL embedded |
| profile.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_VERSION"></a><a href="#CL_DEVICE_VERSION"><code>CL_DEVICE_<wbr>VERSION</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>char</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">OpenCL version string. |
| Returns the OpenCL version supported by the device. This version |
| string has the following format:</p> |
| <p class="tableblock"> <em>OpenCL<space><major_version.minor_version><space><vendor-specific |
| information></em></p> |
| <p class="tableblock"> The major_version.minor_version value returned will be one of 1.0, |
| 1.1, 1.2, 2.0, 2.1, 2.2, or 3.0.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_NUMERIC_VERSION"></a><a href="#CL_DEVICE_NUMERIC_VERSION"><code>CL_DEVICE_<wbr>NUMERIC_<wbr>VERSION</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_version</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns the detailed (major, minor, patch) version supported by the |
| device. The major and minor version numbers returned must match |
| those returned via <a href="#CL_DEVICE_VERSION"><code>CL_DEVICE_<wbr>VERSION</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_OPENCL_C_VERSION"></a><a href="#CL_DEVICE_OPENCL_C_VERSION"><code>CL_DEVICE_<wbr>OPENCL_<wbr>C_<wbr>VERSION</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.1 and <a href="#unified-spec">deprecated by</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>char</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns the highest fully backwards compatible OpenCL C version |
| supported by the compiler for the device. |
| For devices supporting compilation from OpenCL C source, this will |
| return a version string with the following format:</p> |
| <p class="tableblock"> <em>OpenCL<space>C<space><major_version.minor_version><space><vendor-specific |
| information></em></p> |
| <p class="tableblock"> For devices that support compilation from OpenCL C source:</p> |
| <p class="tableblock"> Because OpenCL 3.0 is backwards compatible with OpenCL C 1.2, |
| an OpenCL 3.0 device must support at least OpenCL C 1.2. |
| An OpenCL 3.0 device may return an OpenCL C version newer |
| than OpenCL C 1.2 if and only if all optional OpenCL C |
| features are supported by the device for the newer version.</p> |
| <p class="tableblock"> Support for OpenCL C 2.0 is required for an OpenCL 2.0, OpenCL 2.1, |
| or OpenCL 2.2 device.</p> |
| <p class="tableblock"> Support for OpenCL C 1.2 is required for an OpenCL 1.2 device.</p> |
| <p class="tableblock"> Support for OpenCL C 1.1 is required for an OpenCL 1.1 device.</p> |
| <p class="tableblock"> Support for either OpenCL C 1.0 or OpenCL C 1.1 is required for |
| an OpenCL 1.0 device.</p> |
| <p class="tableblock"> For devices that do not support compilation from OpenCL C source, |
| such as when <a href="#CL_DEVICE_COMPILER_AVAILABLE"><code>CL_DEVICE_<wbr>COMPILER_<wbr>AVAILABLE</code></a> is <a href="#CL_FALSE"><code>CL_FALSE</code></a>, this |
| query may return an empty string.</p> |
| <p class="tableblock"> This query has been superseded by the <a href="#CL_DEVICE_OPENCL_C_ALL_VERSIONS"><code>CL_DEVICE_<wbr>OPENCL_<wbr>C_<wbr>ALL_<wbr>VERSIONS</code></a> |
| query, which returns a set of OpenCL C versions supported by a |
| device.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_OPENCL_C_ALL_VERSIONS"></a><a href="#CL_DEVICE_OPENCL_C_ALL_VERSIONS"><code>CL_DEVICE_<wbr>OPENCL_<wbr>C_<wbr>ALL_<wbr>VERSIONS</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#cl_name_version"><code>cl_name_<wbr>version</code></a>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns an array of name, version descriptions listing all the versions |
| of OpenCL C supported by the compiler for the device. |
| In each returned description structure, the name field is required to be |
| "OpenCL C". The list may include both newer non-backwards compatible |
| OpenCL C versions, such as OpenCL C 3.0, and older OpenCL C versions |
| with mandatory backwards compatibility. |
| The version returned by <a href="#CL_DEVICE_OPENCL_C_VERSION"><code>CL_DEVICE_<wbr>OPENCL_<wbr>C_<wbr>VERSION</code></a> is required to be |
| present in the list.</p> |
| <p class="tableblock"> For devices that support compilation from OpenCL C source:</p> |
| <p class="tableblock"> Because OpenCL 3.0 is backwards compatible with OpenCL C 1.2, |
| and OpenCL C 1.2 is backwards compatible with OpenCL C 1.1 and |
| OpenCL C 1.0, support for at least OpenCL C 3.0, OpenCL C 1.2, |
| OpenCL C 1.1, and OpenCL C 1.0 is required for an OpenCL 3.0 device.</p> |
| <p class="tableblock"> Support for OpenCL C 2.0, OpenCL C 1.2, OpenCL C 1.1, and OpenCL C |
| 1.0 is required for an OpenCL 2.0, OpenCL 2.1, or OpenCL 2.2 device.</p> |
| <p class="tableblock"> Support for OpenCL C 1.2, OpenCL C 1.1, and OpenCL C 1.0 is required |
| for an OpenCL 1.2 device.</p> |
| <p class="tableblock"> Support for OpenCL C 1.1 and OpenCL C 1.0 is required for an |
| OpenCL 1.1 device.</p> |
| <p class="tableblock"> Support for at least OpenCL C 1.0 is required for an OpenCL 1.0 device.</p> |
| <p class="tableblock"> For devices that do not support compilation from OpenCL C source, |
| this query may return an empty array.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_OPENCL_C_FEATURES"></a><a href="#CL_DEVICE_OPENCL_C_FEATURES"><code>CL_DEVICE_<wbr>OPENCL_<wbr>C_<wbr>FEATURES</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#cl_name_version"><code>cl_name_<wbr>version</code></a>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns an array of optional OpenCL C features supported by the |
| compiler for the device alongside the OpenCL C version that introduced |
| the feature macro. |
| For example, if a compiler supports an OpenCL C 3.0 feature, the |
| returned name will be the full name of the OpenCL C feature macro, and |
| the returned version will be 3.0.0.</p> |
| <p class="tableblock"> For devices that do not support compilation from OpenCL C source, |
| this query may return an empty array.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_EXTENSIONS"></a><a href="#CL_DEVICE_EXTENSIONS"><code>CL_DEVICE_<wbr>EXTENSIONS</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>char</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns a space separated list of extension names (the extension |
| names themselves do not contain any spaces) supported by the device. |
| The list of extension names may include Khronos approved extension |
| names and vendor specified extension names.</p> |
| <p class="tableblock"> The following Khronos extension names must be returned by |
| all devices that support OpenCL 1.1:</p> |
| <p class="tableblock"> <strong>cl_khr_byte_addressable_store</strong><br> |
| <strong>cl_khr_global_int32_base_atomics</strong><br> |
| <strong>cl_khr_global_int32_extended_atomics</strong><br> |
| <strong>cl_khr_local_int32_base_atomics</strong><br> |
| <strong>cl_khr_local_int32_extended_atomics</strong></p> |
| <p class="tableblock"> Additionally, the following Khronos extension names must be returned |
| by all devices that support OpenCL 1.2 when and only when the optional |
| feature is supported:</p> |
| <p class="tableblock"> <strong>cl_khr_fp64</strong></p> |
| <p class="tableblock"> Additionally, the following Khronos extension names must be returned |
| by all devices that support OpenCL 2.0, OpenCL 2.1, or OpenCL 2.2. |
| For devices that support OpenCL 3.0, these extension names must |
| be returned when and only when the optional feature is supported:</p> |
| <p class="tableblock"> <strong>cl_khr_3d_image_writes</strong><br> |
| <strong>cl_khr_depth_images</strong><br> |
| <strong>cl_khr_image2d_from_buffer</strong></p> |
| <p class="tableblock"> Please refer to the OpenCL Extension Specification or vendor |
| provided documentation for a detailed description of these extensions.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_EXTENSIONS_WITH_VERSION"></a><a href="#CL_DEVICE_EXTENSIONS_WITH_VERSION"><code>CL_DEVICE_<wbr>EXTENSIONS_<wbr>WITH_<wbr>VERSION</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#cl_name_version"><code>cl_name_<wbr>version</code></a>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns an array of description (name and version) structures. The same |
| extension name must not be reported more than once. The list of |
| extensions reported must match the list reported via |
| <a href="#CL_DEVICE_EXTENSIONS"><code>CL_DEVICE_<wbr>EXTENSIONS</code></a>.</p> |
| <p class="tableblock"> See <a id="CL_DEVICE_EXTENSIONS"></a><a href="#CL_DEVICE_EXTENSIONS"><code>CL_DEVICE_<wbr>EXTENSIONS</code></a> for a list of extensions that are |
| required to be reported for a given OpenCL version.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PRINTF_BUFFER_SIZE"></a><a href="#CL_DEVICE_PRINTF_BUFFER_SIZE"><code>CL_DEVICE_<wbr>PRINTF_<wbr>BUFFER_<wbr>SIZE</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>size_t</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Maximum size in bytes of the internal buffer that holds the output |
| of printf calls from a kernel. |
| The minimum value for the FULL profile is 1 MB.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PREFERRED_INTEROP_USER_SYNC"></a><a href="#CL_DEVICE_PREFERRED_INTEROP_USER_SYNC"><code>CL_DEVICE_<wbr>PREFERRED_<wbr>INTEROP_<wbr>USER_<wbr>SYNC</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_bool</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Is <a href="#CL_TRUE"><code>CL_TRUE</code></a> if the devices preference is for the user to be |
| responsible for synchronization, when sharing memory objects between |
| OpenCL and other APIs such as DirectX, <a href="#CL_FALSE"><code>CL_FALSE</code></a> if the device / |
| implementation has a performant path for performing synchronization |
| of memory object shared between OpenCL and other APIs such as |
| DirectX.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PARENT_DEVICE"></a><a href="#CL_DEVICE_PARENT_DEVICE"><code>CL_DEVICE_<wbr>PARENT_<wbr>DEVICE</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_device_<wbr>id</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns the <code>cl_device_<wbr>id</code> of the parent device to which this |
| sub-device belongs. |
| If <em>device</em> is a root-level device, a <code>NULL</code> value is returned.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PARTITION_MAX_SUB_DEVICES"></a><a href="#CL_DEVICE_PARTITION_MAX_SUB_DEVICES"><code>CL_DEVICE_<wbr>PARTITION_<wbr>MAX_<wbr>SUB_<wbr>DEVICES</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns the maximum number of sub-devices that can be created when a |
| device is partitioned.</p> |
| <p class="tableblock"> The value returned cannot exceed <a href="#CL_DEVICE_MAX_COMPUTE_UNITS"><code>CL_DEVICE_<wbr>MAX_<wbr>COMPUTE_<wbr>UNITS</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PARTITION_PROPERTIES"></a><a href="#CL_DEVICE_PARTITION_PROPERTIES"><code>CL_DEVICE_<wbr>PARTITION_<wbr>PROPERTIES</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_device_<wbr>partition_<wbr>property</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns the list of partition types supported by <em>device</em>. |
| This is an array of <code>cl_device_<wbr>partition_<wbr>property</code> values drawn from |
| the following list:</p> |
| <p class="tableblock"> <a href="#CL_DEVICE_PARTITION_EQUALLY"><code>CL_DEVICE_<wbr>PARTITION_<wbr>EQUALLY</code></a><br> |
| <a href="#CL_DEVICE_PARTITION_BY_COUNTS"><code>CL_DEVICE_<wbr>PARTITION_<wbr>BY_<wbr>COUNTS</code></a><br> |
| <a href="#CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN"><code>CL_DEVICE_<wbr>PARTITION_<wbr>BY_<wbr>AFFINITY_<wbr>DOMAIN</code></a></p> |
| <p class="tableblock"> If the device cannot be partitioned (i.e. there is no partitioning |
| scheme supported by the device that will return at least two |
| subdevices), a value of 0 will be returned.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PARTITION_AFFINITY_DOMAIN"></a><a href="#CL_DEVICE_PARTITION_AFFINITY_DOMAIN"><code>CL_DEVICE_<wbr>PARTITION_<wbr>AFFINITY_<wbr>DOMAIN</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_device_<wbr>affinity_<wbr>domain</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns the list of supported affinity domains for partitioning the |
| device using <a href="#CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN"><code>CL_DEVICE_<wbr>PARTITION_<wbr>BY_<wbr>AFFINITY_<wbr>DOMAIN</code></a>. |
| This is a bit-field that describes one or more of the following |
| values:</p> |
| <p class="tableblock"> <a id="CL_DEVICE_AFFINITY_DOMAIN_NUMA"></a><a href="#CL_DEVICE_AFFINITY_DOMAIN_NUMA"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>NUMA</code></a><br> |
| <a id="CL_DEVICE_AFFINITY_DOMAIN_L4_CACHE"></a><a href="#CL_DEVICE_AFFINITY_DOMAIN_L4_CACHE"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>L4_<wbr>CACHE</code></a><br> |
| <a id="CL_DEVICE_AFFINITY_DOMAIN_L3_CACHE"></a><a href="#CL_DEVICE_AFFINITY_DOMAIN_L3_CACHE"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>L3_<wbr>CACHE</code></a><br> |
| <a id="CL_DEVICE_AFFINITY_DOMAIN_L2_CACHE"></a><a href="#CL_DEVICE_AFFINITY_DOMAIN_L2_CACHE"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>L2_<wbr>CACHE</code></a><br> |
| <a id="CL_DEVICE_AFFINITY_DOMAIN_L1_CACHE"></a><a href="#CL_DEVICE_AFFINITY_DOMAIN_L1_CACHE"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>L1_<wbr>CACHE</code></a><br> |
| <a id="CL_DEVICE_AFFINITY_DOMAIN_NEXT_PARTITIONABLE"></a><a href="#CL_DEVICE_AFFINITY_DOMAIN_NEXT_PARTITIONABLE"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>NEXT_<wbr>PARTITIONABLE</code></a></p> |
| <p class="tableblock"> If the device does not support any affinity domains, a value of 0 |
| will be returned.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PARTITION_TYPE"></a><a href="#CL_DEVICE_PARTITION_TYPE"><code>CL_DEVICE_<wbr>PARTITION_<wbr>TYPE</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_device_<wbr>partition_<wbr>property</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns the properties argument specified in <a href="#clCreateSubDevices"><strong>clCreateSubDevices</strong></a> if |
| device is a sub-device. |
| In the case where the properties argument to <a href="#clCreateSubDevices"><strong>clCreateSubDevices</strong></a> is |
| <a href="#CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN"><code>CL_DEVICE_<wbr>PARTITION_<wbr>BY_<wbr>AFFINITY_<wbr>DOMAIN</code></a>, |
| <a href="#CL_DEVICE_AFFINITY_DOMAIN_NEXT_PARTITIONABLE"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>NEXT_<wbr>PARTITIONABLE</code></a>, the affinity domain |
| used to perform the partition will be returned. |
| This can be one of the following values:</p> |
| <p class="tableblock"> <a href="#CL_DEVICE_AFFINITY_DOMAIN_NUMA"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>NUMA</code></a><br> |
| <a href="#CL_DEVICE_AFFINITY_DOMAIN_L4_CACHE"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>L4_<wbr>CACHE</code></a><br> |
| <a href="#CL_DEVICE_AFFINITY_DOMAIN_L3_CACHE"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>L3_<wbr>CACHE</code></a><br> |
| <a href="#CL_DEVICE_AFFINITY_DOMAIN_L2_CACHE"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>L2_<wbr>CACHE</code></a><br> |
| <a href="#CL_DEVICE_AFFINITY_DOMAIN_L1_CACHE"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>L1_<wbr>CACHE</code></a></p> |
| <p class="tableblock"> Otherwise the implementation may either return a |
| <em>param_value_size_ret</em> of 0 i.e. there is no partition type |
| associated with device or can return a property value of 0 (where 0 |
| is used to terminate the partition property list) in the memory that |
| <em>param_value</em> points to.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_REFERENCE_COUNT"></a><a href="#CL_DEVICE_REFERENCE_COUNT"><code>CL_DEVICE_<wbr>REFERENCE_<wbr>COUNT</code></a> <sup class="footnote">[<a id="_footnoteref_9" class="footnote" href="#_footnotedef_9" title="View footnote.">9</a>]</sup></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns the <em>device</em> reference count. |
| If the device is a root-level device, a reference count of one is |
| returned.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_SVM_CAPABILITIES"></a><a href="#CL_DEVICE_SVM_CAPABILITIES"><code>CL_DEVICE_<wbr>SVM_<wbr>CAPABILITIES</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_device_<wbr>svm_<wbr>capabilities</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Describes the various shared virtual memory (SVM) memory |
| allocation types the device supports. |
| This is a bit-field that describes a combination of the following |
| values:</p> |
| <p class="tableblock"> <a id="CL_DEVICE_SVM_COARSE_GRAIN_BUFFER"></a><a href="#CL_DEVICE_SVM_COARSE_GRAIN_BUFFER"><code>CL_DEVICE_<wbr>SVM_<wbr>COARSE_<wbr>GRAIN_<wbr>BUFFER</code></a> - Support for coarse-grain buffer |
| sharing using <a href="#clSVMAlloc"><strong>clSVMAlloc</strong></a>. |
| Memory consistency is guaranteed at synchronization points and the |
| host must use calls to <a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a> and |
| <a href="#clEnqueueUnmapMemObject"><strong>clEnqueueUnmapMemObject</strong></a>.<br> |
| <a id="CL_DEVICE_SVM_FINE_GRAIN_BUFFER"></a><a href="#CL_DEVICE_SVM_FINE_GRAIN_BUFFER"><code>CL_DEVICE_<wbr>SVM_<wbr>FINE_<wbr>GRAIN_<wbr>BUFFER</code></a> - Support for fine-grain buffer |
| sharing using <a href="#clSVMAlloc"><strong>clSVMAlloc</strong></a>. |
| Memory consistency is guaranteed at synchronization points without |
| need for <a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a> and <a href="#clEnqueueUnmapMemObject"><strong>clEnqueueUnmapMemObject</strong></a>.<br> |
| <a id="CL_DEVICE_SVM_FINE_GRAIN_SYSTEM"></a><a href="#CL_DEVICE_SVM_FINE_GRAIN_SYSTEM"><code>CL_DEVICE_<wbr>SVM_<wbr>FINE_<wbr>GRAIN_<wbr>SYSTEM</code></a> - Support for sharing the host’s |
| entire virtual memory including memory allocated using <strong>malloc</strong>. |
| Memory consistency is guaranteed at synchronization points.<br> |
| <a id="CL_DEVICE_SVM_ATOMICS"></a><a href="#CL_DEVICE_SVM_ATOMICS"><code>CL_DEVICE_<wbr>SVM_<wbr>ATOMICS</code></a> - Support for the OpenCL 2.0 atomic |
| operations that provide memory consistency across the host and all |
| OpenCL devices supporting fine-grain SVM allocations.</p> |
| <p class="tableblock"> The mandated minimum capability for an OpenCL 2.0, 2.1, or 2.2 device is |
| <a href="#CL_DEVICE_SVM_COARSE_GRAIN_BUFFER"><code>CL_DEVICE_<wbr>SVM_<wbr>COARSE_<wbr>GRAIN_<wbr>BUFFER</code></a>.</p> |
| <p class="tableblock"> For other device versions there is no mandated minimum capability.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PREFERRED_PLATFORM_ATOMIC_ALIGNMENT"></a><a href="#CL_DEVICE_PREFERRED_PLATFORM_ATOMIC_ALIGNMENT"><code>CL_DEVICE_<wbr>PREFERRED_<wbr>PLATFORM_<wbr>ATOMIC_<wbr>ALIGNMENT</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns the value representing the preferred alignment in bytes for |
| OpenCL 2.0 fine-grained SVM atomic types. |
| This query can return 0 which indicates that the preferred alignment |
| is aligned to the natural size of the type.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PREFERRED_GLOBAL_ATOMIC_ALIGNMENT"></a><a href="#CL_DEVICE_PREFERRED_GLOBAL_ATOMIC_ALIGNMENT"><code>CL_DEVICE_<wbr>PREFERRED_<wbr>GLOBAL_<wbr>ATOMIC_<wbr>ALIGNMENT</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns the value representing the preferred alignment in bytes for |
| OpenCL 2.0 atomic types to global memory. |
| This query can return 0 which indicates that the preferred alignment |
| is aligned to the natural size of the type.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PREFERRED_LOCAL_ATOMIC_ALIGNMENT"></a><a href="#CL_DEVICE_PREFERRED_LOCAL_ATOMIC_ALIGNMENT"><code>CL_DEVICE_<wbr>PREFERRED_<wbr>LOCAL_<wbr>ATOMIC_<wbr>ALIGNMENT</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns the value representing the preferred alignment in bytes for |
| OpenCL 2.0 atomic types to local memory. |
| This query can return 0 which indicates that the preferred alignment |
| is aligned to the natural size of the type.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_MAX_NUM_SUB_GROUPS"></a><a href="#CL_DEVICE_MAX_NUM_SUB_GROUPS"><code>CL_DEVICE_<wbr>MAX_<wbr>NUM_<wbr>SUB_<wbr>GROUPS</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.1.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Maximum number of sub-groups in a work-group that a device is |
| capable of executing on a single compute unit, for any given |
| kernel-instance running on the device.</p> |
| <p class="tableblock"> The minimum value is 1 if the device supports subgroups, and must be |
| 0 for devices that do not support subgroups. |
| Support for subgroups is required for an OpenCL 2.1 or 2.2 device.</p> |
| <p class="tableblock"> (Refer also to <a href="#clGetKernelSubGroupInfo"><strong>clGetKernelSubGroupInfo</strong></a>.)</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_SUB_GROUP_INDEPENDENT_FORWARD_PROGRESS"></a><a href="#CL_DEVICE_SUB_GROUP_INDEPENDENT_FORWARD_PROGRESS"><code>CL_DEVICE_<wbr>SUB_<wbr>GROUP_<wbr>INDEPENDENT_<wbr>FORWARD_<wbr>PROGRESS</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.1.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_bool</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Is <a href="#CL_TRUE"><code>CL_TRUE</code></a> if this device supports independent forward progress of |
| sub-groups, <a href="#CL_FALSE"><code>CL_FALSE</code></a> otherwise.</p> |
| <p class="tableblock"> This query must return <a href="#CL_TRUE"><code>CL_TRUE</code></a> for devices that support the |
| <strong>cl_khr_subgroups</strong> extension, and must return <a href="#CL_FALSE"><code>CL_FALSE</code></a> for |
| devices that do not support subgroups.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES"></a><a href="#CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES"><code>CL_DEVICE_<wbr>ATOMIC_<wbr>MEMORY_<wbr>CAPABILITIES</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_device_<wbr>atomic_<wbr>capabilities</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Describes the various memory orders and scopes that the device supports for atomic memory operations. |
| This is a bit-field that describes a combination of the following |
| values:</p> |
| <p class="tableblock"> <a id="CL_DEVICE_ATOMIC_ORDER_RELAXED"></a><a href="#CL_DEVICE_ATOMIC_ORDER_RELAXED"><code>CL_DEVICE_<wbr>ATOMIC_<wbr>ORDER_<wbr>RELAXED</code></a> - Support for the <strong>relaxed</strong> memory order.<br> |
| <a id="CL_DEVICE_ATOMIC_ORDER_ACQ_REL"></a><a href="#CL_DEVICE_ATOMIC_ORDER_ACQ_REL"><code>CL_DEVICE_<wbr>ATOMIC_<wbr>ORDER_<wbr>ACQ_<wbr>REL</code></a> - Support for the <strong>acquire</strong>, <strong>release</strong>, and <strong>acquire-release</strong> memory orders.<br> |
| <a id="CL_DEVICE_ATOMIC_ORDER_SEQ_CST"></a><a href="#CL_DEVICE_ATOMIC_ORDER_SEQ_CST"><code>CL_DEVICE_<wbr>ATOMIC_<wbr>ORDER_<wbr>SEQ_<wbr>CST</code></a> - Support for the <strong>sequentially consistent</strong> memory order.</p> |
| <p class="tableblock"> Because atomic memory orders are hierarchical, a device that supports a strong memory order must also support all weaker memory orders.</p> |
| <p class="tableblock"> <a id="CL_DEVICE_ATOMIC_SCOPE_WORK_ITEM"></a><a href="#CL_DEVICE_ATOMIC_SCOPE_WORK_ITEM"><code>CL_DEVICE_<wbr>ATOMIC_<wbr>SCOPE_<wbr>WORK_<wbr>ITEM</code></a> <sup class="footnote">[<a id="_footnoteref_10" class="footnote" href="#_footnotedef_10" title="View footnote.">10</a>]</sup> - Support for memory ordering constraints that apply to a single work-item.<br> |
| <a id="CL_DEVICE_ATOMIC_SCOPE_WORK_GROUP"></a><a href="#CL_DEVICE_ATOMIC_SCOPE_WORK_GROUP"><code>CL_DEVICE_<wbr>ATOMIC_<wbr>SCOPE_<wbr>WORK_<wbr>GROUP</code></a> - Support for memory ordering constraints that apply to all work-items in a work-group.<br> |
| <a id="CL_DEVICE_ATOMIC_SCOPE_DEVICE"></a><a href="#CL_DEVICE_ATOMIC_SCOPE_DEVICE"><code>CL_DEVICE_<wbr>ATOMIC_<wbr>SCOPE_<wbr>DEVICE</code></a> - Support for memory ordering constraints that apply to all work-items executing on the device.<br> |
| <a id="CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES"></a><a href="#CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES"><code>CL_DEVICE_<wbr>ATOMIC_<wbr>SCOPE_<wbr>ALL_<wbr>DEVICES</code></a> - Support for memory ordering constraints that apply to all work-items executing across all devices that can share SVM memory with each other and the host process.</p> |
| <p class="tableblock"> Because atomic scopes are hierarchical, a device that supports a wide scope must also support all narrower scopes, except for the work-item scope, which is a special case.</p> |
| <p class="tableblock"> The mandated minimum capability is:</p> |
| <p class="tableblock"> <a href="#CL_DEVICE_ATOMIC_ORDER_RELAXED"><code>CL_DEVICE_<wbr>ATOMIC_<wbr>ORDER_<wbr>RELAXED</code></a> |<br> |
| <a href="#CL_DEVICE_ATOMIC_SCOPE_WORK_GROUP"><code>CL_DEVICE_<wbr>ATOMIC_<wbr>SCOPE_<wbr>WORK_<wbr>GROUP</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_ATOMIC_FENCE_CAPABILITIES"></a><a href="#CL_DEVICE_ATOMIC_FENCE_CAPABILITIES"><code>CL_DEVICE_<wbr>ATOMIC_<wbr>FENCE_<wbr>CAPABILITIES</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_device_<wbr>atomic_<wbr>capabilities</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Describes the various memory orders and scopes that the device supports for atomic fence operations. |
| This is a bit-field that has the same set of possible values as described for <a href="#CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES"><code>CL_DEVICE_<wbr>ATOMIC_<wbr>MEMORY_<wbr>CAPABILITIES</code></a>.</p> |
| <p class="tableblock"> The mandated minimum capability is:</p> |
| <p class="tableblock"> <a href="#CL_DEVICE_ATOMIC_ORDER_RELAXED"><code>CL_DEVICE_<wbr>ATOMIC_<wbr>ORDER_<wbr>RELAXED</code></a> |<br> |
| <a href="#CL_DEVICE_ATOMIC_ORDER_ACQ_REL"><code>CL_DEVICE_<wbr>ATOMIC_<wbr>ORDER_<wbr>ACQ_<wbr>REL</code></a> |<br> |
| <a href="#CL_DEVICE_ATOMIC_SCOPE_WORK_GROUP"><code>CL_DEVICE_<wbr>ATOMIC_<wbr>SCOPE_<wbr>WORK_<wbr>GROUP</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_NON_UNIFORM_WORK_GROUP_SUPPORT"></a><a href="#CL_DEVICE_NON_UNIFORM_WORK_GROUP_SUPPORT"><code>CL_DEVICE_<wbr>NON_<wbr>UNIFORM_<wbr>WORK_<wbr>GROUP_<wbr>SUPPORT</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_bool</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Is <a href="#CL_TRUE"><code>CL_TRUE</code></a> if the device supports non-uniform work-groups, and <a href="#CL_FALSE"><code>CL_FALSE</code></a> otherwise.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_WORK_GROUP_COLLECTIVE_FUNCTIONS_SUPPORT"></a><a href="#CL_DEVICE_WORK_GROUP_COLLECTIVE_FUNCTIONS_SUPPORT"><code>CL_DEVICE_<wbr>WORK_<wbr>GROUP_<wbr>COLLECTIVE_<wbr>FUNCTIONS_<wbr>SUPPORT</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_bool</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Is <a href="#CL_TRUE"><code>CL_TRUE</code></a> if the device supports work-group collective functions e.g. <code>work_group_broadcast</code>, <code>work_group_reduce</code>, and <code>work_group_scan</code>, and <a href="#CL_FALSE"><code>CL_FALSE</code></a> otherwise.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_GENERIC_ADDRESS_SPACE_SUPPORT"></a><a href="#CL_DEVICE_GENERIC_ADDRESS_SPACE_SUPPORT"><code>CL_DEVICE_<wbr>GENERIC_<wbr>ADDRESS_<wbr>SPACE_<wbr>SUPPORT</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_bool</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Is <a href="#CL_TRUE"><code>CL_TRUE</code></a> if the device supports the generic address space and its associated built-in functions, and <a href="#CL_FALSE"><code>CL_FALSE</code></a> otherwise.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_DEVICE_ENQUEUE_CAPABILITIES"></a><a href="#CL_DEVICE_DEVICE_ENQUEUE_CAPABILITIES"><code>CL_DEVICE_<wbr>DEVICE_<wbr>ENQUEUE_<wbr>CAPABILITIES</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_device_<wbr>device_<wbr>enqueue_<wbr>capabilities</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Describes device-side enqueue capabilities of the device. |
| This is a bit-field that describes one or more of the following |
| values:</p> |
| <p class="tableblock"> <a id="CL_DEVICE_QUEUE_SUPPORTED"></a><a href="#CL_DEVICE_QUEUE_SUPPORTED"><code>CL_DEVICE_<wbr>QUEUE_<wbr>SUPPORTED</code></a> - Device supports device-side enqueue and on-device queues.<br> |
| <a id="CL_DEVICE_QUEUE_REPLACEABLE_DEFAULT"></a><a href="#CL_DEVICE_QUEUE_REPLACEABLE_DEFAULT"><code>CL_DEVICE_<wbr>QUEUE_<wbr>REPLACEABLE_<wbr>DEFAULT</code></a> - Device supports a replaceable default on-device queue.</p> |
| <p class="tableblock"> If <a href="#CL_DEVICE_QUEUE_REPLACEABLE_DEFAULT"><code>CL_DEVICE_<wbr>QUEUE_<wbr>REPLACEABLE_<wbr>DEFAULT</code></a> is set, <a href="#CL_DEVICE_QUEUE_SUPPORTED"><code>CL_DEVICE_<wbr>QUEUE_<wbr>SUPPORTED</code></a> must also be set.</p> |
| <p class="tableblock"> Devices that set <a href="#CL_DEVICE_QUEUE_SUPPORTED"><code>CL_DEVICE_<wbr>QUEUE_<wbr>SUPPORTED</code></a> for <a href="#CL_DEVICE_DEVICE_ENQUEUE_CAPABILITIES"><code>CL_DEVICE_<wbr>DEVICE_<wbr>ENQUEUE_<wbr>CAPABILITIES</code></a> must also return <a href="#CL_TRUE"><code>CL_TRUE</code></a> for <a href="#CL_DEVICE_GENERIC_ADDRESS_SPACE_SUPPORT"><code>CL_DEVICE_<wbr>GENERIC_<wbr>ADDRESS_<wbr>SPACE_<wbr>SUPPORT</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PIPE_SUPPORT"></a><a href="#CL_DEVICE_PIPE_SUPPORT"><code>CL_DEVICE_<wbr>PIPE_<wbr>SUPPORT</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_bool</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Is <a href="#CL_TRUE"><code>CL_TRUE</code></a> if the device supports pipes, and <a href="#CL_FALSE"><code>CL_FALSE</code></a> otherwise.</p> |
| <p class="tableblock"> Devices that return <a href="#CL_TRUE"><code>CL_TRUE</code></a> for <a href="#CL_DEVICE_PIPE_SUPPORT"><code>CL_DEVICE_<wbr>PIPE_<wbr>SUPPORT</code></a> must also return <a href="#CL_TRUE"><code>CL_TRUE</code></a> for <a href="#CL_DEVICE_GENERIC_ADDRESS_SPACE_SUPPORT"><code>CL_DEVICE_<wbr>GENERIC_<wbr>ADDRESS_<wbr>SPACE_<wbr>SUPPORT</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PREFERRED_WORK_GROUP_SIZE_MULTIPLE"></a><a href="#CL_DEVICE_PREFERRED_WORK_GROUP_SIZE_MULTIPLE"><code>CL_DEVICE_<wbr>PREFERRED_<wbr>WORK_<wbr>GROUP_<wbr>SIZE_<wbr>MULTIPLE</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>size_t</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns the preferred multiple of work-group size for the given device. |
| This is a performance hint intended as a guide when specifying the local work size argument to <a href="#clEnqueueNDRangeKernel"><strong>clEnqueueNDRangeKernel</strong></a>.</p> |
| <p class="tableblock"> (Refer also to <a href="#clGetKernelWorkGroupInfo"><strong>clGetKernelWorkGroupInfo</strong></a> where <a href="#CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE"><code>CL_KERNEL_<wbr>PREFERRED_<wbr>WORK_<wbr>GROUP_<wbr>SIZE_<wbr>MULTIPLE</code></a> |
| can return a different value to <a href="#CL_DEVICE_PREFERRED_WORK_GROUP_SIZE_MULTIPLE"><code>CL_DEVICE_<wbr>PREFERRED_<wbr>WORK_<wbr>GROUP_<wbr>SIZE_<wbr>MULTIPLE</code></a> which may be more optimal.)</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_LATEST_CONFORMANCE_VERSION_PASSED"></a><a href="#CL_DEVICE_LATEST_CONFORMANCE_VERSION_PASSED"><code>CL_DEVICE_<wbr>LATEST_<wbr>CONFORMANCE_<wbr>VERSION_<wbr>PASSED</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>char</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Returns the latest version of the conformance test suite that this device |
| has fully passed in accordance with the official conformance process.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="paragraph"> |
| <p><a href="#clGetDeviceInfo"><strong>clGetDeviceInfo</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_DEVICE"><code>CL_INVALID_<wbr>DEVICE</code></a> if <em>device</em> is not a valid device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>param_name</em> is not one of the supported values or |
| if size in bytes specified by <em>param_value_size</em> is < size of return |
| type as specified in the <a href="#device-queries-table">Device Queries</a> table |
| and <em>param_value</em> is not a <code>NULL</code> value or if <em>param_name</em> is a value |
| that is available as an extension and the corresponding extension is not |
| supported by the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To query device and host timestamps, call the function:</p> |
| </div> |
| <div id="clGetDeviceAndHostTimer" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clGetDeviceAndHostTimer( |
| cl_device_id device, |
| cl_ulong* device_timestamp, |
| cl_ulong* host_timestamp);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clGetDeviceAndHostTimer"><strong>clGetDeviceAndHostTimer</strong></a> is <a href="#unified-spec">missing before</a> version 2.1. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>device</em> is a device returned by <a href="#clGetDeviceIDs"><strong>clGetDeviceIDs</strong></a>.</p> |
| </li> |
| <li> |
| <p><em>device_timestamp</em> will be updated with the value of the device timer in |
| nanoseconds. |
| The resolution of the timer is the same as the device profiling timer |
| returned by <a href="#clGetDeviceInfo"><strong>clGetDeviceInfo</strong></a> and the <a href="#CL_DEVICE_PROFILING_TIMER_RESOLUTION"><code>CL_DEVICE_<wbr>PROFILING_<wbr>TIMER_<wbr>RESOLUTION</code></a> |
| query.</p> |
| </li> |
| <li> |
| <p><em>host_timestamp</em> will be updated with the value of the host timer in |
| nanoseconds at the closest possible point in time to that at which |
| <em>device_timer</em> was returned. |
| The resolution of the timer may be queried via <a href="#clGetPlatformInfo"><strong>clGetPlatformInfo</strong></a> and the |
| flag <a href="#CL_PLATFORM_HOST_TIMER_RESOLUTION"><code>CL_PLATFORM_<wbr>HOST_<wbr>TIMER_<wbr>RESOLUTION</code></a>.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clGetDeviceAndHostTimer"><strong>clGetDeviceAndHostTimer</strong></a> returns a reasonably synchronized pair of |
| timestamps from the device timer and the host timer as seen by <em>device</em>. |
| Implementations may need to execute this query with a high latency in order |
| to provide reasonable synchronization of the timestamps. |
| The host timestamp and device timestamp returned by this function and |
| <a href="#clGetHostTimer"><strong>clGetHostTimer</strong></a> each have an implementation defined timebase. |
| The timestamps will always be in their respective timebases regardless of |
| which query function is used. |
| The timestamp returned from <a href="#clGetEventProfilingInfo"><strong>clGetEventProfilingInfo</strong></a> for an event on a |
| device and a device timestamp queried from the same device will always be in |
| the same timebase.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clGetDeviceAndHostTimer"><strong>clGetDeviceAndHostTimer</strong></a> will return <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> with a time value in |
| <em>host_timestamp</em> if provided. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_DEVICE"><code>CL_INVALID_<wbr>DEVICE</code></a> if <em>device</em> is not a valid device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_OPERATION"><code>CL_INVALID_<wbr>OPERATION</code></a> if the platform associated with <em>device</em> does not |
| support device and host timer synchronization.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>host_timestamp</em> or <em>device_timestamp</em> is <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To query the host clock, call the function:</p> |
| </div> |
| <div id="clGetHostTimer" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clGetHostTimer( |
| cl_device_id device, |
| cl_ulong* host_timestamp);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clGetHostTimer"><strong>clGetHostTimer</strong></a> is <a href="#unified-spec">missing before</a> version 2.1. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>device</em> is a device returned by <a href="#clGetDeviceIDs"><strong>clGetDeviceIDs</strong></a>.</p> |
| </li> |
| <li> |
| <p><em>host_timestamp</em> will be updated with the value of the current timer in |
| nanoseconds. |
| The resolution of the timer may be queried via <a href="#clGetPlatformInfo"><strong>clGetPlatformInfo</strong></a> and the |
| flag <a href="#CL_PLATFORM_HOST_TIMER_RESOLUTION"><code>CL_PLATFORM_<wbr>HOST_<wbr>TIMER_<wbr>RESOLUTION</code></a>.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clGetHostTimer"><strong>clGetHostTimer</strong></a> returns the current value of the host clock as seen by |
| <em>device</em>. |
| This value is in the same timebase as the <em>host_timestamp</em> returned from |
| <a href="#clGetDeviceAndHostTimer"><strong>clGetDeviceAndHostTimer</strong></a>. |
| The implementation will return with as low a latency as possible to allow a |
| correlation with a subsequent application sampled time. |
| The host timestamp and device timestamp returned by this function and |
| <a href="#clGetDeviceAndHostTimer"><strong>clGetDeviceAndHostTimer</strong></a> each have an implementation defined timebase. |
| The timestamps will always be in their respective timebases regardless of |
| which query function is used. |
| The timestamp returned from <a href="#clGetEventProfilingInfo"><strong>clGetEventProfilingInfo</strong></a> for an event on a |
| device and a device timestamp queried from the same device will always be in |
| the same timebase.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clGetHostTimer"><strong>clGetHostTimer</strong></a> will return <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> with a time value in |
| <em>host_timestamp</em> if provided. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_DEVICE"><code>CL_INVALID_<wbr>DEVICE</code></a> if <em>device</em> is not a valid device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_OPERATION"><code>CL_INVALID_<wbr>OPERATION</code></a> if the platform associated with <em>device</em> does not |
| support device and host timer synchronization.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>host_timestamp</em> is <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="sect2"> |
| <h3 id="_partitioning_a_device"><a class="anchor" href="#_partitioning_a_device"></a>4.3. Partitioning a Device</h3> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| Partitioning devices is <a href="#unified-spec">missing before</a> version 1.2. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To create sub-devices partitioning an OpenCL device, call the function:</p> |
| </div> |
| <div id="clCreateSubDevices" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clCreateSubDevices( |
| cl_device_id in_device, |
| <span class="directive">const</span> cl_device_partition_property* properties, |
| cl_uint num_devices, |
| cl_device_id* out_devices, |
| cl_uint* num_devices_ret);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clCreateSubDevices"><strong>clCreateSubDevices</strong></a> is <a href="#unified-spec">missing before</a> version 1.2. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>in_device</em> is the device to be partitioned.</p> |
| </li> |
| <li> |
| <p><em>properties</em> specifies how <em>in_device</em> is to be partitioned, described by a |
| partition name and its corresponding value. |
| Each partition name is immediately followed by the corresponding desired |
| value. |
| The list is terminated with 0. |
| The list of supported partitioning schemes is described in the |
| <a href="#subdevice-partition-table">Subdevice Partition</a> table. |
| Only one of the listed partitioning schemes can be specified in |
| <em>properties</em>.</p> |
| </li> |
| <li> |
| <p><em>num_devices</em> is the size of memory pointed to by <em>out_devices</em> specified as |
| the number of <code>cl_device_<wbr>id</code> entries.</p> |
| </li> |
| <li> |
| <p><em>out_devices</em> is the buffer where the OpenCL sub-devices will be returned. |
| If <em>out_devices</em> is <code>NULL</code>, this argument is ignored. |
| If <em>out_devices</em> is not <code>NULL</code>, <em>num_devices</em> must be greater than or equal |
| to the number of sub-devices that <em>device</em> may be partitioned into according |
| to the partitioning scheme specified in <em>properties</em>.</p> |
| </li> |
| <li> |
| <p><em>num_devices_ret</em> returns the number of sub-devices that <em>device</em> may be |
| partitioned into according to the partitioning scheme specified in |
| <em>properties</em>. |
| If <em>num_devices_ret</em> is <code>NULL</code>, it is ignored.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clCreateSubDevices"><strong>clCreateSubDevices</strong></a> creates an array of sub-devices that each reference a |
| non-intersecting set of compute units within <em>in_device</em>, according to the |
| partition scheme given by <em>properties</em>. |
| The output sub-devices may be used in every way that the root (or parent) |
| device can be used, including creating contexts, building programs, further |
| calls to <a href="#clCreateSubDevices"><strong>clCreateSubDevices</strong></a> and creating command-queues. |
| When a command-queue is created against a sub-device, the commands enqueued |
| on the queue are executed only on the sub-device.</p> |
| </div> |
| <table id="subdevice-partition-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 6. List of supported partition schemes by <a href="#clCreateSubDevices">clCreateSubDevices</a></caption> |
| <colgroup> |
| <col style="width: 33%;"> |
| <col style="width: 17%;"> |
| <col style="width: 50%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">Partition Property</th> |
| <th class="tableblock halign-left valign-top">Partition Value</th> |
| <th class="tableblock halign-left valign-top">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PARTITION_EQUALLY"></a><a href="#CL_DEVICE_PARTITION_EQUALLY"><code>CL_DEVICE_<wbr>PARTITION_<wbr>EQUALLY</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Split the aggregate device into as many smaller aggregate devices as |
| can be created, each containing <em>n</em> compute units. |
| The value <em>n</em> is passed as the value accompanying this property. |
| If <em>n</em> does not divide evenly into |
| <a href="#CL_DEVICE_MAX_COMPUTE_UNITS"><code>CL_DEVICE_<wbr>MAX_<wbr>COMPUTE_<wbr>UNITS</code></a>, then the remaining compute |
| units are not used.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PARTITION_BY_COUNTS"></a><a href="#CL_DEVICE_PARTITION_BY_COUNTS"><code>CL_DEVICE_<wbr>PARTITION_<wbr>BY_<wbr>COUNTS</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">This property is followed by a list of compute unit counts |
| terminated with 0 or <a id="CL_DEVICE_PARTITION_BY_COUNTS_LIST_END"></a><a href="#CL_DEVICE_PARTITION_BY_COUNTS_LIST_END"><code>CL_DEVICE_<wbr>PARTITION_<wbr>BY_<wbr>COUNTS_<wbr>LIST_<wbr>END</code></a>. |
| For each non-zero count <em>m</em> in the list, a sub-device is created |
| with <em>m</em> compute units in it.</p> |
| <p class="tableblock"> The number of non-zero count entries in the list may not exceed |
| <a href="#CL_DEVICE_PARTITION_MAX_SUB_DEVICES"><code>CL_DEVICE_<wbr>PARTITION_<wbr>MAX_<wbr>SUB_<wbr>DEVICES</code></a>.</p> |
| <p class="tableblock"> The total number of compute units specified may not exceed |
| <a href="#CL_DEVICE_MAX_COMPUTE_UNITS"><code>CL_DEVICE_<wbr>MAX_<wbr>COMPUTE_<wbr>UNITS</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN"></a><a href="#CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN"><code>CL_DEVICE_<wbr>PARTITION_<wbr>BY_<wbr>AFFINITY_<wbr>DOMAIN</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_device_<wbr>affinity_<wbr>domain</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Split the device into smaller aggregate devices containing one or |
| more compute units that all share part of a cache hierarchy. |
| The value accompanying this property may be drawn from the following |
| list:</p> |
| <p class="tableblock"> <a href="#CL_DEVICE_AFFINITY_DOMAIN_NUMA"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>NUMA</code></a> - Split the device into sub-devices |
| comprised of compute units that share a NUMA node.</p> |
| <p class="tableblock"> <a href="#CL_DEVICE_AFFINITY_DOMAIN_L4_CACHE"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>L4_<wbr>CACHE</code></a> - Split the device into |
| sub-devices comprised of compute units that share a level 4 data |
| cache.</p> |
| <p class="tableblock"> <a href="#CL_DEVICE_AFFINITY_DOMAIN_L3_CACHE"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>L3_<wbr>CACHE</code></a> - Split the device into |
| sub-devices comprised of compute units that share a level 3 data |
| cache.</p> |
| <p class="tableblock"> <a href="#CL_DEVICE_AFFINITY_DOMAIN_L2_CACHE"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>L2_<wbr>CACHE</code></a> - Split the device into |
| sub-devices comprised of compute units that share a level 2 data |
| cache.</p> |
| <p class="tableblock"> <a href="#CL_DEVICE_AFFINITY_DOMAIN_L1_CACHE"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>L1_<wbr>CACHE</code></a> - Split the device into |
| sub-devices comprised of compute units that share a level 1 data |
| cache.</p> |
| <p class="tableblock"> <a href="#CL_DEVICE_AFFINITY_DOMAIN_NEXT_PARTITIONABLE"><code>CL_DEVICE_<wbr>AFFINITY_<wbr>DOMAIN_<wbr>NEXT_<wbr>PARTITIONABLE</code></a> - Split the device |
| along the next partitionable affinity domain. |
| The implementation shall find the first level along which the device |
| or sub-device may be further subdivided in the order NUMA, L4, L3, |
| L2, L1, and partition the device into sub-devices comprised of |
| compute units that share memory subsystems at this level.</p> |
| <p class="tableblock"> The user may determine what happened by calling |
| <a href="#clGetDeviceInfo"><strong>clGetDeviceInfo</strong></a>(<a href="#CL_DEVICE_PARTITION_TYPE"><code>CL_DEVICE_<wbr>PARTITION_<wbr>TYPE</code></a>) on the sub-devices.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="paragraph"> |
| <p><a href="#clCreateSubDevices"><strong>clCreateSubDevices</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the partition is created |
| successfully. |
| Otherwise, it returns a <code>NULL</code> value with the following error values |
| returned in <em>errcode_ret</em>:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_DEVICE"><code>CL_INVALID_<wbr>DEVICE</code></a> if <em>in_device</em> is not a valid device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if values specified in <em>properties</em> are not valid or if |
| values specified in <em>properties</em> are valid but not supported by the |
| device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>out_devices</em> is not <code>NULL</code> and <em>num_devices</em> is |
| less than the number of sub-devices created by the partition scheme.</p> |
| </li> |
| <li> |
| <p><a href="#CL_DEVICE_PARTITION_FAILED"><code>CL_DEVICE_<wbr>PARTITION_<wbr>FAILED</code></a> if the partition name is supported by the |
| implementation but in_device could not be further partitioned.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_DEVICE_PARTITION_COUNT"><code>CL_INVALID_<wbr>DEVICE_<wbr>PARTITION_<wbr>COUNT</code></a> if the partition name specified in |
| <em>properties</em> is <a href="#CL_DEVICE_PARTITION_BY_COUNTS"><code>CL_DEVICE_<wbr>PARTITION_<wbr>BY_<wbr>COUNTS</code></a> and the number of |
| sub-devices requested exceeds <a href="#CL_DEVICE_PARTITION_MAX_SUB_DEVICES"><code>CL_DEVICE_<wbr>PARTITION_<wbr>MAX_<wbr>SUB_<wbr>DEVICES</code></a> or the |
| total number of compute units requested exceeds |
| <a href="#CL_DEVICE_MAX_COMPUTE_UNITS"><code>CL_DEVICE_<wbr>MAX_<wbr>COMPUTE_<wbr>UNITS</code></a> for <em>in_device</em>, or the number of |
| compute units requested for one or more sub-devices is less than zero or |
| the number of sub-devices requested exceeds |
| <a href="#CL_DEVICE_MAX_COMPUTE_UNITS"><code>CL_DEVICE_<wbr>MAX_<wbr>COMPUTE_<wbr>UNITS</code></a> for <em>in_device</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>A few examples that describe how to specify partition properties in |
| <em>properties</em> argument to <a href="#clCreateSubDevices"><strong>clCreateSubDevices</strong></a> are given below:</p> |
| </div> |
| <div class="paragraph"> |
| <p>To partition a device containing 16 compute units into two sub-devices, each |
| containing 8 compute units, pass the following in <em>properties</em>:</p> |
| </div> |
| <div class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c">{ CL_DEVICE_PARTITION_EQUALLY, <span class="integer">8</span>, |
| <span class="integer">0</span> } <span class="comment">// 0 terminates the property list</span></code></pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>To partition a device with four compute units into two sub-devices with one |
| sub-device containing 3 compute units and the other sub-device 1 compute |
| unit, pass the following in properties argument:</p> |
| </div> |
| <div class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c">{ CL_DEVICE_PARTITION_BY_COUNTS, |
| <span class="integer">3</span>, <span class="integer">1</span>, CL_DEVICE_PARTITION_BY_COUNTS_LIST_END, |
| <span class="integer">0</span> } <span class="comment">// 0 terminates the property list</span></code></pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>To split a device along the outermost cache line (if any), pass the |
| following in properties argument:</p> |
| </div> |
| <div class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c">{ CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN, |
| CL_DEVICE_AFFINITY_DOMAIN_NEXT_PARTITIONABLE, |
| <span class="integer">0</span> } <span class="comment">// 0 terminates the property list</span></code></pre> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To retain a device, call the function:</p> |
| </div> |
| <div id="clRetainDevice" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clRetainDevice( |
| cl_device_id device);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clRetainDevice"><strong>clRetainDevice</strong></a> is <a href="#unified-spec">missing before</a> version 1.2. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>device</em> is the OpenCL device to retain.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clRetainDevice"><strong>clRetainDevice</strong></a> increments the <em>device</em> reference count if <em>device</em> is a |
| valid sub-device created by a call to <a href="#clCreateSubDevices"><strong>clCreateSubDevices</strong></a>. |
| If <em>device</em> is a root level device i.e. a <code>cl_device_<wbr>id</code> returned by |
| <a href="#clGetDeviceIDs"><strong>clGetDeviceIDs</strong></a>, the <em>device</em> reference count remains unchanged.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clRetainDevice"><strong>clRetainDevice</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed successfully |
| or the device is a root-level device. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_DEVICE"><code>CL_INVALID_<wbr>DEVICE</code></a> if <em>device</em> is not a valid device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To release a device, call the function:</p> |
| </div> |
| <div id="clReleaseDevice" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clReleaseDevice( |
| cl_device_id device);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clReleaseDevice"><strong>clReleaseDevice</strong></a> is <a href="#unified-spec">missing before</a> version 1.2. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>device</em> is the OpenCL device to retain.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clReleaseDevice"><strong>clReleaseDevice</strong></a> decrements the <em>device</em> reference count if device is a |
| valid sub-device created by a call to <a href="#clCreateSubDevices"><strong>clCreateSubDevices</strong></a>. |
| If <em>device</em> is a root level device i.e. a <code>cl_device_<wbr>id</code> returned by |
| <a href="#clGetDeviceIDs"><strong>clGetDeviceIDs</strong></a>, the <em>device</em> reference count remains unchanged.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clReleaseDevice"><strong>clReleaseDevice</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_DEVICE"><code>CL_INVALID_<wbr>DEVICE</code></a> if <em>device</em> is not a valid device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>After the <em>device</em> reference count becomes zero and all the objects attached |
| to <em>device</em> (such as command-queues) are released, the <em>device</em> object is |
| deleted. |
| Using this function to release a reference that was not obtained by creating |
| the object or by calling <a href="#clRetainDevice"><strong>clRetainDevice</strong></a> causes undefined behavior.</p> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="sect2"> |
| <h3 id="_contexts"><a class="anchor" href="#_contexts"></a>4.4. Contexts</h3> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To create an OpenCL context, call the function:</p> |
| </div> |
| <div id="clCreateContext" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_context clCreateContext( |
| <span class="directive">const</span> cl_context_properties* properties, |
| cl_uint num_devices, |
| <span class="directive">const</span> cl_device_id* devices, |
| <span class="directive">void</span> (CL_CALLBACK* pfn_notify)(<span class="directive">const</span> <span class="predefined-type">char</span>* errinfo, <span class="directive">const</span> <span class="directive">void</span>* private_info, size_t cb, <span class="directive">void</span>* user_data), |
| <span class="directive">void</span>* user_data, |
| cl_int* errcode_ret);</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>properties</em> specifies a list of context property names and their |
| corresponding values. |
| Each property name is immediately followed by the corresponding desired |
| value. |
| The list is terminated with 0. |
| The list of supported properties is described in the |
| <a href="#context-properties-table">Context Properties</a> table. |
| <em>properties</em> can be <code>NULL</code> in which case the platform that is selected is |
| implementation-defined.</p> |
| </li> |
| <li> |
| <p><em>num_devices</em> is the number of devices specified in the <em>devices</em> argument.</p> |
| </li> |
| <li> |
| <p><em>devices</em> is a pointer to a list of unique devices returned by |
| <a href="#clGetDeviceIDs"><strong>clGetDeviceIDs</strong></a> or sub-devices created by <a href="#clCreateSubDevices"><strong>clCreateSubDevices</strong></a> for a |
| platform. <sup class="footnote">[<a id="_footnoteref_11" class="footnote" href="#_footnotedef_11" title="View footnote.">11</a>]</sup></p> |
| </li> |
| <li> |
| <p><em>pfn_notify</em> is a callback function that can be registered by the |
| application. |
| This callback function will be used by the OpenCL implementation to report |
| information on errors during context creation as well as errors that occur |
| at runtime in this context. |
| This callback function may be called asynchronously by the OpenCL |
| implementation. |
| It is the applications responsibility to ensure that the callback function |
| is thread-safe. |
| If <em>pfn_notify</em> is <code>NULL</code>, no callback function is registered.</p> |
| </li> |
| <li> |
| <p><em>user_data</em> will be passed as the <em>user_data</em> argument when <em>pfn_notify</em> is |
| called. |
| <em>user_data</em> can be <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><em>errcode_ret</em> will return an appropriate error code. |
| If <em>errcode_ret</em> is <code>NULL</code>, no error code is returned.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The parameters to the callback function <em>pfn_notify</em> are:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>errinfo</em> is a pointer to an error string.</p> |
| </li> |
| <li> |
| <p><em>private_info</em> and <em>cb</em> represent a pointer to binary data that is |
| returned by the OpenCL implementation that can be used to log additional |
| information helpful in debugging the error.</p> |
| </li> |
| <li> |
| <p><em>user_data</em> is a pointer to user supplied data.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Contexts are used by the OpenCL runtime for managing objects such as |
| command-queues, memory, program and kernel objects and for executing kernels |
| on one or more devices specified in the context.</p> |
| </div> |
| <table id="context-properties-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 7. List of supported context creation properties by <a href="#clCreateContext">clCreateContext</a></caption> |
| <colgroup> |
| <col style="width: 33%;"> |
| <col style="width: 17%;"> |
| <col style="width: 50%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">Context Property</th> |
| <th class="tableblock halign-left valign-top">Property Value</th> |
| <th class="tableblock halign-left valign-top">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_CONTEXT_PLATFORM"></a><a href="#CL_CONTEXT_PLATFORM"><code>CL_CONTEXT_<wbr>PLATFORM</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_platform_<wbr>id</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Specifies the platform to use.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_CONTEXT_INTEROP_USER_SYNC"></a><a href="#CL_CONTEXT_INTEROP_USER_SYNC"><code>CL_CONTEXT_<wbr>INTEROP_<wbr>USER_<wbr>SYNC</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_bool</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Specifies whether the user is responsible for synchronization |
| between OpenCL and other APIs. |
| Please refer to the specific sections in the OpenCL Extension |
| Specification that describe sharing with other APIs for restrictions |
| on using this flag.</p> |
| <p class="tableblock"> If <a href="#CL_CONTEXT_INTEROP_USER_SYNC"><code>CL_CONTEXT_<wbr>INTEROP_<wbr>USER_<wbr>SYNC</code></a> is not specified, a default of |
| <a href="#CL_FALSE"><code>CL_FALSE</code></a> is assumed.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| There are a number of cases where error notifications need to be |
| delivered due to an error that occurs outside a context. |
| Such notifications may not be delivered through the <em>pfn_notify</em> callback. |
| Where these notifications go is implementation-defined. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clCreateContext"><strong>clCreateContext</strong></a> returns a valid non-zero context and <em>errcode_ret</em> is set |
| to <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the context is created successfully. |
| Otherwise, it returns a <code>NULL</code> value with the following error values |
| returned in <em>errcode_ret</em>:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_PLATFORM"><code>CL_INVALID_<wbr>PLATFORM</code></a> if <em>properties</em> is <code>NULL</code> and no platform could be |
| selected or if platform value specified in <em>properties</em> is not a valid |
| platform.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_PROPERTY"><code>CL_INVALID_<wbr>PROPERTY</code></a> if context property name in <em>properties</em> is not a |
| supported property name, if the value specified for a supported property |
| name is not valid, or if the same property name is specified more than |
| once. |
| This error code is <a href="#unified-spec">missing before</a> version 1.1.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>devices</em> is <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>num_devices</em> is equal to zero.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>pfn_notify</em> is <code>NULL</code> but <em>user_data</em> is not |
| <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_DEVICE"><code>CL_INVALID_<wbr>DEVICE</code></a> if any device in <em>devices</em> is not a valid device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_DEVICE_NOT_AVAILABLE"><code>CL_DEVICE_<wbr>NOT_<wbr>AVAILABLE</code></a> if a device in <em>devices</em> is currently not |
| available even though the device was returned by <a href="#clGetDeviceIDs"><strong>clGetDeviceIDs</strong></a>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| <div class="paragraph"> |
| <p>It is possible that a device(s) becomes unavailable after a context and |
| command-queues that use this device(s) have been created and commands have |
| been queued to command-queues. |
| In this case the behavior of OpenCL API calls that use this context (and |
| command-queues) are considered to be implementation-defined. |
| The user callback function, if specified, when the context is created can be |
| used to record appropriate information in the <em>errinfo</em>, <em>private_info</em> |
| arguments passed to the callback function when the device becomes |
| unavailable.</p> |
| </div> |
| </td> |
| </tr> |
| </table> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To create an OpenCL context from a specific device |
| type <sup class="footnote">[<a id="_footnoteref_12" class="footnote" href="#_footnotedef_12" title="View footnote.">12</a>]</sup>, call the function:</p> |
| </div> |
| <div id="clCreateContextFromType" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_context clCreateContextFromType( |
| <span class="directive">const</span> cl_context_properties* properties, |
| cl_device_type device_type, |
| <span class="directive">void</span> (CL_CALLBACK* pfn_notify)(<span class="directive">const</span> <span class="predefined-type">char</span>* errinfo, <span class="directive">const</span> <span class="directive">void</span>* private_info, size_t cb, <span class="directive">void</span>* user_data), |
| <span class="directive">void</span>* user_data, |
| cl_int* errcode_ret);</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>properties</em> specifies a list of context property names and their |
| corresponding values. |
| Each property name is immediately followed by the corresponding desired |
| value. |
| The list of supported properties is described in the |
| <a href="#context-properties-table">Context Properties</a> table. |
| <em>properties</em> can also be <code>NULL</code> in which case the platform that is selected |
| is implementation-defined.</p> |
| </li> |
| <li> |
| <p><em>device_type</em> is a bit-field that identifies the type of device and is |
| described in the <a href="#device-types-table">Device Types</a> table.</p> |
| </li> |
| <li> |
| <p><em>pfn_notify</em> and <em>user_data</em> are described in <a href="#clCreateContext"><strong>clCreateContext</strong></a>.</p> |
| </li> |
| <li> |
| <p><em>errcode_ret</em> will return an appropriate error code. |
| If <em>errcode_ret</em> is <code>NULL</code>, no error code is returned.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Only devices that are returned by <a href="#clGetDeviceIDs"><strong>clGetDeviceIDs</strong></a> for <em>device_type</em> are |
| used to create the context. |
| The context does not reference any sub-devices that may have been created |
| from these devices.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clCreateContextFromType"><strong>clCreateContextFromType</strong></a> returns a valid non-zero context and <em>errcode_ret</em> |
| is set to <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the context is created successfully. |
| Otherwise, it returns a <code>NULL</code> value with the following error values |
| returned in <em>errcode_ret</em>:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_PLATFORM"><code>CL_INVALID_<wbr>PLATFORM</code></a> if <em>properties</em> is <code>NULL</code> and no platform could be |
| selected or if platform value specified in <em>properties</em> is not a valid |
| platform.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_PROPERTY"><code>CL_INVALID_<wbr>PROPERTY</code></a> if context property name in <em>properties</em> is not a |
| supported property name, if the value specified for a supported property |
| name is not valid, or if the same property name is specified more than |
| once. |
| This error code is <a href="#unified-spec">missing before</a> version 1.1.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>pfn_notify</em> is <code>NULL</code> but <em>user_data</em> is not |
| <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_DEVICE_TYPE"><code>CL_INVALID_<wbr>DEVICE_<wbr>TYPE</code></a> if <em>device_type</em> is not a valid value.</p> |
| </li> |
| <li> |
| <p><a href="#CL_DEVICE_NOT_AVAILABLE"><code>CL_DEVICE_<wbr>NOT_<wbr>AVAILABLE</code></a> if no devices that match <em>device_type</em> and |
| property values specified in <em>properties</em> are currently available.</p> |
| </li> |
| <li> |
| <p><a href="#CL_DEVICE_NOT_FOUND"><code>CL_DEVICE_<wbr>NOT_<wbr>FOUND</code></a> if no devices that match <em>device_type</em> and property |
| values specified in <em>properties</em> were found.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To retain a context, call the function:</p> |
| </div> |
| <div id="clRetainContext" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clRetainContext( |
| cl_context context);</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>context</em> specifies the OpenCL context to retain.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clRetainContext"><strong>clRetainContext</strong></a> increments the <em>context</em> reference count.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clCreateContext"><strong>clCreateContext</strong></a> and <a href="#clCreateContextFromType"><strong>clCreateContextFromType</strong></a> perform an implicit retain. |
| This is very helpful for 3<sup>rd</sup> party libraries, which typically get a |
| context passed to them by the application. |
| However, it is possible that the application may delete the context without |
| informing the library. |
| Allowing functions to attach to (i.e. retain) and release a context solves |
| the problem of a context being used by a library no longer being valid.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clRetainContext"><strong>clRetainContext</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if <em>context</em> is not a valid OpenCL context.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To release a context, call the function:</p> |
| </div> |
| <div id="clReleaseContext" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clReleaseContext( |
| cl_context context);</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>context</em> specifies the OpenCL context to release.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clReleaseContext"><strong>clReleaseContext</strong></a> decrements the <em>context</em> reference count. |
| After the reference count becomes zero and all the objects attached to |
| <em>context</em> (such as memory objects, command-queues) are released, the |
| <em>context</em> is deleted. |
| Using this function to release a reference that was not obtained by creating |
| the object or by calling <a href="#clRetainContext"><strong>clRetainContext</strong></a> causes undefined behavior.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clReleaseContext"><strong>clReleaseContext</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if <em>context</em> is not a valid OpenCL context.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To query information about a context, call the function:</p> |
| </div> |
| <div id="clGetContextInfo" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clGetContextInfo( |
| cl_context context, |
| cl_context_info param_name, |
| size_t param_value_size, |
| <span class="directive">void</span>* param_value, |
| size_t* param_value_size_ret);</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>context</em> specifies the OpenCL context being queried.</p> |
| </li> |
| <li> |
| <p><em>param_name</em> is an enumeration constant that specifies the information to |
| query.</p> |
| </li> |
| <li> |
| <p><em>param_value</em> is a pointer to memory where the appropriate result being |
| queried is returned. |
| If <em>param_value</em> is <code>NULL</code>, it is ignored.</p> |
| </li> |
| <li> |
| <p><em>param_value_size</em> specifies the size in bytes of memory pointed to by |
| <em>param_value</em>. |
| This size must be greater than or equal to the size of return type as |
| described in the <a href="#context-info-table">Context Attributes</a> table.</p> |
| </li> |
| <li> |
| <p><em>param_value_size_ret</em> returns the actual size in bytes of data being |
| queried by <em>param_name</em>. |
| If <em>param_value_size_ret</em> is <code>NULL</code>, it is ignored.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The list of supported <em>param_name</em> values and the information returned in |
| <em>param_value</em> by <a href="#clGetContextInfo"><strong>clGetContextInfo</strong></a> is described in the |
| <a href="#context-info-table">Context Attributes</a> table.</p> |
| </div> |
| <table id="context-info-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 8. List of supported param_names by <a href="#clGetContextInfo">clGetContextInfo</a></caption> |
| <colgroup> |
| <col style="width: 33%;"> |
| <col style="width: 17%;"> |
| <col style="width: 50%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">Context Info</th> |
| <th class="tableblock halign-left valign-top">Return Type</th> |
| <th class="tableblock halign-left valign-top">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_CONTEXT_REFERENCE_COUNT"></a><a href="#CL_CONTEXT_REFERENCE_COUNT"><code>CL_CONTEXT_<wbr>REFERENCE_<wbr>COUNT</code></a> <sup class="footnote">[<a id="_footnoteref_13" class="footnote" href="#_footnotedef_13" title="View footnote.">13</a>]</sup></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Return the <em>context</em> reference count.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_CONTEXT_NUM_DEVICES"></a><a href="#CL_CONTEXT_NUM_DEVICES"><code>CL_CONTEXT_<wbr>NUM_<wbr>DEVICES</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.1.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Return the number of devices in <em>context</em>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_CONTEXT_DEVICES"></a><a href="#CL_CONTEXT_DEVICES"><code>CL_CONTEXT_<wbr>DEVICES</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_device_<wbr>id</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Return the list of devices and sub-devices in <em>context</em>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_CONTEXT_PROPERTIES"></a><a href="#CL_CONTEXT_PROPERTIES"><code>CL_CONTEXT_<wbr>PROPERTIES</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_context_<wbr>properties</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Return the properties argument specified in <a href="#clCreateContext"><strong>clCreateContext</strong></a> or |
| <a href="#clCreateContextFromType"><strong>clCreateContextFromType</strong></a>.</p> |
| <p class="tableblock"> If the <em>properties</em> argument specified in <a href="#clCreateContext"><strong>clCreateContext</strong></a> or |
| <a href="#clCreateContextFromType"><strong>clCreateContextFromType</strong></a> used to create <em>context</em> was not <code>NULL</code>, |
| the implementation must return the values specified in the |
| properties argument in the same order and without including |
| additional properties.</p> |
| <p class="tableblock"> If the <em>properties</em> argument specified in <a href="#clCreateContext"><strong>clCreateContext</strong></a> or |
| <a href="#clCreateContextFromType"><strong>clCreateContextFromType</strong></a> used to create <em>context</em> was <code>NULL</code>, the |
| implementation must return <em>param_value_size_ret</em> equal to 0, |
| indicating that there are no properties to be returned.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="paragraph"> |
| <p><a href="#clGetContextInfo"><strong>clGetContextInfo</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if <em>context</em> is not a valid context.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>param_name</em> is not one of the supported values or |
| if size in bytes specified by <em>param_value_size</em> is < size of return |
| type as specified in the <a href="#context-info-table">Context Attributes</a> |
| table and <em>param_value</em> is not a <code>NULL</code> value.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To register a callback function with a context that is called when |
| the context is destroyed, call the function</p> |
| </div> |
| <div id="clSetContextDestructorCallback" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clSetContextDestructorCallback( |
| cl_context context, |
| <span class="directive">void</span> (CL_CALLBACK* pfn_notify)(cl_context context, <span class="directive">void</span>* user_data), |
| <span class="directive">void</span>* user_data);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clSetContextDestructorCallback"><strong>clSetContextDestructorCallback</strong></a> is <a href="#unified-spec">missing before</a> version 3.0. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>context</em> specifies the OpenCL context to register the callback to.</p> |
| </li> |
| <li> |
| <p><em>pfn_notify</em> is the callback function to register. |
| This callback function may be called asynchronously by the OpenCL |
| implementation. |
| It is the application’s responsibility to ensure that the callback function |
| is thread-safe. |
| The parameters to this callback function are:</p> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>context</em> is the OpenCL context being deleted. |
| When the callback function is called by the implementation, this context |
| is no longer valid. |
| <em>context</em> is only provided for reference purposes.</p> |
| </li> |
| <li> |
| <p><em>user_data</em> is a pointer to user-supplied data.</p> |
| </li> |
| </ul> |
| </div> |
| </li> |
| <li> |
| <p><em>user_data</em> will be passed as the <em>user_data</em> argument when <em>pfn_notify</em> is |
| called. |
| <em>user_data</em> can be <code>NULL</code>.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Each call to <a href="#clSetContextDestructorCallback"><strong>clSetContextDestructorCallback</strong></a> registers the specified |
| callback function on a destructor callback stack associated with <em>context</em>. |
| The registered callback functions are called in the reverse order in |
| which they were registered. |
| If a context callback function was specified when <em>context</em> was created, |
| it will not be called after any context destructor callback is called. |
| Therefore, the context destructor callback provides a mechanism for an |
| application to safely re-use or free any <em>user_data</em> specified for the |
| context callback function when <em>context</em> was created.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clSetContextDestructorCallback"><strong>clSetContextDestructorCallback</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is |
| executed successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if <em>context</em> is not a valid context.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>pfn_notify</em> is <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="sect1"> |
| <h2 id="opencl-runtime"><a class="anchor" href="#opencl-runtime"></a>5. The OpenCL Runtime</h2> |
| <div class="sectionbody"> |
| <div class="paragraph"> |
| <p>In this section we describe the API calls that manage OpenCL objects such as |
| command-queues, memory objects, program objects, kernel objects for kernel |
| functions in a program and calls that allow you to enqueue commands to a |
| command-queue such as executing a kernel, reading, or writing a memory |
| object.</p> |
| </div> |
| <div class="sect2"> |
| <h3 id="_command_queues"><a class="anchor" href="#_command_queues"></a>5.1. Command Queues</h3> |
| <div class="paragraph"> |
| <p>OpenCL objects such as memory, program and kernel objects are created using |
| a context. |
| Operations on these objects are performed using a command-queue. |
| The command-queue can be used to queue a set of operations (referred to as |
| commands) in order. |
| Having multiple command-queues allows applications to queue multiple |
| independent commands without requiring synchronization. |
| Note that this should work as long as these objects are not being shared. |
| Sharing of objects across multiple command-queues will require the |
| application to perform appropriate synchronization. |
| This is described in <a href="#shared-opencl-objects">Shared OpenCL Objects</a></p> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To create a host or device command-queue on a specific device, call the |
| function</p> |
| </div> |
| <div id="clCreateCommandQueueWithProperties" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_command_queue clCreateCommandQueueWithProperties( |
| cl_context context, |
| cl_device_id device, |
| <span class="directive">const</span> cl_queue_properties* properties, |
| cl_int* errcode_ret);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clCreateCommandQueueWithProperties"><strong>clCreateCommandQueueWithProperties</strong></a> is <a href="#unified-spec">missing before</a> version 2.0. |
| Also see extension <strong>cl_khr_create_command_queue</strong>. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>context</em> must be a valid OpenCL context.</p> |
| </li> |
| <li> |
| <p><em>device</em> must be a device or sub-device associated with <em>context</em>. |
| It can either be in the list of devices and sub-devices specified when |
| <em>context</em> is created using <a href="#clCreateContext"><strong>clCreateContext</strong></a> or be a root device with the |
| same device type as specified when <em>context</em> is created using |
| <a href="#clCreateContextFromType"><strong>clCreateContextFromType</strong></a>.</p> |
| </li> |
| <li> |
| <p><em>properties</em> specifies a list of properties for the command-queue and their |
| corresponding values. |
| Each property name is immediately followed by the corresponding desired |
| value. |
| The list is terminated with 0. |
| The list of supported properties is described in the |
| <a href="#queue-properties-table">table below</a>. |
| If a supported property and its value is not specified in <em>properties</em>, its |
| default value will be used. |
| <em>properties</em> can be <code>NULL</code> in which case the default values for supported |
| command-queue properties will be used.</p> |
| </li> |
| <li> |
| <p><em>errcode_ret</em> will return an appropriate error code. |
| If <em>errcode_ret</em> is <code>NULL</code>, no error code is returned.</p> |
| </li> |
| </ul> |
| </div> |
| <table id="queue-properties-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 9. List of supported queue creation properties by <a href="#clCreateCommandQueueWithProperties">clCreateCommandQueueWithProperties</a></caption> |
| <colgroup> |
| <col style="width: 33%;"> |
| <col style="width: 17%;"> |
| <col style="width: 50%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">Queue Property</th> |
| <th class="tableblock halign-left valign-top">Property Value</th> |
| <th class="tableblock halign-left valign-top">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_QUEUE_PROPERTIES"></a><a href="#CL_QUEUE_PROPERTIES"><code>CL_QUEUE_<wbr>PROPERTIES</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_command_<wbr>queue_<wbr>properties</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">This is a bitfield and can be set to a combination of the following |
| values:</p> |
| <p class="tableblock"> <a id="CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE"></a><a href="#CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE"><code>CL_QUEUE_<wbr>OUT_<wbr>OF_<wbr>ORDER_<wbr>EXEC_<wbr>MODE_<wbr>ENABLE</code></a> - Determines whether the |
| commands queued in the command-queue are executed in-order or |
| out-of-order. |
| If set, the commands in the command-queue are executed out-of-order. |
| Otherwise, commands are executed in-order.</p> |
| <p class="tableblock"> <a id="CL_QUEUE_PROFILING_ENABLE"></a><a href="#CL_QUEUE_PROFILING_ENABLE"><code>CL_QUEUE_<wbr>PROFILING_<wbr>ENABLE</code></a> - Enable or disable profiling of commands |
| in the command-queue. |
| If set, the profiling of commands is enabled. |
| Otherwise profiling of commands is disabled.</p> |
| <p class="tableblock"> <a id="CL_QUEUE_ON_DEVICE"></a><a href="#CL_QUEUE_ON_DEVICE"><code>CL_QUEUE_<wbr>ON_<wbr>DEVICE</code></a> - Indicates that this is a device queue. |
| If <a href="#CL_QUEUE_ON_DEVICE"><code>CL_QUEUE_<wbr>ON_<wbr>DEVICE</code></a> is set, |
| <a href="#CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE"><code>CL_QUEUE_<wbr>OUT_<wbr>OF_<wbr>ORDER_<wbr>EXEC_<wbr>MODE_<wbr>ENABLE</code></a> |
| <sup class="footnote">[<a id="_footnoteref_14" class="footnote" href="#_footnotedef_14" title="View footnote.">14</a>]</sup> |
| must also be set. |
| <a href="#unified-spec">Missing before</a> version 2.0.</p> |
| <p class="tableblock"> <a id="CL_QUEUE_ON_DEVICE_DEFAULT"></a><a href="#CL_QUEUE_ON_DEVICE_DEFAULT"><code>CL_QUEUE_<wbr>ON_<wbr>DEVICE_<wbr>DEFAULT</code></a> |
| <sup class="footnote">[<a id="_footnoteref_15" class="footnote" href="#_footnotedef_15" title="View footnote.">15</a>]</sup> - |
| indicates that this is the default device queue. |
| This can only be used with <a href="#CL_QUEUE_ON_DEVICE"><code>CL_QUEUE_<wbr>ON_<wbr>DEVICE</code></a>. |
| <a href="#unified-spec">Missing before</a> version 2.0.</p> |
| <p class="tableblock"> If <a href="#CL_QUEUE_PROPERTIES"><code>CL_QUEUE_<wbr>PROPERTIES</code></a> is not specified an in-order host command |
| queue is created for the specified device</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_QUEUE_SIZE"></a><a href="#CL_QUEUE_SIZE"><code>CL_QUEUE_<wbr>SIZE</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Specifies the size of the device queue in bytes.</p> |
| <p class="tableblock"> This can only be specified if <a href="#CL_QUEUE_ON_DEVICE"><code>CL_QUEUE_<wbr>ON_<wbr>DEVICE</code></a> is set in |
| <a href="#CL_QUEUE_PROPERTIES"><code>CL_QUEUE_<wbr>PROPERTIES</code></a>. |
| This must be a value ≤ <a href="#CL_DEVICE_QUEUE_ON_DEVICE_MAX_SIZE"><code>CL_DEVICE_<wbr>QUEUE_<wbr>ON_<wbr>DEVICE_<wbr>MAX_<wbr>SIZE</code></a>.</p> |
| <p class="tableblock"> For best performance, this should be ≤ |
| <a href="#CL_DEVICE_QUEUE_ON_DEVICE_PREFERRED_SIZE"><code>CL_DEVICE_<wbr>QUEUE_<wbr>ON_<wbr>DEVICE_<wbr>PREFERRED_<wbr>SIZE</code></a>.</p> |
| <p class="tableblock"> If <a href="#CL_QUEUE_SIZE"><code>CL_QUEUE_<wbr>SIZE</code></a> is not specified, the device queue is created with |
| <a href="#CL_DEVICE_QUEUE_ON_DEVICE_PREFERRED_SIZE"><code>CL_DEVICE_<wbr>QUEUE_<wbr>ON_<wbr>DEVICE_<wbr>PREFERRED_<wbr>SIZE</code></a> as the size of the queue.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="paragraph"> |
| <p><a href="#clCreateCommandQueueWithProperties"><strong>clCreateCommandQueueWithProperties</strong></a> returns a valid non-zero command-queue |
| and <em>errcode_ret</em> is set to <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the command-queue is created |
| successfully. |
| Otherwise, it returns a <code>NULL</code> value with one of the following error values |
| returned in <em>errcode_ret</em>:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if <em>context</em> is not a valid context.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_DEVICE"><code>CL_INVALID_<wbr>DEVICE</code></a> if <em>device</em> is not a valid device or is not associated |
| with <em>context</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if values specified in <em>properties</em> are not valid.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_QUEUE_PROPERTIES"><code>CL_INVALID_<wbr>QUEUE_<wbr>PROPERTIES</code></a> if values specified in <em>properties</em> are |
| valid but are not supported by the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To create a host command-queue on a specific device, call the function</p> |
| </div> |
| <div id="clCreateCommandQueue" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_command_queue clCreateCommandQueue( |
| cl_context context, |
| cl_device_id device, |
| cl_command_queue_properties properties, |
| cl_int* errcode_ret);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clCreateCommandQueue"><strong>clCreateCommandQueue</strong></a> is <a href="#unified-spec">deprecated by</a> version 2.0. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>context</em> must be a valid OpenCL context.</p> |
| </li> |
| <li> |
| <p><em>device</em> must be a device or sub-device associated with <em>context</em>. |
| It can either be in the list of devices and sub-devices specified when |
| <em>context</em> is created using <a href="#clCreateContext"><strong>clCreateContext</strong></a> or be a root device with the |
| same device type as specified when <em>context</em> is created using |
| <a href="#clCreateContextFromType"><strong>clCreateContextFromType</strong></a>.</p> |
| </li> |
| <li> |
| <p><em>properties</em> specifies a list of properties for the command-queue. |
| This is a bit-field and the supported properties are described in the |
| <a href="#legacy-queue-properties-table">table</a> below. |
| Only command-queue properties specified in this table can be used, |
| otherwise the value specified in <em>properties</em> is considered to be not |
| valid. |
| <em>properties</em> can be 0 in which case the default values for supported |
| command-queue properties will be used.</p> |
| </li> |
| </ul> |
| </div> |
| <table id="legacy-queue-properties-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 10. List of supported <code>cl_command_queue_property</code> values by <a href="#clCreateCommandQueue">clCreateCommandQueue</a></caption> |
| <colgroup> |
| <col style="width: 50%;"> |
| <col style="width: 50%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">Command-Queue Properties</th> |
| <th class="tableblock halign-left valign-top">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE"><code>CL_QUEUE_<wbr>OUT_<wbr>OF_<wbr>ORDER_<wbr>EXEC_<wbr>MODE_<wbr>ENABLE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Determines whether the commands queued in the command-queue are executed |
| in-order or out-of-order. |
| If set, the commands in the command-queue are executed out-of-order. |
| Otherwise, commands are executed in-order.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_QUEUE_PROFILING_ENABLE"><code>CL_QUEUE_<wbr>PROFILING_<wbr>ENABLE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Enable or disable profiling of commands in the command-queue. |
| If set, the profiling of commands is enabled. |
| Otherwise profiling of commands is disabled.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>errcode_ret</em> will return an appropriate error code. |
| If <em>errcode_ret</em> is <code>NULL</code>, no error code is returned.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clCreateCommandQueue"><strong>clCreateCommandQueue</strong></a> returns a valid non-zero command-queue and <em>errcode_ret</em> |
| is set to <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the command-queue is created successfully. |
| Otherwise, it returns a <code>NULL</code> value with one of the following error values |
| returned in <em>errcode_ret</em>:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if <em>context</em> is not a valid context.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_DEVICE"><code>CL_INVALID_<wbr>DEVICE</code></a> if <em>device</em> is not a valid device or is not associated |
| with <em>context</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if values specified in <em>properties</em> are not valid.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_QUEUE_PROPERTIES"><code>CL_INVALID_<wbr>QUEUE_<wbr>PROPERTIES</code></a> if values specified in <em>properties</em> are |
| valid but are not supported by the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To replace the default command queue on a device, call the function</p> |
| </div> |
| <div id="clSetDefaultDeviceCommandQueue" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clSetDefaultDeviceCommandQueue( |
| cl_context context, |
| cl_device_id device, |
| cl_command_queue command_queue);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clSetDefaultDeviceCommandQueue"><strong>clSetDefaultDeviceCommandQueue</strong></a> is <a href="#unified-spec">missing before</a> version 2.1. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>context</em> is the OpenCL context used to create <em>command_queue</em>.</p> |
| </li> |
| <li> |
| <p><em>device</em> is a valid OpenCL device associated with <em>context</em>.</p> |
| </li> |
| <li> |
| <p><em>command_queue</em> specifies a command queue object which replaces the |
| default device command queue</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clSetDefaultDeviceCommandQueue"><strong>clSetDefaultDeviceCommandQueue</strong></a> may be used to replace a default device |
| command queue created with <a href="#clCreateCommandQueueWithProperties"><strong>clCreateCommandQueueWithProperties</strong></a> and the |
| <a href="#CL_QUEUE_ON_DEVICE_DEFAULT"><code>CL_QUEUE_<wbr>ON_<wbr>DEVICE_<wbr>DEFAULT</code></a> flag.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clSetDefaultDeviceCommandQueue"><strong>clSetDefaultDeviceCommandQueue</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is |
| executed successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if <em>context</em> is not a valid context.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_DEVICE"><code>CL_INVALID_<wbr>DEVICE</code></a> if <em>device</em> is not a valid device or is not associated |
| with <em>context</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_OPERATION"><code>CL_INVALID_<wbr>OPERATION</code></a> if <em>device</em> does not support a replaceable default on-device queue.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_COMMAND_QUEUE"><code>CL_INVALID_<wbr>COMMAND_<wbr>QUEUE</code></a> if <em>command_queue</em> is not a valid command-queue |
| for <em>device</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To retain a command queue, call the function</p> |
| </div> |
| <div id="clRetainCommandQueue" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clRetainCommandQueue( |
| cl_command_queue command_queue);</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>command_queue</em> specifies the command-queue to be retained.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The <em>command_queue</em> reference count is incremented.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clCreateCommandQueueWithProperties"><strong>clCreateCommandQueueWithProperties</strong></a> and <a href="#clCreateCommandQueue"><strong>clCreateCommandQueue</strong></a> perform an |
| implicit retain. |
| This is very helpful for 3<sup>rd</sup> party libraries, which typically get a |
| command-queue passed to them by the application. |
| However, it is possible that the application may delete the command-queue |
| without informing the library. |
| Allowing functions to attach to (i.e. retain) and release a command-queue |
| solves the problem of a command-queue being used by a library no longer |
| being valid.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clRetainCommandQueue"><strong>clRetainCommandQueue</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_COMMAND_QUEUE"><code>CL_INVALID_<wbr>COMMAND_<wbr>QUEUE</code></a> if <em>command_queue</em> is not a valid |
| command-queue.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To release a command queue, call the function</p> |
| </div> |
| <div id="clReleaseCommandQueue" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clReleaseCommandQueue( |
| cl_command_queue command_queue);</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>command_queue</em> specifies the command-queue to be released.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The <em>command_queue</em> reference count is decremented.</p> |
| </div> |
| <div class="paragraph"> |
| <p>After the <em>command_queue</em> reference count becomes zero and all commands |
| queued to <em>command_queue</em> have finished (eg. |
| kernel-instances, memory object updates etc.), the command-queue is deleted.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clReleaseCommandQueue"><strong>clReleaseCommandQueue</strong></a> performs an implicit flush to issue any previously |
| queued OpenCL commands in <em>command_queue</em>. |
| Using this function to release a reference that was not obtained by creating |
| the object or by calling <a href="#clRetainCommandQueue"><strong>clRetainCommandQueue</strong></a> causes undefined behavior.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clReleaseCommandQueue"><strong>clReleaseCommandQueue</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_COMMAND_QUEUE"><code>CL_INVALID_<wbr>COMMAND_<wbr>QUEUE</code></a> if <em>command_queue</em> is not a valid |
| command-queue.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To query information about a command-queue, call the function</p> |
| </div> |
| <div id="clGetCommandQueueInfo" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clGetCommandQueueInfo( |
| cl_command_queue command_queue, |
| cl_command_queue_info param_name, |
| size_t param_value_size, |
| <span class="directive">void</span>* param_value, |
| size_t* param_value_size_ret);</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>command_queue</em> specifies the command-queue being queried.</p> |
| </li> |
| <li> |
| <p><em>param_name</em> specifies the information to query.</p> |
| </li> |
| <li> |
| <p><em>param_value</em> is a pointer to memory where the appropriate result being |
| queried is returned. |
| If <em>param_value</em> is <code>NULL</code>, it is ignored.</p> |
| </li> |
| <li> |
| <p><em>param_value_size</em> is used to specify the size in bytes of memory pointed to |
| by <em>param_value</em>. |
| This size must be ≥ size of return type as described in the |
| <a href="#command-queue-param-table">Command Queue Parameter</a> table. |
| If <em>param_value</em> is <code>NULL</code>, it is ignored.</p> |
| </li> |
| <li> |
| <p><em>param_value_size_ret</em> returns the actual size in bytes of data being |
| queried by <em>param_name</em>. |
| If <em>param_value_size_ret</em> is <code>NULL</code>, it is ignored.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The list of supported <em>param_name</em> values and the information returned in |
| <em>param_value</em> by <a href="#clGetCommandQueueInfo"><strong>clGetCommandQueueInfo</strong></a> is described in the |
| <a href="#command-queue-param-table">Command Queue Parameter</a> table.</p> |
| </div> |
| <table id="command-queue-param-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 11. List of supported param_names by <a href="#clGetCommandQueueInfo">clGetCommandQueueInfo</a></caption> |
| <colgroup> |
| <col style="width: 33%;"> |
| <col style="width: 17%;"> |
| <col style="width: 50%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">Queue Info</th> |
| <th class="tableblock halign-left valign-top">Return Type</th> |
| <th class="tableblock halign-left valign-top">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_QUEUE_CONTEXT"></a><a href="#CL_QUEUE_CONTEXT"><code>CL_QUEUE_<wbr>CONTEXT</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_context</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Return the context specified when the command-queue is created.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_QUEUE_DEVICE"></a><a href="#CL_QUEUE_DEVICE"><code>CL_QUEUE_<wbr>DEVICE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_device_<wbr>id</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Return the device specified when the command-queue is created.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_QUEUE_REFERENCE_COUNT"></a><a href="#CL_QUEUE_REFERENCE_COUNT"><code>CL_QUEUE_<wbr>REFERENCE_<wbr>COUNT</code></a> <sup class="footnote">[<a id="_footnoteref_16" class="footnote" href="#_footnotedef_16" title="View footnote.">16</a>]</sup></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Return the command-queue reference count.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_QUEUE_PROPERTIES"></a><a href="#CL_QUEUE_PROPERTIES"><code>CL_QUEUE_<wbr>PROPERTIES</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_command_<wbr>queue_<wbr>properties</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Return the currently specified properties for the command-queue. |
| These properties are specified by the value associated with the |
| <a href="#CL_QUEUE_PROPERTIES"><code>CL_QUEUE_<wbr>PROPERTIES</code></a> passed in <em>properties</em> argument in |
| <a href="#clCreateCommandQueueWithProperties"><strong>clCreateCommandQueueWithProperties</strong></a>, or the value of the <em>properties</em> |
| argument in <a href="#clCreateCommandQueue"><strong>clCreateCommandQueue</strong></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_QUEUE_PROPERTIES_ARRAY"></a><a href="#CL_QUEUE_PROPERTIES_ARRAY"><code>CL_QUEUE_<wbr>PROPERTIES_<wbr>ARRAY</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 3.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_queue_<wbr>properties</code>[]</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Return the properties argument specified in |
| <a href="#clCreateCommandQueueWithProperties"><strong>clCreateCommandQueueWithProperties</strong></a>.</p> |
| <p class="tableblock"> If the <em>properties</em> argument specified in |
| <a href="#clCreateCommandQueueWithProperties"><strong>clCreateCommandQueueWithProperties</strong></a> used to create <em>command_queue</em> |
| was not <code>NULL</code>, the implementation must return the values specified in |
| the properties argument in the same order and without including |
| additional properties.</p> |
| <p class="tableblock"> If <em>command_queue</em> was created using <a href="#clCreateCommandQueue"><strong>clCreateCommandQueue</strong></a>, or if the |
| <em>properties</em> argument specified in clCreateCommandQueueWithProperties} |
| was <code>NULL</code>, the implementation must return <em>param_value_size_ret</em> |
| equal to 0, indicating that there are no properties to be returned.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_QUEUE_SIZE"></a><a href="#CL_QUEUE_SIZE"><code>CL_QUEUE_<wbr>SIZE</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Return the size of the device command-queue. |
| To be considered valid for this query, <em>command_queue</em> must be a |
| device command-queue.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_QUEUE_DEVICE_DEFAULT"></a><a href="#CL_QUEUE_DEVICE_DEFAULT"><code>CL_QUEUE_<wbr>DEVICE_<wbr>DEFAULT</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.1.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_command_<wbr>queue</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Return the current default command queue for the underlying device.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="paragraph"> |
| <p><a href="#clGetCommandQueueInfo"><strong>clGetCommandQueueInfo</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_COMMAND_QUEUE"><code>CL_INVALID_<wbr>COMMAND_<wbr>QUEUE</code></a> if <em>command_queue</em> is not a valid |
| command-queue, or if <em>command_queue</em> is not a valid command-queue |
| for <em>param_name</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>param_name</em> is not one of the supported values or |
| if size in bytes specified by <em>param_value_size</em> is < size of return |
| type as specified in the <a href="#command-queue-param-table">Command Queue |
| Parameter</a> table, and <em>param_value</em> is not a <code>NULL</code> value.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To enable or disable the properties of a command-queue, call the function</p> |
| </div> |
| <div id="clSetCommandQueueProperty" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clSetCommandQueueProperty( |
| cl_command_queue command_queue, |
| cl_command_queue_properties properties, |
| cl_bool enable, |
| cl_command_queue_properties* old_properties);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clSetCommandQueueProperty"><strong>clSetCommandQueueProperty</strong></a> is <a href="#unified-spec">deprecated by</a> version 1.1. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>command_queue</em> specifies the command-queue being modified.</p> |
| </li> |
| <li> |
| <p><em>properties</em> specifies the new list of properties for the command-queue. |
| This is a bit-field and the supported properties are described in the |
| <a href="#legacy-queue-properties-table">Command-Queue Properties table</a> for |
| <a href="#clCreateCommandQueue"><strong>clCreateCommandQueue</strong></a>. |
| Only command-queue properties specified in this table can be used, |
| otherwise the value specified in <em>properties</em> is considered to be not |
| valid.</p> |
| </li> |
| <li> |
| <p><em>enable</em> determines whether the values specified by <em>properties</em> are |
| enabled (if <em>enable</em> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>) or disabled (if <em>enable</em> is <a href="#CL_FALSE"><code>CL_FALSE</code></a>) |
| for the command-queue.</p> |
| </li> |
| <li> |
| <p><em>old_properties</em> returns the command-queue properties before they were |
| changed by <a href="#clSetCommandQueueProperty"><strong>clSetCommandQueueProperty</strong></a>. If <em>old_properties</em> is <code>NULL</code>, it |
| is ignored.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| <div class="paragraph"> |
| <p>Changing the <a href="#CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE"><code>CL_QUEUE_<wbr>OUT_<wbr>OF_<wbr>ORDER_<wbr>EXEC_<wbr>MODE_<wbr>ENABLE</code></a> command-queue property |
| will cause the OpenCL implementation to block until all previously queued |
| commands in <em>command_queue</em> have completed. This can be an expensive operation |
| and therefore changes to this property should only be done when absolutely |
| necessary.</p> |
| </div> |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clSetCommandQueueProperty"><strong>clSetCommandQueueProperty</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_COMMAND_QUEUE"><code>CL_INVALID_<wbr>COMMAND_<wbr>QUEUE</code></a> if <em>command_queue</em> is not a valid command-queue.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if values specified in <em>properties</em> are not valid.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_QUEUE_PROPERTIES"><code>CL_INVALID_<wbr>QUEUE_<wbr>PROPERTIES</code></a> if values specified in <em>properties</em> are |
| valid but are not supported by the device.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="sect2"> |
| <h3 id="_buffer_objects"><a class="anchor" href="#_buffer_objects"></a>5.2. Buffer Objects</h3> |
| <div class="paragraph"> |
| <p>A <em>buffer</em> object stores a one-dimensional collection of elements. |
| Elements of a <em>buffer</em> object can be a scalar data type (such as an int, |
| float), vector data type, or a user-defined structure.</p> |
| </div> |
| <div class="sect3"> |
| <h4 id="_creating_buffer_objects"><a class="anchor" href="#_creating_buffer_objects"></a>5.2.1. Creating Buffer Objects</h4> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>A <strong>buffer object</strong> may be created using the function</p> |
| </div> |
| <div id="clCreateBuffer" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_mem clCreateBuffer( |
| cl_context context, |
| cl_mem_flags flags, |
| size_t size, |
| <span class="directive">void</span>* host_ptr, |
| cl_int* errcode_ret);</code></pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>A <strong>buffer object</strong> may also be created with additional properties using the function</p> |
| </div> |
| <div id="clCreateBufferWithProperties" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_mem clCreateBufferWithProperties( |
| cl_context context, |
| <span class="directive">const</span> cl_mem_properties* properties, |
| cl_mem_flags flags, |
| size_t size, |
| <span class="directive">void</span>* host_ptr, |
| cl_int* errcode_ret);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clCreateBufferWithProperties"><strong>clCreateBufferWithProperties</strong></a> is <a href="#unified-spec">missing before</a> version 3.0. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>context</em> is a valid OpenCL context used to create the buffer object.</p> |
| </li> |
| <li> |
| <p><em>properties</em> is an optional list of properties for the buffer object and their corresponding values. |
| The list is terminated with the special property <code>0</code>. |
| If no properties are required, <em>properties</em> may be <code>NULL</code>. |
| OpenCL 3.0 does not define any optional properties for buffers.</p> |
| </li> |
| <li> |
| <p><em>flags</em> is a bit-field that is used to specify allocation and usage |
| information about the image memory object being created and is described in |
| the <a href="#memory-flags-table">supported memory flag values</a> table.</p> |
| </li> |
| <li> |
| <p><em>size</em> is the size in bytes of the buffer memory object to be allocated.</p> |
| </li> |
| <li> |
| <p><em>host_ptr</em> is a pointer to the buffer data that may already be allocated |
| by the application. |
| The size of the buffer that <em>host_ptr</em> points to must be greater than or equal to <em>size</em> |
| bytes.</p> |
| </li> |
| <li> |
| <p><em>errcode_ret</em> may return an appropriate error code. |
| If <em>errcode_ret</em> is <code>NULL</code>, no error code is returned.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The alignment requirements for data stored in buffer objects are described |
| in <a href="#alignment-app-data-types">Alignment of Application Data Types</a>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>If <a href="#clCreateBuffer"><strong>clCreateBuffer</strong></a> or <a href="#clCreateBufferWithProperties"><strong>clCreateBufferWithProperties</strong></a> is called with |
| <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a> set in its <em>flags</em> argument, the contents of the |
| memory pointed to by <em>host_ptr</em> at the time of the <a href="#clCreateBuffer"><strong>clCreateBuffer</strong></a> call |
| define the initial contents of the buffer object.</p> |
| </div> |
| <div class="paragraph"> |
| <p>If <a href="#clCreateBuffer"><strong>clCreateBuffer</strong></a> or <a href="#clCreateBufferWithProperties"><strong>clCreateBufferWithProperties</strong></a> is called with a |
| pointer returned by <a href="#clSVMAlloc"><strong>clSVMAlloc</strong></a> as its <em>host_ptr</em> argument, and |
| <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a> is set in its <em>flags</em> argument, <a href="#clCreateBuffer"><strong>clCreateBuffer</strong></a> or |
| <a href="#clCreateBufferWithProperties"><strong>clCreateBufferWithProperties</strong></a> will succeed and return a valid non-zero |
| buffer object as long as the <em>size</em> argument is no larger than the |
| <em>size</em> argument passed in the original <a href="#clSVMAlloc"><strong>clSVMAlloc</strong></a> call. |
| The new buffer object returned has the shared memory as the underlying |
| storage. |
| Locations in the buffers underlying shared memory can be operated on using |
| atomic operations to the devices level of support as defined in the memory |
| model.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clCreateBuffer"><strong>clCreateBuffer</strong></a> and <a href="#clCreateBufferWithProperties"><strong>clCreateBufferWithProperties</strong></a> returns a valid non-zero |
| buffer object and <em>errcode_ret</em> is set to <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the buffer object |
| is created successfully. |
| Otherwise, they return a <code>NULL</code> value with one of the following error values |
| returned in <em>errcode_ret</em>:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if <em>context</em> is not a valid context.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_PROPERTY"><code>CL_INVALID_<wbr>PROPERTY</code></a> if a property name in <em>properties</em> is not a |
| supported property name, if the value specified for a supported property |
| name is not valid, or if the same property name is specified more than |
| once.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if values specified in <em>flags</em> are not valid as defined |
| in the <a href="#memory-flags-table">Memory Flags</a> table.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_BUFFER_SIZE"><code>CL_INVALID_<wbr>BUFFER_<wbr>SIZE</code></a> if <em>size</em> is 0 or if <em>size</em> is greater than |
| <a href="#CL_DEVICE_MAX_MEM_ALLOC_SIZE"><code>CL_DEVICE_<wbr>MAX_<wbr>MEM_<wbr>ALLOC_<wbr>SIZE</code></a> for all devices in <em>context</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_HOST_PTR"><code>CL_INVALID_<wbr>HOST_<wbr>PTR</code></a> if <em>host_ptr</em> is <code>NULL</code> and <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a> or |
| <a href="#CL_MEM_COPY_HOST_PTR"><code>CL_MEM_<wbr>COPY_<wbr>HOST_<wbr>PTR</code></a> are set in <em>flags</em> or if <em>host_ptr</em> is not <code>NULL</code> |
| but <a href="#CL_MEM_COPY_HOST_PTR"><code>CL_MEM_<wbr>COPY_<wbr>HOST_<wbr>PTR</code></a> or <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a> are not set in <em>flags</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MEM_OBJECT_ALLOCATION_FAILURE"><code>CL_MEM_<wbr>OBJECT_<wbr>ALLOCATION_<wbr>FAILURE</code></a> if there is a failure to allocate |
| memory for buffer object.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| <table id="memory-flags-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 12. List of supported memory flag values</caption> |
| <colgroup> |
| <col style="width: 50%;"> |
| <col style="width: 50%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">Memory Flags</th> |
| <th class="tableblock halign-left valign-top">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MEM_READ_WRITE"></a><a href="#CL_MEM_READ_WRITE"><code>CL_MEM_<wbr>READ_<wbr>WRITE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">This flag specifies that the memory object will be read |
| and written by a kernel. |
| This is the default.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MEM_WRITE_ONLY"></a><a href="#CL_MEM_WRITE_ONLY"><code>CL_MEM_<wbr>WRITE_<wbr>ONLY</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">This flag specifies that the memory object will be |
| written but not read by a kernel.</p> |
| <p class="tableblock"> Reading from a buffer or image object created with <a href="#CL_MEM_WRITE_ONLY"><code>CL_MEM_<wbr>WRITE_<wbr>ONLY</code></a> |
| inside a kernel is undefined.</p> |
| <p class="tableblock"> <a href="#CL_MEM_READ_WRITE"><code>CL_MEM_<wbr>READ_<wbr>WRITE</code></a> and <a href="#CL_MEM_WRITE_ONLY"><code>CL_MEM_<wbr>WRITE_<wbr>ONLY</code></a> are mutually exclusive.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MEM_READ_ONLY"></a><a href="#CL_MEM_READ_ONLY"><code>CL_MEM_<wbr>READ_<wbr>ONLY</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">This flag specifies that the memory object is a |
| readonly memory object when used inside a kernel.</p> |
| <p class="tableblock"> Writing to a buffer or image object created with <a href="#CL_MEM_READ_ONLY"><code>CL_MEM_<wbr>READ_<wbr>ONLY</code></a> inside |
| a kernel is undefined.</p> |
| <p class="tableblock"> <a href="#CL_MEM_READ_WRITE"><code>CL_MEM_<wbr>READ_<wbr>WRITE</code></a> or <a href="#CL_MEM_WRITE_ONLY"><code>CL_MEM_<wbr>WRITE_<wbr>ONLY</code></a> and <a href="#CL_MEM_READ_ONLY"><code>CL_MEM_<wbr>READ_<wbr>ONLY</code></a> are mutually |
| exclusive.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MEM_USE_HOST_PTR"></a><a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">This flag is valid only if host_ptr is not <code>NULL</code>. |
| If specified, it indicates that the application wants the OpenCL |
| implementation to use memory referenced by host_ptr as the storage bits |
| for the memory object.</p> |
| <p class="tableblock"> The contents of the memory pointed to by host_ptr at the time of the |
| <a href="#clCreateBuffer"><strong>clCreateBuffer</strong></a>, <a href="#clCreateBufferWithProperties"><strong>clCreateBufferWithProperties</strong></a>, <a href="#clCreateImage"><strong>clCreateImage</strong></a>, |
| <a href="#clCreateImageWithProperties"><strong>clCreateImageWithProperties</strong></a>, <a href="#clCreateImage2D"><strong>clCreateImage2D</strong></a>, or <a href="#clCreateImage3D"><strong>clCreateImage3D</strong></a> |
| call define the initial contents of the memory object.</p> |
| <p class="tableblock"> OpenCL implementations are allowed to cache the contents pointed |
| to by host_ptr in device memory. |
| This cached copy can be used when kernels are executed on a device.</p> |
| <p class="tableblock"> The result of OpenCL commands that operate on multiple buffer objects |
| created with the same host_ptr or from overlapping host or SVM regions |
| is considered to be undefined.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MEM_ALLOC_HOST_PTR"></a><a href="#CL_MEM_ALLOC_HOST_PTR"><code>CL_MEM_<wbr>ALLOC_<wbr>HOST_<wbr>PTR</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">This flag specifies that the application wants the OpenCL implementation |
| to allocate memory from host accessible memory.</p> |
| <p class="tableblock"> <a href="#CL_MEM_ALLOC_HOST_PTR"><code>CL_MEM_<wbr>ALLOC_<wbr>HOST_<wbr>PTR</code></a> and <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a> are mutually exclusive.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MEM_COPY_HOST_PTR"></a><a href="#CL_MEM_COPY_HOST_PTR"><code>CL_MEM_<wbr>COPY_<wbr>HOST_<wbr>PTR</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">This flag is valid only if host_ptr is not <code>NULL</code>. |
| If specified, it indicates that the application wants the OpenCL |
| implementation to allocate memory for the memory object and copy the |
| data from memory referenced by host_ptr. |
| The implementation will copy the memory immediately and host_ptr is |
| available for reuse by the application when the <a href="#clCreateBuffer"><strong>clCreateBuffer</strong></a>, |
| <a href="#clCreateBufferWithProperties"><strong>clCreateBufferWithProperties</strong></a>, <a href="#clCreateImage"><strong>clCreateImage</strong></a>, <a href="#clCreateImageWithProperties"><strong>clCreateImageWithProperties</strong></a>, |
| <a href="#clCreateImage2D"><strong>clCreateImage2D</strong></a>, or <a href="#clCreateImage3D"><strong>clCreateImage3D</strong></a> operation returns.</p> |
| <p class="tableblock"> <a href="#CL_MEM_COPY_HOST_PTR"><code>CL_MEM_<wbr>COPY_<wbr>HOST_<wbr>PTR</code></a> and <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a> are mutually exclusive.</p> |
| <p class="tableblock"> <a href="#CL_MEM_COPY_HOST_PTR"><code>CL_MEM_<wbr>COPY_<wbr>HOST_<wbr>PTR</code></a> can be used with <a href="#CL_MEM_ALLOC_HOST_PTR"><code>CL_MEM_<wbr>ALLOC_<wbr>HOST_<wbr>PTR</code></a> to |
| initialize the contents of the <code>cl_mem</code> object allocated using |
| host-accessible (e.g. PCIe) memory.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MEM_HOST_WRITE_ONLY"></a><a href="#CL_MEM_HOST_WRITE_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>WRITE_<wbr>ONLY</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">This flag specifies that the host will only write to the memory object |
| (using OpenCL APIs that enqueue a write or a map for write). |
| This can be used to optimize write access from the host (e.g. enable |
| write-combined allocations for memory objects for devices that |
| communicate with the host over a system bus such as PCIe).</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MEM_HOST_READ_ONLY"></a><a href="#CL_MEM_HOST_READ_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>READ_<wbr>ONLY</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">This flag specifies that the host will only read |
| the memory object (using OpenCL APIs that enqueue a read or a map for |
| read).</p> |
| <p class="tableblock"> <a href="#CL_MEM_HOST_WRITE_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>WRITE_<wbr>ONLY</code></a> and <a href="#CL_MEM_HOST_READ_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>READ_<wbr>ONLY</code></a> are mutually exclusive.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MEM_HOST_NO_ACCESS"></a><a href="#CL_MEM_HOST_NO_ACCESS"><code>CL_MEM_<wbr>HOST_<wbr>NO_<wbr>ACCESS</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">This flag specifies that the host will not read or |
| write the memory object.</p> |
| <p class="tableblock"> <a href="#CL_MEM_HOST_WRITE_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>WRITE_<wbr>ONLY</code></a> or <a href="#CL_MEM_HOST_READ_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>READ_<wbr>ONLY</code></a> and |
| <a href="#CL_MEM_HOST_NO_ACCESS"><code>CL_MEM_<wbr>HOST_<wbr>NO_<wbr>ACCESS</code></a> are mutually exclusive.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MEM_KERNEL_READ_AND_WRITE"></a><a href="#CL_MEM_KERNEL_READ_AND_WRITE"><code>CL_MEM_<wbr>KERNEL_<wbr>READ_<wbr>AND_<wbr>WRITE</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">This flag is only used by <a href="#clGetSupportedImageFormats"><strong>clGetSupportedImageFormats</strong></a> to query image |
| formats that may be both read from and written to by the same kernel |
| instance. |
| To create a memory object that may be read from and written to use |
| <a href="#CL_MEM_READ_WRITE"><code>CL_MEM_<wbr>READ_<wbr>WRITE</code></a>.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To create a new buffer object (referred to as a sub-buffer object) from an |
| existing buffer object, call the function</p> |
| </div> |
| <div id="clCreateSubBuffer" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_mem clCreateSubBuffer( |
| cl_mem buffer, |
| cl_mem_flags flags, |
| cl_buffer_create_type buffer_create_type, |
| <span class="directive">const</span> <span class="directive">void</span>* buffer_create_info, |
| cl_int* errcode_ret);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clCreateSubBuffer"><strong>clCreateSubBuffer</strong></a> is <a href="#unified-spec">missing before</a> version 1.1. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>buffer</em> must be a valid buffer object and cannot be a sub-buffer object.</p> |
| </li> |
| <li> |
| <p><em>flags</em> is a bit-field that is used to specify allocation and usage |
| information about the sub-buffer memory object being created and is |
| described in the <a href="#memory-flags-table">Memory Flags</a> table. |
| If the <a href="#CL_MEM_READ_WRITE"><code>CL_MEM_<wbr>READ_<wbr>WRITE</code></a>, <a href="#CL_MEM_READ_ONLY"><code>CL_MEM_<wbr>READ_<wbr>ONLY</code></a>, or <a href="#CL_MEM_WRITE_ONLY"><code>CL_MEM_<wbr>WRITE_<wbr>ONLY</code></a> values are |
| not specified in <em>flags</em>, they are inherited from the corresponding memory |
| access qualifiers associated with <em>buffer</em>. |
| The <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a>, <a href="#CL_MEM_ALLOC_HOST_PTR"><code>CL_MEM_<wbr>ALLOC_<wbr>HOST_<wbr>PTR</code></a>, and <a href="#CL_MEM_COPY_HOST_PTR"><code>CL_MEM_<wbr>COPY_<wbr>HOST_<wbr>PTR</code></a> |
| values cannot be specified in <em>flags</em> but are inherited from the |
| corresponding memory access qualifiers associated with <em>buffer</em>. |
| If <a href="#CL_MEM_COPY_HOST_PTR"><code>CL_MEM_<wbr>COPY_<wbr>HOST_<wbr>PTR</code></a> is specified in the memory access qualifier values |
| associated with <em>buffer</em> it does not imply any additional copies when the |
| sub-buffer is created from <em>buffer</em>. |
| If the <a href="#CL_MEM_HOST_WRITE_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>WRITE_<wbr>ONLY</code></a>, <a href="#CL_MEM_HOST_READ_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>READ_<wbr>ONLY</code></a>, or |
| <a href="#CL_MEM_HOST_NO_ACCESS"><code>CL_MEM_<wbr>HOST_<wbr>NO_<wbr>ACCESS</code></a> values are not specified in <em>flags</em>, they are |
| inherited from the corresponding memory access qualifiers associated with |
| <em>buffer</em>.</p> |
| </li> |
| <li> |
| <p><em>buffer_create_type</em> and <em>buffer_create_info</em> describe the type of buffer |
| object to be created. |
| The list of supported values for <em>buffer_create_type</em> and corresponding |
| descriptor that <em>buffer_create_info</em> points to is described in the |
| <a href="#subbuffer-create-info-table">SubBuffer Attributes</a> table.</p> |
| </li> |
| </ul> |
| </div> |
| <table id="subbuffer-create-info-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 13. List of supported buffer creation types by <a href="#clCreateSubBuffer">clCreateSubBuffer</a></caption> |
| <colgroup> |
| <col style="width: 50%;"> |
| <col style="width: 50%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">Buffer Creation Type</th> |
| <th class="tableblock halign-left valign-top">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_BUFFER_CREATE_TYPE_REGION"></a><a href="#CL_BUFFER_CREATE_TYPE_REGION"><code>CL_BUFFER_<wbr>CREATE_<wbr>TYPE_<wbr>REGION</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.1.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Create a buffer object that represents a specific region in <em>buffer</em>.</p> |
| <p class="tableblock"> <em>buffer_create_info</em> is a pointer to a <a href="#cl_buffer_region"><code>cl_buffer_<wbr>region</code></a> structure |
| specifying a region of the buffer.</p> |
| <p class="tableblock"> If <em>buffer</em> is created with <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a>, the <em>host_ptr</em> |
| associated with the buffer object returned is <em>host_ptr + origin</em>.</p> |
| <p class="tableblock"> The buffer object returned references the data store allocated for |
| buffer and points to the region specified by <em>buffer_create_info</em> in |
| this data store.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="paragraph"> |
| <p><a href="#clCreateSubBuffer"><strong>clCreateSubBuffer</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors in <em>errcode_ret</em>:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_MEM_OBJECT"><code>CL_INVALID_<wbr>MEM_<wbr>OBJECT</code></a> if <em>buffer</em> is not a valid buffer object or is a |
| sub-buffer object.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>buffer</em> was created with <a href="#CL_MEM_WRITE_ONLY"><code>CL_MEM_<wbr>WRITE_<wbr>ONLY</code></a> and |
| <em>flags</em> specifies <a href="#CL_MEM_READ_WRITE"><code>CL_MEM_<wbr>READ_<wbr>WRITE</code></a> or <a href="#CL_MEM_READ_ONLY"><code>CL_MEM_<wbr>READ_<wbr>ONLY</code></a>, or if <em>buffer</em> |
| was created with <a href="#CL_MEM_READ_ONLY"><code>CL_MEM_<wbr>READ_<wbr>ONLY</code></a> and <em>flags</em> specifies |
| <a href="#CL_MEM_READ_WRITE"><code>CL_MEM_<wbr>READ_<wbr>WRITE</code></a> or <a href="#CL_MEM_WRITE_ONLY"><code>CL_MEM_<wbr>WRITE_<wbr>ONLY</code></a>, or if <em>flags</em> specifies |
| <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a> or <a href="#CL_MEM_ALLOC_HOST_PTR"><code>CL_MEM_<wbr>ALLOC_<wbr>HOST_<wbr>PTR</code></a> or <a href="#CL_MEM_COPY_HOST_PTR"><code>CL_MEM_<wbr>COPY_<wbr>HOST_<wbr>PTR</code></a>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>buffer</em> was created with <a href="#CL_MEM_HOST_WRITE_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>WRITE_<wbr>ONLY</code></a> and |
| <em>flags</em> specify <a href="#CL_MEM_HOST_READ_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>READ_<wbr>ONLY</code></a>, or if <em>buffer</em> was created with |
| <a href="#CL_MEM_HOST_READ_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>READ_<wbr>ONLY</code></a> and <em>flags</em> specify <a href="#CL_MEM_HOST_WRITE_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>WRITE_<wbr>ONLY</code></a>, or if |
| <em>buffer</em> was created with <a href="#CL_MEM_HOST_NO_ACCESS"><code>CL_MEM_<wbr>HOST_<wbr>NO_<wbr>ACCESS</code></a> and <em>flags</em> specify |
| <a href="#CL_MEM_HOST_READ_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>READ_<wbr>ONLY</code></a> or <a href="#CL_MEM_HOST_WRITE_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>WRITE_<wbr>ONLY</code></a>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if the value specified in <em>buffer_create_type</em> is not |
| valid.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if value(s) specified in <em>buffer_create_info</em> (for a |
| given <em>buffer_create_type</em>) is not valid or if <em>buffer_create_info</em> is |
| <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MEM_OBJECT_ALLOCATION_FAILURE"><code>CL_MEM_<wbr>OBJECT_<wbr>ALLOCATION_<wbr>FAILURE</code></a> if there is a failure to allocate |
| memory for sub-buffer object.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if the region specified by the <a href="#cl_buffer_region"><code>cl_buffer_<wbr>region</code></a> |
| structure passed in <em>buffer_create_info</em> is out of bounds in <em>buffer</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_BUFFER_SIZE"><code>CL_INVALID_<wbr>BUFFER_<wbr>SIZE</code></a> if the <em>size</em> field of the <a href="#cl_buffer_region"><code>cl_buffer_<wbr>region</code></a> |
| structure passed in <em>buffer_create_info</em> is 0.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MISALIGNED_SUB_BUFFER_OFFSET"><code>CL_MISALIGNED_<wbr>SUB_<wbr>BUFFER_<wbr>OFFSET</code></a> if there are no devices in <em>context</em> |
| associated with <em>buffer</em> for which the <em>origin</em> field of the |
| <a href="#cl_buffer_region"><code>cl_buffer_<wbr>region</code></a> structure passed in <em>buffer_create_info</em> is |
| aligned to the <a href="#CL_DEVICE_MEM_BASE_ADDR_ALIGN"><code>CL_DEVICE_<wbr>MEM_<wbr>BASE_<wbr>ADDR_<wbr>ALIGN</code></a> value.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| <div class="paragraph"> |
| <p>Concurrent reading from, writing to and copying between both a buffer object |
| and its sub-buffer object(s) is undefined. |
| Concurrent reading from, writing to and copying between overlapping |
| sub-buffer objects created with the same buffer object is undefined. |
| Only reading from both a buffer object and its sub-buffer objects or reading |
| from multiple overlapping sub-buffer objects is defined.</p> |
| </div> |
| </td> |
| </tr> |
| </table> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>The <a href="#cl_buffer_region"><code>cl_buffer_<wbr>region</code></a> structure specifies a region of a buffer object:</p> |
| </div> |
| <div id="cl_buffer_region" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++"><span class="keyword">typedef</span> <span class="keyword">struct</span> cl_buffer_region { |
| size_t origin; |
| size_t size; |
| } cl_buffer_region;</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>origin</em> is the offset in bytes of the region.</p> |
| </li> |
| <li> |
| <p><em>size</em> is the size in bytes of the region.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Constraints on the values of <em>origin</em> and <em>size</em> are specified for the |
| <a href="#clCreateSubBuffer"><strong>clCreateSubBuffer</strong></a> function to which this structure is passed.</p> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="sect3"> |
| <h4 id="_reading_writing_and_copying_buffer_objects"><a class="anchor" href="#_reading_writing_and_copying_buffer_objects"></a>5.2.2. Reading, Writing and Copying Buffer Objects</h4> |
| <div class="paragraph"> |
| <p>The following functions enqueue commands to read from a buffer object to |
| host memory or write to a buffer object from host memory.</p> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To read from a buffer object to host memory or to write to a buffer object from |
| host memory call one of the functions</p> |
| </div> |
| <div id="clEnqueueReadBuffer" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clEnqueueReadBuffer( |
| cl_command_queue command_queue, |
| cl_mem buffer, |
| cl_bool blocking_read, |
| size_t offset, |
| size_t size, |
| <span class="directive">void</span>* ptr, |
| cl_uint num_events_in_wait_list, |
| <span class="directive">const</span> cl_event* event_wait_list, |
| cl_event* event);</code></pre> |
| </div> |
| </div> |
| <div id="clEnqueueWriteBuffer" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clEnqueueWriteBuffer( |
| cl_command_queue command_queue, |
| cl_mem buffer, |
| cl_bool blocking_write, |
| size_t offset, |
| size_t size, |
| <span class="directive">const</span> <span class="directive">void</span>* ptr, |
| cl_uint num_events_in_wait_list, |
| <span class="directive">const</span> cl_event* event_wait_list, |
| cl_event* event);</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>command_queue</em> is a valid host command-queue in which the read / write |
| command will be queued. |
| <em>command_queue</em> and <em>buffer</em> must be created with the same OpenCL context.</p> |
| </li> |
| <li> |
| <p><em>buffer</em> refers to a valid buffer object.</p> |
| </li> |
| <li> |
| <p><em>blocking_read</em> and <em>blocking_write</em> indicate if the read and write |
| operations are <em>blocking</em> or <em>non-blocking</em> (see below).</p> |
| </li> |
| <li> |
| <p><em>offset</em> is the offset in bytes in the buffer object to read from or write |
| to.</p> |
| </li> |
| <li> |
| <p><em>size</em> is the size in bytes of data being read or written.</p> |
| </li> |
| <li> |
| <p><em>ptr</em> is the pointer to buffer in host memory where data is to be read into |
| or to be written from.</p> |
| </li> |
| <li> |
| <p><em>event_wait_list</em> and <em>num_events_in_wait_list</em> specify events that need to |
| complete before this particular command can be executed. |
| If <em>event_wait_list</em> is <code>NULL</code>, then this particular command does not wait |
| on any event to complete. |
| If <em>event_wait_list</em> is <code>NULL</code>, <em>num_events_in_wait_list</em> must be 0. |
| If <em>event_wait_list</em> is not <code>NULL</code>, the list of events pointed to by |
| <em>event_wait_list</em> must be valid and <em>num_events_in_wait_list</em> must be |
| greater than 0. |
| The events specified in <em>event_wait_list</em> act as synchronization points. |
| The context associated with events in <em>event_wait_list</em> and <em>command_queue</em> |
| must be the same. |
| The memory associated with <em>event_wait_list</em> can be reused or freed after |
| the function returns.</p> |
| </li> |
| <li> |
| <p><em>event</em> returns an event object that identifies this read / write command |
| and can be used to query or queue a wait for this command to complete. |
| If <em>event</em> is <code>NULL</code> or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If <em>event_wait_list</em> and <em>event</em> are not <code>NULL</code>, <em>event</em> must not refer |
| to an element of the <em>event_wait_list</em> array.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>If <em>blocking_read</em> is <a href="#CL_TRUE"><code>CL_TRUE</code></a> i.e. the read command is blocking, |
| <a href="#clEnqueueReadBuffer"><strong>clEnqueueReadBuffer</strong></a> does not return until the buffer data has been read |
| and copied into memory pointed to by <em>ptr</em>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>If <em>blocking_read</em> is <a href="#CL_FALSE"><code>CL_FALSE</code></a> i.e. the read command is non-blocking, |
| <a href="#clEnqueueReadBuffer"><strong>clEnqueueReadBuffer</strong></a> queues a non-blocking read command and returns. |
| The contents of the buffer that <em>ptr</em> points to cannot be used until the |
| read command has completed. |
| The <em>event</em> argument returns an event object which can be used to query the |
| execution status of the read command. |
| When the read command has completed, the contents of the buffer that <em>ptr</em> |
| points to can be used by the application.</p> |
| </div> |
| <div class="paragraph"> |
| <p>If <em>blocking_write</em> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>, the write command is blocking and does not |
| return until the command is complete, including transfer of the data. |
| The memory pointed to by <em>ptr</em> can be reused by the application after the |
| <a href="#clEnqueueWriteBuffer"><strong>clEnqueueWriteBuffer</strong></a> call returns.</p> |
| </div> |
| <div class="paragraph"> |
| <p>If <em>blocking_write</em> is <a href="#CL_FALSE"><code>CL_FALSE</code></a>, the OpenCL implementation will use <em>ptr</em> to |
| perform a non-blocking write. |
| As the write is non-blocking the implementation can return immediately. |
| The memory pointed to by <em>ptr</em> cannot be reused by the application after the |
| call returns. |
| The <em>event</em> argument returns an event object which can be used to query the |
| execution status of the write command. |
| When the write command has completed, the memory pointed to by <em>ptr</em> can |
| then be reused by the application.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clEnqueueReadBuffer"><strong>clEnqueueReadBuffer</strong></a> and <a href="#clEnqueueWriteBuffer"><strong>clEnqueueWriteBuffer</strong></a> return <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the |
| function is executed successfully. |
| Otherwise, they return one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_COMMAND_QUEUE"><code>CL_INVALID_<wbr>COMMAND_<wbr>QUEUE</code></a> if <em>command_queue</em> is not a valid host |
| command-queue.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if the context associated with <em>command_queue</em> and |
| <em>buffer</em> are not the same or if the context associated with |
| <em>command_queue</em> and events in <em>event_wait_list</em> are not the same.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_MEM_OBJECT"><code>CL_INVALID_<wbr>MEM_<wbr>OBJECT</code></a> if <em>buffer</em> is not a valid buffer object.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if the region being read or written specified by |
| (<em>offset</em>, <em>size</em>) is out of bounds or if <em>ptr</em> is a <code>NULL</code> value.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_EVENT_WAIT_LIST"><code>CL_INVALID_<wbr>EVENT_<wbr>WAIT_<wbr>LIST</code></a> if <em>event_wait_list</em> is <code>NULL</code> and |
| <em>num_events_in_wait_list</em> > 0, or <em>event_wait_list</em> is not <code>NULL</code> and |
| <em>num_events_in_wait_list</em> is 0, or if event objects in <em>event_wait_list</em> |
| are not valid events.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MISALIGNED_SUB_BUFFER_OFFSET"><code>CL_MISALIGNED_<wbr>SUB_<wbr>BUFFER_<wbr>OFFSET</code></a> if <em>buffer</em> is a sub-buffer object and |
| <em>offset</em> specified when the sub-buffer object is created is not aligned |
| to <a href="#CL_DEVICE_MEM_BASE_ADDR_ALIGN"><code>CL_DEVICE_<wbr>MEM_<wbr>BASE_<wbr>ADDR_<wbr>ALIGN</code></a> value for device associated with |
| <em>queue</em>. |
| This error code is <a href="#unified-spec">missing before</a> version 1.1.</p> |
| </li> |
| <li> |
| <p><a href="#CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST"><code>CL_EXEC_<wbr>STATUS_<wbr>ERROR_<wbr>FOR_<wbr>EVENTS_<wbr>IN_<wbr>WAIT_<wbr>LIST</code></a> if the read and write |
| operations are blocking and the execution status of any of the events in |
| <em>event_wait_list</em> is a negative integer value. |
| This error code is <a href="#unified-spec">missing before</a> version 1.1.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MEM_OBJECT_ALLOCATION_FAILURE"><code>CL_MEM_<wbr>OBJECT_<wbr>ALLOCATION_<wbr>FAILURE</code></a> if there is a failure to allocate |
| memory for data store associated with <em>buffer</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_OPERATION"><code>CL_INVALID_<wbr>OPERATION</code></a> if <a href="#clEnqueueReadBuffer"><strong>clEnqueueReadBuffer</strong></a> is called on <em>buffer</em> |
| which has been created with <a href="#CL_MEM_HOST_WRITE_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>WRITE_<wbr>ONLY</code></a> or |
| <a href="#CL_MEM_HOST_NO_ACCESS"><code>CL_MEM_<wbr>HOST_<wbr>NO_<wbr>ACCESS</code></a>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_OPERATION"><code>CL_INVALID_<wbr>OPERATION</code></a> if <a href="#clEnqueueWriteBuffer"><strong>clEnqueueWriteBuffer</strong></a> is called on <em>buffer</em> |
| which has been created with <a href="#CL_MEM_HOST_READ_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>READ_<wbr>ONLY</code></a> or |
| <a href="#CL_MEM_HOST_NO_ACCESS"><code>CL_MEM_<wbr>HOST_<wbr>NO_<wbr>ACCESS</code></a>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>The following functions enqueue commands to read a 2D or 3D rectangular |
| region from a buffer object to host memory or write a 2D or 3D rectangular |
| region to a buffer object from host memory.</p> |
| </div> |
| <div id="clEnqueueReadBufferRect" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clEnqueueReadBufferRect( |
| cl_command_queue command_queue, |
| cl_mem buffer, |
| cl_bool blocking_read, |
| <span class="directive">const</span> size_t* buffer_origin, |
| <span class="directive">const</span> size_t* host_origin, |
| <span class="directive">const</span> size_t* region, |
| size_t buffer_row_pitch, |
| size_t buffer_slice_pitch, |
| size_t host_row_pitch, |
| size_t host_slice_pitch, |
| <span class="directive">void</span>* ptr, |
| cl_uint num_events_in_wait_list, |
| <span class="directive">const</span> cl_event* event_wait_list, |
| cl_event* event);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clEnqueueReadBufferRect"><strong>clEnqueueReadBufferRect</strong></a> is <a href="#unified-spec">missing before</a> version 1.1. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div id="clEnqueueWriteBufferRect" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clEnqueueWriteBufferRect( |
| cl_command_queue command_queue, |
| cl_mem buffer, |
| cl_bool blocking_write, |
| <span class="directive">const</span> size_t* buffer_origin, |
| <span class="directive">const</span> size_t* host_origin, |
| <span class="directive">const</span> size_t* region, |
| size_t buffer_row_pitch, |
| size_t buffer_slice_pitch, |
| size_t host_row_pitch, |
| size_t host_slice_pitch, |
| <span class="directive">const</span> <span class="directive">void</span>* ptr, |
| cl_uint num_events_in_wait_list, |
| <span class="directive">const</span> cl_event* event_wait_list, |
| cl_event* event);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clEnqueueWriteBufferRect"><strong>clEnqueueWriteBufferRect</strong></a> is <a href="#unified-spec">missing before</a> version 1.1. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>command_queue</em> refers is a valid host command-queue in which the read / |
| write command will be queued. |
| <em>command_queue</em> and <em>buffer</em> must be created with the same OpenCL context.</p> |
| </li> |
| <li> |
| <p><em>buffer</em> refers to a valid buffer object.</p> |
| </li> |
| <li> |
| <p><em>blocking_read</em> and <em>blocking_write</em> indicate if the read and write |
| operations are <em>blocking</em> or <em>non-blocking</em> (see below).</p> |
| </li> |
| <li> |
| <p><em>buffer_origin</em> defines the (<em>x</em>, <em>y</em>, <em>z</em>) offset in the memory region |
| associated with <em>buffer</em>. |
| For a 2D rectangle region, the <em>z</em> value given by <em>buffer_origin</em>[2] should |
| be 0. |
| The offset in bytes is computed as <em>buffer_origin</em>[2] × |
| <em>buffer_slice_pitch</em> + <em>buffer_origin</em>[1] × <em>buffer_row_pitch</em> + |
| <em>buffer_origin</em>[0].</p> |
| </li> |
| <li> |
| <p><em>host_origin</em> defines the (<em>x</em>, <em>y</em>, <em>z</em>) offset in the memory region |
| pointed to by <em>ptr</em>. |
| For a 2D rectangle region, the <em>z</em> value given by <em>host_origin</em>[2] should be |
| 0. |
| The offset in bytes is computed as <em>host_origin</em>[2] × |
| <em>host_slice_pitch</em> + <em>host_origin</em>[1] × <em>host_row_pitch</em> + |
| <em>host_origin</em>[0].</p> |
| </li> |
| <li> |
| <p><em>region</em> defines the (<em>width</em> in bytes, <em>height</em> in rows, <em>depth</em> in slices) |
| of the 2D or 3D rectangle being read or written. |
| For a 2D rectangle copy, the <em>depth</em> value given by <em>region</em>[2] should be 1. |
| The values in region cannot be 0.</p> |
| </li> |
| <li> |
| <p><em>buffer_row_pitch</em> is the length of each row in bytes to be used for the |
| memory region associated with <em>buffer</em>. |
| If <em>buffer_row_pitch</em> is 0, <em>buffer_row_pitch</em> is computed as <em>region</em>[0].</p> |
| </li> |
| <li> |
| <p><em>buffer_slice_pitch</em> is the length of each 2D slice in bytes to be used for |
| the memory region associated with <em>buffer</em>. |
| If <em>buffer_slice_pitch</em> is 0, <em>buffer_slice_pitch</em> is computed as |
| <em>region</em>[1] × <em>buffer_row_pitch</em>.</p> |
| </li> |
| <li> |
| <p><em>host_row_pitch</em> is the length of each row in bytes to be used for the |
| memory region pointed to by <em>ptr</em>. |
| If <em>host_row_pitch</em> is 0, <em>host_row_pitch</em> is computed as <em>region</em>[0].</p> |
| </li> |
| <li> |
| <p><em>host_slice_pitch</em> is the length of each 2D slice in bytes to be used for |
| the memory region pointed to by <em>ptr</em>. |
| If <em>host_slice_pitch</em> is 0, <em>host_slice_pitch</em> is computed as <em>region</em>[1] |
| × <em>host_row_pitch</em>.</p> |
| </li> |
| <li> |
| <p><em>ptr</em> is the pointer to buffer in host memory where data is to be read into |
| or to be written from.</p> |
| </li> |
| <li> |
| <p><em>event_wait_list</em> and <em>num_events_in_wait_list</em> specify events that need to |
| complete before this particular command can be executed. |
| If <em>event_wait_list</em> is <code>NULL</code>, then this particular command does not wait |
| on any event to complete. |
| If <em>event_wait_list</em> is <code>NULL</code>, <em>num_events_in_wait_list</em> must be 0. |
| If <em>event_wait_list</em> is not <code>NULL</code>, the list of events pointed to by |
| <em>event_wait_list</em> must be valid and <em>num_events_in_wait_list</em> must be |
| greater than 0. |
| The events specified in <em>event_wait_list</em> act as synchronization points. |
| The context associated with events in <em>event_wait_list</em> and <em>command_queue</em> |
| must be the same. |
| The memory associated with <em>event_wait_list</em> can be reused or freed after |
| the function returns.</p> |
| </li> |
| <li> |
| <p><em>event</em> returns an event object that identifies this read / write command |
| and can be used to query or queue a wait for this command to complete. |
| If <em>event</em> is <code>NULL</code> or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If <em>event_wait_list</em> and <em>event</em> are not <code>NULL</code>, <em>event</em> must not refer |
| to an element of the <em>event_wait_list</em> array.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>If <em>blocking_read</em> is <a href="#CL_TRUE"><code>CL_TRUE</code></a> i.e. the read command is blocking, |
| <a href="#clEnqueueReadBufferRect"><strong>clEnqueueReadBufferRect</strong></a> does not return until the buffer data has been |
| read and copied into memory pointed to by <em>ptr</em>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>If <em>blocking_read</em> is <a href="#CL_FALSE"><code>CL_FALSE</code></a> i.e. the read command is non-blocking, |
| <a href="#clEnqueueReadBufferRect"><strong>clEnqueueReadBufferRect</strong></a> queues a non-blocking read command and returns. |
| The contents of the buffer that <em>ptr</em> points to cannot be used until the |
| read command has completed. |
| The <em>event</em> argument returns an event object which can be used to query the |
| execution status of the read command. |
| When the read command has completed, the contents of the buffer that <em>ptr</em> |
| points to can be used by the application.</p> |
| </div> |
| <div class="paragraph"> |
| <p>If <em>blocking_write</em> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>, the write command is blocking and does not |
| return until the command is complete, including transfer of the data. |
| The memory pointed to by <em>ptr</em> can be reused by the application after the |
| <a href="#clEnqueueWriteBufferRect"><strong>clEnqueueWriteBufferRect</strong></a> call returns.</p> |
| </div> |
| <div class="paragraph"> |
| <p>If <em>blocking_write</em> is <a href="#CL_FALSE"><code>CL_FALSE</code></a>, the OpenCL implementation will use <em>ptr</em> to |
| perform a non-blocking write. |
| As the write is non-blocking the implementation can return immediately. |
| The memory pointed to by <em>ptr</em> cannot be reused by the application after the |
| call returns. |
| The <em>event</em> argument returns an event object which can be used to query the |
| execution status of the write command. |
| When the write command has completed, the memory pointed to by <em>ptr</em> can |
| then be reused by the application.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clEnqueueReadBufferRect"><strong>clEnqueueReadBufferRect</strong></a> and <a href="#clEnqueueWriteBufferRect"><strong>clEnqueueWriteBufferRect</strong></a> return <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> |
| if the function is executed successfully. |
| Otherwise, they return one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_COMMAND_QUEUE"><code>CL_INVALID_<wbr>COMMAND_<wbr>QUEUE</code></a> if <em>command_queue</em> is not a valid host |
| command-queue.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if the context associated with <em>command_queue</em> and |
| <em>buffer</em> are not the same or if the context associated with |
| <em>command_queue</em> and events in <em>event_wait_list</em> are not the same.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_MEM_OBJECT"><code>CL_INVALID_<wbr>MEM_<wbr>OBJECT</code></a> if <em>buffer</em> is not a valid buffer object.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>buffer_origin</em>, <em>host_origin</em>, or <em>region</em> is <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if the region being read or written specified by |
| (<em>buffer_origin</em>, <em>region</em>, <em>buffer_row_pitch</em>, <em>buffer_slice_pitch</em>) is |
| out of bounds.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if any <em>region</em> array element is 0.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>buffer_row_pitch</em> is not 0 and is less than |
| <em>region</em>[0].</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>host_row_pitch</em> is not 0 and is less than |
| <em>region</em>[0].</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>buffer_slice_pitch</em> is not 0 and is less than |
| <em>region</em>[1] × <em>buffer_row_pitch</em> and not a multiple of |
| <em>buffer_row_pitch</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>host_slice_pitch</em> is not 0 and is less than |
| <em>region</em>[1] × <em>host_row_pitch</em> and not a multiple of |
| <em>host_row_pitch</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>ptr</em> is <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_EVENT_WAIT_LIST"><code>CL_INVALID_<wbr>EVENT_<wbr>WAIT_<wbr>LIST</code></a> if <em>event_wait_list</em> is <code>NULL</code> and |
| <em>num_events_in_wait_list</em> > 0, or <em>event_wait_list</em> is not <code>NULL</code> and |
| <em>num_events_in_wait_list</em> is 0, or if event objects in <em>event_wait_list</em> |
| are not valid events.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MISALIGNED_SUB_BUFFER_OFFSET"><code>CL_MISALIGNED_<wbr>SUB_<wbr>BUFFER_<wbr>OFFSET</code></a> if <em>buffer</em> is a sub-buffer object and |
| <em>offset</em> specified when the sub-buffer object is created is not aligned |
| to <a href="#CL_DEVICE_MEM_BASE_ADDR_ALIGN"><code>CL_DEVICE_<wbr>MEM_<wbr>BASE_<wbr>ADDR_<wbr>ALIGN</code></a> value for device associated with |
| <em>queue</em>. |
| This error code is <a href="#unified-spec">missing before</a> version 1.1.</p> |
| </li> |
| <li> |
| <p><a href="#CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST"><code>CL_EXEC_<wbr>STATUS_<wbr>ERROR_<wbr>FOR_<wbr>EVENTS_<wbr>IN_<wbr>WAIT_<wbr>LIST</code></a> if the read and write |
| operations are blocking and the execution status of any of the events in |
| <em>event_wait_list</em> is a negative integer value. |
| This error code is <a href="#unified-spec">missing before</a> version 1.1.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MEM_OBJECT_ALLOCATION_FAILURE"><code>CL_MEM_<wbr>OBJECT_<wbr>ALLOCATION_<wbr>FAILURE</code></a> if there is a failure to allocate |
| memory for data store associated with <em>buffer</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_OPERATION"><code>CL_INVALID_<wbr>OPERATION</code></a> if <a href="#clEnqueueReadBufferRect"><strong>clEnqueueReadBufferRect</strong></a> is called on <em>buffer</em> |
| which has been created with <a href="#CL_MEM_HOST_WRITE_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>WRITE_<wbr>ONLY</code></a> or |
| <a href="#CL_MEM_HOST_NO_ACCESS"><code>CL_MEM_<wbr>HOST_<wbr>NO_<wbr>ACCESS</code></a>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_OPERATION"><code>CL_INVALID_<wbr>OPERATION</code></a> if <a href="#clEnqueueWriteBufferRect"><strong>clEnqueueWriteBufferRect</strong></a> is called on <em>buffer</em> |
| which has been created with <a href="#CL_MEM_HOST_READ_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>READ_<wbr>ONLY</code></a> or |
| <a href="#CL_MEM_HOST_NO_ACCESS"><code>CL_MEM_<wbr>HOST_<wbr>NO_<wbr>ACCESS</code></a>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| <div class="paragraph"> |
| <p>Calling <a href="#clEnqueueReadBuffer"><strong>clEnqueueReadBuffer</strong></a> to read a region of the buffer object with the |
| <em>ptr</em> argument value set to <em>host_ptr</em> + <em>offset</em>, where <em>host_ptr</em> is a |
| pointer to the memory region specified when the buffer object being read is |
| created with <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a>, must meet the following requirements in |
| order to avoid undefined behavior:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>All commands that use this buffer object or a memory object (buffer or |
| image) created from this buffer object have finished execution before |
| the read command begins execution.</p> |
| </li> |
| <li> |
| <p>The buffer object or memory objects created from this buffer object are |
| not mapped.</p> |
| </li> |
| <li> |
| <p>The buffer object or memory objects created from this buffer object are |
| not used by any command-queue until the read command has finished |
| execution.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Calling <a href="#clEnqueueReadBufferRect"><strong>clEnqueueReadBufferRect</strong></a> to read a region of the buffer object with |
| the <em>ptr</em> argument value set to <em>host_ptr</em> and <em>host_origin</em>, |
| <em>buffer_origin</em> values are the same, where <em>host_ptr</em> is a pointer to the |
| memory region specified when the buffer object being read is created with |
| <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a>, must meet the same requirements given above for |
| <a href="#clEnqueueReadBuffer"><strong>clEnqueueReadBuffer</strong></a>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Calling <a href="#clEnqueueWriteBuffer"><strong>clEnqueueWriteBuffer</strong></a> to update the latest bits in a region of the |
| buffer object with the <em>ptr</em> argument value set to <em>host_ptr</em> + <em>offset</em>, |
| where <em>host_ptr</em> is a pointer to the memory region specified when the buffer |
| object being written is created with <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a>, must meet the |
| following requirements in order to avoid undefined behavior:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>The host memory region given by (<em>host_ptr</em> + <em>offset</em>, <em>cb</em>) contains |
| the latest bits when the enqueued write command begins execution.</p> |
| </li> |
| <li> |
| <p>The buffer object or memory objects created from this buffer object are |
| not mapped.</p> |
| </li> |
| <li> |
| <p>The buffer object or memory objects created from this buffer object are |
| not used by any command-queue until the write command has finished |
| execution.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Calling <a href="#clEnqueueWriteBufferRect"><strong>clEnqueueWriteBufferRect</strong></a> to update the latest bits in a region of |
| the buffer object with the <em>ptr</em> argument value set to <em>host_ptr</em> and |
| <em>host_origin</em>, <em>buffer_origin</em> values are the same, where <em>host_ptr</em> is a |
| pointer to the memory region specified when the buffer object being written |
| is created with <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a>, must meet the following requirements in |
| order to avoid undefined behavior:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>The host memory region given by (<em>buffer_origin region</em>) contains the |
| latest bits when the enqueued write command begins execution.</p> |
| </li> |
| <li> |
| <p>The buffer object or memory objects created from this buffer object are |
| not mapped.</p> |
| </li> |
| <li> |
| <p>The buffer object or memory objects created from this buffer object are |
| not used by any command-queue until the write command has finished |
| execution.</p> |
| </li> |
| </ul> |
| </div> |
| </td> |
| </tr> |
| </table> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To enqueue a command to copy a buffer object identified by <em>src_buffer</em> to |
| another buffer object identified by <em>dst_buffer</em>, call the function</p> |
| </div> |
| <div id="clEnqueueCopyBuffer" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clEnqueueCopyBuffer( |
| cl_command_queue command_queue, |
| cl_mem src_buffer, |
| cl_mem dst_buffer, |
| size_t src_offset, |
| size_t dst_offset, |
| size_t size, |
| cl_uint num_events_in_wait_list, |
| <span class="directive">const</span> cl_event* event_wait_list, |
| cl_event* event);</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>command_queue</em> refers to a host command-queue in which the copy command |
| will be queued. |
| The OpenCL context associated with <em>command_queue</em>, <em>src_buffer</em> and |
| <em>dst_buffer</em> must be the same.</p> |
| </li> |
| <li> |
| <p><em>src_offset</em> refers to the offset where to begin copying data from |
| <em>src_buffer</em>.</p> |
| </li> |
| <li> |
| <p><em>dst_offset</em> refers to the offset where to begin copying data into |
| <em>dst_buffer</em>.</p> |
| </li> |
| <li> |
| <p><em>size</em> refers to the size in bytes to copy.</p> |
| </li> |
| <li> |
| <p><em>event_wait_list</em> and <em>num_events_in_wait_list</em> specify events that need to |
| complete before this particular command can be executed. |
| If <em>event_wait_list</em> is <code>NULL</code>, then this particular command does not wait |
| on any event to complete. |
| If <em>event_wait_list</em> is <code>NULL</code>, <em>num_events_in_wait_list</em> must be 0. |
| If <em>event_wait_list</em> is not <code>NULL</code>, the list of events pointed to by |
| <em>event_wait_list</em> must be valid and <em>num_events_in_wait_list</em> must be |
| greater than 0. |
| The events specified in <em>event_wait_list</em> act as synchronization points. |
| The context associated with events in <em>event_wait_list</em> and <em>command_queue</em> |
| must be the same. |
| The memory associated with <em>event_wait_list</em> can be reused or freed after |
| the function returns.</p> |
| </li> |
| <li> |
| <p><em>event</em> returns an event object that identifies this copy command |
| and can be used to query or queue a wait for this command to complete. |
| If <em>event</em> is <code>NULL</code> or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If <em>event_wait_list</em> and <em>event</em> are not <code>NULL</code>, <em>event</em> must not refer |
| to an element of the <em>event_wait_list</em> array.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clEnqueueCopyBuffer"><strong>clEnqueueCopyBuffer</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_COMMAND_QUEUE"><code>CL_INVALID_<wbr>COMMAND_<wbr>QUEUE</code></a> if <em>command_queue</em> is not a valid host |
| command-queue.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if the context associated with <em>command_queue</em>, |
| <em>src_buffer</em> and <em>dst_buffer</em> are not the same or if the context |
| associated with <em>command_queue</em> and events in <em>event_wait_list</em> are not |
| the same.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_MEM_OBJECT"><code>CL_INVALID_<wbr>MEM_<wbr>OBJECT</code></a> if <em>src_buffer</em> and <em>dst_buffer</em> are not valid |
| buffer objects.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>src_offset</em>, <em>dst_offset</em>, <em>size</em>, <em>src_offset</em> |
| + <em>size</em> or <em>dst_offset</em> + <em>size</em> require accessing elements |
| outside the <em>src_buffer</em> and <em>dst_buffer</em> buffer objects respectively.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_EVENT_WAIT_LIST"><code>CL_INVALID_<wbr>EVENT_<wbr>WAIT_<wbr>LIST</code></a> if <em>event_wait_list</em> is <code>NULL</code> and |
| <em>num_events_in_wait_list</em> > 0, or <em>event_wait_list</em> is not <code>NULL</code> and |
| <em>num_events_in_wait_list</em> is 0, or if event objects in <em>event_wait_list</em> |
| are not valid events.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MISALIGNED_SUB_BUFFER_OFFSET"><code>CL_MISALIGNED_<wbr>SUB_<wbr>BUFFER_<wbr>OFFSET</code></a> if <em>src_buffer</em> is a sub-buffer object |
| and <em>offset</em> specified when the sub-buffer object is created is not |
| aligned to <a href="#CL_DEVICE_MEM_BASE_ADDR_ALIGN"><code>CL_DEVICE_<wbr>MEM_<wbr>BASE_<wbr>ADDR_<wbr>ALIGN</code></a> value for device associated |
| with <em>queue</em>. |
| This error code is <a href="#unified-spec">missing before</a> version 1.1.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MISALIGNED_SUB_BUFFER_OFFSET"><code>CL_MISALIGNED_<wbr>SUB_<wbr>BUFFER_<wbr>OFFSET</code></a> if <em>dst_buffer</em> is a sub-buffer object |
| and <em>offset</em> specified when the sub-buffer object is created is not |
| aligned to <a href="#CL_DEVICE_MEM_BASE_ADDR_ALIGN"><code>CL_DEVICE_<wbr>MEM_<wbr>BASE_<wbr>ADDR_<wbr>ALIGN</code></a> value for device associated |
| with <em>queue</em>. |
| This error code is <a href="#unified-spec">missing before</a> version 1.1.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MEM_COPY_OVERLAP"><code>CL_MEM_<wbr>COPY_<wbr>OVERLAP</code></a> if <em>src_buffer</em> and <em>dst_buffer</em> are the same buffer |
| or sub-buffer object and the source and destination regions overlap or |
| if <em>src_buffer</em> and <em>dst_buffer</em> are different sub-buffers of the same |
| associated buffer object and they overlap. |
| The regions overlap if <em>src_offset</em> ≤ <em>dst_offset</em> ≤ |
| <em>src_offset</em> + <em>size</em> - 1 or if <em>dst_offset</em> ≤ <em>src_offset</em> ≤ |
| <em>dst_offset</em> + <em>size</em> - 1.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MEM_OBJECT_ALLOCATION_FAILURE"><code>CL_MEM_<wbr>OBJECT_<wbr>ALLOCATION_<wbr>FAILURE</code></a> if there is a failure to allocate |
| memory for data store associated with <em>src_buffer</em> or <em>dst_buffer</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To enqueue a command to copy a 2D or 3D rectangular region from the buffer |
| object identified by <em>src_buffer</em> to a 2D or 3D region in the buffer object |
| identified by <em>dst_buffer</em>, call the function</p> |
| </div> |
| <div id="clEnqueueCopyBufferRect" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clEnqueueCopyBufferRect( |
| cl_command_queue command_queue, |
| cl_mem src_buffer, |
| cl_mem dst_buffer, |
| <span class="directive">const</span> size_t* src_origin, |
| <span class="directive">const</span> size_t* dst_origin, |
| <span class="directive">const</span> size_t* region, |
| size_t src_row_pitch, |
| size_t src_slice_pitch, |
| size_t dst_row_pitch, |
| size_t dst_slice_pitch, |
| cl_uint num_events_in_wait_list, |
| <span class="directive">const</span> cl_event* event_wait_list, |
| cl_event* event);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clEnqueueCopyBufferRect"><strong>clEnqueueCopyBufferRect</strong></a> is <a href="#unified-spec">missing before</a> version 1.1. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>command_queue</em> refers to the host command-queue in which the copy command |
| will be queued. |
| The OpenCL context associated with <em>command_queue</em>, <em>src_buffer</em> and |
| <em>dst_buffer</em> must be the same.</p> |
| </li> |
| <li> |
| <p><em>src_origin</em> defines the (<em>x</em>, <em>y</em>, <em>z</em>) offset in the memory region |
| associated with <em>src_buffer</em>. |
| For a 2D rectangle region, the <em>z</em> value given by <em>src_origin</em>[2] should be |
| 0. |
| The offset in bytes is computed as <em>src_origin</em>[2] × <em>src_slice_pitch</em> |
| + <em>src_origin</em>[1] × <em>src_row_pitch</em> + <em>src_origin</em>[0].</p> |
| </li> |
| <li> |
| <p><em>dst_origin</em> defines the (<em>x</em>, <em>y</em>, <em>z</em>) offset in the memory region |
| associated with <em>dst_buffer</em>. |
| For a 2D rectangle region, the <em>z</em> value given by <em>dst_origin</em>[2] should be |
| 0. |
| The offset in bytes is computed as <em>dst_origin</em>[2] × <em>dst_slice_pitch</em> |
| + <em>dst_origin</em>[1] × <em>dst_row_pitch</em> + <em>dst_origin</em>[0].</p> |
| </li> |
| <li> |
| <p><em>region</em> defines the (<em>width</em> in bytes, <em>height</em> in rows, <em>depth</em> in slices) |
| of the 2D or 3D rectangle being copied. |
| For a 2D rectangle, the <em>depth</em> value given by <em>region</em>[2] should be 1. |
| The values in region cannot be 0.</p> |
| </li> |
| <li> |
| <p><em>src_row_pitch</em> is the length of each row in bytes to be used for the memory |
| region associated with <em>src_buffer</em>. |
| If <em>src_row_pitch</em> is 0, <em>src_row_pitch</em> is computed as <em>region</em>[0].</p> |
| </li> |
| <li> |
| <p><em>src_slice_pitch</em> is the length of each 2D slice in bytes to be used for the |
| memory region associated with <em>src_buffer</em>. |
| If <em>src_slice_pitch</em> is 0, <em>src_slice_pitch</em> is computed as <em>region</em>[1] |
| × <em>src_row_pitch</em>.</p> |
| </li> |
| <li> |
| <p><em>dst_row_pitch</em> is the length of each row in bytes to be used for the memory |
| region associated with <em>dst_buffer</em>. |
| If <em>dst_row_pitch</em> is 0, <em>dst_row_pitch</em> is computed as <em>region</em>[0].</p> |
| </li> |
| <li> |
| <p><em>dst_slice_pitch</em> is the length of each 2D slice in bytes to be used for the |
| memory region associated with <em>dst_buffer</em>. |
| If <em>dst_slice_pitch</em> is 0, <em>dst_slice_pitch</em> is computed as <em>region</em>[1] |
| × <em>dst_row_pitch</em>.</p> |
| </li> |
| <li> |
| <p><em>event_wait_list</em> and <em>num_events_in_wait_list</em> specify events that need to |
| complete before this particular command can be executed. |
| If <em>event_wait_list</em> is <code>NULL</code>, then this particular command does not wait |
| on any event to complete. |
| If <em>event_wait_list</em> is <code>NULL</code>, <em>num_events_in_wait_list</em> must be 0. |
| If <em>event_wait_list</em> is not <code>NULL</code>, the list of events pointed to by |
| <em>event_wait_list</em> must be valid and <em>num_events_in_wait_list</em> must be |
| greater than 0. |
| The events specified in <em>event_wait_list</em> act as synchronization points. |
| The context associated with events in <em>event_wait_list</em> and <em>command_queue</em> |
| must be the same. |
| The memory associated with <em>event_wait_list</em> can be reused or freed after |
| the function returns.</p> |
| </li> |
| <li> |
| <p><em>event</em> returns an event object that identifies this copy command |
| and can be used to query or queue a wait for this command to complete. |
| If <em>event</em> is <code>NULL</code> or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If <em>event_wait_list</em> and <em>event</em> are not <code>NULL</code>, <em>event</em> must not refer |
| to an element of the <em>event_wait_list</em> array.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Copying begins at the source offset and destination offset which are |
| computed as described below in the description for <em>src_origin</em> and |
| <em>dst_origin</em>. |
| Each byte of the region’s width is copied from the source offset to the |
| destination offset. |
| After copying each width, the source and destination offsets are incremented |
| by their respective source and destination row pitches. |
| After copying each 2D rectangle, the source and destination offsets are |
| incremented by their respective source and destination slice pitches.</p> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| <div class="paragraph"> |
| <p>If <em>src_buffer</em> and <em>dst_buffer</em> are the same buffer object, <em>src_row_pitch</em> |
| must equal <em>dst_row_pitch</em> and <em>src_slice_pitch</em> must equal |
| <em>dst_slice_pitch</em>.</p> |
| </div> |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clEnqueueCopyBufferRect"><strong>clEnqueueCopyBufferRect</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_COMMAND_QUEUE"><code>CL_INVALID_<wbr>COMMAND_<wbr>QUEUE</code></a> if <em>command_queue</em> is not a valid host |
| command-queue.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if the context associated with <em>command_queue</em>, |
| <em>src_buffer</em> and <em>dst_buffer</em> are not the same or if the context |
| associated with <em>command_queue</em> and events in <em>event_wait_list</em> are not |
| the same.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_MEM_OBJECT"><code>CL_INVALID_<wbr>MEM_<wbr>OBJECT</code></a> if <em>src_buffer</em> and <em>dst_buffer</em> are not valid |
| buffer objects.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>src_origin</em>, <em>dst_origin</em>, or <em>region</em> is <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if (<em>src_origin</em>, <em>region</em>, <em>src_row_pitch</em>, |
| <em>src_slice_pitch</em>) or (<em>dst_origin</em>, <em>region</em>, <em>dst_row_pitch</em>, |
| <em>dst_slice_pitch</em>) require accessing elements outside the <em>src_buffer</em> |
| and <em>dst_buffer</em> buffer objects respectively.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if any <em>region</em> array element is 0.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>src_row_pitch</em> is not 0 and is less than |
| <em>region</em>[0].</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>dst_row_pitch</em> is not 0 and is less than |
| <em>region</em>[0].</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>src_slice_pitch</em> is not 0 and is less than |
| <em>region</em>[1] × <em>src_row_pitch</em> or if <em>src_slice_pitch</em> is not 0 and |
| is not a multiple of <em>src_row_pitch</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>dst_slice_pitch</em> is not 0 and is less than |
| <em>region</em>[1] × <em>dst_row_pitch</em> or if <em>dst_slice_pitch</em> is not 0 and |
| is not a multiple of <em>dst_row_pitch</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>src_buffer</em> and <em>dst_buffer</em> are the same buffer |
| object and <em>src_slice_pitch</em> is not equal to <em>dst_slice_pitch</em> and |
| <em>src_row_pitch</em> is not equal to <em>dst_row_pitch</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_EVENT_WAIT_LIST"><code>CL_INVALID_<wbr>EVENT_<wbr>WAIT_<wbr>LIST</code></a> if <em>event_wait_list</em> is <code>NULL</code> and |
| <em>num_events_in_wait_list</em> > 0, or <em>event_wait_list</em> is not <code>NULL</code> and |
| <em>num_events_in_wait_list</em> is 0, or if event objects in <em>event_wait_list</em> |
| are not valid events.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MEM_COPY_OVERLAP"><code>CL_MEM_<wbr>COPY_<wbr>OVERLAP</code></a> if <em>src_buffer</em> and <em>dst_buffer</em> are the same buffer |
| or sub-buffer object and the source and destination regions overlap or |
| if <em>src_buffer</em> and <em>dst_buffer</em> are different sub-buffers of the same |
| associated buffer object and they overlap. |
| Refer to <a href="#check-copy-overlap">Checking for Memory Copy Overlap</a> for |
| details on how to determine if source and destination regions overlap.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MISALIGNED_SUB_BUFFER_OFFSET"><code>CL_MISALIGNED_<wbr>SUB_<wbr>BUFFER_<wbr>OFFSET</code></a> if <em>src_buffer</em> is a sub-buffer object |
| and <em>offset</em> specified when the sub-buffer object is created is not |
| aligned to <a href="#CL_DEVICE_MEM_BASE_ADDR_ALIGN"><code>CL_DEVICE_<wbr>MEM_<wbr>BASE_<wbr>ADDR_<wbr>ALIGN</code></a> value for device associated |
| with <em>queue</em>. |
| This error code is <a href="#unified-spec">missing before</a> version 1.1.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MISALIGNED_SUB_BUFFER_OFFSET"><code>CL_MISALIGNED_<wbr>SUB_<wbr>BUFFER_<wbr>OFFSET</code></a> if <em>dst_buffer</em> is a sub-buffer object |
| and <em>offset</em> specified when the sub-buffer object is created is not |
| aligned to <a href="#CL_DEVICE_MEM_BASE_ADDR_ALIGN"><code>CL_DEVICE_<wbr>MEM_<wbr>BASE_<wbr>ADDR_<wbr>ALIGN</code></a> value for device associated |
| with <em>queue</em>. |
| This error code is <a href="#unified-spec">missing before</a> version 1.1.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MEM_OBJECT_ALLOCATION_FAILURE"><code>CL_MEM_<wbr>OBJECT_<wbr>ALLOCATION_<wbr>FAILURE</code></a> if there is a failure to allocate |
| memory for data store associated with <em>src_buffer</em> or <em>dst_buffer</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="sect3"> |
| <h4 id="_filling_buffer_objects"><a class="anchor" href="#_filling_buffer_objects"></a>5.2.3. Filling Buffer Objects</h4> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| Filling buffer objects is <a href="#unified-spec">missing before</a> version 1.2. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To enqueue a command to fill a buffer object with a pattern of a given |
| pattern size, call the function</p> |
| </div> |
| <div id="clEnqueueFillBuffer" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clEnqueueFillBuffer( |
| cl_command_queue command_queue, |
| cl_mem buffer, |
| <span class="directive">const</span> <span class="directive">void</span>* pattern, |
| size_t pattern_size, |
| size_t offset, |
| size_t size, |
| cl_uint num_events_in_wait_list, |
| <span class="directive">const</span> cl_event* event_wait_list, |
| cl_event* event);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clEnqueueFillBuffer"><strong>clEnqueueFillBuffer</strong></a> is <a href="#unified-spec">missing before</a> version 1.2. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>command_queue</em> refers to the host command-queue in which the fill command |
| will be queued. |
| The OpenCL context associated with <em>command_queue</em> and <em>buffer</em> must be the |
| same.</p> |
| </li> |
| <li> |
| <p><em>buffer</em> is a valid buffer object.</p> |
| </li> |
| <li> |
| <p><em>pattern</em> is a pointer to the data pattern of size <em>pattern_size</em> in bytes. |
| <em>pattern</em> will be used to fill a region in <em>buffer</em> starting at <em>offset</em> and |
| is <em>size</em> bytes in size. |
| The data pattern must be a scalar or vector integer or floating-point data |
| type supported by OpenCL as described in <a href="#scalar-data-types">Shared |
| Application Scalar Data Types</a> and <a href="#vector-data-types">Supported |
| Application Vector Data Types</a>. |
| For example, if <em>buffer</em> is to be filled with a pattern of <code>float4</code> values, |
| then <em>pattern</em> will be a pointer to a <code>cl_float4</code> value and <em>pattern_size</em> |
| will be <code>sizeof(cl_float4)</code>. |
| The maximum value of <em>pattern_size</em> is the size of the largest integer or |
| floating-point vector data type supported by the OpenCL device. |
| The memory associated with <em>pattern</em> can be reused or freed after the |
| function returns.</p> |
| </li> |
| <li> |
| <p><em>offset</em> is the location in bytes of the region being filled in <em>buffer</em> and |
| must be a multiple of <em>pattern_size</em>.</p> |
| </li> |
| <li> |
| <p><em>size</em> is the size in bytes of region being filled in <em>buffer</em> and must be a |
| multiple of <em>pattern_size</em>.</p> |
| </li> |
| <li> |
| <p><em>event_wait_list</em> and <em>num_events_in_wait_list</em> specify events that need to |
| complete before this particular command can be executed. |
| If <em>event_wait_list</em> is <code>NULL</code>, then this particular command does not wait |
| on any event to complete. |
| If <em>event_wait_list</em> is <code>NULL</code>, <em>num_events_in_wait_list</em> must be 0. |
| If <em>event_wait_list</em> is not <code>NULL</code>, the list of events pointed to by |
| <em>event_wait_list</em> must be valid and <em>num_events_in_wait_list</em> must be |
| greater than 0. |
| The events specified in <em>event_wait_list</em> act as synchronization points. |
| The context associated with events in <em>event_wait_list</em> and <em>command_queue</em> |
| must be the same. |
| The memory associated with <em>event_wait_list</em> can be reused or freed after |
| the function returns.</p> |
| </li> |
| <li> |
| <p><em>event</em> returns an event object that identifies this command |
| and can be used to query or queue a wait for this command to complete. |
| If <em>event</em> is <code>NULL</code> or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If <em>event_wait_list</em> and <em>event</em> are not <code>NULL</code>, <em>event</em> must not refer |
| to an element of the <em>event_wait_list</em> array.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The usage information which indicates whether the memory object can be read |
| or written by a kernel and/or the host and is given by the <code>cl_mem_<wbr>flags</code> |
| argument value specified when <em>buffer</em> is created is ignored by |
| <a href="#clEnqueueFillBuffer"><strong>clEnqueueFillBuffer</strong></a>.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clEnqueueFillBuffer"><strong>clEnqueueFillBuffer</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_COMMAND_QUEUE"><code>CL_INVALID_<wbr>COMMAND_<wbr>QUEUE</code></a> if <em>command_queue</em> is not a valid host |
| command-queue.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if the context associated with <em>command_queue</em> and |
| <em>buffer</em> are not the same or if the context associated with |
| <em>command_queue</em> and events in <em>event_wait_list</em> are not the same.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_MEM_OBJECT"><code>CL_INVALID_<wbr>MEM_<wbr>OBJECT</code></a> if <em>buffer</em> is not a valid buffer object.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>offset</em> or <em>offset</em> + <em>size</em> require accessing |
| elements outside the <em>buffer</em> buffer object respectively.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>pattern</em> is <code>NULL</code> or if <em>pattern_size</em> is 0 or if |
| <em>pattern_size</em> is not one of { 1, 2, 4, 8, 16, 32, 64, 128 }.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>offset</em> and <em>size</em> are not a multiple of |
| <em>pattern_size</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_EVENT_WAIT_LIST"><code>CL_INVALID_<wbr>EVENT_<wbr>WAIT_<wbr>LIST</code></a> if <em>event_wait_list</em> is <code>NULL</code> and |
| <em>num_events_in_wait_list</em> > 0, or <em>event_wait_list</em> is not <code>NULL</code> and |
| <em>num_events_in_wait_list</em> is 0, or if event objects in <em>event_wait_list</em> |
| are not valid events.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MISALIGNED_SUB_BUFFER_OFFSET"><code>CL_MISALIGNED_<wbr>SUB_<wbr>BUFFER_<wbr>OFFSET</code></a> if <em>buffer</em> is a sub-buffer object and |
| offset specified when the sub-buffer object is created is not aligned to |
| <a href="#CL_DEVICE_MEM_BASE_ADDR_ALIGN"><code>CL_DEVICE_<wbr>MEM_<wbr>BASE_<wbr>ADDR_<wbr>ALIGN</code></a> value for device associated with <em>queue</em>. |
| This error code is <a href="#unified-spec">missing before</a> version 1.1.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MEM_OBJECT_ALLOCATION_FAILURE"><code>CL_MEM_<wbr>OBJECT_<wbr>ALLOCATION_<wbr>FAILURE</code></a> if there is a failure to allocate |
| memory for data store associated with <em>buffer</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="sect3"> |
| <h4 id="_mapping_buffer_objects"><a class="anchor" href="#_mapping_buffer_objects"></a>5.2.4. Mapping Buffer Objects</h4> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To enqueue a command to map a region of the buffer object given by <em>buffer</em> |
| into the host address space and returns a pointer to this mapped region, |
| call the function</p> |
| </div> |
| <div id="clEnqueueMapBuffer" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++"><span class="directive">void</span>* clEnqueueMapBuffer( |
| cl_command_queue command_queue, |
| cl_mem buffer, |
| cl_bool blocking_map, |
| cl_map_flags map_flags, |
| size_t offset, |
| size_t size, |
| cl_uint num_events_in_wait_list, |
| <span class="directive">const</span> cl_event* event_wait_list, |
| cl_event* event, |
| cl_int* errcode_ret);</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>command_queue</em> must be a valid host command-queue.</p> |
| </li> |
| <li> |
| <p><em>blocking_map</em> indicates if the map operation is <em>blocking</em> or |
| <em>non-blocking</em>.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>If <em>blocking_map</em> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>, <a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a> does not return until the |
| specified region in <em>buffer</em> is mapped into the host address space and the |
| application can access the contents of the mapped region using the pointer |
| returned by <a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>If <em>blocking_map</em> is <a href="#CL_FALSE"><code>CL_FALSE</code></a> i.e. map operation is non-blocking, the |
| pointer to the mapped region returned by <a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a> cannot be used |
| until the map command has completed. |
| The <em>event</em> argument returns an event object which can be used to query the |
| execution status of the map command. |
| When the map command is completed, the application can access the contents |
| of the mapped region using the pointer returned by <a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a>.</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>map_flags</em> is a bit-field and is described in the |
| <a href="#memory-map-flags-table">Memory Map Flags</a> table.</p> |
| </li> |
| <li> |
| <p><em>buffer</em> is a valid buffer object. |
| The OpenCL context associated with <em>command_queue</em> and <em>buffer</em> must be the |
| same.</p> |
| </li> |
| <li> |
| <p><em>offset</em> and <em>size</em> are the offset in bytes and the size of the region in |
| the buffer object that is being mapped.</p> |
| </li> |
| <li> |
| <p><em>event_wait_list</em> and <em>num_events_in_wait_list</em> specify events that need to |
| complete before this particular command can be executed. |
| If <em>event_wait_list</em> is <code>NULL</code>, then this particular command does not wait |
| on any event to complete. |
| If <em>event_wait_list</em> is <code>NULL</code>, <em>num_events_in_wait_list</em> must be 0. |
| If <em>event_wait_list</em> is not <code>NULL</code>, the list of events pointed to by |
| <em>event_wait_list</em> must be valid and <em>num_events_in_wait_list</em> must be |
| greater than 0. |
| The events specified in <em>event_wait_list</em> act as synchronization points. |
| The context associated with events in <em>event_wait_list</em> and <em>command_queue</em> |
| must be the same. |
| The memory associated with <em>event_wait_list</em> can be reused or freed after |
| the function returns.</p> |
| </li> |
| <li> |
| <p><em>event</em> returns an event object that identifies this command |
| and can be used to query or queue a wait for this command to complete. |
| If <em>event</em> is <code>NULL</code> or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If <em>event_wait_list</em> and <em>event</em> are not <code>NULL</code>, <em>event</em> must not refer |
| to an element of the <em>event_wait_list</em> array.</p> |
| </li> |
| <li> |
| <p><em>errcode_ret</em> will return an appropriate error code. |
| If <em>errcode_ret</em> is <code>NULL</code>, no error code is returned.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a> will return a pointer to the mapped region. |
| The <em>errcode_ret</em> is set to <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>A <code>NULL</code> pointer is returned otherwise with one of the following error |
| values returned in <em>errcode_ret</em>:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_COMMAND_QUEUE"><code>CL_INVALID_<wbr>COMMAND_<wbr>QUEUE</code></a> if <em>command_queue</em> is not a valid host |
| command-queue.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if context associated with <em>command_queue</em> and |
| <em>buffer</em> are not the same or if the context associated with |
| <em>command_queue</em> and events in <em>event_wait_list</em> are not the same.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_MEM_OBJECT"><code>CL_INVALID_<wbr>MEM_<wbr>OBJECT</code></a> if <em>buffer</em> is not a valid buffer object.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if region being mapped given by (<em>offset</em>, <em>size</em>) is |
| out of bounds or if <em>size</em> is 0 or if values specified in <em>map_flags</em> |
| are not valid.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_EVENT_WAIT_LIST"><code>CL_INVALID_<wbr>EVENT_<wbr>WAIT_<wbr>LIST</code></a> if <em>event_wait_list</em> is <code>NULL</code> and |
| <em>num_events_in_wait_list</em> > 0, or <em>event_wait_list</em> is not <code>NULL</code> and |
| <em>num_events_in_wait_list</em> is 0, or if event objects in <em>event_wait_list</em> |
| are not valid events.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MISALIGNED_SUB_BUFFER_OFFSET"><code>CL_MISALIGNED_<wbr>SUB_<wbr>BUFFER_<wbr>OFFSET</code></a> if <em>buffer</em> is a sub-buffer object and |
| <em>offset</em> specified when the sub-buffer object is created is not aligned |
| to <a href="#CL_DEVICE_MEM_BASE_ADDR_ALIGN"><code>CL_DEVICE_<wbr>MEM_<wbr>BASE_<wbr>ADDR_<wbr>ALIGN</code></a> value for the device associated with |
| <em>queue</em>. |
| This error code is <a href="#unified-spec">missing before</a> version 1.1.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MAP_FAILURE"><code>CL_MAP_<wbr>FAILURE</code></a> if there is a failure to map the requested region into |
| the host address space. |
| This error cannot occur for buffer objects created with |
| <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a> or <a href="#CL_MEM_ALLOC_HOST_PTR"><code>CL_MEM_<wbr>ALLOC_<wbr>HOST_<wbr>PTR</code></a>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST"><code>CL_EXEC_<wbr>STATUS_<wbr>ERROR_<wbr>FOR_<wbr>EVENTS_<wbr>IN_<wbr>WAIT_<wbr>LIST</code></a> if the map operation is |
| blocking and the execution status of any of the events in |
| <em>event_wait_list</em> is a negative integer value. |
| This error code is <a href="#unified-spec">missing before</a> version 1.1.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MEM_OBJECT_ALLOCATION_FAILURE"><code>CL_MEM_<wbr>OBJECT_<wbr>ALLOCATION_<wbr>FAILURE</code></a> if there is a failure to allocate |
| memory for data store associated with <em>buffer</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_OPERATION"><code>CL_INVALID_<wbr>OPERATION</code></a> if <em>buffer</em> has been created with |
| <a href="#CL_MEM_HOST_WRITE_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>WRITE_<wbr>ONLY</code></a> or <a href="#CL_MEM_HOST_NO_ACCESS"><code>CL_MEM_<wbr>HOST_<wbr>NO_<wbr>ACCESS</code></a> and <a href="#CL_MAP_READ"><code>CL_MAP_<wbr>READ</code></a> is set |
| in <em>map_flags</em> or if <em>buffer</em> has been created with |
| <a href="#CL_MEM_HOST_READ_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>READ_<wbr>ONLY</code></a> or <a href="#CL_MEM_HOST_NO_ACCESS"><code>CL_MEM_<wbr>HOST_<wbr>NO_<wbr>ACCESS</code></a> and <a href="#CL_MAP_WRITE"><code>CL_MAP_<wbr>WRITE</code></a> or |
| <a href="#CL_MAP_WRITE_INVALIDATE_REGION"><code>CL_MAP_<wbr>WRITE_<wbr>INVALIDATE_<wbr>REGION</code></a> is set in <em>map_flags</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_OPERATION"><code>CL_INVALID_<wbr>OPERATION</code></a> if mapping would lead to overlapping regions being |
| mapped for writing.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The pointer returned maps a region starting at <em>offset</em> and is at least |
| <em>size</em> bytes in size. |
| The result of a memory access outside this region is undefined.</p> |
| </div> |
| <div class="paragraph"> |
| <p>If the buffer object is created with <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a> set in <em>mem_flags</em>, |
| the following will be true:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>The <em>host_ptr</em> specified in <a href="#clCreateBuffer"><strong>clCreateBuffer</strong></a> or <a href="#clCreateBufferWithProperties"><strong>clCreateBufferWithProperties</strong></a> |
| will contain the latest bits in the region being mapped when the |
| <a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a> command has completed.</p> |
| </li> |
| <li> |
| <p>The pointer value returned by <a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a> will be derived from |
| the <em>host_ptr</em> specified when the buffer object is created.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Mapped buffer objects are unmapped using <a href="#clEnqueueUnmapMemObject"><strong>clEnqueueUnmapMemObject</strong></a>. |
| This is described in <a href="#unmapping-mapped-memory">Unmapping Mapped Memory |
| Objects</a>.</p> |
| </div> |
| <table id="memory-map-flags-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 14. List of supported map flag values</caption> |
| <colgroup> |
| <col style="width: 50%;"> |
| <col style="width: 50%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">Map Flags</th> |
| <th class="tableblock halign-left valign-top">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MAP_READ"></a><a href="#CL_MAP_READ"><code>CL_MAP_<wbr>READ</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">This flag specifies that the region being mapped in the memory object is |
| being mapped for reading.</p> |
| <p class="tableblock"> The pointer returned by <a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a> (<a href="#clEnqueueMapImage"><strong>clEnqueueMapImage</strong></a>) is |
| guaranteed to contain the latest bits in the region being mapped when |
| the <a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a> (<a href="#clEnqueueMapImage"><strong>clEnqueueMapImage</strong></a>) command has completed.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MAP_WRITE"></a><a href="#CL_MAP_WRITE"><code>CL_MAP_<wbr>WRITE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">This flag specifies that the region being mapped in the memory object is |
| being mapped for writing.</p> |
| <p class="tableblock"> The pointer returned by <a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a> (<a href="#clEnqueueMapImage"><strong>clEnqueueMapImage</strong></a>) is |
| guaranteed to contain the latest bits in the region being mapped when |
| the <a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a> (<a href="#clEnqueueMapImage"><strong>clEnqueueMapImage</strong></a>) command has completed</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MAP_WRITE_INVALIDATE_REGION"></a><a href="#CL_MAP_WRITE_INVALIDATE_REGION"><code>CL_MAP_<wbr>WRITE_<wbr>INVALIDATE_<wbr>REGION</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">This flag specifies that the region being mapped in the memory object is |
| being mapped for writing.</p> |
| <p class="tableblock"> The contents of the region being mapped are to be discarded. |
| This is typically the case when the region being mapped is overwritten |
| by the host. |
| This flag allows the implementation to no longer guarantee that the |
| pointer returned by <a href="#clEnqueueMapBuffer"><strong>clEnqueueMapBuffer</strong></a> (<a href="#clEnqueueMapImage"><strong>clEnqueueMapImage</strong></a>) contains |
| the latest bits in the region being mapped which can be a significant |
| performance enhancement.</p> |
| <p class="tableblock"> <a href="#CL_MAP_READ"><code>CL_MAP_<wbr>READ</code></a> or <a href="#CL_MAP_WRITE"><code>CL_MAP_<wbr>WRITE</code></a> and <a href="#CL_MAP_WRITE_INVALIDATE_REGION"><code>CL_MAP_<wbr>WRITE_<wbr>INVALIDATE_<wbr>REGION</code></a> are |
| mutually exclusive.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="sect2"> |
| <h3 id="_image_objects"><a class="anchor" href="#_image_objects"></a>5.3. Image Objects</h3> |
| <div class="paragraph"> |
| <p>An <em>image</em> object is used to store a one-, two- or three-dimensional |
| texture, frame-buffer or image. |
| The elements of an image object are selected from a list of predefined image |
| formats. |
| The minimum number of elements in a memory object is one.</p> |
| </div> |
| <div class="sect3"> |
| <h4 id="_creating_image_objects"><a class="anchor" href="#_creating_image_objects"></a>5.3.1. Creating Image Objects</h4> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>An <strong>image object</strong> may be created using the function</p> |
| </div> |
| <div id="clCreateImage" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_mem clCreateImage( |
| cl_context context, |
| cl_mem_flags flags, |
| <span class="directive">const</span> cl_image_format* image_format, |
| <span class="directive">const</span> cl_image_desc* image_desc, |
| <span class="directive">void</span>* host_ptr, |
| cl_int* errcode_ret);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clCreateImage"><strong>clCreateImage</strong></a> is <a href="#unified-spec">missing before</a> version 1.2. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="paragraph"> |
| <p>An <strong>image object</strong> may also be created with additional properties using the function</p> |
| </div> |
| <div id="clCreateImageWithProperties" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_mem clCreateImageWithProperties( |
| cl_context context, |
| <span class="directive">const</span> cl_mem_properties* properties, |
| cl_mem_flags flags, |
| <span class="directive">const</span> cl_image_format* image_format, |
| <span class="directive">const</span> cl_image_desc* image_desc, |
| <span class="directive">void</span>* host_ptr, |
| cl_int* errcode_ret);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clCreateImageWithProperties"><strong>clCreateImageWithProperties</strong></a> is <a href="#unified-spec">missing before</a> version 3.0. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>context</em> is a valid OpenCL context used to create the image object.</p> |
| </li> |
| <li> |
| <p><em>properties</em> is an optional list of properties for the image object and their corresponding values. |
| The list is terminated with the special property <code>0</code>. |
| If no properties are required, <em>properties</em> may be <code>NULL</code>. |
| OpenCL 3.0 does not define any optional properties for images.</p> |
| </li> |
| <li> |
| <p><em>flags</em> is a bit-field that is used to specify allocation and usage |
| information about the image memory object being created and is described in |
| the <a href="#memory-flags-table">supported memory flag values</a> table.</p> |
| </li> |
| <li> |
| <p><em>image_format</em> is a pointer to a structure that describes format properties |
| of the image to be allocated. |
| A 1D image buffer or 2D image can be created from a buffer by specifying a |
| buffer object in the <em>image_desc</em>→<em>mem_object</em>. |
| A 2D image can be created from another 2D image object by specifying an |
| image object in the <em>image_desc</em>→<em>mem_object</em>. |
| Refer to the <a href="#image-format-descriptor">Image Format Descriptor</a> section |
| for a detailed description of the image format descriptor.</p> |
| </li> |
| <li> |
| <p><em>image_desc</em> is a pointer to a structure that describes type and dimensions |
| of the image to be allocated. |
| Refer to the <a href="#image-descriptor">Image Descriptor</a> section for a detailed |
| description of the image descriptor.</p> |
| </li> |
| <li> |
| <p><em>host_ptr</em> is a pointer to the image data that may already be allocated by |
| the application. |
| Refer to the <a href="#host-ptr-buffer-size-table">table below</a> for a description |
| of how large the buffer that <em>host_ptr</em> points to must be.</p> |
| </li> |
| <li> |
| <p><em>errcode_ret</em> will return an appropriate error code. |
| If <em>errcode_ret</em> is <code>NULL</code>, no error code is returned.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The alignment requirements for data stored in image objects are described |
| in <a href="#alignment-app-data-types">Alignment of Application Data Types</a>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>For all image types except <a href="#CL_MEM_OBJECT_IMAGE1D_BUFFER"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE1D_<wbr>BUFFER</code></a>, if the value |
| specified for <em>flags</em> is 0, the default is used which is <a href="#CL_MEM_READ_WRITE"><code>CL_MEM_<wbr>READ_<wbr>WRITE</code></a>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>For <a href="#CL_MEM_OBJECT_IMAGE1D_BUFFER"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE1D_<wbr>BUFFER</code></a> image type, or an image created from |
| another memory object (image or buffer), if the <a href="#CL_MEM_READ_WRITE"><code>CL_MEM_<wbr>READ_<wbr>WRITE</code></a>, |
| <a href="#CL_MEM_READ_ONLY"><code>CL_MEM_<wbr>READ_<wbr>ONLY</code></a> or <a href="#CL_MEM_WRITE_ONLY"><code>CL_MEM_<wbr>WRITE_<wbr>ONLY</code></a> values are not specified in <em>flags</em>, |
| they are inherited from the corresponding memory access qualifiers associated |
| with <em>mem_object</em>. |
| The <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a>, <a href="#CL_MEM_ALLOC_HOST_PTR"><code>CL_MEM_<wbr>ALLOC_<wbr>HOST_<wbr>PTR</code></a> and <a href="#CL_MEM_COPY_HOST_PTR"><code>CL_MEM_<wbr>COPY_<wbr>HOST_<wbr>PTR</code></a> |
| values cannot be specified in <em>flags</em> but are inherited from the |
| corresponding memory access qualifiers associated with <em>mem_object</em>. |
| If <a href="#CL_MEM_COPY_HOST_PTR"><code>CL_MEM_<wbr>COPY_<wbr>HOST_<wbr>PTR</code></a> is specified in the memory access qualifier values |
| associated with <em>mem_object</em> it does not imply any additional copies when |
| the image is created from <em>mem_object</em>. |
| If the <a href="#CL_MEM_HOST_WRITE_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>WRITE_<wbr>ONLY</code></a>, <a href="#CL_MEM_HOST_READ_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>READ_<wbr>ONLY</code></a> or |
| <a href="#CL_MEM_HOST_NO_ACCESS"><code>CL_MEM_<wbr>HOST_<wbr>NO_<wbr>ACCESS</code></a> values are not specified in <em>flags</em>, they are |
| inherited from the corresponding memory access qualifiers associated with |
| <em>mem_object</em>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>For a 3D image or 2D image array, the image data specified by <em>host_ptr</em> is |
| stored as a linear sequence of adjacent 2D image slices or 2D images |
| respectively. |
| Each 2D image is a linear sequence of adjacent scanlines. |
| Each scanline is a linear sequence of image elements.</p> |
| </div> |
| <div class="paragraph"> |
| <p>For a 2D image, the image data specified by <em>host_ptr</em> is stored as a linear |
| sequence of adjacent scanlines. |
| Each scanline is a linear sequence of image elements.</p> |
| </div> |
| <div class="paragraph"> |
| <p>For a 1D image array, the image data specified by <em>host_ptr</em> is stored as a |
| linear sequence of adjacent 1D images. |
| Each 1D image is stored as a single scanline which is a linear sequence of |
| adjacent elements.</p> |
| </div> |
| <div class="paragraph"> |
| <p>For 1D image or 1D image buffer, the image data specified by <em>host_ptr</em> is |
| stored as a single scanline which is a linear sequence of adjacent elements.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Image elements are stored according to their image format as described in the |
| <a href="#image-format-descriptor">Image Format Descriptor</a> section.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clCreateImage"><strong>clCreateImage</strong></a> and <a href="#clCreateImageWithProperties"><strong>clCreateImageWithProperties</strong></a> returns a valid non-zero |
| image object and <em>errcode_ret</em> is set to <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the image object |
| is created successfully. |
| Otherwise, they return a <code>NULL</code> value with one of the following error values |
| returned in <em>errcode_ret</em>:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if <em>context</em> is not a valid context.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_PROPERTY"><code>CL_INVALID_<wbr>PROPERTY</code></a> if a property name in <em>properties</em> is not a |
| supported property name, if the value specified for a supported property |
| name is not valid, or if the same property name is specified more than |
| once.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if values specified in <em>flags</em> are not valid.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_IMAGE_FORMAT_DESCRIPTOR"><code>CL_INVALID_<wbr>IMAGE_<wbr>FORMAT_<wbr>DESCRIPTOR</code></a> if values specified in <em>image_format</em> |
| are not valid or if <em>image_format</em> is <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_IMAGE_FORMAT_DESCRIPTOR"><code>CL_INVALID_<wbr>IMAGE_<wbr>FORMAT_<wbr>DESCRIPTOR</code></a> if a 2D image is created from a |
| buffer and the row pitch and base address alignment does not follow the |
| rules described for creating a 2D image from a buffer.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_IMAGE_FORMAT_DESCRIPTOR"><code>CL_INVALID_<wbr>IMAGE_<wbr>FORMAT_<wbr>DESCRIPTOR</code></a> if a 2D image is created from a 2D |
| image object and the rules described above are not followed.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_IMAGE_DESCRIPTOR"><code>CL_INVALID_<wbr>IMAGE_<wbr>DESCRIPTOR</code></a> if values specified in <em>image_desc</em> are not |
| valid or if <em>image_desc</em> is <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_IMAGE_SIZE"><code>CL_INVALID_<wbr>IMAGE_<wbr>SIZE</code></a> if image dimensions specified in <em>image_desc</em> |
| exceed the maximum image dimensions described in the |
| <a href="#device-queries-table">Device Queries</a> table for all devices |
| in <em>context</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_HOST_PTR"><code>CL_INVALID_<wbr>HOST_<wbr>PTR</code></a> if <em>host_ptr</em> is <code>NULL</code> and <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a> or |
| <a href="#CL_MEM_COPY_HOST_PTR"><code>CL_MEM_<wbr>COPY_<wbr>HOST_<wbr>PTR</code></a> are set in <em>flags</em> or if <em>host_ptr</em> is not <code>NULL</code> |
| but <a href="#CL_MEM_COPY_HOST_PTR"><code>CL_MEM_<wbr>COPY_<wbr>HOST_<wbr>PTR</code></a> or <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a> are not set in <em>flags</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if an image is being created from another memory object |
| (buffer or image) under one of the following circumstances: 1) |
| <em>mem_object</em> was created with <a href="#CL_MEM_WRITE_ONLY"><code>CL_MEM_<wbr>WRITE_<wbr>ONLY</code></a> and <em>flags</em> specifies |
| <a href="#CL_MEM_READ_WRITE"><code>CL_MEM_<wbr>READ_<wbr>WRITE</code></a> or <a href="#CL_MEM_READ_ONLY"><code>CL_MEM_<wbr>READ_<wbr>ONLY</code></a>, 2) <em>mem_object</em> was created with |
| <a href="#CL_MEM_READ_ONLY"><code>CL_MEM_<wbr>READ_<wbr>ONLY</code></a> and <em>flags</em> specifies <a href="#CL_MEM_READ_WRITE"><code>CL_MEM_<wbr>READ_<wbr>WRITE</code></a> or |
| <a href="#CL_MEM_WRITE_ONLY"><code>CL_MEM_<wbr>WRITE_<wbr>ONLY</code></a>, 3) <em>flags</em> specifies <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a> or |
| <a href="#CL_MEM_ALLOC_HOST_PTR"><code>CL_MEM_<wbr>ALLOC_<wbr>HOST_<wbr>PTR</code></a> or <a href="#CL_MEM_COPY_HOST_PTR"><code>CL_MEM_<wbr>COPY_<wbr>HOST_<wbr>PTR</code></a>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if an image is being created from another memory object |
| (buffer or image) and <em>mem_object</em> was created with |
| <a href="#CL_MEM_HOST_WRITE_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>WRITE_<wbr>ONLY</code></a> and <em>flags</em> specifies <a href="#CL_MEM_HOST_READ_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>READ_<wbr>ONLY</code></a>, or |
| if <em>mem_object</em> was created with <a href="#CL_MEM_HOST_READ_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>READ_<wbr>ONLY</code></a> and <em>flags</em> |
| specifies <a href="#CL_MEM_HOST_WRITE_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>WRITE_<wbr>ONLY</code></a>, or if <em>mem_object</em> was created with |
| <a href="#CL_MEM_HOST_NO_ACCESS"><code>CL_MEM_<wbr>HOST_<wbr>NO_<wbr>ACCESS</code></a> and_flags_ specifies <a href="#CL_MEM_HOST_READ_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>READ_<wbr>ONLY</code></a> or |
| <a href="#CL_MEM_HOST_WRITE_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>WRITE_<wbr>ONLY</code></a>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_IMAGE_FORMAT_NOT_SUPPORTED"><code>CL_IMAGE_<wbr>FORMAT_<wbr>NOT_<wbr>SUPPORTED</code></a> if there are no devices in <em>context</em> that |
| support <em>image_format</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MEM_OBJECT_ALLOCATION_FAILURE"><code>CL_MEM_<wbr>OBJECT_<wbr>ALLOCATION_<wbr>FAILURE</code></a> if there is a failure to allocate |
| memory for image object.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_OPERATION"><code>CL_INVALID_<wbr>OPERATION</code></a> if there are no devices in <em>context</em> that support |
| images (i.e. <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> specified in the |
| <a href="#device-queries-table">Device Queries</a> table is <a href="#CL_FALSE"><code>CL_FALSE</code></a>).</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| <table id="host-ptr-buffer-size-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 15. Required <em>host_ptr</em> buffer sizes for images</caption> |
| <colgroup> |
| <col style="width: 50%;"> |
| <col style="width: 50%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">Image Type</th> |
| <th class="tableblock halign-left valign-top">Size of buffer that <em>host_ptr</em> points to</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MEM_OBJECT_IMAGE1D"></a><a href="#CL_MEM_OBJECT_IMAGE1D"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE1D</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">≥ image_row_pitch</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MEM_OBJECT_IMAGE1D_BUFFER"></a><a href="#CL_MEM_OBJECT_IMAGE1D_BUFFER"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE1D_<wbr>BUFFER</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">≥ image_row_pitch</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MEM_OBJECT_IMAGE2D"></a><a href="#CL_MEM_OBJECT_IMAGE2D"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE2D</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">≥ image_row_pitch × image_height</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MEM_OBJECT_IMAGE3D"></a><a href="#CL_MEM_OBJECT_IMAGE3D"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE3D</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">≥ image_slice_pitch × image_depth</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MEM_OBJECT_IMAGE1D_ARRAY"></a><a href="#CL_MEM_OBJECT_IMAGE1D_ARRAY"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE1D_<wbr>ARRAY</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">≥ image_slice_pitch × image_array_size</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_MEM_OBJECT_IMAGE2D_ARRAY"></a><a href="#CL_MEM_OBJECT_IMAGE2D_ARRAY"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE2D_<wbr>ARRAY</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.2.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">≥ image_slice_pitch × image_array_size</p></td> |
| </tr> |
| </tbody> |
| </table> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>A <strong>2D image</strong> object can be created using the following function</p> |
| </div> |
| <div id="clCreateImage2D" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_mem clCreateImage2D( |
| cl_context context, |
| cl_mem_flags flags, |
| <span class="directive">const</span> cl_image_format* image_format, |
| size_t image_width, |
| size_t image_height, |
| size_t image_row_pitch, |
| <span class="directive">void</span>* host_ptr, |
| cl_int* errcode_ret);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clCreateImage2D"><strong>clCreateImage2D</strong></a> is <a href="#unified-spec">deprecated by</a> version 1.2. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>context</em> is a valid OpenCL context on which the image object is to be |
| created.</p> |
| </li> |
| <li> |
| <p><em>flags</em> is a bit-field that is used to specify allocation and usage |
| information about the image memory object being created and is described in |
| the <a href="#memory-flags-table">supported memory flag values</a> table. |
| If the value specified for <em>flags</em> is 0, the default is used which is |
| <a href="#CL_MEM_READ_WRITE"><code>CL_MEM_<wbr>READ_<wbr>WRITE</code></a>.</p> |
| </li> |
| <li> |
| <p><em>image_format</em> is a pointer to a structure that describes format properties |
| of the image to be allocated. |
| Refer to the <a href="#image-format-descriptor">Image Format Descriptor</a> section |
| for a detailed description of the image format descriptor.</p> |
| </li> |
| <li> |
| <p><em>image_width</em> and <em>image_height</em> are the width and height of the image in |
| pixels. |
| These must be values greater than or equal to 1.</p> |
| </li> |
| <li> |
| <p><em>image_row_pitch</em> is the scan-line pitch in bytes. |
| This must be 0 if <em>host_ptr</em> is <code>NULL</code> and can be either 0 or ≥ |
| <em>image_width</em> × size of element in bytes if <em>host_ptr</em> is not <code>NULL</code>. |
| If <em>host_ptr</em> is not <code>NULL</code> and <em>image_row_pitch</em> is 0, <em>image_row_pitch</em> |
| is calculated as <em>image_width</em> × size of element in bytes. |
| If <em>image_row_pitch</em> is not 0, it must be a multiple of the image element |
| size in bytes.</p> |
| </li> |
| <li> |
| <p><em>host_ptr</em> is a pointer to the image data that may already be allocated by |
| the application. |
| Refer to the <a href="#CL_MEM_OBJECT_IMAGE2D"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE2D</code></a> entry in the |
| <a href="#host-ptr-buffer-size-table">required <em>host_ptr</em> buffer size table</a> for a |
| description of how large the buffer that <em>host_ptr</em> points to must be. |
| The image data specified by <em>host_ptr</em> is stored as a linear sequence of |
| adjacent scanlines. |
| Each scanline is a linear sequence of image elements. |
| Image elements are stored according to their image format as described in |
| the <a href="#image-format-descriptor">Image Format Descriptor</a> section.</p> |
| </li> |
| <li> |
| <p><em>errcode_ret</em> will return an appropriate error code. |
| If <em>errcode_ret</em> is <code>NULL</code>, no error code is returned.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clCreateImage2D"><strong>clCreateImage2D</strong></a> returns a valid non-zero image object created and the |
| <em>errcode_ret</em> is set to <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the image object is created |
| successfully. |
| Otherwise, it returns a <code>NULL</code> value with one of the following error values |
| returned in <em>errcode_ret</em>:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if <em>context</em> is not a valid context.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if values specified in <em>flags</em> are not valid.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_IMAGE_FORMAT_DESCRIPTOR"><code>CL_INVALID_<wbr>IMAGE_<wbr>FORMAT_<wbr>DESCRIPTOR</code></a> if values specified in <em>image_format</em> |
| are not valid or if <em>image_format</em> is <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_IMAGE_SIZE"><code>CL_INVALID_<wbr>IMAGE_<wbr>SIZE</code></a> if <em>image_width</em> or <em>image_height</em> are 0 or if they |
| exceed the maximum values specified in <a href="#CL_DEVICE_IMAGE2D_MAX_WIDTH"><code>CL_DEVICE_<wbr>IMAGE2D_<wbr>MAX_<wbr>WIDTH</code></a> or |
| <a href="#CL_DEVICE_IMAGE2D_MAX_HEIGHT"><code>CL_DEVICE_<wbr>IMAGE2D_<wbr>MAX_<wbr>HEIGHT</code></a> respectively for all devices in <em>context</em> or |
| if values specified by <em>image_row_pitch</em> do not follow rules described in the |
| argument description above.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_HOST_PTR"><code>CL_INVALID_<wbr>HOST_<wbr>PTR</code></a> if <em>host_ptr</em> is <code>NULL</code> and <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a> or |
| <a href="#CL_MEM_COPY_HOST_PTR"><code>CL_MEM_<wbr>COPY_<wbr>HOST_<wbr>PTR</code></a> are set in <em>flags</em> or if <em>host_ptr</em> is not <code>NULL</code> |
| but <a href="#CL_MEM_COPY_HOST_PTR"><code>CL_MEM_<wbr>COPY_<wbr>HOST_<wbr>PTR</code></a> or <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a> are not set in <em>flags</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_IMAGE_FORMAT_NOT_SUPPORTED"><code>CL_IMAGE_<wbr>FORMAT_<wbr>NOT_<wbr>SUPPORTED</code></a> if there are no devices in <em>context</em> that |
| support <em>image_format</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MEM_OBJECT_ALLOCATION_FAILURE"><code>CL_MEM_<wbr>OBJECT_<wbr>ALLOCATION_<wbr>FAILURE</code></a> if there is a failure to allocate |
| memory for image object.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_OPERATION"><code>CL_INVALID_<wbr>OPERATION</code></a> if there are no devices in <em>context</em> that support |
| images (i.e. <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> specified in the |
| <a href="#device-queries-table">Device Queries</a> table is <a href="#CL_FALSE"><code>CL_FALSE</code></a>).</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>A <strong>3D image</strong> object can be created using the following function</p> |
| </div> |
| <div id="clCreateImage3D" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_mem clCreateImage3D( |
| cl_context context, |
| cl_mem_flags flags, |
| <span class="directive">const</span> cl_image_format* image_format, |
| size_t image_width, |
| size_t image_height, |
| size_t image_depth, |
| size_t image_row_pitch, |
| size_t image_slice_pitch, |
| <span class="directive">void</span>* host_ptr, |
| cl_int* errcode_ret);</code></pre> |
| </div> |
| </div> |
| <div class="admonitionblock important"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-important" title="Important"></i> |
| </td> |
| <td class="content"> |
| <a href="#clCreateImage3D"><strong>clCreateImage3D</strong></a> is <a href="#unified-spec">deprecated by</a> version 1.2. |
| </td> |
| </tr> |
| </table> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>context</em> is a valid OpenCL context on which the image object is to be |
| created.</p> |
| </li> |
| <li> |
| <p><em>flags</em> is a bit-field that is used to specify allocation and usage |
| information about the image memory object being created and is described in |
| the <a href="#memory-flags-table">supported memory flag values</a> table. |
| If the value specified for <em>flags</em> is 0, the default is used which is |
| <a href="#CL_MEM_READ_WRITE"><code>CL_MEM_<wbr>READ_<wbr>WRITE</code></a>.</p> |
| </li> |
| <li> |
| <p><em>image_format</em> is a pointer to a structure that describes format properties |
| of the image to be allocated. |
| Refer to the <a href="#image-format-descriptor">Image Format Descriptor</a> section |
| for a detailed description of the image format descriptor.</p> |
| </li> |
| <li> |
| <p><em>image_width</em> and <em>image_height</em> are the width and height of the image in |
| pixels. |
| These must be values greater than or equal to 1.</p> |
| </li> |
| <li> |
| <p><em>image_depth</em> is the depth of the image in pixels. For <a href="#clCreateImage3D"><strong>clCreateImage3D</strong></a>, |
| this must be a value > 1.</p> |
| </li> |
| <li> |
| <p><em>image_row_pitch</em> is the scan-line pitch in bytes. |
| This must be 0 if <em>host_ptr</em> is <code>NULL</code> and can be either 0 or ≥ |
| <em>image_width</em> × size of element in bytes if <em>host_ptr</em> is not <code>NULL</code>. |
| If <em>host_ptr</em> is not <code>NULL</code> and <em>image_row_pitch</em> is 0, <em>image_row_pitch</em> |
| is calculated as <em>image_width</em> × size of element in bytes. |
| If <em>image_row_pitch</em> is not 0, it must be a multiple of the image element |
| size in bytes.</p> |
| </li> |
| <li> |
| <p><em>image_slice_pitch</em> is the size in bytes of each 2D slice in the 3D image. |
| This be be 0 if <em>host_ptr</em> is <code>NULL</code> and can be 0 or ≥ |
| <em>image_row_pitch</em> × <em>image_height</em> if <em>host_ptr</em> is not <code>NULL</code>. |
| If <em>host_ptr</em> is not <code>NULL</code> and <em>image_slice_pitch</em> is 0, |
| <em>image_slice_pitch</em> is calculated as <em>image_row_pitch</em> × |
| <em>image_height</em>. |
| If <em>image_slice_pitch</em> is not 0, it must be a multiple of the |
| <em>image_row_pitch</em>.</p> |
| </li> |
| <li> |
| <p><em>host_ptr</em> is a pointer to the image data that may already be allocated by |
| the application. |
| Refer to the <a href="#CL_MEM_OBJECT_IMAGE3D"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE3D</code></a> entry in the |
| <a href="#host-ptr-buffer-size-table">required <em>host_ptr</em> buffer size table</a> for a |
| description of how large the buffer that <em>host_ptr</em> points to must be. |
| The image data specified by <em>host_ptr</em> is stored as a linear sequence of |
| adjacent 2D slices. |
| Each scanline is a linear sequence of image elements. |
| Image elements are stored according to their image format as described in |
| the <a href="#image-format-descriptor">Image Format Descriptor</a> section.</p> |
| </li> |
| <li> |
| <p><em>errcode_ret</em> will return an appropriate error code. |
| If <em>errcode_ret</em> is <code>NULL</code>, no error code is returned.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clCreateImage3D"><strong>clCreateImage3D</strong></a> returns a valid non-zero image object created and the |
| <em>errcode_ret</em> is set to <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the image object is created |
| successfully. |
| Otherwise, it returns a <code>NULL</code> value with one of the following error values |
| returned in <em>errcode_ret</em>:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if <em>context</em> is not a valid context.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if values specified in <em>flags</em> are not valid.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_IMAGE_FORMAT_DESCRIPTOR"><code>CL_INVALID_<wbr>IMAGE_<wbr>FORMAT_<wbr>DESCRIPTOR</code></a> if values specified in <em>image_format</em> |
| are not valid or if <em>image_format</em> is <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_IMAGE_SIZE"><code>CL_INVALID_<wbr>IMAGE_<wbr>SIZE</code></a> if <em>image_width</em> or <em>image_height</em> are 0 or if |
| <em>image_depth</em> ≤ 1, or if they exceed the maximum values specified in |
| <a href="#CL_DEVICE_IMAGE3D_MAX_WIDTH"><code>CL_DEVICE_<wbr>IMAGE3D_<wbr>MAX_<wbr>WIDTH</code></a>, <a href="#CL_DEVICE_IMAGE3D_MAX_HEIGHT"><code>CL_DEVICE_<wbr>IMAGE3D_<wbr>MAX_<wbr>HEIGHT</code></a> or |
| <a href="#CL_DEVICE_IMAGE3D_MAX_DEPTH"><code>CL_DEVICE_<wbr>IMAGE3D_<wbr>MAX_<wbr>DEPTH</code></a> respectively for all devices in <em>context</em>, or |
| if values specified by <em>image_row_pitch</em> and <em>image_slice_pitch</em> do not |
| follow rules described in the argument description above.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_HOST_PTR"><code>CL_INVALID_<wbr>HOST_<wbr>PTR</code></a> if <em>host_ptr</em> is <code>NULL</code> and <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a> or |
| <a href="#CL_MEM_COPY_HOST_PTR"><code>CL_MEM_<wbr>COPY_<wbr>HOST_<wbr>PTR</code></a> are set in <em>flags</em> or if <em>host_ptr</em> is not <code>NULL</code> |
| but <a href="#CL_MEM_COPY_HOST_PTR"><code>CL_MEM_<wbr>COPY_<wbr>HOST_<wbr>PTR</code></a> or <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a> are not set in <em>flags</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_IMAGE_FORMAT_NOT_SUPPORTED"><code>CL_IMAGE_<wbr>FORMAT_<wbr>NOT_<wbr>SUPPORTED</code></a> if there are no devices in <em>context</em> that |
| support <em>image_format</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MEM_OBJECT_ALLOCATION_FAILURE"><code>CL_MEM_<wbr>OBJECT_<wbr>ALLOCATION_<wbr>FAILURE</code></a> if there is a failure to allocate |
| memory for image object.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_OPERATION"><code>CL_INVALID_<wbr>OPERATION</code></a> if there are no devices in <em>context</em> that support |
| images (i.e. <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> specified in the |
| <a href="#device-queries-table">Device Queries</a> table is <a href="#CL_FALSE"><code>CL_FALSE</code></a>).</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <div class="sect4"> |
| <h5 id="image-format-descriptor"><a class="anchor" href="#image-format-descriptor"></a>5.3.1.1. Image Format Descriptor</h5> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>The <a href="#cl_image_format"><code>cl_image_<wbr>format</code></a> image format descriptor structure describes an image |
| format, and is defined as:</p> |
| </div> |
| <div id="cl_image_format" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++"><span class="keyword">typedef</span> <span class="keyword">struct</span> cl_image_format { |
| cl_channel_order image_channel_order; |
| cl_channel_type image_channel_data_type; |
| } cl_image_format;</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><code>image_channel_order</code> specifies the number of channels and the channel |
| layout i.e. the memory layout in which channels are stored in the image. |
| Valid values are described in the <a href="#image-channel-order-table">Image Channel |
| Order</a> table.</p> |
| </li> |
| <li> |
| <p><code>image_channel_data_type</code> describes the size of the channel data type. |
| The list of supported values is described in the |
| <a href="#image-channel-data-types-table">Image Channel Data Types</a> table. |
| The number of bits per element determined by the <code>image_channel_data_type</code> |
| and <code>image_channel_order</code> must be a power of two.</p> |
| </li> |
| </ul> |
| </div> |
| <table id="image-channel-order-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 16. List of supported Image Channel Order Values</caption> |
| <colgroup> |
| <col style="width: 50%;"> |
| <col style="width: 50%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">Image Channel Order</th> |
| <th class="tableblock halign-left valign-top">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_R"></a><a href="#CL_R"><code>CL_R</code></a>, <a id="CL_A"></a><a href="#CL_A"><code>CL_A</code></a>,</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Single channel image formats where the single channel represents a <code>RED</code> or <code>ALPHA</code> component.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_DEPTH"></a><a href="#CL_DEPTH"><code>CL_DEPTH</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">A single channel image format where the single channel represents a <code>DEPTH</code> component.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_LUMINANCE"></a><a href="#CL_LUMINANCE"><code>CL_LUMINANCE</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">A single channel image format where the single channel represents a <code>LUMINANCE</code> value. |
| The <code>LUMINANCE</code> value is replicated into the <code>RED</code>, <code>GREEN</code>, and <code>BLUE</code> components.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_INTENSITY"></a><a href="#CL_INTENSITY"><code>CL_INTENSITY</code></a>,</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">A single channel image format where the single channel represents an <code>INTENSITY</code> value. |
| The <code>INTENSITY</code> value is replicated into the <code>RED</code>, <code>GREEN</code>, <code>BLUE</code>, and <code>ALPHA</code> components.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_RG"></a><a href="#CL_RG"><code>CL_RG</code></a>, <a id="CL_RA"></a><a href="#CL_RA"><code>CL_RA</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Two channel image formats. |
| The first channel always represents a <code>RED</code> component. |
| The second channel represents a <code>GREEN</code> component or an <code>ALPHA</code> component.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_Rx"></a><a href="#CL_Rx"><code>CL_Rx</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.1.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">A two channel image format, where the first channel represents a <code>RED</code> component and the second channel is ignored.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_RGB"></a><a href="#CL_RGB"><code>CL_RGB</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">A three channel image format, where the three channels represent <code>RED</code>, <code>GREEN</code>, and <code>BLUE</code> components.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_RGx"></a><a href="#CL_RGx"><code>CL_RGx</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.1.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">A three channel image format, where the first two channels represent <code>RED</code> and <code>GREEN</code> components and the third channel is ignored.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_RGBA"></a><a href="#CL_RGBA"><code>CL_RGBA</code></a>, <a id="CL_ARGB"></a><a href="#CL_ARGB"><code>CL_ARGB</code></a>, <a id="CL_BGRA"></a><a href="#CL_BGRA"><code>CL_BGRA</code></a>, <a id="CL_ABGR"></a><a href="#CL_ABGR"><code>CL_ABGR</code></a></p> |
| <p class="tableblock"> <a href="#CL_ABGR"><code>CL_ABGR</code></a> is <a href="#unified-spec">missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Four channel image formats, where the four channels represent <code>RED</code>, <code>GREEN</code>, <code>BLUE</code>, and <code>ALPHA</code> components.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_RGBx"></a><a href="#CL_RGBx"><code>CL_RGBx</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 1.1.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">A four channel image format, where the first three channels represent <code>RED</code>, <code>GREEN</code>, and <code>BLUE</code> components and the fourth channel is ignored.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_sRGB"></a><a href="#CL_sRGB"><code>CL_sRGB</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">A three channel image format, where the three channels represent <code>RED</code>, <code>GREEN</code>, and <code>BLUE</code> components in the sRGB color space.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_sRGBA"></a><a href="#CL_sRGBA"><code>CL_sRGBA</code></a>, <a id="CL_sBGRA"></a><a href="#CL_sBGRA"><code>CL_sBGRA</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Four channel image formats, where the first three channels represent <code>RED</code>, <code>GREEN</code>, and <code>BLUE</code> components in the sRGB color space. |
| The fourth channel represents an <code>ALPHA</code> component.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_sRGBx"></a><a href="#CL_sRGBx"><code>CL_sRGBx</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.0.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">A four channel image format, where the three channels represent <code>RED</code>, <code>GREEN</code>, and <code>BLUE</code> components in the sRGB color space. |
| The fourth channel is ignored.</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <table id="image-channel-data-types-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 17. List of supported Image Channel Data Types</caption> |
| <colgroup> |
| <col style="width: 50%;"> |
| <col style="width: 50%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">Image Channel Data Type</th> |
| <th class="tableblock halign-left valign-top">Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_SNORM_INT8"></a><a href="#CL_SNORM_INT8"><code>CL_SNORM_<wbr>INT8</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Each channel component is a normalized signed 8-bit integer value</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_SNORM_INT16"></a><a href="#CL_SNORM_INT16"><code>CL_SNORM_<wbr>INT16</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Each channel component is a normalized signed 16-bit integer value</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_UNORM_INT8"></a><a href="#CL_UNORM_INT8"><code>CL_UNORM_<wbr>INT8</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Each channel component is a normalized unsigned 8-bit integer value</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_UNORM_INT16"></a><a href="#CL_UNORM_INT16"><code>CL_UNORM_<wbr>INT16</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Each channel component is a normalized unsigned 16-bit integer value</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_UNORM_SHORT_565"></a><a href="#CL_UNORM_SHORT_565"><code>CL_UNORM_<wbr>SHORT_<wbr>565</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Represents a normalized 5-6-5 3-channel RGB image. |
| The channel order must be <a href="#CL_RGB"><code>CL_RGB</code></a> or <a href="#CL_RGBx"><code>CL_RGBx</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_UNORM_SHORT_555"></a><a href="#CL_UNORM_SHORT_555"><code>CL_UNORM_<wbr>SHORT_<wbr>555</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Represents a normalized x-5-5-5 4-channel xRGB image. |
| The channel order must be <a href="#CL_RGB"><code>CL_RGB</code></a> or <a href="#CL_RGBx"><code>CL_RGBx</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_UNORM_INT_101010"></a><a href="#CL_UNORM_INT_101010"><code>CL_UNORM_<wbr>INT_<wbr>101010</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Represents a normalized x-10-10-10 4-channel xRGB image. |
| The channel order must be <a href="#CL_RGB"><code>CL_RGB</code></a> or <a href="#CL_RGBx"><code>CL_RGBx</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_UNORM_INT_101010_2"></a><a href="#CL_UNORM_INT_101010_2"><code>CL_UNORM_<wbr>INT_<wbr>101010_<wbr>2</code></a></p> |
| <p class="tableblock"><a href="#unified-spec">Missing before</a> version 2.1.</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Represents a normalized 10-10-10-2 four-channel RGBA image. |
| The channel order must be <a href="#CL_RGBA"><code>CL_RGBA</code></a>.</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_SIGNED_INT8"></a><a href="#CL_SIGNED_INT8"><code>CL_SIGNED_<wbr>INT8</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Each channel component is an unnormalized signed 8-bit integer value</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_SIGNED_INT16"></a><a href="#CL_SIGNED_INT16"><code>CL_SIGNED_<wbr>INT16</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Each channel component is an unnormalized signed 16-bit integer value</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_SIGNED_INT32"></a><a href="#CL_SIGNED_INT32"><code>CL_SIGNED_<wbr>INT32</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Each channel component is an unnormalized signed 32-bit integer value</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_UNSIGNED_INT8"></a><a href="#CL_UNSIGNED_INT8"><code>CL_UNSIGNED_<wbr>INT8</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Each channel component is an unnormalized unsigned 8-bit integer value</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_UNSIGNED_INT16"></a><a href="#CL_UNSIGNED_INT16"><code>CL_UNSIGNED_<wbr>INT16</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Each channel component is an unnormalized unsigned 16-bit integer value</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_UNSIGNED_INT32"></a><a href="#CL_UNSIGNED_INT32"><code>CL_UNSIGNED_<wbr>INT32</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Each channel component is an unnormalized unsigned 32-bit integer value</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_HALF_FLOAT"></a><a href="#CL_HALF_FLOAT"><code>CL_HALF_<wbr>FLOAT</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Each channel component is a 16-bit half-float value</p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a id="CL_FLOAT"></a><a href="#CL_FLOAT"><code>CL_FLOAT</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">Each channel component is a single precision floating-point value</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="paragraph"> |
| <p>For example, to specify a normalized unsigned 8-bit / channel RGBA image, |
| <code>image_channel_order</code> = <a href="#CL_RGBA"><code>CL_RGBA</code></a>, and <code>image_channel_data_type</code> = |
| <a href="#CL_UNORM_INT8"><code>CL_UNORM_<wbr>INT8</code></a>. |
| The memory layout of this image format is described below:</p> |
| </div> |
| <table class="tableblock frame-all grid-all" style="width: 60%;"> |
| <colgroup> |
| <col style="width: 10%;"> |
| <col style="width: 10%;"> |
| <col style="width: 10%;"> |
| <col style="width: 10%;"> |
| <col style="width: 60%;"> |
| </colgroup> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">R</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">G</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">B</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">A</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">…​</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="paragraph"> |
| <p>with the corresponding byte offsets</p> |
| </div> |
| <table class="tableblock frame-all grid-all" style="width: 60%;"> |
| <colgroup> |
| <col style="width: 10%;"> |
| <col style="width: 10%;"> |
| <col style="width: 10%;"> |
| <col style="width: 10%;"> |
| <col style="width: 60%;"> |
| </colgroup> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">0</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">1</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">2</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">3</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">…​</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="paragraph"> |
| <p>Similar, if <code>image_channel_order</code> = <a href="#CL_RGBA"><code>CL_RGBA</code></a> and <code>image_channel_data_type</code> = |
| <a href="#CL_SIGNED_INT16"><code>CL_SIGNED_<wbr>INT16</code></a>, the memory layout of this image format is described below:</p> |
| </div> |
| <table class="tableblock frame-all grid-all" style="width: 60%;"> |
| <colgroup> |
| <col style="width: 10%;"> |
| <col style="width: 10%;"> |
| <col style="width: 10%;"> |
| <col style="width: 10%;"> |
| <col style="width: 60%;"> |
| </colgroup> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">R</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">G</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">B</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">A</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">…​</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="paragraph"> |
| <p>with the corresponding byte offsets</p> |
| </div> |
| <table class="tableblock frame-all grid-all" style="width: 60%;"> |
| <colgroup> |
| <col style="width: 10%;"> |
| <col style="width: 10%;"> |
| <col style="width: 10%;"> |
| <col style="width: 10%;"> |
| <col style="width: 60%;"> |
| </colgroup> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">0</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">2</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">4</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">6</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">…​</p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="paragraph"> |
| <p><code>image_channel_data_type</code> values of <a href="#CL_UNORM_SHORT_565"><code>CL_UNORM_<wbr>SHORT_<wbr>565</code></a>, <a href="#CL_UNORM_SHORT_555"><code>CL_UNORM_<wbr>SHORT_<wbr>555</code></a>, |
| <a href="#CL_UNORM_INT_101010"><code>CL_UNORM_<wbr>INT_<wbr>101010</code></a>, and <a href="#CL_UNORM_INT_101010_2"><code>CL_UNORM_<wbr>INT_<wbr>101010_<wbr>2</code></a> are special cases of packed |
| image formats where the channels of each element are packed into a single |
| unsigned short or unsigned int. |
| For these special packed image formats, the channels are normally packed |
| with the first channel in the most significant bits of the bitfield, and |
| successive channels occupying progressively less significant locations. |
| For <a href="#CL_UNORM_SHORT_565"><code>CL_UNORM_<wbr>SHORT_<wbr>565</code></a>, R is in bits 15:11, G is in bits 10:5 and B is in |
| bits 4:0. |
| For <a href="#CL_UNORM_SHORT_555"><code>CL_UNORM_<wbr>SHORT_<wbr>555</code></a>, bit 15 is undefined, R is in bits 14:10, G in bits |
| 9:5 and B in bits 4:0. |
| For <a href="#CL_UNORM_INT_101010"><code>CL_UNORM_<wbr>INT_<wbr>101010</code></a>, bits 31:30 are undefined, R is in bits 29:20, G in |
| bits 19:10 and B in bits 9:0. |
| For <a href="#CL_UNORM_INT_101010_2"><code>CL_UNORM_<wbr>INT_<wbr>101010_<wbr>2</code></a>, R is in bits 31:22, G in bits 21:12, B in bits |
| 11:2 and A in bits 1:0.</p> |
| </div> |
| <div class="paragraph"> |
| <p>OpenCL implementations must maintain the minimum precision specified by the |
| number of bits in <code>image_channel_data_type</code>. |
| If the image format specified by <code>image_channel_order</code>, and |
| <code>image_channel_data_type</code> cannot be supported by the OpenCL implementation, |
| then the call to <a href="#clCreateImage"><strong>clCreateImage</strong></a>, <a href="#clCreateImageWithProperties"><strong>clCreateImageWithProperties</strong></a>, |
| <a href="#clCreateImage2D"><strong>clCreateImage2D</strong></a>, or <a href="#clCreateImage3D"><strong>clCreateImage3D</strong></a> will return a <code>NULL</code> memory object.</p> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="sect4"> |
| <h5 id="image-descriptor"><a class="anchor" href="#image-descriptor"></a>5.3.1.2. Image Descriptor</h5> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>The <a href="#cl_image_desc"><code>cl_image_<wbr>desc</code></a> image descriptor structure describes the image type |
| and dimensions of an image or image array when creating an image using |
| <a href="#clCreateImage"><strong>clCreateImage</strong></a> or <a href="#clCreateImageWithProperties"><strong>clCreateImageWithProperties</strong></a>, and is defined as:</p> |
| </div> |
| <div id="cl_image_desc" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++"><span class="keyword">typedef</span> <span class="keyword">struct</span> cl_image_desc { |
| cl_mem_object_type image_type; |
| size_t image_width; |
| size_t image_height; |
| size_t image_depth; |
| size_t image_array_size; |
| size_t image_row_pitch; |
| size_t image_slice_pitch; |
| cl_uint num_mip_levels; |
| cl_uint num_samples; |
| <span class="keyword">union</span> { |
| cl_mem buffer; |
| cl_mem mem_object; |
| }; |
| } cl_image_desc;</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><code>image_type</code> describes the image type and must be either |
| <a href="#CL_MEM_OBJECT_IMAGE1D"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE1D</code></a>, <a href="#CL_MEM_OBJECT_IMAGE1D_BUFFER"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE1D_<wbr>BUFFER</code></a>, |
| <a href="#CL_MEM_OBJECT_IMAGE1D_ARRAY"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE1D_<wbr>ARRAY</code></a>, <a href="#CL_MEM_OBJECT_IMAGE2D"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE2D</code></a>, |
| <a href="#CL_MEM_OBJECT_IMAGE2D_ARRAY"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE2D_<wbr>ARRAY</code></a>, or <a href="#CL_MEM_OBJECT_IMAGE3D"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE3D</code></a>.</p> |
| </li> |
| <li> |
| <p><code>image_width</code> is the width of the image in pixels. |
| For a 2D image and image array, the image width must be a value ≥ 1 and |
| ≤ <a href="#CL_DEVICE_IMAGE2D_MAX_WIDTH"><code>CL_DEVICE_<wbr>IMAGE2D_<wbr>MAX_<wbr>WIDTH</code></a>. |
| For a 3D image, the image width must be a value ≥ 1 and ≤ |
| <a href="#CL_DEVICE_IMAGE3D_MAX_WIDTH"><code>CL_DEVICE_<wbr>IMAGE3D_<wbr>MAX_<wbr>WIDTH</code></a>. |
| For a 1D image buffer, the image width must be a value ≥ 1 and ≤ |
| <a href="#CL_DEVICE_IMAGE_MAX_BUFFER_SIZE"><code>CL_DEVICE_<wbr>IMAGE_<wbr>MAX_<wbr>BUFFER_<wbr>SIZE</code></a>. |
| For a 1D image and 1D image array, the image width must be a value ≥ 1 |
| and ≤ <a href="#CL_DEVICE_IMAGE2D_MAX_WIDTH"><code>CL_DEVICE_<wbr>IMAGE2D_<wbr>MAX_<wbr>WIDTH</code></a>.</p> |
| </li> |
| <li> |
| <p><code>image_height</code> is the height of the image in pixels. |
| This is only used if the image is a 2D or 3D image, or a 2D image array. |
| For a 2D image or image array, the image height must be a value ≥ 1 and |
| ≤ <a href="#CL_DEVICE_IMAGE2D_MAX_HEIGHT"><code>CL_DEVICE_<wbr>IMAGE2D_<wbr>MAX_<wbr>HEIGHT</code></a>. |
| For a 3D image, the image height must be a value ≥ 1 and ≤ |
| <a href="#CL_DEVICE_IMAGE3D_MAX_HEIGHT"><code>CL_DEVICE_<wbr>IMAGE3D_<wbr>MAX_<wbr>HEIGHT</code></a>.</p> |
| </li> |
| <li> |
| <p><code>image_depth</code> is the depth of the image in pixels. |
| This is only used if the image is a 3D image and must be a value ≥ 1 and |
| ≤ <a href="#CL_DEVICE_IMAGE3D_MAX_DEPTH"><code>CL_DEVICE_<wbr>IMAGE3D_<wbr>MAX_<wbr>DEPTH</code></a>.</p> |
| </li> |
| <li> |
| <p><code>image_array_size</code> <sup class="footnote">[<a id="_footnoteref_17" class="footnote" href="#_footnotedef_17" title="View footnote.">17</a>]</sup> is the number of |
| images in the image array. |
| This is only used if the image is a 1D or 2D image array. |
| The values for <code>image_array_size</code>, if specified, must be a value ≥ 1 and |
| ≤ <a href="#CL_DEVICE_IMAGE_MAX_ARRAY_SIZE"><code>CL_DEVICE_<wbr>IMAGE_<wbr>MAX_<wbr>ARRAY_<wbr>SIZE</code></a>.</p> |
| </li> |
| <li> |
| <p><code>image_row_pitch</code> is the scan-line pitch in bytes. |
| This must be 0 if <em>host_ptr</em> is <code>NULL</code> and can be either 0 or ≥ |
| <code>image_width</code> × size of element in bytes if <em>host_ptr</em> is not <code>NULL</code>. |
| If <em>host_ptr</em> is not <code>NULL</code> and <code>image_row_pitch</code> = 0, <code>image_row_pitch</code> is |
| calculated as <code>image_width</code> × size of element in bytes. |
| If <code>image_row_pitch</code> is not 0, it must be a multiple of the image element |
| size in bytes. |
| For a 2D image created from a buffer, the pitch specified (or computed if |
| pitch specified is 0) must be a multiple of the maximum of the |
| <a href="#CL_DEVICE_IMAGE_PITCH_ALIGNMENT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>PITCH_<wbr>ALIGNMENT</code></a> value for all devices in the context |
| associated with the buffer specified by <code>mem_object</code> that support images.</p> |
| </li> |
| <li> |
| <p><code>image_slice_pitch</code> is the size in bytes of each 2D slice in the 3D image or |
| the size in bytes of each image in a 1D or 2D image array. |
| This must be 0 if <em>host_ptr</em> is <code>NULL</code>. |
| If <em>host_ptr</em> is not <code>NULL</code>, <code>image_slice_pitch</code> can be either 0 or ≥ |
| <code>image_row_pitch</code> × <code>image_height</code> for a 2D image array or 3D image |
| and can be either 0 or ≥ <code>image_row_pitch</code> for a 1D image array. |
| If <em>host_ptr</em> is not <code>NULL</code> and <code>image_slice_pitch</code> = 0, <code>image_slice_pitch</code> |
| is calculated as <code>image_row_pitch</code> × <code>image_height</code> for a 2D image |
| array or 3D image and <code>image_row_pitch</code> for a 1D image array. |
| If <code>image_slice_pitch</code> is not 0, it must be a multiple of the |
| <code>image_row_pitch</code>.</p> |
| </li> |
| <li> |
| <p><code>num_mip_levels</code> and <code>num_samples</code> must be 0.</p> |
| </li> |
| <li> |
| <p><code>mem_object</code> may refer to a valid buffer or image memory object. |
| <code>mem_object</code> can be a buffer memory object if <code>image_type</code> is |
| <a href="#CL_MEM_OBJECT_IMAGE1D_BUFFER"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE1D_<wbr>BUFFER</code></a> or |
| <a href="#CL_MEM_OBJECT_IMAGE2D"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE2D</code></a> <sup class="footnote">[<a id="_footnoteref_18" class="footnote" href="#_footnotedef_18" title="View footnote.">18</a>]</sup>. |
| <code>mem_object</code> can be an image object if <code>image_type</code> is |
| <a href="#CL_MEM_OBJECT_IMAGE2D"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE2D</code></a> <sup class="footnote">[<a id="_footnoteref_19" class="footnote" href="#_footnotedef_19" title="View footnote.">19</a>]</sup>. |
| Otherwise it must be <code>NULL</code>. |
| The image pixels are taken from the memory objects data store. |
| When the contents of the specified memory objects data store are modified, |
| those changes are reflected in the contents of the image object and |
| vice-versa at corresponding synchronization points.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>For a 1D image buffer created from a buffer object, the <code>image_width</code> × |
| size of element in bytes must be ≤ size of the buffer object. |
| The image data in the buffer object is stored as a single scanline which is |
| a linear sequence of adjacent elements.</p> |
| </div> |
| <div class="paragraph"> |
| <p>For a 2D image created from a buffer object, the <code>image_row_pitch</code> × |
| <code>image_height</code> must be ≤ size of the buffer object specified by |
| <code>mem_object</code>. |
| The image data in the buffer object is stored as a linear sequence of |
| adjacent scanlines. |
| Each scanline is a linear sequence of image elements padded to |
| <code>image_row_pitch</code> bytes.</p> |
| </div> |
| <div class="paragraph"> |
| <p>For an image object created from another image object, the values specified |
| in the image descriptor except for <code>mem_object</code> must match the image |
| descriptor information associated with <code>mem_object</code>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Image elements are stored according to their image format as described in |
| <a href="#image-format-descriptor">Image Format Descriptor</a>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>If the buffer object specified by <code>mem_object</code> was created with |
| <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a>, the <em>host_ptr</em> specified to <a href="#clCreateBuffer"><strong>clCreateBuffer</strong></a> or |
| <a href="#clCreateBufferWithProperties"><strong>clCreateBufferWithProperties</strong></a> must be aligned to the maximum of the |
| <a href="#CL_DEVICE_IMAGE_BASE_ADDRESS_ALIGNMENT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>BASE_<wbr>ADDRESS_<wbr>ALIGNMENT</code></a> value for all devices in the |
| context associated with the buffer specified by <code>mem_object</code> that |
| support images.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Creating a 2D image object from another 2D image object creates a new |
| 2D image object that shares the image data store with <code>mem_object</code> but views |
| the pixels in the image with a different image channel order. |
| Restrictions are:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>All of the values specified in <em>image_desc</em> must match the image descriptor |
| information associated with <code>mem_object</code>, except for <code>mem_object</code>.</p> |
| </li> |
| <li> |
| <p>The image channel data type specified in <em>image_format</em> must match the |
| image channel data type associated with <code>mem_object</code>.</p> |
| </li> |
| <li> |
| <p>The image channel order specified in <em>image_format</em> must be compatible |
| with the image channel order associated with <code>mem_object</code>. |
| Compatible image channel orders |
| <sup class="footnote">[<a id="_footnoteref_20" class="footnote" href="#_footnotedef_20" title="View footnote.">20</a>]</sup> are:</p> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| <table class="tableblock frame-all grid-all stretch"> |
| <colgroup> |
| <col style="width: 50%;"> |
| <col style="width: 50%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">Image Channel Order in <em>image_format</em>:</th> |
| <th class="tableblock halign-left valign-top">Image Channel Order associated with <code>mem_object</code>:</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_sBGRA"><code>CL_sBGRA</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_BGRA"><code>CL_BGRA</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_BGRA"><code>CL_BGRA</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_sBGRA"><code>CL_sBGRA</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_sRGBA"><code>CL_sRGBA</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_RGBA"><code>CL_RGBA</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_RGBA"><code>CL_RGBA</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_sRGBA"><code>CL_sRGBA</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_sRGB"><code>CL_sRGB</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_RGB"><code>CL_RGB</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_RGB"><code>CL_RGB</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_sRGB"><code>CL_sRGB</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_sRGBx"><code>CL_sRGBx</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_RGBx"><code>CL_RGBx</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_RGBx"><code>CL_RGBx</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_sRGBx"><code>CL_sRGBx</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_DEPTH"><code>CL_DEPTH</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_R"><code>CL_R</code></a></p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| <div class="paragraph"> |
| <p>Concurrent reading from, writing to and copying between both a buffer object |
| and 1D image buffer or 2D image object associated with the buffer object is |
| undefined. |
| Only reading from both a buffer object and 1D image buffer or 2D image |
| object associated with the buffer object is defined.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Writing to an image created from a buffer and then reading from this buffer |
| in a kernel even if appropriate synchronization operations (such as a |
| barrier) are performed between the writes and reads is undefined. |
| Similarly, writing to the buffer and reading from the image created from |
| this buffer with appropriate synchronization between the writes and reads is |
| undefined.</p> |
| </div> |
| </td> |
| </tr> |
| </table> |
| </div> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="sect3"> |
| <h4 id="_querying_list_of_supported_image_formats"><a class="anchor" href="#_querying_list_of_supported_image_formats"></a>5.3.2. Querying List of Supported Image Formats</h4> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>To get the list of image formats supported by an OpenCL implementation for a |
| specified context, image type, and allocation information, call the function</p> |
| </div> |
| <div id="clGetSupportedImageFormats" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clGetSupportedImageFormats( |
| cl_context context, |
| cl_mem_flags flags, |
| cl_mem_object_type image_type, |
| cl_uint num_entries, |
| cl_image_format* image_formats, |
| cl_uint* num_image_formats);</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>context</em> is a valid OpenCL context on which the image object(s) will be |
| created.</p> |
| </li> |
| <li> |
| <p><em>flags</em> is a bit-field that is used to specify usage |
| information about the image formats being queried and is described in |
| the <a href="#memory-flags-table">Memory Flags</a> table. |
| <em>flags</em> may be <a href="#CL_MEM_READ_WRITE"><code>CL_MEM_<wbr>READ_<wbr>WRITE</code></a> to query image formats that may be read |
| from and written to by different kernel instances when correctly ordered by |
| event dependencies, or <a href="#CL_MEM_READ_ONLY"><code>CL_MEM_<wbr>READ_<wbr>ONLY</code></a> to query image formats that may |
| be read from by a kernel, or <a href="#CL_MEM_WRITE_ONLY"><code>CL_MEM_<wbr>WRITE_<wbr>ONLY</code></a> to query image formats that |
| may be written to by a kernel, or <a href="#CL_MEM_KERNEL_READ_AND_WRITE"><code>CL_MEM_<wbr>KERNEL_<wbr>READ_<wbr>AND_<wbr>WRITE</code></a> to query |
| image formats that may be both read from and written to by the same kernel |
| instance. |
| Please see <a href="#image-format-mapping">Image Format Mapping</a> for clarification.</p> |
| </li> |
| <li> |
| <p><em>image_type</em> describes the image type and must be either |
| <a href="#CL_MEM_OBJECT_IMAGE1D"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE1D</code></a>, <a href="#CL_MEM_OBJECT_IMAGE1D_BUFFER"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE1D_<wbr>BUFFER</code></a>, <a href="#CL_MEM_OBJECT_IMAGE2D"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE2D</code></a>, |
| <a href="#CL_MEM_OBJECT_IMAGE3D"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE3D</code></a>, <a href="#CL_MEM_OBJECT_IMAGE1D_ARRAY"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE1D_<wbr>ARRAY</code></a>, or |
| <a href="#CL_MEM_OBJECT_IMAGE2D_ARRAY"><code>CL_MEM_<wbr>OBJECT_<wbr>IMAGE2D_<wbr>ARRAY</code></a>.</p> |
| </li> |
| <li> |
| <p><em>num_entries</em> specifies the number of entries that can be returned in the |
| memory location given by <em>image_formats</em>.</p> |
| </li> |
| <li> |
| <p><em>image_formats</em> is a pointer to a memory location where the list of |
| supported image formats are returned. |
| Each entry describes a <a href="#cl_image_format"><code>cl_image_<wbr>format</code></a> structure supported by the OpenCL |
| implementation. |
| If <em>image_formats</em> is <code>NULL</code>, it is ignored.</p> |
| </li> |
| <li> |
| <p><em>num_image_formats</em> is the actual number of supported image formats for a |
| specific <em>context</em> and values specified by <em>flags</em>. |
| If <em>num_image_formats</em> is <code>NULL</code>, it is ignored.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clGetSupportedImageFormats"><strong>clGetSupportedImageFormats</strong></a> returns a union of image formats supported by |
| all devices in the context.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clGetSupportedImageFormats"><strong>clGetSupportedImageFormats</strong></a> returns <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the function is executed |
| successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if <em>context</em> is not a valid context.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>flags</em> or <em>image_type</em> are not valid, or if |
| <em>num_entries</em> is 0 and <em>image_formats</em> is not <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>If <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> specified in the <a href="#device-queries-table">Device |
| Queries</a> table is <a href="#CL_TRUE"><code>CL_TRUE</code></a>, the values assigned to |
| <a href="#CL_DEVICE_MAX_READ_IMAGE_ARGS"><code>CL_DEVICE_<wbr>MAX_<wbr>READ_<wbr>IMAGE_<wbr>ARGS</code></a>, <a href="#CL_DEVICE_MAX_WRITE_IMAGE_ARGS"><code>CL_DEVICE_<wbr>MAX_<wbr>WRITE_<wbr>IMAGE_<wbr>ARGS</code></a> |
| <a href="#CL_DEVICE_IMAGE2D_MAX_WIDTH"><code>CL_DEVICE_<wbr>IMAGE2D_<wbr>MAX_<wbr>WIDTH</code></a>, <a href="#CL_DEVICE_IMAGE2D_MAX_HEIGHT"><code>CL_DEVICE_<wbr>IMAGE2D_<wbr>MAX_<wbr>HEIGHT</code></a> |
| <a href="#CL_DEVICE_IMAGE3D_MAX_WIDTH"><code>CL_DEVICE_<wbr>IMAGE3D_<wbr>MAX_<wbr>WIDTH</code></a>, <a href="#CL_DEVICE_IMAGE3D_MAX_HEIGHT"><code>CL_DEVICE_<wbr>IMAGE3D_<wbr>MAX_<wbr>HEIGHT</code></a> |
| <a href="#CL_DEVICE_IMAGE3D_MAX_DEPTH"><code>CL_DEVICE_<wbr>IMAGE3D_<wbr>MAX_<wbr>DEPTH</code></a>, and <a href="#CL_DEVICE_MAX_SAMPLERS"><code>CL_DEVICE_<wbr>MAX_<wbr>SAMPLERS</code></a> by the implementation |
| must be greater than or equal to the minimum values specified in the |
| <a href="#device-queries-table">Device Queries</a> table.</p> |
| </div> |
| </div> |
| </div> |
| <div class="sect4"> |
| <h5 id="minimum-list-of-supported-image-formats"><a class="anchor" href="#minimum-list-of-supported-image-formats"></a>5.3.2.1. Minimum List of Supported Image Formats</h5> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>The tables below describe the required minimum lists of supported image |
| formats. |
| To query all image formats supported by an implementation, call the function <a href="#clGetSupportedImageFormats"><strong>clGetSupportedImageFormats</strong></a>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>For full profile devices supporting OpenCL 2.0, 2.1, or 2.2, the minimum |
| list of supported image formats for either reading or writing in a kernel |
| is:</p> |
| </div> |
| <table id="min-supported-image-formats-2.0" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 18. Minimum list of supported image formats for reading or writing (OpenCL 2.0, 2.1, or 2.2)</caption> |
| <colgroup> |
| <col style="width: 34%;"> |
| <col style="width: 33%;"> |
| <col style="width: 33%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">num_channels</th> |
| <th class="tableblock halign-left valign-top">channel_order</th> |
| <th class="tableblock halign-left valign-top">channel_data_type</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">1</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_R"><code>CL_R</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_UNORM_INT8"><code>CL_UNORM_<wbr>INT8</code></a><br> |
| <a href="#CL_UNORM_INT16"><code>CL_UNORM_<wbr>INT16</code></a><br> |
| <a href="#CL_SNORM_INT8"><code>CL_SNORM_<wbr>INT8</code></a><br> |
| <a href="#CL_SNORM_INT16"><code>CL_SNORM_<wbr>INT16</code></a><br> |
| <a href="#CL_SIGNED_INT8"><code>CL_SIGNED_<wbr>INT8</code></a><br> |
| <a href="#CL_SIGNED_INT16"><code>CL_SIGNED_<wbr>INT16</code></a><br> |
| <a href="#CL_SIGNED_INT32"><code>CL_SIGNED_<wbr>INT32</code></a><br> |
| <a href="#CL_UNSIGNED_INT8"><code>CL_UNSIGNED_<wbr>INT8</code></a><br> |
| <a href="#CL_UNSIGNED_INT16"><code>CL_UNSIGNED_<wbr>INT16</code></a><br> |
| <a href="#CL_UNSIGNED_INT32"><code>CL_UNSIGNED_<wbr>INT32</code></a><br> |
| <a href="#CL_HALF_FLOAT"><code>CL_HALF_<wbr>FLOAT</code></a><br> |
| <a href="#CL_FLOAT"><code>CL_FLOAT</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">1</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_DEPTH"><code>CL_DEPTH</code></a> <sup class="footnote">[<a id="_footnoteref_21" class="footnote" href="#_footnotedef_21" title="View footnote.">21</a>]</sup></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_UNORM_INT16"><code>CL_UNORM_<wbr>INT16</code></a><br> |
| <a href="#CL_FLOAT"><code>CL_FLOAT</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">2</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_RG"><code>CL_RG</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_UNORM_INT8"><code>CL_UNORM_<wbr>INT8</code></a><br> |
| <a href="#CL_UNORM_INT16"><code>CL_UNORM_<wbr>INT16</code></a><br> |
| <a href="#CL_SNORM_INT8"><code>CL_SNORM_<wbr>INT8</code></a><br> |
| <a href="#CL_SNORM_INT16"><code>CL_SNORM_<wbr>INT16</code></a><br> |
| <a href="#CL_SIGNED_INT8"><code>CL_SIGNED_<wbr>INT8</code></a><br> |
| <a href="#CL_SIGNED_INT16"><code>CL_SIGNED_<wbr>INT16</code></a><br> |
| <a href="#CL_SIGNED_INT32"><code>CL_SIGNED_<wbr>INT32</code></a><br> |
| <a href="#CL_UNSIGNED_INT8"><code>CL_UNSIGNED_<wbr>INT8</code></a><br> |
| <a href="#CL_UNSIGNED_INT16"><code>CL_UNSIGNED_<wbr>INT16</code></a><br> |
| <a href="#CL_UNSIGNED_INT32"><code>CL_UNSIGNED_<wbr>INT32</code></a><br> |
| <a href="#CL_HALF_FLOAT"><code>CL_HALF_<wbr>FLOAT</code></a><br> |
| <a href="#CL_FLOAT"><code>CL_FLOAT</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">4</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_RGBA"><code>CL_RGBA</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_UNORM_INT8"><code>CL_UNORM_<wbr>INT8</code></a><br> |
| <a href="#CL_UNORM_INT16"><code>CL_UNORM_<wbr>INT16</code></a><br> |
| <a href="#CL_SNORM_INT8"><code>CL_SNORM_<wbr>INT8</code></a><br> |
| <a href="#CL_SNORM_INT16"><code>CL_SNORM_<wbr>INT16</code></a><br> |
| <a href="#CL_SIGNED_INT8"><code>CL_SIGNED_<wbr>INT8</code></a><br> |
| <a href="#CL_SIGNED_INT16"><code>CL_SIGNED_<wbr>INT16</code></a><br> |
| <a href="#CL_SIGNED_INT32"><code>CL_SIGNED_<wbr>INT32</code></a><br> |
| <a href="#CL_UNSIGNED_INT8"><code>CL_UNSIGNED_<wbr>INT8</code></a><br> |
| <a href="#CL_UNSIGNED_INT16"><code>CL_UNSIGNED_<wbr>INT16</code></a><br> |
| <a href="#CL_UNSIGNED_INT32"><code>CL_UNSIGNED_<wbr>INT32</code></a><br> |
| <a href="#CL_HALF_FLOAT"><code>CL_HALF_<wbr>FLOAT</code></a><br> |
| <a href="#CL_FLOAT"><code>CL_FLOAT</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">4</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_BGRA"><code>CL_BGRA</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_UNORM_INT8"><code>CL_UNORM_<wbr>INT8</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">4</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_sRGBA"><code>CL_sRGBA</code></a> <sup class="footnote">[<a id="_footnoteref_22" class="footnote" href="#_footnotedef_22" title="View footnote.">22</a>]</sup></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_UNORM_INT8"><code>CL_UNORM_<wbr>INT8</code></a></p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="paragraph"> |
| <p>For full profile devices supporting other OpenCL versions, such as OpenCL 1.2 |
| or OpenCL 3.0, the minimum list of supported image formats for either reading |
| or writing in a kernel is:</p> |
| </div> |
| <table id="min-supported-image-formats" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 19. Minimum list of required image formats for reading or writing</caption> |
| <colgroup> |
| <col style="width: 34%;"> |
| <col style="width: 33%;"> |
| <col style="width: 33%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">num_channels</th> |
| <th class="tableblock halign-left valign-top">channel_order</th> |
| <th class="tableblock halign-left valign-top">channel_data_type</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">4</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_RGBA"><code>CL_RGBA</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_UNORM_INT8"><code>CL_UNORM_<wbr>INT8</code></a><br> |
| <a href="#CL_UNORM_INT16"><code>CL_UNORM_<wbr>INT16</code></a><br> |
| <a href="#CL_SIGNED_INT8"><code>CL_SIGNED_<wbr>INT8</code></a><br> |
| <a href="#CL_SIGNED_INT16"><code>CL_SIGNED_<wbr>INT16</code></a><br> |
| <a href="#CL_SIGNED_INT32"><code>CL_SIGNED_<wbr>INT32</code></a><br> |
| <a href="#CL_UNSIGNED_INT8"><code>CL_UNSIGNED_<wbr>INT8</code></a><br> |
| <a href="#CL_UNSIGNED_INT16"><code>CL_UNSIGNED_<wbr>INT16</code></a><br> |
| <a href="#CL_UNSIGNED_INT32"><code>CL_UNSIGNED_<wbr>INT32</code></a><br> |
| <a href="#CL_HALF_FLOAT"><code>CL_HALF_<wbr>FLOAT</code></a><br> |
| <a href="#CL_FLOAT"><code>CL_FLOAT</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">4</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_BGRA"><code>CL_BGRA</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_UNORM_INT8"><code>CL_UNORM_<wbr>INT8</code></a></p></td> |
| </tr> |
| </tbody> |
| </table> |
| <div class="paragraph"> |
| <p>For full profile devices that support reading from and writing to the same |
| image object from the same kernel instance (see <a href="#CL_DEVICE_MAX_READ_WRITE_IMAGE_ARGS"><code>CL_DEVICE_<wbr>MAX_<wbr>READ_<wbr>WRITE_<wbr>IMAGE_<wbr>ARGS</code></a>), |
| the minimum list of supported image formats for reading and writing in |
| the same kernel instance is:</p> |
| </div> |
| <table id="min-supported-image-formats-read-write" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 20. Minimum list of required image formats for reading and writing</caption> |
| <colgroup> |
| <col style="width: 34%;"> |
| <col style="width: 33%;"> |
| <col style="width: 33%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">num_channels</th> |
| <th class="tableblock halign-left valign-top">channel_order</th> |
| <th class="tableblock halign-left valign-top">channel_data_type</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">1</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_R"><code>CL_R</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_UNORM_INT8"><code>CL_UNORM_<wbr>INT8</code></a><br> |
| <a href="#CL_SIGNED_INT8"><code>CL_SIGNED_<wbr>INT8</code></a><br> |
| <a href="#CL_SIGNED_INT16"><code>CL_SIGNED_<wbr>INT16</code></a><br> |
| <a href="#CL_SIGNED_INT32"><code>CL_SIGNED_<wbr>INT32</code></a><br> |
| <a href="#CL_UNSIGNED_INT8"><code>CL_UNSIGNED_<wbr>INT8</code></a><br> |
| <a href="#CL_UNSIGNED_INT16"><code>CL_UNSIGNED_<wbr>INT16</code></a><br> |
| <a href="#CL_UNSIGNED_INT32"><code>CL_UNSIGNED_<wbr>INT32</code></a><br> |
| <a href="#CL_HALF_FLOAT"><code>CL_HALF_<wbr>FLOAT</code></a><br> |
| <a href="#CL_FLOAT"><code>CL_FLOAT</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock">4</p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_RGBA"><code>CL_RGBA</code></a></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_UNORM_INT8"><code>CL_UNORM_<wbr>INT8</code></a><br> |
| <a href="#CL_SIGNED_INT8"><code>CL_SIGNED_<wbr>INT8</code></a><br> |
| <a href="#CL_SIGNED_INT16"><code>CL_SIGNED_<wbr>INT16</code></a><br> |
| <a href="#CL_SIGNED_INT32"><code>CL_SIGNED_<wbr>INT32</code></a><br> |
| <a href="#CL_UNSIGNED_INT8"><code>CL_UNSIGNED_<wbr>INT8</code></a><br> |
| <a href="#CL_UNSIGNED_INT16"><code>CL_UNSIGNED_<wbr>INT16</code></a><br> |
| <a href="#CL_UNSIGNED_INT32"><code>CL_UNSIGNED_<wbr>INT32</code></a><br> |
| <a href="#CL_HALF_FLOAT"><code>CL_HALF_<wbr>FLOAT</code></a><br> |
| <a href="#CL_FLOAT"><code>CL_FLOAT</code></a></p></td> |
| </tr> |
| </tbody> |
| </table> |
| </div> |
| </div> |
| </div> |
| <div class="sect4"> |
| <h5 id="image-format-mapping"><a class="anchor" href="#image-format-mapping"></a>5.3.2.2. Image format mapping to OpenCL kernel language image access qualifiers</h5> |
| <div class="paragraph"> |
| <p>Image arguments to kernels may have the <code>read_only</code>, <code>write_only</code> or |
| <code>read_write</code> qualifier. |
| Not all image formats supported by the device and platform are valid to be |
| passed to all of these access qualifiers. |
| For each access qualifier, only images whose format is in the list of |
| formats returned by <a href="#clGetSupportedImageFormats"><strong>clGetSupportedImageFormats</strong></a> with the given flag |
| arguments in the <a href="#image-format-mapping-table">Image Format Mapping</a> table |
| are permitted. |
| It is not valid to pass an image supporting writing as both a <code>read_only</code> |
| image and a <code>write_only</code> image parameter, or to a <code>read_write</code> image |
| parameter and any other image parameter.</p> |
| </div> |
| <table id="image-format-mapping-table" class="tableblock frame-all grid-all stretch"> |
| <caption class="title">Table 21. Mapping from format flags passed to <a href="#clGetSupportedImageFormats">clGetSupportedImageFormats</a> to OpenCL kernel language image access qualifiers</caption> |
| <colgroup> |
| <col style="width: 50%;"> |
| <col style="width: 50%;"> |
| </colgroup> |
| <thead> |
| <tr> |
| <th class="tableblock halign-left valign-top">Access Qualifier</th> |
| <th class="tableblock halign-left valign-top">Memory Flags</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>read_only</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_MEM_READ_ONLY"><code>CL_MEM_<wbr>READ_<wbr>ONLY</code></a>,<br> |
| <a href="#CL_MEM_READ_WRITE"><code>CL_MEM_<wbr>READ_<wbr>WRITE</code></a>,<br> |
| <a href="#CL_MEM_KERNEL_READ_AND_WRITE"><code>CL_MEM_<wbr>KERNEL_<wbr>READ_<wbr>AND_<wbr>WRITE</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>write_only</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_MEM_WRITE_ONLY"><code>CL_MEM_<wbr>WRITE_<wbr>ONLY</code></a>,<br> |
| <a href="#CL_MEM_READ_WRITE"><code>CL_MEM_<wbr>READ_<wbr>WRITE</code></a>,<br> |
| <a href="#CL_MEM_KERNEL_READ_AND_WRITE"><code>CL_MEM_<wbr>KERNEL_<wbr>READ_<wbr>AND_<wbr>WRITE</code></a></p></td> |
| </tr> |
| <tr> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><code>read_write</code></p></td> |
| <td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#CL_MEM_KERNEL_READ_AND_WRITE"><code>CL_MEM_<wbr>KERNEL_<wbr>READ_<wbr>AND_<wbr>WRITE</code></a></p></td> |
| </tr> |
| </tbody> |
| </table> |
| </div> |
| </div> |
| <div class="sect3"> |
| <h4 id="_reading_writing_and_copying_image_objects"><a class="anchor" href="#_reading_writing_and_copying_image_objects"></a>5.3.3. Reading, Writing and Copying Image Objects</h4> |
| <div class="openblock"> |
| <div class="content"> |
| <div class="paragraph"> |
| <p>The following functions enqueue commands to read from an image or image |
| array object to host memory or write to an image or image array object from |
| host memory.</p> |
| </div> |
| <div id="clEnqueueReadImage" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clEnqueueReadImage( |
| cl_command_queue command_queue, |
| cl_mem image, |
| cl_bool blocking_read, |
| <span class="directive">const</span> size_t* origin, |
| <span class="directive">const</span> size_t* region, |
| size_t row_pitch, |
| size_t slice_pitch, |
| <span class="directive">void</span>* ptr, |
| cl_uint num_events_in_wait_list, |
| <span class="directive">const</span> cl_event* event_wait_list, |
| cl_event* event);</code></pre> |
| </div> |
| </div> |
| <div id="clEnqueueWriteImage" class="listingblock"> |
| <div class="content"> |
| <pre class="CodeRay highlight"><code data-lang="c++">cl_int clEnqueueWriteImage( |
| cl_command_queue command_queue, |
| cl_mem image, |
| cl_bool blocking_write, |
| <span class="directive">const</span> size_t* origin, |
| <span class="directive">const</span> size_t* region, |
| size_t input_row_pitch, |
| size_t input_slice_pitch, |
| <span class="directive">const</span> <span class="directive">void</span>* ptr, |
| cl_uint num_events_in_wait_list, |
| <span class="directive">const</span> cl_event* event_wait_list, |
| cl_event* event);</code></pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><em>command_queue</em> refers to the host command-queue in which the read / write |
| command will be queued. |
| <em>command_queue</em> and <em>image</em> must be created with the same OpenCL context.</p> |
| </li> |
| <li> |
| <p><em>image</em> refers to a valid image or image array object.</p> |
| </li> |
| <li> |
| <p><em>blocking_read</em> and <em>blocking_write</em> indicate if the read and write |
| operations are <em>blocking</em> or <em>non-blocking</em>.</p> |
| </li> |
| <li> |
| <p><em>origin</em> defines the (<em>x</em>, <em>y</em>, <em>z</em>) offset in pixels in the 1D, 2D or 3D |
| image, the (<em>x</em>, <em>y</em>) offset and the image index in the 2D image array or |
| the (<em>x</em>) offset and the image index in the 1D image array. |
| If <em>image</em> is a 2D image object, <em>origin</em>[2] must be 0. |
| If <em>image</em> is a 1D image or 1D image buffer object, <em>origin</em>[1] and |
| <em>origin</em>[2] must be 0. |
| If <em>image</em> is a 1D image array object, <em>origin</em>[2] must be 0. |
| If <em>image</em> is a 1D image array object, <em>origin</em>[1] describes the image index |
| in the 1D image array. |
| If <em>image</em> is a 2D image array object, <em>origin</em>[2] describes the image index |
| in the 2D image array.</p> |
| </li> |
| <li> |
| <p><em>region</em> defines the (<em>width</em>, <em>height</em>, <em>depth</em>) in pixels of the 1D, 2D or |
| 3D rectangle, the (<em>width</em>, <em>height</em>) in pixels of the 2D rectangle and the |
| number of images of a 2D image array or the (<em>width</em>) in pixels of the 1D |
| rectangle and the number of images of a 1D image array. |
| If <em>image</em> is a 2D image object, <em>region</em>[2] must be 1. |
| If <em>image</em> is a 1D image or 1D image buffer object, <em>region</em>[1] and |
| <em>region</em>[2] must be 1. |
| If <em>image</em> is a 1D image array object, <em>region</em>[2] must be 1. |
| The values in <em>region</em> cannot be 0.</p> |
| </li> |
| <li> |
| <p><em>row_pitch</em> in <a href="#clEnqueueReadImage"><strong>clEnqueueReadImage</strong></a> and <em>input_row_pitch</em> in |
| <a href="#clEnqueueWriteImage"><strong>clEnqueueWriteImage</strong></a> is the length of each row in bytes. |
| This value must be greater than or equal to the element size in bytes |
| × <em>width</em>. |
| If <em>row_pitch</em> (or <em>input_row_pitch</em>) is set to 0, the appropriate row pitch |
| is calculated based on the size of each element in bytes multiplied by |
| <em>width</em>.</p> |
| </li> |
| <li> |
| <p><em>slice_pitch</em> in <a href="#clEnqueueReadImage"><strong>clEnqueueReadImage</strong></a> and <em>input_slice_pitch</em> in |
| <a href="#clEnqueueWriteImage"><strong>clEnqueueWriteImage</strong></a> is the size in bytes of the 2D slice of the 3D region |
| of a 3D image or each image of a 1D or 2D image array being read or written |
| respectively. |
| This must be 0 if <em>image</em> is a 1D or 2D image. |
| Otherwise this value must be greater than or equal to <em>row_pitch</em> × |
| <em>height</em>. |
| If <em>slice_pitch</em> (or <em>input_slice_pitch</em>) is set to 0, the appropriate slice |
| pitch is calculated based on the <em>row_pitch</em> × <em>height</em>.</p> |
| </li> |
| <li> |
| <p><em>ptr</em> is the pointer to a buffer in host memory where image data is to be |
| read from or to be written to. |
| The alignment requirements for ptr are specified in |
| <a href="#alignment-app-data-types">Alignment of Application Data Types</a>.</p> |
| </li> |
| <li> |
| <p><em>event_wait_list</em> and <em>num_events_in_wait_list</em> specify events that need to |
| complete before this particular command can be executed. |
| If <em>event_wait_list</em> is <code>NULL</code>, then this particular command does not wait |
| on any event to complete. |
| If <em>event_wait_list</em> is <code>NULL</code>, <em>num_events_in_wait_list</em> must be 0. |
| If <em>event_wait_list</em> is not <code>NULL</code>, the list of events pointed to by |
| <em>event_wait_list</em> must be valid and <em>num_events_in_wait_list</em> must be |
| greater than 0. |
| The events specified in <em>event_wait_list</em> act as synchronization points. |
| The context associated with events in <em>event_wait_list</em> and <em>command_queue</em> |
| must be the same. |
| The memory associated with <em>event_wait_list</em> can be reused or freed after |
| the function returns.</p> |
| </li> |
| <li> |
| <p><em>event</em> returns an event object that identifies this read / write command |
| and can be used to query or queue a wait for this command to complete. |
| If <em>event</em> is <code>NULL</code> or the enqueue is unsuccessful, no event will be |
| created and therefore it will not be possible to query the status of this |
| command or to wait for this command to complete. |
| If <em>event_wait_list</em> and <em>event</em> are not <code>NULL</code>, <em>event</em> must not refer |
| to an element of the <em>event_wait_list</em> array.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>If <em>blocking_read</em> is <a href="#CL_TRUE"><code>CL_TRUE</code></a> i.e. the read command is blocking, |
| <a href="#clEnqueueReadImage"><strong>clEnqueueReadImage</strong></a> does not return until the buffer data has been read and |
| copied into memory pointed to by <em>ptr</em>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>If <em>blocking_read</em> is <a href="#CL_FALSE"><code>CL_FALSE</code></a> i.e. the read command is non-blocking, |
| <a href="#clEnqueueReadImage"><strong>clEnqueueReadImage</strong></a> queues a non-blocking read command and returns. |
| The contents of the buffer that <em>ptr</em> points to cannot be used until the |
| read command has completed. |
| The <em>event</em> argument returns an event object which can be used to query the |
| execution status of the read command. |
| When the read command has completed, the contents of the buffer that <em>ptr</em> |
| points to can be used by the application.</p> |
| </div> |
| <div class="paragraph"> |
| <p>If <em>blocking_write</em> is <a href="#CL_TRUE"><code>CL_TRUE</code></a>, the write command is blocking and does not |
| return until the command is complete, including transfer of the data. |
| The memory pointed to by <em>ptr</em> can be reused by the application after the |
| <a href="#clEnqueueWriteImage"><strong>clEnqueueWriteImage</strong></a> call returns.</p> |
| </div> |
| <div class="paragraph"> |
| <p>If <em>blocking_write</em> is <a href="#CL_FALSE"><code>CL_FALSE</code></a>, the OpenCL implementation will use <em>ptr</em> to |
| perform a non-blocking write. |
| As the write is non-blocking the implementation can return immediately. |
| The memory pointed to by <em>ptr</em> cannot be reused by the application after the |
| call returns. |
| The <em>event</em> argument returns an event object which can be used to query the |
| execution status of the write command. |
| When the write command has completed, the memory pointed to by <em>ptr</em> can |
| then be reused by the application.</p> |
| </div> |
| <div class="paragraph"> |
| <p><a href="#clEnqueueReadImage"><strong>clEnqueueReadImage</strong></a> and <a href="#clEnqueueWriteImage"><strong>clEnqueueWriteImage</strong></a> return <a href="#CL_SUCCESS"><code>CL_SUCCESS</code></a> if the |
| function is executed successfully. |
| Otherwise, it returns one of the following errors:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p><a href="#CL_INVALID_COMMAND_QUEUE"><code>CL_INVALID_<wbr>COMMAND_<wbr>QUEUE</code></a> if <em>command_queue</em> is not a valid host |
| command-queue.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_CONTEXT"><code>CL_INVALID_<wbr>CONTEXT</code></a> if the context associated with <em>command_queue</em> and |
| <em>image</em> are not the same or if the context associated with |
| <em>command_queue</em> and events in <em>event_wait_list</em> are not the same.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_MEM_OBJECT"><code>CL_INVALID_<wbr>MEM_<wbr>OBJECT</code></a> if <em>image</em> is not a valid image object.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>origin</em> or <em>region</em> is <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if the region being read or written specified by |
| <em>origin</em> and <em>region</em> is out of bounds.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if values in <em>origin</em> and <em>region</em> do not follow rules |
| described in the argument description for <em>origin</em> and <em>region</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_VALUE"><code>CL_INVALID_<wbr>VALUE</code></a> if <em>ptr</em> is <code>NULL</code>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_EVENT_WAIT_LIST"><code>CL_INVALID_<wbr>EVENT_<wbr>WAIT_<wbr>LIST</code></a> if <em>event_wait_list</em> is <code>NULL</code> and |
| <em>num_events_in_wait_list</em> > 0, or <em>event_wait_list</em> is not <code>NULL</code> and |
| <em>num_events_in_wait_list</em> is 0, or if event objects in <em>event_wait_list</em> |
| are not valid events.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_IMAGE_SIZE"><code>CL_INVALID_<wbr>IMAGE_<wbr>SIZE</code></a> if image dimensions (image width, height, |
| specified or compute row and/or slice pitch) for <em>image</em> are not |
| supported by device associated with <em>queue</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_IMAGE_FORMAT_NOT_SUPPORTED"><code>CL_IMAGE_<wbr>FORMAT_<wbr>NOT_<wbr>SUPPORTED</code></a> if image format (image channel order and |
| data type) for <em>image</em> are not supported by device associated with |
| <em>queue</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_MEM_OBJECT_ALLOCATION_FAILURE"><code>CL_MEM_<wbr>OBJECT_<wbr>ALLOCATION_<wbr>FAILURE</code></a> if there is a failure to allocate |
| memory for data store associated with <em>image</em>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_OPERATION"><code>CL_INVALID_<wbr>OPERATION</code></a> if the device associated with <em>command_queue</em> does |
| not support images (i.e. <a href="#CL_DEVICE_IMAGE_SUPPORT"><code>CL_DEVICE_<wbr>IMAGE_<wbr>SUPPORT</code></a> specified in the |
| <a href="#device-queries-table">Device Queries</a> table is <a href="#CL_FALSE"><code>CL_FALSE</code></a>).</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_OPERATION"><code>CL_INVALID_<wbr>OPERATION</code></a> if <a href="#clEnqueueReadImage"><strong>clEnqueueReadImage</strong></a> is called on <em>image</em> which |
| has been created with <a href="#CL_MEM_HOST_WRITE_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>WRITE_<wbr>ONLY</code></a> or <a href="#CL_MEM_HOST_NO_ACCESS"><code>CL_MEM_<wbr>HOST_<wbr>NO_<wbr>ACCESS</code></a>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_INVALID_OPERATION"><code>CL_INVALID_<wbr>OPERATION</code></a> if <a href="#clEnqueueWriteImage"><strong>clEnqueueWriteImage</strong></a> is called on <em>image</em> which |
| has been created with <a href="#CL_MEM_HOST_READ_ONLY"><code>CL_MEM_<wbr>HOST_<wbr>READ_<wbr>ONLY</code></a> or <a href="#CL_MEM_HOST_NO_ACCESS"><code>CL_MEM_<wbr>HOST_<wbr>NO_<wbr>ACCESS</code></a>.</p> |
| </li> |
| <li> |
| <p><a href="#CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST"><code>CL_EXEC_<wbr>STATUS_<wbr>ERROR_<wbr>FOR_<wbr>EVENTS_<wbr>IN_<wbr>WAIT_<wbr>LIST</code></a> if the read and write |
| operations are blocking and the execution status of any of the events in |
| <em>event_wait_list</em> is a negative integer value. |
| This error code is <a href="#unified-spec">missing before</a> version 1.1.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_RESOURCES"><code>CL_OUT_<wbr>OF_<wbr>RESOURCES</code></a> if there is a failure to allocate resources required |
| by the OpenCL implementation on the device.</p> |
| </li> |
| <li> |
| <p><a href="#CL_OUT_OF_HOST_MEMORY"><code>CL_OUT_<wbr>OF_<wbr>HOST_<wbr>MEMORY</code></a> if there is a failure to allocate resources |
| required by the OpenCL implementation on the host.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="admonitionblock note"> |
| <table> |
| <tr> |
| <td class="icon"> |
| <i class="fa icon-note" title="Note"></i> |
| </td> |
| <td class="content"> |
| <div class="paragraph"> |
| <p>Calling <a href="#clEnqueueReadImage"><strong>clEnqueueReadImage</strong></a> to read a region of the <em>image</em> with the <em>ptr</em> |
| argument value set to <em>host_ptr</em> + (<em>origin</em>[2] × <em>image slice pitch</em> |
| + <em>origin</em>[1] × <em>image row pitch</em> + <em>origin</em>[0] × <em>bytes |
| per pixel</em>), where <em>host_ptr</em> is a pointer to the memory region specified |
| when the <em>image</em> being read is created with <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_<wbr>USE_<wbr>HOST_<wbr>PTR</code></a>, must meet |
| the following requirements in order to avoid undefined behavior:</p> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>All commands that use this image object have finished execution before |
| the read command begins execution.</p> |
| </li> |
| <li> |
| <p>The <em>row_pitch</em> and <em>slice_pitch</em> argument values in |
| <a href="#clEnqueueReadImage"><strong>clEnqueueReadImage</strong></a> must be set to the image row pitch and slice pitch.</p> |
| </li> |
| <li> |
| <p>The image object is not mapped.</p> |
| </li> |
| <li> |
| <p>The image object is not used by any command-queue until the read command |
| has finished execution.</p> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>Calling <a href="#clEnqueueWriteImage"><strong>clEnqueueWriteImage</strong></a> to update the latest bits in a region of the |
| <em>image</em> with the <em>ptr</em> argument value set to <em>host_ptr</em> + (<em>origin</em>[2] |
| × <em>image slice pitch</em> + <em>origin</em>[1] × <em>image row pitch</em> + |
| <em>origin</em>[0] × <em>bytes per pixel</em>), where <em>host_ptr</em> is a pointer to the |
| memory region specified when the <em>image</em> being written is created with |
| <a href="#CL_MEM_USE_HOST_PTR"><code>CL_MEM_ |