blob: 17a77c5eaa7d4afa7877c63a288c1f8139b156c4 [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<!--[if IE]><meta http-equiv="X-UA-Compatible" content="IE=edge"><![endif]-->
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="generator" content="Asciidoctor 1.5.7">
<meta name="author" content="Khronos&#174; OpenCL Working Group">
<title>The OpenCL&#8482; Specification</title>
<style>
/*! normalize.css v2.1.2 | MIT License | git.io/normalize */
/* ========================================================================== HTML5 display definitions ========================================================================== */
/** Correct `block` display not defined in IE 8/9. */
article, aside, details, figcaption, figure, footer, header, hgroup, main, nav, section, summary { display: block; }
/** Correct `inline-block` display not defined in IE 8/9. */
audio, canvas, video { display: inline-block; }
/** Prevent modern browsers from displaying `audio` without controls. Remove excess height in iOS 5 devices. */
audio:not([controls]) { display: none; height: 0; }
/** Address `[hidden]` styling not present in IE 8/9. Hide the `template` element in IE, Safari, and Firefox < 22. */
[hidden], template { display: none; }
script { display: none !important; }
/* ========================================================================== Base ========================================================================== */
/** 1. Set default font family to sans-serif. 2. Prevent iOS text size adjust after orientation change, without disabling user zoom. */
html { font-family: sans-serif; /* 1 */ -ms-text-size-adjust: 100%; /* 2 */ -webkit-text-size-adjust: 100%; /* 2 */ }
/** Remove default margin. */
body { margin: 0; }
/* ========================================================================== Links ========================================================================== */
/** Remove the gray background color from active links in IE 10. */
a { background: transparent; }
/** Address `outline` inconsistency between Chrome and other browsers. */
a:focus { outline: thin dotted; }
/** Improve readability when focused and also mouse hovered in all browsers. */
a:active, a:hover { outline: 0; }
/* ========================================================================== Typography ========================================================================== */
/** Address variable `h1` font-size and margin within `section` and `article` contexts in Firefox 4+, Safari 5, and Chrome. */
h1 { font-size: 2em; margin: 0.67em 0; }
/** Address styling not present in IE 8/9, Safari 5, and Chrome. */
abbr[title] { border-bottom: 1px dotted; }
/** Address style set to `bolder` in Firefox 4+, Safari 5, and Chrome. */
b, strong { font-weight: bold; }
/** Address styling not present in Safari 5 and Chrome. */
dfn { font-style: italic; }
/** Address differences between Firefox and other browsers. */
hr { -moz-box-sizing: content-box; box-sizing: content-box; height: 0; }
/** Address styling not present in IE 8/9. */
mark { background: #ff0; color: #000; }
/** Correct font family set oddly in Safari 5 and Chrome. */
code, kbd, pre, samp { font-family: monospace, serif; font-size: 1em; }
/** Improve readability of pre-formatted text in all browsers. */
pre { white-space: pre-wrap; }
/** Set consistent quote types. */
q { quotes: "\201C" "\201D" "\2018" "\2019"; }
/** Address inconsistent and variable font size in all browsers. */
small { font-size: 80%; }
/** Prevent `sub` and `sup` affecting `line-height` in all browsers. */
sub, sup { font-size: 75%; line-height: 0; position: relative; vertical-align: baseline; }
sup { top: -0.5em; }
sub { bottom: -0.25em; }
/* ========================================================================== Embedded content ========================================================================== */
/** Remove border when inside `a` element in IE 8/9. */
img { border: 0; }
/** Correct overflow displayed oddly in IE 9. */
svg:not(:root) { overflow: hidden; }
/* ========================================================================== Figures ========================================================================== */
/** Address margin not present in IE 8/9 and Safari 5. */
figure { margin: 0; }
/* ========================================================================== Forms ========================================================================== */
/** Define consistent border, margin, and padding. */
fieldset { border: 1px solid #c0c0c0; margin: 0 2px; padding: 0.35em 0.625em 0.75em; }
/** 1. Correct `color` not being inherited in IE 8/9. 2. Remove padding so people aren't caught out if they zero out fieldsets. */
legend { border: 0; /* 1 */ padding: 0; /* 2 */ }
/** 1. Correct font family not being inherited in all browsers. 2. Correct font size not being inherited in all browsers. 3. Address margins set differently in Firefox 4+, Safari 5, and Chrome. */
button, input, select, textarea { font-family: inherit; /* 1 */ font-size: 100%; /* 2 */ margin: 0; /* 3 */ }
/** Address Firefox 4+ setting `line-height` on `input` using `!important` in the UA stylesheet. */
button, input { line-height: normal; }
/** Address inconsistent `text-transform` inheritance for `button` and `select`. All other form control elements do not inherit `text-transform` values. Correct `button` style inheritance in Chrome, Safari 5+, and IE 8+. Correct `select` style inheritance in Firefox 4+ and Opera. */
button, select { text-transform: none; }
/** 1. Avoid the WebKit bug in Android 4.0.* where (2) destroys native `audio` and `video` controls. 2. Correct inability to style clickable `input` types in iOS. 3. Improve usability and consistency of cursor style between image-type `input` and others. */
button, html input[type="button"], input[type="reset"], input[type="submit"] { -webkit-appearance: button; /* 2 */ cursor: pointer; /* 3 */ }
/** Re-set default cursor for disabled elements. */
button[disabled], html input[disabled] { cursor: default; }
/** 1. Address box sizing set to `content-box` in IE 8/9. 2. Remove excess padding in IE 8/9. */
input[type="checkbox"], input[type="radio"] { box-sizing: border-box; /* 1 */ padding: 0; /* 2 */ }
/** 1. Address `appearance` set to `searchfield` in Safari 5 and Chrome. 2. Address `box-sizing` set to `border-box` in Safari 5 and Chrome (include `-moz` to future-proof). */
input[type="search"] { -webkit-appearance: textfield; /* 1 */ -moz-box-sizing: content-box; -webkit-box-sizing: content-box; /* 2 */ box-sizing: content-box; }
/** Remove inner padding and search cancel button in Safari 5 and Chrome on OS X. */
input[type="search"]::-webkit-search-cancel-button, input[type="search"]::-webkit-search-decoration { -webkit-appearance: none; }
/** Remove inner padding and border in Firefox 4+. */
button::-moz-focus-inner, input::-moz-focus-inner { border: 0; padding: 0; }
/** 1. Remove default vertical scrollbar in IE 8/9. 2. Improve readability and alignment in all browsers. */
textarea { overflow: auto; /* 1 */ vertical-align: top; /* 2 */ }
/* ========================================================================== Tables ========================================================================== */
/** Remove most spacing between table cells. */
table { border-collapse: collapse; border-spacing: 0; }
meta.foundation-mq-small { font-family: "only screen and (min-width: 768px)"; width: 768px; }
meta.foundation-mq-medium { font-family: "only screen and (min-width:1280px)"; width: 1280px; }
meta.foundation-mq-large { font-family: "only screen and (min-width:1440px)"; width: 1440px; }
*, *:before, *:after { -moz-box-sizing: border-box; -webkit-box-sizing: border-box; box-sizing: border-box; }
html, body { font-size: 100%; }
body { background: white; color: #222222; padding: 0; margin: 0; font-family: "Helvetica Neue", "Helvetica", Helvetica, Arial, sans-serif; font-weight: normal; font-style: normal; line-height: 1; position: relative; cursor: auto; }
a:hover { cursor: pointer; }
img, object, embed { max-width: 100%; height: auto; }
object, embed { height: 100%; }
img { -ms-interpolation-mode: bicubic; }
#map_canvas img, #map_canvas embed, #map_canvas object, .map_canvas img, .map_canvas embed, .map_canvas object { max-width: none !important; }
.left { float: left !important; }
.right { float: right !important; }
.text-left { text-align: left !important; }
.text-right { text-align: right !important; }
.text-center { text-align: center !important; }
.text-justify { text-align: justify !important; }
.hide { display: none; }
.antialiased { -webkit-font-smoothing: antialiased; }
img { display: inline-block; vertical-align: middle; }
textarea { height: auto; min-height: 50px; }
select { width: 100%; }
object, svg { display: inline-block; vertical-align: middle; }
.center { margin-left: auto; margin-right: auto; }
.spread { width: 100%; }
p.lead, .paragraph.lead > p, #preamble > .sectionbody > .paragraph:first-of-type p { font-size: 1.21875em; line-height: 1.6; }
.subheader, .admonitionblock td.content > .title, .audioblock > .title, .exampleblock > .title, .imageblock > .title, .listingblock > .title, .literalblock > .title, .stemblock > .title, .openblock > .title, .paragraph > .title, .quoteblock > .title, table.tableblock > .title, .verseblock > .title, .videoblock > .title, .dlist > .title, .olist > .title, .ulist > .title, .qlist > .title, .hdlist > .title { line-height: 1.4; color: black; font-weight: 300; margin-top: 0.2em; margin-bottom: 0.5em; }
/* Typography resets */
div, dl, dt, dd, ul, ol, li, h1, h2, h3, #toctitle, .sidebarblock > .content > .title, h4, h5, h6, pre, form, p, blockquote, th, td { margin: 0; padding: 0; direction: ltr; }
/* Default Link Styles */
a { color: #0068b0; text-decoration: none; line-height: inherit; }
a:hover, a:focus { color: #333333; }
a img { border: none; }
/* Default paragraph styles */
p { font-family: Noto, sans-serif; font-weight: normal; font-size: 1em; line-height: 1.6; margin-bottom: 0.75em; text-rendering: optimizeLegibility; }
p aside { font-size: 0.875em; line-height: 1.35; font-style: italic; }
/* Default header styles */
h1, h2, h3, #toctitle, .sidebarblock > .content > .title, h4, h5, h6 { font-family: Noto, sans-serif; font-weight: normal; font-style: normal; color: black; text-rendering: optimizeLegibility; margin-top: 0.5em; margin-bottom: 0.5em; line-height: 1.2125em; }
h1 small, h2 small, h3 small, #toctitle small, .sidebarblock > .content > .title small, h4 small, h5 small, h6 small { font-size: 60%; color: #4d4d4d; line-height: 0; }
h1 { font-size: 2.125em; }
h2 { font-size: 1.6875em; }
h3, #toctitle, .sidebarblock > .content > .title { font-size: 1.375em; }
h4 { font-size: 1.125em; }
h5 { font-size: 1.125em; }
h6 { font-size: 1em; }
hr { border: solid #dddddd; border-width: 1px 0 0; clear: both; margin: 1.25em 0 1.1875em; height: 0; }
/* Helpful Typography Defaults */
em, i { font-style: italic; line-height: inherit; }
strong, b { font-weight: bold; line-height: inherit; }
small { font-size: 60%; line-height: inherit; }
code { font-family: Consolas, "Liberation Mono", Courier, monospace; font-weight: normal; color: #264357; }
/* Lists */
ul, ol, dl { font-size: 1em; line-height: 1.6; margin-bottom: 0.75em; list-style-position: outside; font-family: Noto, sans-serif; }
ul, ol { margin-left: 1.5em; }
ul.no-bullet, ol.no-bullet { margin-left: 1.5em; }
/* Unordered Lists */
ul li ul, ul li ol { margin-left: 1.25em; margin-bottom: 0; font-size: 1em; /* Override nested font-size change */ }
ul.square li ul, ul.circle li ul, ul.disc li ul { list-style: inherit; }
ul.square { list-style-type: square; }
ul.circle { list-style-type: circle; }
ul.disc { list-style-type: disc; }
ul.no-bullet { list-style: none; }
/* Ordered Lists */
ol li ul, ol li ol { margin-left: 1.25em; margin-bottom: 0; }
/* Definition Lists */
dl dt { margin-bottom: 0.3em; font-weight: bold; }
dl dd { margin-bottom: 0.75em; }
/* Abbreviations */
abbr, acronym { text-transform: uppercase; font-size: 90%; color: black; border-bottom: 1px dotted #dddddd; cursor: help; }
abbr { text-transform: none; }
/* Blockquotes */
blockquote { margin: 0 0 0.75em; padding: 0.5625em 1.25em 0 1.1875em; border-left: 1px solid #dddddd; }
blockquote cite { display: block; font-size: 0.8125em; color: #5e93b8; }
blockquote cite:before { content: "\2014 \0020"; }
blockquote cite a, blockquote cite a:visited { color: #5e93b8; }
blockquote, blockquote p { line-height: 1.6; color: #333333; }
/* Microformats */
.vcard { display: inline-block; margin: 0 0 1.25em 0; border: 1px solid #dddddd; padding: 0.625em 0.75em; }
.vcard li { margin: 0; display: block; }
.vcard .fn { font-weight: bold; font-size: 0.9375em; }
.vevent .summary { font-weight: bold; }
.vevent abbr { cursor: auto; text-decoration: none; font-weight: bold; border: none; padding: 0 0.0625em; }
@media only screen and (min-width: 768px) { h1, h2, h3, #toctitle, .sidebarblock > .content > .title, h4, h5, h6 { line-height: 1.4; }
h1 { font-size: 2.75em; }
h2 { font-size: 2.3125em; }
h3, #toctitle, .sidebarblock > .content > .title { font-size: 1.6875em; }
h4 { font-size: 1.4375em; } }
/* Tables */
table { background: white; margin-bottom: 1.25em; border: solid 1px #d8d8ce; }
table thead, table tfoot { background: -webkit-linear-gradient(top, #add386, #90b66a); font-weight: bold; }
table thead tr th, table thead tr td, table tfoot tr th, table tfoot tr td { padding: 0.5em 0.625em 0.625em; font-size: inherit; color: white; text-align: left; }
table tr th, table tr td { padding: 0.5625em 0.625em; font-size: inherit; color: #6d6e71; }
table tr.even, table tr.alt, table tr:nth-of-type(even) { background: #edf2f2; }
table thead tr th, table tfoot tr th, table tbody tr td, table tr td, table tfoot tr td { display: table-cell; line-height: 1.4; }
body { -moz-osx-font-smoothing: grayscale; -webkit-font-smoothing: antialiased; tab-size: 4; }
h1, h2, h3, #toctitle, .sidebarblock > .content > .title, h4, h5, h6 { line-height: 1.4; }
a:hover, a:focus { text-decoration: underline; }
.clearfix:before, .clearfix:after, .float-group:before, .float-group:after { content: " "; display: table; }
.clearfix:after, .float-group:after { clear: both; }
*:not(pre) > code { font-size: inherit; font-style: normal !important; letter-spacing: 0; padding: 0; background-color: white; -webkit-border-radius: 0; border-radius: 0; line-height: inherit; word-wrap: break-word; }
*:not(pre) > code.nobreak { word-wrap: normal; }
*:not(pre) > code.nowrap { white-space: nowrap; }
pre, pre > code { line-height: 1.6; color: #264357; font-family: Consolas, "Liberation Mono", Courier, monospace; font-weight: normal; }
em em { font-style: normal; }
strong strong { font-weight: normal; }
.keyseq { color: #333333; }
kbd { font-family: Consolas, "Liberation Mono", Courier, monospace; display: inline-block; color: black; font-size: 0.65em; line-height: 1.45; background-color: #f7f7f7; border: 1px solid #ccc; -webkit-border-radius: 3px; border-radius: 3px; -webkit-box-shadow: 0 1px 0 rgba(0, 0, 0, 0.2), 0 0 0 0.1em white inset; box-shadow: 0 1px 0 rgba(0, 0, 0, 0.2), 0 0 0 0.1em white inset; margin: 0 0.15em; padding: 0.2em 0.5em; vertical-align: middle; position: relative; top: -0.1em; white-space: nowrap; }
.keyseq kbd:first-child { margin-left: 0; }
.keyseq kbd:last-child { margin-right: 0; }
.menuseq, .menuref { color: #000; }
.menuseq b:not(.caret), .menuref { font-weight: inherit; }
.menuseq { word-spacing: -0.02em; }
.menuseq b.caret { font-size: 1.25em; line-height: 0.8; }
.menuseq i.caret { font-weight: bold; text-align: center; width: 0.45em; }
b.button:before, b.button:after { position: relative; top: -1px; font-weight: normal; }
b.button:before { content: "["; padding: 0 3px 0 2px; }
b.button:after { content: "]"; padding: 0 2px 0 3px; }
#header, #content, #footnotes, #footer { width: 100%; margin-left: auto; margin-right: auto; margin-top: 0; margin-bottom: 0; max-width: 62.5em; *zoom: 1; position: relative; padding-left: 1.5em; padding-right: 1.5em; }
#header:before, #header:after, #content:before, #content:after, #footnotes:before, #footnotes:after, #footer:before, #footer:after { content: " "; display: table; }
#header:after, #content:after, #footnotes:after, #footer:after { clear: both; }
#content { margin-top: 1.25em; }
#content:before { content: none; }
#header > h1:first-child { color: black; margin-top: 2.25rem; margin-bottom: 0; }
#header > h1:first-child + #toc { margin-top: 8px; border-top: 1px solid #dddddd; }
#header > h1:only-child, body.toc2 #header > h1:nth-last-child(2) { border-bottom: 1px solid #dddddd; padding-bottom: 8px; }
#header .details { border-bottom: 1px solid #dddddd; line-height: 1.45; padding-top: 0.25em; padding-bottom: 0.25em; padding-left: 0.25em; color: #5e93b8; display: -ms-flexbox; display: -webkit-flex; display: flex; -ms-flex-flow: row wrap; -webkit-flex-flow: row wrap; flex-flow: row wrap; }
#header .details span:first-child { margin-left: -0.125em; }
#header .details span.email a { color: #333333; }
#header .details br { display: none; }
#header .details br + span:before { content: "\00a0\2013\00a0"; }
#header .details br + span.author:before { content: "\00a0\22c5\00a0"; color: #333333; }
#header .details br + span#revremark:before { content: "\00a0|\00a0"; }
#header #revnumber { text-transform: capitalize; }
#header #revnumber:after { content: "\00a0"; }
#content > h1:first-child:not([class]) { color: black; border-bottom: 1px solid #dddddd; padding-bottom: 8px; margin-top: 0; padding-top: 1rem; margin-bottom: 1.25rem; }
#toc { border-bottom: 0 solid #dddddd; padding-bottom: 0.5em; }
#toc > ul { margin-left: 0.125em; }
#toc ul.sectlevel0 > li > a { font-style: italic; }
#toc ul.sectlevel0 ul.sectlevel1 { margin: 0.5em 0; }
#toc ul { font-family: Noto, sans-serif; list-style-type: none; }
#toc li { line-height: 1.3334; margin-top: 0.3334em; }
#toc a { text-decoration: none; }
#toc a:active { text-decoration: underline; }
#toctitle { color: black; font-size: 1.2em; }
@media only screen and (min-width: 768px) { #toctitle { font-size: 1.375em; }
body.toc2 { padding-left: 15em; padding-right: 0; }
#toc.toc2 { margin-top: 0 !important; background-color: white; position: fixed; width: 15em; left: 0; top: 0; border-right: 1px solid #dddddd; border-top-width: 0 !important; border-bottom-width: 0 !important; z-index: 1000; padding: 1.25em 1em; height: 100%; overflow: auto; }
#toc.toc2 #toctitle { margin-top: 0; margin-bottom: 0.8rem; font-size: 1.2em; }
#toc.toc2 > ul { font-size: 0.9em; margin-bottom: 0; }
#toc.toc2 ul ul { margin-left: 0; padding-left: 1em; }
#toc.toc2 ul.sectlevel0 ul.sectlevel1 { padding-left: 0; margin-top: 0.5em; margin-bottom: 0.5em; }
body.toc2.toc-right { padding-left: 0; padding-right: 15em; }
body.toc2.toc-right #toc.toc2 { border-right-width: 0; border-left: 1px solid #dddddd; left: auto; right: 0; } }
@media only screen and (min-width: 1280px) { body.toc2 { padding-left: 20em; padding-right: 0; }
#toc.toc2 { width: 20em; }
#toc.toc2 #toctitle { font-size: 1.375em; }
#toc.toc2 > ul { font-size: 0.95em; }
#toc.toc2 ul ul { padding-left: 1.25em; }
body.toc2.toc-right { padding-left: 0; padding-right: 20em; } }
#content #toc { border-style: solid; border-width: 1px; border-color: #e6e6e6; margin-bottom: 1.25em; padding: 1.25em; background: white; -webkit-border-radius: 0; border-radius: 0; }
#content #toc > :first-child { margin-top: 0; }
#content #toc > :last-child { margin-bottom: 0; }
#footer { max-width: 100%; background-color: none; padding: 1.25em; }
#footer-text { color: black; line-height: 1.44; }
#content { margin-bottom: 0.625em; }
.sect1 { padding-bottom: 0.625em; }
@media only screen and (min-width: 768px) { #content { margin-bottom: 1.25em; }
.sect1 { padding-bottom: 1.25em; } }
.sect1:last-child { padding-bottom: 0; }
.sect1 + .sect1 { border-top: 0 solid #dddddd; }
#content h1 > a.anchor, h2 > a.anchor, h3 > a.anchor, #toctitle > a.anchor, .sidebarblock > .content > .title > a.anchor, h4 > a.anchor, h5 > a.anchor, h6 > a.anchor { position: absolute; z-index: 1001; width: 1.5ex; margin-left: -1.5ex; display: block; text-decoration: none !important; visibility: hidden; text-align: center; font-weight: normal; }
#content h1 > a.anchor:before, h2 > a.anchor:before, h3 > a.anchor:before, #toctitle > a.anchor:before, .sidebarblock > .content > .title > a.anchor:before, h4 > a.anchor:before, h5 > a.anchor:before, h6 > a.anchor:before { content: "\00A7"; font-size: 0.85em; display: block; padding-top: 0.1em; }
#content h1:hover > a.anchor, #content h1 > a.anchor:hover, h2:hover > a.anchor, h2 > a.anchor:hover, h3:hover > a.anchor, #toctitle:hover > a.anchor, .sidebarblock > .content > .title:hover > a.anchor, h3 > a.anchor:hover, #toctitle > a.anchor:hover, .sidebarblock > .content > .title > a.anchor:hover, h4:hover > a.anchor, h4 > a.anchor:hover, h5:hover > a.anchor, h5 > a.anchor:hover, h6:hover > a.anchor, h6 > a.anchor:hover { visibility: visible; }
#content h1 > a.link, h2 > a.link, h3 > a.link, #toctitle > a.link, .sidebarblock > .content > .title > a.link, h4 > a.link, h5 > a.link, h6 > a.link { color: black; text-decoration: none; }
#content h1 > a.link:hover, h2 > a.link:hover, h3 > a.link:hover, #toctitle > a.link:hover, .sidebarblock > .content > .title > a.link:hover, h4 > a.link:hover, h5 > a.link:hover, h6 > a.link:hover { color: black; }
.audioblock, .imageblock, .literalblock, .listingblock, .stemblock, .videoblock { margin-bottom: 1.25em; }
.admonitionblock td.content > .title, .audioblock > .title, .exampleblock > .title, .imageblock > .title, .listingblock > .title, .literalblock > .title, .stemblock > .title, .openblock > .title, .paragraph > .title, .quoteblock > .title, table.tableblock > .title, .verseblock > .title, .videoblock > .title, .dlist > .title, .olist > .title, .ulist > .title, .qlist > .title, .hdlist > .title { text-rendering: optimizeLegibility; text-align: left; }
table.tableblock > caption.title { white-space: nowrap; overflow: visible; max-width: 0; }
.paragraph.lead > p, #preamble > .sectionbody > .paragraph:first-of-type p { color: black; }
table.tableblock #preamble > .sectionbody > .paragraph:first-of-type p { font-size: inherit; }
.admonitionblock > table { border-collapse: separate; border: 0; background: none; width: 100%; }
.admonitionblock > table td.icon { text-align: center; width: 80px; }
.admonitionblock > table td.icon img { max-width: initial; }
.admonitionblock > table td.icon .title { font-weight: bold; font-family: Noto, sans-serif; text-transform: uppercase; }
.admonitionblock > table td.content { padding-left: 1.125em; padding-right: 1.25em; border-left: 1px solid #dddddd; color: #5e93b8; }
.admonitionblock > table td.content > :last-child > :last-child { margin-bottom: 0; }
.exampleblock > .content { border-style: solid; border-width: 1px; border-color: #e6e6e6; margin-bottom: 1.25em; padding: 1.25em; background: white; -webkit-border-radius: 0; border-radius: 0; }
.exampleblock > .content > :first-child { margin-top: 0; }
.exampleblock > .content > :last-child { margin-bottom: 0; }
.sidebarblock { border-style: solid; border-width: 1px; border-color: #e6e6e6; margin-bottom: 1.25em; padding: 1.25em; background: white; -webkit-border-radius: 0; border-radius: 0; }
.sidebarblock > :first-child { margin-top: 0; }
.sidebarblock > :last-child { margin-bottom: 0; }
.sidebarblock > .content > .title { color: black; margin-top: 0; }
.exampleblock > .content > :last-child > :last-child, .exampleblock > .content .olist > ol > li:last-child > :last-child, .exampleblock > .content .ulist > ul > li:last-child > :last-child, .exampleblock > .content .qlist > ol > li:last-child > :last-child, .sidebarblock > .content > :last-child > :last-child, .sidebarblock > .content .olist > ol > li:last-child > :last-child, .sidebarblock > .content .ulist > ul > li:last-child > :last-child, .sidebarblock > .content .qlist > ol > li:last-child > :last-child { margin-bottom: 0; }
.literalblock pre, .listingblock pre:not(.highlight), .listingblock pre[class="highlight"], .listingblock pre[class^="highlight "], .listingblock pre.CodeRay, .listingblock pre.prettyprint { background: #eeeeee; }
.sidebarblock .literalblock pre, .sidebarblock .listingblock pre:not(.highlight), .sidebarblock .listingblock pre[class="highlight"], .sidebarblock .listingblock pre[class^="highlight "], .sidebarblock .listingblock pre.CodeRay, .sidebarblock .listingblock pre.prettyprint { background: #f2f1f1; }
.literalblock pre, .literalblock pre[class], .listingblock pre, .listingblock pre[class] { border: 1px hidden #666666; -webkit-border-radius: 0; border-radius: 0; word-wrap: break-word; padding: 1.25em 1.5625em 1.125em 1.5625em; font-size: 0.8125em; }
.literalblock pre.nowrap, .literalblock pre[class].nowrap, .listingblock pre.nowrap, .listingblock pre[class].nowrap { overflow-x: auto; white-space: pre; word-wrap: normal; }
@media only screen and (min-width: 768px) { .literalblock pre, .literalblock pre[class], .listingblock pre, .listingblock pre[class] { font-size: 0.90625em; } }
@media only screen and (min-width: 1280px) { .literalblock pre, .literalblock pre[class], .listingblock pre, .listingblock pre[class] { font-size: 1em; } }
.literalblock.output pre { color: #eeeeee; background-color: #264357; }
.listingblock pre.highlightjs { padding: 0; }
.listingblock pre.highlightjs > code { padding: 1.25em 1.5625em 1.125em 1.5625em; -webkit-border-radius: 0; border-radius: 0; }
.listingblock > .content { position: relative; }
.listingblock code[data-lang]:before { display: none; content: attr(data-lang); position: absolute; font-size: 0.75em; top: 0.425rem; right: 0.5rem; line-height: 1; text-transform: uppercase; color: #999; }
.listingblock:hover code[data-lang]:before { display: block; }
.listingblock.terminal pre .command:before { content: attr(data-prompt); padding-right: 0.5em; color: #999; }
.listingblock.terminal pre .command:not([data-prompt]):before { content: "$"; }
table.pyhltable { border-collapse: separate; border: 0; margin-bottom: 0; background: none; }
table.pyhltable td { vertical-align: top; padding-top: 0; padding-bottom: 0; line-height: 1.6; }
table.pyhltable td.code { padding-left: .75em; padding-right: 0; }
pre.pygments .lineno, table.pyhltable td:not(.code) { color: #999; padding-left: 0; padding-right: .5em; border-right: 1px solid #dddddd; }
pre.pygments .lineno { display: inline-block; margin-right: .25em; }
table.pyhltable .linenodiv { background: none !important; padding-right: 0 !important; }
.quoteblock { margin: 0 1em 0.75em 1.5em; display: table; }
.quoteblock > .title { margin-left: -1.5em; margin-bottom: 0.75em; }
.quoteblock blockquote, .quoteblock blockquote p { color: #333333; font-size: 1.15rem; line-height: 1.75; word-spacing: 0.1em; letter-spacing: 0; font-style: italic; text-align: justify; }
.quoteblock blockquote { margin: 0; padding: 0; border: 0; }
.quoteblock blockquote:before { content: "\201c"; float: left; font-size: 2.75em; font-weight: bold; line-height: 0.6em; margin-left: -0.6em; color: black; text-shadow: 0 1px 2px rgba(0, 0, 0, 0.1); }
.quoteblock blockquote > .paragraph:last-child p { margin-bottom: 0; }
.quoteblock .attribution { margin-top: 0.5em; margin-right: 0.5ex; text-align: right; }
.quoteblock .quoteblock { margin-left: 0; margin-right: 0; padding: 0.5em 0; border-left: 3px solid #5e93b8; }
.quoteblock .quoteblock blockquote { padding: 0 0 0 0.75em; }
.quoteblock .quoteblock blockquote:before { display: none; }
.verseblock { margin: 0 1em 0.75em 1em; }
.verseblock pre { font-family: "Open Sans", "DejaVu Sans", sans; font-size: 1.15rem; color: #333333; font-weight: 300; text-rendering: optimizeLegibility; }
.verseblock pre strong { font-weight: 400; }
.verseblock .attribution { margin-top: 1.25rem; margin-left: 0.5ex; }
.quoteblock .attribution, .verseblock .attribution { font-size: 0.8125em; line-height: 1.45; font-style: italic; }
.quoteblock .attribution br, .verseblock .attribution br { display: none; }
.quoteblock .attribution cite, .verseblock .attribution cite { display: block; letter-spacing: -0.025em; color: #5e93b8; }
.quoteblock.abstract { margin: 0 0 0.75em 0; display: block; }
.quoteblock.abstract blockquote, .quoteblock.abstract blockquote p { text-align: left; word-spacing: 0; }
.quoteblock.abstract blockquote:before, .quoteblock.abstract blockquote p:first-of-type:before { display: none; }
table.tableblock { max-width: 100%; border-collapse: separate; }
table.tableblock td > .paragraph:last-child p > p:last-child, table.tableblock th > p:last-child, table.tableblock td > p:last-child { margin-bottom: 0; }
table.tableblock, th.tableblock, td.tableblock { border: 0 solid #d8d8ce; }
table.grid-all > thead > tr > .tableblock, table.grid-all > tbody > tr > .tableblock { border-width: 0 1px 1px 0; }
table.grid-all > tfoot > tr > .tableblock { border-width: 1px 1px 0 0; }
table.grid-cols > * > tr > .tableblock { border-width: 0 1px 0 0; }
table.grid-rows > thead > tr > .tableblock, table.grid-rows > tbody > tr > .tableblock { border-width: 0 0 1px 0; }
table.grid-rows > tfoot > tr > .tableblock { border-width: 1px 0 0 0; }
table.grid-all > * > tr > .tableblock:last-child, table.grid-cols > * > tr > .tableblock:last-child { border-right-width: 0; }
table.grid-all > tbody > tr:last-child > .tableblock, table.grid-all > thead:last-child > tr > .tableblock, table.grid-rows > tbody > tr:last-child > .tableblock, table.grid-rows > thead:last-child > tr > .tableblock { border-bottom-width: 0; }
table.frame-all { border-width: 1px; }
table.frame-sides { border-width: 0 1px; }
table.frame-topbot { border-width: 1px 0; }
th.halign-left, td.halign-left { text-align: left; }
th.halign-right, td.halign-right { text-align: right; }
th.halign-center, td.halign-center { text-align: center; }
th.valign-top, td.valign-top { vertical-align: top; }
th.valign-bottom, td.valign-bottom { vertical-align: bottom; }
th.valign-middle, td.valign-middle { vertical-align: middle; }
table thead th, table tfoot th { font-weight: bold; }
tbody tr th { display: table-cell; line-height: 1.4; background: -webkit-linear-gradient(top, #add386, #90b66a); }
tbody tr th, tbody tr th p, tfoot tr th, tfoot tr th p { color: white; font-weight: bold; }
p.tableblock > code:only-child { background: none; padding: 0; }
p.tableblock { font-size: 1em; }
td > div.verse { white-space: pre; }
ol { margin-left: 1.75em; }
ul li ol { margin-left: 1.5em; }
dl dd { margin-left: 1.125em; }
dl dd:last-child, dl dd:last-child > :last-child { margin-bottom: 0; }
ol > li p, ul > li p, ul dd, ol dd, .olist .olist, .ulist .ulist, .ulist .olist, .olist .ulist { margin-bottom: 0.375em; }
ul.checklist, ul.none, ol.none, ul.no-bullet, ol.no-bullet, ol.unnumbered, ul.unstyled, ol.unstyled { list-style-type: none; }
ul.no-bullet, ol.no-bullet, ol.unnumbered { margin-left: 0.625em; }
ul.unstyled, ol.unstyled { margin-left: 0; }
ul.checklist { margin-left: 0.625em; }
ul.checklist li > p:first-child > .fa-square-o:first-child, ul.checklist li > p:first-child > .fa-check-square-o:first-child { width: 1.25em; font-size: 0.8em; position: relative; bottom: 0.125em; }
ul.checklist li > p:first-child > input[type="checkbox"]:first-child { margin-right: 0.25em; }
ul.inline { display: -ms-flexbox; display: -webkit-box; display: flex; -ms-flex-flow: row wrap; -webkit-flex-flow: row wrap; flex-flow: row wrap; list-style: none; margin: 0 0 0.375em -0.75em; }
ul.inline > li { margin-left: 0.75em; }
.unstyled dl dt { font-weight: normal; font-style: normal; }
ol.arabic { list-style-type: decimal; }
ol.decimal { list-style-type: decimal-leading-zero; }
ol.loweralpha { list-style-type: lower-alpha; }
ol.upperalpha { list-style-type: upper-alpha; }
ol.lowerroman { list-style-type: lower-roman; }
ol.upperroman { list-style-type: upper-roman; }
ol.lowergreek { list-style-type: lower-greek; }
.hdlist > table, .colist > table { border: 0; background: none; }
.hdlist > table > tbody > tr, .colist > table > tbody > tr { background: none; }
td.hdlist1, td.hdlist2 { vertical-align: top; padding: 0 0.625em; }
td.hdlist1 { font-weight: bold; padding-bottom: 0.75em; }
.literalblock + .colist, .listingblock + .colist { margin-top: -0.5em; }
.colist > table tr > td:first-of-type { padding: 0.4em 0.75em 0 0.75em; line-height: 1; vertical-align: top; }
.colist > table tr > td:first-of-type img { max-width: initial; }
.colist > table tr > td:last-of-type { padding: 0.25em 0; }
.thumb, .th { line-height: 0; display: inline-block; border: solid 4px white; -webkit-box-shadow: 0 0 0 1px #dddddd; box-shadow: 0 0 0 1px #dddddd; }
.imageblock.left, .imageblock[style*="float: left"] { margin: 0.25em 0.625em 1.25em 0; }
.imageblock.right, .imageblock[style*="float: right"] { margin: 0.25em 0 1.25em 0.625em; }
.imageblock > .title { margin-bottom: 0; }
.imageblock.thumb, .imageblock.th { border-width: 6px; }
.imageblock.thumb > .title, .imageblock.th > .title { padding: 0 0.125em; }
.image.left, .image.right { margin-top: 0.25em; margin-bottom: 0.25em; display: inline-block; line-height: 0; }
.image.left { margin-right: 0.625em; }
.image.right { margin-left: 0.625em; }
a.image { text-decoration: none; display: inline-block; }
a.image object { pointer-events: none; }
sup.footnote, sup.footnoteref { font-size: 0.875em; position: static; vertical-align: super; }
sup.footnote a, sup.footnoteref a { text-decoration: none; }
sup.footnote a:active, sup.footnoteref a:active { text-decoration: underline; }
#footnotes { padding-top: 0.75em; padding-bottom: 0.75em; margin-bottom: 0.625em; }
#footnotes hr { width: 20%; min-width: 6.25em; margin: -0.25em 0 0.75em 0; border-width: 1px 0 0 0; }
#footnotes .footnote { padding: 0 0.375em 0 0.225em; line-height: 1.3334; font-size: 0.875em; margin-left: 1.2em; margin-bottom: 0.2em; }
#footnotes .footnote a:first-of-type { font-weight: bold; text-decoration: none; margin-left: -1.05em; }
#footnotes .footnote:last-of-type { margin-bottom: 0; }
#content #footnotes { margin-top: -0.625em; margin-bottom: 0; padding: 0.75em 0; }
.gist .file-data > table { border: 0; background: #fff; width: 100%; margin-bottom: 0; }
.gist .file-data > table td.line-data { width: 99%; }
div.unbreakable { page-break-inside: avoid; }
.big { font-size: larger; }
.small { font-size: smaller; }
.underline { text-decoration: underline; }
.overline { text-decoration: overline; }
.line-through { text-decoration: line-through; }
.aqua { color: #00bfbf; }
.aqua-background { background-color: #00fafa; }
.black { color: black; }
.black-background { background-color: black; }
.blue { color: #0000bf; }
.blue-background { background-color: #0000fa; }
.fuchsia { color: #bf00bf; }
.fuchsia-background { background-color: #fa00fa; }
.gray { color: #606060; }
.gray-background { background-color: #7d7d7d; }
.green { color: #006000; }
.green-background { background-color: #007d00; }
.lime { color: #00bf00; }
.lime-background { background-color: #00fa00; }
.maroon { color: #600000; }
.maroon-background { background-color: #7d0000; }
.navy { color: #000060; }
.navy-background { background-color: #00007d; }
.olive { color: #606000; }
.olive-background { background-color: #7d7d00; }
.purple { color: #600060; }
.purple-background { background-color: #7d007d; }
.red { color: #bf0000; }
.red-background { background-color: #fa0000; }
.silver { color: #909090; }
.silver-background { background-color: #bcbcbc; }
.teal { color: #006060; }
.teal-background { background-color: #007d7d; }
.white { color: #bfbfbf; }
.white-background { background-color: #fafafa; }
.yellow { color: #bfbf00; }
.yellow-background { background-color: #fafa00; }
span.icon > .fa { cursor: default; }
a span.icon > .fa { cursor: inherit; }
.admonitionblock td.icon [class^="fa icon-"] { font-size: 2.5em; text-shadow: 1px 1px 2px rgba(0, 0, 0, 0.5); cursor: default; }
.admonitionblock td.icon .icon-note:before { content: "\f05a"; color: #29475c; }
.admonitionblock td.icon .icon-tip:before { content: "\f0eb"; text-shadow: 1px 1px 2px rgba(155, 155, 0, 0.8); color: #111; }
.admonitionblock td.icon .icon-warning:before { content: "\f071"; color: #bf6900; }
.admonitionblock td.icon .icon-caution:before { content: "\f06d"; color: #bf3400; }
.admonitionblock td.icon .icon-important:before { content: "\f06a"; color: #bf0000; }
.conum[data-value] { display: inline-block; color: #fff !important; background-color: black; -webkit-border-radius: 100px; border-radius: 100px; text-align: center; font-size: 0.75em; width: 1.67em; height: 1.67em; line-height: 1.67em; font-family: "Open Sans", "DejaVu Sans", sans-serif; font-style: normal; font-weight: bold; }
.conum[data-value] * { color: #fff !important; }
.conum[data-value] + b { display: none; }
.conum[data-value]:after { content: attr(data-value); }
pre .conum[data-value] { position: relative; top: -0.125em; }
b.conum * { color: inherit !important; }
.conum:not([data-value]):empty { display: none; }
h1, h2, h3, #toctitle, .sidebarblock > .content > .title, h4, h5, h6 { border-bottom: 1px solid #dddddd; }
.sect1 { padding-bottom: 0; }
#toctitle { color: #00406F; font-weight: normal; margin-top: 1.5em; }
.sidebarblock { border-color: #aaa; }
code { -webkit-border-radius: 4px; border-radius: 4px; }
p.tableblock.header { color: #6d6e71; }
.literalblock pre, .listingblock pre { background: #eeeeee; }
</style>
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css">
<style>
/* Stylesheet for CodeRay to match GitHub theme | MIT License | http://foundation.zurb.com */
/*pre.CodeRay {background-color:#f7f7f8;}*/
.CodeRay .line-numbers{border-right:1px solid #d8d8d8;padding:0 0.5em 0 .25em}
.CodeRay span.line-numbers{display:inline-block;margin-right:.5em;color:rgba(0,0,0,.3)}
.CodeRay .line-numbers strong{color:rgba(0,0,0,.4)}
table.CodeRay{border-collapse:separate;border-spacing:0;margin-bottom:0;border:0;background:none}
table.CodeRay td{vertical-align: top;line-height:1.45}
table.CodeRay td.line-numbers{text-align:right}
table.CodeRay td.line-numbers>pre{padding:0;color:rgba(0,0,0,.3)}
table.CodeRay td.code{padding:0 0 0 .5em}
table.CodeRay td.code>pre{padding:0}
.CodeRay .debug{color:#fff !important;background:#000080 !important}
.CodeRay .annotation{color:#007}
.CodeRay .attribute-name{color:#000080}
.CodeRay .attribute-value{color:#700}
.CodeRay .binary{color:#509}
.CodeRay .comment{color:#998;font-style:italic}
.CodeRay .char{color:#04d}
.CodeRay .char .content{color:#04d}
.CodeRay .char .delimiter{color:#039}
.CodeRay .class{color:#458;font-weight:bold}
.CodeRay .complex{color:#a08}
.CodeRay .constant,.CodeRay .predefined-constant{color:#008080}
.CodeRay .color{color:#099}
.CodeRay .class-variable{color:#369}
.CodeRay .decorator{color:#b0b}
.CodeRay .definition{color:#099}
.CodeRay .delimiter{color:#000}
.CodeRay .doc{color:#970}
.CodeRay .doctype{color:#34b}
.CodeRay .doc-string{color:#d42}
.CodeRay .escape{color:#666}
.CodeRay .entity{color:#800}
.CodeRay .error{color:#808}
.CodeRay .exception{color:inherit}
.CodeRay .filename{color:#099}
.CodeRay .function{color:#900;font-weight:bold}
.CodeRay .global-variable{color:#008080}
.CodeRay .hex{color:#058}
.CodeRay .integer,.CodeRay .float{color:#099}
.CodeRay .include{color:#555}
.CodeRay .inline{color:#000}
.CodeRay .inline .inline{background:#ccc}
.CodeRay .inline .inline .inline{background:#bbb}
.CodeRay .inline .inline-delimiter{color:#d14}
.CodeRay .inline-delimiter{color:#d14}
.CodeRay .important{color:#555;font-weight:bold}
.CodeRay .interpreted{color:#b2b}
.CodeRay .instance-variable{color:#008080}
.CodeRay .label{color:#970}
.CodeRay .local-variable{color:#963}
.CodeRay .octal{color:#40e}
.CodeRay .predefined{color:#369}
.CodeRay .preprocessor{color:#579}
.CodeRay .pseudo-class{color:#555}
.CodeRay .directive{font-weight:bold}
.CodeRay .type{font-weight:bold}
.CodeRay .predefined-type{color:inherit}
.CodeRay .reserved,.CodeRay .keyword {color:#000;font-weight:bold}
.CodeRay .key{color:#808}
.CodeRay .key .delimiter{color:#606}
.CodeRay .key .char{color:#80f}
.CodeRay .value{color:#088}
.CodeRay .regexp .delimiter{color:#808}
.CodeRay .regexp .content{color:#808}
.CodeRay .regexp .modifier{color:#808}
.CodeRay .regexp .char{color:#d14}
.CodeRay .regexp .function{color:#404;font-weight:bold}
.CodeRay .string{color:#d20}
.CodeRay .string .string .string{background:#ffd0d0}
.CodeRay .string .content{color:#d14}
.CodeRay .string .char{color:#d14}
.CodeRay .string .delimiter{color:#d14}
.CodeRay .shell{color:#d14}
.CodeRay .shell .delimiter{color:#d14}
.CodeRay .symbol{color:#990073}
.CodeRay .symbol .content{color:#a60}
.CodeRay .symbol .delimiter{color:#630}
.CodeRay .tag{color:#008080}
.CodeRay .tag-special{color:#d70}
.CodeRay .variable{color:#036}
.CodeRay .insert{background:#afa}
.CodeRay .delete{background:#faa}
.CodeRay .change{color:#aaf;background:#007}
.CodeRay .head{color:#f8f;background:#505}
.CodeRay .insert .insert{color:#080}
.CodeRay .delete .delete{color:#800}
.CodeRay .change .change{color:#66f}
.CodeRay .head .head{color:#f4f}
</style>
<link rel="stylesheet" href="../katex/katex.min.css">
<script src="../katex/katex.min.js"></script>
<script src="../katex/contrib/auto-render.min.js"></script>
<!-- Use KaTeX to render math once document is loaded, see
https://github.com/Khan/KaTeX/tree/master/contrib/auto-render -->
<script>
document.addEventListener("DOMContentLoaded", function () {
renderMathInElement(
document.body,
{
delimiters: [
{ left: "$$", right: "$$", display: true},
{ left: "\\[", right: "\\]", display: true},
{ left: "$", right: "$", display: false},
{ left: "\\(", right: "\\)", display: false}
]
}
);
});
</script></head>
<body class="book toc2 toc-left" style="max-width: 100;">
<div id="header">
<h1>The OpenCL<sup>&#8482;</sup> Specification</h1>
<div class="details">
<span id="author" class="author">Khronos<sup>&#174;</sup> OpenCL Working Group</span><br>
<span id="revnumber">version v2.2-10,</span>
<span id="revdate">Tue, 05 Feb 2019 21:16:07 +0000</span>
<br><span id="revremark">from git branch: commit: 00422daf5dc013f21ab633479577c7cc225150e2</span>
</div>
<div id="toc" class="toc2">
<div id="toctitle">Table of Contents</div>
<ul class="sectlevel1">
<li><a href="#_introduction">1. Introduction</a>
<ul class="sectlevel2">
<li><a href="#_normative_references">1.1. Normative References</a></li>
<li><a href="#_version_numbers">1.2. Version Numbers</a></li>
</ul>
</li>
<li><a href="#_glossary">2. Glossary</a></li>
<li><a href="#_the_opencl_architecture">3. The OpenCL Architecture</a>
<ul class="sectlevel2">
<li><a href="#_platform_model">3.1. Platform Model</a></li>
<li><a href="#_execution_model">3.2. Execution Model</a></li>
<li><a href="#_memory_model">3.3. Memory Model</a></li>
<li><a href="#opencl-framework">3.4. The OpenCL Framework</a></li>
</ul>
</li>
<li><a href="#opencl-platform-layer">4. The OpenCL Platform Layer</a>
<ul class="sectlevel2">
<li><a href="#_querying_platform_info">4.1. Querying Platform Info</a></li>
<li><a href="#platform-querying-devices">4.2. Querying Devices</a></li>
<li><a href="#_partitioning_a_device">4.3. Partitioning a Device</a></li>
<li><a href="#_contexts">4.4. Contexts</a></li>
</ul>
</li>
<li><a href="#opencl-runtime">5. The OpenCL Runtime</a>
<ul class="sectlevel2">
<li><a href="#_command_queues">5.1. Command Queues</a></li>
<li><a href="#_buffer_objects">5.2. Buffer Objects</a></li>
<li><a href="#_image_objects">5.3. Image Objects</a></li>
<li><a href="#_pipes">5.4. Pipes</a></li>
<li><a href="#_querying_unmapping_migrating_retaining_and_releasing_memory_objects">5.5. Querying, Unmapping, Migrating, Retaining and Releasing Memory Objects</a></li>
<li><a href="#_shared_virtual_memory">5.6. Shared Virtual Memory</a></li>
<li><a href="#_sampler_objects">5.7. Sampler Objects</a></li>
<li><a href="#_program_objects">5.8. Program Objects</a></li>
<li><a href="#_kernel_objects">5.9. Kernel Objects</a></li>
<li><a href="#_executing_kernels">5.10. Executing Kernels</a></li>
<li><a href="#event-objects">5.11. Event Objects</a></li>
<li><a href="#markers-barriers-waiting-for-events">5.12. Markers, Barriers and Waiting for Events</a></li>
<li><a href="#_out_of_order_execution_of_kernels_and_memory_object_commands">5.13. Out-of-order Execution of Kernels and Memory Object Commands</a></li>
<li><a href="#profiling-operations">5.14. Profiling Operations on Memory Objects and Kernels</a></li>
<li><a href="#_flush_and_finish">5.15. Flush and Finish</a></li>
</ul>
</li>
<li><a href="#_associated_opencl_specification">6. Associated OpenCL specification</a>
<ul class="sectlevel2">
<li><a href="#spirv-il">6.1. SPIR-V Intermediate language</a></li>
<li><a href="#opencl-extensions">6.2. Extensions to OpenCL</a></li>
<li><a href="#_support_for_earlier_opencl_c_kernel_languages">6.3. Support for earlier OpenCL C kernel languages</a></li>
</ul>
</li>
<li><a href="#opencl-embedded-profile">7. OpenCL Embedded Profile</a></li>
<li><a href="#_shared_objects_thread_safety">Appendix A: Shared Objects, Thread Safety</a>
<ul class="sectlevel2">
<li><a href="#shared-opencl-objects">Shared OpenCL Objects</a></li>
<li><a href="#_multiple_host_threads">Multiple Host Threads</a></li>
</ul>
</li>
<li><a href="#_portability">Appendix B: Portability</a></li>
<li><a href="#data-types">Appendix C: Application Data Types</a>
<ul class="sectlevel2">
<li><a href="#scalar-data-types">Shared Application Scalar Data Types</a></li>
<li><a href="#vector-data-types">Supported Application Vector Data Types</a></li>
<li><a href="#alignment-app-data-types">Alignment of Application Data Types</a></li>
<li><a href="#_vector_literals">Vector Literals</a></li>
<li><a href="#vector-components">Vector Components</a></li>
<li><a href="#_implicit_conversions">Implicit Conversions</a></li>
<li><a href="#_explicit_casts">Explicit Casts</a></li>
<li><a href="#_other_operators_and_functions">Other operators and functions</a></li>
<li><a href="#_application_constant_definitions">Application constant definitions</a></li>
</ul>
</li>
<li><a href="#check-copy-overlap">Appendix D: CL_MEM_COPY_OVERLAP</a></li>
<li><a href="#_changes">Appendix E: Changes</a>
<ul class="sectlevel2">
<li><a href="#_summary_of_changes_from_opencl_1_0">Summary of changes from OpenCL 1.0</a></li>
<li><a href="#_summary_of_changes_from_opencl_1_1">Summary of changes from OpenCL 1.1</a></li>
<li><a href="#_summary_of_changes_from_opencl_1_2">Summary of changes from OpenCL 1.2</a></li>
<li><a href="#_summary_of_changes_from_opencl_2_0">Summary of changes from OpenCL 2.0</a></li>
<li><a href="#_summary_of_changes_from_opencl_2_1">Summary of changes from OpenCL 2.1</a></li>
</ul>
</li>
</ul>
</div>
</div>
<div id="content">
<div id="preamble">
<div class="sectionbody">
<div style="page-break-after: always;"></div>
<div class="paragraph">
<p>Copyright 2008-2019 The Khronos Group.</p>
</div>
<div class="paragraph">
<p>This specification is protected by copyright laws and contains material proprietary
to the Khronos Group, Inc. Except as described by these terms, it or any components
may not be reproduced, republished, distributed, transmitted, displayed, broadcast
or otherwise exploited in any manner without the express prior written permission
of Khronos Group.</p>
</div>
<div class="paragraph">
<p>Khronos Group grants a conditional copyright license to use and reproduce the
unmodified specification for any purpose, without fee or royalty, EXCEPT no licenses
to any patent, trademark or other intellectual property rights are granted under
these terms. Parties desiring to implement the specification and make use of
Khronos trademarks in relation to that implementation, and receive reciprocal patent
license protection under the Khronos IP Policy must become Adopters and confirm the
implementation as conformant under the process defined by Khronos for this
specification; see <a href="https://www.khronos.org/adopters" class="bare">https://www.khronos.org/adopters</a>.</p>
</div>
<div class="paragraph">
<p>Khronos Group makes no, and expressly disclaims any, representations or warranties,
express or implied, regarding this specification, including, without limitation:
merchantability, fitness for a particular purpose, non-infringement of any
intellectual property, correctness, accuracy, completeness, timeliness, and
reliability. Under no circumstances will the Khronos Group, or any of its Promoters,
Contributors or Members, or their respective partners, officers, directors,
employees, agents or representatives be liable for any damages, whether direct,
indirect, special or consequential damages for lost revenues, lost profits, or
otherwise, arising from or in connection with these materials.</p>
</div>
<div class="paragraph">
<p>Vulkan and Khronos are registered trademarks, and OpenXR, SPIR, SPIR-V, SYCL, WebGL,
WebCL, OpenVX, OpenVG, EGL, COLLADA, glTF, NNEF, OpenKODE, OpenKCAM, StreamInput,
OpenWF, OpenSL ES, OpenMAX, OpenMAX AL, OpenMAX IL, OpenMAX DL, OpenML and DevU are
trademarks of the Khronos Group Inc. ASTC is a trademark of ARM Holdings PLC,
OpenCL is a trademark of Apple Inc. and OpenGL and OpenML are registered trademarks
and the OpenGL ES and OpenGL SC logos are trademarks of Silicon Graphics
International used under license by Khronos. All other product names, trademarks,
and/or company names are used solely for identification and belong to their
respective owners.</p>
</div>
<div style="page-break-after: always;"></div>
<div class="paragraph">
<p><strong>Acknowledgements</strong></p>
</div>
<div class="paragraph">
<p>The OpenCL specification is the result of the contributions of many people,
representing a cross section of the desktop, hand-held, and embedded
computer industry.
Following is a partial list of the contributors, including the company that
they represented at the time of their contribution:</p>
</div>
<div class="paragraph">
<p>Chuck Rose, Adobe<br>
Eric Berdahl, Adobe<br>
Shivani Gupta, Adobe<br>
Bill Licea Kane, AMD<br>
Ed Buckingham, AMD<br>
Jan Civlin, AMD<br>
Laurent Morichetti, AMD<br>
Mark Fowler, AMD<br>
Marty Johnson, AMD<br>
Michael Mantor, AMD<br>
Norm Rubin, AMD<br>
Ofer Rosenberg, AMD<br>
Brian Sumner, AMD<br>
Victor Odintsov, AMD<br>
Aaftab Munshi, Apple<br>
Abe Stephens, Apple<br>
Alexandre Namaan, Apple<br>
Anna Tikhonova, Apple<br>
Chendi Zhang, Apple<br>
Eric Bainville, Apple<br>
David Hayward, Apple<br>
Giridhar Murthy, Apple<br>
Ian Ollmann, Apple<br>
Inam Rahman, Apple<br>
James Shearer, Apple<br>
MonPing Wang, Apple<br>
Tanya Lattner, Apple<br>
Mikael Bourges-Sevenier, Aptina<br>
Anton Lokhmotov, ARM<br>
Dave Shreiner, ARM<br>
Hedley Francis, ARM<br>
Robert Elliott, ARM<br>
Scott Moyers, ARM<br>
Tom Olson, ARM<br>
Anastasia Stulova, ARM<br>
Christopher Thompson-Walsh, Broadcom<br>
Holger Waechtler, Broadcom<br>
Norman Rink, Broadcom<br>
Andrew Richards, Codeplay<br>
Maria Rovatsou, Codeplay<br>
Alistair Donaldson, Codeplay<br>
Alastair Murray, Codeplay<br>
Stephen Frye, Electronic Arts<br>
Eric Schenk, Electronic Arts<br>
Daniel Laroche, Freescale<br>
David Neto, Google<br>
Robin Grosman, Huawei<br>
Craig Davies, Huawei<br>
Brian Horton, IBM<br>
Brian Watt, IBM<br>
Gordon Fossum, IBM<br>
Greg Bellows, IBM<br>
Joaquin Madruga, IBM<br>
Mark Nutter, IBM<br>
Mike Perks, IBM<br>
Sean Wagner, IBM<br>
Jon Parr, Imagination Technologies<br>
Robert Quill, Imagination Technologies<br>
James McCarthy, Imagination Technologie<br>
Jon Leech, Independent<br>
Aaron Kunze, Intel<br>
Aaron Lefohn, Intel<br>
Adam Lake, Intel<br>
Alexey Bader, Intel<br>
Allen Hux, Intel<br>
Andrew Brownsword, Intel<br>
Andrew Lauritzen, Intel<br>
Bartosz Sochacki, Intel<br>
Ben Ashbaugh, Intel<br>
Brian Lewis, Intel<br>
Geoff Berry, Intel<br>
Hong Jiang, Intel<br>
Jayanth Rao, Intel<br>
Josh Fryman, Intel<br>
Larry Seiler, Intel<br>
Mike MacPherson, Intel<br>
Murali Sundaresan, Intel<br>
Paul Lalonde, Intel<br>
Raun Krisch, Intel<br>
Stephen Junkins, Intel<br>
Tim Foley, Intel<br>
Timothy Mattson, Intel<br>
Yariv Aridor, Intel<br>
Michael Kinsner, Intel<br>
Kevin Stevens, Intel<br>
Benjamin Bergen, Los Alamos National Laboratory<br>
Roy Ju, Mediatek<br>
Bor-Sung Liang, Mediatek<br>
Rahul Agarwal, Mediatek<br>
Michal Witaszek, Mobica<br>
JenqKuen Lee, NTHU<br>
Amit Rao, NVIDIA<br>
Ashish Srivastava, NVIDIA<br>
Bastiaan Aarts, NVIDIA<br>
Chris Cameron, NVIDIA<br>
Christopher Lamb, NVIDIA<br>
Dibyapran Sanyal, NVIDIA<br>
Guatam Chakrabarti, NVIDIA<br>
Ian Buck, NVIDIA<br>
Jaydeep Marathe, NVIDIA<br>
Jian-Zhong Wang, NVIDIA<br>
Karthik Raghavan Ravi, NVIDIA<br>
Kedar Patil, NVIDIA<br>
Manjunath Kudlur, NVIDIA<br>
Mark Harris, NVIDIA<br>
Michael Gold, NVIDIA<br>
Neil Trevett, NVIDIA<br>
Richard Johnson, NVIDIA<br>
Sean Lee, NVIDIA<br>
Tushar Kashalikar, NVIDIA<br>
Vinod Grover, NVIDIA<br>
Xiangyun Kong, NVIDIA<br>
Yogesh Kini, NVIDIA<br>
Yuan Lin, NVIDIA<br>
Mayuresh Pise, NVIDIA<br>
Allan Tzeng, QUALCOMM<br>
Alex Bourd, QUALCOMM<br>
Anirudh Acharya, QUALCOMM<br>
Andrew Gruber, QUALCOMM<br>
Andrzej Mamona, QUALCOMM<br>
Benedict Gaster, QUALCOMM<br>
Bill Torzewski, QUALCOMM<br>
Bob Rychlik, QUALCOMM<br>
Chihong Zhang, QUALCOMM<br>
Chris Mei, QUALCOMM<br>
Colin Sharp, QUALCOMM<br>
David Garcia, QUALCOMM<br>
David Ligon, QUALCOMM<br>
Jay Yun, QUALCOMM<br>
Lee Howes, QUALCOMM<br>
Richard Ruigrok, QUALCOMM<br>
Robert J. Simpson, QUALCOMM<br>
Sumesh Udayakumaran, QUALCOMM<br>
Vineet Goel, QUALCOMM<br>
Lihan Bin, QUALCOMM<br>
Vlad Shimanskiy, QUALCOMM<br>
Jian Liu, QUALCOMM<br>
Tasneem Brutch, Samsung<br>
Yoonseo Choi, Samsung<br>
Dennis Adams, Sony<br>
Pr-Anders Aronsson, Sony<br>
Jim Rasmusson, Sony<br>
Thierry Lepley, STMicroelectronics<br>
Anton Gorenko, StreamHPC<br>
Jakub Szuppe, StreamHPC<br>
Vincent Hindriksen, StreamHPC<br>
Alan Ward, Texas Instruments<br>
Yuan Zhao, Texas Instruments<br>
Pete Curry, Texas Instruments<br>
Simon McIntosh-Smith, University of Bristol<br>
James Price, University of Bristol<br>
Paul Preney, University of Windsor<br>
Shane Peelar, University of Windsor<br>
Brian Hutsell, Vivante<br>
Mike Cai, Vivante<br>
Sumeet Kumar, Vivante<br>
Wei-Lun Kao, Vivante<br>
Xing Wang, Vivante<br>
Jeff Fifield, Xilinx<br>
Hem C. Neema, Xilinx<br>
Henry Styles, Xilinx<br>
Ralph Wittig, Xilinx<br>
Ronan Keryell, Xilinx<br>
AJ Guillon, YetiWare Inc<br></p>
</div>
<div style="page-break-after: always;"></div>
</div>
</div>
<div class="sect1">
<h2 id="_introduction">1. Introduction</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Modern processor architectures have embraced parallelism as an important
pathway to increased performance.
Facing technical challenges with higher clock speeds in a fixed power
envelope, Central Processing Units (CPUs) now improve performance by adding
multiple cores.
Graphics Processing Units (GPUs) have also evolved from fixed function
rendering devices into programmable parallel processors.
As todays computer systems often include highly parallel CPUs, GPUs and
other types of processors, it is important to enable software developers to
take full advantage of these heterogeneous processing platforms.</p>
</div>
<div class="paragraph">
<p>Creating applications for heterogeneous parallel processing platforms is
challenging as traditional programming approaches for multi-core CPUs and
GPUs are very different.
CPU-based parallel programming models are typically based on standards but
usually assume a shared address space and do not encompass vector
operations.
General purpose GPU programming models address complex memory hierarchies
and vector operations but are traditionally platform-, vendor- or
hardware-specific.
These limitations make it difficult for a developer to access the compute
power of heterogeneous CPUs, GPUs and other types of processors from a
single, multi-platform source code base.
More than ever, there is a need to enable software developers to effectively
take full advantage of heterogeneous processing platforms from high
performance compute servers, through desktop computer systems to handheld
devices - that include a diverse mix of parallel CPUs, GPUs and other
processors such as DSPs and the Cell/B.E.
processor.</p>
</div>
<div class="paragraph">
<p><strong>OpenCL</strong> (Open Computing Language) is an open royalty-free standard for
general purpose parallel programming across CPUs, GPUs and other processors,
giving software developers portable and efficient access to the power of
these heterogeneous processing platforms.</p>
</div>
<div class="paragraph">
<p>OpenCL supports a wide range of applications, ranging from embedded and
consumer software to HPC solutions, through a low-level, high-performance,
portable abstraction.
By creating an efficient, close-to-the-metal programming interface, OpenCL
will form the foundation layer of a parallel computing ecosystem of
platform-independent tools, middleware and applications.
OpenCL is particularly suited to play an increasingly significant role in
emerging interactive graphics applications that combine general parallel
compute algorithms with graphics rendering pipelines.</p>
</div>
<div class="paragraph">
<p>OpenCL consists of an API for coordinating parallel computation across
heterogeneous processors; and a cross-platform intermediate language with a
well-specified computation environment.
The OpenCL standard:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Supports both data- and task-based parallel programming models</p>
</li>
<li>
<p>Utilizes a portable and self-contained intermediate representation with
support for parallel execution</p>
</li>
<li>
<p>Defines consistent numerical requirements based on IEEE 754</p>
</li>
<li>
<p>Defines a configuration profile for handheld and embedded devices</p>
</li>
<li>
<p>Efficiently interoperates with OpenGL, OpenGL ES and other graphics APIs</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>This document begins with an overview of basic concepts and the architecture
of OpenCL, followed by a detailed description of its execution model, memory
model and synchronization support.
It then discusses the OpenCL platform and runtime API.
Some examples are given that describe sample compute use-cases and how they
would be written in OpenCL.
The specification is divided into a core specification that any OpenCL
compliant implementation must support; a handheld/embedded profile which
relaxes the OpenCL compliance requirements for handheld and embedded
devices; and a set of optional extensions that are likely to move into the
core specification in later revisions of the OpenCL specification.</p>
</div>
<div class="sect2">
<h3 id="_normative_references">1.1. Normative References</h3>
<div class="paragraph">
<p>Normative references are references to external documents or resources to
which implementers of OpenCL must comply with all, or specified portions of,
as described in this specification.</p>
</div>
<div id="iso-c11" class="paragraph">
<p><em>ISO/IEC 9899:2011 - Information technology - Programming languages - C</em>,
<a href="https://www.iso.org/standard/57853.html" class="bare">https://www.iso.org/standard/57853.html</a> (final specification),
<a href="http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1570.pdf" class="bare">http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1570.pdf</a> (last public
draft).</p>
</div>
</div>
<div class="sect2">
<h3 id="_version_numbers">1.2. Version Numbers</h3>
<div class="paragraph">
<p>The OpenCL version number follows a <em>major.minor-revision</em> scheme. When this
version number is used within the API it generally only includes the
<em>major.minor</em> components of the version number.</p>
</div>
<div class="paragraph">
<p>A difference in the <em>major</em> or <em>minor</em> version number indicates that some
amount of new functionality has been added to the specification, and may also
include behavior changes and bug fixes.
Functionality may also be deprecated or removed when the <em>major</em> or <em>minor</em>
version changes.</p>
</div>
<div class="paragraph">
<p>A difference in the <em>revision</em> number indicates small changes to the
specification, typically to fix a bug or to clarify language.
When the <em>revision</em> number changes there may be an impact on the behavior of
existing functionality, but this should not affect backwards compatibility.
Functionality should not be added or removed when the <em>revision</em> number
changes.</p>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_glossary">2. Glossary</h2>
<div class="sectionbody">
<div class="dlist">
<dl>
<dt class="hdlist1">Application </dt>
<dd>
<p>The combination of the program running on the host and OpenCL devices.</p>
</dd>
<dt class="hdlist1">Acquire semantics </dt>
<dd>
<p>One of the memory order semantics defined for synchronization
operations.
Acquire semantics apply to atomic operations that load from memory.
Given two units of execution, <strong>A</strong> and <strong>B</strong>, acting on a shared atomic
object <strong>M</strong>, if <strong>A</strong> uses an atomic load of <strong>M</strong> with acquire semantics to
synchronize-with an atomic store to <strong>M</strong> by <strong>B</strong> that used release
semantics, then <strong>A</strong>'s atomic load will occur before any subsequent
operations by <strong>A</strong>.
Note that the memory orders <em>release</em>, <em>sequentially consistent</em>, and
<em>acquire_release</em> all include <em>release semantics</em> and effectively pair
with a load using acquire semantics.</p>
</dd>
<dt class="hdlist1">Acquire release semantics </dt>
<dd>
<p>A memory order semantics for synchronization operations (such as atomic
operations) that has the properties of both acquire and release memory
orders.
It is used with read-modify-write operations.</p>
</dd>
<dt class="hdlist1">Atomic operations </dt>
<dd>
<p>Operations that at any point, and from any perspective, have either
occurred completely, or not at all.
Memory orders associated with atomic operations may constrain the
visibility of loads and stores with respect to the atomic operations
(see <em>relaxed semantics</em>, <em>acquire semantics</em>, <em>release semantics</em> or
<em>acquire release semantics</em>).</p>
</dd>
<dt class="hdlist1">Blocking and Non-Blocking Enqueue API calls </dt>
<dd>
<p>A <em>non-blocking enqueue API call</em> places a <em>command</em> on a
<em>command-queue</em> and returns immediately to the host.
The <em>blocking-mode enqueue API calls</em> do not return to the host until
the command has completed.</p>
</dd>
<dt class="hdlist1">Barrier </dt>
<dd>
<p>There are three types of <em>barriers</em> a command-queue barrier, a
work-group barrier and a sub-group barrier.</p>
<div class="openblock">
<div class="content">
<div class="ulist">
<ul>
<li>
<p>The OpenCL API provides a function to enqueue a <em>command-queue</em>
<em>barrier</em> command.
This <em>barrier</em> command ensures that all previously enqueued commands to
a command-queue have finished execution before any following <em>commands</em>
enqueued in the <em>command-queue</em> can begin execution.</p>
</li>
<li>
<p>The OpenCL kernel execution model provides built-in <em>work-group barrier</em>
functionality.
This <em>barrier</em> built-in function can be used by a <em>kernel</em> executing on
a <em>device</em> to perform synchronization between <em>work-items</em> in a
<em>work-group</em> executing the <em>kernel</em>.
All the <em>work-items</em> of a <em>work-group</em> must execute the <em>barrier</em>
construct before any are allowed to continue execution beyond the
<em>barrier</em>.</p>
</li>
<li>
<p>The OpenCL kernel execution model provides built-in <em>sub-group barrier</em>
functionality.
This <em>barrier</em> built-in function can be used by a <em>kernel</em> executing on
a <em>device</em> to perform synchronization between <em>work-items</em> in a
<em>sub-group</em> executing the <em>kernel</em>.
All the <em>work-items</em> of a <em>sub-group</em> must execute the <em>barrier</em>
construct before any are allowed to continue execution beyond the
<em>barrier</em>.</p>
</li>
</ul>
</div>
</div>
</div>
</dd>
<dt class="hdlist1">Buffer Object </dt>
<dd>
<p>A memory object that stores a linear collection of bytes.
Buffer objects are accessible using a pointer in a <em>kernel</em> executing on
a <em>device</em>.
Buffer objects can be manipulated by the host using OpenCL API calls.
A <em>buffer object</em> encapsulates the following information:</p>
<div class="openblock">
<div class="content">
<div class="ulist">
<ul>
<li>
<p>Size in bytes.</p>
</li>
<li>
<p>Properties that describe usage information and which region to allocate
from.</p>
</li>
<li>
<p>Buffer data.</p>
</li>
</ul>
</div>
</div>
</div>
</dd>
<dt class="hdlist1">Built-in Kernel </dt>
<dd>
<p>A <em>built-in kernel</em> is a <em>kernel</em> that is executed on an OpenCL <em>device</em>
or <em>custom device</em> by fixed-function hardware or in firmware.
<em>Applications</em> can query the <em>built-in kernels</em> supported by a <em>device</em>
or <em>custom device</em>.
A <em>program object</em> can only contain <em>kernels</em> written in OpenCL C or
<em>built-in kernels</em> but not both.
See also <em>Kernel</em> and <em>Program</em>.</p>
</dd>
<dt class="hdlist1">Child kernel </dt>
<dd>
<p>See <em>Device-side enqueue</em>.</p>
</dd>
<dt class="hdlist1">Command </dt>
<dd>
<p>The OpenCL operations that are submitted to a <em>command-queue</em> for
execution.
For example, OpenCL commands issue kernels for execution on a compute
device, manipulate memory objects, etc.</p>
</dd>
<dt class="hdlist1">Command-queue </dt>
<dd>
<p>An object that holds <em>commands</em> that will be executed on a specific
<em>device</em>.
The <em>command-queue</em> is created on a specific <em>device</em> in a <em>context</em>.
<em>Commands</em> to a <em>command-queue</em> are queued in-order but may be executed
in-order or out-of-order.
<em>Refer to In-order Execution_and_Out-of-order Execution</em>.</p>
</dd>
<dt class="hdlist1">Command-queue Barrier </dt>
<dd>
<p>See <em>Barrier</em>.</p>
</dd>
<dt class="hdlist1">Command synchronization </dt>
<dd>
<p>Constraints on the order that commands are launched for execution on a
device defined in terms of the synchronization points that occur between
commands in host command-queues and between commands in device-side
command-queues.
See <em>synchronization points</em>.</p>
</dd>
<dt class="hdlist1">Complete </dt>
<dd>
<p>The final state in the six state model for the execution of a command.
The transition into this state occurs is signaled through event objects
or callback functions associated with a command.</p>
</dd>
<dt class="hdlist1">Compute Device Memory </dt>
<dd>
<p>This refers to one or more memories attached to the compute device.</p>
</dd>
<dt class="hdlist1">Compute Unit </dt>
<dd>
<p>An OpenCL <em>device</em> has one or more <em>compute units</em>.
A <em>work-group</em> executes on a single <em>compute unit</em>.
A <em>compute unit</em> is composed of one or more <em>processing elements</em> and
<em>local memory</em>.
A <em>compute unit</em> may also include dedicated texture filter units that
can be accessed by its processing elements.</p>
</dd>
<dt class="hdlist1">Concurrency </dt>
<dd>
<p>A property of a system in which a set of tasks in a system can remain
active and make progress at the same time.
To utilize concurrent execution when running a program, a programmer
must identify the concurrency in their problem, expose it within the
source code, and then exploit it using a notation that supports
concurrency.</p>
</dd>
<dt class="hdlist1">Constant Memory </dt>
<dd>
<p>A region of <em>global memory</em> that remains constant during the execution
of a <em>kernel</em>.
The <em>host</em> allocates and initializes memory objects placed into
<em>constant memory</em>.</p>
</dd>
<dt class="hdlist1">Context </dt>
<dd>
<p>The environment within which the kernels execute and the domain in which
synchronization and memory management is defined.
The <em>context</em> includes a set of <em>devices</em>, the memory accessible to
those <em>devices</em>, the corresponding memory properties and one or more
<em>command-queues</em> used to schedule execution of a <em>kernel(s)</em> or
operations on <em>memory objects</em>.</p>
</dd>
<dt class="hdlist1">Control flow </dt>
<dd>
<p>The flow of instructions executed by a work-item.
Multiple logically related work items may or may not execute the same
control flow.
The control flow is said to be <em>converged</em> if all the work-items in the
set execution the same stream of instructions.
In a <em>diverged</em> control flow, the work-items in the set execute
different instructions.
At a later point, if a diverged control flow becomes converged, it is
said to be a re-converged control flow.</p>
</dd>
<dt class="hdlist1">Converged control flow </dt>
<dd>
<p>See <em>Control flow</em>.</p>
</dd>
<dt class="hdlist1">Custom Device </dt>
<dd>
<p>An OpenCL <em>device</em> that fully implements the OpenCL Runtime but does not
support <em>programs</em> written in OpenCL C.
A custom device may be specialized non-programmable hardware that is
very power efficient and performant for directed tasks or hardware with
limited programmable capabilities such as specialized DSPs.
Custom devices are not OpenCL conformant.
Custom devices may support an online compiler.
Programs for custom devices can be created using the OpenCL runtime APIs
that allow OpenCL programs to be created from source (if an online
compiler is supported) and/or binary, or from <em>built-in kernels</em>
supported by the <em>device</em>.
See also <em>Device</em>.</p>
</dd>
<dt class="hdlist1">Data Parallel Programming Model </dt>
<dd>
<p>Traditionally, this term refers to a programming model where concurrency
is expressed as instructions from a single program applied to multiple
elements within a set of data structures.
The term has been generalized in OpenCL to refer to a model wherein a
set of instructions from a single program are applied concurrently to
each point within an abstract domain of indices.</p>
</dd>
<dt class="hdlist1">Data race </dt>
<dd>
<p>The execution of a program contains a data race if it contains two
actions in different work items or host threads where (1) one action
modifies a memory location and the other action reads or modifies the
same memory location, and (2) at least one of these actions is not
atomic, or the corresponding memory scopes are not inclusive, and (3)
the actions are global actions unordered by the global-happens-before
relation or are local actions unordered by the local-happens before
relation.</p>
</dd>
<dt class="hdlist1">Deprecation </dt>
<dd>
<p>Existing features are marked as deprecated if their usage is not
recommended as that feature is being de-emphasized, superseded and may
be removed from a future version of the specification.</p>
</dd>
<dt class="hdlist1">Device </dt>
<dd>
<p>A <em>device</em> is a collection of <em>compute units</em>.
A <em>command-queue</em> is used to queue <em>commands</em> to a <em>device</em>.
Examples of <em>commands</em> include executing <em>kernels</em>, or reading and
writing <em>memory objects</em>.
OpenCL devices typically correspond to a GPU, a multi-core CPU, and
other processors such as DSPs and the Cell/B.E.
processor.</p>
</dd>
<dt class="hdlist1">Device-side enqueue </dt>
<dd>
<p>A mechanism whereby a kernel-instance is enqueued by a kernel-instance
running on a device without direct involvement by the host program.
This produces <em>nested parallelism</em>; i.e. additional levels of
concurrency are nested inside a running kernel-instance.
The kernel-instance executing on a device (the <em>parent kernel</em>) enqueues
a kernel-instance (the <em>child kernel</em>) to a device-side command queue.
Child and parent kernels execute asynchronously though a parent kernel
does not complete until all of its child-kernels have completed.</p>
</dd>
<dt class="hdlist1">Diverged control flow </dt>
<dd>
<p>See <em>Control flow</em>.</p>
</dd>
<dt class="hdlist1">Ended </dt>
<dd>
<p>The fifth state in the six state model for the execution of a command.
The transition into this state occurs when execution of a command has
ended.
When a Kernel-enqueue command ends, all of the work-groups associated
with that command have finished their execution.</p>
</dd>
<dt class="hdlist1">Event Object </dt>
<dd>
<p>An <em>event object</em> encapsulates the status of an operation such as a
<em>command</em>.
It can be used to synchronize operations in a context.</p>
</dd>
<dt class="hdlist1">Event Wait List </dt>
<dd>
<p>An <em>event wait list</em> is a list of <em>event objects</em> that can be used to
control when a particular <em>command</em> begins execution.</p>
</dd>
<dt class="hdlist1">Fence </dt>
<dd>
<p>A memory ordering operation without an associated atomic object.
A fence can use the <em>acquire semantics, release semantics</em>, or <em>acquire
release semantics</em>.</p>
</dd>
<dt class="hdlist1">Framework </dt>
<dd>
<p>A software system that contains the set of components to support
software development and execution.
A <em>framework</em> typically includes libraries, APIs, runtime systems,
compilers, etc.</p>
</dd>
<dt class="hdlist1">Generic address space </dt>
<dd>
<p>An address space that include the <em>private</em>, <em>local</em>, and <em>global</em>
address spaces available to a device.
The generic address space supports conversion of pointers to and from
private, local and global address spaces, and hence lets a programmer
write a single function that at compile time can take arguments from any
of the three named address spaces.</p>
</dd>
<dt class="hdlist1">Global Happens before </dt>
<dd>
<p>See <em>Happens before</em>.</p>
</dd>
<dt class="hdlist1">Global ID </dt>
<dd>
<p>A <em>global ID</em> is used to uniquely identify a <em>work-item</em> and is derived
from the number of <em>global work-items</em> specified when executing a
<em>kernel</em>.
The <em>global ID</em> is a N-dimensional value that starts at (0, 0, &#8230;&#8203; 0).
See also <em>Local ID</em>.</p>
</dd>
<dt class="hdlist1">Global Memory </dt>
<dd>
<p>A memory region accessible to all <em>work-items</em> executing in a <em>context</em>.
It is accessible to the <em>host</em> using <em>commands</em> such as read, write and
map.
<em>Global memory</em> is included within the <em>generic address space</em> that
includes the private and local address spaces.</p>
</dd>
<dt class="hdlist1">GL share group </dt>
<dd>
<p>A <em>GL share group</em> object manages shared OpenGL or OpenGL ES resources
such as textures, buffers, framebuffers, and renderbuffers and is
associated with one or more GL context objects.
The <em>GL share group</em> is typically an opaque object and not directly
accessible.</p>
</dd>
<dt class="hdlist1">Handle </dt>
<dd>
<p>An opaque type that references an <em>object</em> allocated by OpenCL.
Any operation on an <em>object</em> occurs by reference to that objects handle.</p>
</dd>
<dt class="hdlist1">Happens before </dt>
<dd>
<p>An ordering relationship between operations that execute on multiple
units of execution.
If an operation A happens-before operation B then A must occur before B;
in particular, any value written by A will be visible to B.
We define two separate happens before relations: <em>global-happens-before</em>
and <em>local-happens-before</em>.
These are defined in <a href="#memory-ordering-rules">Memory Model: Memory
Ordering Rules</a>.</p>
</dd>
<dt class="hdlist1">Host </dt>
<dd>
<p>The <em>host</em> interacts with the <em>context</em> using the OpenCL API.</p>
</dd>
<dt class="hdlist1">Host-thread </dt>
<dd>
<p>The unit of execution that executes the statements in the host program.</p>
</dd>
<dt class="hdlist1">Host pointer </dt>
<dd>
<p>A pointer to memory that is in the virtual address space on the <em>host</em>.</p>
</dd>
<dt class="hdlist1">Illegal </dt>
<dd>
<p>Behavior of a system that is explicitly not allowed and will be reported
as an error when encountered by OpenCL.</p>
</dd>
<dt class="hdlist1">Image Object </dt>
<dd>
<p>A <em>memory object</em> that stores a two- or three-dimensional structured
array.
Image data can only be accessed with read and write functions.
The read functions use a <em>sampler</em>.</p>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <em>image object</em> encapsulates the following information:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Dimensions of the image.</p>
</li>
<li>
<p>Description of each element in the image.</p>
</li>
<li>
<p>Properties that describe usage information and which region to allocate
from.</p>
</li>
<li>
<p>Image data.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>The elements of an image are selected from a list of predefined image
formats.</p>
</div>
</div>
</div>
</dd>
<dt class="hdlist1">Implementation Defined </dt>
<dd>
<p>Behavior that is explicitly allowed to vary between conforming
implementations of OpenCL.
An OpenCL implementor is required to document the implementation-defined
behavior.</p>
</dd>
<dt class="hdlist1">Independent Forward Progress </dt>
<dd>
<p>If an entity supports independent forward progress, then if it is
otherwise not dependent on any actions due to be performed by any other
entity (for example it does not wait on a lock held by, and thus that
must be released by, any other entity), then its execution cannot be
blocked by the execution of any other entity in the system (it will not
be starved).
Work items in a subgroup, for example, typically do not support
independent forward progress, so one work item in a subgroup may be
completely blocked (starved) if a different work item in the same
subgroup enters a spin loop.</p>
</dd>
<dt class="hdlist1">In-order Execution </dt>
<dd>
<p>A model of execution in OpenCL where the <em>commands</em> in a <em>command-queue</em>
are executed in order of submission with each <em>command</em> running to
completion before the next one begins.
See Out-of-order Execution.</p>
</dd>
<dt class="hdlist1">Intermediate Language </dt>
<dd>
<p>A lower-level language that may be used to create programs.
SPIR-V is a required IL for OpenCL 2.2 runtimes.
Additional ILs may be accepted on an implementation-defined basis.</p>
</dd>
<dt class="hdlist1">Kernel </dt>
<dd>
<p>A <em>kernel</em> is a function declared in a <em>program</em> and executed on an
OpenCL <em>device</em>.
A <em>kernel</em> is identified by the kernel or kernel qualifier applied to
any function defined in a <em>program</em>.</p>
</dd>
<dt class="hdlist1">Kernel-instance </dt>
<dd>
<p>The work carried out by an OpenCL program occurs through the execution
of kernel-instances on devices.
The kernel instance is the <em>kernel object</em>, the values associated with
the arguments to the kernel, and the parameters that define the
<em>NDRange</em> index space.</p>
</dd>
<dt class="hdlist1">Kernel Object </dt>
<dd>
<p>A <em>kernel object</em> encapsulates a specific <code>__kernel</code> function declared
in a <em>program</em> and the argument values to be used when executing this
<code>__kernel</code> function.</p>
</dd>
<dt class="hdlist1">Kernel Language </dt>
<dd>
<p>A language that is used to create source code for kernel.
Supported kernel languages include OpenCL C, OpenCL C++, and OpenCL
dialect of SPIR-V.</p>
</dd>
<dt class="hdlist1">Launch </dt>
<dd>
<p>The transition of a command from the <em>submitted</em> state to the <em>ready</em>
state.
See <em>Ready</em>.</p>
</dd>
<dt class="hdlist1">Local ID </dt>
<dd>
<p>A <em>local ID</em> specifies a unique <em>work-item ID</em> within a given
<em>work-group</em> that is executing a <em>kernel</em>.
The <em>local ID</em> is a N-dimensional value that starts at (0, 0, &#8230;&#8203; 0).
See also <em>Global ID</em>.</p>
</dd>
<dt class="hdlist1">Local Memory </dt>
<dd>
<p>A memory region associated with a <em>work-group</em> and accessible only by
<em>work-items</em> in that <em>work-group</em>.
<em>Local memory</em> is included within the <em>generic address space</em> that
includes the private and global address spaces.</p>
</dd>
<dt class="hdlist1">Marker </dt>
<dd>
<p>A <em>command</em> queued in a <em>command-queue</em> that can be used to tag all
<em>commands</em> queued before the <em>marker</em> in the <em>command-queue</em>.
The <em>marker</em> command returns an <em>event</em> which can be used by the
<em>application</em> to queue a wait on the marker event i.e. wait for all
commands queued before the <em>marker</em> command to complete.</p>
</dd>
<dt class="hdlist1">Memory Consistency Model </dt>
<dd>
<p>Rules that define which values are observed when multiple units of
execution load data from any shared memory plus the synchronization
operations that constrain the order of memory operations and define
synchronization relationships.
The memory consistency model in OpenCL is based on the memory model from
the ISO C11 programming language.</p>
</dd>
<dt class="hdlist1">Memory Objects </dt>
<dd>
<p>A <em>memory object</em> is a handle to a reference counted region of <em>Global
Memory</em>.
Also see <em>Buffer Object</em> and <em>Image Object</em>.</p>
</dd>
<dt class="hdlist1">Memory Regions (or Pools) </dt>
<dd>
<p>A distinct address space in OpenCL.
<em>Memory regions</em> may overlap in physical memory though OpenCL will treat
them as logically distinct.
The <em>memory regions</em> are denoted as <em>private</em>, <em>local</em>, <em>constant,</em> and
<em>global</em>.</p>
</dd>
<dt class="hdlist1">Memory Scopes </dt>
<dd>
<p>These memory scopes define a hierarchy of visibilities when analyzing
the ordering constraints of memory operations.
They are defined by the values of the <strong>memory_scope</strong> enumeration
constant.
Current values are <strong>memory_scope_work_item</strong> (memory constraints only
apply to a single work-item and in practice apply only to image
operations), <strong>memory_scope_sub_group</strong> (memory-ordering constraints only
apply to work-items executing in a sub-group), <strong>memory_scope_work_group</strong>
(memory-ordering constraints only apply to work-items executing in a
work-group), <strong>memory_scope_device</strong> (memory-ordering constraints only
apply to work-items executing on a single device) and
<strong>memory_scope_all_svm_devices</strong> (memory-ordering constraints only apply
to work-items executing across multiple devices and when using shared
virtual memory).</p>
</dd>
<dt class="hdlist1">Modification Order </dt>
<dd>
<p>All modifications to a particular atomic object M occur in some
particular <em>total order</em>, called the <em>modification order</em> of M.
If A and B are modifications of an atomic object M, and A happens-before
B, then A shall precede B in the modification order of M.
Note that the modification order of an atomic object M is independent of
whether M is in local or global memory.</p>
</dd>
<dt class="hdlist1">Nested Parallelism </dt>
<dd>
<p>See <em>device-side enqueue</em>.</p>
</dd>
<dt class="hdlist1">Object </dt>
<dd>
<p>Objects are abstract representation of the resources that can be
manipulated by the OpenCL API.
Examples include <em>program objects</em>, <em>kernel objects</em>, and <em>memory
objects</em>.</p>
</dd>
<dt class="hdlist1">Out-of-Order Execution </dt>
<dd>
<p>A model of execution in which <em>commands</em> placed in the <em>work queue</em> may
begin and complete execution in any order consistent with constraints
imposed by <em>event wait lists_and_command-queue barrier</em>.
See <em>In-order Execution</em>.</p>
</dd>
<dt class="hdlist1">Parent device </dt>
<dd>
<p>The OpenCL <em>device</em> which is partitioned to create <em>sub-devices</em>.
Not all <em>parent devices</em> are <em>root devices</em>.
A <em>root device</em> might be partitioned and the <em>sub-devices</em> partitioned
again.
In this case, the first set of <em>sub-devices</em> would be <em>parent devices</em>
of the second set, but not the <em>root devices</em>.
Also see <em>Device</em>, <em>parent device</em> and <em>root device</em>.</p>
</dd>
<dt class="hdlist1">Parent kernel </dt>
<dd>
<p>see <em>Device-side enqueue</em>.</p>
</dd>
<dt class="hdlist1">Pipe </dt>
<dd>
<p>The <em>pipe</em> memory object conceptually is an ordered sequence of data
items.
A pipe has two endpoints: a write endpoint into which data items are
inserted, and a read endpoint from which data items are removed.
At any one time, only one kernel instance may write into a pipe, and
only one kernel instance may read from a pipe.
To support the producer consumer design pattern, one kernel instance
connects to the write endpoint (the producer) while another kernel
instance connects to the reading endpoint (the consumer).</p>
</dd>
<dt class="hdlist1">Platform </dt>
<dd>
<p>The <em>host</em> plus a collection of <em>devices</em> managed by the OpenCL
<em>framework</em> that allow an application to share <em>resources</em> and execute
<em>kernels</em> on <em>devices</em> in the <em>platform</em>.</p>
</dd>
<dt class="hdlist1">Private Memory </dt>
<dd>
<p>A region of memory private to a <em>work-item</em>.
Variables defined in one <em>work-items</em> <em>private memory</em> are not visible
to another <em>work-item</em>.</p>
</dd>
<dt class="hdlist1">Processing Element </dt>
<dd>
<p>A virtual scalar processor.
A work-item may execute on one or more processing elements.</p>
</dd>
<dt class="hdlist1">Program </dt>
<dd>
<p>An OpenCL <em>program</em> consists of a set of <em>kernels</em>.
<em>Programs</em> may also contain auxiliary functions called by the
<code>__kernel</code> functions and constant data.</p>
</dd>
<dt class="hdlist1">Program Object </dt>
<dd>
<p>A <em>program object</em> encapsulates the following information:</p>
<div class="openblock">
<div class="content">
<div class="ulist">
<ul>
<li>
<p>A reference to an associated <em>context</em>.</p>
</li>
<li>
<p>A <em>program</em> source or binary.</p>
</li>
<li>
<p>The latest successfully built program executable, the list of <em>devices</em>
for which the program executable is built, the build options used and a
build log.</p>
</li>
<li>
<p>The number of <em>kernel objects</em> currently attached.</p>
</li>
</ul>
</div>
</div>
</div>
</dd>
<dt class="hdlist1">Queued </dt>
<dd>
<p>The first state in the six state model for the execution of a command.
The transition into this state occurs when the command is enqueued into
a command-queue.</p>
</dd>
<dt class="hdlist1">Ready </dt>
<dd>
<p>The third state in the six state model for the execution of a command.
The transition into this state occurs when pre-requisites constraining
execution of a command have been met; i.e. the command has been
launched.
When a kernel-enqueue command is launched, work-groups associated with
the command are placed in a devices work-pool from which they are
scheduled for execution.</p>
</dd>
<dt class="hdlist1">Re-converged Control Flow </dt>
<dd>
<p>see <em>Control flow</em>.</p>
</dd>
<dt class="hdlist1">Reference Count </dt>
<dd>
<p>The life span of an OpenCL object is determined by its <em>reference
count</em>, an internal count of the number of references to the object.
When you create an object in OpenCL, its <em>reference count</em> is set to
one.
Subsequent calls to the appropriate <em>retain</em> API (such as
<strong>clRetainContext</strong>, <strong>clRetainCommandQueue</strong>) increment the <em>reference
count</em>.
Calls to the appropriate <em>release</em> API (such as <strong>clReleaseContext</strong>,
<strong>clReleaseCommandQueue</strong>) decrement the <em>reference count</em>.
Implementations may also modify the <em>reference count</em>, e.g. to track
attached objects or to ensure correct operation of in-progress or
scheduled activities.
The object becomes inaccessible to host code when the number of
<em>release</em> operations performed matches the number of <em>retain</em> operations
plus the allocation of the object.
At this point the reference count may be zero but this is not
guaranteed.</p>
</dd>
<dt class="hdlist1">Relaxed Consistency </dt>
<dd>
<p>A memory consistency model in which the contents of memory visible to
different <em>work-items</em> or <em>commands</em> may be different except at a
<em>barrier</em> or other explicit synchronization points.</p>
</dd>
<dt class="hdlist1">Relaxed Semantics </dt>
<dd>
<p>A memory order semantics for atomic operations that implies no order
constraints.
The operation is <em>atomic</em> but it has no impact on the order of memory
operations.</p>
</dd>
<dt class="hdlist1">Release Semantics </dt>
<dd>
<p>One of the memory order semantics defined for synchronization
operations.
Release semantics apply to atomic operations that store to memory.
Given two units of execution, <strong>A</strong> and <strong>B</strong>, acting on a shared atomic
object <strong>M</strong>, if <strong>A</strong> uses an atomic store of <strong>M</strong> with release semantics to
synchronize-with an atomic load to <strong>M</strong> by <strong>B</strong> that used acquire
semantics, then <strong>A</strong>'s atomic store will occur <em>after</em> any prior
operations by <strong>A</strong>.
Note that the memory orders <em>acquire</em>, <em>sequentially consistent</em>, and
<em>acquire_release</em> all include <em>acquire semantics</em> and effectively pair
with a store using release semantics.</p>
</dd>
<dt class="hdlist1">Remainder work-groups </dt>
<dd>
<p>When the work-groups associated with a kernel-instance are defined, the
sizes of a work-group in each dimension may not evenly divide the size
of the NDRange in the corresponding dimensions.
The result is a collection of work-groups on the boundaries of the
NDRange that are smaller than the base work-group size.
These are known as <em>remainder work-groups</em>.</p>
</dd>
<dt class="hdlist1">Running </dt>
<dd>
<p>The fourth state in the six state model for the execution of a command.
The transition into this state occurs when the execution of the command
starts.
When a Kernel-enqueue command starts, one or more work-groups associated
with the command start to execute.</p>
</dd>
<dt class="hdlist1">Root device </dt>
<dd>
<p>A <em>root device</em> is an OpenCL <em>device</em> that has not been partitioned.
Also see <em>Device</em>, <em>Parent device</em> and <em>Root device</em>.</p>
</dd>
<dt class="hdlist1">Resource </dt>
<dd>
<p>A class of <em>objects</em> defined by OpenCL.
An instance of a <em>resource</em> is an <em>object</em>.
The most common <em>resources</em> are the <em>context</em>, <em>command-queue</em>, <em>program
objects</em>, <em>kernel objects</em>, and <em>memory objects</em>.
Computational resources are hardware elements that participate in the
action of advancing a program counter.
Examples include the <em>host</em>, <em>devices</em>, <em>compute units</em> and <em>processing
elements</em>.</p>
</dd>
<dt class="hdlist1">Retain, Release </dt>
<dd>
<p>The action of incrementing (retain) and decrementing (release) the
reference count using an OpenCL <em>object</em>.
This is a book keeping functionality to make sure the system doesn&#8217;t
remove an <em>object</em> before all instances that use this <em>object</em> have
finished.
Refer to <em>Reference Count</em>.</p>
</dd>
<dt class="hdlist1">Sampler </dt>
<dd>
<p>An <em>object</em> that describes how to sample an image when the image is read
in the <em>kernel</em>.
The image read functions take a <em>sampler</em> as an argument.
The <em>sampler</em> specifies the image addressing-mode i.e. how out-of-range
image coordinates are handled, the filter mode, and whether the input
image coordinate is a normalized or unnormalized value.</p>
</dd>
<dt class="hdlist1">Scope inclusion </dt>
<dd>
<p>Two actions <strong>A</strong> and <strong>B</strong> are defined to have an inclusive scope if they
have the same scope <strong>P</strong> such that: (1) if <strong>P</strong> is
<strong>memory_scope_sub_group</strong>, and <strong>A</strong> and <strong>B</strong> are executed by work-items
within the same sub-group, or (2) if <strong>P</strong> is <strong>memory_scope_work_group</strong>,
and <strong>A</strong> and <strong>B</strong> are executed by work-items within the same work-group,
or (3) if <strong>P</strong> is <strong>memory_scope_device</strong>, and <strong>A</strong> and <strong>B</strong> are executed by
work-items on the same device, or (4) if <strong>P</strong> is
<strong>memory_scope_all_svm_devices</strong>, if <strong>A</strong> and <strong>B</strong> are executed by host
threads or by work-items on one or more devices that can share SVM
memory with each other and the host process.</p>
</dd>
<dt class="hdlist1">Sequenced before </dt>
<dd>
<p>A relation between evaluations executed by a single unit of execution.
Sequenced-before is an asymmetric, transitive, pair-wise relation that
induces a partial order between evaluations.
Given any two evaluations A and B, if A is sequenced-before B, then the
execution of A shall precede the execution of B.</p>
</dd>
<dt class="hdlist1">Sequential consistency </dt>
<dd>
<p>Sequential consistency interleaves the steps executed by each unit of
execution.
Each access to a memory location sees the last assignment to that
location in that interleaving.</p>
</dd>
<dt class="hdlist1">Sequentially consistent semantics </dt>
<dd>
<p>One of the memory order semantics defined for synchronization
operations.
When using sequentially-consistent synchronization operations, the loads
and stores within one unit of execution appear to execute in program
order (i.e., the sequenced-before order), and loads and stores from
different units of execution appear to be simply interleaved.</p>
</dd>
<dt class="hdlist1">Shared Virtual Memory (SVM) </dt>
<dd>
<p>An address space exposed to both the host and the devices within a
context.
SVM causes addresses to be meaningful between the host and all of the
devices within a context and therefore supports the use of pointer based
data structures in OpenCL kernels.
It logically extends a portion of the global memory into the host
address space therefore giving work-items access to the host address
space.
There are three types of SVM in OpenCL:</p>
<div class="openblock">
<div class="content">
<div class="dlist">
<dl>
<dt class="hdlist1"><em>Coarse-Grained buffer SVM</em> </dt>
<dd>
<p>Sharing occurs at the granularity of regions of OpenCL buffer memory
objects.</p>
</dd>
<dt class="hdlist1"><em>Fine-Grained buffer SVM</em> </dt>
<dd>
<p>Sharing occurs at the granularity of individual loads/stores into bytes
within OpenCL buffer memory objects.</p>
</dd>
<dt class="hdlist1"><em>Fine-Grained system SVM</em> </dt>
<dd>
<p>Sharing occurs at the granularity of individual loads/stores into bytes
occurring anywhere within the host memory.</p>
</dd>
</dl>
</div>
</div>
</div>
</dd>
<dt class="hdlist1">SIMD </dt>
<dd>
<p>Single Instruction Multiple Data.
A programming model where a <em>kernel</em> is executed concurrently on
multiple <em>processing elements</em> each with its own data and a shared
program counter.
All <em>processing elements</em> execute a strictly identical set of
instructions.</p>
</dd>
<dt class="hdlist1">Specialization constants </dt>
<dd>
<p>Specialization is intended for constant objects that will not have known
constant values until after initial generation of a module in an intermediate
representation format (e.g. SPIR-V). Such objects are called specialization
constants.
Application might provide values for the specialization constants that
will be used when program is built from the intermediate format.
Specialization constants that do not receive a value from an application
shall use default values as defined in OpenCL C++ or SPIR-V specification.</p>
</dd>
<dt class="hdlist1">SPMD </dt>
<dd>
<p>Single Program Multiple Data.
A programming model where a <em>kernel</em> is executed concurrently on
multiple <em>processing elements</em> each with its own data and its own
program counter.
Hence, while all computational resources run the same <em>kernel</em> they
maintain their own instruction counter and due to branches in a
<em>kernel</em>, the actual sequence of instructions can be quite different
across the set of <em>processing elements</em>.</p>
</dd>
<dt class="hdlist1">Sub-device </dt>
<dd>
<p>An OpenCL <em>device</em> can be partitioned into multiple <em>sub-devices</em>.
The new <em>sub-devices</em> alias specific collections of compute units within
the parent <em>device</em>, according to a partition scheme.
The <em>sub-devices</em> may be used in any situation that their parent
<em>device</em> may be used.
Partitioning a <em>device</em> does not destroy the parent <em>device</em>, which may
continue to be used along side and intermingled with its child
<em>sub-devices</em>.
Also see <em>Device</em>, <em>Parent device</em> and <em>Root device</em>.</p>
</dd>
<dt class="hdlist1">Sub-group </dt>
<dd>
<p>Sub-groups are an implementation-dependent grouping of work-items within
a work-group.
The size and number of sub-groups is implementation-defined.</p>
</dd>
<dt class="hdlist1">Sub-group Barrier </dt>
<dd>
<p>See <em>Barrier</em>.</p>
</dd>
<dt class="hdlist1">Submitted </dt>
<dd>
<p>The second state in the six state model for the execution of a command.
The transition into this state occurs when the command is flushed from
the command-queue and submitted for execution on the device.
Once submitted, a programmer can assume a command will execute once its
prerequisites have been met.</p>
</dd>
<dt class="hdlist1">SVM Buffer </dt>
<dd>
<p>A memory allocation enabled to work with <em>Shared Virtual Memory (SVM)</em>.
Depending on how the SVM buffer is created, it can be a coarse-grained
or fine-grained SVM buffer.
Optionally it may be wrapped by a <em>Buffer Object</em>.
See <em>Shared Virtual Memory (SVM)</em>.</p>
</dd>
<dt class="hdlist1">Synchronization </dt>
<dd>
<p>Synchronization refers to mechanisms that constrain the order of
execution and the visibility of memory operations between two or more
units of execution.</p>
</dd>
<dt class="hdlist1">Synchronization operations </dt>
<dd>
<p>Operations that define memory order constraints in a program.
They play a special role in controlling how memory operations in one
unit of execution (such as work-items or, when using SVM a host thread)
are made visible to another.
Synchronization operations in OpenCL include <em>atomic operations</em> and
<em>fences</em>.</p>
</dd>
<dt class="hdlist1">Synchronization point </dt>
<dd>
<p>A synchronization point between a pair of commands (A and B) assures
that results of command A happens-before command B is launched (i.e.
enters the ready state) .</p>
</dd>
<dt class="hdlist1">Synchronizes with </dt>
<dd>
<p>A relation between operations in two different units of execution that
defines a memory order constraint in global memory
(<em>global-synchronizes-with</em>) or local memory
(<em>local-synchronizes-with</em>).</p>
</dd>
<dt class="hdlist1">Task Parallel Programming Model </dt>
<dd>
<p>A programming model in which computations are expressed in terms of
multiple concurrent tasks executing in one or more <em>command-queues</em>.
The concurrent tasks can be running different <em>kernels</em>.</p>
</dd>
<dt class="hdlist1">Thread-safe </dt>
<dd>
<p>An OpenCL API call is considered to be <em>thread-safe</em> if the internal
state as managed by OpenCL remains consistent when called simultaneously
by multiple <em>host</em> threads.
OpenCL API calls that are <em>thread-safe</em> allow an application to call
these functions in multiple <em>host</em> threads without having to implement
mutual exclusion across these <em>host</em> threads i.e. they are also
re-entrant-safe.</p>
</dd>
<dt class="hdlist1">Undefined </dt>
<dd>
<p>The behavior of an OpenCL API call, built-in function used inside a
<em>kernel</em> or execution of a <em>kernel</em> that is explicitly not defined by
OpenCL.
A conforming implementation is not required to specify what occurs when
an undefined construct is encountered in OpenCL.</p>
</dd>
<dt class="hdlist1">Unit of execution </dt>
<dd>
<p>A generic term for a process, OS managed thread running on the host (a
host-thread), kernel-instance, host program, work-item or any other
executable agent that advances the work associated with a program.</p>
</dd>
<dt class="hdlist1">Work-group </dt>
<dd>
<p>A collection of related <em>work-items</em> that execute on a single <em>compute
unit</em>.
The <em>work-items</em> in the group execute the same <em>kernel-instance</em> and
share <em>local</em> <em>memory</em> and <em>work-group functions</em>.</p>
</dd>
<dt class="hdlist1">Work-group Barrier </dt>
<dd>
<p>See <em>Barrier</em>.</p>
</dd>
<dt class="hdlist1">Work-group Function </dt>
<dd>
<p>A function that carries out collective operations across all the
work-items in a work-group.
Available collective operations are a barrier, reduction, broadcast,
prefix sum, and evaluation of a predicate.
A work-group function must occur within a <em>converged control flow</em>; i.e.
all work-items in the work-group must encounter precisely the same
work-group function.</p>
</dd>
<dt class="hdlist1">Work-group Synchronization </dt>
<dd>
<p>Constraints on the order of execution for work-items in a single
work-group.</p>
</dd>
<dt class="hdlist1">Work-pool </dt>
<dd>
<p>A logical pool associated with a device that holds commands and
work-groups from kernel-instances that are ready to execute.
OpenCL does not constrain the order that commands and work-groups are
scheduled for execution from the work-pool; i.e. a programmer must
assume that they could be interleaved.
There is one work-pool per device used by all command-queues associated
with that device.
The work-pool may be implemented in any manner as long as it assures
that work-groups placed in the pool will eventually execute.</p>
</dd>
<dt class="hdlist1">Work-item </dt>
<dd>
<p>One of a collection of parallel executions of a <em>kernel</em> invoked on a
<em>device</em> by a <em>command</em>.
A <em>work-item</em> is executed by one or more <em>processing elements</em> as part
of a <em>work-group</em> executing on a <em>compute unit</em>.
A <em>work-item</em> is distinguished from other work-items by its <em>global ID</em>
or the combination of its <em>work-group</em> ID and its <em>local ID</em> within a
<em>work-group</em>.</p>
</dd>
</dl>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_the_opencl_architecture">3. The OpenCL Architecture</h2>
<div class="sectionbody">
<div class="paragraph">
<p><strong>OpenCL</strong> is an open industry standard for programming a heterogeneous
collection of CPUs, GPUs and other discrete computing devices organized into
a single platform.
It is more than a language.
OpenCL is a framework for parallel programming and includes a language, API,
libraries and a runtime system to support software development.
Using OpenCL, for example, a programmer can write general purpose programs
that execute on GPUs without the need to map their algorithms onto a 3D
graphics API such as OpenGL or DirectX.</p>
</div>
<div class="paragraph">
<p>The target of OpenCL is expert programmers wanting to write portable yet
efficient code.
This includes library writers, middleware vendors, and performance oriented
application programmers.
Therefore OpenCL provides a low-level hardware abstraction plus a framework
to support programming and many details of the underlying hardware are
exposed.</p>
</div>
<div class="paragraph">
<p>To describe the core ideas behind OpenCL, we will use a hierarchy of models:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Platform Model</p>
</li>
<li>
<p>Memory Model</p>
</li>
<li>
<p>Execution Model</p>
</li>
<li>
<p>Programming Model</p>
</li>
</ul>
</div>
<div class="sect2">
<h3 id="_platform_model">3.1. Platform Model</h3>
<div class="paragraph">
<p>The <a href="#platform-model-image">Platform model</a> for OpenCL is defined below.
The model consists of a <strong>host</strong> connected to one or more <strong>OpenCL devices</strong>.
An OpenCL device is divided into one or more <strong>compute units</strong> (CUs) which are
further divided into one or more <strong>processing elements</strong> (PEs).
Computations on a device occur within the processing elements.</p>
</div>
<div class="paragraph">
<p>An OpenCL application is implemented as both host code and device kernel
code.
The host code portion of an OpenCL application runs on a host processor
according to the models native to the host platform.
The OpenCL application host code submits the kernel code as commands from
the host to OpenCL devices.
An OpenCL device executes the commands computation on the processing
elements within the device.</p>
</div>
<div class="paragraph">
<p>An OpenCL device has considerable latitude on how computations are mapped
onto the devices processing elements.
When processing elements within a compute unit execute the same sequence of
statements across the processing elements, the control flow is said to be
<em>converged</em>.
Hardware optimized for executing a single stream of instructions over
multiple processing elements is well suited to converged control flows.
When the control flow varies from one processing element to another, it is
said to be <em>diverged</em>.
While a kernel always begins execution with a converged control flow, due to
branching statements within a kernel, converged and diverged control flows
may occur within a single kernel.
This provides a great deal of flexibility in the algorithms that can be
implemented with OpenCL.</p>
</div>
<div id="platform-model-image" class="imageblock" style="text-align: center">
<div class="content">
<img src="" alt="platform model">
</div>
<div class="title">Figure 1. Platform Model &#8230;&#8203; one host plus one or more compute devices each with one or more compute units composed of one or more processing elements.</div>
</div>
<div class="paragraph">
<p>Programmers provide programs in the form of SPIR-V source binaries, OpenCL C
or OpenCL C++ source strings or implementation-defined binary objects.
The OpenCL platform provides a compiler to translate program input of either
form into executable program objects.
The device code compiler may be <em>online</em> or <em>offline</em>.
An <em>online</em> <em>compiler</em> is available during host program execution using
standard APIs.
An <em>offline compiler</em> is invoked outside of host program control, using
platform-specific methods.
The OpenCL runtime allows developers to get a previously compiled device
program executable and be able to load and execute a previously compiled
device program executable.</p>
</div>
<div class="paragraph">
<p>OpenCL defines two kinds of platform profiles: a <em>full profile</em> and a
reduced-functionality <em>embedded profile</em>.
A full profile platform must provide an online compiler for all its devices.
An embedded platform may provide an online compiler, but is not required to
do so.</p>
</div>
<div class="paragraph">
<p>A device may expose special purpose functionality as a <em>built-in function</em>.
The platform provides APIs for enumerating and invoking the built-in
functions offered by a device, but otherwise does not define their
construction or semantics.
A <em>custom device</em> supports only built-in functions, and cannot be programmed
via a kernel language.</p>
</div>
<div class="paragraph">
<p>All device types support the OpenCL execution model, the OpenCL memory
model, and the APIs used in OpenCL to manage devices.</p>
</div>
<div class="paragraph">
<p>The platform model is an abstraction describing how OpenCL views the
hardware.
The relationship between the elements of the platform model and the hardware
in a system may be a fixed property of a device or it may be a dynamic
feature of a program dependent on how a compiler optimizes code to best
utilize physical hardware.</p>
</div>
</div>
<div class="sect2">
<h3 id="_execution_model">3.2. Execution Model</h3>
<div class="paragraph">
<p>The OpenCL execution model is defined in terms of two distinct units of
execution: <strong>kernels</strong> that execute on one or more OpenCL devices and a <strong>host
program</strong> that executes on the host.
With regard to OpenCL, the kernels are where the "work" associated with a
computation occurs.
This work occurs through <strong>work-items</strong> that execute in groups
(<strong>work-groups</strong>).</p>
</div>
<div class="paragraph">
<p>A kernel executes within a well-defined context managed by the host.
The context defines the environment within which kernels execute.
It includes the following resources:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><strong>Devices</strong>: One or more devices exposed by the OpenCL platform.</p>
</li>
<li>
<p><strong>Kernel Objects</strong>:The OpenCL functions with their associated argument
values that run on OpenCL devices.</p>
</li>
<li>
<p><strong>Program Objects</strong>:The program source and executable that implement the
kernels.</p>
</li>
<li>
<p><strong>Memory Objects</strong>:Variables visible to the host and the OpenCL devices.