blob: d8fdd0847ad3d57995de13664f659ed1f86d50e9 [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<!--[if IE]><meta http-equiv="X-UA-Compatible" content="IE=edge"><![endif]-->
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="generator" content="Asciidoctor 1.5.8">
<meta name="author" content="Khronos&#174; OpenCL Working Group">
<title>The OpenCL&#8482; C Specification</title>
<style>
/*! normalize.css v2.1.2 | MIT License | git.io/normalize */
/* ========================================================================== HTML5 display definitions ========================================================================== */
/** Correct `block` display not defined in IE 8/9. */
article, aside, details, figcaption, figure, footer, header, hgroup, main, nav, section, summary { display: block; }
/** Correct `inline-block` display not defined in IE 8/9. */
audio, canvas, video { display: inline-block; }
/** Prevent modern browsers from displaying `audio` without controls. Remove excess height in iOS 5 devices. */
audio:not([controls]) { display: none; height: 0; }
/** Address `[hidden]` styling not present in IE 8/9. Hide the `template` element in IE, Safari, and Firefox < 22. */
[hidden], template { display: none; }
script { display: none !important; }
/* ========================================================================== Base ========================================================================== */
/** 1. Set default font family to sans-serif. 2. Prevent iOS text size adjust after orientation change, without disabling user zoom. */
html { font-family: sans-serif; /* 1 */ -ms-text-size-adjust: 100%; /* 2 */ -webkit-text-size-adjust: 100%; /* 2 */ }
/** Remove default margin. */
body { margin: 0; }
/* ========================================================================== Links ========================================================================== */
/** Remove the gray background color from active links in IE 10. */
a { background: transparent; }
/** Address `outline` inconsistency between Chrome and other browsers. */
a:focus { outline: thin dotted; }
/** Improve readability when focused and also mouse hovered in all browsers. */
a:active, a:hover { outline: 0; }
/* ========================================================================== Typography ========================================================================== */
/** Address variable `h1` font-size and margin within `section` and `article` contexts in Firefox 4+, Safari 5, and Chrome. */
h1 { font-size: 2em; margin: 0.67em 0; }
/** Address styling not present in IE 8/9, Safari 5, and Chrome. */
abbr[title] { border-bottom: 1px dotted; }
/** Address style set to `bolder` in Firefox 4+, Safari 5, and Chrome. */
b, strong { font-weight: bold; }
/** Address styling not present in Safari 5 and Chrome. */
dfn { font-style: italic; }
/** Address differences between Firefox and other browsers. */
hr { -moz-box-sizing: content-box; box-sizing: content-box; height: 0; }
/** Address styling not present in IE 8/9. */
mark { background: #ff0; color: #000; }
/** Correct font family set oddly in Safari 5 and Chrome. */
code, kbd, pre, samp { font-family: monospace, serif; font-size: 1em; }
/** Improve readability of pre-formatted text in all browsers. */
pre { white-space: pre-wrap; }
/** Set consistent quote types. */
q { quotes: "\201C" "\201D" "\2018" "\2019"; }
/** Address inconsistent and variable font size in all browsers. */
small { font-size: 80%; }
/** Prevent `sub` and `sup` affecting `line-height` in all browsers. */
sub, sup { font-size: 75%; line-height: 0; position: relative; vertical-align: baseline; }
sup { top: -0.5em; }
sub { bottom: -0.25em; }
/* ========================================================================== Embedded content ========================================================================== */
/** Remove border when inside `a` element in IE 8/9. */
img { border: 0; }
/** Correct overflow displayed oddly in IE 9. */
svg:not(:root) { overflow: hidden; }
/* ========================================================================== Figures ========================================================================== */
/** Address margin not present in IE 8/9 and Safari 5. */
figure { margin: 0; }
/* ========================================================================== Forms ========================================================================== */
/** Define consistent border, margin, and padding. */
fieldset { border: 1px solid #c0c0c0; margin: 0 2px; padding: 0.35em 0.625em 0.75em; }
/** 1. Correct `color` not being inherited in IE 8/9. 2. Remove padding so people aren't caught out if they zero out fieldsets. */
legend { border: 0; /* 1 */ padding: 0; /* 2 */ }
/** 1. Correct font family not being inherited in all browsers. 2. Correct font size not being inherited in all browsers. 3. Address margins set differently in Firefox 4+, Safari 5, and Chrome. */
button, input, select, textarea { font-family: inherit; /* 1 */ font-size: 100%; /* 2 */ margin: 0; /* 3 */ }
/** Address Firefox 4+ setting `line-height` on `input` using `!important` in the UA stylesheet. */
button, input { line-height: normal; }
/** Address inconsistent `text-transform` inheritance for `button` and `select`. All other form control elements do not inherit `text-transform` values. Correct `button` style inheritance in Chrome, Safari 5+, and IE 8+. Correct `select` style inheritance in Firefox 4+ and Opera. */
button, select { text-transform: none; }
/** 1. Avoid the WebKit bug in Android 4.0.* where (2) destroys native `audio` and `video` controls. 2. Correct inability to style clickable `input` types in iOS. 3. Improve usability and consistency of cursor style between image-type `input` and others. */
button, html input[type="button"], input[type="reset"], input[type="submit"] { -webkit-appearance: button; /* 2 */ cursor: pointer; /* 3 */ }
/** Re-set default cursor for disabled elements. */
button[disabled], html input[disabled] { cursor: default; }
/** 1. Address box sizing set to `content-box` in IE 8/9. 2. Remove excess padding in IE 8/9. */
input[type="checkbox"], input[type="radio"] { box-sizing: border-box; /* 1 */ padding: 0; /* 2 */ }
/** 1. Address `appearance` set to `searchfield` in Safari 5 and Chrome. 2. Address `box-sizing` set to `border-box` in Safari 5 and Chrome (include `-moz` to future-proof). */
input[type="search"] { -webkit-appearance: textfield; /* 1 */ -moz-box-sizing: content-box; -webkit-box-sizing: content-box; /* 2 */ box-sizing: content-box; }
/** Remove inner padding and search cancel button in Safari 5 and Chrome on OS X. */
input[type="search"]::-webkit-search-cancel-button, input[type="search"]::-webkit-search-decoration { -webkit-appearance: none; }
/** Remove inner padding and border in Firefox 4+. */
button::-moz-focus-inner, input::-moz-focus-inner { border: 0; padding: 0; }
/** 1. Remove default vertical scrollbar in IE 8/9. 2. Improve readability and alignment in all browsers. */
textarea { overflow: auto; /* 1 */ vertical-align: top; /* 2 */ }
/* ========================================================================== Tables ========================================================================== */
/** Remove most spacing between table cells. */
table { border-collapse: collapse; border-spacing: 0; }
meta.foundation-mq-small { font-family: "only screen and (min-width: 768px)"; width: 768px; }
meta.foundation-mq-medium { font-family: "only screen and (min-width:1280px)"; width: 1280px; }
meta.foundation-mq-large { font-family: "only screen and (min-width:1440px)"; width: 1440px; }
*, *:before, *:after { -moz-box-sizing: border-box; -webkit-box-sizing: border-box; box-sizing: border-box; }
html, body { font-size: 100%; }
body { background: white; color: #222222; padding: 0; margin: 0; font-family: "Helvetica Neue", "Helvetica", Helvetica, Arial, sans-serif; font-weight: normal; font-style: normal; line-height: 1; position: relative; cursor: auto; }
a:hover { cursor: pointer; }
img, object, embed { max-width: 100%; height: auto; }
object, embed { height: 100%; }
img { -ms-interpolation-mode: bicubic; }
#map_canvas img, #map_canvas embed, #map_canvas object, .map_canvas img, .map_canvas embed, .map_canvas object { max-width: none !important; }
.left { float: left !important; }
.right { float: right !important; }
.text-left { text-align: left !important; }
.text-right { text-align: right !important; }
.text-center { text-align: center !important; }
.text-justify { text-align: justify !important; }
.hide { display: none; }
.antialiased { -webkit-font-smoothing: antialiased; }
img { display: inline-block; vertical-align: middle; }
textarea { height: auto; min-height: 50px; }
select { width: 100%; }
object, svg { display: inline-block; vertical-align: middle; }
.center { margin-left: auto; margin-right: auto; }
.spread { width: 100%; }
p.lead, .paragraph.lead > p, #preamble > .sectionbody > .paragraph:first-of-type p { font-size: 1.21875em; line-height: 1.6; }
.subheader, .admonitionblock td.content > .title, .audioblock > .title, .exampleblock > .title, .imageblock > .title, .listingblock > .title, .literalblock > .title, .stemblock > .title, .openblock > .title, .paragraph > .title, .quoteblock > .title, table.tableblock > .title, .verseblock > .title, .videoblock > .title, .dlist > .title, .olist > .title, .ulist > .title, .qlist > .title, .hdlist > .title { line-height: 1.4; color: black; font-weight: 300; margin-top: 0.2em; margin-bottom: 0.5em; }
/* Typography resets */
div, dl, dt, dd, ul, ol, li, h1, h2, h3, #toctitle, .sidebarblock > .content > .title, h4, h5, h6, pre, form, p, blockquote, th, td { margin: 0; padding: 0; direction: ltr; }
/* Default Link Styles */
a { color: #0068b0; text-decoration: none; line-height: inherit; }
a:hover, a:focus { color: #333333; }
a img { border: none; }
/* Default paragraph styles */
p { font-family: Noto, sans-serif; font-weight: normal; font-size: 1em; line-height: 1.6; margin-bottom: 0.75em; text-rendering: optimizeLegibility; }
p aside { font-size: 0.875em; line-height: 1.35; font-style: italic; }
/* Default header styles */
h1, h2, h3, #toctitle, .sidebarblock > .content > .title, h4, h5, h6 { font-family: Noto, sans-serif; font-weight: normal; font-style: normal; color: black; text-rendering: optimizeLegibility; margin-top: 0.5em; margin-bottom: 0.5em; line-height: 1.2125em; }
h1 small, h2 small, h3 small, #toctitle small, .sidebarblock > .content > .title small, h4 small, h5 small, h6 small { font-size: 60%; color: #4d4d4d; line-height: 0; }
h1 { font-size: 2.125em; }
h2 { font-size: 1.6875em; }
h3, #toctitle, .sidebarblock > .content > .title { font-size: 1.375em; }
h4 { font-size: 1.125em; }
h5 { font-size: 1.125em; }
h6 { font-size: 1em; }
hr { border: solid #dddddd; border-width: 1px 0 0; clear: both; margin: 1.25em 0 1.1875em; height: 0; }
/* Helpful Typography Defaults */
em, i { font-style: italic; line-height: inherit; }
strong, b { font-weight: bold; line-height: inherit; }
small { font-size: 60%; line-height: inherit; }
code { font-family: Consolas, "Liberation Mono", Courier, monospace; font-weight: normal; color: #264357; }
/* Lists */
ul, ol, dl { font-size: 1em; line-height: 1.6; margin-bottom: 0.75em; list-style-position: outside; font-family: Noto, sans-serif; }
ul, ol { margin-left: 1.5em; }
ul.no-bullet, ol.no-bullet { margin-left: 1.5em; }
/* Unordered Lists */
ul li ul, ul li ol { margin-left: 1.25em; margin-bottom: 0; font-size: 1em; /* Override nested font-size change */ }
ul.square li ul, ul.circle li ul, ul.disc li ul { list-style: inherit; }
ul.square { list-style-type: square; }
ul.circle { list-style-type: circle; }
ul.disc { list-style-type: disc; }
ul.no-bullet { list-style: none; }
/* Ordered Lists */
ol li ul, ol li ol { margin-left: 1.25em; margin-bottom: 0; }
/* Definition Lists */
dl dt { margin-bottom: 0.3em; font-weight: bold; }
dl dd { margin-bottom: 0.75em; }
/* Abbreviations */
abbr, acronym { text-transform: uppercase; font-size: 90%; color: black; border-bottom: 1px dotted #dddddd; cursor: help; }
abbr { text-transform: none; }
/* Blockquotes */
blockquote { margin: 0 0 0.75em; padding: 0.5625em 1.25em 0 1.1875em; border-left: 1px solid #dddddd; }
blockquote cite { display: block; font-size: 0.8125em; color: #5e93b8; }
blockquote cite:before { content: "\2014 \0020"; }
blockquote cite a, blockquote cite a:visited { color: #5e93b8; }
blockquote, blockquote p { line-height: 1.6; color: #333333; }
/* Microformats */
.vcard { display: inline-block; margin: 0 0 1.25em 0; border: 1px solid #dddddd; padding: 0.625em 0.75em; }
.vcard li { margin: 0; display: block; }
.vcard .fn { font-weight: bold; font-size: 0.9375em; }
.vevent .summary { font-weight: bold; }
.vevent abbr { cursor: auto; text-decoration: none; font-weight: bold; border: none; padding: 0 0.0625em; }
@media only screen and (min-width: 768px) { h1, h2, h3, #toctitle, .sidebarblock > .content > .title, h4, h5, h6 { line-height: 1.4; }
h1 { font-size: 2.75em; }
h2 { font-size: 2.3125em; }
h3, #toctitle, .sidebarblock > .content > .title { font-size: 1.6875em; }
h4 { font-size: 1.4375em; } }
/* Tables */
table { background: white; margin-bottom: 1.25em; border: solid 1px #d8d8ce; }
table thead, table tfoot { background: -webkit-linear-gradient(top, #add386, #90b66a); font-weight: bold; }
table thead tr th, table thead tr td, table tfoot tr th, table tfoot tr td { padding: 0.5em 0.625em 0.625em; font-size: inherit; color: white; text-align: left; }
table tr th, table tr td { padding: 0.5625em 0.625em; font-size: inherit; color: #6d6e71; }
table tr.even, table tr.alt, table tr:nth-of-type(even) { background: #edf2f2; }
table thead tr th, table tfoot tr th, table tbody tr td, table tr td, table tfoot tr td { display: table-cell; line-height: 1.4; }
body { -moz-osx-font-smoothing: grayscale; -webkit-font-smoothing: antialiased; tab-size: 4; }
h1, h2, h3, #toctitle, .sidebarblock > .content > .title, h4, h5, h6 { line-height: 1.4; }
a:hover, a:focus { text-decoration: underline; }
.clearfix:before, .clearfix:after, .float-group:before, .float-group:after { content: " "; display: table; }
.clearfix:after, .float-group:after { clear: both; }
*:not(pre) > code { font-size: inherit; font-style: normal !important; letter-spacing: 0; padding: 0; background-color: white; -webkit-border-radius: 0; border-radius: 0; line-height: inherit; word-wrap: break-word; }
*:not(pre) > code.nobreak { word-wrap: normal; }
*:not(pre) > code.nowrap { white-space: nowrap; }
pre, pre > code { line-height: 1.6; color: #264357; font-family: Consolas, "Liberation Mono", Courier, monospace; font-weight: normal; }
em em { font-style: normal; }
strong strong { font-weight: normal; }
.keyseq { color: #333333; }
kbd { font-family: Consolas, "Liberation Mono", Courier, monospace; display: inline-block; color: black; font-size: 0.65em; line-height: 1.45; background-color: #f7f7f7; border: 1px solid #ccc; -webkit-border-radius: 3px; border-radius: 3px; -webkit-box-shadow: 0 1px 0 rgba(0, 0, 0, 0.2), 0 0 0 0.1em white inset; box-shadow: 0 1px 0 rgba(0, 0, 0, 0.2), 0 0 0 0.1em white inset; margin: 0 0.15em; padding: 0.2em 0.5em; vertical-align: middle; position: relative; top: -0.1em; white-space: nowrap; }
.keyseq kbd:first-child { margin-left: 0; }
.keyseq kbd:last-child { margin-right: 0; }
.menuseq, .menuref { color: #000; }
.menuseq b:not(.caret), .menuref { font-weight: inherit; }
.menuseq { word-spacing: -0.02em; }
.menuseq b.caret { font-size: 1.25em; line-height: 0.8; }
.menuseq i.caret { font-weight: bold; text-align: center; width: 0.45em; }
b.button:before, b.button:after { position: relative; top: -1px; font-weight: normal; }
b.button:before { content: "["; padding: 0 3px 0 2px; }
b.button:after { content: "]"; padding: 0 2px 0 3px; }
#header, #content, #footnotes, #footer { width: 100%; margin-left: auto; margin-right: auto; margin-top: 0; margin-bottom: 0; max-width: 62.5em; *zoom: 1; position: relative; padding-left: 1.5em; padding-right: 1.5em; }
#header:before, #header:after, #content:before, #content:after, #footnotes:before, #footnotes:after, #footer:before, #footer:after { content: " "; display: table; }
#header:after, #content:after, #footnotes:after, #footer:after { clear: both; }
#content { margin-top: 1.25em; }
#content:before { content: none; }
#header > h1:first-child { color: black; margin-top: 2.25rem; margin-bottom: 0; }
#header > h1:first-child + #toc { margin-top: 8px; border-top: 1px solid #dddddd; }
#header > h1:only-child, body.toc2 #header > h1:nth-last-child(2) { border-bottom: 1px solid #dddddd; padding-bottom: 8px; }
#header .details { border-bottom: 1px solid #dddddd; line-height: 1.45; padding-top: 0.25em; padding-bottom: 0.25em; padding-left: 0.25em; color: #5e93b8; display: -ms-flexbox; display: -webkit-flex; display: flex; -ms-flex-flow: row wrap; -webkit-flex-flow: row wrap; flex-flow: row wrap; }
#header .details span:first-child { margin-left: -0.125em; }
#header .details span.email a { color: #333333; }
#header .details br { display: none; }
#header .details br + span:before { content: "\00a0\2013\00a0"; }
#header .details br + span.author:before { content: "\00a0\22c5\00a0"; color: #333333; }
#header .details br + span#revremark:before { content: "\00a0|\00a0"; }
#header #revnumber { text-transform: capitalize; }
#header #revnumber:after { content: "\00a0"; }
#content > h1:first-child:not([class]) { color: black; border-bottom: 1px solid #dddddd; padding-bottom: 8px; margin-top: 0; padding-top: 1rem; margin-bottom: 1.25rem; }
#toc { border-bottom: 0 solid #dddddd; padding-bottom: 0.5em; }
#toc > ul { margin-left: 0.125em; }
#toc ul.sectlevel0 > li > a { font-style: italic; }
#toc ul.sectlevel0 ul.sectlevel1 { margin: 0.5em 0; }
#toc ul { font-family: Noto, sans-serif; list-style-type: none; }
#toc li { line-height: 1.3334; margin-top: 0.3334em; }
#toc a { text-decoration: none; }
#toc a:active { text-decoration: underline; }
#toctitle { color: black; font-size: 1.2em; }
@media only screen and (min-width: 768px) { #toctitle { font-size: 1.375em; }
body.toc2 { padding-left: 15em; padding-right: 0; }
#toc.toc2 { margin-top: 0 !important; background-color: white; position: fixed; width: 15em; left: 0; top: 0; border-right: 1px solid #dddddd; border-top-width: 0 !important; border-bottom-width: 0 !important; z-index: 1000; padding: 1.25em 1em; height: 100%; overflow: auto; }
#toc.toc2 #toctitle { margin-top: 0; margin-bottom: 0.8rem; font-size: 1.2em; }
#toc.toc2 > ul { font-size: 0.9em; margin-bottom: 0; }
#toc.toc2 ul ul { margin-left: 0; padding-left: 1em; }
#toc.toc2 ul.sectlevel0 ul.sectlevel1 { padding-left: 0; margin-top: 0.5em; margin-bottom: 0.5em; }
body.toc2.toc-right { padding-left: 0; padding-right: 15em; }
body.toc2.toc-right #toc.toc2 { border-right-width: 0; border-left: 1px solid #dddddd; left: auto; right: 0; } }
@media only screen and (min-width: 1280px) { body.toc2 { padding-left: 20em; padding-right: 0; }
#toc.toc2 { width: 20em; }
#toc.toc2 #toctitle { font-size: 1.375em; }
#toc.toc2 > ul { font-size: 0.95em; }
#toc.toc2 ul ul { padding-left: 1.25em; }
body.toc2.toc-right { padding-left: 0; padding-right: 20em; } }
#content #toc { border-style: solid; border-width: 1px; border-color: #e6e6e6; margin-bottom: 1.25em; padding: 1.25em; background: white; -webkit-border-radius: 0; border-radius: 0; }
#content #toc > :first-child { margin-top: 0; }
#content #toc > :last-child { margin-bottom: 0; }
#footer { max-width: 100%; background-color: none; padding: 1.25em; }
#footer-text { color: black; line-height: 1.44; }
#content { margin-bottom: 0.625em; }
.sect1 { padding-bottom: 0.625em; }
@media only screen and (min-width: 768px) { #content { margin-bottom: 1.25em; }
.sect1 { padding-bottom: 1.25em; } }
.sect1:last-child { padding-bottom: 0; }
.sect1 + .sect1 { border-top: 0 solid #dddddd; }
#content h1 > a.anchor, h2 > a.anchor, h3 > a.anchor, #toctitle > a.anchor, .sidebarblock > .content > .title > a.anchor, h4 > a.anchor, h5 > a.anchor, h6 > a.anchor { position: absolute; z-index: 1001; width: 1.5ex; margin-left: -1.5ex; display: block; text-decoration: none !important; visibility: hidden; text-align: center; font-weight: normal; }
#content h1 > a.anchor:before, h2 > a.anchor:before, h3 > a.anchor:before, #toctitle > a.anchor:before, .sidebarblock > .content > .title > a.anchor:before, h4 > a.anchor:before, h5 > a.anchor:before, h6 > a.anchor:before { content: "\00A7"; font-size: 0.85em; display: block; padding-top: 0.1em; }
#content h1:hover > a.anchor, #content h1 > a.anchor:hover, h2:hover > a.anchor, h2 > a.anchor:hover, h3:hover > a.anchor, #toctitle:hover > a.anchor, .sidebarblock > .content > .title:hover > a.anchor, h3 > a.anchor:hover, #toctitle > a.anchor:hover, .sidebarblock > .content > .title > a.anchor:hover, h4:hover > a.anchor, h4 > a.anchor:hover, h5:hover > a.anchor, h5 > a.anchor:hover, h6:hover > a.anchor, h6 > a.anchor:hover { visibility: visible; }
#content h1 > a.link, h2 > a.link, h3 > a.link, #toctitle > a.link, .sidebarblock > .content > .title > a.link, h4 > a.link, h5 > a.link, h6 > a.link { color: black; text-decoration: none; }
#content h1 > a.link:hover, h2 > a.link:hover, h3 > a.link:hover, #toctitle > a.link:hover, .sidebarblock > .content > .title > a.link:hover, h4 > a.link:hover, h5 > a.link:hover, h6 > a.link:hover { color: black; }
.audioblock, .imageblock, .literalblock, .listingblock, .stemblock, .videoblock { margin-bottom: 1.25em; }
.admonitionblock td.content > .title, .audioblock > .title, .exampleblock > .title, .imageblock > .title, .listingblock > .title, .literalblock > .title, .stemblock > .title, .openblock > .title, .paragraph > .title, .quoteblock > .title, table.tableblock > .title, .verseblock > .title, .videoblock > .title, .dlist > .title, .olist > .title, .ulist > .title, .qlist > .title, .hdlist > .title { text-rendering: optimizeLegibility; text-align: left; }
table.tableblock > caption.title { white-space: nowrap; overflow: visible; max-width: 0; }
.paragraph.lead > p, #preamble > .sectionbody > .paragraph:first-of-type p { color: black; }
table.tableblock #preamble > .sectionbody > .paragraph:first-of-type p { font-size: inherit; }
.admonitionblock > table { border-collapse: separate; border: 0; background: none; width: 100%; }
.admonitionblock > table td.icon { text-align: center; width: 80px; }
.admonitionblock > table td.icon img { max-width: initial; }
.admonitionblock > table td.icon .title { font-weight: bold; font-family: Noto, sans-serif; text-transform: uppercase; }
.admonitionblock > table td.content { padding-left: 1.125em; padding-right: 1.25em; border-left: 1px solid #dddddd; color: #5e93b8; }
.admonitionblock > table td.content > :last-child > :last-child { margin-bottom: 0; }
.exampleblock > .content { border-style: solid; border-width: 1px; border-color: #e6e6e6; margin-bottom: 1.25em; padding: 1.25em; background: white; -webkit-border-radius: 0; border-radius: 0; }
.exampleblock > .content > :first-child { margin-top: 0; }
.exampleblock > .content > :last-child { margin-bottom: 0; }
.sidebarblock { border-style: solid; border-width: 1px; border-color: #e6e6e6; margin-bottom: 1.25em; padding: 1.25em; background: white; -webkit-border-radius: 0; border-radius: 0; }
.sidebarblock > :first-child { margin-top: 0; }
.sidebarblock > :last-child { margin-bottom: 0; }
.sidebarblock > .content > .title { color: black; margin-top: 0; }
.exampleblock > .content > :last-child > :last-child, .exampleblock > .content .olist > ol > li:last-child > :last-child, .exampleblock > .content .ulist > ul > li:last-child > :last-child, .exampleblock > .content .qlist > ol > li:last-child > :last-child, .sidebarblock > .content > :last-child > :last-child, .sidebarblock > .content .olist > ol > li:last-child > :last-child, .sidebarblock > .content .ulist > ul > li:last-child > :last-child, .sidebarblock > .content .qlist > ol > li:last-child > :last-child { margin-bottom: 0; }
.literalblock pre, .listingblock pre:not(.highlight), .listingblock pre[class="highlight"], .listingblock pre[class^="highlight "], .listingblock pre.CodeRay, .listingblock pre.prettyprint { background: #eeeeee; }
.sidebarblock .literalblock pre, .sidebarblock .listingblock pre:not(.highlight), .sidebarblock .listingblock pre[class="highlight"], .sidebarblock .listingblock pre[class^="highlight "], .sidebarblock .listingblock pre.CodeRay, .sidebarblock .listingblock pre.prettyprint { background: #f2f1f1; }
.literalblock pre, .literalblock pre[class], .listingblock pre, .listingblock pre[class] { border: 1px hidden #666666; -webkit-border-radius: 0; border-radius: 0; word-wrap: break-word; padding: 1.25em 1.5625em 1.125em 1.5625em; font-size: 0.8125em; }
.literalblock pre.nowrap, .literalblock pre[class].nowrap, .listingblock pre.nowrap, .listingblock pre[class].nowrap { overflow-x: auto; white-space: pre; word-wrap: normal; }
@media only screen and (min-width: 768px) { .literalblock pre, .literalblock pre[class], .listingblock pre, .listingblock pre[class] { font-size: 0.90625em; } }
@media only screen and (min-width: 1280px) { .literalblock pre, .literalblock pre[class], .listingblock pre, .listingblock pre[class] { font-size: 1em; } }
.literalblock.output pre { color: #eeeeee; background-color: #264357; }
.listingblock pre.highlightjs { padding: 0; }
.listingblock pre.highlightjs > code { padding: 1.25em 1.5625em 1.125em 1.5625em; -webkit-border-radius: 0; border-radius: 0; }
.listingblock > .content { position: relative; }
.listingblock code[data-lang]:before { display: none; content: attr(data-lang); position: absolute; font-size: 0.75em; top: 0.425rem; right: 0.5rem; line-height: 1; text-transform: uppercase; color: #999; }
.listingblock:hover code[data-lang]:before { display: block; }
.listingblock.terminal pre .command:before { content: attr(data-prompt); padding-right: 0.5em; color: #999; }
.listingblock.terminal pre .command:not([data-prompt]):before { content: "$"; }
table.pyhltable { border-collapse: separate; border: 0; margin-bottom: 0; background: none; }
table.pyhltable td { vertical-align: top; padding-top: 0; padding-bottom: 0; line-height: 1.6; }
table.pyhltable td.code { padding-left: .75em; padding-right: 0; }
pre.pygments .lineno, table.pyhltable td:not(.code) { color: #999; padding-left: 0; padding-right: .5em; border-right: 1px solid #dddddd; }
pre.pygments .lineno { display: inline-block; margin-right: .25em; }
table.pyhltable .linenodiv { background: none !important; padding-right: 0 !important; }
.quoteblock { margin: 0 1em 0.75em 1.5em; display: table; }
.quoteblock > .title { margin-left: -1.5em; margin-bottom: 0.75em; }
.quoteblock blockquote, .quoteblock blockquote p { color: #333333; font-size: 1.15rem; line-height: 1.75; word-spacing: 0.1em; letter-spacing: 0; font-style: italic; text-align: justify; }
.quoteblock blockquote { margin: 0; padding: 0; border: 0; }
.quoteblock blockquote:before { content: "\201c"; float: left; font-size: 2.75em; font-weight: bold; line-height: 0.6em; margin-left: -0.6em; color: black; text-shadow: 0 1px 2px rgba(0, 0, 0, 0.1); }
.quoteblock blockquote > .paragraph:last-child p { margin-bottom: 0; }
.quoteblock .attribution { margin-top: 0.5em; margin-right: 0.5ex; text-align: right; }
.quoteblock .quoteblock { margin-left: 0; margin-right: 0; padding: 0.5em 0; border-left: 3px solid #5e93b8; }
.quoteblock .quoteblock blockquote { padding: 0 0 0 0.75em; }
.quoteblock .quoteblock blockquote:before { display: none; }
.verseblock { margin: 0 1em 0.75em 1em; }
.verseblock pre { font-family: "Open Sans", "DejaVu Sans", sans; font-size: 1.15rem; color: #333333; font-weight: 300; text-rendering: optimizeLegibility; }
.verseblock pre strong { font-weight: 400; }
.verseblock .attribution { margin-top: 1.25rem; margin-left: 0.5ex; }
.quoteblock .attribution, .verseblock .attribution { font-size: 0.8125em; line-height: 1.45; font-style: italic; }
.quoteblock .attribution br, .verseblock .attribution br { display: none; }
.quoteblock .attribution cite, .verseblock .attribution cite { display: block; letter-spacing: -0.025em; color: #5e93b8; }
.quoteblock.abstract { margin: 0 0 0.75em 0; display: block; }
.quoteblock.abstract blockquote, .quoteblock.abstract blockquote p { text-align: left; word-spacing: 0; }
.quoteblock.abstract blockquote:before, .quoteblock.abstract blockquote p:first-of-type:before { display: none; }
table.tableblock { max-width: 100%; border-collapse: separate; }
table.tableblock td > .paragraph:last-child p > p:last-child, table.tableblock th > p:last-child, table.tableblock td > p:last-child { margin-bottom: 0; }
table.tableblock, th.tableblock, td.tableblock { border: 0 solid #d8d8ce; }
table.grid-all > thead > tr > .tableblock, table.grid-all > tbody > tr > .tableblock { border-width: 0 1px 1px 0; }
table.grid-all > tfoot > tr > .tableblock { border-width: 1px 1px 0 0; }
table.grid-cols > * > tr > .tableblock { border-width: 0 1px 0 0; }
table.grid-rows > thead > tr > .tableblock, table.grid-rows > tbody > tr > .tableblock { border-width: 0 0 1px 0; }
table.grid-rows > tfoot > tr > .tableblock { border-width: 1px 0 0 0; }
table.grid-all > * > tr > .tableblock:last-child, table.grid-cols > * > tr > .tableblock:last-child { border-right-width: 0; }
table.grid-all > tbody > tr:last-child > .tableblock, table.grid-all > thead:last-child > tr > .tableblock, table.grid-rows > tbody > tr:last-child > .tableblock, table.grid-rows > thead:last-child > tr > .tableblock { border-bottom-width: 0; }
table.frame-all { border-width: 1px; }
table.frame-sides { border-width: 0 1px; }
table.frame-topbot { border-width: 1px 0; }
th.halign-left, td.halign-left { text-align: left; }
th.halign-right, td.halign-right { text-align: right; }
th.halign-center, td.halign-center { text-align: center; }
th.valign-top, td.valign-top { vertical-align: top; }
th.valign-bottom, td.valign-bottom { vertical-align: bottom; }
th.valign-middle, td.valign-middle { vertical-align: middle; }
table thead th, table tfoot th { font-weight: bold; }
tbody tr th { display: table-cell; line-height: 1.4; background: -webkit-linear-gradient(top, #add386, #90b66a); }
tbody tr th, tbody tr th p, tfoot tr th, tfoot tr th p { color: white; font-weight: bold; }
p.tableblock > code:only-child { background: none; padding: 0; }
p.tableblock { font-size: 1em; }
td > div.verse { white-space: pre; }
ol { margin-left: 1.75em; }
ul li ol { margin-left: 1.5em; }
dl dd { margin-left: 1.125em; }
dl dd:last-child, dl dd:last-child > :last-child { margin-bottom: 0; }
ol > li p, ul > li p, ul dd, ol dd, .olist .olist, .ulist .ulist, .ulist .olist, .olist .ulist { margin-bottom: 0.375em; }
ul.checklist, ul.none, ol.none, ul.no-bullet, ol.no-bullet, ol.unnumbered, ul.unstyled, ol.unstyled { list-style-type: none; }
ul.no-bullet, ol.no-bullet, ol.unnumbered { margin-left: 0.625em; }
ul.unstyled, ol.unstyled { margin-left: 0; }
ul.checklist { margin-left: 0.625em; }
ul.checklist li > p:first-child > .fa-square-o:first-child, ul.checklist li > p:first-child > .fa-check-square-o:first-child { width: 1.25em; font-size: 0.8em; position: relative; bottom: 0.125em; }
ul.checklist li > p:first-child > input[type="checkbox"]:first-child { margin-right: 0.25em; }
ul.inline { display: -ms-flexbox; display: -webkit-box; display: flex; -ms-flex-flow: row wrap; -webkit-flex-flow: row wrap; flex-flow: row wrap; list-style: none; margin: 0 0 0.375em -0.75em; }
ul.inline > li { margin-left: 0.75em; }
.unstyled dl dt { font-weight: normal; font-style: normal; }
ol.arabic { list-style-type: decimal; }
ol.decimal { list-style-type: decimal-leading-zero; }
ol.loweralpha { list-style-type: lower-alpha; }
ol.upperalpha { list-style-type: upper-alpha; }
ol.lowerroman { list-style-type: lower-roman; }
ol.upperroman { list-style-type: upper-roman; }
ol.lowergreek { list-style-type: lower-greek; }
.hdlist > table, .colist > table { border: 0; background: none; }
.hdlist > table > tbody > tr, .colist > table > tbody > tr { background: none; }
td.hdlist1, td.hdlist2 { vertical-align: top; padding: 0 0.625em; }
td.hdlist1 { font-weight: bold; padding-bottom: 0.75em; }
.literalblock + .colist, .listingblock + .colist { margin-top: -0.5em; }
.colist > table tr > td:first-of-type { padding: 0.4em 0.75em 0 0.75em; line-height: 1; vertical-align: top; }
.colist > table tr > td:first-of-type img { max-width: initial; }
.colist > table tr > td:last-of-type { padding: 0.25em 0; }
.thumb, .th { line-height: 0; display: inline-block; border: solid 4px white; -webkit-box-shadow: 0 0 0 1px #dddddd; box-shadow: 0 0 0 1px #dddddd; }
.imageblock.left, .imageblock[style*="float: left"] { margin: 0.25em 0.625em 1.25em 0; }
.imageblock.right, .imageblock[style*="float: right"] { margin: 0.25em 0 1.25em 0.625em; }
.imageblock > .title { margin-bottom: 0; }
.imageblock.thumb, .imageblock.th { border-width: 6px; }
.imageblock.thumb > .title, .imageblock.th > .title { padding: 0 0.125em; }
.image.left, .image.right { margin-top: 0.25em; margin-bottom: 0.25em; display: inline-block; line-height: 0; }
.image.left { margin-right: 0.625em; }
.image.right { margin-left: 0.625em; }
a.image { text-decoration: none; display: inline-block; }
a.image object { pointer-events: none; }
sup.footnote, sup.footnoteref { font-size: 0.875em; position: static; vertical-align: super; }
sup.footnote a, sup.footnoteref a { text-decoration: none; }
sup.footnote a:active, sup.footnoteref a:active { text-decoration: underline; }
#footnotes { padding-top: 0.75em; padding-bottom: 0.75em; margin-bottom: 0.625em; }
#footnotes hr { width: 20%; min-width: 6.25em; margin: -0.25em 0 0.75em 0; border-width: 1px 0 0 0; }
#footnotes .footnote { padding: 0 0.375em 0 0.225em; line-height: 1.3334; font-size: 0.875em; margin-left: 1.2em; margin-bottom: 0.2em; }
#footnotes .footnote a:first-of-type { font-weight: bold; text-decoration: none; margin-left: -1.05em; }
#footnotes .footnote:last-of-type { margin-bottom: 0; }
#content #footnotes { margin-top: -0.625em; margin-bottom: 0; padding: 0.75em 0; }
.gist .file-data > table { border: 0; background: #fff; width: 100%; margin-bottom: 0; }
.gist .file-data > table td.line-data { width: 99%; }
div.unbreakable { page-break-inside: avoid; }
.big { font-size: larger; }
.small { font-size: smaller; }
.underline { text-decoration: underline; }
.overline { text-decoration: overline; }
.line-through { text-decoration: line-through; }
.aqua { color: #00bfbf; }
.aqua-background { background-color: #00fafa; }
.black { color: black; }
.black-background { background-color: black; }
.blue { color: #0000bf; }
.blue-background { background-color: #0000fa; }
.fuchsia { color: #bf00bf; }
.fuchsia-background { background-color: #fa00fa; }
.gray { color: #606060; }
.gray-background { background-color: #7d7d7d; }
.green { color: #006000; }
.green-background { background-color: #007d00; }
.lime { color: #00bf00; }
.lime-background { background-color: #00fa00; }
.maroon { color: #600000; }
.maroon-background { background-color: #7d0000; }
.navy { color: #000060; }
.navy-background { background-color: #00007d; }
.olive { color: #606000; }
.olive-background { background-color: #7d7d00; }
.purple { color: #600060; }
.purple-background { background-color: #7d007d; }
.red { color: #bf0000; }
.red-background { background-color: #fa0000; }
.silver { color: #909090; }
.silver-background { background-color: #bcbcbc; }
.teal { color: #006060; }
.teal-background { background-color: #007d7d; }
.white { color: #bfbfbf; }
.white-background { background-color: #fafafa; }
.yellow { color: #bfbf00; }
.yellow-background { background-color: #fafa00; }
span.icon > .fa { cursor: default; }
a span.icon > .fa { cursor: inherit; }
.admonitionblock td.icon [class^="fa icon-"] { font-size: 2.5em; text-shadow: 1px 1px 2px rgba(0, 0, 0, 0.5); cursor: default; }
.admonitionblock td.icon .icon-note:before { content: "\f05a"; color: #29475c; }
.admonitionblock td.icon .icon-tip:before { content: "\f0eb"; text-shadow: 1px 1px 2px rgba(155, 155, 0, 0.8); color: #111; }
.admonitionblock td.icon .icon-warning:before { content: "\f071"; color: #bf6900; }
.admonitionblock td.icon .icon-caution:before { content: "\f06d"; color: #bf3400; }
.admonitionblock td.icon .icon-important:before { content: "\f06a"; color: #bf0000; }
.conum[data-value] { display: inline-block; color: #fff !important; background-color: black; -webkit-border-radius: 100px; border-radius: 100px; text-align: center; font-size: 0.75em; width: 1.67em; height: 1.67em; line-height: 1.67em; font-family: "Open Sans", "DejaVu Sans", sans-serif; font-style: normal; font-weight: bold; }
.conum[data-value] * { color: #fff !important; }
.conum[data-value] + b { display: none; }
.conum[data-value]:after { content: attr(data-value); }
pre .conum[data-value] { position: relative; top: -0.125em; }
b.conum * { color: inherit !important; }
.conum:not([data-value]):empty { display: none; }
h1, h2, h3, #toctitle, .sidebarblock > .content > .title, h4, h5, h6 { border-bottom: 1px solid #dddddd; }
.sect1 { padding-bottom: 0; }
#toctitle { color: #00406F; font-weight: normal; margin-top: 1.5em; }
.sidebarblock { border-color: #aaa; }
code { -webkit-border-radius: 4px; border-radius: 4px; }
p.tableblock.header { color: #6d6e71; }
.literalblock pre, .listingblock pre { background: #eeeeee; }
</style>
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css">
<style>
/* Stylesheet for CodeRay to match GitHub theme | MIT License | http://foundation.zurb.com */
/*pre.CodeRay {background-color:#f7f7f8;}*/
.CodeRay .line-numbers{border-right:1px solid #d8d8d8;padding:0 0.5em 0 .25em}
.CodeRay span.line-numbers{display:inline-block;margin-right:.5em;color:rgba(0,0,0,.3)}
.CodeRay .line-numbers strong{color:rgba(0,0,0,.4)}
table.CodeRay{border-collapse:separate;border-spacing:0;margin-bottom:0;border:0;background:none}
table.CodeRay td{vertical-align: top;line-height:1.45}
table.CodeRay td.line-numbers{text-align:right}
table.CodeRay td.line-numbers>pre{padding:0;color:rgba(0,0,0,.3)}
table.CodeRay td.code{padding:0 0 0 .5em}
table.CodeRay td.code>pre{padding:0}
.CodeRay .debug{color:#fff !important;background:#000080 !important}
.CodeRay .annotation{color:#007}
.CodeRay .attribute-name{color:#000080}
.CodeRay .attribute-value{color:#700}
.CodeRay .binary{color:#509}
.CodeRay .comment{color:#998;font-style:italic}
.CodeRay .char{color:#04d}
.CodeRay .char .content{color:#04d}
.CodeRay .char .delimiter{color:#039}
.CodeRay .class{color:#458;font-weight:bold}
.CodeRay .complex{color:#a08}
.CodeRay .constant,.CodeRay .predefined-constant{color:#008080}
.CodeRay .color{color:#099}
.CodeRay .class-variable{color:#369}
.CodeRay .decorator{color:#b0b}
.CodeRay .definition{color:#099}
.CodeRay .delimiter{color:#000}
.CodeRay .doc{color:#970}
.CodeRay .doctype{color:#34b}
.CodeRay .doc-string{color:#d42}
.CodeRay .escape{color:#666}
.CodeRay .entity{color:#800}
.CodeRay .error{color:#808}
.CodeRay .exception{color:inherit}
.CodeRay .filename{color:#099}
.CodeRay .function{color:#900;font-weight:bold}
.CodeRay .global-variable{color:#008080}
.CodeRay .hex{color:#058}
.CodeRay .integer,.CodeRay .float{color:#099}
.CodeRay .include{color:#555}
.CodeRay .inline{color:#000}
.CodeRay .inline .inline{background:#ccc}
.CodeRay .inline .inline .inline{background:#bbb}
.CodeRay .inline .inline-delimiter{color:#d14}
.CodeRay .inline-delimiter{color:#d14}
.CodeRay .important{color:#555;font-weight:bold}
.CodeRay .interpreted{color:#b2b}
.CodeRay .instance-variable{color:#008080}
.CodeRay .label{color:#970}
.CodeRay .local-variable{color:#963}
.CodeRay .octal{color:#40e}
.CodeRay .predefined{color:#369}
.CodeRay .preprocessor{color:#579}
.CodeRay .pseudo-class{color:#555}
.CodeRay .directive{font-weight:bold}
.CodeRay .type{font-weight:bold}
.CodeRay .predefined-type{color:inherit}
.CodeRay .reserved,.CodeRay .keyword {color:#000;font-weight:bold}
.CodeRay .key{color:#808}
.CodeRay .key .delimiter{color:#606}
.CodeRay .key .char{color:#80f}
.CodeRay .value{color:#088}
.CodeRay .regexp .delimiter{color:#808}
.CodeRay .regexp .content{color:#808}
.CodeRay .regexp .modifier{color:#808}
.CodeRay .regexp .char{color:#d14}
.CodeRay .regexp .function{color:#404;font-weight:bold}
.CodeRay .string{color:#d20}
.CodeRay .string .string .string{background:#ffd0d0}
.CodeRay .string .content{color:#d14}
.CodeRay .string .char{color:#d14}
.CodeRay .string .delimiter{color:#d14}
.CodeRay .shell{color:#d14}
.CodeRay .shell .delimiter{color:#d14}
.CodeRay .symbol{color:#990073}
.CodeRay .symbol .content{color:#a60}
.CodeRay .symbol .delimiter{color:#630}
.CodeRay .tag{color:#008080}
.CodeRay .tag-special{color:#d70}
.CodeRay .variable{color:#036}
.CodeRay .insert{background:#afa}
.CodeRay .delete{background:#faa}
.CodeRay .change{color:#aaf;background:#007}
.CodeRay .head{color:#f8f;background:#505}
.CodeRay .insert .insert{color:#080}
.CodeRay .delete .delete{color:#800}
.CodeRay .change .change{color:#66f}
.CodeRay .head .head{color:#f4f}
</style>
<link rel="stylesheet" href="../katex/katex.min.css">
<script src="../katex/katex.min.js"></script>
<script src="../katex/contrib/auto-render.min.js"></script>
<!-- Use KaTeX to render math once document is loaded, see
https://github.com/Khan/KaTeX/tree/master/contrib/auto-render -->
<script>
document.addEventListener("DOMContentLoaded", function () {
renderMathInElement(
document.body,
{
delimiters: [
{ left: "$$", right: "$$", display: true},
{ left: "\\[", right: "\\]", display: true},
{ left: "$", right: "$", display: false},
{ left: "\\(", right: "\\)", display: false}
]
}
);
});
</script></head>
<body class="book toc2 toc-left" style="max-width: 100;">
<div id="header">
<h1>The OpenCL<sup>&#8482;</sup> C Specification</h1>
<div class="details">
<span id="author" class="author">Khronos<sup>&#174;</sup> OpenCL Working Group</span><br>
<span id="revnumber">version v3.0.5,</span>
<span id="revdate">Wed, 30 Sep 2020 00:00:00 +0000</span>
<br><span id="revremark">from git branch: master commit: 4d8a36725aa8af9658ab5cb62fdbf52adb44bcca</span>
</div>
<div id="toc" class="toc2">
<div id="toctitle">Table of Contents</div>
<ul class="sectlevel1">
<li><a href="#the-opencl-c-programming-language">6. The OpenCL C Programming Language</a>
<ul class="sectlevel2">
<li><a href="#unified-spec">6.1. Unified Specification</a></li>
<li><a href="#optional-functionality">6.2. Optional functionality</a>
<ul class="sectlevel3">
<li><a href="#features">6.2.1. Features</a></li>
<li><a href="#extensions">6.2.2. Extensions</a></li>
</ul>
</li>
<li><a href="#supported-data-types">6.3. Supported Data Types</a>
<ul class="sectlevel3">
<li><a href="#built-in-scalar-data-types">6.3.1. Built-in Scalar Data Types</a></li>
<li><a href="#built-in-vector-data-types">6.3.2. Built-in Vector Data Types</a></li>
<li><a href="#other-built-in-data-types">6.3.3. Other Built-in Data Types</a></li>
<li><a href="#reserved-data-types">6.3.4. Reserved Data Types</a></li>
<li><a href="#alignment-of-types">6.3.5. Alignment of Types</a></li>
<li><a href="#vector-literals">6.3.6. Vector Literals</a></li>
<li><a href="#vector-components">6.3.7. Vector Components</a></li>
<li><a href="#aliasing-rules">6.3.8. Aliasing Rules</a></li>
<li><a href="#keywords">6.3.9. Keywords</a></li>
</ul>
</li>
<li><a href="#conversions-and-type-casting">6.4. Conversions and Type Casting</a>
<ul class="sectlevel3">
<li><a href="#implicit-conversions">6.4.1. Implicit Conversions</a></li>
<li><a href="#explicit-casts">6.4.2. Explicit Casts</a></li>
<li><a href="#explicit-conversions">6.4.3. Explicit Conversions</a></li>
<li><a href="#reinterpreting-data-as-another-type">6.4.4. Reinterpreting Data As Another Type</a></li>
<li><a href="#pointer-casting">6.4.5. Pointer Casting</a></li>
<li><a href="#usual-arithmetic-conversions">6.4.6. Usual Arithmetic Conversions</a></li>
</ul>
</li>
<li><a href="#operators">6.5. Operators</a>
<ul class="sectlevel3">
<li><a href="#operators-arithmetic">6.5.1. Arithmetic Operators</a></li>
<li><a href="#operators-unary">6.5.2. Unary Operators</a></li>
<li><a href="#operators-prepost">6.5.3. Pre- and Post-Operators</a></li>
<li><a href="#operators-relational">6.5.4. Relational Operators</a></li>
<li><a href="#operators-equality">6.5.5. Equality Operators</a></li>
<li><a href="#operators-bitwise">6.5.6. Bitwise Operators</a></li>
<li><a href="#operators-logical">6.5.7. Logical Operators</a></li>
<li><a href="#operators-logical-unary">6.5.8. Unary Logical Operator</a></li>
<li><a href="#operators-ternary-selection">6.5.9. Ternary Selection Operator</a></li>
<li><a href="#operators-shift">6.5.10. Shift Operators</a></li>
<li><a href="#operators-sizeof">6.5.11. Sizeof Operator</a></li>
<li><a href="#operators-comma">6.5.12. Comma Operator</a></li>
<li><a href="#operators-indirection">6.5.13. Indirection Operator</a></li>
<li><a href="#operators-address">6.5.14. Address Operator</a></li>
<li><a href="#operators-assignment">6.5.15. Assignment Operator</a></li>
</ul>
</li>
<li><a href="#vector-operations">6.6. Vector Operations</a></li>
<li><a href="#address-space-qualifiers">6.7. Address Space Qualifiers</a>
<ul class="sectlevel3">
<li><a href="#global-or-global">6.7.1. <code>__global</code> (or <code>global</code>)</a></li>
<li><a href="#local-or-local">6.7.2. <code>__local</code> (or <code>local</code>)</a></li>
<li><a href="#constant-or-constant">6.7.3. <code>__constant</code> (or <code>constant</code>)</a></li>
<li><a href="#private-or-private">6.7.4. <code>__private</code> (or <code>private</code>)</a></li>
<li><a href="#the-generic-address-space">6.7.5. The Generic Address Space</a></li>
<li><a href="#changes-to-C99">6.7.6. Changes to C99</a></li>
</ul>
</li>
<li><a href="#access-qualifiers">6.8. Access Qualifiers</a></li>
<li><a href="#function-qualifiers">6.9. Function Qualifiers</a>
<ul class="sectlevel3">
<li><a href="#kernel-or-kernel">6.9.1. <code>__kernel</code> (or <code>kernel</code>)</a></li>
<li><a href="#optional-attribute-qualifiers">6.9.2. Optional Attribute Qualifiers</a></li>
</ul>
</li>
<li><a href="#storage-class-specifiers">6.10. Storage-Class Specifiers</a></li>
<li><a href="#restrictions">6.11. Restrictions</a></li>
<li><a href="#preprocessor-directives-and-macros">6.12. Preprocessor Directives and Macros</a></li>
<li><a href="#attribute-qualifiers">6.13. Attribute Qualifiers</a>
<ul class="sectlevel3">
<li><a href="#specifying-attributes-of-types">6.13.1. Specifying Attributes of Types</a></li>
<li><a href="#specifying-attributes-of-functions">6.13.2. Specifying Attributes of Functions</a></li>
<li><a href="#specifying-attributes-of-variables">6.13.3. Specifying Attributes of Variables</a></li>
<li><a href="#specifying-attributes-of-blocks-and-control-flow-statements">6.13.4. Specifying Attributes of Blocks and Control-Flow-Statements</a></li>
<li><a href="#specifying-attribute-for-unrolling-loops">6.13.5. Specifying Attribute For Unrolling Loops</a></li>
<li><a href="#extending-attribute-qualifiers">6.13.6. Extending Attribute Qualifiers</a></li>
</ul>
</li>
<li><a href="#blocks">6.14. Blocks</a>
<ul class="sectlevel3">
<li><a href="#declaring-and-using-a-block">6.14.1. Declaring and Using a Block</a></li>
<li><a href="#declaring-a-block-reference">6.14.2. Declaring a Block Reference</a></li>
<li><a href="#block-literal-expressions">6.14.3. Block Literal Expressions</a></li>
<li><a href="#control-flow">6.14.4. Control Flow</a></li>
<li><a href="#restrictions-1">6.14.5. Restrictions</a></li>
</ul>
</li>
<li><a href="#built-in-functions">6.15. Built-in Functions</a>
<ul class="sectlevel3">
<li><a href="#work-item-functions">6.15.1. Work-Item Functions</a></li>
<li><a href="#math-functions">6.15.2. Math Functions</a></li>
<li><a href="#integer-functions">6.15.3. Integer Functions</a></li>
<li><a href="#common-functions">6.15.4. Common Functions</a></li>
<li><a href="#geometric-functions">6.15.5. Geometric Functions</a></li>
<li><a href="#relational-functions">6.15.6. Relational Functions</a></li>
<li><a href="#vector-data-load-and-store-functions">6.15.7. Vector Data Load and Store Functions</a></li>
<li><a href="#synchronization-functions">6.15.8. Synchronization Functions</a></li>
<li><a href="#legacy-mem-fence-functions">6.15.9. Legacy Explicit Memory Fence Functions</a></li>
<li><a href="#address-space-qualifier-functions">6.15.10. Address Space Qualifier Functions</a></li>
<li><a href="#async-copies">6.15.11. Async Copies from Global to Local Memory, Local to Global Memory, and Prefetch</a></li>
<li><a href="#atomic-functions">6.15.12. Atomic Functions</a></li>
<li><a href="#miscellaneous-vector-functions">6.15.13. Miscellaneous Vector Functions</a></li>
<li><a href="#printf">6.15.14. printf</a></li>
<li><a href="#image-read-and-write-functions">6.15.15. Image Read and Write Functions</a></li>
<li><a href="#work-group-functions">6.15.16. Work-group Collective Functions</a></li>
<li><a href="#pipe-functions">6.15.17. Pipe Functions</a></li>
<li><a href="#enqueuing-kernels">6.15.18. Enqueuing Kernels</a></li>
<li><a href="#subgroup-functions">6.15.19. Subgroup Functions</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#opencl-numerical-compliance">7. OpenCL Numerical Compliance</a>
<ul class="sectlevel2">
<li><a href="#rounding-modes-1">7.1. Rounding Modes</a></li>
<li><a href="#inf-nan-and-denormalized-numbers">7.2. INF, NaN and Denormalized Numbers</a></li>
<li><a href="#floating-point-exceptions">7.3. Floating-Point Exceptions</a></li>
<li><a href="#relative-error-as-ulps">7.4. Relative Error as ULPs</a></li>
<li><a href="#edge-case-behavior">7.5. Edge Case Behavior</a>
<ul class="sectlevel3">
<li><a href="#additional-requirements-beyond-c99-tc2">7.5.1. Additional Requirements Beyond C99 TC2</a></li>
<li><a href="#changes-to-c99-tc2-behavior">7.5.2. Changes to C99 TC2 Behavior</a></li>
<li><a href="#edge-case-behavior-in-flush-to-zero-mode">7.5.3. Edge Case Behavior in Flush To Zero Mode</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#image-addressing-and-filtering">8. Image Addressing and Filtering</a>
<ul class="sectlevel2">
<li><a href="#image-coordinates">8.1. Image Coordinates</a></li>
<li><a href="#addressing-and-filter-modes">8.2. Addressing and Filter Modes</a></li>
<li><a href="#conversion-rules">8.3. Conversion Rules</a>
<ul class="sectlevel3">
<li><a href="#conversion-rules-for-normalized-integer-channel-data-types">8.3.1. Conversion rules for normalized integer channel data types</a></li>
<li><a href="#conversion-rules-for-half-precision-floating-point-channel-data-type">8.3.2. Conversion rules for half precision floating-point channel data type</a></li>
<li><a href="#conversion-rules-for-floating-point-channel-data-type">8.3.3. Conversion rules for floating-point channel data type</a></li>
<li><a href="#conversion-rules-for-signed-and-unsigned-8-bit-16-bit-and-32-bit-integer-channel-data-types">8.3.4. Conversion rules for signed and unsigned 8-bit, 16-bit and 32-bit integer channel data types</a></li>
<li><a href="#conversion-rules-for-srgba-and-sbgra-images">8.3.5. Conversion rules for sRGBA and sBGRA images</a></li>
</ul>
</li>
<li><a href="#selecting-an-image-from-an-image-array">8.4. Selecting an Image from an Image Array</a></li>
</ul>
</li>
<li><a href="#references">9. Normative References</a></li>
</ul>
</div>
</div>
<div id="content">
<div id="preamble">
<div class="sectionbody">
<div style="page-break-after: always;"></div>
<div class="paragraph">
<p>Copyright 2008-2020 The Khronos Group.</p>
</div>
<div class="paragraph">
<p>This specification is protected by copyright laws and contains material proprietary
to the Khronos Group, Inc. Except as described by these terms, it or any components
may not be reproduced, republished, distributed, transmitted, displayed, broadcast
or otherwise exploited in any manner without the express prior written permission
of Khronos Group.</p>
</div>
<div class="paragraph">
<p>Khronos Group grants a conditional copyright license to use and reproduce the
unmodified specification for any purpose, without fee or royalty, EXCEPT no licenses
to any patent, trademark or other intellectual property rights are granted under
these terms. Parties desiring to implement the specification and make use of
Khronos trademarks in relation to that implementation, and receive reciprocal patent
license protection under the Khronos IP Policy must become Adopters and confirm the
implementation as conformant under the process defined by Khronos for this
specification; see <a href="https://www.khronos.org/adopters" class="bare">https://www.khronos.org/adopters</a>.</p>
</div>
<div class="paragraph">
<p>Khronos Group makes no, and expressly disclaims any, representations or warranties,
express or implied, regarding this specification, including, without limitation:
merchantability, fitness for a particular purpose, non-infringement of any
intellectual property, correctness, accuracy, completeness, timeliness, and
reliability. Under no circumstances will the Khronos Group, or any of its Promoters,
Contributors or Members, or their respective partners, officers, directors,
employees, agents or representatives be liable for any damages, whether direct,
indirect, special or consequential damages for lost revenues, lost profits, or
otherwise, arising from or in connection with these materials.</p>
</div>
<div class="paragraph">
<p>Vulkan and Khronos are registered trademarks, and OpenXR, SPIR, SPIR-V, SYCL, WebGL,
WebCL, OpenVX, OpenVG, EGL, COLLADA, glTF, NNEF, OpenKODE, OpenKCAM, StreamInput,
OpenWF, OpenSL ES, OpenMAX, OpenMAX AL, OpenMAX IL, OpenMAX DL, OpenML and DevU are
trademarks of the Khronos Group Inc. ASTC is a trademark of ARM Holdings PLC,
OpenCL is a trademark of Apple Inc. and OpenGL and OpenML are registered trademarks
and the OpenGL ES and OpenGL SC logos are trademarks of Silicon Graphics
International used under license by Khronos. All other product names, trademarks,
and/or company names are used solely for identification and belong to their
respective owners.</p>
</div>
<div style="page-break-after: always;"></div>
</div>
</div>
<div class="sect1">
<h2 id="the-opencl-c-programming-language"><a class="anchor" href="#the-opencl-c-programming-language"></a>6. The OpenCL C Programming Language</h2>
<div class="sectionbody">
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>This document starts at chapter 6 to keep the section numbers historically
consistent with previous versions of the OpenCL and OpenCL C Programming
Language specifications.</p>
</div>
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>This section describes the OpenCL C programming language.
The OpenCL C programming language may be used to write kernels that execute
on an OpenCL device.</p>
</div>
<div class="paragraph">
<p>The OpenCL C programming language (also referred to as OpenCL C) is based
on the <a href="#C99-spec">ISO/IEC 9899:1999 Programming languages - C</a> specification
(also referred to as the C99 specification, or just C99), with extensions
and restrictions to support parallel kernels.
In addition, some features of OpenCL C are based on the <a href="#C11-spec">ISO/IEC
9899:2011 Information technology - Programming languages - C</a> specification
(also referred to as the C11 specification, or just C11).</p>
</div>
<div class="paragraph">
<p>This document describes the modifications and restrictions to C99 and C11
in OpenCL C.
Please refer to the C99 specification for a detailed description of the
language grammar.</p>
</div>
<div class="sect2">
<h3 id="unified-spec"><a class="anchor" href="#unified-spec"></a>6.1. Unified Specification</h3>
<div class="paragraph">
<p>This document specifies all versions of OpenCL C.</p>
</div>
<div class="paragraph">
<p>There are several ways that an OpenCL C feature may be described in terms of
what versions of OpenCL C specify that feature.</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Requires support for OpenCL C <em>major.minor</em> or newer: Features that were
introduced in version <em>major.minor</em>.
Compilers for an earlier version of OpenCL C will not provide these
features.</p>
<div class="ulist">
<ul>
<li>
<p>In some instances the variation of "For OpenCL C <em>major.minor</em> or newer"
is used, it has the identical meaning.</p>
</li>
</ul>
</div>
</li>
<li>
<p>Requires support for OpenCL C 2.0, or OpenCL C 3.0 or newer and the
<code>__opencl_c_feature_name</code> feature:
Features that were introduced in OpenCL C 2.0 as mandatory, but made
<a href="#optional-functionality">optional</a> in OpenCL C 3.0.
Compilers for versions of OpenCL C 1.2 or below will not provide these
features, compilers for OpenCL C 2.0 will provide these features,
compilers for OpenCL C 3.0 or newer may provide these features.</p>
</li>
<li>
<p>Requires support for OpenCL C 3.0 or newer and the
<code>__opencl_c_feature_name</code> feature: <a href="#optional-functionality">Optional</a> features that were introduced in OpenCL C 3.0.
Compilers for an earlier version of OpenCL C will not provide these
features, compilers for OpenCL C 3.0 or newer may provide these features.</p>
</li>
<li>
<p>Deprecated by OpenCL C <em>major.minor</em>: Features that were deprecated
in version <em>major.minor</em>, see the definition of deprecation in the
glossary of the main OpenCL specification.</p>
</li>
<li>
<p>Universal: Features that have no mention of what version they are missing
before or deprecated by are specified for all versions of OpenCL C.</p>
</li>
</ul>
</div>
</div>
<div class="sect2">
<h3 id="optional-functionality"><a class="anchor" href="#optional-functionality"></a>6.2. Optional functionality</h3>
<div class="paragraph">
<p>Some language functionality is optional and will not be supported by all
devices. Such functionality is represented by optional language features or
language extensions. Support of optional functionality in OpenCL C is indicated
by the presence of special predefined macros.</p>
</div>
<div class="sect3">
<h4 id="features"><a class="anchor" href="#features"></a>6.2.1. Features</h4>
<div class="admonitionblock important">
<table>
<tr>
<td class="icon">
<i class="fa icon-important" title="Important"></i>
</td>
<td class="content">
Feature test macros <a href="#unified-spec">require</a> support for OpenCL C
3.0 or newer.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>Optional language features are described in this document. They are optional
from OpenCL C 3.0 onwards and therefore are not supported by all
implementations. When an OpenCL C 3.0 optional feature is supported, an
associated <em>feature test macro</em> will be predefined.</p>
</div>
<div class="paragraph">
<p>The following table describes OpenCL C 3.0 or newer features and their
meaning. The naming convention for the feature macros is
<code>__opencl_c_&lt;name&gt;</code>.</p>
</div>
<div class="paragraph">
<p>Feature macro identifiers are used as names of features in this document.</p>
</div>
<table id="table-optional-lang-features" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 1. Optional features in OpenCL C 3.0 or newer and their predefined macros.</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<thead>
<tr>
<th class="tableblock halign-left valign-top"><strong>Feature Macro/Name</strong></th>
<th class="tableblock halign-left valign-top"><strong>Brief Description</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>__opencl_c_3d_image_writes</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The OpenCL C compiler supports built-in functions for writing to 3D image
objects.</p>
<p class="tableblock">OpenCL C compilers that define the feature macro <code>__opencl_c_3d_image_writes</code>
must also define the feature macro <code>__opencl_c_images</code>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>__opencl_c_atomic_order_acq_rel</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The OpenCL C compiler supports enumerations and built-in functions for atomic
operations with acquire and release memory consistency orders.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>__opencl_c_atomic_order_seq_cst</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The OpenCL C compiler supports enumerations and built-in functions for atomic
operations and fences with sequentially consistent memory consistency order.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>__opencl_c_atomic_scope_device</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The OpenCL C compiler supports enumerations and built-in functions for atomic
operations and fences with device memory scope.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>__opencl_c_atomic_scope_all_devices</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The OpenCL C compiler supports enumerations and built-in functions for atomic
operations and fences with all with memory scope across all devices that can
share SVM memory with each other and the host process.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>__opencl_c_device_enqueue</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The OpenCL C compiler supports built-in functions to enqueue additional work
from the device.</p>
<p class="tableblock">OpenCL C compilers that define the feature macro <code>__opencl_c_device_enqueue</code>
must also define the feature macro <code>__opencl_c_generic_address_space</code>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>__opencl_c_generic_address_space</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The OpenCL C compiler supports the unnamed generic address space.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>__opencl_c_fp64</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The OpenCL C compiler supports types and built-in functions with 64-bit
floating point types.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>__opencl_c_images</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The OpenCL C compiler supports types and built-in functions for images.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>__opencl_c_int64</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The OpenCL C compiler supports types and built-in functions with 64-bit
integers.</p>
<p class="tableblock">OpenCL C compilers for FULL profile devices or devices with 64-bit pointers
must always define the <code>__opencl_c_int64</code> feature macro.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>__opencl_c_pipes</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The OpenCL C compiler supports the pipe modifier and built-in functions
to read and write from a pipe.</p>
<p class="tableblock">OpenCL C compilers that define the feature macro <code>__opencl_c_pipes</code> must
also define the feature macro <code>__opencl_c_generic_address_space</code>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>__opencl_c_program_scope_global_variables</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The OpenCL C compiler supports program scope variables in the global address
space.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>__opencl_c_read_write_images</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The OpenCL C compiler supports reading from and writing to the same image
object in a kernel.</p>
<p class="tableblock">OpenCL C compilers that define the feature macro
<code>__opencl_c_read_write_images</code> must also define the feature macro
<code>__opencl_c_images</code>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>__opencl_c_subgroups</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The OpenCL C compiler supports built-in functions operating on sub-groupings
of work-items.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>__opencl_c_work_group_collective_functions</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The OpenCL C compiler supports built-in functions that perform collective
operations across a work-group.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>In OpenCL C 3.0 or newer, feature macros must expand to the value <code>1</code> if the
feature macro is defined by the OpenCL C compiler. A feature macro must not be
defined if the feature is not supported by the OpenCL C compiler. A feature
macro may expand to a different value in the future, but if this occurs the
value of the feature macro must compare greater than the prior value of the
feature macro.</p>
</div>
</div>
<div class="sect3">
<h4 id="extensions"><a class="anchor" href="#extensions"></a>6.2.2. Extensions</h4>
<div class="paragraph">
<p>Optional functionality that is not defined in this document is referred to
as extensions. Extensions are described in
<a href="#opencl-extension-spec">the OpenCL Extension Specification</a>.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>Prior to OpenCL C 3.0 some optional features described in this document were
referred to as optional core features. Their presence could be
indicated by the predefined extension macros. If any of the features has been
an optional extension in earlier OpenCL versions it can still be used as an
extension i.e. the same predefined extension macros are still valid in OpenCL C
3.0 or newer. However, the use of feature macros is preferred whenever
possible.</p>
</div>
</td>
</tr>
</table>
</div>
</div>
</div>
<div class="sect2">
<h3 id="supported-data-types"><a class="anchor" href="#supported-data-types"></a>6.3. Supported Data Types</h3>
<div class="paragraph">
<p>The following data types are supported.</p>
</div>
<div class="sect3">
<h4 id="built-in-scalar-data-types"><a class="anchor" href="#built-in-scalar-data-types"></a>6.3.1. Built-in Scalar Data Types</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The following table describes the list of built-in scalar data types.</p>
</div>
<table id="table-builtin-scalar-types" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 2. Built-in Scalar Data Types</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Type</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>bool</code> <sup class="footnote">[<a id="_footnoteref_1" class="footnote" href="#_footnotedef_1" title="View footnote.">1</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A conditional data type which is either <em>true</em> or <em>false</em>.
The value <em>true</em> expands to the integer constant 1 and the value
<em>false</em> expands to the integer constant 0.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>char</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A signed two&#8217;s complement 8-bit integer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>unsigned char</code>, <code>uchar</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">An unsigned 8-bit integer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>short</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A signed two&#8217;s complement 16-bit integer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>unsigned short</code>, <code>ushort</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">An unsigned 16-bit integer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>int</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A signed two&#8217;s complement 32-bit integer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>unsigned int</code>, <code>uint</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">An unsigned 32-bit integer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>long</code> <sup class="footnote" id="_footnote_long">[<a id="_footnoteref_2" class="footnote" href="#_footnotedef_2" title="View footnote.">2</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A signed two&#8217;s complement 64-bit integer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>unsigned long</code>, <code>ulong</code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_2" title="View footnote.">2</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">An unsigned 64-bit integer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>float</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A 32-bit floating-point.
The <code>float</code> data type must conform to the IEEE 754 single precision
storage format.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>double</code> <sup class="footnote">[<a id="_footnoteref_3" class="footnote" href="#_footnotedef_3" title="View footnote.">3</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A 64-bit floating-point.
The <code>double</code> data type must conform to the IEEE 754 double precision
storage format.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.2 or newer.
Also see extension <strong>cl_khr_fp64</strong>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>half</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A 16-bit floating-point.
The <code>half</code> data type must conform to the IEEE 754-2008 half precision
storage format.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>size_t</code> <sup class="footnote" id="_footnote_size_t">[<a id="_footnoteref_4" class="footnote" href="#_footnotedef_4" title="View footnote.">4</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The unsigned integer type of the result of the <code>sizeof</code> operator.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>ptrdiff_t</code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_4" title="View footnote.">4</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A signed integer type that is the result of subtracting two
pointers.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>intptr_t</code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_4" title="View footnote.">4</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A signed integer type with the property that any valid pointer to
<code>void</code> can be converted to this type, then converted back to pointer
to <code>void</code>, and the result will compare equal to the original pointer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>uintptr_t</code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_4" title="View footnote.">4</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">An unsigned integer type with the property that any valid pointer
to <code>void</code> can be converted to this type, then converted back to
pointer to <code>void</code>, and the result will compare equal to the original
pointer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>void</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The <code>void</code> type comprises an empty set of values; it is an incomplete
type that cannot be completed.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>Most built-in scalar data types are also declared as appropriate types in
the OpenCL API (and header files) that can be used by an application.
The following table describes the built-in scalar data type in the OpenCL C
programming language and the corresponding data type available to the
application:</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Type in OpenCL Language</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>API type for application</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>bool</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">n/a</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>char</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_char</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>unsigned char</code>, <code>uchar</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uchar</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>short</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_short</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>unsigned short</code>, <code>ushort</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_ushort</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>int</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_int</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>unsigned int</code>, <code>uint</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>long</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_long</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>unsigned long</code>, <code>ulong</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_ulong</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>float</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_float</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>double</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_double</code> <sup class="footnote">[<a id="_footnoteref_5" class="footnote" href="#_footnotedef_5" title="View footnote.">5</a>]</sup></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>half</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_half</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>size_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">n/a</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>ptrdiff_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">n/a</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>intptr_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">n/a</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>uintptr_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">n/a</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>void</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>void</code></p></td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="sect4">
<h5 id="the-half-data-type"><a class="anchor" href="#the-half-data-type"></a>6.3.1.1. The <code>half</code> Data Type</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <code>half</code> data type must be IEEE 754-2008 compliant.
<code>half</code> numbers have 1 sign bit, 5 exponent bits, and 10 mantissa bits.
The interpretation of the sign, exponent and mantissa is analogous to IEEE
754 floating-point numbers.
The exponent bias is 15.
The <code>half</code> data type must represent finite and normal numbers, denormalized
numbers, infinities and NaN.
Denormalized numbers for the <code>half</code> data type which may be generated when
converting a <code>float</code> to a <code>half</code> using <strong>vstore_half</strong> and converting a <code>half</code>
to a <code>float</code> using <strong>vload_half</strong> cannot be flushed to zero.
Conversions from <code>float</code> to <code>half</code> correctly round the mantissa to 11 bits
of precision.
Conversions from <code>half</code> to <code>float</code> are lossless; all <code>half</code> numbers are
exactly representable as <code>float</code> values.</p>
</div>
<div class="paragraph">
<p>The <code>half</code> data type can only be used to declare a pointer to a buffer that
contains <code>half</code> values.
A few valid examples are given below:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="directive">void</span>
bar (__global half *p)
{
...
}
__kernel <span class="directive">void</span>
foo (__global half *pg, __local half *pl)
{
__global half *ptr;
<span class="predefined-type">int</span> offset;
ptr = pg + offset;
bar(ptr);
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>Below are some examples that are not valid usage of the <code>half</code> type:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">half a;
half b[<span class="integer">100</span>];
half *p;
a = *p; <span class="comment">// not allowed. must use *vload_half* function</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>Loads from a pointer to a <code>half</code> and stores to a pointer to a <code>half</code> can be
performed using the <a href="#vector-data-load-and-store-functions">vector data load
and store functions</a> <strong>vload_half</strong>, <strong>vload_half<em>n</em></strong>, <strong>vloada_halfn</strong> and
<strong>vstore_half</strong>, <strong>vstore_half<em>n</em></strong>, and <strong>vstorea_halfn</strong>.
The load functions read scalar or vector <code>half</code> values from memory and
convert them to a scalar or vector <code>float</code> value.
The store functions take a scalar or vector <code>float</code> value as input, convert
it to a <code>half</code> scalar or vector value (with appropriate rounding mode) and
write the <code>half</code> scalar or vector value to memory.</p>
</div>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="built-in-vector-data-types"><a class="anchor" href="#built-in-vector-data-types"></a>6.3.2. Built-in Vector Data Types</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <code>char</code>, <code>unsigned char</code>, <code>short</code>, <code>unsigned short</code>, <code>int</code>, <code>unsigned int</code>,
<code>long</code>, <code>unsigned long</code>, <code>float</code> and <code>double vector data types are supported.
<sup class="footnote">[<a id="_footnoteref_6" class="footnote" href="#_footnotedef_6" title="View footnote.">6</a>]</sup>
The vector data type is defined with the type name, i.e. `char</code>, <code>uchar</code>,
<code>short</code>, <code>ushort</code>, <code>int</code>, <code>uint</code>, <code>long</code>, <code>ulong</code>, <code>float</code>, or <code>double</code>
followed by a literal value <em>n</em> that defines the number of elements in the
vector.
Supported values of <em>n</em> are 2, 3, 4, 8, and 16 for all vector data types.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
Vector types with three elements, i.e. where <em>n</em> is 3, <a href="#unified-spec">require</a> support for OpenCL C 1.1 or newer.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>The following table describes the list of built-in vector data types.</p>
</div>
<table id="table-builtin-vector-types" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 3. Built-in Vector Data Types</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Type</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>char<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A vector of <em>n</em> 8-bit signed two&#8217;s complement integer values.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>uchar<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A vector of <em>n</em> 8-bit unsigned integer values.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>short<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A vector of <em>n</em> 16-bit signed two&#8217;s complement integer values.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>ushort<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A vector of <em>n</em> 16-bit unsigned integer values.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>int<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A vector of <em>n</em> 32-bit signed two&#8217;s complement integer values.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>uint<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A vector of <em>n</em> 32-bit unsigned integer values.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>long<em>n</em></code> <sup class="footnote" id="_footnote_long-vec">[<a id="_footnoteref_7" class="footnote" href="#_footnotedef_7" title="View footnote.">7</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A vector of <em>n</em> 64-bit signed two&#8217;s complement integer values.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>ulong<em>n</em></code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_7" title="View footnote.">7</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A vector of <em>n</em> 64-bit unsigned integer values.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>float<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A vector of <em>n</em> 32-bit floating-point values.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>double<em>n</em></code> <sup class="footnote">[<a id="_footnoteref_8" class="footnote" href="#_footnotedef_8" title="View footnote.">8</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A vector of <em>n</em> 64-bit floating-point values.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.2 or newer.
Also see extension <strong>cl_khr_fp64</strong>.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>The built-in vector data types are also declared as appropriate types in the
OpenCL API (and header files) that can be used by an application.
The following table describes the built-in vector data type in the OpenCL C
programming language and the corresponding data type available to the
application:</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Type in OpenCL Language</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>API type for application</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>char<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_char<em>n</em></code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>uchar<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uchar<em>n</em></code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>short<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_short<em>n</em></code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>ushort<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_ushort<em>n</em></code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>int<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_int<em>n</em></code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>uint<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_uint<em>n</em></code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>long<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_long<em>n</em></code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>ulong<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_ulong<em>n</em></code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>float<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_float<em>n</em></code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>double<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_double<em>n</em></code></p></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="sect3">
<h4 id="other-built-in-data-types"><a class="anchor" href="#other-built-in-data-types"></a>6.3.3. Other Built-in Data Types</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The following table describes the list of additional data types supported by
OpenCL.</p>
</div>
<table id="table-other-builtin-types" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 4. Other Built-in Data Types</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Type</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>image2d_t</code> <sup class="footnote" id="_footnote_image-functions">[<a id="_footnoteref_9" class="footnote" href="#_footnotedef_9" title="View footnote.">9</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A 2D image.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>image3d_t</code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_9" title="View footnote.">9</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A 3D image.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>image2d_array_t</code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_9" title="View footnote.">9</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A 2D image array.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.2 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>image1d_t</code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_9" title="View footnote.">9</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A 1D image.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.2 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>image1d_buffer_t</code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_9" title="View footnote.">9</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A 1D image created from a buffer object.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.2 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>image1d_array_t</code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_9" title="View footnote.">9</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A 1D image array.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.2 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>image2d_depth_t</code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_9" title="View footnote.">9</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A 2D depth image.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 2.0 or newer, also see
<code>cl_khr_depth_images</code> extension.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>image2d_array_depth_t</code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_9" title="View footnote.">9</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A 2D depth image array.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 2.0 or newer, also see
<code>cl_khr_depth_images</code> extension.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>sampler_t</code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_9" title="View footnote.">9</a>]</sup></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A sampler type.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>queue_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A device command queue.
This queue can only be used to enqueue commands from kernels executing
on the device.</p>
<p class="tableblock"> <a href="#unifed-spec">Requires</a> support for OpenCL C 2.0, or OpenCL C 3.0 or
newer and the <code>__opencl_c_device_enqueue</code> feature.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>ndrange_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The N-dimensional range over which a kernel executes.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 2.0, or OpenCL C 3.0 or
newer and the <code>__opencl_c_device_enqueue</code> feature.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>clk_event_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A device side event that identifies a command enqueue to
a device command queue.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 2.0, or OpenCL C 3.0 or
newer and the <code>__opencl_c_device_enqueue</code> feature.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>reserve_id_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A reservation ID.
This opaque type is used to identify the reservation for
<a href="#pipe-functions">reading and writing a pipe</a>.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 2.0, or OpenCL C 3.0 or
newer and the <code>__opencl_c_pipes</code> feature.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>event_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">An event.
This can be used to identify <a href="#async-copies">async copies</a> from
<code>global</code> to <code>local</code> memory and vice-versa.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_mem_fence_flags</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">This is a bitfield and can be 0 or a combination of the following
values ORed together:</p>
<p class="tableblock"> <code>CLK_GLOBAL_MEM_FENCE</code><br>
<code>CLK_LOCAL_MEM_FENCE</code><br>
<code>CLK_IMAGE_MEM_FENCE</code></p>
<p class="tableblock"> These flags are described in detail in the
<a href="#synchronization-functions">synchronization functions</a> section.</p></td>
</tr>
</tbody>
</table>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>The <code>image2d_t</code>, <code>image3d_t</code>, <code>image2d_array_t</code>, <code>image1d_t</code>,
<code>image1d_buffer_t</code>, <code>image1d_array_t</code>, <code>image2d_depth_t</code>,
<code>image2d_array_depth_t</code> and <code>sampler_t</code> types are only defined if the device
supports images, i.e. the value of the <a href="#opencl-device-queries"><code>CL_DEVICE_IMAGE_SUPPORT</code> device query</a>) is <code>CL_TRUE</code>.
If this is the case then an OpenCL C 3.0 or newer compiler must also define
the <code>__opencl_c_images</code> feature macro.</p>
</div>
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>The C99 derived types (arrays, structs, unions, functions, and pointers),
constructed from the built-in <a href="#built-in-scalar-data-types">scalar</a>,
<a href="#built-in-vector-data-types">vector</a>, and
<a href="#other-built-in-data-types">other</a> data types are supported, with specified
<a href="#restrictions">restrictions</a>.</p>
</div>
<div class="paragraph">
<p>The following tables describe the other built-in data types in OpenCL
described in <a href="#table-other-builtin-types">Other Built-in Data Types</a> and the corresponding data type
available to the application:</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Type in OpenCL C</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>API type for application</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>image2d_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_mem</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>image3d_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_mem</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>image2d_array_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_mem</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>image1d_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_mem</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>image1d_buffer_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_mem</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>image1d_array_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_mem</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>image2d_depth_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_mem</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>image2d_array_depth_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_mem</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>sampler_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_sampler</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>queue_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_command_queue</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>ndrange_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">N/A</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>clk_event_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">N/A</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>reserve_id_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">N/A</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>event_t</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">N/A</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>cl_mem_fence_flags</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">N/A</p></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="sect3">
<h4 id="reserved-data-types"><a class="anchor" href="#reserved-data-types"></a>6.3.4. Reserved Data Types</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The data type names described in the following table are reserved and cannot
be used by applications as type names.
The vector data type names defined in <a href="#table-builtin-vector-types">Built-in Vector Data Types</a>, but
where <em>n</em> is any value other than 2, 3, 4, 8 and 16, are also reserved.</p>
</div>
<table id="table-reserved-types" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 5. Reserved Data Types</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Type</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>bool<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A boolean vector.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>half<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A 16-bit floating-point vector.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>quad</code>, <code>quad<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A 128-bit floating-point scalar and vector.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>complex half</code>, <code>complex half<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A complex 16-bit floating-point scalar and vector.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>imaginary half</code>, <code>imaginary half<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">An imaginary 16-bit floating-point scalar and vector.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>complex float</code>, <code>complex float<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A complex 32-bit floating-point scalar and vector.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>imaginary float</code>, <code>imaginary float<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">An imaginary 32-bit floating-point scalar and vector.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>complex double</code>, <code>complex double<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A complex 64-bit floating-point scalar and vector.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>imaginary double</code>, <code>imaginary double<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">An imaginary 64-bit floating-point scalar and vector.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>complex quad</code>, <code>complex quad<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A complex 128-bit floating-point scalar and vector.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>imaginary quad</code>, <code>imaginary quad<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">An imaginary 128-bit floating-point scalar and vector.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>float<em>n</em>x<em>m</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">An <em>n</em> × <em>m</em> matrix of single precision floating-point values
stored in column-major order.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>double<em>n</em>x<em>m</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">An <em>n</em> × <em>m</em> matrix of double precision floating-point values
stored in column-major order.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>long double</code>, <code>long double<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A floating-point scalar and vector type with at least as much
precision and range as a <code>double</code> and no more precision and range than
a quad.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>long long, long long<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A 128-bit signed integer scalar and vector.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>unsigned long long</code>,
<code>ulong long</code>,
<code>ulong long<em>n</em></code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A 128-bit unsigned integer scalar and vector.</p></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="sect3">
<h4 id="alignment-of-types"><a class="anchor" href="#alignment-of-types"></a>6.3.5. Alignment of Types</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>A data item declared to be a data type in memory is always aligned to the
size of the data type in bytes.
For example, a <code>float4</code> variable will be aligned to a 16-byte boundary, a
<code>char2</code> variable will be aligned to a 2-byte boundary.</p>
</div>
<div class="paragraph">
<p>For 3-component vector data types, the size of the data type is <code>4 *
sizeof(component)</code>.
This means that a 3-component vector data type will be aligned to a <code>4 *
sizeof(component)</code> boundary.
The <strong>vload3</strong> and <strong>vstore3</strong> built-in functions can be used to read and write,
respectively, 3-component vector data types from an array of packed scalar
data type.</p>
</div>
<div class="paragraph">
<p>A built-in data type that is not a power of two bytes in size must be
aligned to the next larger power of two.
This rule applies to built-in types only, not structs or unions.</p>
</div>
<div class="paragraph">
<p>The OpenCL compiler is responsible for aligning data items to the
appropriate alignment as required by the data type.
For arguments to a <code>__kernel</code> function declared to be a pointer to a data
type, the OpenCL compiler can assume that the pointee is always
appropriately aligned as required by the data type.
The behavior of an unaligned load or store is undefined, except for the
<a href="#vector-data-load-and-store-functions">vector data load and store
functions</a> <strong>vload<em>n</em></strong>, <strong>vload_half<em>n</em></strong>, <strong>vstore<em>n</em></strong>, and
<strong>vstore_half<em>n</em></strong>.
The vector load functions can read a vector from an address aligned to the
element type of the vector.
The vector store functions can write a vector to an address aligned to the
element type of the vector.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="vector-literals"><a class="anchor" href="#vector-literals"></a>6.3.6. Vector Literals</h4>
<div class="paragraph">
<p>Vector literals can be used to create vectors from a list of scalars,
vectors or a mixture thereof.
A vector literal can be used either as a vector initializer or as a primary
expression.
A vector literal cannot be used as an l-value.</p>
</div>
<div class="paragraph">
<p>A vector literal is written as a parenthesized vector type followed by a
parenthesized comma delimited list of parameters.
A vector literal operates as an overloaded function.
The forms of the function that are available is the set of possible argument
lists for which all arguments have the same element type as the result
vector, and the total number of elements is equal to the number of elements
in the result vector.
In addition, a form with a single scalar of the same type as the element
type of the vector is available.
For example, the following forms are available for <code>float4</code>:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">(float4)( <span class="predefined-type">float</span>, <span class="predefined-type">float</span>, <span class="predefined-type">float</span>, <span class="predefined-type">float</span> )
(float4)( float2, <span class="predefined-type">float</span>, <span class="predefined-type">float</span> )
(float4)( <span class="predefined-type">float</span>, float2, <span class="predefined-type">float</span> )
(float4)( <span class="predefined-type">float</span>, <span class="predefined-type">float</span>, float2 )
(float4)( float2, float2 )
(float4)( float3, <span class="predefined-type">float</span> )
(float4)( <span class="predefined-type">float</span>, float3 )
(float4)( <span class="predefined-type">float</span> )</code></pre>
</div>
</div>
<div class="paragraph">
<p>Operands are evaluated by standard rules for function evaluation, except
that implicit scalar widening shall not occur.
The order in which the operands are evaluated is undefined.
The operands are assigned to their respective positions in the result vector
as they appear in memory order.
That is, the first element of the first operand is assigned to <code>result.x</code>,
the second element of the first operand (or the first element of the second
operand if the first operand was a scalar) is assigned to <code>result.y</code>, etc.
In the case of the form that has a single scalar operand, the operand is
replicated across all lanes of the vector.</p>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">float4 f = (float4)(<span class="float">1</span><span class="float">.0f</span>, <span class="float">2</span><span class="float">.0f</span>, <span class="float">3</span><span class="float">.0f</span>, <span class="float">4</span><span class="float">.0f</span>);
uint4 u = (uint4)(<span class="integer">1</span>); <span class="comment">// u will be (1, 1, 1, 1).</span>
float4 f = (float4)((float2)(<span class="float">1</span><span class="float">.0f</span>, <span class="float">2</span><span class="float">.0f</span>), (float2)(<span class="float">3</span><span class="float">.0f</span>, <span class="float">4</span><span class="float">.0f</span>));
float4 f = (float4)(<span class="float">1</span><span class="float">.0f</span>, (float2)(<span class="float">2</span><span class="float">.0f</span>, <span class="float">3</span><span class="float">.0f</span>), <span class="float">4</span><span class="float">.0f</span>);
float4 f = (float4)(<span class="float">1</span><span class="float">.0f</span>, <span class="float">2</span><span class="float">.0f</span>); <span class="comment">// error</span></code></pre>
</div>
</div>
</div>
<div class="sect3">
<h4 id="vector-components"><a class="anchor" href="#vector-components"></a>6.3.7. Vector Components</h4>
<div class="paragraph">
<p>The components of vector data types can be addressed as
<code>&lt;vector_data_type&gt;.xyzw</code>.
Vector data types with two or more components, such as <code>char2</code>, can access <code>.xy</code> elements.
Vector data types with three or more components, such as <code>uint3</code>, can access <code>.xyz</code> elements.
Vector data types with four or more components, such as <code>ulong4</code> or <code>float8</code>, can access <code>.xyzw</code> elements.</p>
</div>
<div class="paragraph">
<p>In OpenCL C 3.0, the components of vector data types can also be addressed as
<code>&lt;vector_data_type&gt;.rgba</code>.
Vector data types with two or more components can access <code>.rg</code> elements.
Vector data types with three or more components can access <code>.rgb</code> elements.
Vector data types with four or more components can access <code>.rgba</code> elements.</p>
</div>
<div class="paragraph">
<p>Accessing components beyond those declared for the vector type is an error
so, for example:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">float2 coord;
coord.x = <span class="float">1</span><span class="float">.0f</span>; <span class="comment">// is legal</span>
coord.r = <span class="float">1</span><span class="float">.0f</span>; <span class="comment">// is legal in OpenCL C 3.0</span>
coord.z = <span class="float">1</span><span class="float">.0f</span>; <span class="comment">// is illegal, since coord only has two components</span>
float3 pos;
pos.z = <span class="float">1</span><span class="float">.0f</span>; <span class="comment">// is legal</span>
pos.b = <span class="float">1</span><span class="float">.0f</span>; <span class="comment">// is legal in OpenCL C 3.0</span>
pos.w = <span class="float">1</span><span class="float">.0f</span>; <span class="comment">// is illegal, since pos only has three components</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>The component selection syntax allows multiple components to be selected by
appending their names after the period (<strong>.</strong>).</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">float4 c;
c.xyzw = (float4)(<span class="float">1</span><span class="float">.0f</span>, <span class="float">2</span><span class="float">.0f</span>, <span class="float">3</span><span class="float">.0f</span>, <span class="float">4</span><span class="float">.0f</span>);
c.z = <span class="float">1</span><span class="float">.0f</span>;
c.xy = (float2)(<span class="float">3</span><span class="float">.0f</span>, <span class="float">4</span><span class="float">.0f</span>);
c.xyz = (float3)(<span class="float">3</span><span class="float">.0f</span>, <span class="float">4</span><span class="float">.0f</span>, <span class="float">5</span><span class="float">.0f</span>);</code></pre>
</div>
</div>
<div class="paragraph">
<p>The component selection syntax also allows components to be permuted or
replicated.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">float4 pos = (float4)(<span class="float">1</span><span class="float">.0f</span>, <span class="float">2</span><span class="float">.0f</span>, <span class="float">3</span><span class="float">.0f</span>, <span class="float">4</span><span class="float">.0f</span>);
float4 swiz= pos.wzyx; <span class="comment">// swiz = (4.0f, 3.0f, 2.0f, 1.0f)</span>
float4 dup = pos.xxyy; <span class="comment">// dup = (1.0f, 1.0f, 2.0f, 2.0f)</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>The component group notation can occur on the left hand side of an
expression.
To form an l-value, swizzling must be applied to an l-value of vector type,
contain no duplicate components, and it results in an l-value of scalar or
vector type, depending on number of components specified.
Each component must be a supported scalar or vector type.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">float4 pos = (float4)(<span class="float">1</span><span class="float">.0f</span>, <span class="float">2</span><span class="float">.0f</span>, <span class="float">3</span><span class="float">.0f</span>, <span class="float">4</span><span class="float">.0f</span>);
pos.xw = (float2)(<span class="float">5</span><span class="float">.0f</span>, <span class="float">6</span><span class="float">.0f</span>);<span class="comment">// pos = (5.0f, 2.0f, 3.0f, 6.0f)</span>
pos.wx = (float2)(<span class="float">7</span><span class="float">.0f</span>, <span class="float">8</span><span class="float">.0f</span>);<span class="comment">// pos = (8.0f, 2.0f, 3.0f, 7.0f)</span>
pos.xyz = (float3)(<span class="float">3</span><span class="float">.0f</span>, <span class="float">5</span><span class="float">.0f</span>, <span class="float">9</span><span class="float">.0f</span>); <span class="comment">// pos = (3.0f, 5.0f, 9.0f, 4.0f)</span>
pos.xx = (float2)(<span class="float">3</span><span class="float">.0f</span>, <span class="float">4</span><span class="float">.0f</span>);<span class="comment">// illegal - 'x' used twice</span>
<span class="comment">// illegal - mismatch between float2 and float4</span>
pos.xy = (float4)(<span class="float">1</span><span class="float">.0f</span>, <span class="float">2</span><span class="float">.0f</span>, <span class="float">3</span><span class="float">.0f</span>, <span class="float">4</span><span class="float">.0f</span>);
float4 a, b, c, d;
float16 x;
x = (float16)(a, b, c, d);
x = (float16)(a.xxxx, b.xyz, c.xyz, d.xyz, a.yzw);
<span class="comment">// illegal - component a.xxxxxxx is not a valid vector type</span>
x = (float16)(a.xxxxxxx, b.xyz, c.xyz, d.xyz);</code></pre>
</div>
</div>
<div class="paragraph">
<p>Elements of vector data types can also be accessed using a numeric index to
refer to the appropriate element in the vector.
The numeric indices that can be used are given in the table below:</p>
</div>
<table id="table-vector-indices" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 6. Numeric indices for built-in vector data types</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Vector Components</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Numeric indices that can be used</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">2-component</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0, 1</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">3-component</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0, 1, 2</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">4-component</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0, 1, 2, 3</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">8-component</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0, 1, 2, 3, 4, 5, 6, 7</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">16-component</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
a, A, b, B, c, C, d, D, e, E, f, F</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>The numeric indices must be preceded by the letter <code>s</code> or <code>S</code>.</p>
</div>
<div class="paragraph">
<p>In the following example</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">float8 f;</code></pre>
</div>
</div>
<div class="paragraph">
<p><code>f.s0</code> refers to the 1<sup>st</sup> element of the <code>float8</code> variable <code>f</code> and <code>f.s7</code>
refers to the 8<sup>th</sup> element of the <code>float8</code> variable <code>f</code>.</p>
</div>
<div class="paragraph">
<p>In the following example</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">float16 x;</code></pre>
</div>
</div>
<div class="paragraph">
<p><code>x.sa</code> (or <code>x.sA</code>) refers to the 11<sup>th</sup> element of the <code>float16</code> variable
<code>x</code> and <code>x.sf</code> (or <code>x.sF</code>) refers to the 16<sup>th</sup> element of the <code>float16</code>
variable <code>x</code>.</p>
</div>
<div class="paragraph">
<p>The numeric indices used to refer to an appropriate element in the vector
cannot be intermixed with <code>.xyzw</code> notation used to access elements of a 1 ..
4 component vector.</p>
</div>
<div class="paragraph">
<p>For example</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">float4 f, a;
a = f.x12w; <span class="comment">// illegal use of numeric indices with .xyzw</span>
a.xyzw = f.s0123; <span class="comment">// valid</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>Vector data types can use the <code>.lo</code> (or <code>.even</code>) and <code>.hi</code> (or <code>.odd</code>)
suffixes to get smaller vector types or to combine smaller vector types to a
larger vector type.
Multiple levels of <code>.lo</code> (or <code>.even</code>) and <code>.hi</code> (or <code>.odd</code>) suffixes can be
used until they refer to a scalar term.</p>
</div>
<div class="paragraph">
<p>The <code>.lo</code> suffix refers to the lower half of a given vector.
The <code>.hi</code> suffix refers to the upper half of a given vector.</p>
</div>
<div class="paragraph">
<p>The <code>.even</code> suffix refers to the even elements of a vector.
The <code>.odd</code> suffix refers to the odd elements of a vector.</p>
</div>
<div class="paragraph">
<p>Some examples to help illustrate this are given below:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">float4 vf;
float2 low = vf.lo; <span class="comment">// returns vf.xy</span>
float2 high = vf.hi; <span class="comment">// returns vf.zw</span>
float2 even = vf.even; <span class="comment">// returns vf.xz</span>
float2 odd = vf.odd; <span class="comment">// returns vf.yw</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>The suffixes <code>.lo</code> (or <code>.even</code>) and <code>.hi</code> (or <code>.odd</code>) for a 3-component
vector type operate as if the 3-component vector type is a 4-component
vector type with the value in the <code>w</code> component undefined.</p>
</div>
<div class="paragraph">
<p>Some examples are given below:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">float8 vf;
float4 odd = vf.odd;
float4 even = vf.even;
float2 high = vf.even.hi;
float2 low = vf.odd.lo;
<span class="comment">// interleave LR stereo stream</span>
float4 left, right;
float8 interleaved;
interleaved.even = left;
interleaved.odd = right;
<span class="comment">// deinterleave</span>
left = interleaved.even;
right = interleaved.odd;
<span class="comment">// transpose a 4x4 matrix</span>
<span class="directive">void</span> transpose( float4 m[<span class="integer">4</span>] )
{
<span class="comment">// read matrix into a float16 vector</span>
float16 x = (float16)( m[<span class="integer">0</span>], m[<span class="integer">1</span>], m[<span class="integer">2</span>], m[<span class="integer">3</span>] );
float16 t;
<span class="comment">// transpose</span>
t.even = x.lo;
t.odd = x.hi;
x.even = t.lo;
x.odd = t.hi;
<span class="comment">// write back</span>
m[<span class="integer">0</span>] = x.lo.lo; <span class="comment">// { m[0][0], m[1][0], m[2][0], m[3][0] }</span>
m[<span class="integer">1</span>] = x.lo.hi; <span class="comment">// { m[0][1], m[1][1], m[2][1], m[3][1] }</span>
m[<span class="integer">2</span>] = x.hi.lo; <span class="comment">// { m[0][2], m[1][2], m[2][2], m[3][2] }</span>
m[<span class="integer">3</span>] = x.hi.hi; <span class="comment">// { m[0][3], m[1][3], m[2][3], m[3][3] }</span>
}
float3 vf = (float3)(<span class="float">1</span><span class="float">.0f</span>, <span class="float">2</span><span class="float">.0f</span>, <span class="float">3</span><span class="float">.0f</span>);
float2 low = vf.lo; <span class="comment">// (1.0f, 2.0f);</span>
float2 high = vf.hi; <span class="comment">// (3.0f, _undefined_);</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>It is an error to take the address of a vector element and will result in a
compilation error.
For example:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">float8 vf;
<span class="predefined-type">float</span> *f = &amp;vf.x; m <span class="comment">// is illegal</span>
float2 *f2 = &amp;vf.s07; <span class="comment">// is illegal</span>
float4 *odd = &amp;vf.odd; <span class="comment">// is illegal</span>
float4 *even = &amp;vf.even; <span class="comment">// is illegal</span>
float2 *high = &amp;vf.even.hi; <span class="comment">// is illegal</span>
float2 *low = &amp;vf.odd.lo; <span class="comment">// is illegal</span></code></pre>
</div>
</div>
</div>
<div class="sect3">
<h4 id="aliasing-rules"><a class="anchor" href="#aliasing-rules"></a>6.3.8. Aliasing Rules</h4>
<div class="paragraph">
<p>OpenCL C programs shall comply with the C99 type-based aliasing rules
defined in <a href="#C99-spec">section 6.5, item 7 of the C99 Specification</a>.
The OpenCL C built-in vector data types are considered aggregate types
<sup class="footnote">[<a id="_footnoteref_10" class="footnote" href="#_footnotedef_10" title="View footnote.">10</a>]</sup> for the purpose of applying these
aliasing rules.</p>
</div>
</div>
<div class="sect3">
<h4 id="keywords"><a class="anchor" href="#keywords"></a>6.3.9. Keywords</h4>
<div class="paragraph">
<p>The following names are reserved for use as keywords in OpenCL C and shall
not be used otherwise.</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Names reserved as keywords by C99.</p>
</li>
<li>
<p>OpenCL C data types defined in <a href="#table-builtin-vector-types">Built-in Vector Data Types</a>,
<a href="#table-other-builtin-types">Other Built-in Data Types</a>, and <a href="#table-reserved-types">Reserved Data Types</a>.</p>
</li>
<li>
<p>Address space qualifiers: <code>__global</code>, <code>global</code>, <code>__local</code>, <code>local</code>,
<code>__constant</code>, <code>constant</code>, <code>__private</code>, and <code>private</code>.
<code>__generic</code> and <code>generic</code> are reserved for future use.</p>
</li>
<li>
<p>Function qualifiers: <code>__kernel</code> and <code>kernel</code>.</p>
</li>
<li>
<p>Access qualifiers: <code>__read_only</code>, <code>read_only</code>, <code>__write_only</code>,
<code>write_only</code>, <code>__read_write</code> and <code>read_write</code>.</p>
</li>
<li>
<p><code>uniform</code>, <code>pipe</code>.</p>
</li>
</ul>
</div>
</div>
</div>
<div class="sect2">
<h3 id="conversions-and-type-casting"><a class="anchor" href="#conversions-and-type-casting"></a>6.4. Conversions and Type Casting</h3>
<div class="sect3">
<h4 id="implicit-conversions"><a class="anchor" href="#implicit-conversions"></a>6.4.1. Implicit Conversions</h4>
<div class="paragraph">
<p>Implicit conversions between scalar built-in types defined in
<a href="#table-builtin-scalar-types">Built-in Scalar Data Types</a> (except <code>void</code> and <code>half</code>
<sup class="footnote">[<a id="_footnoteref_11" class="footnote" href="#_footnotedef_11" title="View footnote.">11</a>]</sup>) are supported.
When an implicit conversion is done, it is not just a re-interpretation of
the expression&#8217;s value but a conversion of that value to an equivalent value
in the new type.
For example, the integer value 5 will be converted to the floating-point
value 5.0.</p>
</div>
<div class="paragraph">
<p>Implicit conversions from a scalar type to a vector type are allowed.
In this case, the scalar may be subject to the usual arithmetic conversion
to the element type used by the vector.
The scalar type is then widened to the vector.</p>
</div>
<div class="paragraph">
<p>Implicit conversions between built-in vector data types are disallowed.</p>
</div>
<div class="paragraph">
<p>Implicit conversions for pointer types follow the rules described in the
&lt;C99-spec,C99 Specification&gt;&gt;.</p>
</div>
</div>
<div class="sect3">
<h4 id="explicit-casts"><a class="anchor" href="#explicit-casts"></a>6.4.2. Explicit Casts</h4>
<div class="paragraph">
<p>Standard typecasts for built-in scalar data types defined in
<a href="#table-builtin-scalar-types">Built-in Scalar Data Types</a> will perform appropriate conversion (except
<code>void</code> and <code>half</code> <sup class="footnote">[<a id="_footnoteref_12" class="footnote" href="#_footnotedef_12" title="View footnote.">12</a>]</sup>).
In the example below:</p>
</div>
<div class="paragraph">
<p>[9] Unless the <strong>cl_khr_fp16</strong> extension is supported and has been enabled.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="predefined-type">float</span> f = <span class="float">1</span><span class="float">.0f</span>;
<span class="predefined-type">int</span> i = (<span class="predefined-type">int</span>)f;</code></pre>
</div>
</div>
<div class="paragraph">
<p><code>f</code> stores <code>0x3F800000</code> and <code>i</code> stores <code>0x1</code> which is the floating-point
value <code>1.0f</code> in <code>f</code> converted to an integer value.</p>
</div>
<div class="paragraph">
<p>Explicit casts between vector types are not legal.
The examples below will generate a compilation error.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">int4 i;
uint4 u = (uint4) i; <span class="comment">// not allowed</span>
float4 f;
int4 i = (int4) f; <span class="comment">// not allowed</span>
float4 f;
int8 i = (int8) f; <span class="comment">// not allowed</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>Scalar to vector conversions may be performed by casting the scalar to the
desired vector data type.
Type casting will also perform appropriate arithmetic conversion.
The round to zero rounding mode will be used for conversions to built-in
integer vector types.
The default rounding mode will be used for conversions to floating-point
vector types.
When casting a <code>bool</code> to a vector integer data type, the vector components
will be set to -1 (i.e. all bits set) if the bool value is <em>true</em> and 0
otherwise.</p>
</div>
<div class="paragraph">
<p>Below are some correct examples of explicit casts.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="predefined-type">float</span> f = <span class="float">1</span><span class="float">.0f</span>;
float4 va = (float4)f;
<span class="comment">// va is a float4 vector with elements (f, f, f, f).</span>
uchar u = <span class="hex">0xFF</span>;
float4 vb = (float4)u;
<span class="comment">// vb is a float4 vector with elements</span>
<span class="comment">// ((float)u, (float)u, (float)u, (float)u).</span>
<span class="predefined-type">float</span> f = <span class="float">2</span><span class="float">.0f</span>;
int2 vc = (int2)f;
<span class="comment">// vc is an int2 vector with elements ((int)f, (int)f).</span>
uchar4 vtrue = (uchar4)<span class="predefined-constant">true</span>;
<span class="comment">// vtrue is a uchar4 vector with elements (0xff, 0xff,</span>
<span class="comment">// 0xff, 0xff).</span></code></pre>
</div>
</div>
</div>
<div class="sect3">
<h4 id="explicit-conversions"><a class="anchor" href="#explicit-conversions"></a>6.4.3. Explicit Conversions</h4>
<div class="paragraph">
<p>Explicit conversions may be performed using the</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">convert_destType(sourceType)</code></pre>
</div>
</div>
<div class="paragraph">
<p>suite of functions.
These provide a full set of type conversions between supported
<a href="#built-in-scalar-data-types">scalar</a>,
<a href="#built-in-vector-data-types">vector</a>, and
<a href="#other-built-in-data-types">other</a> data types except for the following
types: <code>bool</code>, <code>half</code>, <code>size_t</code>, <code>ptrdiff_t</code>, <code>intptr_t</code>, <code>uintptr_t</code>, and
<code>void</code>.</p>
</div>
<div class="paragraph">
<p>The number of elements in the source and destination vectors must match.</p>
</div>
<div class="paragraph">
<p>In the example below:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">uchar4 u;
int4 c = convert_int4(u);</code></pre>
</div>
</div>
<div class="paragraph">
<p><code>convert_int4</code> converts a <code>uchar4</code> vector <code>u</code> to an <code>int4</code> vector <code>c</code>.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="predefined-type">float</span> f;
<span class="predefined-type">int</span> i = convert_int(f);</code></pre>
</div>
</div>
<div class="paragraph">
<p><code>convert_int</code> converts a <code>float</code> scalar <code>f</code> to an int scalar <code>i</code>.</p>
</div>
<div class="paragraph">
<p>The behavior of the conversion may be modified by one or two optional
modifiers that specify saturation for out-of-range inputs and rounding
behavior.</p>
</div>
<div class="paragraph">
<p>The full form of the scalar convert function is:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">destType convert_destType&lt;_sat&gt;&lt;_roundingMode&gt;(sourceType)</code></pre>
</div>
</div>
<div class="paragraph">
<p>where <code>dstType</code> is the destination scalar type and <code>sourceType</code> is the source scalar type.</p>
</div>
<div class="paragraph">
<p>The full form of the vector convert function is:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">destTypen convert_destTypen&lt;_sat&gt;&lt;_roundingMode&gt;(sourceTypen)</code></pre>
</div>
</div>
<div class="paragraph">
<p>where <code>destTypen</code> is the n-element destination vector type and <code>sourceTypen</code> is the n-element source vector type.</p>
</div>
<div class="sect4">
<h5 id="data-types"><a class="anchor" href="#data-types"></a>6.4.3.1. Data Types</h5>
<div class="paragraph">
<p>Conversions are available for the following scalar types: <code>char</code>, <code>uchar</code>,
<code>short</code>, <code>ushort</code>, <code>int</code>, <code>uint</code>, <code>long</code>, <code>ulong</code>, <code>float</code>, and built-in
vector types derived therefrom.
The operand and result type must have the same number of elements.
The operand and result type may be the same type in which case the
conversion has no effect on the type or value of an expression.</p>
</div>
<div class="paragraph">
<p>Conversions between integer types follow the conversion rules specified in
<a href="#C99-spec">sections 6.3.1.1 and 6.3.1.3 of the C99 Specification</a> except
for <a href="#out-of-range-behavior">out-of-range behavior and saturated
conversions</a>.</p>
</div>
</div>
<div class="sect4">
<h5 id="rounding-modes"><a class="anchor" href="#rounding-modes"></a>6.4.3.2. Rounding Modes</h5>
<div class="paragraph">
<p>Conversions to and from floating-point type shall conform to IEEE-754
rounding rules.
Conversions may have an optional rounding mode modifier described in the
following table.</p>
</div>
<table id="table-rounding-mode" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 7. Rounding Modes</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Modifier</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Rounding Mode Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>_rte</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Round to nearest even</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>_rtz</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Round toward zero</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>_rtp</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Round toward positive infinity</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>_rtn</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Round toward negative infinity</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">no modifier specified</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use the default rounding mode for this destination
type, <code>_rtz</code> for conversion to integers or the
default rounding mode for conversion to
floating-point types.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>By default, conversions to integer type use the <code>_rtz</code> (round toward zero)
rounding mode and conversions to floating-point type
<sup class="footnote">[<a id="_footnoteref_13" class="footnote" href="#_footnotedef_13" title="View footnote.">13</a>]</sup> use the default rounding mode.
The only default floating-point rounding mode supported is round to nearest
even i.e the default rounding mode will be <code>_rte</code> for floating-point types.</p>
</div>
</div>
<div class="sect4">
<h5 id="out-of-range-behavior"><a class="anchor" href="#out-of-range-behavior"></a>6.4.3.3. Out-of-Range Behavior and Saturated Conversions</h5>
<div class="paragraph">
<p>When the conversion operand is either greater than the greatest
representable destination value or less than the least representable
destination value, it is said to be out-of-range.
The result of out-of-range conversion is determined by the conversion rules
specified by <a href="#C99-spec">section 6.3 of the C99 Specification</a>.
When converting from a floating-point type to integer type, the behavior is
implementation-defined.</p>
</div>
<div class="paragraph">
<p>Conversions to integer type may opt to convert using the optional saturated
mode by appending the _sat modifier to the conversion function name.
When in saturated mode, values that are outside the representable range
shall clamp to the nearest representable value in the destination format.
(NaN should be converted to 0).</p>
</div>
<div class="paragraph">
<p>Conversions to floating-point type shall conform to IEEE-754 rounding rules.
The <code>_sat</code> modifier may not be used for conversions to floating-point
formats.</p>
</div>
</div>
<div class="sect4">
<h5 id="explicit-conversion-examples"><a class="anchor" href="#explicit-conversion-examples"></a>6.4.3.4. Explicit Conversion Examples</h5>
<div class="paragraph">
<p>Example 1:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">short4 s;
<span class="comment">// negative values clamped to 0</span>
ushort4 u = convert_ushort4_sat( s );
<span class="comment">// values &gt; CHAR_MAX converted to CHAR_MAX</span>
<span class="comment">// values &lt; CHAR_MIN converted to CHAR_MIN</span>
char4 c = convert_char4_sat( s );</code></pre>
</div>
</div>
<div class="paragraph">
<p>Example 2:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">float4 f;
<span class="comment">// values implementation defined for</span>
<span class="comment">// f &gt; INT_MAX, f &lt; INT_MIN or NaN</span>
int4 i = convert_int4( f );
<span class="comment">// values &gt; INT_MAX clamp to INT_MAX, values &lt; INT_MIN clamp</span>
<span class="comment">// to INT_MIN. NaN should produce 0.</span>
<span class="comment">// The _rtz_ rounding mode is used to produce the integer values.</span>
int4 i2 = convert_int4_sat( f );
<span class="comment">// similar to convert_int4, except that floating-point values</span>
<span class="comment">// are rounded to the nearest integer instead of truncated</span>
int4 i3 = convert_int4_rte( f );
<span class="comment">// similar to convert_int4_sat, except that floating-point values</span>
<span class="comment">// are rounded to the nearest integer instead of truncated</span>
int4 i4 = convert_int4_sat_rte( f );</code></pre>
</div>
</div>
<div class="paragraph">
<p>Example 3:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">int4 i;
<span class="comment">// convert ints to floats using the default rounding mode.</span>
float4 f = convert_float4( i );
<span class="comment">// convert ints to floats. integer values that cannot</span>
<span class="comment">// be exactly represented as floats should round up to the</span>
<span class="comment">// next representable float.</span>
float4 f = convert_float4_rtp( i );</code></pre>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="reinterpreting-data-as-another-type"><a class="anchor" href="#reinterpreting-data-as-another-type"></a>6.4.4. Reinterpreting Data As Another Type</h4>
<div class="paragraph">
<p>It is frequently necessary to reinterpret bits in a data type as another
data type in OpenCL.
This is typically required when direct access to the bits in a
floating-point type is needed, for example to mask off the sign bit or make
use of the result of a vector <a href="#operators-relational">relational operator</a>
on floating-point data <sup class="footnote">[<a id="_footnoteref_14" class="footnote" href="#_footnotedef_14" title="View footnote.">14</a>]</sup>.
Several methods to achieve this (non-) conversion are frequently practiced
in C, including pointer aliasing, unions and memcpy.
Of these, only memcpy is strictly correct in C99.
Since OpenCL does not provide <strong>memcpy</strong>, other methods are needed.</p>
</div>
<div class="sect4">
<h5 id="reinterpreting-types-using-unions"><a class="anchor" href="#reinterpreting-types-using-unions"></a>6.4.4.1. Reinterpreting Types Using Unions</h5>
<div class="paragraph">
<p>The OpenCL language extends the union to allow the program to access a
member of a union object using a member of a different type.
The relevant bytes of the representation of the object are treated as an
object of the type used for the access.
If the type used for access is larger than the representation of the object,
then the value of the additional bytes is undefined.</p>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="comment">// d only if double precision is supported</span>
<span class="keyword">union</span> { <span class="predefined-type">float</span> f; uint u; <span class="predefined-type">double</span> d; } u;
u.u = <span class="integer">1</span>; <span class="comment">// u.f contains 2**-149. u.d is undefined --</span>
<span class="comment">// depending on endianness the low or high half</span>
<span class="comment">// of d is unknown</span>
u.f = <span class="float">1</span><span class="float">.0f</span>; <span class="comment">// u.u contains 0x3f800000, u.d contains an</span>
<span class="comment">// undefined value -- depending on endianness</span>
<span class="comment">// the low or high half of d is unknown</span>
u.d = <span class="float">1</span><span class="float">.0</span>; <span class="comment">// u.u contains 0x3ff00000 (big endian) or 0</span>
<span class="comment">// (little endian). u.f contains either 0x1.ep0f</span>
<span class="comment">// (big endian) or 0.0f (little endian)</span></code></pre>
</div>
</div>
</div>
<div class="sect4">
<h5 id="reinterpreting-types-using-as_type-and-as_typen"><a class="anchor" href="#reinterpreting-types-using-as_type-and-as_typen"></a>6.4.4.2. Reinterpreting Types Using <strong>as_type</strong>() and <strong>as_type<em>n</em></strong>()</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>All data types described in <a href="#table-builtin-scalar-types">Built-in Scalar Data Types</a> and
<a href="#table-builtin-vector-types">Built-in Vector Data Types</a> (except <code>bool</code>, <code>void</code>, and <code>half</code>
<sup class="footnote">[<a id="_footnoteref_15" class="footnote" href="#_footnotedef_15" title="View footnote.">15</a>]</sup>) may be also reinterpreted as another data type of
the same size using the <strong>as_type</strong>() operator for scalar data types and the
<strong>as_type<em>n</em></strong>() operator <sup class="footnote">[<a id="_footnoteref_16" class="footnote" href="#_footnotedef_16" title="View footnote.">16</a>]</sup> for vector
data types.
When the operand and result type contain the same number of elements, the
bits in the operand shall be returned directly without modification as the
new type.
The usual type promotion for function arguments shall not be performed.</p>
</div>
<div class="paragraph">
<p>For example, <code><strong>as_float</strong>(0x3f800000)</code> returns <code>1.0f</code>, which is the value
that the bit pattern <code>0x3f800000</code> has if viewed as an IEEE-754 single
precision value.</p>
</div>
<div class="paragraph">
<p>When the operand and result type contain a different number of elements, the
result shall be implementation-defined except if the operand is a
4-component vector and the result is a 3-component vector.
In this case, the bits in the operand shall be returned directly without
modification as the new type.
That is, a conforming implementation shall explicitly define a behavior, but
two conforming implementations need not have the same behavior when the
number of elements in the result and operand types does not match.
The implementation may define the result to contain all, some or none of the
original bits in whatever order it chooses.
It is an error to use <strong>as_type</strong>() or <strong>as_type<em>n</em></strong>() operator to
reinterpret data to a type of a different number of bytes.</p>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="predefined-type">float</span> f = <span class="float">1</span><span class="float">.0f</span>;
uint u = as_uint(f); <span class="comment">// Legal. Contains: 0x3f800000</span>
float4 f = (float4)(<span class="float">1</span><span class="float">.0f</span>, <span class="float">2</span><span class="float">.0f</span>, <span class="float">3</span><span class="float">.0f</span>, <span class="float">4</span><span class="float">.0f</span>);
<span class="comment">// Legal. Contains:</span>
<span class="comment">// (int4)(0x3f800000, 0x40000000, 0x40400000, 0x40800000)</span>
int4 i = as_int4(f);
float4 f, g;
int4 is_less = f &lt; g;
<span class="comment">// Legal. f[i] = f[i] &lt; g[i] ? f[i] : 0.0f</span>
f = as_float4(as_int4(f) &amp; is_less);
<span class="predefined-type">int</span> i;
<span class="comment">// Legal. Result is implementation-defined.</span>
short2 j = as_short2(i);
int4 i;
<span class="comment">// Legal. Result is implementation-defined.</span>
short8 j = as_short8(i);
float4 f;
<span class="comment">// Error. Result and operand have different sizes</span>
double4 g = as_double4(f); <span class="comment">// Only if double precision is supported.</span>
float4 f;
<span class="comment">// Legal. g.xyz will have same values as f.xyz. g.w is undefined</span>
float3 g = as_float3(f);</code></pre>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="pointer-casting"><a class="anchor" href="#pointer-casting"></a>6.4.5. Pointer Casting</h4>
<div class="paragraph">
<p>Pointers to old and new types may be cast back and forth to each other.
Casting a pointer to a new type represents an unchecked assertion that the
address is correctly aligned.
The developer will also need to know the endianness of the OpenCL device and
the endianness of the data to determine how the scalar and vector data
elements are stored in memory.</p>
</div>
</div>
<div class="sect3">
<h4 id="usual-arithmetic-conversions"><a class="anchor" href="#usual-arithmetic-conversions"></a>6.4.6. Usual Arithmetic Conversions</h4>
<div class="paragraph">
<p>Many operators that expect operands of arithmetic type cause conversions and
yield result types in a similar way.
The purpose is to determine a common real type for the operands and result.
For the specified operands, each operand is converted, without change of
type domain, to a type whose corresponding real type is the common real
type.
For this purpose, all vector types shall be considered to have higher
conversion ranks than scalars.
Unless explicitly stated otherwise, the common real type is also the
corresponding real type of the result, whose type domain is the type domain
of the operands if they are the same, and complex otherwise.
This pattern is called the usual arithmetic conversions.
If the operands are of more than one vector type, then an error shall occur.
<a href="#implicit-conversions">Implicit conversions</a> between vector types are not
permitted.</p>
</div>
<div class="paragraph">
<p>Otherwise, if there is only a single vector type, and all other operands are
scalar types, the scalar types are converted to the type of the vector
element, then widened into a new vector containing the same number of
elements as the vector, by duplication of the scalar value across the width
of the new vector.
An error shall occur if any scalar operand has greater rank than the type of
the vector element.
For this purpose, the rank order defined as follows:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>The rank of a floating-point type is greater than the rank of another
floating-point type, if the first floating-point type can exactly
represent all numeric values in the second floating-point type.
(For this purpose, the encoding of the floating-point value is used,
rather than the subset of the encoding usable by the device.)</p>
</li>
<li>
<p>The rank of any floating-point type is greater than the rank of any
integer type.</p>
</li>
<li>
<p>The rank of an integer type is greater than the rank of an integer type
with less precision.</p>
</li>
<li>
<p>The rank of an unsigned integer type is <strong>greater than</strong> the rank of a
signed integer type with the same precision
<sup class="footnote">[<a id="_footnoteref_17" class="footnote" href="#_footnotedef_17" title="View footnote.">17</a>]</sup>.</p>
</li>
<li>
<p>The rank of the bool type is less than the rank of any other type.</p>
</li>
<li>
<p>The rank of an enumerated type shall equal the rank of the compatible
integer type.</p>
</li>
<li>
<p>For all types, <code>T1</code>, <code>T2</code> and <code>T3</code>, if <code>T1</code> has greater rank than <code>T2</code>,
and <code>T2</code> has greater rank than <code>T3</code>, then <code>T1</code> has greater rank than
<code>T3</code>.</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>Otherwise, if all operands are scalar, the usual arithmetic conversions
apply, per <a href="#C99-spec">section 6.3.1.8 of the C99 Specification</a>.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>Both the standard orderings in <a href="#C99-spec">sections 6.3.1.8 and 6.3.1.1 of
the C99 Specification</a> were examined and rejected.
Had we used integer conversion rank here, <code>int4 + 0U</code> would have been legal
and had <code>int4</code> return type.
Had we used standard C99 usual arithmetic conversion rules for scalars, then
the standard integer promotion would have been performed on vector integer
element types and <code>short8 + char</code> would either have return type of <code>int8</code> or
be illegal.</p>
</div>
</td>
</tr>
</table>
</div>
</div>
</div>
<div class="sect2">
<h3 id="operators"><a class="anchor" href="#operators"></a>6.5. Operators</h3>
<div class="sect3">
<h4 id="operators-arithmetic"><a class="anchor" href="#operators-arithmetic"></a>6.5.1. Arithmetic Operators</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The arithmetic operators add (<strong>+</strong>), subtract (<strong>-</strong>), multiply (<strong>*</strong>) and
divide (<strong>/</strong>) operate on built-in integer and floating-point scalar, and
vector data types.
The arithmetic operator remainder (<strong>%</strong>) operates on built-in integer scalar
and integer vector data types.
All arithmetic operators return result of the same built-in type (integer or
floating-point) as the type of the operands, after operand type conversion.
After conversion, the following cases are valid:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>The two operands are scalars.
In this case, the operation is applied, resulting in a scalar.</p>
</li>
<li>
<p>One operand is a scalar, and the other is a vector.
In this case, the scalar may be subject to the usual arithmetic
conversion to the element type used by the vector operand.
The scalar type is then widened to a vector that has the same number of
components as the vector operand.
The operation is done component-wise resulting in the same size vector.</p>
</li>
<li>
<p>The two operands are vectors of the same type.
In this case, the operation is done component-wise resulting in the same
size vector.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>All other cases of implicit conversions are illegal.
Division on integer types which results in a value that lies outside of the
range bounded by the maximum and minimum representable values of the integer
type will not cause an exception but will result in an unspecified value.
A divide by zero with integer types does not cause an exception but will
result in an unspecified value.
Division by zero for floating-point types will result in ±∞ or
NaN as prescribed by the IEEE-754 standard.
Use the built-in functions <strong>dot</strong> and <strong>cross</strong> to get, respectively, the
vector dot product and the vector cross product.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="operators-unary"><a class="anchor" href="#operators-unary"></a>6.5.2. Unary Operators</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The arithmetic unary operators (<strong>+</strong> and <strong>-</strong>) operate on built-in scalar and
vector types.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="operators-prepost"><a class="anchor" href="#operators-prepost"></a>6.5.3. Pre- and Post-Operators</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The arithmetic post- and pre-increment and decrement operators (<strong>--</strong> and
<strong>++</strong>) operate on built-in scalar and vector types except the built-in scalar
and vector <code>float</code> types <sup class="footnote">[<a id="_footnoteref_18" class="footnote" href="#_footnotedef_18" title="View footnote.">18</a>]</sup>.
All unary operators work component-wise on their operands.
These result with the same type they operated on.
For post- and pre-increment and decrement, the expression must be one that
could be assigned to (an l-value).
Pre-increment and pre-decrement add or subtract 1 to the contents of the
expression they operate on, and the value of the pre-increment or
pre-decrement expression is the resulting value of that modification.
Post-increment and post-decrement expressions add or subtract 1 to the
contents of the expression they operate on, but the resulting expression has
the expression&#8217;s value before the post-increment or post-decrement was
executed.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="operators-relational"><a class="anchor" href="#operators-relational"></a>6.5.4. Relational Operators</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The relational operators greater than (<strong>&gt;</strong>), less than (<strong>&lt;</strong>), greater than or
equal (<strong>&gt;=</strong>), and less than or equal (<strong>&lt;=</strong>) operate on scalar and vector types
<sup class="footnote">[<a id="_footnoteref_19" class="footnote" href="#_footnotedef_19" title="View footnote.">19</a>]</sup>.
All relational operators result in an integer type.
After operand type conversion, the following cases are valid:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>The two operands are scalars.
In this case, the operation is applied, resulting in an <code>int</code> scalar.</p>
</li>
<li>
<p>One operand is a scalar, and the other is a vector.
In this case, the scalar may be subject to the usual arithmetic
conversion to the element type used by the vector operand.
The scalar type is then widened to a vector that has the same number of
components as the vector operand.
The operation is done component-wise resulting in the same size vector.</p>
</li>
<li>
<p>The two operands are vectors of the same type.
In this case, the operation is done component-wise resulting in the same
size vector.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>All other cases of implicit conversions are illegal.</p>
</div>
<div class="paragraph">
<p>The result is a scalar signed integer of type <code>int</code> if the source operands
are scalar and a vector signed integer type of the same size as the source
operands if the source operands are vector types.
Vector source operands of type <code>char<em>n</em></code> and <code>uchar<em>n</em></code> return a
<code>char<em>n</em></code> result; vector source operands of type <code>short<em>n</em></code> and
<code>ushort<em>n</em></code> return a <code>short<em>n</em></code> result; vector source operands of type
<code>int<em>n</em></code>, <code>uint<em>n</em></code> and <code>float<em>n</em></code> return an <code>int<em>n</em></code> result; vector
source operands of type <code>long<em>n</em></code>, <code>ulong<em>n</em></code> and <code>double<em>n</em></code> return a
<code>long<em>n</em></code> result.
For scalar types, the relational operators shall return 0 if the specified
relation is <em>false</em> and 1 if the specified relation is <em>true</em>.
For vector types, the relational operators shall return 0 if the specified
relation is <em>false</em> and -1 (i.e. all bits set) if the specified relation is
<em>true</em>.
The relational operators always return 0 if either argument is not a number
(NaN).</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="operators-equality"><a class="anchor" href="#operators-equality"></a>6.5.5. Equality Operators</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The equality operators equal (<strong>==</strong>) and not equal (<strong>!=</strong>) operate on
built-in scalar and vector types <sup class="footnote">[<a id="_footnoteref_20" class="footnote" href="#_footnotedef_20" title="View footnote.">20</a>]</sup>.
All equality operators result in an integer type.
After operand type conversion, the following cases are valid:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>The two operands are scalars.
In this case, the operation is applied, resulting in a scalar.</p>
</li>
<li>
<p>One operand is a scalar, and the other is a vector.
In this case, the scalar may be subject to the usual arithmetic
conversion to the element type used by the vector operand.
The scalar type is then widened to a vector that has the same number of
components as the vector operand.
The operation is done component-wise resulting in the same size vector.</p>
</li>
<li>
<p>The two operands are vectors of the same type.
In this case, the operation is done component-wise resulting in the same
size vector.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>All other cases of implicit conversions are illegal.</p>
</div>
<div class="paragraph">
<p>The result is a scalar signed integer of type <code>int</code> if the source operands
are scalar and a vector signed integer type of the same size as the source
operands if the source operands are vector types.
Vector source operands of type <code>char<em>n</em></code> and <code>uchar<em>n</em></code> return a
<code>char<em>n</em></code> result; vector source operands of type <code>short<em>n</em></code> and
<code>ushort<em>n</em></code> return a <code>short<em>n</em></code> result; vector source operands of type
<code>int<em>n</em></code>, <code>uint<em>n</em></code> and <code>float<em>n</em></code> return an <code>int<em>n</em></code> result; vector
source operands of type <code>long<em>n</em></code>, <code>ulong<em>n</em></code> and <code>double<em>n</em></code> return a
<code>long<em>n</em></code> result.</p>
</div>
<div class="paragraph">
<p>For scalar types, the equality operators return 0 if the specified relation
is <em>false</em> and return 1 if the specified relation is <em>true</em>.
For vector types, the equality operators shall return 0 if the specified
relation is <em>false</em> and -1 (i.e. all bits set) if the specified relation is
<em>true</em>.
The equality operator equal (<strong>==</strong>) returns 0 if one or both arguments are
not a number (NaN).
The equality operator not equal (<strong>!=</strong>) returns 1 (for scalar source
operands) or -1 (for vector source operands) if one or both arguments are
not a number (NaN).</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="operators-bitwise"><a class="anchor" href="#operators-bitwise"></a>6.5.6. Bitwise Operators</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The bitwise operators and (<strong>&amp;</strong>), or (<strong>|</strong>), exclusive or (<strong>^</strong>), and not
(<strong>~</strong>) operate on all scalar and vector built-in types except the built-in
scalar and vector <code>float</code> types.
For vector built-in types, the operators are applied component-wise.
If one operand is a scalar and the other is a vector, the scalar may be
subject to the usual arithmetic conversion to the element type used by the
vector operand.
The scalar type is then widened to a vector that has the same number of
components as the vector operand.
The operation is done component-wise resulting in the same size vector.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="operators-logical"><a class="anchor" href="#operators-logical"></a>6.5.7. Logical Operators</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The logical operators and (<strong>&amp;&amp;</strong>) and or (<strong>||</strong>) operate on all scalar and
vector built-in types.
For scalar built-in types only, and (<strong>&amp;&amp;</strong>) will only evaluate the right hand
operand if the left hand operand compares unequal to 0.
For scalar built-in types only, or (<strong>||</strong>) will only evaluate the right hand
operand if the left hand operand compares equal to 0.
For built-in vector types, both operands are evaluated and the operators are
applied component-wise.
If one operand is a scalar and the other is a vector, the scalar may be
subject to the usual arithmetic conversion to the element type used by the
vector operand.
The scalar type is then widened to a vector that has the same number of
components as the vector operand.
The operation is done component-wise resulting in the same size vector.</p>
</div>
<div class="paragraph">
<p>The logical operator exclusive or (<strong>^^</strong>) is reserved.</p>
</div>
<div class="paragraph">
<p>The result is a scalar signed integer of type <code>int</code> if the source operands
are scalar and a vector signed integer type of the same size as the source
operands if the source operands are vector types.
Vector source operands of type <code>char<em>n</em></code> and <code>uchar<em>n</em></code> return a
<code>char<em>n</em></code> result; vector source operands of type <code>short<em>n</em></code> and
<code>ushort<em>n</em></code> return a <code>short<em>n</em></code> result; vector source operands of type
<code>int<em>n</em></code>, <code>uint<em>n</em></code> and <code>float<em>n</em></code> return an <code>int<em>n</em></code> result; vector
source operands of type <code>long<em>n</em></code>, <code>ulong<em>n</em></code> and <code>double<em>n</em></code> return a
<code>long<em>n</em></code> result.</p>
</div>
<div class="paragraph">
<p>For scalar types, the logical operators shall return 0 if the result of the
operation is <em>false</em> and 1 if the result is <em>true</em>.
For vector types, the logical operators shall return 0 if the result of the
operation is <em>false</em> and -1 (i.e. all bits set) if the result is <em>true</em>.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="operators-logical-unary"><a class="anchor" href="#operators-logical-unary"></a>6.5.8. Unary Logical Operator</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The logical unary operator not (<strong>!</strong>) operates on all scalar and vector
built-in types.
For built-in vector types, the operators are applied component-wise.</p>
</div>
<div class="paragraph">
<p>The result is a scalar signed integer of type <code>int</code> if the source operands
are scalar and a vector signed integer type of the same size as the source
operands if the source operands are vector types.
Vector source operands of type <code>char<em>n</em></code> and <code>uchar<em>n</em></code> return a
<code>char<em>n</em></code> result; vector source operands of type <code>short<em>n</em></code> and
<code>ushort<em>n</em></code> return a <code>short<em>n</em></code> result; vector source operands of type
<code>int<em>n</em></code>, <code>uint<em>n</em></code> and <code>float<em>n</em></code> return an <code>int<em>n</em></code> result; vector
source operands of type <code>long<em>n</em></code>, <code>ulong<em>n</em></code> and <code>double<em>n</em></code> return a
<code>long<em>n</em></code> result.</p>
</div>
<div class="paragraph">
<p>For scalar types, the result of the logical unary operator is 0 if the value
of its operand compares unequal to 0, and 1 if the value of its operand
compares equal to 0.
For vector types, the unary operator shall return a 0 if the value of its
operand compares unequal to 0, and -1 (i.e. all bits set) if the value of
its operand compares equal to 0.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="operators-ternary-selection"><a class="anchor" href="#operators-ternary-selection"></a>6.5.9. Ternary Selection Operator</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The ternary selection operator (<strong>?:</strong>) operates on three expressions (<em>exp1</em>
<strong>?</strong> <em>exp2</em> <strong>:</strong> <em>exp3</em>).
This operator evaluates the first expression <em>exp1</em>, which can be a scalar
or vector result except <code>float</code>.
If all three expressions are scalar values, the C99 rules for ternary
operator are followed.
If the result is a vector value, then this is equivalent to calling
<strong>select</strong>(<em>exp3</em>, <em>exp2</em>, <em>exp1</em>).
The <strong>select</strong> function is described in <a href="#table-builtin-relational">Built-in Scalar and Vector Relational Functions</a>.
The second and third expressions can be any type, as long their types match,
or there is an <a href="#implicit-conversions">implicit conversion</a> that can be
applied to one of the expressions to make their types match, or one is a
vector and the other is a scalar and the scalar may be subject to the usual
arithmetic conversion to the element type used by the vector operand and
widened to the same type as the vector type.
This resulting matching type is the type of the entire expression.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="operators-shift"><a class="anchor" href="#operators-shift"></a>6.5.10. Shift Operators</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The operators right-shift (<strong>&gt;&gt;</strong>), left-shift (<strong>&lt;&lt;</strong>) operate on all scalar
and vector built-in types except the built-in scalar and vector <code>float</code>
types.
For built-in vector types, the operators are applied component-wise.
For the right-shift (<strong>&gt;&gt;</strong>), left-shift (<strong>&lt;&lt;</strong>) operators, the rightmost
operand must be a scalar if the first operand is a scalar, and the rightmost
operand can be a vector or scalar if the first operand is a vector.</p>
</div>
<div class="paragraph">
<p>The result of <code>E1</code> <strong>&lt;&lt;</strong> <code>E2</code> is <code>E1</code> left-shifted by log<sub>2</sub>(N) least significant
bits in <code>E2</code> viewed as an unsigned integer value, where N is the number of bits
used to represent the data type of <code>E1</code> after integer promotion
<sup class="footnote">[<a id="_footnoteref_21" class="footnote" href="#_footnotedef_21" title="View footnote.">21</a>]</sup>, if <code>E1</code> is a scalar, or the number of bits
used to represent the type of <code>E1</code> elements, if <code>E1</code> is a vector.
The vacated bits are filled with zeros.</p>
</div>
<div class="paragraph">
<p>The result of <code>E1</code> <strong>&gt;&gt;</strong> <code>E2</code> is <code>E1</code> right-shifted by log<sub>2</sub>(N) least
significant bits in <code>E2</code> viewed as an unsigned integer value, where N is the
number of bits used to represent the data type of <code>E1</code> after integer
promotion, if <code>E1</code> is a scalar, or the number of bits used to represent the
type of <code>E1</code> elements, if <code>E1</code> is a vector.
If <code>E1</code> has an unsigned type or if <code>E1</code> has a signed type and a nonnegative
value, the vacated bits are filled with zeros.
If <code>E1</code> has a signed type and a negative value, the vacated bits are filled
with ones.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="operators-sizeof"><a class="anchor" href="#operators-sizeof"></a>6.5.11. Sizeof Operator</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <code>sizeof</code> operator yields the size (in bytes) of its operand, including
any <a href="#alignment-of-types">padding bytes needed for alignment</a>, which may be
an expression or the parenthesized name of a type.
The size is determined from the type of the operand.
The result is of type <code>size_t</code>.
If the type of the operand is a variable length array
<sup class="footnote">[<a id="_footnoteref_22" class="footnote" href="#_footnotedef_22" title="View footnote.">22</a>]</sup> type, the operand is
evaluated; otherwise, the operand is not evaluated and the result is an integer
constant.</p>
</div>
<div class="paragraph">
<p>When applied to an operand that has type <code>char</code> or <code>uchar</code>, the result is 1.
When applied to an operand that has type <code>short</code>, <code>ushort</code>, or <code>half</code> the
result is 2.
When applied to an operand that has type <code>int</code>, <code>uint</code> or <code>float</code>, the
result is 4.
When applied to an operand that has type <code>long</code>, <code>ulong</code> or <code>double</code>, the
result is 8.
When applied to an operand that is a vector type, the result is the number of
components times the size of each scalar component <sup class="footnote">[<a id="_footnoteref_23" class="footnote" href="#_footnotedef_23" title="View footnote.">23</a>]</sup>.
When applied to an operand that has array type, the result is the total
number of bytes in the array.
When applied to an operand that has structure or union type, the result is
the total number of bytes in such an object, including internal and trailing
padding.
The <code>sizeof</code> operator shall not be applied to an expression that has
function type or an incomplete type, to the parenthesized name of such a
type, or to an expression that designates a bit-field struct member
<sup class="footnote">[<a id="_footnoteref_24" class="footnote" href="#_footnotedef_24" title="View footnote.">24</a>]</sup>.</p>
</div>
<div class="paragraph">
<p>The behavior of applying the <code>sizeof</code> operator to the <code>bool</code>, <code>image2d_t</code>,
<code>image3d_t</code>, <code>image2d_array_t</code>, <code>image1d_t</code>, <code>image1d_buffer_t</code>,
<code>image1d_array_t</code>, <code>image2d_depth_t</code>, <code>image2d_array_depth_t</code>,
<code>sampler_t</code>, <code>queue_t, `ndrange_t</code>, <code>clk_event_t</code>, <code>reserve_id_t</code>, and
<code>event_t</code> types is implementation-defined.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="operators-comma"><a class="anchor" href="#operators-comma"></a>6.5.12. Comma Operator</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The comma (<strong>,</strong>) operator operates on expressions by returning the type and
value of the right-most expression in a comma separated list of expressions.
All expressions are evaluated, in order, from left to right.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="operators-indirection"><a class="anchor" href="#operators-indirection"></a>6.5.13. Indirection Operator</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The unary (<strong>*</strong>) operator denotes indirection.
If the operand points to an object, the result is an l-value designating the
object.
If the operand has type "pointer to <em>type</em>", the result has type
"<em>type</em>".
If an invalid value has been assigned to the pointer, the behavior of the unary
<strong>*</strong> operator is undefined <sup class="footnote">[<a id="_footnoteref_25" class="footnote" href="#_footnotedef_25" title="View footnote.">25</a>]</sup>.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="operators-address"><a class="anchor" href="#operators-address"></a>6.5.14. Address Operator</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The unary (<strong>&amp;</strong>) operator returns the address of its operand.
If the operand has type "<em>type</em>", the result has type "pointer to
<em>type</em>".
If the operand is the result of a unary <strong>*</strong> operator, neither that operator
nor the <strong>&amp;</strong> operator is evaluated and the result is as if both were omitted,
except that the constraints on the operators still apply and the result is
not an l-value.
Similarly, if the operand is the result of a <strong>[]</strong> operator, neither the <strong>&amp;</strong>
operator nor the unary <strong>*</strong> that is implied by the <strong>[]</strong> is evaluated and the
result is as if the <strong>&amp;</strong> operator were removed and the <strong>[]</strong> operator were
changed to a <strong>+</strong> operator.
Otherwise, the result is a pointer to the object designated by its
operand <sup class="footnote">[<a id="_footnoteref_26" class="footnote" href="#_footnotedef_26" title="View footnote.">26</a>]</sup>.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="operators-assignment"><a class="anchor" href="#operators-assignment"></a>6.5.15. Assignment Operator</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>Assignments of values to variable names are done with the assignment
operator (<strong>=</strong>), like</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><em>lvalue</em> = <em>expression</em></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>The assignment operator stores the value of <em>expression</em> into <em>lvalue</em>.
The <em>expression</em> and <em>lvalue</em> must have the same type, or the expression
must have a type in <a href="#table-builtin-scalar-types">Built-in Scalar Data Types</a>, in which case an
implicit conversion will be done on the expression before the assignment is
done.</p>
</div>
<div class="paragraph">
<p>If <em>expression</em> is a scalar type and <em>lvalue</em> is a vector type, the scalar
is converted to the element type used by the vector operand.
The scalar type is then widened to a vector that has the same number of
components as the vector operand.
The operation is done component-wise resulting in the same size vector.</p>
</div>
<div class="paragraph">
<p>Any other desired type-conversions must be specified explicitly.
L-values must be writable.
Variables that are built-in types, entire structures or arrays, structure
fields, l-values with the field selector (<strong>.</strong>) applied to select components
or swizzles without repeated fields, l-values within parentheses, and
l-values dereferenced with the array subscript operator (<strong>[]</strong>) are all
l-values.
Other binary or unary expressions, function names, swizzles with repeated
fields, and constants cannot be l-values.
The ternary operator (<strong>?:</strong>) is also not allowed as an l-value.</p>
</div>
<div class="paragraph">
<p>The order of evaluation of the operands is unspecified.
If an attempt is made to modify the result of an assignment operator or to
access it after the next sequence point, the behavior is undefined.
Other assignment operators are the assignments add into (<strong>+=</strong>), subtract
from (<strong>-=</strong>), multiply into (<strong>=</strong>), divide into (<strong>/=</strong>), modulus into (<strong>%=</strong>),
left shift by (<strong>&lt;&lt;=</strong>), right shift by (<strong>&gt;&gt;=</strong>), and into (<strong>&amp;=</strong>), inclusive
or into (<strong>|=</strong>), and exclusive or into (<strong>^=</strong>).</p>
</div>
<div class="paragraph">
<p>The expression</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><em>lvalue</em> <em>op</em> <strong>=</strong> <em>expression</em></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>is equivalent to</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><em>lvalue</em> = <em>lvalue</em> <em>op</em> <em>expression</em></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>and the <em>lvalue</em> and <em>expression</em> must satisfy the requirements for both
operator <em>op</em> and assignment (<strong>=</strong>).</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>Except for the <code>sizeof</code> operator, the <code>half</code> data type cannot be used with
any of the operators described in this section.</p>
</div>
</td>
</tr>
</table>
</div>
</div>
</div>
</div>
</div>
<div class="sect2">
<h3 id="vector-operations"><a class="anchor" href="#vector-operations"></a>6.6. Vector Operations</h3>
<div class="paragraph">
<p>Vector operations are component-wise.
Usually, when an operator operates on a vector, it is operating
independently on each component of the vector, in a component-wise fashion.</p>
</div>
<div class="paragraph">
<p>For example,</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">float4 v, u;
<span class="predefined-type">float</span> f;
v = u + f;</code></pre>
</div>
</div>
<div class="paragraph">
<p>will be equivalent to</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">v.x = u.x + f;
v.y = u.y + f;
v.z = u.z + f;
v.w = u.w + f;</code></pre>
</div>
</div>
<div class="paragraph">
<p>And</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">float4 v, u, w;
w = v + u;</code></pre>
</div>
</div>
<div class="paragraph">
<p>will be equivalent to</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">w.x = v.x + u.x;
w.y = v.y + u.y;
w.z = v.z + u.z;
w.w = v.w + u.w;</code></pre>
</div>
</div>
<div class="paragraph">
<p>and likewise for most operators and all integer and floating-point vector
types.</p>
</div>
</div>
<div class="sect2">
<h3 id="address-space-qualifiers"><a class="anchor" href="#address-space-qualifiers"></a>6.7. Address Space Qualifiers</h3>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>OpenCL implements the following disjoint named address spaces: <code>__global</code>,
<code>__local</code>, <code>__constant</code> and <code>__private</code>.
The address space qualifier may be used in variable declarations to specify
the region of memory that is used to allocate the object.
The C syntax for type qualifiers is extended in OpenCL to include an address
space name as a valid type qualifier.
If the type of an object is qualified by an address space name, the object
is allocated in the specified address space name.</p>
</div>
<div class="paragraph">
<p>The address space names without the <code>__</code> prefix, i.e. <code>global</code>, <code>local</code>,
<code>constant</code> and <code>private</code>, may be substituted for the corresponding address
space names with the <code>__</code> prefix.</p>
</div>
<div class="paragraph">
<p>The address space name for arguments to a function in a program, or local
variables of a function is <code>__private</code>.
All function arguments shall be in the <code>__private</code> address space.</p>
</div>
<div class="paragraph">
<p>Additionally, all function return values shall be in the <code>__private</code> address space.</p>
</div>
<div class="paragraph">
<p>For OpenCL C 2.0, or OpenCL 3.0 or newer with the
<code>__opencl_c_program_scope_global_variables</code> feature, the address space for a
variable at program scope or a <code>static</code> or <code>extern</code> variable inside a function
may be either <code>__constant</code> or <code>__global</code>,
and the address space defaults to <code>__global</code> if not specified.
Otherwise, the address space for a variable at program scope or a <code>static</code> or <code>extern</code>
variable inside a function must explicitly be <code>__constant</code>.</p>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="comment">// declares a pointer p in the global address space that</span>
<span class="comment">// points to an object in the global address space</span>
global <span class="predefined-type">int</span> *p;
<span class="directive">void</span> foo (...)
{
<span class="comment">// declares an array of 4 floats in the private address space</span>
<span class="predefined-type">float</span> x[<span class="integer">4</span>];
...
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>For OpenCL C 2.0, or with the <code>__opencl_c_generic_address_space</code> feature,
there is an additional unnamed generic address space. The unnamed generic
address space overlaps the named <code>__global</code>, <code>__local</code>, and <code>__private</code> address space; the named <code>__constant</code> address space is not in
the generic address space.</p>
</div>
<div class="paragraph">
<p>If the generic address space is supported,
pointers that are declared without pointing to a named address space point
to the generic address space.
Otherwise, when the generic address space is not supported, pointers that
are declared without pointing to a named address space point to the
<code>__private</code> address space.</p>
</div>
<div class="paragraph">
<p>Kernel function arguments declared to be a pointer or an array of a type
must point to one of the named address spaces <code>__global</code>, <code>__local</code> or
<code>__constant</code>.</p>
</div>
<div class="paragraph">
<p>A pointer to address space A can be assigned to a pointer to the same
address space A or be implicitly converted and assigned to a pointer
to the generic address space.
Casting a pointer to address space A to a pointer to address space B is
illegal if A and B are named address spaces and A is not the same as B.</p>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="comment">// OK.</span>
<span class="predefined-type">int</span> f() { ... }
<span class="comment">// Error. Address space qualifier cannot be used with non-pointer return type.</span>
private <span class="predefined-type">int</span> f() { ... }
<span class="comment">// OK. Address space qualifier can be used with pointer return type.</span>
local <span class="predefined-type">int</span> *f() { ... }</code></pre>
</div>
</div>
<div class="paragraph">
<p>The <code>__global</code>, <code>__constant</code>, <code>__local</code>, <code>__private</code>, <code>global</code>,
<code>constant</code>, <code>local</code>, and <code>private</code> names are reserved for use as address
space qualifiers and shall not be used otherwise.
The <code>__generic</code> and <code>generic</code> names are reserved for future use.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>The size of pointers to different address spaces may differ.
It is not correct to assume that, for example, <code>sizeof(__global int *)</code>
always equals <code>sizeof(__local int *)</code>.</p>
</div>
</td>
</tr>
</table>
</div>
</div>
</div>
<div class="sect3">
<h4 id="global-or-global"><a class="anchor" href="#global-or-global"></a>6.7.1. <code>__global</code> (or <code>global</code>)</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <code>__global</code> or <code>global</code> address space name is used to refer to memory
objects (buffer or image objects) allocated from the <code>global</code> memory pool.</p>
</div>
<div class="paragraph">
<p>A buffer memory object can be declared as a pointer to a scalar, vector or
user-defined struct.
This allows the kernel to read and/or write any location in the buffer.</p>
</div>
<div class="paragraph">
<p>The actual size of the array memory object is determined when the memory
object is allocated via appropriate API calls in the host code.</p>
</div>
<div class="paragraph">
<p>Some examples are:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">global float4 *color; <span class="comment">// An array of float4 elements</span>
<span class="keyword">typedef</span> <span class="keyword">struct</span> {
<span class="predefined-type">float</span> a[<span class="integer">3</span>];
<span class="predefined-type">int</span> b[<span class="integer">2</span>];
} foo_t;
global foo_t *my_info; <span class="comment">// An array of foo_t elements.</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>As image objects are always allocated from the <code>global</code> address space, the
<code>__global</code> or <code>global</code> qualifier should not be specified for image types.
The elements of an image object cannot be directly accessed.
Built-in functions to read from and write to an image object are provided.</p>
</div>
<div class="paragraph">
<p>For OpenCL C 2.0, or with the <code>__opencl_c_program_scope_global_variables</code>
feature,
variables defined at program scope and <code>static</code> variables inside a function
can also be declared in the <code>global</code> address space.
They can be defined with any valid OpenCL C data type except for those in
<a href="#table-other-builtin-types">Other Built-in Data Types</a>.
Such program scope variables may be of any user-defined type,
or a pointer to a user-defined type.
In the presence of shared virtual memory, these pointers or pointer members
should work as expected as long as they are shared virtual memory pointers
and the referenced storage has been mapped appropriately.
These variables in the <code>global</code> address space have the same lifetime as the
program, and their values persist between calls to any of the kernels in the
program.
These variables are not shared across devices.
They have distinct storage.</p>
</div>
<div class="paragraph">
<p>Program scope and <code>static</code> variables in the <code>global</code> address space are zero
initialized by default. A constant expression may be given as an initializer.</p>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="comment">// Note: these examples assume OpenCL C 2.0, or OpenCL C 3.0 or newer and the</span>
<span class="comment">// __opencl_c_program_scope_global_variables feature.</span>
global <span class="predefined-type">int</span> foo; <span class="comment">// OK.</span>
<span class="predefined-type">int</span> foo; <span class="comment">// OK. Declared in the global address space.</span>
global uchar buf[<span class="integer">512</span>]; <span class="comment">// OK.</span>
global <span class="predefined-type">int</span> baz = <span class="integer">12</span>; <span class="comment">// OK. Initialization is allowed.</span>
<span class="directive">static</span> global <span class="predefined-type">int</span> bat; <span class="comment">// OK. Internal linkage.</span>
<span class="directive">static</span> <span class="predefined-type">int</span> foo; <span class="comment">// OK. Declared in the global address space.</span>
<span class="directive">static</span> global <span class="predefined-type">int</span> foo; <span class="comment">// OK.</span>
<span class="predefined-type">int</span> *foo; <span class="comment">// OK. foo is allocated in the global address space.</span>
<span class="comment">// foo points to a location in the private or</span>
<span class="comment">// generic address space.</span>
<span class="directive">void</span> func(...)
{
<span class="predefined-type">int</span> *foo; <span class="comment">// OK. foo is allocated in the private address space.</span>
<span class="comment">// foo points to a location in the private or</span>
<span class="comment">// generic address space.</span>
...
}
global <span class="predefined-type">int</span> * global ptr; <span class="comment">// OK.</span>
<span class="predefined-type">int</span> * global ptr; <span class="comment">// OK.</span>
constant <span class="predefined-type">int</span> *global ptr=&amp;baz; <span class="comment">// Error, baz is in the global address</span>
<span class="comment">// space.</span>
global <span class="predefined-type">int</span> * constant ptr = &amp;baz; <span class="comment">// OK</span>
global image2d_t im; <span class="comment">// Error. Invalid type for program scope variables.</span>
global event_t ev; <span class="comment">// Error. Invalid type for program scope variables.</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>The <code>const</code> qualifier can also be used with the <code>__global</code> qualifier to
specify a read-only buffer memory object.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="local-or-local"><a class="anchor" href="#local-or-local"></a>6.7.2. <code>__local</code> (or <code>local</code>)</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <code>__local</code> or <code>local</code> address space name is used to describe variables
that need to be allocated in local memory and are shared by all work-items
of a work-group.
Pointers to the <code>__local</code> address space are allowed as arguments to
functions (including kernel functions).
Variables declared in the <code>__local</code> address space inside a kernel function
must occur at kernel function scope.</p>
</div>
<div class="paragraph">
<p>Some examples of variables allocated in the <code>__local</code> address space inside
a kernel function are:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span> my_func(...)
{
local <span class="predefined-type">float</span> a; <span class="comment">// A single float allocated</span>
<span class="comment">// in local address space</span>
local <span class="predefined-type">float</span> b[<span class="integer">10</span>]; <span class="comment">// An array of 10 floats</span>
<span class="comment">// allocated in local address space.</span>
<span class="keyword">if</span> (...)
{
<span class="comment">// example of variable in __local address space but not</span>
<span class="comment">// declared at __kernel function scope.</span>
local <span class="predefined-type">float</span> c; <span class="comment">// not allowed.</span>
}
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>Variables allocated in the <code>__local</code> address space inside a kernel
function cannot be initialized.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span> my_func(...)
{
local <span class="predefined-type">float</span> a = <span class="integer">1</span>; <span class="comment">// not allowed</span>
local <span class="predefined-type">float</span> b;
b = <span class="integer">1</span>; <span class="comment">// allowed</span>
}</code></pre>
</div>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>Variables allocated in the <code>__local</code> address space inside a kernel
function are allocated for each work-group executing the kernel and exist
only for the lifetime of the work-group executing the kernel.</p>
</div>
</td>
</tr>
</table>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="constant-or-constant"><a class="anchor" href="#constant-or-constant"></a>6.7.3. <code>__constant</code> (or <code>constant</code>)</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <code>__constant</code> or <code>constant</code> address space name is used to describe
variables allocated in <code>global</code> memory and which are accessed inside a
kernel(s) as read-only variables.
These read-only variables can be accessed by all (global) work-items of the
kernel during its execution.
Pointers to the <code>__constant</code> address space are allowed as arguments to
functions (including kernel functions) and for variables declared inside
functions.</p>
</div>
<div class="paragraph">
<p>All string literal storage shall be in the <code>__constant</code> address space.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>Each argument to a kernel that is a pointer to the <code>__constant</code> address
space is counted separately towards the maximum number of such arguments,
defined as the value of the <a href="#opencl-device-queries"><code>CL_DEVICE_MAX_CONSTANT_ARGS</code> device query</a>.</p>
</div>
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>Variables in the program scope can be declared in the <code>__constant</code> address
space.
Variables in the outermost scope of kernel functions can be declared in the
<code>__constant</code> address space.
These variables are required to be initialized and the values used to
initialize these variables must be a compile time constant.
Writing to such a variable results in a compile-time error.</p>
</div>
<div class="paragraph">
<p>Implementations are not required to aggregate these declarations into the
fewest number of constant arguments.
This behavior is implementation defined.</p>
</div>
<div class="paragraph">
<p>Thus portable code must conservatively assume that each variable declared
inside a function or in program scope allocated in the <code>__constant</code>
address space counts as a separate constant argument.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="private-or-private"><a class="anchor" href="#private-or-private"></a>6.7.4. <code>__private</code> (or <code>private</code>)</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>Variables inside a kernel function not declared with an address space
qualifier, all variables inside non-kernel functions, and all function
arguments are in the <code>__private</code> or <code>private</code> address space.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="the-generic-address-space"><a class="anchor" href="#the-generic-address-space"></a>6.7.5. The Generic Address Space</h4>
<div class="openblock">
<div class="content">
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The functionality described in this section <a href="#unified-spec">requires</a>
support for OpenCL C 2.0, or OpenCL C 3.0 or newer and the
<code>__opencl_c_generic_address_space</code> feature.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>The following rules apply when using pointers that point to the generic
address space:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>A pointer that points to the <code>global</code>, <code>local</code> or <code>private</code> address
space can be implicitly converted to a pointer to the unnamed generic
address space but not vice-versa.</p>
</li>
<li>
<p>Pointer casts can be used to cast a pointer that points to the <code>global</code>,
<code>local</code> or <code>private</code> space to the unnamed generic address space and
vice-versa.</p>
</li>
<li>
<p>A pointer that points to the <code>constant</code> address space cannot be cast or
implicitly converted to the generic address space.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>A few examples follow.</p>
</div>
<div class="paragraph">
<p>This is the canonical example.
In this example, function <code>foo</code> is declared with an argument that is a
pointer with no address space qualifier.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="directive">void</span> foo(<span class="predefined-type">int</span> *a)
{
*a = *a + <span class="integer">2</span>;
}
kernel <span class="directive">void</span> k1(local <span class="predefined-type">int</span> *a)
{
...
foo(a);
...
}
kernel <span class="directive">void</span> k2(global <span class="predefined-type">int</span> *a)
{
...
foo(a);
...
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>In the example below, <code>var</code> is a pointer to the unnamed generic address space.
A pointer to the <code>global</code> or <code>local</code> address space may be assigned to <code>var</code>
depending on the result of a conditional expression.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span> bar(global <span class="predefined-type">int</span> *g, local <span class="predefined-type">int</span> *l)
{
<span class="predefined-type">int</span> *var;
<span class="keyword">if</span> (is_even(get_global_id(<span class="integer">0</span>))
var = g;
<span class="keyword">else</span>
var = l;
*var = <span class="integer">42</span>;
...
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>In the example below, the same pointer to the unnamed generic address
space is used to point to objects allocated in different named address spaces.
A pointer to the unnamed generic address space may point to
objects in the <code>global</code>, <code>local</code>, and <code>private</code> address spaces,
but it is not legal for a pointer to the unnamed generic address to
point to an object in the <code>constant</code> address space.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="predefined-type">int</span> *ptr;
global <span class="predefined-type">int</span> g;
ptr = &amp;g; <span class="comment">// legal</span>
local <span class="predefined-type">int</span> l;
ptr = &amp;l; <span class="comment">// legal</span>
private <span class="predefined-type">int</span> p;
ptr = &amp;p; <span class="comment">// legal</span>
constant <span class="predefined-type">int</span> c;
ptr = &amp;c; <span class="comment">// illegal</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>In the example below, pointers to named address spaces are assigned to
a pointer to the unnamed generic address space.
It is legal to assign a pointer to the <code>global</code>, <code>local</code>, and <code>private</code>
address spaces to a pointer to the unnamed generic address space without
an explicit cast.
It is not legal to assign a pointer to the <code>constant</code> address space to
a pointer to the unnamed generic address space.
It is also not legal to assign a pointer to the unnamed generic address
space to a pointer to a named address space without a cast.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">global <span class="predefined-type">int</span> *gp;
local <span class="predefined-type">int</span> *lp;
private <span class="predefined-type">int</span> *pp;
constant <span class="predefined-type">int</span> *cp;
<span class="predefined-type">int</span> *p;
p = gp; <span class="comment">// legal</span>
p = lp; <span class="comment">// legal</span>
p = pp; <span class="comment">// legal</span>
p = cp; <span class="comment">// illegal</span>
<span class="comment">// it is illegal to convert from a generic pointer</span>
<span class="comment">// to an explicit address space pointer without a cast:</span>
gp = p; <span class="comment">// compile-time error</span>
lp = p; <span class="comment">// compile-time error</span>
pp = p; <span class="comment">// compile-time error</span>
cp = p; <span class="comment">// compile-time error</span></code></pre>
</div>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="changes-to-C99"><a class="anchor" href="#changes-to-C99"></a>6.7.6. Changes to C99</h4>
<div class="paragraph">
<p>This section details the modifications to the <a href="#C99-spec">C99
Specification</a> needed to incorporate the functionality of named address
space and the generic address space:</p>
</div>
<div class="paragraph">
<p><strong>Clause 6.2.5 - Types, replace paragraph 26 with the following paragraphs</strong>:</p>
</div>
<div class="paragraph">
<p>If type <code>T</code> is qualified by the address space qualifier for address space
<code>A</code>, then " <code>T</code> is in <code>A</code> ".
If type <code>T</code> is in address space <code>A</code>, a pointer to <code>T</code> is also a " pointer
into <code>A</code> " and the referenced address space of the pointer is <code>A</code>.</p>
</div>
<div class="paragraph">
<p>A pointer to <code>void</code> in any address space shall have the same representation
and alignment requirements as a pointer to a character type in the same
address space.
Similarly, pointers to differently access-qualified versions of compatible
types shall have the same representation and alignment requirements.
All pointers to structure types in the same address space shall have the
same representation and alignment requirements as each other.
All pointers to union types in the same address space shall have the same
representation and alignment requirements as each other.</p>
</div>
<div class="paragraph">
<p><strong>Clause 6.3.2.3 - Pointers, replace the first two paragraphs with the
following paragraphs</strong>:</p>
</div>
<div class="paragraph">
<p>If a pointer into one address space is converted to a pointer into another
address space, then unless the original pointer is a null pointer or the
location referred to by the original pointer is within the second address
space, the behavior is undefined.
(For the original pointer to refer to a location within the second address
space, the two address spaces must overlap).</p>
</div>
<div class="paragraph">
<p>A pointer to <code>void</code> in any address space may be converted to or from a
pointer to any incomplete or object type.
A pointer to any incomplete or object type in some address space may be
converted to a pointer to <code>void</code> in an enclosing address space and back
again; the result shall compare equal to the original pointer.</p>
</div>
<div class="paragraph">
<p>For any qualifier <em>q</em>, a pointer to a non-<em>q</em>-qualified type may be
converted to a pointer to the <em>q</em>-qualified version of the type (but with
the same address-space qualifier or the generic address space); the values
stored in the original and converted pointers shall compare equal.</p>
</div>
<div class="paragraph">
<p><strong>Clause 6.3.2.3 - Pointers, replace the last sentence of paragraph 4 with</strong>:</p>
</div>
<div class="paragraph">
<p>Conversion of a null pointer to another pointer type yields a null pointer
of that type.
Any two null pointers whose referenced address spaces overlap shall compare
equal.</p>
</div>
<div class="paragraph">
<p><strong>Clause 6.5.2.2 - Function calls, change the second bullet of paragraph 6
to</strong>:</p>
</div>
<div class="paragraph">
<p>both types are pointers to qualified or unqualified versions of a character
type or <code>void</code> in the same address space or one type is a pointer in a named
address space and the other is a pointer in the generic address space.</p>
</div>
<div class="paragraph">
<p><strong>Clause 6.5.6 - Additive operators, add another constraint paragraph</strong>:</p>
</div>
<div class="paragraph">
<p>For subtraction, if the two operands are pointers into different address
spaces, the address spaces must overlap.</p>
</div>
<div class="paragraph">
<p><strong>Clause 6.5.8 - Relational operators, add another constraint paragraph</strong>:</p>
</div>
<div class="paragraph">
<p>If the two operands are pointers into different address spaces, the address
spaces must overlap.</p>
</div>
<div class="paragraph">
<p><strong>Clause 6.5.8 - Relational operators, add a new paragraph between existing
paragraphs 3 and 4</strong>:</p>
</div>
<div class="paragraph">
<p>If the two operands are pointers into different address spaces, one of the
address spaces encloses the other.
The pointer into the enclosed address space is first converted to a pointer
to the same reference type except with any address-space qualifier removed
and any address-space qualifier of the other pointer&#8217;s reference type added.
(After this conversion, both pointers are pointers into the same address
space).</p>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span> test1()
{
global <span class="predefined-type">int</span> arr[<span class="integer">5</span>] = { <span class="integer">0</span>, <span class="integer">1</span>, <span class="integer">2</span>, <span class="integer">3</span>, <span class="integer">4</span> };
<span class="predefined-type">int</span> *p = &amp;arr[<span class="integer">1</span>];
global <span class="predefined-type">int</span> *q = &amp;arr[<span class="integer">3</span>];
<span class="comment">// q implicitly converted to the generic address space</span>
<span class="comment">// since the generic address space encloses the global</span>
<span class="comment">// address space</span>
<span class="keyword">if</span> (q &gt;= p)
printf(<span class="string"><span class="delimiter">&quot;</span><span class="content">true</span><span class="char">\n</span><span class="delimiter">&quot;</span></span>);
<span class="comment">// q implicitly converted to the generic address space</span>
<span class="comment">// since the generic address space encloses the global</span>
<span class="comment">// address space</span>
<span class="keyword">if</span> (p &lt;= q)
printf(<span class="string"><span class="delimiter">&quot;</span><span class="content">true</span><span class="char">\n</span><span class="delimiter">&quot;</span></span>);
}</code></pre>
</div>
</div>
<div class="paragraph">
<p><strong>Clause 6.5.9 - Equality operators, add another constraint paragraph</strong>:</p>
</div>
<div class="paragraph">
<p>If the two operands are pointers into different address spaces, the address
spaces must overlap.</p>
</div>
<div class="paragraph">
<p><strong>Clause 6.5.9 - Equality operators, replace paragraph 5 with</strong>:</p>
</div>
<div class="paragraph">
<p>Otherwise, at least one operand is a pointer.
If one operand is a pointer and the other is a null pointer constant, the
null pointer constant is converted to the type of the pointer.
If both operands are pointers, each of the following conversions is
performed as applicable:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>If the two operands are pointers into different address spaces, one of
the address spaces encloses the other.
The pointer into the enclosed address space is first converted to a
pointer to the same reference type except with any address-space
qualifier removed and any address-space qualifier of the other pointer&#8217;s
reference type added.
(After this conversion, both pointers are pointers into the same address
space).</p>
</li>
<li>
<p>Then, if one operand is a pointer to an object or incomplete type and
the other is a pointer to a qualified or unqualified version of <code>void</code>,
the former is converted to the type of the latter.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="predefined-type">int</span> *ptr = <span class="predefined-constant">NULL</span>;
local <span class="predefined-type">int</span> lval = SOME_VAL;
local <span class="predefined-type">int</span> *lptr = &amp;lval;
global <span class="predefined-type">int</span> gval = SOME_OTHER_VAL;
global <span class="predefined-type">int</span> *gptr = &amp;gval;
ptr = lptr;
<span class="keyword">if</span> (ptr == gptr) <span class="comment">// legal</span>
{
...
}
<span class="keyword">if</span> (ptr == lptr) <span class="comment">// legal</span>
{
...
}
<span class="keyword">if</span> (lptr == gptr) <span class="comment">// illegal, compiler error</span>
{
...
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>Consider the following example:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="predefined-type">bool</span> callee(<span class="predefined-type">int</span> *p1, <span class="predefined-type">int</span> *p2)
{
<span class="keyword">if</span> (p1 == p2)
<span class="keyword">return</span> <span class="predefined-constant">true</span>;
<span class="keyword">return</span> <span class="predefined-constant">false</span>;
}
<span class="directive">void</span> caller()
{
global <span class="predefined-type">int</span> *gptr = <span class="hex">0xdeadbeef</span>;
private <span class="predefined-type">int</span> *pptr = <span class="hex">0xdeadbeef</span>;
<span class="comment">// behavior of callee is undefined</span>
<span class="predefined-type">bool</span> b = callee(gptr, pptr);
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>The behavior of callee is undefined as gptr and pptr are in different
address spaces.
The example above would have the same undefined behavior if the equality
operator is replaced with a relational operator.</p>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="predefined-type">int</span> *ptr = <span class="predefined-constant">NULL</span>;
local <span class="predefined-type">int</span> *lptr = <span class="predefined-constant">NULL</span>;
global <span class="predefined-type">int</span> *gptr = <span class="predefined-constant">NULL</span>;
<span class="keyword">if</span> (ptr == <span class="predefined-constant">NULL</span>) <span class="comment">// legal</span>
{
...
}
<span class="keyword">if</span> (ptr == lptr) <span class="comment">// legal</span>
{
...
}
<span class="keyword">if</span> (lptr == gptr) <span class="comment">// compile-time error</span>
{
...
}
ptr = lptr; <span class="comment">// legal</span>
intptr l = (intptr_t)lptr;
<span class="keyword">if</span> (l == <span class="integer">0</span>) <span class="comment">// legal</span>
{
...
}
<span class="keyword">if</span> (l == <span class="predefined-constant">NULL</span>) <span class="comment">// legal</span>
{
...
}</code></pre>
</div>
</div>
<div class="paragraph">
<p><strong>Clause 6.5.9 - Equality operators, replace first sentence of paragraph 6
with</strong>:</p>
</div>
<div class="paragraph">
<p>Two pointers compare equal if and only if both are null pointers with
overlapping address spaces.</p>
</div>
<div class="paragraph">
<p><strong>Clause 6.5.15 - Conditional operator, add another constraint paragraph</strong>:</p>
</div>
<div class="paragraph">
<p>If the second and third operands are pointers into different address spaces,
the address spaces must overlap.</p>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span> test1()
{
global <span class="predefined-type">int</span> arr[<span class="integer">5</span>] = { <span class="integer">0</span>, <span class="integer">1</span>, <span class="integer">2</span>, <span class="integer">3</span>, <span class="integer">4</span> };
<span class="predefined-type">int</span> *p = &amp;arr[<span class="integer">1</span>];
global <span class="predefined-type">int</span> *q = &amp;arr[<span class="integer">3</span>];
local <span class="predefined-type">int</span> *r = <span class="predefined-constant">NULL</span>;
<span class="predefined-type">int</span> *val = <span class="predefined-constant">NULL</span>;
<span class="comment">// legal. 2nd and 3rd operands are in address spaces</span>
<span class="comment">// that overlap</span>
val = (q &gt;= p) ? q : p;
<span class="comment">// compiler error. 2nd and 3rd operands are in disjoint</span>
<span class="comment">// address spaces</span>
val = (q &gt;= p) ? q : r;
}</code></pre>
</div>
</div>
<div class="paragraph">
<p><strong>Clause 6.5.16.1 - Simple assignment, change the third and fourth bullets of
paragraph 1 to</strong>:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>both operands are pointers to qualified or unqualified versions of
compatible types, the referenced address space of the left encloses the
referenced address space of the right, and the type pointed to by the
left has all the qualifiers of the type pointed to by the right.</p>
</li>
<li>
<p>one operand is a pointer to an object or incomplete type and the other
is a pointer to a qualified or unqualified version of <code>void</code>, the
referenced address space of the left encloses the referenced address
space of the right, and the type pointed to by the left has all the
qualifiers of the type pointed to by the right.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span> f()
{
<span class="predefined-type">int</span> *ptr;
local <span class="predefined-type">int</span> *lptr;
global <span class="predefined-type">int</span> *gptr;
local <span class="predefined-type">int</span> val = <span class="integer">55</span>;
ptr = &amp;val; <span class="comment">// legal: implicit cast to generic, then assign</span>
lptr = ptr; <span class="comment">// illegal: no implicit cast from</span>
<span class="comment">// generic to local</span>
lptr = gptr; <span class="comment">// illegal: no implicit cast from</span>
<span class="comment">// global to local</span>
ptr = gptr; <span class="comment">// legal: implicit cast from global to generic,</span>
<span class="comment">// then assign</span>
}</code></pre>
</div>
</div>
<div class="paragraph">
<p><strong>Clause 6.7.2.1 - Structure and union specifiers, add a new constraint
paragraph</strong>:</p>
</div>
<div class="paragraph">
<p>Within a structure or union specifier, the type of a member shall not be
qualified by an address space qualifier.</p>
</div>
<div class="paragraph">
<p><strong>Clause 6.7.3 - Type qualifiers, add three new constraint paragraphs</strong>:</p>
</div>
<div class="paragraph">
<p>No type shall be qualified by qualifiers for two or more different address
spaces.</p>
</div>
</div>
</div>
<div class="sect2">
<h3 id="access-qualifiers"><a class="anchor" href="#access-qualifiers"></a>6.8. Access Qualifiers</h3>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>Image objects specified as arguments to a kernel can be declared to be
read-only or write-only.</p>
</div>
<div class="paragraph">
<p>For OpenCL C 2.0, or with the <code>__opencl_c_read_write_images</code> feature,
image objects specified as arguments to a kernel can additionally be
declared to be read-write.</p>
</div>
<div class="paragraph">
<p>The <code>__read_only</code> (or <code>read_only</code>) access qualifier specifies that the
image object is only being read by a kernel or function.
The <code>__write_only</code> (or <code>write_only</code>) access qualifier specifies that the
image object is only being written to by a kernel or function.
The <code>__read_write</code> (or <code>read_write</code>) access qualifier specifies that the
image object may be both read from or written to by a kernel or function.</p>
</div>
<div class="paragraph">
<p>The default access qualifier is <code>read_only</code>, if no access qualifier is declared.</p>
</div>
<div class="paragraph">
<p>In the following example</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span>
foo (read_only image2d_t imageA,
write_only image2d_t imageB)
{
...
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>imageA is a read-only 2D image object, and image is a write-only 2D image
object.</p>
</div>
<div class="paragraph">
<p>The sampler-less read image and write image built-ins can be used with image
declared with the <code>__read_write</code> (or <code>read_write</code>) qualifier.
Calls to built-ins that read from an image using a sampler for images
declared with the <code>__read_write</code> (or <code>read_write</code>) qualifier will be a
compilation error.</p>
</div>
<div class="paragraph">
<p>Pipe objects specified as arguments to a kernel also use these access
qualifiers.
See the <a href="#pipe-functions">detailed description on how these access qualifiers
can be used with pipes</a>.</p>
</div>
<div class="paragraph">
<p>The <code>__read_only</code>, <code>__write_only</code>, <code>__read_write</code>, <code>read_only</code>,
<code>write_only</code> and <code>read_write</code> names are reserved for use as access
qualifiers and shall not be used otherwise.</p>
</div>
</div>
</div>
</div>
<div class="sect2">
<h3 id="function-qualifiers"><a class="anchor" href="#function-qualifiers"></a>6.9. Function Qualifiers</h3>
<div class="sect3">
<h4 id="kernel-or-kernel"><a class="anchor" href="#kernel-or-kernel"></a>6.9.1. <code>__kernel</code> (or <code>kernel</code>)</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <code>__kernel</code> (or <code>kernel</code>) qualifier declares a function to be a kernel
that can be executed by an application on an OpenCL device(s).
The following rules apply to functions that are declared with this
qualifier:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>It can be executed on the device only</p>
</li>
<li>
<p>It can be called by the host</p>
</li>
<li>
<p>It is just a regular function call if a <code>__kernel</code> function is called
by another kernel function.</p>
</li>
</ul>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>Kernel functions with variables declared inside the function with the
<code>__local</code> or <code>local</code> qualifier can be called by the host using appropriate
APIs such as <strong>clEnqueueNDRangeKernel</strong>.</p>
</div>
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>The <code>__kernel</code> and <code>kernel</code> names are reserved for use as functions
qualifiers and shall not be used otherwise.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="optional-attribute-qualifiers"><a class="anchor" href="#optional-attribute-qualifiers"></a>6.9.2. Optional Attribute Qualifiers</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <code>__kernel</code> qualifier can be used with the keyword <em>attribute</em> to
declare additional information about the kernel function as described below.</p>
</div>
<div class="paragraph">
<p>The optional <code>__attribute__((vec_type_hint(&lt;type&gt;)))</code>
<sup class="footnote">[<a id="_footnoteref_27" class="footnote" href="#_footnotedef_27" title="View footnote.">27</a>]</sup> is a hint to the compiler and is intended to be a
representation of the computational <em>width</em> of the <code>__kernel</code>, and should
serve as the basis for calculating processor bandwidth utilization when the
compiler is looking to autovectorize the code.
In the <code>__attribute__((vec_type_hint(&lt;type&gt;)))</code> qualifier &lt;type&gt; is one of
the built-in vector types listed in <a href="#table-builtin-vector-types">Built-in Vector Data Types</a> or the
constituent scalar element types.
If <code>vec_type_hint (&lt;type&gt;)</code> is not specified, the kernel is assumed to have
the <code>__attribute__((vec_type_hint(int)))</code> qualifier.</p>
</div>
<div class="paragraph">
<p>For example, where the developer specified a width of <code>float4</code>, the compiler
should assume that the computation usually uses up to 4 lanes of a <code>float</code>
vector, and would decide to merge work-items or possibly even separate one
work-item into many threads to better match the hardware capabilities.
A conforming implementation is not required to autovectorize code, but shall
support the hint.
A compiler may autovectorize, even if no hint is provided.
If an implementation merges N work-items into one thread, it is responsible
for correctly handling cases where the number of <code>global</code> or <code>local</code>
work-items in any dimension modulo N is not zero.</p>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="comment">// autovectorize assuming float4 as the</span>
<span class="comment">// basic computation width</span>
__kernel __attribute__((vec_type_hint(float4)))
<span class="directive">void</span> foo( __global float4 *p ) { ... }
<span class="comment">// autovectorize assuming double as the</span>
<span class="comment">// basic computation width</span>
__kernel __attribute__((vec_type_hint(<span class="predefined-type">double</span>)))
<span class="directive">void</span> foo( __global float4 *p ) { ... }
<span class="comment">// autovectorize assuming int (default)</span>
<span class="comment">// as the basic computation width</span>
__kernel
<span class="directive">void</span> foo( __global float4 *p ) { ... }</code></pre>
</div>
</div>
<div class="paragraph">
<p>If for example, a <code>__kernel</code> function is declared with</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><code>__attribute__(( vec_type_hint (float4)))</code></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>(meaning that most operations in the <code>__kernel</code> function are explicitly
vectorized using <code>float4</code>) and the kernel is running using Intel<sup>®</sup>
Advanced Vector Instructions (Intel<sup>®</sup> AVX) which implements a
8-float-wide vector unit, the autovectorizer might choose to merge two
work-items to one thread, running a second work-item in the high half of the
256-bit AVX register.</p>
</div>
<div class="paragraph">
<p>As another example, a Power4 machine has two scalar double precision
floating-point units with an 6-cycle deep pipe.
An autovectorizer for the Power4 machine might choose to interleave six
kernels declared with the <code>__attribute__(( vec_type_hint (double2)))</code>
qualifier into one hardware thread, to ensure that there is always 12-way
parallelism available to saturate the FPUs.
It might also choose to merge 4 or 8 work-items (or some other number) if it
concludes that these are better choices, due to resource utilization
concerns or some preference for divisibility by 2.</p>
</div>
<div class="paragraph">
<p>The optional <code>__attribute__((work_group_size_hint(X, Y, Z)))</code> is a hint to
the compiler and is intended to specify the work-group size that may be used
i.e. value most likely to be specified by the <em>local_work_size</em> argument to
<strong>clEnqueueNDRangeKernel</strong>.
For example, the <code>__attribute__((work_group_size_hint(1, 1, 1)))</code> is a
hint to the compiler that the kernel will most likely be executed with a
work-group size of 1.</p>
</div>
<div class="paragraph">
<p>The optional <code>__attribute__((reqd_work_group_size(X, Y, Z)))</code> is the
work-group size that must be used as the <em>local_work_size</em> argument to
<strong>clEnqueueNDRangeKernel</strong>.
This allows the compiler to optimize the generated code appropriately for
this kernel.</p>
</div>
<div class="paragraph">
<p>If <code>Z</code> is one, the <em>work_dim</em> argument to <strong>clEnqueueNDRangeKernel</strong> can be 2
or 3.
If <code>Y</code> and <code>Z</code> are one, the <em>work_dim</em> argument to <strong>clEnqueueNDRangeKernel</strong>
can be 1, 2 or 3.</p>
</div>
</div>
</div>
</div>
</div>
<div class="sect2">
<h3 id="storage-class-specifiers"><a class="anchor" href="#storage-class-specifiers"></a>6.10. Storage-Class Specifiers</h3>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <code>typedef</code> storage-class specifier is supported.
The <code>extern</code> and <code>static</code> storage-class specifiers are supported but
<a href="#unified-spec">require</a> support for OpenCL C 1.2 or newer.
The <code>auto</code> and <code>register</code> storage-class specifiers are not supported.</p>
</div>
<div class="paragraph">
<p>The <code>extern</code> storage-class specifier can only be used for functions (kernel
and non-kernel functions) and <code>global</code> variables declared in program scope
or variables declared inside a function (kernel and non-kernel functions).
The <code>static</code> storage-class specifier can only be used for non-kernel
functions, <code>global</code> variables declared in program scope and variables inside
a function declared in the <code>global</code> or <code>constant</code> address space.</p>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="directive">extern</span> constant float4 noise_table[<span class="integer">256</span>];
<span class="directive">static</span> constant float4 color_table[<span class="integer">256</span>];
<span class="directive">extern</span> kernel <span class="directive">void</span> my_foo(image2d_t img);
<span class="directive">extern</span> <span class="directive">void</span> my_bar(global <span class="predefined-type">float</span> *a);
kernel <span class="directive">void</span> my_func(image2d_t img, global <span class="predefined-type">float</span> *a)
{
<span class="directive">extern</span> constant float4 a;
<span class="directive">static</span> constant float4 b = (float4)(<span class="float">1</span><span class="float">.0f</span>); <span class="comment">// OK.</span>
<span class="directive">static</span> <span class="predefined-type">float</span> c; <span class="comment">// Error: No implicit address space</span>
global <span class="predefined-type">int</span> hurl; <span class="comment">// Error: Must be static</span>
...
my_foo(img);
...
my_bar(a);
...
<span class="keyword">while</span> (<span class="integer">1</span>)
{
<span class="directive">static</span> global <span class="predefined-type">int</span> inside; <span class="comment">// OK.</span>
...
}
...
}</code></pre>
</div>
</div>
</div>
</div>
</div>
<div class="sect2">
<h3 id="restrictions"><a class="anchor" href="#restrictions"></a>6.11. Restrictions</h3>
<div class="openblock">
<div class="content">
<div class="olist loweralpha">
<ol class="loweralpha" type="a">
<li>
<p>The use of pointers is somewhat restricted.
The following rules apply:</p>
<div class="ulist">
<ul>
<li>
<p>Arguments to kernel functions declared in a program that are pointers
must be declared with the <code>__global</code>, <code>__constant</code> or <code>__local</code>
qualifier.</p>
</li>
<li>
<p>A pointer declared with the <code>__constant</code> qualifier can only be
assigned to a pointer declared with the <code>__constant</code> qualifier
respectively.</p>
</li>
<li>
<p>Pointers to functions are not allowed.</p>
</li>
<li>
<p>Arguments to kernel functions in a program cannot be
declared as a pointer to a pointer(s).
Variables inside a function or arguments to non-kernel functions in a
program can be declared as a pointer to a pointer(s).
This restriction only applies to OpenCL C 1.2 or below.</p>
</li>
</ul>
</div>
</li>
<li>
<p>An image type (<code>image2d_t</code>, <code>image3d_t</code>, <code>image2d_array_t</code>, <code>image1d_t</code>,
<code>image1d_buffer_t</code> or <code>image1d_array_t</code>) can only be used as the type of
a function argument.
An image function argument cannot be modified.
Elements of an image can only be accessed using the built-in
<a href="#image-read-and-write-functions">image read and write functions</a>.</p>
<div class="paragraph">
<p>An image type cannot be used to declare a variable, a structure or union
field, an array of images, a pointer to an image, or the return type of a
function.
An image type cannot be used with the <code>__global</code>, <code>__private</code>,
<code>__local</code> and <code>__constant</code> address space qualifiers.</p>
</div>
<div class="paragraph">
<p>The sampler type (<code>sampler_t</code>) can only be used as the type of a function
argument or a variable declared in the program scope or the outermost scope
of a kernel function.
The behavior of a sampler variable declared in a non-outermost scope of a
kernel function is implementation-defined.
A sampler argument or variable cannot be modified.</p>
</div>
<div class="paragraph">
<p>The sampler type cannot be used to declare a structure or union field, an
array of samplers, a pointer to a sampler, or the return type of a function.
The sampler type cannot be used with the <code>__local</code> and <code>__global</code>
address space qualifiers.</p>
</div>
</li>
<li>
<p><a id="restrictions-bitfield"></a> Bit-field struct members are currently not
supported.</p>
</li>
<li>
<p><a id="restrictions-variable-length"></a> Variable length arrays and structures
with flexible (or unsized) arrays are not supported.</p>
</li>
<li>
<p>Variadic functions are not supported, with the exception of <code>printf</code> and
<code>enqueue_kernel</code>.</p>
</li>
<li>
<p>Variadic macros are not supported.
This restriction only applies to OpenCL C 2.0 or below.</p>
</li>
<li>
<p>If a list of parameters in a function declaration is empty, the function
takes no arguments. This is due to the above restriction on variadic
functions.</p>
</li>
<li>
<p>Unless defined in the OpenCL specification, the library functions,
macros, types, and constants defined in the C99 standard headers
<code>assert.h</code>, <code>ctype.h</code>, <code>complex.h</code>, <code>errno.h</code>, <code>fenv.h</code>, <code>float.h</code>,
<code>inttypes.h</code>, <code>limits.h</code>, <code>locale.h</code>, <code>setjmp.h</code>, <code>signal.h</code>,
<code>stdarg.h</code>, <code>stdio.h</code>, <code>stdlib.h</code>, <code>string.h</code>, <code>tgmath.h</code>, <code>time.h</code>,
<code>wchar.h</code> and <code>wctype.h</code> are not available and cannot be included by a
program.</p>
</li>
<li>
<p>The <code>auto</code> and <code>register</code> storage-class specifiers are not supported.</p>
</li>
<li>
<p>Predefined identifiers are not supported.
This restriction only applies to OpenCL C 1.1 or below.</p>
</li>
<li>
<p>Recursion is not supported.</p>
</li>
<li>
<p>The return type of a kernel function must be <code>void</code>.</p>
</li>
<li>
<p>Arguments to kernel functions in a program cannot be declared with the
built-in scalar types <code>bool</code>, <code>size_t</code>, <code>ptrdiff_t</code>, <code>intptr_t</code>, and
<code>uintptr_t</code> or a struct and/or union that contain fields declared to be
one of these built-in scalar types.
The size in bytes of these types are implementation-defined and in
addition can also be different for the OpenCL device and the host
processor making it difficult to allocate buffer objects to be passed as
arguments to a kernel declared as pointer to these types.</p>
</li>
<li>
<p><code>half</code> is not supported as <code>half</code> can be used as a storage format
<sup class="footnote">[<a id="_footnoteref_28" class="footnote" href="#_footnotedef_28" title="View footnote.">28</a>]</sup> only and is not a data type on which
floating-point arithmetic can be performed.</p>
</li>
<li>
<p>Whether or not irreducible control flow is illegal is implementation
defined.</p>
</li>
<li>
<p>The following restriction only applies to OpenCL C 1.0, also see the
<strong>cl_khr_byte_addressable_store</strong> extension.
Built-in types that are less than 32-bits in size, i.e.
<code>char</code>, <code>uchar</code>, <code>char2</code>, <code>uchar2</code>, <code>short</code>, <code>ushort</code>, and <code>half</code>, have
the following restriction:</p>
<div class="ulist">
<ul>
<li>
<p>Writes to a pointer (or arrays) of type <code>char</code>,
<code>uchar</code>, <code>char2</code>, <code>uchar2</code>, <code>short</code>, <code>ushort</code>, and <code>half</code> or to
elements of a struct that are of type <code>char</code>, <code>uchar</code>, <code>char2</code>,
<code>uchar2</code>, <code>short</code> and <code>ushort</code> are not supported.
Refer to <em>section 9.9</em> for additional information.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>The kernel example below shows what memory operations are not supported on
built-in types less than 32-bits in size.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span>
do_proc (__global <span class="predefined-type">char</span> *pA, <span class="predefined-type">short</span> b,
__global <span class="predefined-type">short</span> *pB)
{
<span class="predefined-type">char</span> x[<span class="integer">100</span>];
__private <span class="predefined-type">char</span> *px = x;
<span class="predefined-type">int</span> id = (<span class="predefined-type">int</span>)get_global_id(<span class="integer">0</span>);
<span class="predefined-type">short</span> f;
f = pB[id] + b; <span class="comment">// is allowed</span>
px[<span class="integer">1</span>] = pA[<span class="integer">1</span>]; <span class="comment">// error. px cannot be written.</span>
pB[id] = b; <span class="comment">// error. pB cannot be written</span>
}</code></pre>
</div>
</div>
</li>
<li>
<p>The type qualifiers <code>const</code>, <code>restrict</code> and <code>volatile</code> as defined by the
C99 specification are supported.
These qualifiers cannot be used with <code>image2d_t</code>, <code>image3d_t</code>,
<code>image2d_array_t</code>, <code>image2d_depth_t</code>, <code>image2d_array_depth_t</code>,
<code>image1d_t</code>, <code>image1d_buffer_t</code> and <code>image1d_array_t</code> types.
Types other than pointer types shall not use the <code>restrict</code> qualifier.</p>
</li>
<li>
<p>The event type (<code>event_t</code>) cannot be used as the type of a kernel
function argument.
The event type cannot be used to declare a program scope variable.
The event type cannot be used to declare a structure or union field.
The event type cannot be used with the <code>__local</code>, <code>__constant</code> and
<code>__global</code> address space qualifiers.</p>
</li>
<li>
<p>The <code>clk_event_t</code>, <code>ndrange_t</code> and <code>reserve_id_t</code> types cannot be used
as arguments to kernel functions that get enqueued from the host.
The <code>clk_event_t</code> and <code>reserve_id_t</code> types cannot be declared in program
scope.</p>
</li>
<li>
<p>Kernels enqueued by the host must continue to have their arguments that
are a pointer to a type declared to point to a named address space.</p>
</li>
<li>
<p>A function in an OpenCL program cannot be called <code>main</code>.</p>
</li>
<li>
<p>Implicit function declaration is not supported.</p>
</li>
</ol>
</div>
</div>
</div>
</div>
<div class="sect2">
<h3 id="preprocessor-directives-and-macros"><a class="anchor" href="#preprocessor-directives-and-macros"></a>6.12. Preprocessor Directives and Macros</h3>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The preprocessing directives defined by the C99 specification are supported.</p>
</div>
<div class="paragraph">
<p>The <strong>#pragma</strong> directive is described as:</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><strong>#pragma</strong> <em>pp-tokens<sub>opt</sub></em> <em>new-line</em></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>A <strong>#pragma</strong> directive where the preprocessing token <code>OPENCL</code> (used instead
of <strong><code>STDC</code></strong>) does not immediately follow <strong>pragma</strong> in the directive (prior to
any macro replacement) causes the implementation to behave in an
implementation-defined manner.
The behavior might cause translation to fail or cause the translator or the
resulting program to behave in a non-conforming manner.
Any such <strong>pragma</strong> that is not recognized by the implementation is ignored.
If the preprocessing token <code>OPENCL</code> does immediately follow <strong>#pragma</strong> in the
directive (prior to any macro replacement), then no macro replacement is
performed on the directive, and the directive shall have one of the
following forms whose meanings are described elsewhere:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="comment">// on-off-switch is one of ON, OFF, or DEFAULT</span>
<span class="preprocessor">#pragma</span> OPENCL FP_CONTRACT on-off-<span class="keyword">switch</span>
<span class="preprocessor">#pragma</span> OPENCL EXTENSION extensionname : behavior
<span class="preprocessor">#pragma</span> OPENCL EXTENSION all : behavior</code></pre>
</div>
</div>
<div class="paragraph">
<p>The following predefined macro names are available.</p>
</div>
<div class="dlist">
<dl>
<dt class="hdlist1"><code>__FILE__</code> </dt>
<dd>
<p>The presumed name of the current source file (a character string
literal).</p>
</dd>
<dt class="hdlist1"><code>__LINE__</code> </dt>
<dd>
<p>The presumed line number (within the current source file) of the current
source line (an integer constant).</p>
</dd>
<dt class="hdlist1"><code>__OPENCL_VERSION__</code> </dt>
<dd>
<p>Substitutes an integer reflecting the version number of the OpenCL
supported by the OpenCL device.
The version of OpenCL described in this document will have
<code>__OPENCL_VERSION__</code> substitute the integer 300.</p>
</dd>
<dt class="hdlist1"><code>CL_VERSION_1_0</code> </dt>
<dd>
<p>Substitutes the integer 100 reflecting the OpenCL 1.0 version.
<a href="#unified-spec">Requires</a> support for OpenCL C 1.1 or newer.</p>
</dd>
<dt class="hdlist1"><code>CL_VERSION_1_1</code> </dt>
<dd>
<p>Substitutes the integer 110 reflecting the OpenCL 1.1 version.
<a href="#unified-spec">Requires</a> support for OpenCL C 1.1 or newer.</p>
</dd>
<dt class="hdlist1"><code>CL_VERSION_1_2</code> </dt>
<dd>
<p>Substitutes the integer 120 reflecting the OpenCL 1.2 version.
<a href="#unified-spec">Requires</a> support for OpenCL C 1.2 or newer.</p>
</dd>
<dt class="hdlist1"><code>CL_VERSION_2_0</code> </dt>
<dd>
<p>Substitutes the integer 200 reflecting the OpenCL 2.0 version.
<a href="#unified-spec">Requires</a> support for OpenCL C 2.0 or newer.</p>
</dd>
<dt class="hdlist1"><code>CL_VERSION_3_0</code> </dt>
<dd>
<p>Substitutes the integer 300 reflecting the OpenCL 3.0 version.
<a href="#unified-spec">Requires</a> support for OpenCL C 3.0 or newer.</p>
</dd>
<dt class="hdlist1"><code>__OPENCL_C_VERSION__</code> </dt>
<dd>
<p>Substitutes an integer reflecting the OpenCL C version specified by the
<code>-cl-std</code> build option (see <a href="#opencl-spec">OpenCL Specification</a>) to
<strong>clBuildProgram</strong> or <strong>clCompileProgram</strong>.
If the <code>-cl-std</code> build option is not specified, the highest OpenCL C 1.x
language version supported by each device is used as the version of
OpenCL C when compiling the program for each device.
<a href="#unified-spec">Requires</a> support for OpenCL C 1.2 or newer.</p>
</dd>
<dt class="hdlist1"><code>__ROUNDING_MODE__</code> </dt>
<dd>
<p>Used to determine the current rounding mode and is set to rte.
Only affects the rounding mode of conversions to a float type.
<a href="#unified-spec">Deprecated by</a> OpenCL C 1.1, along with the
<strong>cl_khr_select_fprounding_mode</strong> extension.</p>
</dd>
<dt class="hdlist1"><code>__ENDIAN_LITTLE__</code> </dt>
<dd>
<p>Used to determine if the OpenCL device is a little endian architecture
or a big endian architecture (an integer constant of 1 if device is
little endian and is undefined otherwise).
Also refer to the value of the <a href="#opencl-device-queries"><code>CL_DEVICE_ENDIAN_LITTLE</code> device query</a>.</p>
</dd>
<dt class="hdlist1"><code>__kernel_exec(X, typen)</code> (and <code>kernel_exec(X, typen)</code>) </dt>
<dd>
<p>is defined as:</p>
</dd>
</dl>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">__kernel __attribute__((work_group_size_hint(X, <span class="integer">1</span>, <span class="integer">1</span>))) \
__attribute__((vec_type_hint(typen)))</code></pre>
</div>
</div>
<div class="dlist">
<dl>
<dt class="hdlist1"><code>__IMAGE_SUPPORT__</code> </dt>
<dd>
<p>Used to determine if the OpenCL device supports images.
This is an integer constant of 1 if images are supported and is
undefined otherwise.
Also refer to the value of the <a href="#opencl-device-queries"><code>CL_DEVICE_IMAGE_SUPPORT</code> device query</a> and the <code>__opencl_c_images</code>
feature.</p>
</dd>
<dt class="hdlist1"><code>__FAST_RELAXED_MATH__</code> </dt>
<dd>
<p>Used to determine if the <code>-cl-fast-relaxed-math</code> optimization option is
specified in build options given to <strong>clBuildProgram</strong> or
<strong>clCompileProgram</strong>.
This is an integer constant of 1 if the <code>-cl-fast-relaxed-math</code> build
option is specified and is undefined otherwise.</p>
</dd>
</dl>
</div>
<div class="paragraph">
<p>The <code>NULL</code> macro expands to a null pointer constant.
An integer constant expression with the value 0, or such an expression cast
to type <code>void *</code> is called a <em>null pointer constant</em>.
<a href="#unified-spec">Requires</a> support for OpenCL C 2.0 or newer.</p>
</div>
<div class="paragraph">
<p>The macro names defined by the C99 specification but not currently supported
by OpenCL are reserved for future use.</p>
</div>
<div class="paragraph">
<p>The predefined identifier <code>__func__</code> is available.
<a href="#unified-spec">Requires</a> support for OpenCL C 1.2 or newer.</p>
</div>
<div class="paragraph">
<p>In OpenCL C 3.0 or newer there are a number of optional predefined macros
indicating optional language features. Such macros are listed in the
<a href="#table-optional-lang-features">optional features in OpenCL C 3.0 table</a>.</p>
</div>
</div>
</div>
</div>
<div class="sect2">
<h3 id="attribute-qualifiers"><a class="anchor" href="#attribute-qualifiers"></a>6.13. Attribute Qualifiers</h3>
<div class="paragraph">
<p>This section describes the syntax with which <code>__attribute__</code> may be used,
and the constructs to which attribute specifiers bind.</p>
</div>
<div class="paragraph">
<p>An attribute specifier is of the form</p>
</div>
<div class="paragraph">
<p><code>__attribute__ ((_attribute-list_))</code>.</p>
</div>
<div class="paragraph">
<p>An attribute list is defined as:</p>
</div>
<div class="openblock bnf">
<div class="content">
<div class="dlist">
<dl>
<dt class="hdlist1"><em>attribute-list</em> : </dt>
<dd>
<p><em>attribute<sub>opt</sub></em><br>
<em>attribute-list</em> , <em>attribute<sub>opt</sub></em></p>
</dd>
<dt class="hdlist1"><em>attribute</em> : </dt>
<dd>
<p><em>attribute-token</em> <em>attribute-argument-clause<sub>opt</sub></em></p>
</dd>
<dt class="hdlist1"><em>attribute-token</em> : </dt>
<dd>
<p><em>identifier</em></p>
</dd>
<dt class="hdlist1"><em>attribute-argument-clause</em> : </dt>
<dd>
<p>( <em>attribute-argument-list</em> )</p>
</dd>
<dt class="hdlist1"><em>attribute-argument-list</em> : </dt>
<dd>
<p><em>attribute-argument</em><br>
<em>attribute-argument-list</em> , <em>attribute-argument</em></p>
</dd>
<dt class="hdlist1"><em>attribute-argument</em> : </dt>
<dd>
<p><em>assignment-expression</em></p>
</dd>
</dl>
</div>
</div>
</div>
<div class="paragraph">
<p>This syntax is taken directly from GCC but unlike GCC, which allows
attributes to be applied only to functions, types, and variables, OpenCL
attributes can be associated with:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>types;</p>
</li>
<li>
<p>functions;</p>
</li>
<li>
<p>variables;</p>
</li>
<li>
<p>blocks; and</p>
</li>
<li>
<p>control-flow statements.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>In general, the rules for how an attribute binds, for a given context, are
non-trivial and the reader is pointed to GCC&#8217;s documentation and Maurer and
Wong&#8217;s paper [See 16.
and 17.
in <em>section 11</em> - <strong>References</strong>] for the details.</p>
</div>
<div class="sect3">
<h4 id="specifying-attributes-of-types"><a class="anchor" href="#specifying-attributes-of-types"></a>6.13.1. Specifying Attributes of Types</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The keyword <code>__attribute__</code> allows you to specify special attributes of
enum, struct and union types when you define such types.
This keyword is followed by an attribute specification inside double
parentheses.
Two attributes are currently defined for types: aligned, and packed.</p>
</div>
<div class="paragraph">
<p>You may specify type attributes in an enum, struct or union type declaration
or definition, or for other types in a <code>typedef</code> declaration.</p>
</div>
<div class="paragraph">
<p>For an enum, struct or union type, you may specify attributes either between
the enum, struct or union tag and the name of the type, or just past the
closing curly brace of the <em>definition</em>.
The former syntax is preferred.</p>
</div>
<div class="dlist">
<dl>
<dt class="hdlist1"><code>aligned (<em>alignment</em>)</code> </dt>
</dl>
</div>
</div>
</div>
<div class="paragraph">
<p>This attribute specifies a minimum alignment (in bytes) for variables of the
specified type.
For example, the declarations:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="keyword">struct</span> S { <span class="predefined-type">short</span> f[<span class="integer">3</span>]; } __attribute__ ((aligned (<span class="integer">8</span>)));
<span class="keyword">typedef</span> <span class="predefined-type">int</span> more_aligned_int __attribute__ ((aligned (<span class="integer">8</span>)));</code></pre>
</div>
</div>
<div class="paragraph">
<p>force the compiler to insure (as far as it can) that each variable whose
type is <code>struct S</code> or <code>more_aligned_int</code> will be allocated and aligned <em>at
least</em> on a 8-byte boundary.</p>
</div>
<div class="paragraph">
<p>Note that the alignment of any given struct or union type is required by the
ISO C standard to be at least a perfect multiple of the lowest common
multiple of the alignments of all of the members of the struct or union in
question and must also be a power of two.
This means that you <em>can</em> effectively adjust the alignment of a struct or
union type by attaching an aligned attribute to any one of the members of
such a type, but the notation illustrated in the example above is a more
obvious, intuitive, and readable way to request the compiler to adjust the
alignment of an entire struct or union type.</p>
</div>
<div class="paragraph">
<p>As in the preceding example, you can explicitly specify the alignment (in
bytes) that you wish the compiler to use for a given struct or union type.
Alternatively, you can leave out the alignment factor and just ask the
compiler to align a type to the maximum useful alignment for the target
machine you are compiling for.
For example, you could write:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="keyword">struct</span> S { <span class="predefined-type">short</span> f[<span class="integer">3</span>]; } __attribute__ ((aligned));</code></pre>
</div>
</div>
<div class="paragraph">
<p>Whenever you leave out the alignment factor in an aligned attribute
specification, the compiler automatically sets the alignment for the type to
the largest alignment which is ever used for any data type on the target
machine you are compiling for.
In the example above, the size of each <code>short</code> is 2 bytes, and therefore the
size of the entire <code>struct S</code> type is 6 bytes.
The smallest power of two which is greater than or equal to that is 8, so
the compiler sets the alignment for the entire <code>struct S</code> type to 8 bytes.</p>
</div>
<div class="paragraph">
<p>Note that the effectiveness of aligned attributes may be limited by inherent
limitations of the OpenCL device and compiler.
For some devices, the OpenCL compiler may only be able to arrange for
variables to be aligned up to a certain maximum alignment.
If the OpenCL compiler is only able to align variables up to a maximum of 8
byte alignment, then specifying <code>aligned(16)</code> in an <code>__attribute__</code> will
still only provide you with 8 byte alignment.
See your platform-specific documentation for further information.</p>
</div>
<div class="paragraph">
<p>The aligned attribute can only increase the alignment; but you can decrease
it by specifying packed as well.
See below.</p>
</div>
<div class="openblock">
<div class="content">
<div class="dlist">
<dl>
<dt class="hdlist1"><code>packed</code> </dt>
</dl>
</div>
</div>
</div>
<div class="paragraph">
<p>This attribute, attached to struct or union type definition, specifies that
each member of the structure or union is placed to minimize the memory
required.
When attached to an enum definition, it indicates that the smallest integral
type should be used.</p>
</div>
<div class="paragraph">
<p>Specifying this attribute for struct and union types is equivalent to
specifying the packed attribute on each of the structure or union members.</p>
</div>
<div class="paragraph">
<p>In the following example, the members of <code>my_packed_struct</code> are packed
closely together, but the internal layout of its <code>s</code> member is not packed.
To do that, struct <code>my_unpacked_struct</code> would need to be packed, too.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="keyword">struct</span> my_unpacked_struct
{
<span class="predefined-type">char</span> c;
<span class="predefined-type">int</span> i;
};
<span class="keyword">struct</span> __attribute__ ((packed)) my_packed_struct
{
<span class="predefined-type">char</span> c;
<span class="predefined-type">int</span> i;
<span class="keyword">struct</span> my_unpacked_struct s;
};</code></pre>
</div>
</div>
<div class="paragraph">
<p>You may only specify this attribute on the definition of a enum, struct or
union, not on a <code>typedef</code> which does not also define the enumerated type,
structure or union.</p>
</div>
<div class="openblock">
<div class="content">
</div>
</div>
</div>
<div class="sect3">
<h4 id="specifying-attributes-of-functions"><a class="anchor" href="#specifying-attributes-of-functions"></a>6.13.2. Specifying Attributes of Functions</h4>
<div class="paragraph">
<p>See <a href="#function-qualifiers">Function Qualifiers</a> for the function attribute
qualifiers currently supported.</p>
</div>
</div>
<div class="sect3">
<h4 id="specifying-attributes-of-variables"><a class="anchor" href="#specifying-attributes-of-variables"></a>6.13.3. Specifying Attributes of Variables</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The keyword <code>__attribute__</code> allows you to specify special attributes of
variables or structure fields.
This keyword is followed by an attribute specification inside double
parentheses.
The following attribute qualifiers are currently defined:</p>
</div>
<div class="dlist">
<dl>
<dt class="hdlist1"><code>aligned (<em>alignment</em>)</code> </dt>
<dd>
<p>This attribute specifies a minimum alignment for the variable or structure
field, measured in bytes.
For example, the declaration:</p>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="predefined-type">int</span> x __attribute__ ((aligned (<span class="integer">16</span>))) = <span class="integer">0</span>;</code></pre>
</div>
</div>
<div class="paragraph">
<p>causes the compiler to allocate the global variable <code>x</code> on a 16-byte
boundary.
The alignment value specified must be a power of two.</p>
</div>
<div class="paragraph">
<p>You can also specify the alignment of structure fields.
For example, to create a double-word aligned <code>int</code> pair, you could write:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="keyword">struct</span> foo { <span class="predefined-type">int</span> x[<span class="integer">2</span>] __attribute__ ((aligned (<span class="integer">8</span>))); };</code></pre>
</div>
</div>
<div class="paragraph">
<p>This is an alternative to creating a union with a <code>double</code> member that
forces the union to be double-word aligned.</p>
</div>
<div class="paragraph">
<p>As in the preceding examples, you can explicitly specify the alignment (in
bytes) that you wish the compiler to use for a given variable or structure
field.
Alternatively, you can leave out the alignment factor and just ask the
compiler to align a variable or field to the maximum useful alignment for
the target machine you are compiling for.
For example, you could write:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="predefined-type">short</span> array[<span class="integer">3</span>] __attribute__ ((aligned));</code></pre>
</div>
</div>
<div class="paragraph">
<p>Whenever you leave out the alignment factor in an aligned attribute
specification, the OpenCL compiler automatically sets the alignment for the
declared variable or field to the largest alignment which is ever used for
any data type on the target device you are compiling for.</p>
</div>
<div class="paragraph">
<p>When used on a struct, or struct member, the aligned attribute can only
increase the alignment; in order to decrease it, the packed attribute must
be specified as well.
When used as part of a <code>typedef</code>, the aligned attribute can both increase
and decrease alignment, and specifying the packed attribute will generate a
warning.</p>
</div>
<div class="paragraph">
<p>Note that the effectiveness of aligned attributes may be limited by inherent
limitations of the OpenCL device and compiler.
For some devices, the OpenCL compiler may only be able to arrange for
variables to be aligned up to a certain maximum alignment.
If the OpenCL compiler is only able to align variables up to a maximum of 8
byte alignment, then specifying <code>aligned(16)</code> in an <code>__attribute__</code> will
still only provide you with 8 byte alignment.
See your platform-specific documentation for further information.</p>
</div>
</dd>
<dt class="hdlist1"><code>packed</code> </dt>
<dd>
<p>The packed attribute specifies that a variable or structure field should
have the smallest possible alignment&#8201;&#8212;&#8201;one byte for a variable, unless you
specify a larger value with the aligned attribute.</p>
<div class="paragraph">
<p>Here is a structure in which the field <code>x</code> is packed, so that it immediately
follows a:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="keyword">struct</span> foo
{
<span class="predefined-type">char</span> a;
<span class="predefined-type">int</span> x[<span class="integer">2</span>] __attribute__ ((packed));
};</code></pre>
</div>
</div>
<div class="paragraph">
<p>An attribute list placed at the beginning of a user-defined type applies to
the variable of that type and not the type, while attributes following the
type body apply to the type.</p>
</div>
<div class="paragraph">
<p>For example:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="comment">/* a has alignment of 128 */</span>
__attribute__((aligned(<span class="integer">128</span>))) <span class="keyword">struct</span> A {<span class="predefined-type">int</span> i;} a;
<span class="comment">/* b has alignment of 16 */</span>
__attribute__((aligned(<span class="integer">16</span>))) <span class="keyword">struct</span> B {<span class="predefined-type">double</span> d;}
__attribute__((aligned(<span class="integer">32</span>))) b ;
<span class="keyword">struct</span> A a1; <span class="comment">/* a1 has alignment of 4 */</span>
<span class="keyword">struct</span> B b1; <span class="comment">/* b1 has alignment of 32 */</span></code></pre>
</div>
</div>
</dd>
<dt class="hdlist1"><code>endian (<em>endiantype</em>)</code> </dt>
<dd>
<p>The endian attribute determines the byte ordering of a variable.
<em>endiantype</em> can be set to <code>host</code> indicating the variable uses the
endianness of the host processor or can be set to <code>device</code> indicating the
variable uses the endianness of the device on which the kernel will be
executed.
The default is <code>device</code>.</p>
<div class="paragraph">
<p>For example:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">global float4 *p __attribute__ ((endian(host)));</code></pre>
</div>
</div>
<div class="paragraph">
<p>specifies that data stored in memory pointed to by p will be in the host
endian format.</p>
</div>
<div class="paragraph">
<p>The endian attribute can only be applied to pointer types that are in the
<code>global</code> or <code>constant</code> address space.
The endian attribute cannot be used for variables that are not a pointer
type.
The endian attribute value for both pointers must be the same when one
pointer is assigned to another.</p>
</div>
</dd>
<dt class="hdlist1"><code>nosvm</code> </dt>
<dd>
<p>The <code>nosvm</code> attribute can be used with a pointer variable to inform the
compiler that the pointer does not refer to a shared virtual memory region.
<a href="#unified-spec">Requires</a> support for OpenCL C 2.0 or newer.</p>
</dd>
</dl>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>The <code>nosvm</code> attribute is deprecated, and the compiler can ignore it.</p>
</div>
</td>
</tr>
</table>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="specifying-attributes-of-blocks-and-control-flow-statements"><a class="anchor" href="#specifying-attributes-of-blocks-and-control-flow-statements"></a>6.13.4. Specifying Attributes of Blocks and Control-Flow-Statements</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>For basic blocks and control-flow-statements the attribute is placed before
the structure in question, for example:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">__attribute__((attr1)) {...}
<span class="keyword">for</span> __attribute__((attr2)) (...) __attribute__((attr3)) {...}</code></pre>
</div>
</div>
<div class="paragraph">
<p>Here <code>attr1</code> applies to the block in braces and <code>attr2</code> and <code>attr3</code> apply to
the loop&#8217;s control construct and body, respectively.</p>
</div>
<div class="paragraph">
<p>No attribute qualifiers for blocks and control-flow-statements are currently
defined.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="specifying-attribute-for-unrolling-loops"><a class="anchor" href="#specifying-attribute-for-unrolling-loops"></a>6.13.5. Specifying Attribute For Unrolling Loops</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <code>__attribute__((opencl_unroll_hint))</code> and
<code>__attribute__((opencl_unroll_hint(n)))</code> attribute qualifiers can be used
to specify that a loop (for, while and do loops) can be unrolled.
This attribute qualifier can be used to specify full unrolling or partial
unrolling by a specified amount.
This is a compiler hint and the compiler may ignore this directive.</p>
</div>
<div class="paragraph">
<p>n is the loop unrolling factor and must be a positive integral compile time
constant expression.
An unroll factor of 1 disables unrolling.
If n is not specified, the compiler determines the unrolling factor for the
loop.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>The <code>__attribute__((opencl_unroll_hint(n)))</code> attribute qualifier must
appear immediately before the loop to be affected.</p>
</div>
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">__attribute__((opencl_unroll_hint(<span class="integer">2</span>)))
<span class="keyword">while</span> (*s != <span class="integer">0</span>)
*p++ = *s++;</code></pre>
</div>
</div>
<div class="paragraph">
<p>The tells the compiler to unroll the above while loop by a factor of 2.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">__attribute__((opencl_unroll_hint))
<span class="keyword">for</span> (<span class="predefined-type">int</span> i=<span class="integer">0</span>; i&lt;<span class="integer">2</span>; i++)
{
...
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>In the example above, the compiler will determine how much to unroll the
loop.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">__attribute__((opencl_unroll_hint(<span class="integer">1</span>)))
<span class="keyword">for</span> (<span class="predefined-type">int</span> i=<span class="integer">0</span>; i&lt;<span class="integer">32</span>; i++)
{
...
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>The above is an example where the loop should not be unrolled.</p>
</div>
<div class="paragraph">
<p>Below are some examples of invalid usage of
<code>__attribute__((opencl_unroll_hint(n)))</code>.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">__attribute__((opencl_unroll_hint(-<span class="integer">1</span>)))
<span class="keyword">while</span> (...)
{
...
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>The above example is an invalid usage of the loop unroll factor as the loop
unroll factor is negative.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">__attribute__((opencl_unroll_hint))
<span class="keyword">if</span> (...)
{
...
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>The above example is invalid because the unroll attribute qualifier is used
on a non-loop construct</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span>
my_kernel( ... )
{
<span class="predefined-type">int</span> x;
__attribute__((opencl_unroll_hint(x))
<span class="keyword">for</span> (<span class="predefined-type">int</span> i=<span class="integer">0</span>; i&lt;x; i++)
{
...
}
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>The above example is invalid because the loop unroll factor is not a
compile-time constant expression.</p>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="extending-attribute-qualifiers"><a class="anchor" href="#extending-attribute-qualifiers"></a>6.13.6. Extending Attribute Qualifiers</h4>
<div class="paragraph">
<p>The attribute syntax can be extended for standard language extensions and
vendor specific extensions.
Any extensions should follow the naming conventions outlined in the
introduction to <a href="#opencl-extension-spec">section 9 in the OpenCL 2.0
Extension Specification</a>.</p>
</div>
<div class="paragraph">
<p>Attributes are intended as useful hints to the compiler.
It is our intention that a particular implementation of OpenCL be free to
ignore all attributes and the resulting executable binary will produce the
same result.
This does not preclude an implementation from making use of the additional
information provided by attributes and performing optimizations or other
transformations as it sees fit.
In this case it is the programmer&#8217;s responsibility to guarantee that the
information provided is in some sense correct.</p>
</div>
</div>
</div>
<div class="sect2">
<h3 id="blocks"><a class="anchor" href="#blocks"></a>6.14. Blocks</h3>
<div class="openblock">
<div class="content">
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The functionality described in this section <a href="#unified-spec">requires</a>
support for OpenCL C 2.0, or OpenCL C 3.0 or newer and the
<code>__opencl_c_device_enqueue</code> feature.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>This section describes the clang block syntax
<sup class="footnote">[<a id="_footnoteref_29" class="footnote" href="#_footnotedef_29" title="View footnote.">29</a>]</sup>.</p>
</div>
<div class="paragraph">
<p>Like function types, the Block type is a pair consisting of a result value
type and a list of parameter types very similar to a function type.
Blocks are intended to be used much like functions with the key distinction
being that in addition to executable code they also contain various variable
bindings to automatic (stack) or <code>global</code> memory.</p>
</div>
</div>
</div>
<div class="sect3">
<h4 id="declaring-and-using-a-block"><a class="anchor" href="#declaring-and-using-a-block"></a>6.14.1. Declaring and Using a Block</h4>
<div class="paragraph">
<p>You use the ^ operator to declare a Block variable and to indicate the
beginning of a Block literal.
The body of the Block itself is contained within {}, as shown in this
example (as usual with C, ; indicates the end of the statement):</p>
</div>
<div class="paragraph">
<p>The example is explained in the following illustration:</p>
</div>
<div class="paragraph">
<p><span class="image"><img src="" alt="block example" title="Block Example"></span></p>
</div>
<div class="paragraph">
<p>Notice that the Block is able to make use of variables from the same scope
in which it was defined.</p>
</div>
<div class="paragraph">
<p>If you declare a Block as a variable, you can then use it just as you would
a function:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="predefined-type">int</span> multiplier = <span class="integer">7</span>;
<span class="predefined-type">int</span> (^myBlock)(<span class="predefined-type">int</span>) = ^(<span class="predefined-type">int</span> num) {
<span class="keyword">return</span> num * multiplier;
};
printf(<span class="string"><span class="delimiter">&quot;</span><span class="content">%d</span><span class="char">\n</span><span class="delimiter">&quot;</span></span>, myBlock(<span class="integer">3</span>));
<span class="comment">// prints 21</span></code></pre>
</div>
</div>
</div>
<div class="sect3">
<h4 id="declaring-a-block-reference"><a class="anchor" href="#declaring-a-block-reference"></a>6.14.2. Declaring a Block Reference</h4>
<div class="paragraph">
<p>Block variables hold references to Blocks.
You declare them using syntax similar to that you use to declare a pointer
to a function, except that you use ^ instead of *.
The Block type fully interoperates with the rest of the C type system.
The following are valid Block variable declarations:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="directive">void</span> (^blockReturningVoidWithVoidArgument)(<span class="directive">void</span>);
<span class="predefined-type">int</span> (^blockReturningIntWithIntAndCharArguments)(<span class="predefined-type">int</span>, <span class="predefined-type">char</span>);</code></pre>
</div>
</div>
<div class="paragraph">
<p>A Block that takes no arguments must specify <code>void</code> in the argument list.
A Block reference may not be dereferenced via the pointer dereference
operation *, and thus a Block&#8217;s size may not be computed at compile time.</p>
</div>
<div class="paragraph">
<p>Blocks are designed to be fully type safe by giving the compiler a full set
of metadata to use to validate use of Blocks, parameters passed to blocks,
and assignment of the return value.</p>
</div>
<div class="paragraph">
<p>You can also create types for Blocks&#8201;&#8212;&#8201;doing so is generally considered to
be best practice when you use a block with a given signature in multiple
places:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="keyword">typedef</span> <span class="predefined-type">float</span> (^MyBlockType)(<span class="predefined-type">float</span>, <span class="predefined-type">float</span>);
MyBlockType myFirstBlock = <span class="comment">// ...;</span>
MyBlockType mySecondBlock = <span class="comment">// ...;</span></code></pre>
</div>
</div>
</div>
<div class="sect3">
<h4 id="block-literal-expressions"><a class="anchor" href="#block-literal-expressions"></a>6.14.3. Block Literal Expressions</h4>
<div class="paragraph">
<p>A Block literal expression produces a reference to a Block.
It is introduced by the use of the <strong>^</strong> token as a unary operator.</p>
</div>
<div class="openblock bnf">
<div class="content">
<div class="dlist">
<dl>
<dt class="hdlist1">Block_literal_expression : </dt>
<dd>
<p>^ <em>block_decl</em> <em>compound_statement_body</em></p>
</dd>
<dt class="hdlist1"><em>block_decl</em> : </dt>
<dd>
<p>empty<br>
<em>parameter_list</em><br>
<em>type_expression</em></p>
</dd>
</dl>
</div>
</div>
</div>
<div class="paragraph">
<p>where <em>type_expression</em> is extended to allow ^ as a Block reference where *
is allowed as a function reference.</p>
</div>
<div class="paragraph">
<p>The following Block literal:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">^ <span class="directive">void</span> (<span class="directive">void</span>) { printf(<span class="string"><span class="delimiter">&quot;</span><span class="content">hello world**</span><span class="char">\n</span><span class="content">**</span><span class="delimiter">&quot;</span></span>); }</code></pre>
</div>
</div>
<div class="paragraph">
<p>produces a reference to a Block with no arguments with no return value.</p>
</div>
<div class="paragraph">
<p>The return type is optional and is inferred from the return statements.
If the return statements return a value, they all must return a value of the
same type.
If there is no value returned the inferred type of the Block is <code>void</code>;
otherwise it is the type of the return statement value.
If the return type is omitted and the argument list is <code>( void )</code>, the <code>(
void )</code> argument list may also be omitted.</p>
</div>
<div class="paragraph">
<p>So:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">^ ( <span class="directive">void</span> ) { printf(<span class="string"><span class="delimiter">&quot;</span><span class="content">hello world**</span><span class="char">\n</span><span class="content">**</span><span class="delimiter">&quot;</span></span>); }</code></pre>
</div>
</div>
<div class="paragraph">
<p>and:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">^ { printf(<span class="string"><span class="delimiter">&quot;</span><span class="content">hello world**</span><span class="char">\n</span><span class="content">**</span><span class="delimiter">&quot;</span></span>); }</code></pre>
</div>
</div>
<div class="paragraph">
<p>are exactly equivalent constructs for the same expression.</p>
</div>
<div class="paragraph">
<p>The compound statement body establishes a new lexical scope within that of
its parent.
Variables used within the scope of the compound statement are bound to the
Block in the normal manner with the exception of those in automatic (stack)
storage.
Thus one may access functions and global variables as one would expect, as
well as <code>static</code> local variables.</p>
</div>
<div class="paragraph">
<p>Local automatic (stack) variables referenced within the compound statement
of a Block are imported and captured by the Block as const copies.
The capture (binding) is performed at the time of the Block literal
expression evaluation.</p>
</div>
<div class="paragraph">
<p>The compiler is not required to capture a variable if it can prove that no
references to the variable will actually be evaluated.</p>
</div>
<div class="paragraph">
<p>The lifetime of variables declared in a Block is that of a function..</p>
</div>
<div class="paragraph">
<p>Block literal expressions may occur within Block literal expressions
(nested) and all variables captured by any nested blocks are implicitly also
captured in the scopes of their enclosing Blocks.</p>
</div>
<div class="paragraph">
<p>A Block literal expression may be used as the initialization value for Block
variables at global or local <code>static</code> scope.</p>
</div>
<div class="paragraph">
<p>You can also declare a Block as a global literal in program scope.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="predefined-type">int</span> GlobalInt = <span class="integer">0</span>;
<span class="predefined-type">int</span> (^getGlobalInt)(<span class="directive">void</span>) = ^{ <span class="keyword">return</span> GlobalInt; };</code></pre>
</div>
</div>
</div>
<div class="sect3">
<h4 id="control-flow"><a class="anchor" href="#control-flow"></a>6.14.4. Control Flow</h4>
<div class="paragraph">
<p>The compound statement of a Block is treated much like a function body with
respect to control flow in that continue, break and goto do not escape the
Block.</p>
</div>
</div>
<div class="sect3">
<h4 id="restrictions-1"><a class="anchor" href="#restrictions-1"></a>6.14.5. Restrictions</h4>
<div class="paragraph">
<p>The following Blocks features are currently not supported in OpenCL C.</p>
</div>
<div class="ulist">
<ul>
<li>
<p>The <code>__block</code> storage type.</p>
</li>
<li>
<p>The <strong>Block_copy</strong>() and <strong>Block_release</strong>() functions that copy and release
Blocks.</p>
</li>
<li>
<p>Blocks with variadic arguments.</p>
</li>
<li>
<p>Arrays of Blocks.</p>
</li>
<li>
<p>Blocks as structures and union members.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Block literals are assumed to allocate memory at the point of definition and
to be destroyed at the end of the same scope.
To support these behaviors, additional restrictions
<sup class="footnote">[<a id="_footnoteref_30" class="footnote" href="#_footnotedef_30" title="View footnote.">30</a>]</sup> in addition to the above feature
restrictions are:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Block variables must be defined and used in a way that allows them to be
statically determinable at build or &#8220;link to executable&#8221; time.
In particular:</p>
<div class="ulist">
<ul>
<li>
<p>Block variables assigned in one scope must be used only with the same
or any nested scope.</p>
</li>
<li>
<p>The <code>extern</code> storage-class specified cannot be used with program scope
block variables.</p>
</li>
<li>
<p>Block variable declarations are implicitly qualified with const.
Therefore all block variables must be initialized at declaration time
and may not be reassigned.</p>
</li>
<li>
<p>A block cannot be a return value or a parameter of a function.</p>
</li>
<li>
<p>Blocks cannot be used as expressions of the ternary selection operator
(<strong>?:</strong>).</p>
</li>
</ul>
</div>
</li>
<li>
<p>The unary operators (<strong>*</strong>) and (<strong>&amp;</strong>) cannot be used with a Block.</p>
</li>
<li>
<p>Pointers to Blocks are not allowed.</p>
</li>
<li>
<p>A Block cannot capture another Block variable declared in the outer
scope (Example 4).</p>
</li>
<li>
<p>Block capture semantics follows regular C argument passing convention,
i.e. arrays are captured by reference (decayed to pointers) and structs
are captured by value (Example 5).</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Some examples that describe legal and illegal issue of Blocks in OpenCL C
are described below.</p>
</div>
<div class="paragraph">
<p>Example 1:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="directive">void</span> foo(<span class="predefined-type">int</span> *x, <span class="predefined-type">int</span> (^bar)(<span class="predefined-type">int</span>, <span class="predefined-type">int</span>))
{
*x = bar(*x, *x);
}
kernel
<span class="directive">void</span> k(global <span class="predefined-type">int</span> *x, global <span class="predefined-type">int</span> *z)
{
<span class="keyword">if</span> (some expression)
foo(x, ^<span class="predefined-type">int</span>(<span class="predefined-type">int</span> x, <span class="predefined-type">int</span> y){<span class="keyword">return</span> x+y+*z;}); <span class="comment">// legal</span>
<span class="keyword">else</span>
foo(x, ^<span class="predefined-type">int</span>(<span class="predefined-type">int</span> x, <span class="predefined-type">int</span> y){<span class="keyword">return</span> (x*y)-*z;}); <span class="comment">// legal</span>
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>Example 2:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel
<span class="directive">void</span> k(global <span class="predefined-type">int</span> *x, global <span class="predefined-type">int</span> *z)
{
<span class="predefined-type">int</span> ^(tmp)(<span class="predefined-type">int</span>, <span class="predefined-type">int</span>);
<span class="keyword">if</span> (some expression)
{
tmp = ^<span class="predefined-type">int</span>(<span class="predefined-type">int</span> x, <span class="predefined-type">int</span> y){<span class="keyword">return</span> x+y+*z;}); <span class="comment">// illegal</span>
}
*x = foo(x, tmp);
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>Example 3:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="predefined-type">int</span> GlobalInt = <span class="integer">0</span>;
<span class="predefined-type">int</span> (^getGlobalInt)(<span class="directive">void</span>) = ^{ <span class="keyword">return</span> GlobalInt; }; <span class="comment">// legal</span>
<span class="predefined-type">int</span> (^getAnotherGlobalInt)(<span class="directive">void</span>); <span class="comment">// illegal</span>
<span class="directive">extern</span> <span class="predefined-type">int</span> (^getExternGlobalInt)(<span class="directive">void</span>); <span class="comment">// illegal</span>
<span class="directive">void</span> foo()
{
...
getGlobalInt = ^{ <span class="keyword">return</span> <span class="integer">0</span>; }; <span class="comment">// illegal - cannot assign to</span>
<span class="comment">// a global block variable</span>
...
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>Example 4:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="directive">void</span> (^bl0)(<span class="directive">void</span>) = ^{
...
};
kernel <span class="directive">void</span> k()
{
<span class="directive">void</span>(^bl1)(<span class="directive">void</span>) = ^{
...
};
<span class="directive">void</span>(^bl2)(<span class="directive">void</span>) = ^{
bl0(); <span class="comment">// legal because bl0 is a global</span>
<span class="comment">// variable available in this scope</span>
bl1(); <span class="comment">// illegal because bl1 would have to be captured</span>
};
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>Example 5:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="keyword">struct</span> v {
<span class="predefined-type">int</span> arr[<span class="integer">2</span>];
} s = {<span class="integer">0</span>, <span class="integer">1</span>};
<span class="directive">void</span> (^bl1)() = ^(){printf(<span class="string"><span class="delimiter">&quot;</span><span class="content">%d</span><span class="char">\n</span><span class="delimiter">&quot;</span></span>, s.arr[<span class="integer">1</span>]);};
<span class="comment">// array content copied into captured struct location</span>
<span class="predefined-type">int</span> arr[<span class="integer">2</span>] = {<span class="integer">0</span>, <span class="integer">1</span>};
<span class="directive">void</span> (^bl2)() = ^(){printf(<span class="string"><span class="delimiter">&quot;</span><span class="content">%d</span><span class="char">\n</span><span class="delimiter">&quot;</span></span>, arr[<span class="integer">1</span>]);};
<span class="comment">// array decayed to pointer while captured</span>
s.arr[<span class="integer">1</span>] = arr[<span class="integer">1</span>] = <span class="integer">8</span>;
bl1(); <span class="comment">// prints - 1</span>
bl2(); <span class="comment">// prints - 8</span></code></pre>
</div>
</div>
</div>
</div>
<div class="sect2">
<h3 id="built-in-functions"><a class="anchor" href="#built-in-functions"></a>6.15. Built-in Functions</h3>
<div class="paragraph">
<p>The OpenCL C programming language provides a rich set of built-in functions
for scalar and vector operations.
Many of these functions are similar to the function names provided in common
C libraries but they support scalar and vector argument types.
Applications should use the built-in functions wherever possible instead of
writing their own version.</p>
</div>
<div class="paragraph">
<p>User defined OpenCL C functions behave per C standard rules for functions as
defined in <a href="#C99-spec">section 6.9.1 of the C99 Specification</a>.
On entry to the function, the size of each variably modified parameter is
evaluated and the value of each argument expression is converted to the type
of the corresponding parameter as per the
<a href="#usual-arithmetic-conversions">usual arithmetic conversion rules</a>.
Built-in functions described in this section behave similarly, except that
in order to avoid ambiguity between multiple forms of the same built-in
function, implicit scalar widening shall not occur.
Note that some built-in functions described in this section do have forms
that operate on mixed scalar and vector types, however.</p>
</div>
<div class="sect3">
<h4 id="work-item-functions"><a class="anchor" href="#work-item-functions"></a>6.15.1. Work-Item Functions</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The following table describes the list of built-in work-item functions that
can be used to query the number of dimensions, the global and local work
size specified to <strong>clEnqueueNDRangeKernel</strong>, and the global and local
identifier of each work-item when this kernel is being executed on a device.</p>
</div>
<table id="table-work-item-functions" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 8. Built-in Work-Item Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">uint <strong>get_work_dim</strong>()</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the number of dimensions in use.
This is the value given to the <em>work_dim</em> argument specified in
<strong>clEnqueueNDRangeKernel</strong>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">size_t <strong>get_global_size</strong>(uint <em>dimindx</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the number of global work-items specified for dimension
identified by <em>dimindx</em>.
This value is given by the <em>global_work_size</em> argument to
<strong>clEnqueueNDRangeKernel</strong>.</p>
<p class="tableblock"> Valid values of <em>dimindx</em> are 0 to <strong>get_work_dim</strong>() - 1.
For other values of <em>dimindx</em>, <strong>get_global_size</strong>() returns 1.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">size_t <strong>get_global_id</strong>(uint <em>dimindx</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the unique global work-item ID value for dimension identified
by <em>dimindx</em>.
The global work-item ID specifies the work-item ID based on the number
of global work-items specified to execute the kernel.</p>
<p class="tableblock"> Valid values of <em>dimindx</em> are 0 to <strong>get_work_dim</strong>() - 1.
For other values of <em>dimindx</em>, <strong>get_global_id</strong>() returns 0.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">size_t <strong>get_local_size</strong>(uint <em>dimindx</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the number of local work-items specified in dimension
identified by <em>dimindx</em>.
This value is at most the value given by the <em>local_work_size</em>
argument to <strong>clEnqueueNDRangeKernel</strong> if <em>local_work_size</em> is not
<code>NULL</code>; otherwise the OpenCL implementation chooses an appropriate
<em>local_work_size</em> value which is returned by this function.
If the kernel is executed with a non-uniform work-group size
<sup class="footnote">[<a id="_footnoteref_31" class="footnote" href="#_footnotedef_31" title="View footnote.">31</a>]</sup>, calls to this built-in from some
work-groups may return different values than calls to this built-in from
other work-groups.</p>
<p class="tableblock"> Valid values of <em>dimindx</em> are 0 to <strong>get_work_dim</strong>() - 1.
For other values of <em>dimindx</em>, <strong>get_local_size</strong>() returns 1.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">size_t <strong>get_enqueued_local_size</strong>(
uint <em>dimindx</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the same value as that returned by <strong>get_local_size</strong>(<em>dimindx</em>)
if the kernel is executed with a uniform work-group size.</p>
<p class="tableblock"> If the kernel is executed with a non-uniform work-group size, returns
the number of local work-items in each of the work-groups that make up
the uniform region of the global range in the dimension identified by
<em>dimindx</em>.
If the <em>local_work_size</em> argument to <strong>clEnqueueNDRangeKernel</strong> is not
<code>NULL</code>, this value will match the value specified in
<em>local_work_size</em>[<em>dimindx</em>].
If <em>local_work_size</em> is <code>NULL</code>, this value will match the local size
that the implementation determined would be most efficient at
implementing the uniform region of the global range.</p>
<p class="tableblock"> Valid values of <em>dimindx</em> are 0 to <strong>get_work_dim</strong>() - 1.
For other values of <em>dimindx</em>, <strong>get_enqueued_local_size</strong>() returns 1.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL 2.0 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">size_t <strong>get_local_id</strong>(uint <em>dimindx</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the unique local work-item ID, i.e. a work-item within a
specific work-group for dimension identified by <em>dimindx</em>.</p>
<p class="tableblock"> Valid values of <em>dimindx</em> are 0 to <strong>get_work_dim</strong>() - 1.
For other values of <em>dimindx</em>, <strong>get_local_id</strong>() returns 0.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">size_t <strong>get_num_groups</strong>(uint <em>dimindx</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the number of work-groups that will execute a kernel for
dimension identified by <em>dimindx</em>.</p>
<p class="tableblock"> Valid values of <em>dimindx</em> are 0 to <strong>get_work_dim</strong>() - 1.
For other values of <em>dimindx</em>, <strong>get_num_groups</strong>() returns 1.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">size_t <strong>get_group_id</strong>(uint <em>dimindx</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>get_group_id</strong> returns the work-group ID which is a number from 0 ..
<strong>get_num_groups</strong>(<em>dimindx</em>) - 1.</p>
<p class="tableblock"> Valid values of <em>dimindx</em> are 0 to <strong>get_work_dim</strong>() - 1.
For other values, <strong>get_group_id</strong>() returns 0.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">size_t <strong>get_global_offset</strong>(uint <em>dimindx</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>get_global_offset</strong> returns the offset values specified in
<em>global_work_offset</em> argument to <strong>clEnqueueNDRangeKernel</strong>.</p>
<p class="tableblock"> Valid values of <em>dimindx</em> are 0 to <strong>get_work_dim</strong>() - 1.
For other values, <strong>get_global_offset</strong>() returns 0.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.1 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">size_t <strong>get_global_linear_id</strong>()</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the work-items 1-dimensional global ID.</p>
<p class="tableblock"> For 1D work-groups, it is computed as <strong>get_global_id</strong>(0) -
<strong>get_global_offset</strong>(0).</p>
<p class="tableblock"> For 2D work-groups, it is computed as (<strong>get_global_id</strong>(1) -
<strong>get_global_offset</strong>(1)) * <strong>get_global_size</strong>(0) + (<strong>get_global_id</strong>(0) -
<strong>get_global_offset</strong>(0)).</p>
<p class="tableblock"> For 3D work-groups, it is computed as ((<strong>get_global_id</strong>(2) -
<strong>get_global_offset</strong>(2)) * <strong>get_global_size</strong>(1) * <strong>get_global_size</strong>(0))
+ ((<strong>get_global_id</strong>(1) - <strong>get_global_offset</strong>(1)) * <strong>get_global_size</strong>(0))
+ (<strong>get_global_id</strong>(0) - <strong>get_global_offset</strong>(0)).</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL 2.0 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">size_t <strong>get_local_linear_id</strong>()</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the work-items 1-dimensional local ID.</p>
<p class="tableblock"> For 1D work-groups, it is the same value as</p>
<p class="tableblock"> <strong>get_local_id</strong>(0).</p>
<p class="tableblock"> For 2D work-groups, it is computed as</p>
<p class="tableblock"> <strong>get_local_id</strong>(1) * <strong>get_local_size</strong>(0) + <strong>get_local_id</strong>(0).</p>
<p class="tableblock"> For 3D work-groups, it is computed as</p>
<p class="tableblock"> (<strong>get_local_id</strong>(2) * <strong>get_local_size</strong>(1) * <strong>get_local_size</strong>(0)) +
(<strong>get_local_id</strong>(1) * <strong>get_local_size</strong>(0)) + <strong>get_local_id</strong>(0).</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL 2.0 or newer.</p></td>
</tr>
</tbody>
</table>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The functionality described in the following table <a href="#unified-spec">requires</a> support for OpenCL C 3.0 or newer and the <code>__opencl_c_subgroups</code>
feature.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>The following table describes the list of built-in work-item functions that
can be used to query the size of a subgroup, number of subgroups per work group,
and identifier of the subgroup within a work-group and work-item within a
subgroup when this kernel is being executed on a device.</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<caption class="title">Table 9. Built-in Work-Item Functions for Subgroups</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<thead>
<tr>
<th class="tableblock halign-left valign-top"><strong>Function</strong></th>
<th class="tableblock halign-left valign-top"><strong>Description</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><div class="content"><div class="paragraph">
<p>uint <strong>get_sub_group_size</strong>()</p>
</div></div></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the number of work items in the subgroup.
This value is no more than the maximum subgroup size and is
implementation-defined based on a combination of the compiled kernel and
the dispatch dimensions.
This will be a constant value for the lifetime of the subgroup.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><div class="content"><div class="paragraph">
<p>uint <strong>get_max_sub_group_size</strong>()</p>
</div></div></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the maximum size of a subgroup within the dispatch.
This value will be invariant for a given set of dispatch dimensions and a
kernel object compiled for a given device.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><div class="content"><div class="paragraph">
<p>uint <strong>get_num_sub_groups</strong>()</p>
</div></div></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the number of subgroups that the current work group is divided
into.</p>
<p class="tableblock"> This number will be constant for the duration of a work group&#8217;s execution.
If the kernel is executed with a non-uniform work group size
(i.e. the global_work_size values specified to <strong>clEnqueueNDRangeKernel</strong>
are not evenly divisible by the local_work_size values for any dimension,
calls to this built-in from some work groups may return different values
than calls to this built-in from other work groups.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><div class="content"><div class="paragraph">
<p>uint <strong>get_enqueued_num_sub_groups</strong>()</p>
</div></div></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the same value as that returned by <strong>get_num_sub_groups</strong> if the
kernel is executed with a uniform work group size.</p>
<p class="tableblock"> If the kernel is executed with a non-uniform work group size, returns the
number of subgroups in each of the work groups that make up the uniform
region of the global range.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><div class="content"><div class="paragraph">
<p>uint <strong>get_sub_group_id</strong>()</p>
</div></div></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>get_sub_group_id</strong> returns the subgroup ID which is a number from 0 ..
<strong>get_num_sub_groups</strong>() - 1.</p>
<p class="tableblock"> For <strong>clEnqueueTask</strong>, this returns 0.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><div class="content"><div class="paragraph">
<p>uint <strong>get_sub_group_local_id</strong>()</p>
</div></div></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the unique work item ID within the current subgroup.
The mapping from <strong>get_local_id</strong>(<em>dimindx</em>) to <strong>get_sub_group_local_id</strong>
will be invariant for the lifetime of the work group.</p></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="sect3">
<h4 id="math-functions"><a class="anchor" href="#math-functions"></a>6.15.2. Math Functions</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The built-in math functions are categorized into the following:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>A list of built-in functions that have scalar or vector argument
versions, and,</p>
</li>
<li>
<p>A list of built-in functions that only take scalar <code>float</code> arguments.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>The vector versions of the math functions operate component-wise.
The description is per-component.</p>
</div>
<div class="paragraph">
<p>The built-in math functions are not affected by the prevailing rounding mode
in the calling environment, and always return the same value as they would
if called with the round to nearest even rounding mode.</p>
</div>
<div class="paragraph">
<p>The <a href="#table-builtin-math">following table</a> describes the list of built-in
math functions that can take scalar or vector arguments.
We use the generic type name <code>gentype</code> to indicate that the function can take
<code>float</code>, <code>float2</code>, <code>float3</code>, <code>float4</code>, <code>float8</code>, <code>float16</code>, <code>double</code>
<sup class="footnote" id="_footnote_double-supported">[<a id="_footnoteref_32" class="footnote" href="#_footnotedef_32" title="View footnote.">32</a>]</sup>, <code>double2</code>,
<code>double3</code>, <code>double4</code>, <code>double8</code> or <code>double16</code> as the type for the arguments.
We use the generic type name <code>gentypef</code> to indicate that the function can
take <code>float</code>, <code>float2</code>, <code>float3</code>, <code>float4</code>, <code>float8</code>, or <code>float16</code> as the
type for the arguments.
We use the generic type name <code>gentyped</code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_32" title="View footnote.">32</a>]</sup> to
indicate that the function can take <code>double</code>, <code>double2</code>, <code>double3</code>, <code>double4</code>,
<code>double8</code> or <code>double16</code> as the type for the arguments.
For any specific use of a function, the actual type has to be the same for
all arguments and the return type, unless otherwise specified.</p>
</div>
<table id="table-builtin-math" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 10. Built-in Scalar and Vector Argument Math Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>acos</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Arc cosine function. Returns an angle in radians.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>acosh</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Inverse hyperbolic cosine. Returns an angle in radians.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>acospi</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute <strong>acos</strong>(<em>x</em>) / π.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>asin</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Arc sine function. Returns an angle in radians.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>asinh</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Inverse hyperbolic sine. Returns an angle in radians.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>asinpi</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute <strong>asin</strong>(<em>x</em>) / π.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>atan</strong>(gentype <em>y_over_x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Arc tangent function. Returns an angle in radians.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>atan2</strong>(gentype <em>y</em>, gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Arc tangent of <em>y</em> / <em>x</em>. Returns an angle in radians.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>atanh</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Hyperbolic arc tangent. Returns an angle in radians.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>atanpi</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute <strong>atan</strong>(<em>x</em>) / π.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>atan2pi</strong>(gentype <em>y</em>, gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute <strong>atan2</strong>(<em>y</em>, <em>x</em>) / π.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>cbrt</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute cube-root.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>ceil</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Round to integral value using the round to positive infinity rounding
mode.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>copysign</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <em>x</em> with its sign changed to match the sign of <em>y</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>cos</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute cosine, where <em>x</em> is an angle in radians.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>cosh</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute hyperbolic cosine, where <em>x</em> is an angle in radians.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>cospi</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute <strong>cos</strong>(Ï€ <em>x</em>).</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>erfc</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Complementary error function.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>erf</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Error function encountered in integrating the
<a href="http://mathworld.wolfram.com/NormalDistribution.html"><em>normal
distribution</em></a>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>exp</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute the base-<em>e</em> exponential of <em>x</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>exp2</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Exponential base 2 function.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>exp10</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Exponential base 10 function.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>expm1</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute <em>e<sup>x</sup></em> - 1.0.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>fabs</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute absolute value of a floating-point number.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>fdim</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>x</em> - <em>y</em> if <em>x</em> &gt; <em>y</em>, +0 if <em>x</em> is less than or equal to y.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>floor</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Round to integral value using the round to negative infinity rounding
mode.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>fma</strong>(gentype <em>a</em>, gentype <em>b</em>, gentype <em>c</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the correctly rounded floating-point representation of the sum
of <em>c</em> with the infinitely precise product of <em>a</em> and <em>b</em>.
Rounding of intermediate products shall not occur.
Edge case behavior is per the IEEE 754-2008 standard.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>fmax</strong>(gentype <em>x</em>, gentype <em>y</em>)<br>
gentypef <strong>fmax</strong>(gentypef <em>x</em>, float <em>y</em>)<br>
gentyped <strong>fmax</strong>(gentyped <em>x</em>, double <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <em>y</em> if <em>x</em> &lt; <em>y</em>, otherwise it returns <em>x</em>.
If one argument is a NaN, <strong>fmax</strong>() returns the other argument.
If both arguments are NaNs, <strong>fmax</strong>() returns a NaN.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>fmin</strong>(gentype <em>x</em>, gentype <em>y</em>)<br>
gentypef <strong>fmin</strong>(gentypef <em>x</em>, float <em>y</em>)<br>
gentyped <strong>fmin</strong>(gentyped <em>x</em>, double <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <em>y</em> if <em>y</em> &lt; <em>x</em>, otherwise it returns <em>x</em>.
If one argument is a NaN, <strong>fmin</strong>() returns the other argument.
If both arguments are NaNs, <strong>fmin</strong>() returns a NaN.
<sup class="footnote">[<a id="_footnoteref_33" class="footnote" href="#_footnotedef_33" title="View footnote.">33</a>]</sup></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>fmod</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Modulus.
Returns <em>x</em> - <em>y</em> * <strong>trunc</strong>(<em>x</em>/<em>y</em>).</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>fract</strong>(gentype <em>x</em>, __global gentype <em>*iptr</em>)<br>
gentype <strong>fract</strong>(gentype <em>x</em>, __local gentype <em>*iptr</em>)<br>
gentype <strong>fract</strong>(gentype <em>x</em>, __private gentype <em>*iptr</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> gentype <strong>fract</strong>(gentype <em>x</em>, gentype <em>*iptr</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <strong>fmin</strong>(<em>x</em> - <strong>floor</strong>(<em>x</em>), <code>0x1.fffffep-1f</code>).
<strong>floor</strong>(x) is returned in <em>iptr</em>.
<sup class="footnote">[<a id="_footnoteref_34" class="footnote" href="#_footnotedef_34" title="View footnote.">34</a>]</sup></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float<em>n</em> <strong>frexp</strong>(float<em>n</em> <em>x</em>, __global int<em>n</em> *exp)<br>
float <strong>frexp</strong>(float <em>x</em>, __global int *exp)<br></p>
<p class="tableblock"> float<em>n</em> <strong>frexp</strong>(float<em>n</em> <em>x</em>, __local int<em>n</em> *exp)<br>
float <strong>frexp</strong>(float <em>x</em>, __local int *exp)<br></p>
<p class="tableblock"> float<em>n</em> <strong>frexp</strong>(float<em>n</em> <em>x</em>, __private int<em>n</em> *exp)<br>
float <strong>frexp</strong>(float <em>x</em>, __private int *exp)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> float<em>n</em> <strong>frexp</strong>(float<em>n</em> <em>x</em>, int<em>n</em> *exp)<br>
float <strong>frexp</strong>(float <em>x</em>, int *exp)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Extract mantissa and exponent from <em>x</em>.
For each component the mantissa returned is a <code>float</code> with magnitude
in the interval [1/2, 1) or 0.
Each component of <em>x</em> equals mantissa returned * 2<em><sup>exp</sup></em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">double<em>n</em> <strong>frexp</strong>(double<em>n</em> <em>x</em>, __global int<em>n</em> *exp)<br>
double <strong>frexp</strong>(double <em>x</em>, __global int *exp)<br></p>
<p class="tableblock"> double<em>n</em> <strong>frexp</strong>(double<em>n</em> <em>x</em>, __local int<em>n</em> *exp)<br>
double <strong>frexp</strong>(double <em>x</em>, __local int *exp)<br></p>
<p class="tableblock"> double<em>n</em> <strong>frexp</strong>(double<em>n</em> <em>x</em>, __private int<em>n</em> *exp)<br>
double <strong>frexp</strong>(double <em>x</em>, __private int *exp)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> double<em>n</em> <strong>frexp</strong>(double<em>n</em> <em>x</em>, int<em>n</em> *exp)<br>
double <strong>frexp</strong>(double <em>x</em>, int *exp)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Extract mantissa and exponent from <em>x</em>.
For each component the mantissa returned is a <code>double</code> with magnitude
in the interval [1/2, 1) or 0.
Each component of <em>x</em> equals mantissa returned * 2<em><sup>exp</sup></em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>hypot</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute the value of the square root of <em>x</em><sup>2</sup>+ <em>y</em><sup>2</sup> without
undue overflow or underflow.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int<em>n</em> <strong>ilogb</strong>(float<em>n</em> <em>x</em>)<br>
int <strong>ilogb</strong>(float <em>x</em>)<br>
int<em>n</em> <strong>ilogb</strong>(double<em>n</em> <em>x</em>)<br>
int <strong>ilogb</strong>(double <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Return the exponent as an integer value.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float<em>n</em> <strong>ldexp</strong>(float<em>n</em> <em>x</em>, int<em>n</em> <em>k</em>)<br>
float<em>n</em> <strong>ldexp</strong>(float<em>n</em> <em>x</em>, int <em>k</em>)<br>
float <strong>ldexp</strong>(float <em>x</em>, int <em>k</em>)<br>
double<em>n</em> <strong>ldexp</strong>(double<em>n</em> <em>x</em>, int<em>n</em> <em>k</em>)<br>
double<em>n</em> <strong>ldexp</strong>(double<em>n</em> <em>x</em>, int <em>k</em>)<br>
double <strong>ldexp</strong>(double <em>x</em>, int <em>k</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Multiply <em>x</em> by 2 to the power <em>k</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>lgamma</strong>(gentype <em>x</em>)<br></p>
<p class="tableblock"> float<em>n</em> <strong>lgamma_r</strong>(float<em>n</em> <em>x</em>, __global int<em>n</em> *<em>signp</em>)<br>
float <strong>lgamma_r</strong>(float <em>x</em>, __global int *<em>signp</em>)<br>
double<em>n</em> <strong>lgamma_r</strong>(double<em>n</em> <em>x</em>, __global int<em>n</em> *<em>signp</em>)<br>
double <strong>lgamma_r</strong>(double <em>x</em>, __global int *<em>signp</em>)<br></p>
<p class="tableblock"> float<em>n</em> <strong>lgamma_r</strong>(float<em>n</em> <em>x</em>, __local int<em>n</em> *<em>signp</em>)<br>
float <strong>lgamma_r</strong>(float <em>x</em>, __local int *<em>signp</em>)<br>
double<em>n</em> <strong>lgamma_r</strong>(double<em>n</em> <em>x</em>, __local int<em>n</em> *<em>signp</em>)<br>
double <strong>lgamma_r</strong>(double <em>x</em>, __local int *<em>signp</em>)<br></p>
<p class="tableblock"> float<em>n</em> <strong>lgamma_r</strong>(float<em>n</em> <em>x</em>, __private int<em>n</em> *<em>signp</em>)<br>
float <strong>lgamma_r</strong>(float <em>x</em>, __private int *<em>signp</em>)<br>
double<em>n</em> <strong>lgamma_r</strong>(double<em>n</em> <em>x</em>, __private int<em>n</em> *<em>signp</em>)<br>
double <strong>lgamma_r</strong>(double <em>x</em>, __private int *<em>signp</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> float<em>n</em> <strong>lgamma_r</strong>(float<em>n</em> <em>x</em>, int<em>n</em> *<em>signp</em>)<br>
float <strong>lgamma_r</strong>(float <em>x</em>, int *<em>signp</em>)<br>
double<em>n</em> <strong>lgamma_r</strong>(double<em>n</em> <em>x</em>, int<em>n</em> *<em>signp</em>)<br>
double <strong>lgamma_r</strong>(double <em>x</em>, int *<em>signp</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Log gamma function.
Returns the natural logarithm of the absolute value of the gamma
function.
The sign of the gamma function is returned in the <em>signp</em> argument of
<strong>lgamma_r</strong>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>log</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute natural logarithm.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>log2</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute a base 2 logarithm.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>log10</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute a base 10 logarithm.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>log1p</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute log<sub>e</sub>(1.0 + <em>x</em>).</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>logb</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute the exponent of <em>x</em>, which is the integral part of
log<em><sub>r</sub></em>(|<em>x</em>|).</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>mad</strong>(gentype <em>a</em>, gentype <em>b</em>, gentype <em>c</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>mad</strong> computes <em>a</em> * <em>b</em> + <em>c</em>.
The function may compute <em>a</em> * <em>b</em> + <em>c</em> with reduced accuracy
in the embedded profile. See the OpenCL SPIR-V Environment Specification
for details. On some hardware the mad instruction may provide better
performance than expanded computation of <em>a</em> * <em>b</em> + <em>c</em>.
<sup class="footnote">[<a id="_footnoteref_35" class="footnote" href="#_footnotedef_35" title="View footnote.">35</a>]</sup></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>maxmag</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <em>x</em> if |<em>x</em>| &gt; |<em>y</em>|, <em>y</em> if |<em>y</em>| &gt; |<em>x</em>|, otherwise
<strong>fmax</strong>(<em>x</em>, <em>y</em>).</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.1 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>minmag</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <em>x</em> if |<em>x</em>| &lt; |<em>y</em>|, <em>y</em> if |<em>y</em>| &lt; |<em>x</em>|, otherwise
<strong>fmin</strong>(<em>x</em>, <em>y</em>).</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.1 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>modf</strong>(gentype <em>x</em>, __global gentype <em>*iptr</em>)<br>
gentype <strong>modf</strong>(gentype <em>x</em>, __local gentype <em>*iptr</em>)<br>
gentype <strong>modf</strong>(gentype <em>x</em>, __private gentype <em>*iptr</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> gentype <strong>modf</strong>(gentype <em>x</em>, gentype <em>*iptr</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Decompose a floating-point number.
The <strong>modf</strong> function breaks the argument <em>x</em> into integral and
fractional parts, each of which has the same sign as the argument.
It stores the integral part in the object pointed to by <em>iptr</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float<em>n</em> <strong>nan</strong>(uint<em>n</em> <em>nancode</em>)<br>
float <strong>nan</strong>(uint <em>nancode</em>)<br>
double<em>n</em> <strong>nan</strong>(ulong<em>n</em> <em>nancode</em>)<br>
double <strong>nan</strong>(ulong <em>nancode</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns a quiet NaN.
The <em>nancode</em> may be placed in the significand of the resulting NaN.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>nextafter</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Computes the next representable single-precision floating-point value
following <em>x</em> in the direction of <em>y</em>.
Thus, if <em>y</em> is less than <em>x</em>, <strong>nextafter</strong>() returns the largest
representable floating-point number less than <em>x</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>pow</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute <em>x</em> to the power <em>y</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float<em>n</em> <strong>pown</strong>(float<em>n</em> <em>x</em>, int<em>n</em> <em>y</em>)<br>
float <strong>pown</strong>(float <em>x</em>, int <em>y</em>)<br>
double<em>n</em> <strong>pown</strong>(double<em>n</em> <em>x</em>, int<em>n</em> <em>y</em>)<br>
double <strong>pown</strong>(double <em>x</em>, int <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute <em>x</em> to the power <em>y</em>, where <em>y</em> is an integer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>powr</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute <em>x</em> to the power <em>y</em>, where <em>x</em> is &gt;= 0.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>remainder</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute the value <em>r</em> such that <em>r</em> = <em>x</em> - <em>n</em>*<em>y</em>, where <em>n</em> is the
integer nearest the exact value of <em>x</em>/<em>y</em>.
If there are two integers closest to <em>x</em>/<em>y</em>, <em>n</em> shall be the even
one.
If <em>r</em> is zero, it is given the same sign as <em>x</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float<em>n</em> <strong>remquo</strong>(float<em>n</em> <em>x</em>, float<em>n</em> <em>y</em>, __global int<em>n</em> <em>*quo</em>)<br>
float <strong>remquo</strong>(float <em>x</em>, float <em>y</em>, __global int <em>*quo</em>)<br></p>
<p class="tableblock"> float<em>n</em> <strong>remquo</strong>(float<em>n</em> <em>x</em>, float<em>n</em> <em>y</em>, __local int<em>n</em> <em>*quo</em>)<br>
float <strong>remquo</strong>(float <em>x</em>, float <em>y</em>, __local int <em>*quo</em>)<br></p>
<p class="tableblock"> float<em>n</em> <strong>remquo</strong>(float<em>n</em> <em>x</em>, float<em>n</em> <em>y</em>, __private int<em>n</em> <em>*quo</em>)<br>
float <strong>remquo</strong>(float <em>x</em>, float <em>y</em>, __private int <em>*quo</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> float<em>n</em> <strong>remquo</strong>(float<em>n</em> <em>x</em>, float<em>n</em> <em>y</em>, int<em>n</em> <em>*quo</em>)<br>
float <strong>remquo</strong>(float <em>x</em>, float <em>y</em>, int <em>*quo</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The <strong>remquo</strong> function computes the value r such that <em>r</em> = <em>x</em> -
<em>k</em>*<em>y</em>, where <em>k</em> is the integer nearest the exact value of <em>x</em>/<em>y</em>.
If there are two integers closest to <em>x</em>/<em>y</em>, <em>k</em> shall be the even
one.
If <em>r</em> is zero, it is given the same sign as <em>x</em>.
This is the same value that is returned by the <strong>remainder</strong> function.
<strong>remquo</strong> also calculates the lower seven bits of the integral quotient
<em>x</em>/<em>y</em>, and gives that value the same sign as <em>x</em>/<em>y</em>.
It stores this signed value in the object pointed to by <em>quo</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">double<em>n</em> <strong>remquo</strong>(double<em>n</em> <em>x</em>, double<em>n</em> <em>y</em>, __global int<em>n</em> <em>*quo</em>)<br>
double <strong>remquo</strong>(double <em>x</em>, double <em>y</em>, __global int <em>*quo</em>)<br></p>
<p class="tableblock"> double<em>n</em> <strong>remquo</strong>(double<em>n</em> <em>x</em>, double<em>n</em> <em>y</em>, __local int<em>n</em> <em>*quo</em>)<br>
double <strong>remquo</strong>(double <em>x</em>, double <em>y</em>, __local int <em>*quo</em>)<br></p>
<p class="tableblock"> double<em>n</em> <strong>remquo</strong>(double<em>n</em> <em>x</em>, double<em>n</em> <em>y</em>, __private int<em>n</em> <em>*quo</em>)<br>
double <strong>remquo</strong>(double <em>x</em>, double <em>y</em>, __private int <em>*quo</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> double<em>n</em> <strong>remquo</strong>(double<em>n</em> <em>x</em>, double<em>n</em> <em>y</em>, int<em>n</em> <em>*quo</em>)<br>
double <strong>remquo</strong>(double <em>x</em>, double <em>y</em>, int <em>*quo</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The <strong>remquo</strong> function computes the value r such that <em>r</em> = <em>x</em> -
<em>k</em>*<em>y</em>, where <em>k</em> is the integer nearest the exact value of <em>x</em>/<em>y</em>.
If there are two integers closest to <em>x</em>/<em>y</em>, <em>k</em> shall be the even
one.
If <em>r</em> is zero, it is given the same sign as <em>x</em>.
This is the same value that is returned by the <strong>remainder</strong> function.
<strong>remquo</strong> also calculates the lower seven bits of the integral quotient
<em>x</em>/<em>y</em>, and gives that value the same sign as <em>x</em>/<em>y</em>.
It stores this signed value in the object pointed to by <em>quo</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>rint</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Round to integral value (using round to nearest even rounding mode) in
floating-point format.
Refer to section 7.1 for description of rounding modes.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float<em>n</em> <strong>rootn</strong>(float<em>n</em> <em>x</em>, int<em>n</em> <em>y</em>)<br>
float <strong>rootn</strong>(float <em>x</em>, int <em>y</em>)<br>
double<em>n</em> <strong>rootn</strong>(double<em>n</em> <em>x</em>, int<em>n</em> <em>y</em>)<br>
double <strong>rootn</strong>(double <em>x</em>, int <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute <em>x</em> to the power 1/<em>y</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>round</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Return the integral value nearest to <em>x</em> rounding halfway cases away
from zero, regardless of the current rounding direction.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>rsqrt</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute inverse square root.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>sin</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute sine, where <em>x</em> is an angle in radians.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>sincos</strong>(gentype <em>x</em>, __global gentype <em>*cosval</em>)<br>
gentype <strong>sincos</strong>(gentype <em>x</em>, __local gentype <em>*cosval</em>)<br>
gentype <strong>sincos</strong>(gentype <em>x</em>, __private gentype <em>*cosval</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> gentype <strong>sincos</strong>(gentype <em>x</em>, gentype <em>*cosval</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute sine and cosine of x.
The computed sine is the return value and computed cosine is returned
in <em>cosval</em>, where <em>x</em> is an angle in radians.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>sinh</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute hyperbolic sine, where <em>x</em> is an angle in radians</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>sinpi</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute <strong>sin</strong>(Ï€ <em>x</em>).</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>sqrt</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute square root.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>tan</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute tangent, where <em>x</em> is an angle in radians.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>tanh</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute hyperbolic tangent, where <em>x</em> is an angle in radians.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>tanpi</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute <strong>tan</strong>(Ï€ <em>x</em>).</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>tgamma</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute the gamma function.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>trunc</strong>(gentype)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Round to integral value using the round to zero rounding mode.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>The following table describes the following functions:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>A subset of functions from <a href="#table-builtin-math">Built-in Scalar and Vector Argument Math Functions</a> that are defined with
the half_ prefix .
These functions are implemented with a minimum of 10-bits of accuracy,
i.e. an ULP value &lt;= 8192 ulp.</p>
</li>
<li>
<p>A subset of functions from <a href="#table-builtin-math">Built-in Scalar and Vector Argument Math Functions</a> that are defined with
the native_ prefix.
These functions may map to one or more native device instructions and
will typically have better performance compared to the corresponding
functions (without the <code>native_</code> prefix) described in
<a href="#table-builtin-math">Built-in Scalar and Vector Argument Math Functions</a>.
The accuracy (and in some cases the input range(s)) of these functions
is implementation-defined.</p>
</li>
<li>
<p><code>half_</code> and <code>native_</code> functions for following basic operations:
divide and reciprocal.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>We use the generic type name <code>gentype</code> to indicate that the functions in the
following table can take <code>float</code>, <code>float2</code>, <code>float3</code>, <code>float4</code>, <code>float8</code> or
<code>float16</code> as the type for the arguments.</p>
</div>
<table id="table-builtin-half-native-math" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 11. Built-in Scalar and Vector <em>half</em> and <em>native</em> Math Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>half_cos</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute cosine.
<em>x</em> is an angle in radians, and must be in the range [-2<sup>16</sup>, +2<sup>16</sup>].</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>half_divide</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute <em>x</em> / <em>y</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>half_exp</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute the base-<em>e</em> exponential of <em>x</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>half_exp2</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute the base- 2 exponential of <em>x</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>half_exp10</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute the base- 10 exponential of <em>x</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>half_log</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute natural logarithm.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>half_log2</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute a base 2 logarithm.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>half_log10</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute a base 10 logarithm.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>half_powr</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute <em>x</em> to the power <em>y</em>, where <em>x</em> is &gt;= 0.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>half_recip</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute reciprocal.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>half_rsqrt</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute inverse square root.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>half_sin</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute sine.
<em>x</em> is an angle in radians, and must be in the range [-2<sup>16</sup>, +2<sup>16</sup>].</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>half_sqrt</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute square root.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>half_tan</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute tangent.
<em>x</em> is an angle in radians, and must be in the range [-2<sup>16</sup>, +2<sup>16</sup>].</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>native_cos</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute cosine over an implementation-defined range, where <em>x</em> is an
angle in radians.
The maximum error is implementation-defined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>native_divide</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute <em>x</em> / <em>y</em> over an implementation-defined range.
The maximum error is implementation-defined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>native_exp</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute the base-<em>e</em> exponential of <em>x</em> over an
implementation-defined range.
The maximum error is implementation-defined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>native_exp2</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute the base-2 exponential of <em>x</em> over an implementation-defined
range.
The maximum error is implementation-defined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>native_exp10</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute the base-10 exponential of <em>x</em> over an implementation-defined
range.
The maximum error is implementation-defined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>native_log</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute natural logarithm over an implementation-defined range.
The maximum error is implementation-defined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>native_log2</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute a base 2 logarithm over an
implementation-defined range.
The maximum error is implementation-defined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>native_log10</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute a base 10 logarithm over
an implementation-defined range.
The maximum error is implementation-defined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>native_powr</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute <em>x</em> to the power <em>y</em>, where <em>x</em> is &gt;= 0.
The range of <em>x</em> and <em>y</em> are implementation-defined.
The maximum error is implementation-defined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>native_recip</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute reciprocal over an implementation-defined range.
The maximum error is implementation-defined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>native_rsqrt</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute inverse square root over an implementation-defined range.
The maximum error is implementation-defined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>native_sin</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute sine over an implementation-defined range, where <em>x</em> is an
angle in radians.
The maximum error is implementation-defined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>native_sqrt</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute square root over an implementation-defined range.
The maximum error is implementation-defined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>native_tan</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute tangent over an implementation-defined range, where <em>x</em> is an
angle in radians.
The maximum error is implementation-defined.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>Support for denormal values is optional for <strong>half_</strong> functions.
The <strong>half_</strong> functions may return any result allowed by
<a href="#edge-case-behavior-in-flush-to-zero-mode">Edge Case Behavior</a>, even when
<code>-cl-denorms-are-zero</code> (see <a href="#opencl-spec">section 5.8.4.2 of the OpenCL
Specification</a>) is not in force.
Support for denormal values is implementation-defined for <strong>native_</strong>
functions.</p>
</div>
</div>
</div>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The following symbolic constants are available.
Their values are of type <code>float</code> and are accurate within the precision of a
single precision floating-point number.</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Constant Name</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>MAXFLOAT</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of maximum non-infinite single-precision floating-point number.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>HUGE_VALF</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A positive <code>float</code> constant expression.
<code>HUGE_VALF</code> evaluates to +infinity.
Used as an error value returned by the built-in math functions.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>INFINITY</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A constant expression of type <code>float</code> representing positive or
unsigned infinity.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>NAN</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A constant expression of type <code>float</code> representing a quiet NaN.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>If double precision is supported by the device, e.g. for OpenCL C 3.0 or newer
the <code>__opencl_c_fp64</code> feature macro is present, the following symbolic
constants will also be available:</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Constant Name</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>HUGE_VAL</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">A positive double constant expression.
<code>HUGE_VAL</code> evaluates to +infinity.
Used as an error value returned by the built-in math functions.</p></td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="sect4">
<h5 id="floating-point-macros-and-pragmas"><a class="anchor" href="#floating-point-macros-and-pragmas"></a>6.15.2.1. Floating-point macros and pragmas</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <code>FP_CONTRACT</code> pragma can be used to allow (if the state is on) or
disallow (if the state is off) the implementation to contract expressions.
Each pragma can occur either outside external declarations or preceding all
explicit declarations and statements inside a compound statement.
When outside external declarations, the pragma takes effect from its
occurrence until another <code>FP_CONTRACT</code> pragma is encountered, or until the
end of the translation unit.
When inside a compound statement, the pragma takes effect from its
occurrence until another <code>FP_CONTRACT</code> pragma is encountered (including
within a nested compound statement), or until the end of the compound
statement; at the end of a compound statement the state for the pragma is
restored to its condition just before the compound statement.
If this pragma is used in any other context, the behavior is undefined.</p>
</div>
<div class="paragraph">
<p>The pragma definition to set <code>FP_CONTRACT</code> is:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="comment">// on-off-switch is one of ON, OFF, or DEFAULT.</span>
<span class="comment">// The DEFAULT value is ON.</span>
<span class="preprocessor">#pragma</span> OPENCL FP_CONTRACT on-off-<span class="keyword">switch</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>The <code>FP_FAST_FMAF</code> macro indicates whether the <strong>fma</strong> function is fast
compared with direct code for single precision floating-point.
If defined, the <code>FP_FAST_FMAF</code> macro shall indicate that the <strong>fma</strong> function
generally executes about as fast as, or faster than, a multiply and an add
of <code>float</code> operands.</p>
</div>
<div class="paragraph">
<p>The macro names given in the following list must use the values specified.
These constant expressions are suitable for use in <code>#if</code> preprocessing
directives.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="preprocessor">#define</span> FLT_DIG <span class="integer">6</span>
<span class="preprocessor">#define</span> FLT_MANT_DIG <span class="integer">24</span>
<span class="preprocessor">#define</span> FLT_MAX_10_EXP +<span class="integer">38</span>
<span class="preprocessor">#define</span> FLT_MAX_EXP +<span class="integer">128</span>
<span class="preprocessor">#define</span> FLT_MIN_10_EXP -<span class="integer">37</span>
<span class="preprocessor">#define</span> FLT_MIN_EXP -<span class="integer">125</span>
<span class="preprocessor">#define</span> FLT_RADIX <span class="integer">2</span>
<span class="preprocessor">#define</span> FLT_MAX <span class="hex">0x1</span>.fffffep127f
<span class="preprocessor">#define</span> FLT_MIN <span class="hex">0x1</span><span class="float">.0</span>p-<span class="integer">12</span><span class="float">6f</span>
<span class="preprocessor">#define</span> FLT_EPSILON <span class="hex">0x1</span><span class="float">.0</span>p-<span class="integer">2</span><span class="float">3f</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>The following table describes the built-in macro names given above in the
OpenCL C programming language and the corresponding macro names available to
the application.</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Macro in OpenCL Language</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Macro for application</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>FLT_DIG</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_FLT_DIG</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>FLT_MANT_DIG</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_FLT_MANT_DIG</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>FLT_MAX_10_EXP</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_FLT_MAX_10_EXP</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>FLT_MAX_EXP</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_FLT_MAX_EXP</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>FLT_MIN_10_EXP</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_FLT_MIN_10_EXP</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>FLT_MIN_EXP</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_FLT_MIN_EXP</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>FLT_RADIX</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_FLT_RADIX</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>FLT_MAX</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_FLT_MAX</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>FLT_MIN</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_FLT_MIN</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>FLT_EPSILSON</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_FLT_EPSILON</code></p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>The following macros shall expand to integer constant expressions whose
values are returned by <strong>ilogb</strong>(<em>x</em>) if <em>x</em> is zero or NaN, respectively.
The value of <code>FP_ILOGB0</code> shall be either <code>INT_MIN</code> or <code>-INT_MAX</code>.
The value of <code>FP_ILOGBNAN</code> shall be either <code>INT_MAX</code> or <code>INT_MIN</code>.</p>
</div>
<div class="paragraph">
<p>The following constants are also available.
They are of type <code>float</code> and are accurate within the precision of the
<code>float</code> type.</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Constant</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_E_F</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of <em>e</em></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_LOG2E_F</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of log<sub>2</sub>e</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_LOG10E_F</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of log<sub>10</sub>e</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_LN2_F</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of log<sub>e</sub>2</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_LN10_F</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of log<sub>e</sub>10</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_PI_F</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of π</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_PI_2_F</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of π / 2</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_PI_4_F</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of π / 4</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_1_PI_F</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of 1 / π</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_2_PI_F</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of 2 / π</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_2_SQRTPI_F</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of 2 / √π</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_SQRT2_F</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of √2</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_SQRT1_2_F</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of 1 / √2</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>If double precision is supported by the device, e.g. for OpenCL C 3.0 or newer
the <code>__opencl_c_fp64</code> feature macro is present, then the following macros
and constants are also available:</p>
</div>
<div class="paragraph">
<p>The <code>FP_FAST_FMA</code> macro indicates whether the <strong>fma</strong>() family of functions
are fast compared with direct code for double precision floating-point.
If defined, the <code>FP_FAST_FMA</code> macro shall indicate that the <strong>fma</strong>() function
generally executes about as fast as, or faster than, a multiply and an add
of <code>double</code> operands</p>
</div>
<div class="paragraph">
<p>The macro names given in the following list must use the values specified.
These constant expressions are suitable for use in <code>#if</code> preprocessing
directives.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="preprocessor">#define</span> DBL_DIG <span class="integer">15</span>
<span class="preprocessor">#define</span> DBL_MANT_DIG <span class="integer">53</span>
<span class="preprocessor">#define</span> DBL_MAX_10_EXP +<span class="integer">308</span>
<span class="preprocessor">#define</span> DBL_MAX_EXP +<span class="integer">1024</span>
<span class="preprocessor">#define</span> DBL_MIN_10_EXP -<span class="integer">307</span>
<span class="preprocessor">#define</span> DBL_MIN_EXP -<span class="integer">1021</span>
<span class="preprocessor">#define</span> DBL_MAX <span class="hex">0x1</span>.fffffffffffffp1023
<span class="preprocessor">#define</span> DBL_MIN <span class="hex">0x1</span><span class="float">.0</span>p-<span class="integer">1022</span>
<span class="preprocessor">#define</span> DBL_EPSILON <span class="hex">0x1</span><span class="float">.0</span>p-<span class="integer">52</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>The following table describes the built-in macro names given above in the
OpenCL C programming language and the corresponding macro names available to
the application.</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Macro in OpenCL Language</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Macro for application</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>DBL_DIG</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_DBL_DIG</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>DBL_MANT_DIG</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_DBL_MANT_DIG</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>DBL_MAX_10_EXP</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_DBL_MAX_10_EXP</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>DBL_MAX_EXP</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_DBL_MAX_EXP</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>DBL_MIN_10_EXP</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_DBL_MIN_10_EXP</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>DBL_MIN_EXP</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_DBL_MIN_EXP</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>DBL_MAX</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_DBL_MAX</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>DBL_MIN</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_DBL_MIN</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>DBL_EPSILSON</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_DBL_EPSILON</code></p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>The following constants are also available.
They are of type <code>double</code> and are accurate within the precision of the
double type.</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Constant</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_E</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of <em>e</em></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_LOG2E</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of log<sub>2</sub>e</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_LOG10E</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of log<sub>10</sub>e</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_LN2</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of log<sub>e</sub>2</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_LN10</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of log<sub>e</sub>10</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_PI</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of π</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_PI_2</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of π / 2</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_PI_4</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of π / 4</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_1_PI</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of 1 / π</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_2_PI</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of 2 / π</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_2_SQRTPI</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of 2 / √π</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_SQRT2</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of √2</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>M_SQRT1_2</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Value of 1 / √2</p></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="integer-functions"><a class="anchor" href="#integer-functions"></a>6.15.3. Integer Functions</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <a href="#table-builtin-functions">following table</a> describes the built-in integer
functions that take scalar or vector arguments.
The vector versions of the integer functions operate component-wise.
The description is per-component.</p>
</div>
<div class="paragraph">
<p>We use the generic type name <code>gentype</code> to indicate that the function can take
<code>char</code>, <code>char{2|3|4|8|16}</code>, <code>uchar</code>, <code>uchar{2|3|4|8|16}</code>, <code>short</code>,
<code>short{2|3|4|8|16}</code>, <code>ushort</code>, <code>ushort{2|3|4|8|16}</code>, <code>int</code>, <code>int{2|3|4|8|16}</code>,
<code>uint</code>, <code>uint{2|3|4|8|16}</code>, <code>long</code> <sup class="footnote">[<a id="_footnoteref_36" class="footnote" href="#_footnotedef_36" title="View footnote.">36</a>]</sup>,
<code>long{2|3|4|8|16}</code>, <code>ulong</code>, or <code>ulong{2|3|4|8|16}</code> as the type for the
arguments.
We use the generic type name <code>ugentype</code> to refer to unsigned versions of
<code>gentype</code>.
For example, if <code>gentype</code> is <code>char4</code>, <code>ugentype</code> is <code>uchar4</code>.
We also use the generic type name <code>sgentype</code> to indicate that the function
can take a scalar data type, i.e. <code>char</code>, <code>uchar</code>, <code>short</code>, <code>ushort</code>, <code>int</code>,
<code>uint</code>, <code>long</code>, or <code>ulong</code>, as the type for the arguments.
For built-in integer functions that take <code>gentype</code> and <code>sgentype</code> arguments,
the <code>gentype</code> argument must be a vector or scalar version of the <code>sgentype</code>
argument.
For example, if <code>sgentype</code> is <code>uchar</code>, <code>gentype</code> must be <code>uchar</code> or
<code>uchar{2|3|4|8|16}</code>.
For vector versions, <code>sgentype</code> is implicitly widened to <code>gentype</code> as
described for <a href="#operators-arithmetic">arithmetic operators</a>.</p>
</div>
<div class="paragraph">
<p>For any specific use of a function, the actual type has to be the same for
all arguments and the return type unless otherwise specified.</p>
</div>
<table id="table-builtin-functions" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 12. Built-in Scalar and Vector Integer Argument Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">ugentype <strong>abs</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns |x|.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">ugentype <strong>abs_diff</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns |x - y| without modulo overflow.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>add_sat</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <em>x</em> + <em>y</em> and saturates the result.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>hadd</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns (<em>x</em> + <em>y</em>) &gt;&gt; 1.
The intermediate sum does not modulo overflow.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>rhadd</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns (<em>x</em> + <em>y</em> + 1) &gt;&gt; 1.
The intermediate sum does not modulo overflow.
<sup class="footnote">[<a id="_footnoteref_37" class="footnote" href="#_footnotedef_37" title="View footnote.">37</a>]</sup></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>clamp</strong>(gentype <em>x</em>, gentype <em>minval</em>, gentype <em>maxval</em>)<br>
gentype <strong>clamp</strong>(gentype <em>x</em>, sgentype <em>minval</em>, sgentype <em>maxval</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <strong>min</strong>(<strong>max</strong>(<em>x</em>, <em>minval</em>), <em>maxval</em>).
Results are undefined if <em>minval</em> &gt; <em>maxval</em>.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.1 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>clz</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the number of leading 0-bits in <em>x</em>, starting at the most
significant bit position.
If <em>x</em> is 0, returns the size in bits of the type of <em>x</em> or component
type of <em>x</em>, if <em>x</em> is a vector.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>ctz</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the count of trailing 0-bits in <em>x</em>.
If <em>x</em> is 0, returns the size in bits of the type of <em>x</em> or component
type of <em>x</em>, if <em>x</em> is a vector.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL 2.0 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>mad_hi</strong>(gentype <em>a</em>, gentype <em>b</em>, gentype <em>c</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <strong>mul_hi</strong>(<em>a</em>, <em>b</em>) + <em>c</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>mad_sat</strong>(gentype <em>a</em>, gentype <em>b</em>, gentype <em>c</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <em>a</em> * <em>b</em> + <em>c</em> and saturates the result.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>max</strong>(gentype <em>x</em>, gentype <em>y</em>)<br></p>
<p class="tableblock"> For OpenCL C 1.1 or newer:<br></p>
<p class="tableblock"> gentype <strong>max</strong>(gentype <em>x</em>, sgentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <em>y</em> if <em>x</em> &lt; <em>y</em>, otherwise it returns <em>x</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>min</strong>(gentype <em>x</em>, gentype <em>y</em>)<br></p>
<p class="tableblock"> For OpenCL C 1.1 or newer:<br></p>
<p class="tableblock"> gentype <strong>min</strong>(gentype <em>x</em>, sgentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <em>y</em> if <em>y</em> &lt; <em>x</em>, otherwise it returns <em>x</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>mul_hi</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Computes <em>x</em> * <em>y</em> and returns the high half of the product of <em>x</em> and
<em>y</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>rotate</strong>(gentype <em>v</em>, gentype <em>i</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">For each element in <em>v</em>, the bits are shifted left by the number of
bits given by the corresponding element in <em>i</em> (subject to the usual
<a href="#operators-shift">shift modulo rules</a>).
Bits shifted off the left side of the element are shifted back in from
the right.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>sub_sat</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <em>x</em> - <em>y</em> and saturates the result.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">short <strong>upsample</strong>(char <em>hi</em>, uchar <em>lo</em>)<br>
ushort <strong>upsample</strong>(uchar <em>hi</em>, uchar <em>lo</em>)<br>
short<em>n</em> <strong>upsample</strong>(char<em>n</em> <em>hi</em>, uchar<em>n</em> <em>lo</em>)<br>
ushort<em>n</em> <strong>upsample</strong>(uchar<em>n</em> <em>hi</em>, uchar<em>n</em> <em>lo</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>result</em>[i] = ((short)<em>hi</em>[i] &lt;&lt; 8) | <em>lo</em>[i]<br>
<em>result</em>[i] = ((ushort)<em>hi</em>[i] &lt;&lt; 8) | <em>lo</em>[i]<br></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>upsample</strong>(short <em>hi</em>, ushort <em>lo</em>)<br>
uint <strong>upsample</strong>(ushort <em>hi</em>, ushort <em>lo</em>)<br>
int<em>n</em> <strong>upsample</strong>(short<em>n</em> <em>hi</em>, ushort<em>n</em> <em>lo</em>)<br>
uint<em>n</em> <strong>upsample</strong>(ushort<em>n</em> <em>hi</em>, ushort<em>n</em> <em>lo</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>result</em>[i] = ((int)<em>hi</em>[i] &lt;&lt; 16) | <em>lo</em>[i]<br>
<em>result</em>[i] = ((uint)<em>hi</em>[i] &lt;&lt; 16) | <em>lo</em>[i]<br></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">long <strong>upsample</strong>(int <em>hi</em>, uint <em>lo</em>)<br>
ulong <strong>upsample</strong>(uint <em>hi</em>, uint <em>lo</em>)<br>
long<em>n</em> <strong>upsample</strong>(int<em>n</em> <em>hi</em>, uint<em>n</em> <em>lo</em>)<br>
ulong<em>n</em> <strong>upsample</strong>(uint<em>n</em> <em>hi</em>, uint<em>n</em> <em>lo</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>result</em>[i] = ((long)<em>hi</em>[i] &lt;&lt; 32) | <em>lo</em>[i]<br>
<em>result</em>[i] = ((ulong)<em>hi</em>[i] &lt;&lt; 32) | <em>lo</em>[i]</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>popcount</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the number of non-zero bits in <em>x</em>.<br></p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.2 or newer.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>The following table describes fast integer functions that can be used for
optimizing performance of kernels.
We use the generic type name <code>gentype</code> to indicate that the function can
take <code>int</code>, <code>int2</code>, <code>int3</code>, <code>int4</code>, <code>int8</code>, <code>int16</code>, <code>uint</code>, <code>uint2</code>,
<code>uint3</code>, <code>uint4</code>, <code>uint8</code> or <code>uint16</code> as the type for the arguments.</p>
</div>
<table id="table-builtin-fast-integer" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 13. Built-in 24-bit Integer Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>mad24</strong>(gentype <em>x</em>, gentype <em>y</em>, gentype z)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Multipy two 24-bit integer values <em>x</em> and <em>y</em> and add the 32-bit
integer result to the 32-bit integer <em>z</em>.
Refer to definition of <strong>mul24</strong> to see how the 24-bit integer
multiplication is performed.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>mul24</strong>(gentype <em>x</em>, gentype <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Multiply two 24-bit integer values <em>x</em> and <em>y</em>.
<em>x</em> and <em>y</em> are 32-bit integers but only the low 24-bits are used to
perform the multiplication.
<strong>mul24</strong> should only be used when values in <em>x</em> and <em>y</em> are in the
range [-2<sup>23</sup>, 2<sup>23</sup>-1] if <em>x</em> and <em>y</em> are signed integers and in the
range [0, 2<sup>24</sup>-1] if <em>x</em> and <em>y</em> are unsigned integers.
If <em>x</em> and <em>y</em> are not in this range, the multiplication result is
implementation-defined.</p></td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="sect4">
<h5 id="integer-macros"><a class="anchor" href="#integer-macros"></a>6.15.3.1. Integer Macros</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The macro names given in the following list must use the values specified.
The values shall all be constant expressions suitable for use in <code>#if</code>
preprocessing directives.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="preprocessor">#define</span> CHAR_BIT <span class="integer">8</span>
<span class="preprocessor">#define</span> CHAR_MAX SCHAR_MAX
<span class="preprocessor">#define</span> CHAR_MIN SCHAR_MIN
<span class="preprocessor">#define</span> INT_MAX <span class="integer">2147483647</span>
<span class="preprocessor">#define</span> INT_MIN (-<span class="integer">2147483647</span> - <span class="integer">1</span>)
<span class="preprocessor">#define</span> LONG_MAX <span class="hex">0x7fffffffffffffff</span>L
<span class="preprocessor">#define</span> LONG_MIN (-<span class="hex">0x7fffffffffffffff</span>L - <span class="integer">1</span>)
<span class="preprocessor">#define</span> SCHAR_MAX <span class="integer">127</span>
<span class="preprocessor">#define</span> SCHAR_MIN (-<span class="integer">127</span> - <span class="integer">1</span>)
<span class="preprocessor">#define</span> SHRT_MAX <span class="integer">32767</span>
<span class="preprocessor">#define</span> SHRT_MIN (-<span class="integer">32767</span> - <span class="integer">1</span>)
<span class="preprocessor">#define</span> UCHAR_MAX <span class="integer">255</span>
<span class="preprocessor">#define</span> USHRT_MAX <span class="integer">65535</span>
<span class="preprocessor">#define</span> UINT_MAX <span class="hex">0xffffffff</span>
<span class="preprocessor">#define</span> ULONG_MAX <span class="hex">0xffffffffffffffff</span>UL</code></pre>
</div>
</div>
<div class="paragraph">
<p>The following table describes the built-in macro names given above in the
OpenCL C programming language and the corresponding macro names available to
the application.</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Macro in OpenCL Language</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Macro for application</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CHAR_BIT</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_CHAR_BIT</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CHAR_MAX</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_CHAR_MAX</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CHAR_MIN</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_CHAR_MIN</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>INT_MAX</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_INT_MAX</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>INT_MIN</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_INT_MIN</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>LONG_MAX</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_LONG_MAX</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>LONG_MIN</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_LONG_MIN</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>SCHAR_MAX</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_SCHAR_MAX</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>SCHAR_MIN</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_SCHAR_MIN</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>SHRT_MAX</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_SHRT_MAX</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>SHRT_MIN</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_SHRT_MIN</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>UCHAR_MAX</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_UCHAR_MAX</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>USHRT_MAX</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_USHRT_MAX</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>UINT_MAX</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_UINT_MAX</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>ULONG_MAX</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_ULONG_MAX</code></p></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="common-functions"><a class="anchor" href="#common-functions"></a>6.15.4. Common Functions</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <a href="#table-builtin-common">following table</a> describes the list of built-in
common functions.
These all operate component-wise.
The description is per-component.
We use the generic type name <code>gentype</code> to indicate that the function can take
<code>float</code>, <code>float2</code>, <code>float3</code>, <code>float4</code>, <code>float8</code>, <code>float16</code>, <code>double</code>
<sup class="footnote">[<a id="_footnoteref_38" class="footnote" href="#_footnotedef_38" title="View footnote.">38</a>]</sup>, <code>double2</code>, <code>double3</code>, <code>double4</code>,
<code>double8</code> or <code>double16</code> as the type for the arguments.
We use the generic type name <code>gentypef</code> to indicate that the function can
take <code>float</code>, <code>float2</code>, <code>float3</code>, <code>float4</code>, <code>float8</code>, or <code>float16</code> as the
type for the arguments.
We use the generic type name <code>gentyped</code> to indicate that the function can
take <code>double</code>, <code>double2</code>, <code>double3</code>, <code>double4</code>, <code>double8</code> or <code>double16</code> as
the type for the arguments.</p>
</div>
<div class="paragraph">
<p>The built-in common functions are implemented using the round to nearest
even rounding mode.
The built-in common functions may be implemented using contractions such
as <strong>mad</strong> or <strong>fma</strong>.</p>
</div>
<table id="table-builtin-common" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 14. Built-in Scalar and Vector Argument Common Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>clamp</strong>(gentype <em>x</em>, gentype <em>minval</em>, gentype <em>maxval</em>)<br>
gentypef <strong>clamp</strong>(gentypef <em>x</em>, float <em>minval</em>, float <em>maxval</em>)<br>
gentyped <strong>clamp</strong>(gentyped <em>x</em>, double <em>minval</em>, double <em>maxval</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <strong>fmin</strong>(<strong>fmax</strong>(<em>x</em>, <em>minval</em>), <em>maxval</em>).
Results are undefined if <em>minval</em> &gt; <em>maxval</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>degrees</strong>(gentype <em>radians</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Converts <em>radians</em> to degrees, i.e. (180 / π) * <em>radians</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>max</strong>(gentype <em>x</em>, gentype <em>y</em>)<br>
gentypef <strong>max</strong>(gentypef <em>x</em>, float <em>y</em>)<br>
gentyped <strong>max</strong>(gentyped <em>x</em>, double <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <em>y</em> if <em>x</em> &lt; <em>y</em>, otherwise it returns <em>x</em>.
If <em>x</em> or <em>y</em> are infinite or NaN, the return values are undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>min</strong>(gentype <em>x</em>, gentype <em>y</em>)<br>
gentypef <strong>min</strong>(gentypef <em>x</em>, float <em>y</em>)<br>
gentyped <strong>min</strong>(gentyped <em>x</em>, double <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <em>y</em> if <em>y</em> &lt; <em>x</em>, otherwise it returns <em>x</em>.
If <em>x</em> or <em>y</em> are infinite or NaN, the return values are undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>mix</strong>(gentype <em>x</em>, gentype <em>y</em>, gentype <em>a</em>)<br>
gentypef <strong>mix</strong>(gentypef <em>x</em>, gentypef <em>y</em>, float <em>a</em>)<br>
gentyped <strong>mix</strong>(gentyped <em>x</em>, gentyped <em>y</em>, double <em>a</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the linear blend of <em>x</em> &amp; <em>y</em> implemented as:</p>
<p class="tableblock"> <em>x</em> + (<em>y</em> - <em>x</em>) * <em>a</em></p>
<p class="tableblock"> <em>a</em> must be a value in the range [0.0, 1.0].
If <em>a</em> is not in the range [0.0, 1.0], the return values are
undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>radians</strong>(gentype <em>degrees</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Converts <em>degrees</em> to radians, i.e. (Ï€ / 180) * <em>degrees</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>step</strong>(gentype <em>edge</em>, gentype <em>x</em>)<br>
gentypef <strong>step</strong>(float <em>edge</em>, gentypef <em>x</em>)<br>
gentyped <strong>step</strong>(double <em>edge</em>, gentyped <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns 0.0 if <em>x</em> &lt; <em>edge</em>, otherwise it returns 1.0.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>smoothstep</strong>(gentype <em>edge0</em>, gentype <em>edge1</em>, gentype <em>x</em>)<br>
gentypef <strong>smoothstep</strong>(float <em>edge0</em>, float <em>edge1</em>, gentypef <em>x</em>)<br>
gentyped <strong>smoothstep</strong>(double <em>edge0</em>, double <em>edge1</em>, gentyped <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><div class="content"><div class="paragraph">
<p>Returns 0.0 if <em>x</em> &lt;= <em>edge0</em> and 1.0 if <em>x</em> &gt;= <em>edge1</em> and performs
smooth Hermite interpolation between 0 and 1 when <em>edge0</em> &lt; <em>x</em> &lt;
<em>edge1</em>.
This is useful in cases where you would want a threshold function with
a smooth transition.</p>
</div>
<div class="paragraph">
<p>This is equivalent to:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">gentype t;
t = clamp ((x - edge0) / (edge1 - edge0), <span class="integer">0</span>, <span class="integer">1</span>);
<span class="keyword">return</span> t * t * (<span class="integer">3</span> - <span class="integer">2</span> * t);</code></pre>
</div>
</div>
<div class="paragraph">
<p>Results are undefined if <em>edge0</em> &gt;= <em>edge1</em> or if <em>x</em>, <em>edge0</em> or <em>edge1</em> is
a NaN.</p>
</div></div></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>sign</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns 1.0 if <em>x</em> &gt; 0, -0.0 if <em>x</em> = -0.0, +0.0 if <em>x</em> = +0.0, or
-1.0 if <em>x</em> &lt; 0.
Returns 0.0 if <em>x</em> is a NaN.</p></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="sect3">
<h4 id="geometric-functions"><a class="anchor" href="#geometric-functions"></a>6.15.5. Geometric Functions</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <a href="#table-builtin-geometric">following table</a> describes the list of built-in
geometric functions.
These all operate component-wise.
The description is per-component.
<code>float<em>n</em></code> is <code>float</code>, <code>float2</code>, <code>float3</code>, or <code>float4</code> and <code>double<em>n</em></code> is
<code>double</code> <sup class="footnote">[<a id="_footnoteref_39" class="footnote" href="#_footnotedef_39" title="View footnote.">39</a>]</sup>, <code>double2</code>, <code>double3</code>, or
<code>double4</code>.</p>
</div>
<div class="paragraph">
<p>The built-in geometric functions are implemented using the round to nearest
even rounding mode.
The built-in geometric functions may be implemented using contractions such
as <strong>mad</strong> or <strong>fma</strong>.</p>
</div>
<table id="table-builtin-geometric" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 15. Built-in Scalar and Vector Argument Geometric Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float4 <strong>cross</strong>(float4 <em>p0</em>, float4 <em>p1</em>)<br>
float3 <strong>cross</strong>(float3 <em>p0</em>, float3 <em>p1</em>)<br>
double4 <strong>cross</strong>(double4 <em>p0</em>, double4 <em>p1</em>)<br>
double3 <strong>cross</strong>(double3 <em>p0</em>, double3 <em>p1</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the cross product of <em>p0.xyz</em> and <em>p1.xyz</em>.
The <em>w</em> component of <code>float4</code> result returned will be 0.0.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float <strong>dot</strong>(float<em>n</em> <em>p0</em>, float<em>n</em> <em>p1</em>)<br>
double <strong>dot</strong>(double<em>n</em> <em>p0</em>, double<em>n</em> <em>p1</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Compute dot product.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float <strong>distance</strong>(float<em>n</em> <em>p0</em>, float<em>n</em> <em>p1</em>)<br>
double <strong>distance</strong>(double<em>n</em> <em>p0</em>, double<em>n</em> <em>p1</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the distance between <em>p0</em> and <em>p1</em>.
This is calculated as <strong>length</strong>(<em>p0</em> - <em>p1</em>).</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float <strong>length</strong>(float<em>n</em> <em>p</em>)<br>
double <strong>length</strong>(double<em>n</em> <em>p</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Return the length of vector <em>p</em>, i.e., √ <em>p.x</em><sup>2</sup> + <em>p.y</em> <sup>2</sup>
+ &#8230;&#8203;</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float<em>n</em> <strong>normalize</strong>(float<em>n</em> <em>p</em>)<br>
double<em>n</em> <strong>normalize</strong>(double<em>n</em> <em>p</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns a vector in the same direction as <em>p</em> but with a length of 1.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float <strong>fast_distance</strong>(float<em>n</em> <em>p0</em>, float<em>n</em> <em>p1</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <strong>fast_length</strong>(<em>p0</em> - <em>p1</em>).</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float <strong>fast_length</strong>(float<em>n</em> <em>p</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the length of vector <em>p</em> computed as:</p>
<p class="tableblock"> <strong>half_sqrt</strong>(<em>p.x</em><sup>2</sup> + <em>p.y</em><sup>2</sup> + &#8230;&#8203;)</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float<em>n</em> <strong>fast_normalize</strong>(float<em>n</em> <em>p</em>)</p></td>
<td class="tableblock halign-left valign-top"><div class="content"><div class="paragraph">
<p>Returns a vector in the same direction as <em>p</em> but with a length of 1.
<strong>fast_normalize</strong> is computed as:</p>
</div>
<div class="paragraph">
<p><em>p</em> * <strong>half_rsqrt</strong>(<em>p.x</em><sup>2</sup> + <em>p.y</em><sup>2</sup> + &#8230;&#8203;)</p>
</div>
<div class="paragraph">
<p>The result shall be within 8192 ulps error from the infinitely precise
result of</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="keyword">if</span> (all(p == <span class="float">0</span><span class="float">.0f</span>))
result = p;
<span class="keyword">else</span>
result = p /
sqrt(p.x*p.x + p.y*p.y + ...);</code></pre>
</div>
</div>
<div class="paragraph">
<p>with the following exceptions:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>If the sum of squares is greater than <code>FLT_MAX</code> then the value of the
floating-point values in the result vector are undefined.</p>
</li>
<li>
<p>If the sum of squares is less than <code>FLT_MIN</code> then the implementation
may return back <em>p</em>.</p>
</li>
<li>
<p>If the device is in &#8220;denorms are flushed to zero&#8221; mode, individual
operand elements with magnitude less than <strong>sqrt</strong>(<code>FLT_MIN</code>) may be flushed
to zero before proceeding with the calculation.</p>
</li>
</ol>
</div></div></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="sect3">
<h4 id="relational-functions"><a class="anchor" href="#relational-functions"></a>6.15.6. Relational Functions</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <a href="#operators-relational">relational</a> and <a href="#operators-equality">equality</a>
operators (<strong>&lt;</strong>, <strong>&lt;=</strong>, <strong>&gt;</strong>, <strong>&gt;=</strong>, <strong>!=</strong>, <strong>==</strong>) can be used with scalar and
vector built-in types and produce a scalar or vector signed integer result
respectively.</p>
</div>
<div class="paragraph">
<p>The functions described in the <a href="#table-builtin-relational">following table</a> can
be used with built-in scalar or vector types as arguments and return a scalar or
vector integer result <sup class="footnote">[<a id="_footnoteref_40" class="footnote" href="#_footnotedef_40" title="View footnote.">40</a>]</sup>.
The argument type <code>gentype</code> refers to the following built-in types: <code>char</code>,
<code>char<em>n</em></code>, <code>uchar</code>, <code>uchar<em>n</em></code>, <code>short</code>, <code>short<em>n</em></code>, <code>ushort</code>,
<code>ushort<em>n</em></code>, <code>int</code>, <code>int<em>n</em></code>, <code>uint</code>, <code>uint<em>n</em></code>, <code>long</code>
<sup class="footnote">[<a id="_footnoteref_41" class="footnote" href="#_footnotedef_41" title="View footnote.">41</a>]</sup>, <code>long<em>n</em></code>, <code>ulong</code>, <code>ulong<em>n</em></code>, <code>float</code>,
<code>float<em>n</em></code>, <code>double</code> <sup class="footnote">[<a id="_footnoteref_42" class="footnote" href="#_footnotedef_42" title="View footnote.">42</a>]</sup>, and
<code>double<em>n</em></code>.
The argument type <code>igentype</code> refers to the built-in signed integer types
i.e. <code>char</code>, <code>char<em>n</em></code>, <code>short</code>, <code>short<em>n</em></code>, <code>int</code>, <code>int<em>n</em></code>, <code>long</code>
and <code>long<em>n</em></code>.
The argument type <code>ugentype</code> refers to the built-in unsigned integer types
i.e. <code>uchar</code>, <code>uchar<em>n</em></code>, <code>ushort</code>, <code>ushort<em>n</em></code>, <code>uint</code>, <code>uint<em>n</em></code>,
<code>ulong</code> and <code>ulong<em>n</em></code>.
<em>n</em> is 2, 3, 4, 8, or 16.</p>
</div>
<div class="paragraph">
<p>The functions <strong>isequal</strong>, <strong>isnotequal</strong>, <strong>isgreater</strong>, <strong>isgreaterequal</strong>,
<strong>isless</strong>, <strong>islessequal</strong>, <strong>islessgreater</strong>, <strong>isfinite</strong>, <strong>isinf</strong>, <strong>isnan</strong>,
<strong>isnormal</strong>, <strong>isordered</strong>, <strong>isunordered</strong> and <strong>signbit</strong> described in the
following table shall return a 0 if the specified relation is <em>false</em> and a
1 if the specified relation is <em>true</em> for scalar argument types.
These functions shall return a 0 if the specified relation is <em>false</em> and a
-1 (i.e. all bits set) if the specified relation is <em>true</em> for vector
argument types.</p>
</div>
<div class="paragraph">
<p>The relational functions <strong>isequal</strong>, <strong>isgreater</strong>, <strong>isgreaterequal</strong>, <strong>isless</strong>,
<strong>islessequal</strong>, and <strong>islessgreater</strong> always return 0 if either argument is not
a number (NaN).
<strong>isnotequal</strong> returns 1 if one or both arguments are not a number (NaN) and
the argument type is a scalar and returns -1 if one or both arguments are
not a number (NaN) and the argument type is a vector.</p>
</div>
<table id="table-builtin-relational" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 16. Built-in Scalar and Vector Relational Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>isequal</strong>(float <em>x</em>, float <em>y</em>)<br>
int<em>n</em> <strong>isequal</strong>(float<em>n</em> <em>x</em>, float<em>n</em> <em>y</em>)<br>
int <strong>isequal</strong>(double <em>x</em>, double <em>y</em>)<br>
long<em>n</em> <strong>isequal</strong>(double<em>n</em> <em>x</em>, double<em>n</em> <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the component-wise compare of <em>x</em> == <em>y</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>isnotequal</strong>(float <em>x</em>, float <em>y</em>)<br>
int<em>n</em> <strong>isnotequal</strong>(float<em>n</em> <em>x</em>, float<em>n</em> <em>y</em>)<br>
int <strong>isnotequal</strong>(double <em>x</em>, double <em>y</em>)<br>
long<em>n</em> <strong>isnotequal</strong>(double<em>n</em> <em>x</em>, double<em>n</em> <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the component-wise compare of <em>x</em> != <em>y</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>isgreater</strong>(float <em>x</em>, float <em>y</em>)<br>
int<em>n</em> <strong>isgreater</strong>(float<em>n</em> <em>x</em>, float<em>n</em> <em>y</em>)<br>
int <strong>isgreater</strong>(double <em>x</em>, double <em>y</em>)<br>
long<em>n</em> <strong>isgreater</strong>(double<em>n</em> <em>x</em>, double<em>n</em> <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the component-wise compare of <em>x</em> &gt; <em>y</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>isgreaterequal</strong>(float <em>x</em>, float <em>y</em>)<br>
int<em>n</em> <strong>isgreaterequal</strong>(float<em>n</em> <em>x</em>, float<em>n</em> <em>y</em>)<br>
int <strong>isgreaterequal</strong>(double <em>x</em>, double <em>y</em>)<br>
long<em>n</em> <strong>isgreaterequal</strong>(double<em>n</em> <em>x</em>, double<em>n</em> <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the component-wise compare of <em>x</em> &gt;= <em>y</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>isless</strong>(float <em>x</em>, float <em>y</em>)<br>
int<em>n</em> <strong>isless</strong>(float<em>n</em> <em>x</em>, float<em>n</em> <em>y</em>)<br>
int <strong>isless</strong>(double <em>x</em>, double <em>y</em>)<br>
long<em>n</em> <strong>isless</strong>(double<em>n</em> <em>x</em>, double<em>n</em> <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the component-wise compare of <em>x</em> &lt; <em>y</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>islessequal</strong>(float <em>x</em>, float <em>y</em>)<br>
int<em>n</em> <strong>islessequal</strong>(float<em>n</em> <em>x</em>, float<em>n</em> <em>y</em>)<br>
int <strong>islessequal</strong>(double <em>x</em>, double <em>y</em>)<br>
long<em>n</em> <strong>islessequal</strong>(double<em>n</em> <em>x</em>, double<em>n</em> <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the component-wise compare of <em>x</em> &lt;= <em>y</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>islessgreater</strong>(float <em>x</em>, float <em>y</em>)<br>
int<em>n</em> <strong>islessgreater</strong>(float<em>n</em> <em>x</em>, float<em>n</em> <em>y</em>)<br>
int <strong>islessgreater</strong>(double <em>x</em>, double <em>y</em>)<br>
long<em>n</em> <strong>islessgreater</strong>(double<em>n</em> <em>x</em>, double<em>n</em> <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the component-wise compare of (<em>x</em> &lt; <em>y</em>) || (<em>x</em> &gt; <em>y</em>) .</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>isfinite</strong>(float)<br>
int<em>n</em> <strong>isfinite</strong>(float<em>n</em>)<br>
int <strong>isfinite</strong>(double)<br>
long<em>n</em> <strong>isfinite</strong>(double<em>n</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Test for finite value.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>isinf</strong>(float)<br>
int<em>n</em> <strong>isinf</strong>(float<em>n</em>)<br>
int <strong>isinf</strong>(double)<br>
long<em>n</em> <strong>isinf</strong>(double<em>n</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Test for infinity value (positive or negative).</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>isnan</strong>(float)<br>
int<em>n</em> <strong>isnan</strong>(float<em>n</em>)<br>
int <strong>isnan</strong>(double)<br>
long<em>n</em> <strong>isnan</strong>(double<em>n</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Test for a NaN.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>isnormal</strong>(float)<br>
int<em>n</em> <strong>isnormal</strong>(float<em>n</em>)<br>
int <strong>isnormal</strong>(double)<br>
long<em>n</em> <strong>isnormal</strong>(double<em>n</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Test for a normal value.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>isordered</strong>(float <em>x</em>, float <em>y</em>)<br>
int<em>n</em> <strong>isordered</strong>(float<em>n</em> <em>x</em>, float<em>n</em> <em>y</em>)<br>
int <strong>isordered</strong>(double <em>x</em>, double <em>y</em>)<br>
long<em>n</em> <strong>isordered</strong>(double<em>n</em> <em>x</em>, double<em>n</em> <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Test if arguments are ordered.
<strong>isordered</strong>() takes arguments <em>x</em> and <em>y</em>, and returns the result
<strong>isequal</strong>(<em>x</em>, <em>x</em>) &amp;&amp; <strong>isequal</strong>(<em>y</em>, <em>y</em>).</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>isunordered</strong>(float <em>x</em>, float <em>y</em>)<br>
int<em>n</em> <strong>isunordered</strong>(float<em>n</em> <em>x</em>, float<em>n</em> <em>y</em>)<br>
int <strong>isunordered</strong>(double <em>x</em>, double <em>y</em>)<br>
long<em>n</em> <strong>isunordered</strong>(double<em>n</em> <em>x</em>, double<em>n</em> <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Test if arguments are unordered.
<strong>isunordered</strong>() takes arguments <em>x</em> and <em>y</em>, returning non-zero if <em>x</em>
or <em>y</em> is NaN, and zero otherwise.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>signbit</strong>(float)<br>
int<em>n</em> <strong>signbit</strong>(float<em>n</em>)<br>
int <strong>signbit</strong>(double)<br>
long<em>n</em> <strong>signbit</strong>(double<em>n</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Test for sign bit.
The scalar version of the function returns a 1 if the sign bit in the
float is set else returns 0.
The vector version of the function returns the following for each
component in <code>float<em>n</em></code>: -1 (i.e all bits set) if the sign bit in the
float is set else returns 0.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>any</strong>(igentype <em>x</em>)</p>
<p class="tableblock">Scalar inputs to <strong>any</strong> are <a href="#unified-spec">deprecated by</a> OpenCL C version
3.0.</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns 1 if the most significant bit of <em>x</em> (for scalar inputs) or
any component of <em>x</em> (for vector inputs) is set; otherwise returns 0.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>all</strong>(igentype <em>x</em>)</p>
<p class="tableblock">Scalar inputs to <strong>all</strong> are <a href="#unified-spec">deprecated by</a> OpenCL C version
3.0.</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns 1 if the most significant bit of <em>x</em> (for scalar inputs) or
all components of <em>x</em> (for vector inputs) is set; otherwise returns 0.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>bitselect</strong>(gentype <em>a</em>, gentype <em>b</em>, gentype <em>c</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Each bit of the result is the corresponding bit of <em>a</em> if the
corresponding bit of <em>c</em> is 0.
Otherwise it is the corresponding bit of <em>b</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>select</strong>(gentype <em>a</em>, gentype <em>b</em>, igentype <em>c</em>)<br>
gentype <strong>select</strong>(gentype <em>a</em>, gentype <em>b</em>, ugentype <em>c</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">For each component of a vector type,</p>
<p class="tableblock"> <em>result[i]</em> = if MSB of <em>c[i]</em> is set ? <em>b[i]</em> : <em>a[i]</em>.</p>
<p class="tableblock"> For a scalar type, <em>result</em> = <em>c</em> ? <em>b</em> : <em>a</em>.</p>
<p class="tableblock"> <code>igentype</code> and <code>ugentype</code> must have the same number of elements and
bits as <code>gentype</code> <sup class="footnote">[<a id="_footnoteref_43" class="footnote" href="#_footnotedef_43" title="View footnote.">43</a>]</sup>.</p></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="sect3">
<h4 id="vector-data-load-and-store-functions"><a class="anchor" href="#vector-data-load-and-store-functions"></a>6.15.7. Vector Data Load and Store Functions</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <a href="#table-vector-loadstore">following table</a> describes the list of supported
functions that allow you to read and write vector types from a pointer to
memory.
We use the generic type <code>gentype</code> to indicate the built-in data types
<code>char</code>, <code>uchar</code>, <code>short</code>, <code>ushort</code>, <code>int</code>, <code>uint</code>, <code>long</code> <sup class="footnote">[<a id="_footnoteref_44" class="footnote" href="#_footnotedef_44" title="View footnote.">44</a>]</sup>, <code>ulong</code>,
<code>float</code> or <code>double</code> <sup class="footnote">[<a id="_footnoteref_45" class="footnote" href="#_footnotedef_45" title="View footnote.">45</a>]</sup>.
We use the generic type name <code>gentype<em>n</em></code> to represent n-element vectors
of <code>gentype</code> elements.
We use the type name <code>half<em>n</em></code> to represent n-element vectors of half
elements.
The suffix <em>n</em> is also used in the function names (i.e. <strong>vload<em>n</em></strong>,
<strong>vstore<em>n</em></strong> etc.), where <em>n</em> = 2, 3 <sup class="footnote">[<a id="_footnoteref_46" class="footnote" href="#_footnotedef_46" title="View footnote.">46</a>]</sup>, 4, 8 or
16.</p>
</div>
<table id="table-vector-loadstore" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 17. Built-in Vector Data Load and Store Functions</caption>
<colgroup>
<col style="width: 70%;">
<col style="width: 30%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype<em>n</em> <strong>vload<em>n</em></strong>(size_t <em>offset</em>, const __global gentype *<em>p</em>)<br>
gentype<em>n</em> <strong>vload<em>n</em></strong>(size_t <em>offset</em>, const __local gentype *<em>p</em>)<br>
gentype<em>n</em> <strong>vload<em>n</em></strong>(size_t <em>offset</em>, const __constant gentype *<em>p</em>)<br>
gentype<em>n</em> <strong>vload<em>n</em></strong>(size_t <em>offset</em>, const __private gentype *<em>p</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> gentype<em>n</em> <strong>vload<em>n</em></strong>(size_t <em>offset</em>, const gentype *<em>p</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Return <code>sizeof(gentype<em>n</em>)</code> bytes of data, where the first <code>(<em>n</em> *
sizeof(gentype))</code> bytes are read from the address
computed as <code>(<em>p</em> + (<em>offset</em> * <em>n</em>))</code>.
The computed address must be 8-bit aligned if <code>gentype</code> is <code>char</code> or
<code>uchar</code>; 16-bit aligned if <code>gentype</code> is <code>short</code> or <code>ushort</code>; 32-bit
aligned if <code>gentype</code> is <code>int</code>, <code>uint</code>, or <code>float</code>; and 64-bit aligned
if <code>gentype</code> is <code>long</code> or <code>ulong</code>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>vstore<em>n</em></strong>(gentype<em>n</em> <em>data</em>, size_t <em>offset</em>, __global gentype *<em>p</em>)<br>
void <strong>vstore<em>n</em></strong>(gentype<em>n</em> <em>data</em>, size_t <em>offset</em>, __local gentype *<em>p</em>)<br>
void <strong>vstore<em>n</em></strong>(gentype<em>n</em> <em>data</em>, size_t <em>offset</em>, __private gentype *<em>p</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> void <strong>vstore<em>n</em></strong>(gentype<em>n</em> <em>data</em>, size_t <em>offset</em>, gentype *<em>p</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Write <code><em>n</em> * sizeof(gentype)</code> bytes given by <em>data</em> to the address
computed as <code>(<em>p</em> + (<em>offset</em> * <em>n</em>))</code>.
The computed address must be 8-bit aligned if <code>gentype</code> is <code>char</code> or
<code>uchar</code>; 16-bit aligned if <code>gentype</code> is <code>short</code> or <code>ushort</code>; 32-bit
aligned if <code>gentype</code> is <code>int</code>, <code>uint</code>, or <code>float</code>; and 64-bit aligned
if <code>gentype</code> is <code>long</code> or <code>ulong</code>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float <strong>vload_half</strong>(size_t <em>offset</em>, const __global half *<em>p</em>)<br>
float <strong>vload_half</strong>(size_t <em>offset</em>, const __local half *<em>p</em>)<br>
float <strong>vload_half</strong>(size_t <em>offset</em>, const __constant half *<em>p</em>)<br>
float <strong>vload_half</strong>(size_t <em>offset</em>, const __private half *<em>p</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> float <strong>vload_half</strong>(size_t <em>offset</em>, const half *<em>p</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Read <code>sizeof(half)</code> bytes of data from the address computed as <code>(<em>p</em>
+ <em>offset</em>)</code>.
The data read is interpreted as a <code>half</code> value.
The <code>half</code> value is converted to a <code>float</code> value and the <code>float</code> value
is returned.
The computed read address must be 16-bit aligned.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float<em>n</em> <strong>vload_half<em>n</em></strong>(size_t <em>offset</em>, const __global half *<em>p</em>)<br>
float<em>n</em> <strong>vload_half<em>n</em></strong>(size_t <em>offset</em>, const __local half *<em>p</em>)<br>
float<em>n</em> <strong>vload_half<em>n</em></strong>(size_t <em>offset</em>, const __constant half *<em>p</em>)<br>
float<em>n</em> <strong>vload_half<em>n</em></strong>(size_t <em>offset</em>, const __private half *<em>p</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> float<em>n</em> <strong>vload_half<em>n</em></strong>(size_t <em>offset</em>, const half *<em>p</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Read <code>(<em>n</em> * sizeof(half))</code> bytes of data from the address computed as
<code>(<em>p</em> + (<em>offset * n</em>))</code>.
The data read is interpreted as a <code>half<em>n</em></code> value.
The <code>half<em>n</em></code> value read is converted to a <code>float<em>n</em></code> value and
the <code>float<em>n</em></code> value is returned.
The computed read address must be 16-bit aligned.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>vstore_half</strong>(float <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstore_half_rte</strong>(float <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstore_half_rtz</strong>(float <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstore_half_rtp</strong>(float <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstore_half_rtn</strong>(float <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br></p>
<p class="tableblock"> void <strong>vstore_half</strong>(float <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstore_half_rte</strong>(float <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstore_half_rtz</strong>(float <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstore_half_rtp</strong>(float <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstore_half_rtn</strong>(float <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br></p>
<p class="tableblock"> void <strong>vstore_half</strong>(float <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstore_half_rte</strong>(float <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstore_half_rtz</strong>(float <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstore_half_rtp</strong>(float <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstore_half_rtn</strong>(float <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> void <strong>vstore_half</strong>(float <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstore_half_rte</strong>(float <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstore_half_rtz</strong>(float <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstore_half_rtp</strong>(float <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstore_half_rtn</strong>(float <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The <code>float</code> value given by <em>data</em> is first converted to a <code>half</code> value
using the appropriate rounding mode.
The <code>half</code> value is then written to the address computed as <code>(<em>p</em>
+ <em>offset</em>)</code>.
The computed address must be 16-bit aligned.</p>
<p class="tableblock"> <strong>vstore_half</strong> uses the default rounding mode.
The default rounding mode is round to nearest even.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>vstore_half<em>n</em></strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rte</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtz</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtp</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtn</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br></p>
<p class="tableblock"> void <strong>vstore_half<em>n</em></strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rte</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtz</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtp</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtn</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br></p>
<p class="tableblock"> void <strong>vstore_half<em>n</em></strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rte</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtz</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtp</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtn</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> void <strong>vstore_half<em>n</em></strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rte</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtz</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtp</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtn</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The <code>float<em>n</em></code> value given by <em>data</em> is converted to a <code>half<em>n</em></code>
value using the appropriate rounding mode.
<code><em>n</em> * sizeof(half)</code> bytes from the <code>half<em>n</em></code> value are then written to
the address computed as <code>(<em>p</em>
+ (<em>offset</em> * <em>n</em>))</code>.
The computed address must be 16-bit aligned.</p>
<p class="tableblock"> <strong>vstore_half<em>n</em></strong> uses the default rounding mode.
The default rounding mode is round to nearest even.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>vstore_half</strong>(double <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstore_half_rte</strong>(double <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstore_half_rtz</strong>(double <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstore_half_rtp</strong>(double <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstore_half_rtn</strong>(double <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br></p>
<p class="tableblock"> void <strong>vstore_half</strong>(double <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstore_half_rte</strong>(double <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstore_half_rtz</strong>(double <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstore_half_rtp</strong>(double <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstore_half_rtn</strong>(double <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br></p>
<p class="tableblock"> void <strong>vstore_half</strong>(double <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstore_half_rte</strong>(double <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstore_half_rtz</strong>(double <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstore_half_rtp</strong>(double <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstore_half_rtn</strong>(double <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> void <strong>vstore_half</strong>(double <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstore_half_rte</strong>(double <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstore_half_rtz</strong>(double <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstore_half_rtp</strong>(double <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstore_half_rtn</strong>(double <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The <code>double</code> value given by <em>data</em> is first converted to a <code>half</code>
value using the appropriate rounding mode.
The <code>half</code> value is then written to the address computed as <code>(<em>p</em>
+ <em>offset</em>)</code>.
The computed address must be 16-bit aligned.</p>
<p class="tableblock"> <strong>vstore_half</strong> uses the default rounding mode.
The default rounding mode is round to nearest even.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>vstore_half<em>n</em></strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rte</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtz</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtp</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtn</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br></p>
<p class="tableblock"> void <strong>vstore_half<em>n</em></strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rte</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtz</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtp</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtn</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br></p>
<p class="tableblock"> void <strong>vstore_half<em>n</em></strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rte</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtz</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtp</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtn</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> void <strong>vstore_half<em>n</em></strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rte</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtz</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtp</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstore_half<em>n</em>_rtn</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The <code>double<em>n</em></code> value given by <em>data</em> is converted to a <code>half<em>n</em></code>
value using the appropriate rounding mode.
<code><em>n</em> * sizeof(half)</code> bytes from the <code>half<em>n</em></code> value are then written to
the address computed as <code>(<em>p</em> + (<em>offset</em> * <em>n</em>))</code>.
The computed address must be 16-bit aligned.</p>
<p class="tableblock"> <strong>vstore_half<em>n</em></strong> uses the default rounding mode.
The default rounding mode is round to nearest even.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float<em>n</em> <strong>vloada_half<em>n</em></strong>(size_t <em>offset</em>, const __global half *<em>p</em>)<br>
float<em>n</em> <strong>vloada_half<em>n</em></strong>(size_t <em>offset</em>, const __local half *<em>p</em>)<br>
float<em>n</em> <strong>vloada_half<em>n</em></strong>(size_t <em>offset</em>, const __constant half *<em>p</em>)<br>
float<em>n</em> <strong>vloada_half<em>n</em></strong>(size_t <em>offset</em>, const __private half *<em>p</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> float<em>n</em> <strong>vloada_half<em>n</em></strong>(size_t <em>offset</em>, const half *<em>p</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">For n = 2, 4, 8 and 16, read <code>sizeof(half<em>n</em>)</code> bytes of data from
the address computed as (<em>p</em> + (<em>offset</em> * <em>n</em>)).
The data read is interpreted as a <code>half<em>n</em></code> value.
The <code>half<em>n</em></code> value read is converted to a <code>float<em>n</em></code> value and
the <code>float<em>n</em></code> value is returned.
The computed address must be aligned to <code>sizeof(half<em>n</em>)</code> bytes.</p>
<p class="tableblock"> For n = 3, <strong>vloada_half3</strong> reads a <code>half3</code> from the address computed as
<code>(<em>p</em> + (<em>offset * 4</em>))</code> and returns a <code>float3</code>.
The computed address must be aligned to <code>sizeof(half)</code> * 4 bytes.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>vstorea_half<em>n</em></strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rte</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtz</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtp</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtn</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br></p>
<p class="tableblock"> void <strong>vstorea_half<em>n</em></strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rte</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtz</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtp</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtn</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br></p>
<p class="tableblock"> void <strong>vstorea_half<em>n</em></strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rte</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtz</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtp</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtn</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> void <strong>vstorea_half<em>n</em></strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rte</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtz</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtp</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtn</strong>(float<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The <code>float<em>n</em></code> value given by <em>data</em> is converted to a <code>half<em>n</em></code>
value using the appropriate rounding mode.</p>
<p class="tableblock"> For n = 2, 4, 8 and 16, the <code>half<em>n</em></code> value is written to the
address computed as <code>(<em>p</em> + (<em>offset</em> * <em>n</em>))</code>.
The computed address must be aligned to <code>sizeof(half<em>n</em>)</code> bytes.</p>
<p class="tableblock"> For n = 3, the <code>half3</code> value is written
to the address computed as <code>(<em>p</em> + (<em>offset</em> * 4))</code>.
The computed address must be aligned to <code>sizeof(half) * 4</code> bytes.</p>
<p class="tableblock"> <strong>vstorea_half<em>n</em></strong> uses the default rounding mode.
The default rounding mode is round to nearest even.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>vstorea_half<em>n</em></strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rte</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtz</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtp</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtn</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __global half *<em>p</em>)<br></p>
<p class="tableblock"> void <strong>vstorea_half<em>n</em></strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rte</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtz</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtp</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtn</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __local half *<em>p</em>)<br></p>
<p class="tableblock"> void <strong>vstorea_half<em>n</em></strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rte</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtz</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtp</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtn</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, __private half *<em>p</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0, or OpenCL C 3.0 or newer with the
<code>__opencl_c_generic_address_space</code> feature:<br></p>
<p class="tableblock"> void <strong>vstorea_half<em>n</em></strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rte</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtz</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtp</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)<br>
void <strong>vstorea_half<em>n</em>_rtn</strong>(double<em>n</em> <em>data</em>, size_t <em>offset</em>, half *<em>p</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The <code>double<em>n</em></code> value is converted to a <code>half<em>n</em></code> value using the
appropriate rounding mode.</p>
<p class="tableblock"> For n = 2, 4, 8 or 16, the <code>half<em>n</em></code> value is written to the address
computed as <code>(<em>p</em> + (<em>offset</em> * <em>n</em>))</code>.
The computed address must be aligned to <code>sizeof(half<em>n</em>)</code> bytes.</p>
<p class="tableblock"> For n = 3, the <code>half3</code> value is written
to the address computed as <code>(<em>p</em> + (<em>offset</em> * 4))</code>.
The computed address must be aligned to <code>sizeof(half) * 4</code> bytes.</p>
<p class="tableblock"> <strong>vstorea_half<em>n</em></strong> uses the default rounding mode.
The default rounding mode is round to nearest even.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>The results of vector data load and store functions are undefined if the
address being read from or written to is not correctly aligned as described
in <a href="#table-vector-loadstore">Built-in Vector Data Load and Store Functions</a>.
The pointer argument p can be a pointer to <code>global</code>, <code>local</code>, or <code>private</code>
memory for store functions described in <a href="#table-vector-loadstore">Built-in Vector Data Load and Store Functions</a>.
The pointer argument p can be a pointer to <code>global</code>, <code>local</code>, <code>constant</code>, or
<code>private</code> memory for load functions described in <a href="#table-vector-loadstore">Built-in Vector Data Load and Store Functions</a>.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>The vector data load and store functions variants that take pointer
arguments which point to the generic address space are also supported.</p>
</div>
</td>
</tr>
</table>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="synchronization-functions"><a class="anchor" href="#synchronization-functions"></a>6.15.8. Synchronization Functions</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The following table describes built-in functions to synchronize the work-items
in a work-group.</p>
</div>
<table id="table-builtin-synchronization" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 18. Built-in Work-Group Synchronization Functions</caption>
<colgroup>
<col style="width: 30%;">
<col style="width: 70%;">
</colgroup>
<thead>
<tr>
<th class="tableblock halign-left valign-top"><strong>Function</strong></th>
<th class="tableblock halign-left valign-top"><strong>Description</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>barrier</strong>(<br>
cl_mem_fence_flags <em>flags</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0 or newer, as an alias for <strong>barrier</strong>:<br></p>
<p class="tableblock"> void <strong>work_group_barrier</strong>(<br>
cl_mem_fence_flags <em>flags</em>)<br></p>
<p class="tableblock"> void <strong>work_group_barrier</strong>(<br>
cl_mem_fence_flags <em>flags</em>,<br>
memory_scope <em>scope</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">For these functions, if any work-item in a work-group encounters a
barrier, the barrier must be encountered by all work-items in the
work-group before any are allowed to continue execution beyond the
barrier.</p>
<p class="tableblock"> If the barrier is inside a conditional statement, then all
work-items in the work-group must enter the conditional if any work-item in the work-group enters the
conditional statement and executes the barrier.</p>
<p class="tableblock"> If the barrier is inside a loop, then all work-items in the work-group must execute
the barrier on each iteration of the loop if any work-item executes the barrier on that iteration.</p>
<p class="tableblock"> The <strong>barrier</strong> and <strong>work_group_barrier</strong> functions can specify which
memory operations become visible to the appropriate memory scope
identified by <em>scope</em> <sup class="footnote">[<a id="_footnoteref_47" class="footnote" href="#_footnotedef_47" title="View footnote.">47</a>]</sup>.
The <em>flags</em> argument specifies the memory address spaces.
This is a bitfield and can be set to 0 or a combination of the
following values OR&#8217;ed together.
When these flags are OR&#8217;ed together the barrier acts as a
combined barrier for all address spaces specified by the flags
ordering memory accesses both within and across the specified address
spaces.
For <strong>barrier</strong> and the <strong>work_group_barrier</strong> variant that does not take a
memory scope, the <em>scope</em> is <code>memory_scope_work_group</code>.</p>
<p class="tableblock"> <code>CLK_LOCAL_MEM_FENCE</code> - ensure
that all <code>local</code> memory accesses become visible to all work-items in the
work-group.
Note that the value of <em>scope</em> is ignored as the memory scope is
always <code>memory_scope_work_group</code>.</p>
<p class="tableblock"> <code>CLK_GLOBAL_MEM_FENCE</code> - ensure that
all <code>global</code> memory accesses become visible to the appropriate memory scope
as given by <em>scope</em>.</p>
<p class="tableblock"> <code>CLK_IMAGE_MEM_FENCE</code> - ensure that all image memory accesses
become visible to the appropriate scope given by <em>scope</em>.
The value of <em>scope</em> must be <code>memory_scope_work_group</code>.</p>
<p class="tableblock"> The values of <em>flags</em> and <em>scope</em> must be the same for all work-items
in the work-group.</p></td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The functionality described in the following table <a href="#unified-spec">requires</a> support for OpenCL 3.0 or newer and the <code>__opencl_c_subgroups</code>
feature.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>The following table describes built-in functions to synchronize the work-items
in a subgroup.</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<caption class="title">Table 19. Built-in Subgroup Synchronization Functions</caption>
<colgroup>
<col style="width: 30%;">
<col style="width: 70%;">
</colgroup>
<thead>
<tr>
<th class="tableblock halign-left valign-top"><strong>Function</strong></th>
<th class="tableblock halign-left valign-top"><strong>Description</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>sub_group_barrier</strong>(<br>
cl_mem_fence_flags <em>flags</em>)<br></p>
<p class="tableblock"> void <strong>sub_group_barrier</strong>(<br>
cl_mem_fence_flags <em>flags</em>,<br>
memory_scope <em>scope</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">For these functions, if any work-item in a subgroup encounters a
<strong>sub_group_barrier</strong>, the barrier must be encountered by all work-items in the
subgroup before any are allowed to continue execution beyond the barrier.</p>
<p class="tableblock"> If <strong>sub_group_barrier</strong> is inside a conditional statement, then all
work-items within the subgroup must enter the conditional if any work-item in
the subgroup enters the conditional statement and executes the
<strong>sub_group_barrier</strong>.</p>
<p class="tableblock"> If the <strong>sub_group_barrier</strong> is inside a loop, then all work-items in the subgroup
must execute the barrier on each iteration of the loop if any work-item executes the barrier on that iteration.</p>
<p class="tableblock"> The <strong>sub_group_barrier</strong> function can specify which
memory operations become visible to the appropriate memory scope
identified by <em>scope</em>.
The <em>flags</em> argument specifies the memory address spaces.
This is a bitfield and can be set to 0 or a combination of the
following values OR&#8217;ed together.
When these flags are OR&#8217;ed together the barrier acts as a
combined barrier for all address spaces specified by the flags
ordering memory accesses both within and across the specified address
spaces.
For the <strong>sub_group_barrier</strong> variant that does not take a
memory scope, the <em>scope</em> is <code>memory_scope_sub_group</code>.</p>
<p class="tableblock"> <code>CLK_LOCAL_MEM_FENCE</code> - The <strong>sub_group_barrier</strong> function will either flush
any variables stored in local memory or queue a memory fence to ensure
correct ordering of memory operations to local memory.</p>
<p class="tableblock"> <code>CLK_GLOBAL_MEM_FENCE</code> - The <strong>sub_group_barrier</strong> function will queue a
memory fence to ensure correct ordering of memory operations to global
memory.
This can be useful when work items, for example, write to buffer objects
and then want to read the updated data from these buffer objects.</p>
<p class="tableblock"> <code>CLK_IMAGE_MEM_FENCE</code> - The <strong>sub_group_barrier</strong> function will queue a memory
fence to ensure correct ordering of memory operations to image objects.
This can be useful when work items, for example, write to image objects
and then want to read the updated data from these image objects.</p>
<p class="tableblock"> The value of <em>scope</em> must match requirements of the
<a href="#atomic-restrictions">atomic restrictions section</a>.</p></td>
</tr>
</tbody>
</table>
</div>
<div class="sect3">
<h4 id="legacy-mem-fence-functions"><a class="anchor" href="#legacy-mem-fence-functions"></a>6.15.9. Legacy Explicit Memory Fence Functions</h4>
<div class="openblock">
<div class="content">
<div class="admonitionblock important">
<table>
<tr>
<td class="icon">
<i class="fa icon-important" title="Important"></i>
</td>
<td class="content">
The memory fence functions described in this sub-section are
<a href="#unified-spec">deprecated by</a> OpenCL C 2.0.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>The OpenCL C programming language implements the following explicit memory fence functions to provide ordering between memory operations of a work-item.</p>
</div>
<table id="table-builtin-explicit-memory-fences" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 20. Built-in Explicit Memory Fence Functions</caption>
<colgroup>
<col style="width: 30%;">
<col style="width: 70%;">
</colgroup>
<thead>
<tr>
<th class="tableblock halign-left valign-top"><strong>Function</strong></th>
<th class="tableblock halign-left valign-top"><strong>Description</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>mem_fence</strong>(<br>
cl_mem_fence_flags <em>flags</em>)<br></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Orders loads and stores of a work-item executing a kernel. This means that
loads and stores preceding the <strong>mem_fence</strong> will be committed to memory
before any loads and stores following the <strong>mem_fence</strong>.</p>
<p class="tableblock"> The <em>flags</em> argument specifies the memory address space and can be set to a
combination of the following literal values:</p>
<p class="tableblock"> <code>CLK_LOCAL_MEM_FENCE</code><br>
<code>CLK_GLOBAL_MEM_FENCE</code></p>
<p class="tableblock"> The value of <em>flags</em> must be the same for all work-items in the work-group.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>read_mem_fence</strong>(<br>
cl_mem_fence_flags <em>flags</em>)<br></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Read memory barrier that orders only loads.</p>
<p class="tableblock"> The <em>flags</em> argument specifies the memory address space and can be set to a
combination of the following literal values:</p>
<p class="tableblock"> <code>CLK_LOCAL_MEM_FENCE</code><br>
<code>CLK_GLOBAL_MEM_FENCE</code></p>
<p class="tableblock"> The value of <em>flags</em> must be the same for all work-items in the work-group.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>write_mem_fence</strong>(<br>
cl_mem_fence_flags <em>flags</em>)<br></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Write memory barrier that orders only stores.</p>
<p class="tableblock"> The <em>flags</em> argument specifies the memory address space and can be set to a
combination of the following literal values:</p>
<p class="tableblock"> <code>CLK_LOCAL_MEM_FENCE</code><br>
<code>CLK_GLOBAL_MEM_FENCE</code></p>
<p class="tableblock"> The value of <em>flags</em> must be the same for all work-items in the work-group.</p></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="sect3">
<h4 id="address-space-qualifier-functions"><a class="anchor" href="#address-space-qualifier-functions"></a>6.15.10. Address Space Qualifier Functions</h4>
<div class="openblock">
<div class="content">
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The functionality described in this section <a href="#unified-spec">requires</a>
support for OpenCL C 2.0, or OpenCL C 3.0 or newer and the
<code>__opencl_c_generic_address_space</code> feature.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>This section describes built-in functions to safely convert from pointers
to the generic address space to pointers to named address spaces, and to
query the appropriate fence flags for a pointer to the generic address space.
We use the generic type name <code>gentype</code> to indicate any of the built-in data
types supported by OpenCL C or a user defined type.</p>
</div>
<table id="table-builtin-address-qualifier" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 21. Built-in Address Space Qualifier Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">global gentype * <strong>to_global</strong>(gentype *<em>ptr</em>)<br>
const global gentype * <strong>to_global</strong>(const gentype *<em>ptr</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns a pointer that points to a region in the <code>global</code> address
space if <strong>to_global</strong> can cast <em>ptr</em> to the <code>global</code> address space.
Otherwise it returns <code>NULL</code>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">local gentype * <strong>to_local</strong>(gentype *<em>ptr</em>)<br>
const local gentype * <strong>to_local</strong>(const gentype *<em>ptr</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns a pointer that points to a region in the <code>local</code> address space
if <strong>to_local</strong> can cast <em>ptr</em> to the local address space.
Otherwise it returns <code>NULL</code>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">private gentype * <strong>to_private</strong>(gentype *<em>ptr</em>)<br>
const private gentype * <strong>to_private</strong>(const gentype *<em>ptr</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns a pointer that points to a region in the <code>private</code> address
space if <strong>to_private</strong> can cast <em>ptr</em> to the <code>private</code> address space.
Otherwise it returns <code>NULL</code>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">cl_mem_fence_flags <strong>get_fence</strong>(gentype *<em>ptr</em>)<br>
cl_mem_fence_flags <strong>get_fence</strong>(const gentype *<em>ptr</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns a valid memory fence value for <em>ptr</em>.</p></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="sect3">
<h4 id="async-copies"><a class="anchor" href="#async-copies"></a>6.15.11. Async Copies from Global to Local Memory, Local to Global Memory, and Prefetch</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The OpenCL C programming language implements the <a href="#table-builtin-async-copy">following functions</a> that provide asynchronous copies between <code>global</code> and
local memory and a prefetch from <code>global</code> memory.</p>
</div>
<div class="paragraph">
<p>We use the generic type name <code>gentype</code> to indicate the built-in data types char,
<code>char{2|3|4|8|16}</code>, <code>uchar</code>, <code>uchar{2|3|4|8|16}</code>, <code>short</code>, <code>short{2|3|4|8|16}</code>,
<code>ushort</code>, <code>ushort{2|3|4|8|16}</code>, <code>int</code>, <code>int{2|3|4|8|16}</code>, <code>uint</code>,
<code>uint{2|3|4|8|16}</code>, <code>long</code> <sup class="footnote">[<a id="_footnoteref_48" class="footnote" href="#_footnotedef_48" title="View footnote.">48</a>]</sup>, <code>long{2|3|4|8|16}</code>,
<code>ulong</code>, <code>ulong{2|3|4|8|16}</code>, <code>float</code>, <code>float{2|3|4|8|16}</code>, or <code>double</code>
<sup class="footnote">[<a id="_footnoteref_49" class="footnote" href="#_footnotedef_49" title="View footnote.">49</a>]</sup>, <code>double{2|3|4|8|16}</code> as the type for
the arguments unless otherwise stated <sup class="footnote">[<a id="_footnoteref_50" class="footnote" href="#_footnotedef_50" title="View footnote.">50</a>]</sup>.</p>
</div>
<table id="table-builtin-async-copy" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 22. Built-in Async Copy and Prefetch Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">event_t <strong>async_work_group_copy</strong>(__local gentype <em>*dst</em>,
const __global gentype *<em>src</em>, size_t <em>num_gentypes</em>, event_t <em>event</em>)<br>
event_t <strong>async_work_group_copy</strong>(__global gentype <em>*dst</em>,
const __local gentype *<em>src</em>, size_t <em>num_gentypes</em>, event_t <em>event</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Perform an async copy of <em>num_gentypes</em> gentype elements from <em>src</em> to
<em>dst</em>.
The async copy is performed by all work-items in a work-group and this
built-in function must therefore be encountered by all work-items in a
work-group executing the kernel with the same argument values;
otherwise the results are undefined.
This rule applies to ND-ranges implemented with uniform and
non-uniform work-groups.</p>
<p class="tableblock"> Returns an event object that can be used by <strong>wait_group_events</strong> to
wait for the async copy to finish.
The <em>event</em> argument can also be used to associate the
<strong>async_work_group_copy</strong> with a previous async copy allowing an event
to be shared by multiple async copies; otherwise <em>event</em> should be
zero.</p>
<p class="tableblock"> 0 can be implicitly and explicitly cast to <code>event_t</code> type.</p>
<p class="tableblock"> If <em>event</em> argument is non-zero, the event object supplied in <em>event</em>
argument will be returned.</p>
<p class="tableblock"> This function does not perform any implicit synchronization of source
data such as using a <strong>barrier</strong> before performing the copy.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">event_t <strong>async_work_group_strided_copy</strong>(__local gentype <em>*dst</em>,
const __global gentype *<em>src</em>, size_t <em>num_gentypes</em>, size_t <em>src_stride</em>,
event_t <em>event</em>)<br>
event_t <strong>async_work_group_strided_copy</strong>(__global gentype <em>*dst</em>,
const __local gentype *<em>src</em>, size_t <em>num_gentypes</em>, size_t <em>dst_stride</em>,
event_t <em>event</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Perform an async gather of <em>num_gentypes</em> <code>gentype</code> elements from
<em>src</em> to <em>dst</em>.
The <em>src_stride</em> is the stride in elements for each <code>gentype</code>
element read from <em>src</em>.
The <em>dst_stride</em> is the stride in elements for each <code>gentype</code> element
written to <em>dst</em>.
The async gather is performed by all work-items in a work-group.
This built-in function must therefore be encountered by all work-items
in a work-group executing the kernel with the same argument values;
otherwise the results are undefined.
This rule applies to ND-ranges implemented with uniform and
non-uniform work-groups</p>
<p class="tableblock"> Returns an event object that can be used by <strong>wait_group_events</strong> to
wait for the async copy to finish.
The <em>event</em> argument can also be used to associate the
<strong>async_work_group_strided_copy</strong> with a previous async copy allowing an
event to be shared by multiple async copies; otherwise <em>event</em> should
be zero.</p>
<p class="tableblock"> 0 can be implicitly and explicitly cast to event_t type.</p>
<p class="tableblock"> If <em>event</em> argument is non-zero, the event object supplied in <em>event</em>
argument will be returned.</p>
<p class="tableblock"> This function does not perform any implicit synchronization of source
data such as using a <strong>barrier</strong> before performing the copy.</p>
<p class="tableblock"> The behavior of <strong>async_work_group_strided_copy</strong> is undefined if
<em>src_stride</em> or <em>dst_stride</em> is 0, or if the <em>src_stride</em> or
<em>dst_stride</em> values cause the <em>src</em> or <em>dst</em> pointers to exceed the
upper bounds of the address space during the copy.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.1 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>wait_group_events</strong>(int <em>num_events</em>, event_t *<em>event_list</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Wait for events that identify the <strong>async_work_group_copy</strong> operations
to complete.
The event objects specified in <em>event_list</em> will be released after the
wait is performed.</p>
<p class="tableblock"> This function must be encountered by all work-items in a work-group
executing the kernel with the same <em>num_events</em> and event objects
specified in <em>event_list</em>; otherwise the results are undefined.
This rule applies to ND-ranges implemented with uniform and
non-uniform work-groups</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>prefetch</strong>(const __global gentype *<em>p</em>, size_t <em>num_gentypes</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Prefetch <code><em>num_gentypes</em> * sizeof(gentype)</code> bytes into the global
cache.
The prefetch instruction is applied to a work-item in a work-group and
does not affect the functional behavior of the kernel.</p></td>
</tr>
</tbody>
</table>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>The kernel must wait for the completion of all async copies using the
<strong>wait_group_events</strong> built-in function before exiting; otherwise the behavior
is undefined.</p>
</div>
</td>
</tr>
</table>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="atomic-functions"><a class="anchor" href="#atomic-functions"></a>6.15.12. Atomic Functions</h4>
<div class="admonitionblock important">
<table>
<tr>
<td class="icon">
<i class="fa icon-important" title="Important"></i>
</td>
<td class="content">
The C11 style atomic functions in this sub-section <a href="#unified-spec">require</a> support for OpenCL 2.0 or newer. However, this statement does not
apply to the <a href="#atomic-legacy">"OpenCL C 1.x Legacy Atomics"</a> descriptions at
the end of this sub-section.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>The OpenCL C programming language implements a subset of the C11 atomics
(refer to <a href="#C11-spec">section 7.17 of the C11 Specification</a>) and
synchronization operations.
These operations play a special role in making assignments in one work-item
visible to another.
A synchronization operation on one or more memory locations is either an
acquire operation, a release operation, or both an acquire and release
operation <sup class="footnote">[<a id="_footnoteref_51" class="footnote" href="#_footnotedef_51" title="View footnote.">51</a>]</sup>.
A synchronization operation without an associated memory location is a fence
and can be either an acquire fence, a release fence or both an acquire and
release fence.
In addition, there are relaxed atomic operations, which are not
synchronization operations, and atomic read-modify-write operations which
have special characteristics.</p>
</div>
<div class="paragraph">
<p>The types include</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><code>memory_order</code></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>which is an enumerated type whose enumerators identify memory ordering
constraints;</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><code>memory_scope</code></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>which is an enumerated type whose enumerators identify scope of memory
ordering constraints;</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><code>atomic_flag</code></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>which is a 32-bit integer type representing a primitive atomic flag; and
several atomic analogs of integer types.</p>
</div>
<div class="paragraph">
<p>In the following operation definitions:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>An A refers to one of the atomic types.</p>
</li>
<li>
<p>A C refers to its corresponding non-atomic type.</p>
</li>
<li>
<p>An M refers to the type of the other argument for arithmetic operations.
For atomic integer types, M is C.</p>
</li>
<li>
<p>The functions not ending in explicit have the same semantics as the
corresponding explicit function with <code>memory_order_seq_cst</code> for the
<code>memory_order</code> argument.</p>
</li>
<li>
<p>The functions that do not have <code>memory_scope</code> argument have the same
semantics as the corresponding functions with the <code>memory_scope</code>
argument set to <code>memory_scope_device</code>.</p>
</li>
</ul>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>With fine-grained system SVM, sharing happens at the granularity of
individual loads and stores anywhere in host memory.
Memory consistency is always guaranteed at synchronization points, but to
obtain finer control over consistency, the OpenCL atomics functions may be
used to ensure that the updates to individual data values made by one unit
of execution are visible to other execution units.
In particular, when a host thread needs fine control over the consistency of
memory that is shared with one or more OpenCL devices, it must use atomic
and fence operations that are compatible with the C11 atomic operations.</p>
</div>
<div class="paragraph">
<p>We can&#8217;t require <a href="#C11-spec">C11 atomics</a> since host programs can be
implemented in other programming languages and versions of C or C++, but we
do require that the host programs use atomics and that those atomics be
compatible with those in C11.</p>
</div>
</td>
</tr>
</table>
</div>
<div class="sect4">
<h5 id="the-atomic_var_init-macro"><a class="anchor" href="#the-atomic_var_init-macro"></a>6.15.12.1. The <code>ATOMIC_VAR_INIT</code> macro</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <code>ATOMIC_VAR_INIT</code> macro expands to a token sequence suitable for
initializing an atomic object of a type that is initialization-compatible
with value.
An atomic object with automatic storage duration that is not explicitly
initialized using <code>ATOMIC_VAR_INIT</code> is initially in an indeterminate state;
however, the default (zero) initialization for objects with <code>static</code> storage
duration is guaranteed to produce a valid state.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="preprocessor">#define</span> ATOMIC_VAR_INIT(C value)</code></pre>
</div>
</div>
<div class="paragraph">
<p>This macro can only be used to initialize atomic objects that are declared
in program scope in the <code>global</code> address space.</p>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">global atomic_int guide = ATOMIC_VAR_INIT(<span class="integer">42</span>);</code></pre>
</div>
</div>
<div class="paragraph">
<p>Concurrent access to the variable being initialized, even via an atomic
operation, constitutes a data-race.</p>
</div>
</div>
</div>
</div>
<div class="sect4">
<h5 id="the-atomic_init-function"><a class="anchor" href="#the-atomic_init-function"></a>6.15.12.2. The atomic_init function</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <code>atomic_init</code> function non-atomically initializes the atomic object
pointed to by obj to the value value.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="comment">// Requires OpenCL C 3.0 or newer.</span>
<span class="directive">void</span> atomic_init(<span class="directive">volatile</span> __global A *obj, C value)
<span class="directive">void</span> atomic_init(<span class="directive">volatile</span> __local A *obj, C value)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and the</span>
<span class="comment">// __opencl_c_generic_address_space feature.</span>
<span class="directive">void</span> atomic_init(<span class="directive">volatile</span> A *obj, C value)</code></pre>
</div>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">local atomic_int guide;
<span class="keyword">if</span> (get_local_id(<span class="integer">0</span>) == <span class="integer">0</span>)
atomic_init(&amp;guide, <span class="integer">42</span>);
work_group_barrier(CLK_LOCAL_MEM_FENCE);</code></pre>
</div>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The function variant that uses the generic address space, i.e. no
explicit address space is listed, <a href="#unified-spec">requires</a> support for OpenCL
C 2.0, or OpenCL C 3.0 or newer and the <code>__opencl_c_generic_address_space</code>
feature.
</td>
</tr>
</table>
</div>
</div>
</div>
</div>
<div class="sect4">
<h5 id="order-and-consistency"><a class="anchor" href="#order-and-consistency"></a>6.15.12.3. Order and Consistency</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The enumerated type <code>memory_order</code> specifies the detailed regular
(non-atomic) memory synchronization operations as defined in
<a href="#C11-spec">section 5.1.2.4 of the C11 Specification</a>, and may provide for
operation ordering.
The following table lists the enumeration constants:</p>
</div>
<table id="table-memory-orders" class="tableblock frame-all grid-all stretch">
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Memory Order</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Additional Notes</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>memory_order_relaxed</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#unified-spec">Requires</a> support for OpenCL C 2.0 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>memory_order_acquire</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#unified-spec">Requires</a> support for OpenCL C 2.0, but in OpenCL C 3.0
or newer some uses require the <code>__opencl_c_atomic_order_acq_rel</code>
feature.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>memory_order_release</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#unified-spec">Requires</a> support for OpenCL C 2.0, but in OpenCL C 3.0
or newer some uses require the <code>__opencl_c_atomic_order_acq_rel</code>
feature.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>memory_order_acq_rel</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#unified-spec">Requires</a> support for OpenCL C 2.0, but in OpenCL C 3.0
or newer some uses require the <code>__opencl_c_atomic_order_acq_rel</code>
feature.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>memory_order_seq_cst</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#unified-spec">Requires</a> support for OpenCL C 2.0, or OpenCL C 3.0 or
newer and the <code>__opencl_c_atomic_order_seq_cst</code> feature.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>The <code>memory_order</code> can be used when performing atomic operations to <code>global</code>
or <code>local</code> memory.</p>
</div>
</div>
</div>
</div>
<div class="sect4">
<h5 id="memory-scope"><a class="anchor" href="#memory-scope"></a>6.15.12.4. Memory Scope</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The enumerated type <code>memory_scope</code> specifies whether the memory ordering
constraints given by <code>memory_order</code> apply to work-items in a subgroup,
work-items in a work-group, or work-items from one or more kernels executing
on the device or across devices (in the case of shared virtual memory).
The following table lists the enumeration constants:</p>
</div>
<table id="table-memory-scopes" class="tableblock frame-all grid-all stretch">
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Memory Scope</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Additional Notes</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>memory_scope_work_item</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>memory_scope_work_item</code> can only be used with <code>atomic_work_item_fence</code>
with flags set to <code>CLK_IMAGE_MEM_FENCE</code>.
<a href="#unified-spec">Requires</a> support for OpenCL C 2.0 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>memory_scope_sub_group</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#unified-spec">Requires</a> support for OpenCL C 3.0 or newer and the
<code>__opencl_c_subgroups</code> feature.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>memory_scope_work_group</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#unified-spec">Requires</a> support for OpenCL C 2.0 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>memory_scope_device</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#unified-spec">Requires</a> support for OpenCL C 2.0, or OpenCL C 3.0 or
newer and the <code>__opencl_c_atomic_scope_device</code> feature.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>memory_scope_all_svm_devices</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#unified-spec">Requires</a> support for OpenCL C 2.0, or OpenCL C 3.0 or
newer and the <code>__opencl_c_atomic_scope_all_devices</code> feature.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>memory_scope_all_devices</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">An alias for <code>memory_scope_all_svm_devices</code>.
<a href="#unified-spec">Requires</a> support for OpenCL C 3.0 or newer and the
<code>__opencl_c_atomic_scope_all_devices</code> feature.</p></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="sect4">
<h5 id="fences"><a class="anchor" href="#fences"></a>6.15.12.5. Fences</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The following fence operations are supported.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="directive">void</span> atomic_work_item_fence(cl_mem_fence_flags flags,
memory_order order,
memory_scope scope)
<span class="comment">// Older syntax memory fences are equivalent to atomic_work_item_fence with the</span>
<span class="comment">// same flags parameter, memory_scope_work_group scope, and ordering follows:</span>
<span class="directive">void</span> mem_fence(cl_mem_fence_flags flags) <span class="comment">// memory_order_acq_rel</span>
<span class="directive">void</span> read_mem_fence(cl_mem_fence_flags flags) <span class="comment">// memory_order_acquire</span>
<span class="directive">void</span> write_mem_fence(cl_mem_fence_flags flags) <span class="comment">// memory_order_release</span></code></pre>
</div>
</div>
<div class="paragraph">
<p><code>flags</code> must be set to <code>CLK_GLOBAL_MEM_FENCE</code>, <code>CLK_LOCAL_MEM_FENCE</code>,
<code>CLK_IMAGE_MEM_FENCE</code> or a combination of these values ORed together;
otherwise the behavior is undefined.
The behavior of calling <code>atomic_work_item_fence</code> with <code>CLK_IMAGE_MEM_FENCE</code>
ORed together with either <code>CLK_GLOBAL_MEM_FENCE</code> or <code>CLK_LOCAL_MEM_FENCE</code> is
equivalent to calling <code>atomic_work_item_fence</code> individually for
<code>CLK_IMAGE_MEM_FENCE</code> and the other flags.
Passing both <code>CLK_GLOBAL_MEM_FENCE</code> and <code>CLK_LOCAL_MEM_FENCE</code> to
<code>atomic_work_item_fence</code> will synchronize memory operations to both <code>local</code>
and <code>global</code> memory through some shared atomic action, as described in
<a href="#opencl-spec">section 3.3.6.2 of the OpenCL Specification</a>.</p>
</div>
<div class="paragraph">
<p>Depending on the value of order, this operation:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>has no effects, if <em>order</em> == <code>memory_order_relaxed</code>.</p>
</li>
<li>
<p>is an acquire fence, if <em>order</em> == <code>memory_order_acquire</code>.</p>
</li>
<li>
<p>is a release fence, if <em>order</em> == <code>memory_order_release</code>.</p>
</li>
<li>
<p>is both an acquire fence and a release fence, if <em>order</em> ==
<code>memory_order_acq_rel</code>.</p>
</li>
<li>
<p>is a sequentially consistent acquire and release fence, if <em>order</em> ==
<code>memory_order_seq_cst</code>.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>For images declared with the <code>read_write</code> qualifier, the
<code>atomic_work_item_fence</code> must be called to make sure that writes to the
image by a work-item become visible to that work-item on subsequent reads to
that image by that work-item.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The use of memory order and scope enumerations must respect the
<a href="#atomic-restrictions">restrictions section below</a>.
</td>
</tr>
</table>
</div>
</div>
</div>
</div>
<div class="sect4">
<h5 id="atomic-integer-and-floating-point-types"><a class="anchor" href="#atomic-integer-and-floating-point-types"></a>6.15.12.6. Atomic integer and floating-point types</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The list of supported atomic type names are:</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><code>atomic_int</code></p>
</li>
<li>
<p><code>atomic_uint</code></p>
</li>
<li>
<p><code>atomic_long</code> <sup class="footnote" id="_footnote_atomic-int64-supported">[<a id="_footnoteref_52" class="footnote" href="#_footnotedef_52" title="View footnote.">52</a>]</sup></p>
</li>
<li>
<p><code>atomic_ulong</code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_52" title="View footnote.">52</a>]</sup></p>
</li>
<li>
<p><code>atomic_float</code></p>
</li>
<li>
<p><code>atomic_double</code> <sup class="footnote">[<a id="_footnoteref_53" class="footnote" href="#_footnotedef_53" title="View footnote.">53</a>]</sup></p>
</li>
<li>
<p><code>atomic_intptr_t</code> <sup class="footnote" id="_footnote_atomic-size_t-supported">[<a id="_footnoteref_54" class="footnote" href="#_footnotedef_54" title="View footnote.">54</a>]</sup></p>
</li>
<li>
<p><code>atomic_uintptr_t</code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_54" title="View footnote.">54</a>]</sup></p>
</li>
<li>
<p><code>atomic_size_t</code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_54" title="View footnote.">54</a>]</sup></p>
</li>
<li>
<p><code>atomic_ptrdiff_t</code> <sup class="footnoteref">[<a class="footnote" href="#_footnotedef_54" title="View footnote.">54</a>]</sup></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Arguments to a kernel can be declared to be a pointer to the above atomic
types or the atomic_flag type.</p>
</div>
<div class="paragraph">
<p>The representation of atomic integer, floating-point and pointer types have
the same size as their corresponding regular types.
The atomic_flag type must be implemented as a 32-bit integer.</p>
</div>
</div>
</div>
</div>
<div class="sect4">
<h5 id="operations-on-atomic-types"><a class="anchor" href="#operations-on-atomic-types"></a>6.15.12.7. Operations on atomic types</h5>
<div class="paragraph">
<p>There are only a few kinds of operations on atomic types, though there are
many instances of those kinds.
This section specifies each general kind.</p>
</div>
<div class="sect5">
<h6 id="atomic_store"><a class="anchor" href="#atomic_store"></a>6.15.12.7.1. <strong>The atomic_store Functions</strong></h6>
<div class="openblock">
<div class="content">
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="comment">// Requires OpenCL C 3.0 or newer and both the __opencl_c_atomic_order_seq_cst</span>
<span class="comment">// and __opencl_c_atomic_scope_device features.</span>
<span class="directive">void</span> atomic_store(<span class="directive">volatile</span> __global A *object, C desired)
<span class="directive">void</span> atomic_store(<span class="directive">volatile</span> __local A *object, C desired)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and all of the</span>
<span class="comment">// __opencl_c_generic_address_space, __opencl_c_atomic_order_seq_cst and</span>
<span class="comment">// __opencl_c_atomic_scope_device features.</span>
<span class="directive">void</span> atomic_store(<span class="directive">volatile</span> A *object, C desired)
<span class="comment">// Requires OpenCL C 3.0 or newer and the __opencl_c_atomic_scope_device</span>
<span class="comment">// feature.</span>
<span class="directive">void</span> atomic_store_explicit(<span class="directive">volatile</span> __global A *object,
C desired,
memory_order order)
<span class="directive">void</span> atomic_store_explicit(<span class="directive">volatile</span> __local A *object,
C desired,
memory_order order)
<span class="comment">// Requires OpenCL C 2.0 or OpenCL C 3.0 or newer and both the</span>
<span class="comment">// __opencl_c_generic_address_space and __opencl_c_atomic_scope_device</span>
<span class="comment">// features.</span>
<span class="directive">void</span> atomic_store_explicit(<span class="directive">volatile</span> A *object,
C desired,
memory_order order)
<span class="comment">// Requires OpenCL C 3.0 or newer.</span>
<span class="directive">void</span> atomic_store_explicit(<span class="directive">volatile</span> __global A *object,
C desired,
memory_order order,
memory_scope scope)
<span class="directive">void</span> atomic_store_explicit(<span class="directive">volatile</span> __local A *object,
C desired,
memory_order order,
memory_scope scope)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and the</span>
<span class="comment">// __opencl_c_generic_address_space feature.</span>
<span class="directive">void</span> atomic_store_explicit(<span class="directive">volatile</span> A *object,
C desired,
memory_order order,
memory_scope scope)</code></pre>
</div>
</div>
<div class="paragraph">
<p>The <em>order</em> argument shall not be <code>memory_order_acquire</code>, nor
<code>memory_order_acq_rel</code>.
Atomically replace the value pointed to by <em>object</em> with the value of
<em>desired</em>.
Memory is affected according to the value of <em>order</em>.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The non-explicit <code>atomic_store</code> function <a href="#unified-spec">requires</a>
support for OpenCL C 2.0, or OpenCL C 3.0 or newer and both the
<code>__opencl_c_atomic_order_seq_cst</code> and <code>__opencl_c_atomic_scope_device</code>
features.
For the explicit variants, memory order and scope enumerations must respect the
<a href="#atomic-restrictions">restrictions section below</a>.
</td>
</tr>
</table>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The function variants that use the generic address space, i.e. no
explicit address space is listed, <a href="#unified-spec">require</a> support for OpenCL
C 2.0, or OpenCL C 3.0 or newer and the <code>__opencl_c_generic_address_space</code>
feature.
</td>
</tr>
</table>
</div>
</div>
</div>
</div>
<div class="sect5">
<h6 id="atomic_load"><a class="anchor" href="#atomic_load"></a>6.15.12.7.2. <strong>The atomic_load Functions</strong></h6>
<div class="openblock">
<div class="content">
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="comment">// Requires OpenCL C 3.0 or newer and both the __opencl_c_atomic_order_seq_cst</span>
<span class="comment">// and __opencl_c_atomic_scope_device features.</span>
C atomic_load(<span class="directive">volatile</span> __global A *object)
C atomic_load(<span class="directive">volatile</span> __local A *object)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and all of the</span>
<span class="comment">// __opencl_c_generic_address_space, __opencl_c_atomic_order_seq_cst and</span>
<span class="comment">// __opencl_c_atomic_scope_device features.</span>
C atomic_load(<span class="directive">volatile</span> A *object)
<span class="comment">// Requires OpenCL C 3.0 or newer and the __opencl_c_atomic_scope_device</span>
<span class="comment">// feature.</span>
C atomic_load_explicit(<span class="directive">volatile</span> __global A *object,
memory_order order)
C atomic_load_explicit(<span class="directive">volatile</span> __local A *object,
memory_order order)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and both the</span>
<span class="comment">// __opencl_c_generic_address_space and __opencl_c_atomic_scope_device</span>
<span class="comment">// features.</span>
C atomic_load_explicit(<span class="directive">volatile</span> A *object,
memory_order order)
<span class="comment">// Requires OpenCL C 3.0 or newer.</span>
C atomic_load_explicit(<span class="directive">volatile</span> __global A *object,
memory_order order,
memory_scope scope)
C atomic_load_explicit(<span class="directive">volatile</span> __local A *object,
memory_order order,
memory_scope scope)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and the</span>
<span class="comment">// __opencl_c_generic_address_space feature.</span>
C atomic_load_explicit(<span class="directive">volatile</span> A *object,
memory_order order,
memory_scope scope)</code></pre>
</div>
</div>
<div class="paragraph">
<p>The <em>order</em> argument shall not be <code>memory_order_release</code> nor
<code>memory_order_acq_rel</code>.
Memory is affected according to the value of <em>order</em>.
Atomically returns the value pointed to by <em>object</em>.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The non-explicit <code>atomic_load</code> function <a href="#unified-spec">requires</a>
support for OpenCL C 2.0 or OpenCL C 3.0 or newer and both the
<code>__opencl_c_atomic_order_seq_cst</code> and <code>__opencl_c_atomic_scope_device</code>
features.
For the explicit variants, memory order and scope enumerations must respect the
<a href="#atomic-restrictions">restrictions section below</a>.
</td>
</tr>
</table>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The function variants that use the generic address space, i.e. no
explicit address space is listed, <a href="#unified-spec">require</a> support for OpenCL
C 2.0, or OpenCL C 3.0 or newer and the <code>__opencl_c_generic_address_space</code>
feature.
</td>
</tr>
</table>
</div>
</div>
</div>
</div>
<div class="sect5">
<h6 id="atomic_exchange"><a class="anchor" href="#atomic_exchange"></a>6.15.12.7.3. <strong>The atomic_exchange Functions</strong></h6>
<div class="openblock">
<div class="content">
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="comment">// Requires OpenCL C 3.0 or newer and both the __opencl_c_atomic_order_seq_cst</span>
<span class="comment">// and __opencl_c_atomic_scope_device features.</span>
C atomic_exchange(<span class="directive">volatile</span> __global A *object, C desired)
C atomic_exchange(<span class="directive">volatile</span> __local A *object, C desired)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and all of the</span>
<span class="comment">// __opencl_c_generic_address_space, __opencl_c_atomic_order_seq_cst and</span>
<span class="comment">// __opencl_c_atomic_scope_device features.</span>
C atomic_exchange(<span class="directive">volatile</span> A *object, C desired)
<span class="comment">// Requires OpenCL C 3.0 or newer and the __opencl_c_atomic_scope_device</span>
<span class="comment">// feature.</span>
C atomic_exchange_explicit(<span class="directive">volatile</span> __global A *object,
C desired,
memory_order order)
C atomic_exchange_explicit(<span class="directive">volatile</span> __local A *object,
C desired,
memory_order order)
<span class="comment">// Requires OpenCL C 2.0 or OpenCL C 3.0 or newer and both the</span>
<span class="comment">// __opencl_c_generic_address_space and __opencl_c_atomic_scope_device</span>
<span class="comment">// feature.</span>
C atomic_exchange_explicit(<span class="directive">volatile</span> A *object,
C desired,
memory_order order)
<span class="comment">// Requires OpenCL C 3.0 or newer.</span>
C atomic_exchange_explicit(<span class="directive">volatile</span> __global A *object,
C desired,
memory_order order,
memory_scope scope)
C atomic_exchange_explicit(<span class="directive">volatile</span> __local A *object,
C desired,
memory_order order,
memory_scope scope)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and the</span>
<span class="comment">// __opencl_c_generic_address_space feature.</span>
C atomic_exchange_explicit(<span class="directive">volatile</span> A *object,
C desired,
memory_order order,
memory_scope scope)</code></pre>
</div>
</div>
<div class="paragraph">
<p>Atomically replace the value pointed to by object with desired.
Memory is affected according to the value of order.
These operations are read-modify-write operations (as defined by
<a href="#C11-spec">section 5.1.2.4 of the C11 Specification</a>).
Atomically returns the value pointed to by object immediately before the
effects.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The non-explicit <code>atomic_exchange</code> function <a href="#unified-spec">requires</a>
support for OpenCL C 2.0 or OpenCL C 3.0 or newer and both the
<code>__opencl_c_atomic_order_seq_cst</code> and <code>__opencl_c_atomic_scope_device</code>
features.
For the explicit variants, memory order and scope enumerations must respect the
<a href="#atomic-restrictions">restrictions section below</a>.
</td>
</tr>
</table>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The function variants that use the generic address space, i.e. no
explicit address space is listed, <a href="#unified-spec">require</a> support for OpenCL
C 2.0, or OpenCL C 3.0 or newer and the <code>__opencl_c_generic_address_space</code>
feature.
</td>
</tr>
</table>
</div>
</div>
</div>
</div>
<div class="sect5">
<h6 id="atomic_compare_exchange"><a class="anchor" href="#atomic_compare_exchange"></a>6.15.12.7.4. <strong>The atomic_compare_exchange Functions</strong></h6>
<div class="openblock">
<div class="content">
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="comment">// Requires OpenCL C 3.0 or newer and both the __opencl_c_atomic_order_seq_cst</span>
<span class="comment">// and __opencl_c_atomic_scope_device features.</span>
<span class="predefined-type">bool</span> atomic_compare_exchange_strong(
<span class="directive">volatile</span> __global A *object,
__global C *expected, C desired)
<span class="predefined-type">bool</span> atomic_compare_exchange_strong(
<span class="directive">volatile</span> __global A *object,
__local C *expected, C desired)
<span class="predefined-type">bool</span> atomic_compare_exchange_strong(
<span class="directive">volatile</span> __global A *object,
__private C *expected, C desired)
<span class="predefined-type">bool</span> atomic_compare_exchange_strong(
<span class="directive">volatile</span> __local A *object,
__global C *expected, C desired)
<span class="predefined-type">bool</span> atomic_compare_exchange_strong(
<span class="directive">volatile</span> __local A *object,
__local C *expected, C desired)
<span class="predefined-type">bool</span> atomic_compare_exchange_strong(
<span class="directive">volatile</span> __local A *object,
__private C *expected, C desired)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and all of the</span>
<span class="comment">// __opencl_c_generic_address_space, __opencl_c_atomic_order_seq_cst and</span>
<span class="comment">// __opencl_c_atomic_scope_device features.</span>
<span class="predefined-type">bool</span> atomic_compare_exchange_strong(
<span class="directive">volatile</span> A *object,
C *expected, C desired)
<span class="comment">// Requires OpenCL C 3.0 or newer.</span>
<span class="predefined-type">bool</span> atomic_compare_exchange_strong_explicit(
<span class="directive">volatile</span> __global A *object,
__global C *expected,
C desired,
memory_order success,
memory_order failure)
<span class="predefined-type">bool</span> atomic_compare_exchange_strong_explicit(
<span class="directive">volatile</span> __global A *object,
__local C *expected,
C desired,
memory_order success,
memory_order failure)
<span class="predefined-type">bool</span> atomic_compare_exchange_strong_explicit(
<span class="directive">volatile</span> __global A *object,
__private C *expected,
C desired,
memory_order success,
memory_order failure)
<span class="predefined-type">bool</span> atomic_compare_exchange_strong_explicit(
<span class="directive">volatile</span> __local A *object,
__global C *expected,
C desired,
memory_order success,
memory_order failure)
<span class="predefined-type">bool</span> atomic_compare_exchange_strong_explicit(
<span class="directive">volatile</span> __local A *object,
__local C *expected,
C desired,
memory_order success,
memory_order failure)
<span class="predefined-type">bool</span> atomic_compare_exchange_strong_explicit(
<span class="directive">volatile</span> __local A *object,
__private C *expected,
C desired,
memory_order success,
memory_order failure)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and the</span>
<span class="comment">// __opencl_c_generic_address_space feature.</span>
<span class="predefined-type">bool</span> atomic_compare_exchange_strong_explicit(
<span class="directive">volatile</span> A *object,
C *expected,
C desired,
memory_order success,
memory_order failure)
<span class="comment">// Requires OpenCL C 3.0 or newer.</span>
<span class="predefined-type">bool</span> atomic_compare_exchange_strong_explicit(
<span class="directive">volatile</span> __global A *object,
__global C *expected,
C desired,
memory_order success,
memory_order failure,
memory_scope scope)
<span class="predefined-type">bool</span> atomic_compare_exchange_strong_explicit(
<span class="directive">volatile</span> __global A *object,
__local C *expected,
C desired,
memory_order success,
memory_order failure,
memory_scope scope)
<span class="predefined-type">bool</span> atomic_compare_exchange_strong_explicit(
<span class="directive">volatile</span> __global A *object,
__private C *expected,
C desired,
memory_order success,
memory_order failure,
memory_scope scope)
<span class="predefined-type">bool</span> atomic_compare_exchange_strong_explicit(
<span class="directive">volatile</span> __local A *object,
__global C *expected,
C desired,
memory_order success,
memory_order failure,
memory_scope scope)
<span class="predefined-type">bool</span> atomic_compare_exchange_strong_explicit(
<span class="directive">volatile</span> __local A *object,
__local C *expected,
C desired,
memory_order success,
memory_order failure,
memory_scope scope)
<span class="predefined-type">bool</span> atomic_compare_exchange_strong_explicit(
<span class="directive">volatile</span> __local A *object,
__private C *expected,
C desired,
memory_order success,
memory_order failure,
memory_scope scope)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and the</span>
<span class="comment">// __opencl_c_generic_address_space feature.</span>
<span class="predefined-type">bool</span> atomic_compare_exchange_strong_explicit(
<span class="directive">volatile</span> A *object,
C *expected,
C desired,
memory_order success,
memory_order failure,
memory_scope scope)
<span class="comment">// Requires OpenCL C 3.0 or newer and both the __opencl_c_atomic_order_seq_cst</span>
<span class="comment">// and __opencl_c_atomic_scope_device features.</span>
<span class="predefined-type">bool</span> atomic_compare_exchange_weak(
<span class="directive">volatile</span> __global A *object,
__global C *expected, C desired)
<span class="predefined-type">bool</span> atomic_compare_exchange_weak(
<span class="directive">volatile</span> __global A *object,
__local C *expected, C desired)
<span class="predefined-type">bool</span> atomic_compare_exchange_weak(
<span class="directive">volatile</span> __global A *object,
__private C *expected, C desired)
<span class="predefined-type">bool</span> atomic_compare_exchange_weak(
<span class="directive">volatile</span> __local A *object,
__global C *expected, C desired)
<span class="predefined-type">bool</span> atomic_compare_exchange_weak(
<span class="directive">volatile</span> __local A *object,
__local C *expected, C desired)
<span class="predefined-type">bool</span> atomic_compare_exchange_weak(
<span class="directive">volatile</span> __local A *object,
__private C *expected, C desired)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and all of the</span>
<span class="comment">// __opencl_c_generic_address_space, __opencl_c_atomic_order_seq_cst and</span>
<span class="comment">// __opencl_c_atomic_scope_device features.</span>
<span class="predefined-type">bool</span> atomic_compare_exchange_weak(
<span class="directive">volatile</span> A *object,
C *expected, C desired)
<span class="comment">// Requires OpenCL C 3.0 or newer.</span>
<span class="predefined-type">bool</span> atomic_compare_exchange_weak_explicit(
<span class="directive">volatile</span> __global A *object,
__global C *expected,
C desired,
memory_order success,
memory_order failure)
<span class="predefined-type">bool</span> atomic_compare_exchange_weak_explicit(
<span class="directive">volatile</span> __global A *object,
__local C *expected,
C desired,
memory_order success,
memory_order failure)
<span class="predefined-type">bool</span> atomic_compare_exchange_weak_explicit(
<span class="directive">volatile</span> __global A *object,
__private C *expected,
C desired,
memory_order success,
memory_order failure)
<span class="predefined-type">bool</span> atomic_compare_exchange_weak_explicit(
<span class="directive">volatile</span> __local A *object,
__global C *expected,
C desired,
memory_order success,
memory_order failure)
<span class="predefined-type">bool</span> atomic_compare_exchange_weak_explicit(
<span class="directive">volatile</span> __local A *object,
__local C *expected,
C desired,
memory_order success,
memory_order failure)
<span class="predefined-type">bool</span> atomic_compare_exchange_weak_explicit(
<span class="directive">volatile</span> __local A *object,
__private C *expected,
C desired,
memory_order success,
memory_order failure)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and the</span>
<span class="comment">// __opencl_c_generic_address_space feature.</span>
<span class="predefined-type">bool</span> atomic_compare_exchange_weak_explicit(
<span class="directive">volatile</span> A *object,
C *expected,
C desired,
memory_order success,
memory_order failure)
<span class="comment">// Requires OpenCL C 3.0 or newer.</span>
<span class="predefined-type">bool</span> atomic_compare_exchange_weak_explicit(
<span class="directive">volatile</span> __global A *object,
__global C *expected,
C desired,
memory_order success,
memory_order failure,
memory_scope scope)
<span class="predefined-type">bool</span> atomic_compare_exchange_weak_explicit(
<span class="directive">volatile</span> __global A *object,
__local C *expected,
C desired,
memory_order success,
memory_order failure,
memory_scope scope)
<span class="predefined-type">bool</span> atomic_compare_exchange_weak_explicit(
<span class="directive">volatile</span> __global A *object,
__private C *expected,
C desired,
memory_order success,
memory_order failure,
memory_scope scope)
<span class="predefined-type">bool</span> atomic_compare_exchange_weak_explicit(
<span class="directive">volatile</span> __local A *object,
__global C *expected,
C desired,
memory_order success,
memory_order failure,
memory_scope scope)
<span class="predefined-type">bool</span> atomic_compare_exchange_weak_explicit(
<span class="directive">volatile</span> __local A *object,
__local C *expected,
C desired,
memory_order success,
memory_order failure,
memory_scope scope)
<span class="predefined-type">bool</span> atomic_compare_exchange_weak_explicit(
<span class="directive">volatile</span> __local A *object,
__private C *expected,
C desired,
memory_order success,
memory_order failure,
memory_scope scope)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and the</span>
<span class="comment">// __opencl_c_generic_address_space feature.</span>
<span class="predefined-type">bool</span> atomic_compare_exchange_weak_explicit(
<span class="directive">volatile</span> A *object,
C *expected,
C desired,
memory_order success,
memory_order failure,
memory_scope scope)</code></pre>
</div>
</div>
<div class="paragraph">
<p>The <code>failure</code> argument shall not be <code>memory_order_release</code> nor
<code>memory_order_acq_rel</code>.
The <code>failure</code> argument shall be no stronger than the <code>success</code> argument.
Atomically, compares the value pointed to by object for equality with that
in expected, and if <em>true</em>, replaces the value pointed to by <code>object</code> with
<code>desired</code>, and if <em>false</em>, updates the value in <code>expected</code> with the value
pointed to by <code>object</code>.
Further, if the comparison is <em>true</em>, memory is affected according to the
value of <code>success</code>, and if the comparison is <em>false</em>, memory is affected
according to the value of <code>failure</code>.
If the comparison is <em>true</em>, these operations are atomic read-modify-write operations (as defined by
<a href="#C11-spec">section 5.1.2.4 of the C11 Specification</a>).
Otherwise, these operations are atomic load operations.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>The effect of the compare-and-exchange operations is</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="keyword">if</span> (memcmp(object, expected, <span class="keyword">sizeof</span>(*object) == <span class="integer">0</span>)
memcpy(object, &amp;desired, <span class="keyword">sizeof</span>(*object));
<span class="keyword">else</span>
memcpy(expected, object, <span class="keyword">sizeof</span>(*object));</code></pre>
</div>
</div>
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>The weak compare-and-exchange operations may fail spuriously
<sup class="footnote">[<a id="_footnoteref_55" class="footnote" href="#_footnotedef_55" title="View footnote.">55</a>]</sup>.
That is, even when the contents of memory referred to by <code>expected</code> and
<code>object</code> are equal, it may return zero and store back to <code>expected</code> the same
memory contents that were originally there.</p>
</div>
<div class="paragraph">
<p>These generic functions return the result of the comparison.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The non-explicit <code>atomic_compare_exchange_strong</code> and
<code>atomic_compare_exchange_weak</code> functions <a href="#unified-spec">requires</a> support
for OpenCL C 2.0, or OpenCL C 3.0 or newer and both the
<code>__opencl_c_atomic_order_seq_cst</code> and <code>__opencl_c_atomic_scope_device</code>
features.
For the explicit variants, memory order and scope enumerations must respect the
<a href="#atomic-restrictions">restrictions section below</a>.
</td>
</tr>
</table>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The function variants that use the generic address space, i.e. no
explicit address space is listed, <a href="#unified-spec">require</a> support for OpenCL
C 2.0, or OpenCL C 3.0 or newer and the <code>__opencl_c_generic_address_space</code>
feature.
</td>
</tr>
</table>
</div>
</div>
</div>
</div>
<div class="sect5">
<h6 id="atomic_fetch_key"><a class="anchor" href="#atomic_fetch_key"></a>6.15.12.7.5. <strong>The atomic_fetch and modify Functions</strong></h6>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The following operations perform arithmetic and bitwise computations.
All of these operations are applicable to an object of any atomic integer
type.
The key, operator, and computation correspondence is given in table below:</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<colgroup>
<col style="width: 33.3333%;">
<col style="width: 33.3333%;">
<col style="width: 33.3334%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>key</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>op</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>computation</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>add</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>+</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">addition</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>sub</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>-</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">subtraction</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>or</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>|</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">bitwise inclusive or</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>xor</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>^</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">bitwise exclusive or</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>and</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>&amp;</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">bitwise and</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>min</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>min</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">compute min</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>max</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>max</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">compute max</p></td>
</tr>
</tbody>
</table>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>For <strong>atomic_fetch</strong> and modify functions with <strong>key</strong> = <code>add</code> or <code>sub</code> on
atomic types <code>atomic_intptr_t</code> and <code>atomic_uintptr_t</code>, <code>M</code> is <code>ptrdiff_t</code>.
For <strong>atomic_fetch</strong> and modify functions with <strong>key</strong> = <code>or</code>, <code>xor</code>, <code>and</code>,
<code>min</code> and <code>max</code> on atomic type <code>atomic_intptr_t</code>, <code>M</code> is <code>intptr_t</code>,
and on atomic type <code>atomic_uintptr_t</code>, <code>M</code> is <code>uintptr_t</code>.</p>
</div>
</td>
</tr>
</table>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="comment">// Requires OpenCL C 3.0 or newer and both the __opencl_c_atomic_order_seq_cst</span>
<span class="comment">// and __opencl_c_atomic_scope_device features.</span>
C atomic_fetch_key(<span class="directive">volatile</span> __global A *object, M operand)
C atomic_fetch_key(<span class="directive">volatile</span> __local A *object, M operand)
<span class="comment">// Requires OpenCL C 2.0, or all of the __opencl_c_generic_address_space,</span>
<span class="comment">// __opencl_c_atomic_order_seq_cst and __opencl_c_atomic_scope_device features.</span>
C atomic_fetch_key(<span class="directive">volatile</span> A *object, M operand)
<span class="comment">// Requires OpenCL C 3.0 or newer and the __opencl_c_atomic_scope_device feature.</span>
C atomic_fetch_key_explicit(<span class="directive">volatile</span> __global A *object,
M operand,
memory_order order)
C atomic_fetch_key_explicit(<span class="directive">volatile</span> __local A *object,
M operand,
memory_order order)
<span class="comment">// Requires OpenCL C 2.0 or OpenCL C 3.0 or newer and both the</span>
<span class="comment">// __opencl_c_generic_address_space and __opencl_c_atomic_scope_device</span>
<span class="comment">// features.</span>
C atomic_fetch_key_explicit(<span class="directive">volatile</span> A *object,
M operand,
memory_order order)
<span class="comment">// Requires OpenCL C 3.0 or newer.</span>
C atomic_fetch_key_explicit(<span class="directive">volatile</span> __global A *object,
M operand,
memory_order order,
memory_scope scope)
C atomic_fetch_key_explicit(<span class="directive">volatile</span> __local A *object,
M operand,
memory_order order,
memory_scope scope)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and the</span>
<span class="comment">// __opencl_c_generic_address_space feature.</span>
C atomic_fetch_key_explicit(<span class="directive">volatile</span> A *object,
M operand,
memory_order order,
memory_scope scope)</code></pre>
</div>
</div>
<div class="paragraph">
<p>Atomically replaces the value pointed to by object with the result of the
computation applied to the value pointed to by <code>object</code> and the given
operand.
Memory is affected according to the value of <code>order</code>.
These operations are atomic read-modify-write operations (as defined by
<a href="#C11-spec">section 5.1.2.4 of the C11 Specification</a>).
For signed integer types, arithmetic is defined to use two&#8217;s complement
representation with silent wrap-around on overflow; there are no undefined
results.
For address types, the result may be an undefined address, but the
operations otherwise have no undefined behavior.
Returns atomically the value pointed to by <code>object</code> immediately before the
effects.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The non-explicit <code>atomic_fetch_key</code> functions <a href="#unified-spec">require</a>
support for OpenCL C 2.0, or OpenCL C 3.0 or newer and both the
<code>__opencl_c_atomic_order_seq_cst</code> and <code>__opencl_c_atomic_scope_device</code>
features.
For the explicit variants, memory order and scope enumerations must respect the
<a href="#atomic-restrictions">restrictions section below</a>.
</td>
</tr>
</table>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The function variants that use the generic address space, i.e. no
explicit address space is listed, <a href="#unified-spec">require</a> support for OpenCL
C 2.0, or OpenCL C 3.0 or newer and the <code>__opencl_c_generic_address_space</code>
feature.
</td>
</tr>
</table>
</div>
</div>
</div>
</div>
<div class="sect5">
<h6 id="atomic_flag"><a class="anchor" href="#atomic_flag"></a>6.15.12.7.6. <strong>Atomic Flag Type and Operations</strong></h6>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The <code>atomic_flag</code> type provides the classic test-and-set functionality.
It has two states, <em>set</em> (value is non-zero) and <em>clear</em> (value is 0).</p>
</div>
<div class="paragraph">
<p>In OpenCL C 2.0 Operations on an object of type <code>atomic_flag</code> shall be
lock-free, in OpenCL C 3.0 or newer they may be lock-free.</p>
</div>
<div class="paragraph">
<p>The macro <code>ATOMIC_FLAG_INIT</code> may be used to initialize an <code>atomic_flag</code> to the
<em>clear</em> state.
An <code>atomic_flag</code> that is not explicitly initialized with <code>ATOMIC_FLAG_INIT</code> is
initially in an indeterminate state.</p>
</div>
<div class="paragraph">
<p>This macro can only be used for atomic objects that are declared in program
scope in the <code>global</code> address space with the <code>atomic_flag</code> type.</p>
</div>
<div class="paragraph">
<p>Example:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">global atomic_flag guard = ATOMIC_FLAG_INIT;</code></pre>
</div>
</div>
</div>
</div>
</div>
<div class="sect5">
<h6 id="atomic_flag_test_and_set"><a class="anchor" href="#atomic_flag_test_and_set"></a>6.15.12.7.7. <strong>The atomic_flag_test_and_set Functions</strong></h6>
<div class="openblock">
<div class="content">
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="comment">// Requires OpenCL C 3.0 or newer and both the __opencl_c_atomic_order_seq_cst</span>
<span class="comment">// and __opencl_c_atomic_scope_device features.</span>
<span class="predefined-type">bool</span> atomic_flag_test_and_set(
<span class="directive">volatile</span> __global atomic_flag *object)
<span class="predefined-type">bool</span> atomic_flag_test_and_set(
<span class="directive">volatile</span> __local atomic_flag *object)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and all of the</span>
<span class="comment">// __opencl_c_generic_address_space, __opencl_c_atomic_order_seq_cst and</span>
<span class="comment">// __opencl_c_atomic_scope_device features.</span>
<span class="predefined-type">bool</span> atomic_flag_test_and_set(
<span class="directive">volatile</span> atomic_flag *object)
<span class="comment">// Requires OpenCL C 3.0 or newer and the __opencl_c_atomic_scope_device</span>
<span class="comment">// feature.</span>
<span class="predefined-type">bool</span> atomic_flag_test_and_set_explicit(
<span class="directive">volatile</span> __global atomic_flag *object,
memory_order order)
<span class="predefined-type">bool</span> atomic_flag_test_and_set_explicit(
<span class="directive">volatile</span> __local atomic_flag *object,
memory_order order)
<span class="comment">// Requires OpenCL C 2.0 or OpenCL C 3.0 or newer and both the</span>
<span class="comment">// __opencl_c_generic_address_space and __opencl_c_atomic_scope_device</span>
<span class="comment">// features.</span>
<span class="predefined-type">bool</span> atomic_flag_test_and_set_explicit(
<span class="directive">volatile</span> atomic_flag *object,
memory_order order)
<span class="comment">// Requires OpenCL C 3.0 or newer.</span>
<span class="predefined-type">bool</span> atomic_flag_test_and_set_explicit(
<span class="directive">volatile</span> __global atomic_flag *object,
memory_order order,
memory_scope scope)
<span class="predefined-type">bool</span> atomic_flag_test_and_set_explicit(
<span class="directive">volatile</span> __local atomic_flag *object,
memory_order order,
memory_scope scope)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and the</span>
<span class="comment">// __opencl_c_generic_address_space feature.</span>
<span class="predefined-type">bool</span> atomic_flag_test_and_set_explicit(
<span class="directive">volatile</span> atomic_flag *object,
memory_order order,
memory_scope scope)</code></pre>
</div>
</div>
<div class="paragraph">
<p>Atomically sets the value pointed to by <code>object</code> to <em>true</em>.
Memory is affected according to the value of <code>order</code>.
These operations are atomic read-modify-write operations (as defined by
<a href="#C11-spec">section 5.1.2.4 of the C11 Specification</a>).
Returns atomically the value of the <code>object</code> immediately before the effects.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The non-explicit <code>atomic_flag_test_and_set</code> function <a href="#unified-spec">requires</a> support for OpenCL C 2.0, or OpenCL C 3.0 or newer and both the
<code>__opencl_c_atomic_order_seq_cst</code> and <code>__opencl_c_atomic_scope_device</code>
features.
For the explicit variants, memory order and scope enumerations must respect the
<a href="#atomic-restrictions">restrictions section below</a>.
</td>
</tr>
</table>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The function variants that use the generic address space, i.e. no
explicit address space is listed, <a href="#unified-spec">require</a> support for OpenCL
C 2.0, or OpenCL C 3.0 or newer and the <code>__opencl_c_generic_address_space</code>
feature.
</td>
</tr>
</table>
</div>
</div>
</div>
</div>
<div class="sect5">
<h6 id="atomic_flag_clear"><a class="anchor" href="#atomic_flag_clear"></a>6.15.12.7.8. <strong>The atomic_flag_clear Functions</strong></h6>
<div class="openblock">
<div class="content">
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="comment">// Requires OpenCL C 3.0 or newer and both the __opencl_c_atomic_order_seq_cst</span>
<span class="comment">// and __opencl_c_atomic_scope_device features.</span>
<span class="directive">void</span> atomic_flag_clear(<span class="directive">volatile</span> __global atomic_flag *object)
<span class="directive">void</span> atomic_flag_clear(<span class="directive">volatile</span> __local atomic_flag *object)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and all of the</span>
<span class="comment">// __opencl_c_generic_address_space, __opencl_c_atomic_order_seq_cst and</span>
<span class="comment">// __opencl_c_atomic_scope_device features.</span>
<span class="directive">void</span> atomic_flag_clear(<span class="directive">volatile</span> atomic_flag *object)
<span class="comment">// Requires OpenCL C 3.0 or newer and the __opencl_c_atomic_scope_device</span>
<span class="comment">// feature.</span>
<span class="directive">void</span> atomic_flag_clear_explicit(
<span class="directive">volatile</span> __global atomic_flag *object,
memory_order order)
<span class="directive">void</span> atomic_flag_clear_explicit(
<span class="directive">volatile</span> __local atomic_flag *object,
memory_order order)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and both the</span>
<span class="comment">// __opencl_c_generic_address_space and __opencl_c_atomic_scope_device</span>
<span class="comment">// features.</span>
<span class="directive">void</span> atomic_flag_clear_explicit(
<span class="directive">volatile</span> atomic_flag *object,
memory_order order)
<span class="comment">// Requires OpenCL C 3.0 or newer.</span>
<span class="directive">void</span> atomic_flag_clear_explicit(
<span class="directive">volatile</span> __global atomic_flag *object,
memory_order order,
memory_scope scope)
<span class="directive">void</span> atomic_flag_clear_explicit(
<span class="directive">volatile</span> __local atomic_flag *object,
memory_order order,
memory_scope scope)
<span class="comment">// Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and the</span>
<span class="comment">// __opencl_c_generic_address_space feature.</span>
<span class="directive">void</span> atomic_flag_clear_explicit(
<span class="directive">volatile</span> atomic_flag *object,
memory_order order,
memory_scope scope)</code></pre>
</div>
</div>
<div class="paragraph">
<p>The <code>order</code> argument shall not be <code>memory_order_acquire</code> nor
<code>memory_order_acq_rel</code>.
Atomically sets the value pointed to by object to false.
Memory is affected according to the value of order.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The non-explicit <code>atomic_flag_clear</code> function <a href="#unified-spec">requires</a>
support for OpenCL C 2.0, or OpenCL C 3.0 or newer and both the
<code>__opencl_c_atomic_order_seq_cst</code> and <code>__opencl_c_atomic_scope_device</code>
features.
For the explicit variants, memory order and scope enumerations must respect the
<a href="#atomic-restrictions">restrictions section below</a>.
</td>
</tr>
</table>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The function variants that use the generic address space, i.e. no
explicit address space is listed, <a href="#unified-spec">require</a> support for OpenCL
C 2.0, or OpenCL C 3.0 or newer and the <code>__opencl_c_generic_address_space</code>
feature.
</td>
</tr>
</table>
</div>
</div>
</div>
</div>
</div>
<div class="sect4">
<h5 id="atomic-legacy"><a class="anchor" href="#atomic-legacy"></a>6.15.12.8. OpenCL C 1.x Legacy Atomics</h5>
<div class="admonitionblock important">
<table>
<tr>
<td class="icon">
<i class="fa icon-important" title="Important"></i>
</td>
<td class="content">
The atomic functions described in this sub-section <a href="#unified-spec">require</a> support for OpenCL C 1.1 or newer, and are <a href="#unified-spec">deprecated by</a> OpenCL C 2.0. Also see extensions
<code>cl_khr_global_int32_base_atomics</code>, <code>cl_khr_global_int32_extended_atomics</code>,
<code>cl_khr_local_int32_base_atomics</code>, and <code>cl_khr_local_int32_extended_atomics</code>.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>OpenCL C 1.x had support for relaxed atomic operations via built-in functions
that could operate on any memory address in <code>__global</code> or <code>__local</code> spaces.
Unlike C11 style atomics these did not require using dedicated atomic types,
and instead operated on 32-bit signed integers, 32-bit unsigned integers, and
only in the case of <strong>atomic_xchg</strong> additionally single precision floating-point.
These were equivalent to atomic operations with <code>memory_order_relaxed</code>
consistency, and <code>memory_scope_work_group</code> scope.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
Some implementations may implement legacy atomics with a stricter memory
consistency order than <code>memory_order_relaxed</code> or a broader scope than
<code>memory_scope_work_group</code>.
This is because all the stricter orders and broader scopes fully satisfy the
semantics of the minimum requirements.
</td>
</tr>
</table>
</div>
<table id="table-legacy-atomic-functions" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 23. Legacy Atomic Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>atomic_add</strong>(volatile __global int *<em>p</em>, int <em>val</em>)<br>
int <strong>atom_add</strong>(volatile __global int *<em>p</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_add</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_add</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>val</em>)<br></p>
<p class="tableblock"> int <strong>atomic_add</strong>(volatile __local int *<em>p</em>, int <em>val</em>)<br>
int <strong>atom_add</strong>(volatile __local int *<em>p</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_add</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_add</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>val</em>)<br></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Read the 32-bit value (referred to as <em>old</em>) stored at location pointed by
<em>p</em>. Compute (<em>old</em> + <em>val</em>) and store result at location pointed by <em>p</em>.
The function returns <em>old</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>atomic_sub</strong>(volatile __global int *<em>p</em>, int <em>val</em>)<br>
int <strong>atom_sub</strong>(volatile __global int *<em>p</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_sub</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_sub</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>val</em>)<br></p>
<p class="tableblock"> int <strong>atomic_sub</strong>(volatile __local int *<em>p</em>, int <em>val</em>)<br>
int <strong>atom_sub</strong>(volatile __local int *<em>p</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_sub</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_sub</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>val</em>)<br></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Read the 32-bit value (referred to as <em>old</em>) stored at location pointed by
<em>p</em>. Compute (<em>old</em> - <em>val</em>) and store result at location pointed by <em>p</em>.
The function returns <em>old</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>atomic_xchg</strong>(volatile __global int *<em>p</em>, int <em>val</em>)<br>
int <strong>atom_xchg</strong>(volatile __global int *<em>p</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_xchg</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_xchg</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>val</em>)<br></p>
<p class="tableblock"> float <strong>atomic_xchg</strong>(volatile __global float *<em>p</em>, float <em>val</em>)<br></p>
<p class="tableblock"> int <strong>atomic_xchg</strong>(volatile __local int *<em>p</em>, int <em>val</em>)<br>
int <strong>atom_xchg</strong>(volatile __local int *<em>p</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_xchg</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_xchg</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>val</em>)<br></p>
<p class="tableblock"> float <strong>atomic_xchg</strong>(volatile __local float *<em>p</em>, float <em>val</em>)<br></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Swaps the <em>old</em> value stored at location <em>p</em> with new value given by
<em>val</em>. Returns <em>old</em> value.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>atomic_inc</strong>(volatile __global int *<em>p</em>)<br>
int <strong>atom_inc</strong>(volatile __global int *<em>p</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_inc</strong>(volatile __global unsigned int *<em>p</em>)<br>
unsigned int <strong>atom_inc</strong>(volatile __global unsigned int *<em>p</em>)<br></p>
<p class="tableblock"> int <strong>atomic_inc</strong>(volatile __local int *<em>p</em>)<br>
int <strong>atom_inc</strong>(volatile __local int *<em>p</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_inc</strong>(volatile __local unsigned int *<em>p</em>)<br>
unsigned int <strong>atom_inc</strong>(volatile __local unsigned int *<em>p</em>)<br></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Read the 32-bit value (referred to as <em>old</em>) stored at location pointed by
<em>p</em>. Compute (<em>old</em> + 1) and store result at location pointed by <em>p</em>. The
function returns <em>old</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>atomic_dec</strong>(volatile __global int *<em>p</em>)<br>
int <strong>atom_dec</strong>(volatile __global int *<em>p</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_dec</strong>(volatile __global unsigned int *<em>p</em>)<br>
unsigned int <strong>atom_dec</strong>(__global unsigned int *<em>p</em>)<br></p>
<p class="tableblock"> int <strong>atomic_dec</strong>(volatile __local int *<em>p</em>)<br>
int <strong>atom_dec</strong>(volatile __local int *<em>p</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_dec</strong>(volatile __local unsigned int *<em>p</em>)<br>
unsigned int <strong>atom_dec</strong>(volatile __local unsigned int *<em>p</em>)<br></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Read the 32-bit value (referred to as <em>old</em>) stored at location pointed by
<em>p</em>. Compute (<em>old</em> - 1) and store result at location pointed by <em>p</em>. The
function returns <em>old</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>atomic_cmpxchg</strong>(volatile __global int *<em>p</em>, int <em>cmp</em>, int <em>val</em>)<br>
int <strong>atom_cmpxchg</strong>(volatile __global int *<em>p</em>, int <em>cmp</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_cmpxchg</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>cmp</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_cmpxchg</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>cmp</em>, unsigned int <em>val</em>)<br></p>
<p class="tableblock"> int <strong>atomic_cmpxchg</strong>(volatile __local int *<em>p</em>, int <em>cmp</em>, int <em>val</em>)<br>
int <strong>atom_cmpxchg</strong>(volatile __local int *<em>p</em>, int <em>cmp</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_cmpxchg</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>cmp</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_cmpxchg</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>cmp</em>, unsigned int <em>val</em>)<br></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Read the 32-bit value (referred to as <em>old</em>) stored at location pointed by
<em>p</em>. Compute (<em>old</em> == <em>cmp</em>) ? <em>val</em> : <em>old</em> and store result at location
pointed by <em>p</em>. The function returns <em>old</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>atomic_min</strong>(volatile __global int *<em>p</em>, int <em>val</em>)<br>
int <strong>atom_min</strong>(volatile __global int *<em>p</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_min</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_min</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>val</em>)<br></p>
<p class="tableblock"> int <strong>atomic_min</strong>(volatile __local int *<em>p</em>, int <em>val</em>)<br>
int <strong>atom_min</strong>(volatile __local int *<em>p</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_min</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_min</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>val</em>)<br></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Read the 32-bit value (referred to as <em>old</em>) stored at location pointed by
<em>p</em>. Compute <strong>min</strong>(<em>old</em>, <em>val</em>) and store minimum value at location
pointed by <em>p</em>. The function returns <em>old</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>atomic_max</strong>(volatile __global int *<em>p</em>, int <em>val</em>)<br>
int <strong>atom_max</strong>(volatile __global int *<em>p</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_max</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_max</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>val</em>)<br></p>
<p class="tableblock"> int <strong>atomic_max</strong>(volatile __local int *<em>p</em>, int <em>val</em>)<br>
int <strong>atom_max</strong>(volatile __local int *<em>p</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_max</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_max</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>val</em>)<br></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Read the 32-bit value (referred to as <em>old</em>) stored at location pointed by
<em>p</em>. Compute <strong>max</strong>(<em>old</em>, <em>val</em>) and store maximum value at location
pointed by <em>p</em>. The function returns <em>old</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>atomic_and</strong>(volatile __global int *<em>p</em>, int <em>val</em>)<br>
int <strong>atom_and</strong>(volatile __global int *<em>p</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_and</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_and</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>val</em>)<br></p>
<p class="tableblock"> int <strong>atomic_and</strong>(volatile __local int *<em>p</em>, int <em>val</em>)<br>
int <strong>atom_and</strong>(volatile __local int *<em>p</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_and</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_and</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>val</em>)<br></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Read the 32-bit value (referred to as <em>old</em>) stored at location pointed by
<em>p</em>. Compute (<em>old</em> &amp; <em>val</em>) and store result at location pointed by <em>p</em>.
The function returns <em>old</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>atomic_or</strong>(volatile __global int *<em>p</em>, int <em>val</em>)<br>
int <strong>atom_or</strong>(volatile __global int *<em>p</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_or</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_or</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>val</em>)<br></p>
<p class="tableblock"> int <strong>atomic_or</strong>(volatile __local int *<em>p</em>, int <em>val</em>)<br>
int <strong>atom_or</strong>(volatile __local int *<em>p</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_or</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_or</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>val</em>)<br></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Read the 32-bit value (referred to as <em>old</em>) stored at location pointed by
<em>p</em>. Compute (<em>old</em> | <em>val</em>) and store result at location pointed by
<em>p</em>. The function returns <em>old</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>atomic_xor</strong>(volatile __global int *<em>p</em>, int <em>val</em>)<br>
int <strong>atom_xor</strong>(volatile __global int *<em>p</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_xor</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_xor</strong>(volatile __global unsigned int *<em>p</em>, unsigned int <em>val</em>)<br></p>
<p class="tableblock"> int <strong>atomic_xor</strong>(volatile __local int *<em>p</em>, int <em>val</em>)<br>
int <strong>atom_xor</strong>(volatile __local int *<em>p</em>, int <em>val</em>)<br></p>
<p class="tableblock"> unsigned int <strong>atomic_xor</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>val</em>)<br>
unsigned int <strong>atom_xor</strong>(volatile __local unsigned int *<em>p</em>, unsigned int <em>val</em>)<br></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Read the 32-bit value (referred to as <em>old</em>) stored at location pointed by
<em>p</em>. Compute (<em>old</em> ^ <em>val</em>) and store result at location pointed by <em>p</em>.
The function returns <em>old</em>.</p></td>
</tr>
</tbody>
</table>
</div>
<div class="sect4">
<h5 id="atomic-restrictions"><a class="anchor" href="#atomic-restrictions"></a>6.15.12.9. Restrictions</h5>
<div class="openblock">
<div class="content">
<div class="ulist">
<ul>
<li>
<p>All operations on atomic types must be performed using the built-in
atomic functions.
C11 and C++11 support operators on atomic types.
OpenCL C does not support operators with atomic types.
Using atomic types with operators should result in a compilation error.</p>
</li>
<li>
<p>The <code>atomic_bool</code>, <code>atomic_char</code>, <code>atomic_uchar</code>, <code>atomic_short</code>,
<code>atomic_ushort</code>, <code>atomic_intmax_t</code> and <code>atomic_uintmax_t</code> types are not
supported by OpenCL C.</p>
</li>
<li>
<p>OpenCL C 2.0 requires that the built-in atomic functions on atomic types
are lock-free.
In OpenCL C 3.0 or newer, built-in atomic functions on atomic types may be
lock-free.</p>
</li>
<li>
<p>The <code>_Atomic</code> type specifier and <code>_Atomic</code> type qualifier are not supported
by OpenCL C.</p>
</li>
<li>
<p>The behavior of atomic operations where pointer arguments to the atomic
functions refers to an atomic type in the <code>private</code> address space is
undefined.</p>
</li>
<li>
<p>Using <code>memory_order_acquire</code> with any built-in atomic function except
<code>atomic_work_item_fence</code> <a href="#unified-spec">requires</a> support for OpenCL C
2.0, or OpenCL C 3.0 or newer and the <code>__opencl_c_atomic_order_acq_rel</code>
feature.</p>
</li>
<li>
<p>Using <code>memory_order_release</code> with any built-in atomic function except
<code>atomic_work_item_fence</code> <a href="#unified-spec">requires</a> support for OpenCL C
2.0, or OpenCL C 3.0 or newer and the <code>__opencl_c_atomic_order_acq_rel</code>
feature.</p>
</li>
<li>
<p>Using <code>memory_order_acq_rel</code> with any built-in atomic function except
<code>atomic_work_item_fence</code> <a href="#unified-spec">requires</a> support for OpenCL C
2.0, or OpenCL C 3.0 or newer and the <code>__opencl_c_atomic_order_acq_rel</code>
feature.</p>
</li>
<li>
<p>Using <code>memory_order_seq_cst</code> with any built-in atomic function
<a href="#unified-spec">requires</a> support for OpenCL C 2.0, or OpenCL C 3.0 or
newer and the <code>__opencl_c_atomic_order_seq_cst</code> feature.</p>
</li>
<li>
<p>Using <code>memory_scope_sub_group</code> with any built-in atomic function
<a href="#unified-spec">requires</a> support for OpenCL C 3.0 or newer and the
<code>__opencl_c_subgroups</code> feature.</p>
</li>
<li>
<p>Using <code>memory_scope_device</code> <a href="#unified-spec">requires</a> support for OpenCL
C 2.0, or OpenCL C 3.0 or newer and the
<code>__opencl_c_atomic_scope_device</code> feature.</p>
</li>
<li>
<p>Using <code>memory_scope_all_svm_devices</code> or <code>memory_scope_all_devices</code>
<a href="#unified-spec">requires</a> support for OpenCL C 2.0, or OpenCL C 3.0 or
newer and the <code>__opencl_c_atomic_scope_all_devices</code> feature.</p>
</li>
</ul>
</div>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="miscellaneous-vector-functions"><a class="anchor" href="#miscellaneous-vector-functions"></a>6.15.13. Miscellaneous Vector Functions</h4>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The OpenCL C programming language implements the following additional
built-in vector functions.
We use the generic type name <code>gentype<em>n</em></code> (or <code>gentype<em>m</em></code>) to indicate the
built-in data types <code>char{2|4|8|16}</code>, <code>uchar{2|4|8|16}</code>, <code>short{2|4|8|16}</code>,
<code>ushort{2|4|8|16}</code>, <code>half{2|4|8|16}</code> <sup class="footnote">[<a id="_footnoteref_56" class="footnote" href="#_footnotedef_56" title="View footnote.">56</a>]</sup>,
<code>int{2|4|8|16}</code>, <code>uint{2|4|8|16}</code>, <code>long{2|4|8|16}</code>
<sup class="footnote">[<a id="_footnoteref_57" class="footnote" href="#_footnotedef_57" title="View footnote.">57</a>]</sup>, <code>ulong{2|4|8|16}</code>, <code>float{2|4|8|16}</code>, or
<code>double{2|4|8|16}</code> <sup class="footnote">[<a id="_footnoteref_58" class="footnote" href="#_footnotedef_58" title="View footnote.">58</a>]</sup> as the type for
the arguments unless otherwise stated.
We use the generic name <code>ugentype<em>n</em></code> to indicate the built-in unsigned
integer data types.</p>
</div>
<table id="table-misc-vector" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 24. Built-in Miscellaneous Vector Functions</caption>
<colgroup>
<col style="width: 33.3333%;">
<col style="width: 66.6667%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>vec_step</strong>(gentype<em>n</em> <em>a</em>)<br>
int <strong>vec_step</strong>(char3 <em>a</em>)<br>
int <strong>vec_step</strong>(uchar3 <em>a</em>)<br>
int <strong>vec_step</strong>(short3 <em>a</em>)<br>
int <strong>vec_step</strong>(ushort3 <em>a</em>)<br>
int <strong>vec_step</strong>(half3 <em>a</em>)<br>
int <strong>vec_step</strong>(int3 <em>a</em>)<br>
int <strong>vec_step</strong>(uint3 <em>a</em>)<br>
int <strong>vec_step</strong>(long3 <em>a</em>)<br>
int <strong>vec_step</strong>(ulong3 <em>a</em>)<br>
int <strong>vec_step</strong>(float3 <em>a</em>)<br>
int <strong>vec_step</strong>(double3 <em>a</em>)<br>
int <strong>vec_step</strong>(<em>type</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The <strong>vec_step</strong> built-in function takes a built-in scalar or vector
data type argument and returns an integer value representing the
number of elements in the scalar or vector.</p>
<p class="tableblock"> For all scalar types, <strong>vec_step</strong> returns 1.</p>
<p class="tableblock"> The <strong>vec_step</strong> built-in functions that take a 3-component vector
return 4.</p>
<p class="tableblock"> <strong>vec_step</strong> may also take a pure type as an argument, e.g.
<strong>vec_step</strong>(float2)</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.1 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype<em>n</em> <strong>shuffle</strong>(gentype<em>m</em> <em>x</em>,
ugentype<em>n</em> <em>mask</em>)<br>
gentype<em>n</em> <strong>shuffle2</strong>(gentype<em>m</em> <em>x</em>,
gentype<em>m</em> <em>y</em>,
ugentype<em>n</em> <em>mask</em>)</p></td>
<td class="tableblock halign-left valign-top"><div class="content"><div class="paragraph">
<p>The <strong>shuffle</strong> and <strong>shuffle2</strong> built-in functions construct a
permutation of elements from one or two input vectors respectively
that are of the same type, returning a vector with the same element
type as the input and length that is the same as the shuffle mask.
The size of each element in the <em>mask</em> must match the size of each
element in the result.
For <strong>shuffle</strong>, only the <strong>ilogb</strong>(2<em>m</em>-1) least significant bits of each
<em>mask</em> element are considered.
For <strong>shuffle2</strong>, only the <strong>ilogb</strong>(2<em>m</em>-1)+1 least significant bits of
each <em>mask</em> element are considered.
Other bits in the mask shall be ignored.</p>
</div>
<div class="paragraph">
<p>The elements of the input vectors are numbered from left to right across one
or both of the vectors.
For this purpose, the number of elements in a vector is given by
<strong>vec_step</strong>(gentype<em>m</em>).
The shuffle <em>mask</em> operand specifies, for each element of the result vector,
which element of the one or two input vectors the result element gets.</p>
</div>
<div class="paragraph">
<p><a href="#unified-spec">Requires</a> support for OpenCL C 1.1 or newer.</p>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">uint4 mask = (uint4)(<span class="integer">3</span>, <span class="integer">2</span>, <span class="integer">1</span>, <span class="integer">0</span>);
float4 a;
float4 r = shuffle(a, mask);
uint8 mask = (uint8)(<span class="integer">0</span>, <span class="integer">1</span>, <span class="integer">2</span>, <span class="integer">3</span>, <span class="integer">4</span>, <span class="integer">5</span>, <span class="integer">6</span>, <span class="integer">7</span>);
float4 a, b;
float8 r = shuffle2(a, b, mask);
uint4 mask;
float8 a;
float4 b;
b = shuffle(a, mask);</code></pre>
</div>
</div>
<div class="paragraph">
<p>Examples that are not valid are:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">uint8 mask;
short16 a;
short8 b;
b = shuffle(a, mask); <span class="comment">// not valid</span></code></pre>
</div>
</div></div></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="sect3">
<h4 id="printf"><a class="anchor" href="#printf"></a>6.15.14. printf</h4>
<div class="openblock">
<div class="content">
<div class="admonitionblock important">
<table>
<tr>
<td class="icon">
<i class="fa icon-important" title="Important"></i>
</td>
<td class="content">
<strong>printf</strong> <a href="#unified-spec">requires</a> support for OpenCL C 1.2.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>The OpenCL C programming language implements the <strong>printf</strong> function.</p>
</div>
<table id="table-printf" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 25. Built-in printf Function</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>printf</strong>(constant char restrict <em>format</em>, &#8230;&#8203;)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">The <strong>printf</strong> built-in function writes output to an
implementation-defined stream such as stdout under control of the
string pointed to by <em>format</em> that specifies how subsequent arguments
are converted for output.
If there are insufficient arguments for the format, the behavior is
undefined.
If the format is exhausted while arguments remain, the excess
arguments are evaluated (as always) but are otherwise ignored.
The <strong>printf</strong> function returns when the end of the format string is
encountered.</p>
<p class="tableblock"> <strong>printf</strong> returns 0 if it was executed successfully and -1 otherwise.</p></td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="sect4">
<h5 id="printf-output-synchronization"><a class="anchor" href="#printf-output-synchronization"></a>6.15.14.1. printf output synchronization</h5>
<div class="paragraph">
<p>When the event that is associated with a particular kernel invocation is
completed, the output of all printf() calls executed by this kernel
invocation is flushed to the implementation-defined output stream.
Calling <strong>clFinish</strong> on a command queue flushes all pending output by printf
in previously enqueued and completed commands to the implementation-defined
output stream.
In the case that printf is executed from multiple work-items concurrently,
there is no guarantee of ordering with respect to written data.
For example, it is valid for the output of a work-item with a global id
(0,0,1) to appear intermixed with the output of a work-item with a global id
(0,0,4) and so on.</p>
</div>
</div>
<div class="sect4">
<h5 id="printf-format-string"><a class="anchor" href="#printf-format-string"></a>6.15.14.2. printf format string</h5>
<div class="paragraph">
<p>The format shall be a character sequence, beginning and ending in its
initial shift state.
The format is composed of zero or more directives: ordinary characters (not
<strong>%</strong>), which are copied unchanged to the output stream; and conversion
specifications, each of which results in fetching zero or more subsequent
arguments, converting them, if applicable, according to the corresponding
conversion specifier, and then writing the result to the output stream.
The format is in the constant address space and must be resolvable at
compile time, i.e. cannot be dynamically created by the executing program
itself.</p>
</div>
<div class="paragraph">
<p>Each conversion specification is introduced by the character <strong>%</strong>.
After the <strong>%</strong>, the following appear in sequence:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Zero or more <em>flags</em> (in any order) that modify the meaning of the
conversion specification.</p>
</li>
<li>
<p>An optional minimum <em>field width</em>.
If the converted value has fewer characters than the field width, it is
padded with spaces (by default) on the left (or right, if the left
adjustment flag, described later, has been given) to the field width.
The field width takes the form of a nonnegative decimal integer
<sup class="footnote">[<a id="_footnoteref_59" class="footnote" href="#_footnotedef_59" title="View footnote.">59</a>]</sup>.</p>
</li>
<li>
<p>An optional <em>precision</em> that gives the minimum number of digits to
appear for the <strong>d</strong>, <strong>i</strong>, <strong>o</strong>, <strong>u</strong>, <strong>x</strong>, and <strong>X</strong> conversions, the number
of digits to appear after the decimal-point character for <strong>a</strong>, <strong>A</strong>, <strong>e</strong>,
<strong>E</strong>, <strong>f</strong>, and <strong>F</strong> conversions, the maximum number of significant digits
for the <strong>g</strong> and <strong>G</strong> conversions, or the maximum number of bytes to be
written for <strong>s</strong> conversions.
The precision takes the form of a period (<strong>.</strong>) followed by an optional
decimal integer; if only the period is specified, the precision is taken
as zero.
If a precision appears with any other conversion specifier, the behavior
is undefined.</p>
</li>
<li>
<p>An optional <em>vector specifier</em>.</p>
</li>
<li>
<p>A <em>length modifier</em> that specifies the size of the argument.
The <em>length modifier</em> is required with a vector specifier and together
specifies the vector type.
<a href="#implicit-conversions">Implicit conversions</a> between vector types are
disallowed.
If the <em>vector specifier</em> is not specified, the <em>length modifier</em> is
optional.</p>
</li>
<li>
<p>A <em>conversion specifier</em> character that specifies the type of
conversion to be applied.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>The flag characters and their meanings are:</p>
</div>
<div class="paragraph">
<p><strong>-</strong> The result of the conversion is left-justified within the field.
(It is right-justified if this flag is not specified.)</p>
</div>
<div class="paragraph">
<p><strong>+</strong> The result of a signed conversion always begins with a plus or minus
sign.
(It begins with a sign only when a negative value is converted if this flag
is not specified.) <sup class="footnote">[<a id="_footnoteref_60" class="footnote" href="#_footnotedef_60" title="View footnote.">60</a>]</sup></p>
</div>
<div class="paragraph">
<p><em>space</em> If the first character of a signed conversion is not a sign, or if a
signed conversion results in no characters, a space is prefixed to the
result.
If the <em>space</em> and <strong>+</strong> flags both appear, the <em>space</em> flag is ignored.</p>
</div>
<div class="paragraph">
<p><strong>#</strong> The result is converted to an &#8220;alternative form&#8221;.
For <strong>o</strong> conversion, it increases the precision, if and only if necessary,
to force the first digit of the result to be a zero (if the value and
precision are both 0, a single 0 is printed).
For <strong>x</strong> (or <strong>X</strong>) conversion, a nonzero result has <strong>0x</strong> (or <strong>0X</strong>)
prefixed to it.
For <strong>a</strong>, <strong>A</strong>, <strong>e</strong>, <strong>E</strong>, <strong>f</strong>, <strong>F</strong>, <strong>g</strong>, and <strong>G</strong> conversions,
the result of converting a floating-point number always contains a
decimal-point character, even if no digits follow it.
(Normally, a decimal-point character appears in the result of these
conversions only if a digit follows it.) For <strong>g</strong> and <strong>G</strong> conversions,
trailing zeros are <strong>not</strong> removed from the result.
For other conversions, the behavior is undefined.</p>
</div>
<div class="paragraph">
<p><strong>0</strong> For <strong>d</strong>, <strong>i</strong>, <strong>o</strong>, <strong>u</strong>, <strong>x</strong>, <strong>X</strong>, <strong>a</strong>, <strong>A</strong>, <strong>e</strong>,
<strong>E</strong>, <strong>f</strong>, <strong>F</strong>, <strong>g</strong>, and <strong>G</strong> conversions, leading zeros (following
any indication of sign or base) are used to pad to the field width rather
than performing space padding, except when converting an infinity or NaN.
If the <strong>0</strong> and <strong>-</strong> flags both appear, the <strong>0</strong> flag is ignored.
For <strong>d</strong>, <strong>i</strong>, <strong>o</strong>, <strong>u</strong>, <strong>x</strong>, and <strong>X</strong> conversions, if a precision
is specified, the <strong>0</strong> flag is ignored.
For other conversions, the behavior is undefined.</p>
</div>
<div class="paragraph">
<p>The vector specifier and its meaning is:</p>
</div>
<div class="paragraph">
<p><strong>v</strong><em>n</em> Specifies that a following <strong>a</strong>, <strong>A</strong>, <strong>e</strong>, <strong>E</strong>, <strong>f</strong>, <strong>F</strong>, <strong>g</strong>, <strong>G</strong>,
<strong>d</strong>, <strong>i</strong>, <strong>o</strong>, <strong>u</strong>, <strong>x</strong>, or <strong>X</strong> conversion specifier applies to a vector
argument, where <em>n</em> is the size of the vector and must be 2, 3, 4, 8 or 16.</p>
</div>
<div class="paragraph">
<p>The vector value is displayed in the following general form:</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>value1 C value2 C &#8230;&#8203; C value<em>n</em></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>where C is a separator character.
The value for this separator character is a comma.</p>
</div>
<div class="paragraph">
<p>If the vector specifier is not used, the length modifiers and their meanings
are:</p>
</div>
<div class="paragraph">
<p><strong>hh</strong> Specifies that a following <strong>d</strong>, <strong>i</strong>, <strong>o</strong>, <strong>u</strong>, <strong>x</strong>, or <strong>X</strong> conversion
specifier applies to a <code>char</code> or <code>uchar</code> argument (the argument will have
been promoted according to the integer promotions, but its value shall be
converted to <code>char</code> or <code>uchar</code> before printing).</p>
</div>
<div class="paragraph">
<p><strong>h</strong> Specifies that a following <strong>d</strong>, <strong>i</strong>, <strong>o</strong>, <strong>u</strong>, <strong>x</strong>, or <strong>X</strong> conversion
specifier applies to a <code>short</code> or <code>ushort</code> argument (the argument will have
been promoted according to the integer promotions, but its value shall be
converted to <code>short</code> or <code>unsigned short</code> before printing).</p>
</div>
<div class="paragraph">
<p><strong>l</strong> (ell) Specifies that a following <strong>d</strong>, <strong>i</strong>, <strong>o</strong>, <strong>u</strong>, <strong>x</strong>, or <strong>X</strong>
conversion specifier applies to a <code>long</code> or <code>ulong</code> argument.
The <strong>l</strong> modifier is supported by the full profile.
For the embedded profile, the <strong>l</strong> modifier is supported only if 64-bit
integers are supported by the device.</p>
</div>
<div class="paragraph">
<p>If the vector specifier is used, the length modifiers and their meanings
are:</p>
</div>
<div class="paragraph">
<p><strong>hh</strong> Specifies that a following <strong>d</strong>, <strong>i</strong>, <strong>o</strong>, <strong>u</strong>, <strong>x</strong>, or <strong>X</strong> conversion
specifier applies to a <code>char<em>n</em></code> or <code>uchar<em>n</em></code> argument (the argument
will not be promoted).</p>
</div>
<div class="paragraph">
<p><strong>h</strong> Specifies that a following <strong>d</strong>, <strong>i</strong>, <strong>o</strong>, <strong>u</strong>, <strong>x</strong>, or <strong>X</strong> conversion
specifier applies to a <code>short<em>n</em></code> or <code>ushort<em>n</em></code> argument (the argument
will not be promoted); that a following <strong>a</strong>, <strong>A</strong>, <strong>e</strong>, <strong>E</strong>, <strong>f</strong>, <strong>F</strong>, <strong>g</strong>,
or <strong>G</strong> conversion specifier applies to a <code>half<em>n</em></code>
<sup class="footnote">[<a id="_footnoteref_61" class="footnote" href="#_footnotedef_61" title="View footnote.">61</a>]</sup> argument.</p>
</div>
<div class="paragraph">
<p><strong>hl</strong> This modifier can only be used with the vector specifier.
Specifies that a following <strong>d</strong>, <strong>i</strong>, <strong>o</strong>, <strong>u</strong>, <strong>x</strong>, or <strong>X</strong> conversion
specifier applies to a <code>int<em>n</em></code> or <code>uint<em>n</em></code> argument; that a following
<strong>a</strong>, <strong>A</strong>, <strong>e</strong>, <strong>E</strong>, <strong>f</strong>, <strong>F</strong>, <strong>g</strong>, or <strong>G</strong> conversion specifier applies to a
<code>float<em>n</em></code> argument.</p>
</div>
<div class="paragraph">
<p><strong>l</strong>(ell) Specifies that a following <strong>d</strong>, <strong>i</strong>, <strong>o</strong>, <strong>u</strong>, <strong>x</strong>, or <strong>X</strong>
conversion specifier applies to a <code>long<em>n</em></code> or <code>ulong<em>n</em></code> argument; that
a following <strong>a</strong>, <strong>A</strong>, <strong>e</strong>, <strong>E</strong>, <strong>f</strong>, <strong>F</strong>, <strong>g</strong>, or <strong>G</strong> conversion specifier
applies to a <code>double<em>n</em></code> argument.
The <strong>l</strong> modifier is supported by the full profile.
For the embedded profile, the <strong>l</strong> modifier is supported only if 64-bit
integers or double-precision floating-point are supported by the device.</p>
</div>
<div class="paragraph">
<p>If a vector specifier appears without a length modifier, the behavior is
undefined.
The vector data type described by the vector specifier and length modifier
must match the data type of the argument; otherwise the behavior is
undefined.</p>
</div>
<div class="paragraph">
<p>If a length modifier appears with any conversion specifier other than as
specified above, the behavior is undefined.</p>
</div>
<div class="paragraph">
<p>The conversion specifiers and their meanings are:</p>
</div>
<div class="paragraph">
<p><strong>d,i</strong> The <code>int</code>, <code>char<em>n</em></code>, <code>short<em>n</em></code>, <code>int<em>n</em></code> or <code>long<em>n</em></code>
argument is converted to signed decimal in the style <em>[</em><strong>-</strong><em>]dddd</em>.
The precision specifies the minimum number of digits to appear; if the value
being converted can be represented in fewer digits, it is expanded with
leading zeros.
The default precision is 1.
The result of converting a zero value with a precision of zero is no
characters.</p>
</div>
<div class="paragraph">
<p><strong>o,u,</strong></p>
</div>
<div class="paragraph">
<p><strong>x,X</strong> The <code>unsigned int</code>, <code>uchar<em>n</em></code>, <code>ushort<em>n</em></code>, <code>uint<em>n</em></code> or
<code>ulong<em>n</em></code> argument is converted to unsigned octal (<strong>o</strong>), unsigned decimal
(<strong>u</strong>), or unsigned hexadecimal notation (<strong>x</strong> or <strong>X</strong>) in the style <em>dddd</em>;
the letters <strong>abcdef</strong> are used for <strong>x</strong> conversion and the letters <strong>ABCDEF</strong>
for <strong>X</strong> conversion.
The precision specifies the minimum number of digits to appear; if the value
being converted can be represented in fewer digits, it is expanded with
leading zeros.
The default precision is 1.
The result of converting a zero value with a precision of zero is no
characters.</p>
</div>
<div class="paragraph">
<p><strong>f,F</strong> A <code>double</code>, <code>half<em>n</em></code>, <code>float<em>n</em></code> or <code>double<em>n</em></code> argument
representing a floating-point number is converted to decimal notation in the
style <em>[</em><strong>-</strong><em>]ddd</em><strong>.</strong><em>ddd</em>, where the number of digits after the
decimal-point character is equal to the precision specification.
If the precision is missing, it is taken as 6; if the precision is zero and
the <strong># </strong>flag is not specified, no decimal-point character appears.
If a decimal-point character appears, at least one digit appears before it.
The value is rounded to the appropriate number of digits.
A <code>double</code>, <code>half<em>n</em></code>, <code>float<em>n</em></code> or <code>double<em>n</em></code> argument representing
an infinity is converted in one of the styles <em>[</em><strong>-</strong><em>]</em><strong>inf </strong>or
<em>[</em><strong>-</strong><em>]</em><strong>infinity </strong>&#8201;&#8212;&#8201;which style is implementation-defined.
A <code>double</code>, <code>half<em>n</em></code>, <code>float<em>n</em></code> or <code>double<em>n</em></code> argument representing
a NaN is converted in one of the styles <em>[</em><strong>-</strong><em>]</em><strong>nan </strong>or
<em>[</em><strong>-</strong><em>]</em><strong>nan(</strong><em>n-char-sequence</em><strong>) </strong>&#8201;&#8212;&#8201;which style, and the meaning of any <em>n-char-sequence</em>, is
implementation-defined.
The <strong>F</strong> conversion specifier produces <code>INF</code>, <code>INFINITY</code>, or <code>NAN</code> instead of
<strong>inf</strong>, <strong>infinity</strong>, or <strong>nan</strong>, respectively <sup class="footnote">[<a id="_footnoteref_62" class="footnote" href="#_footnotedef_62" title="View footnote.">62</a>]</sup>.</p>
</div>
<div class="paragraph">
<p><strong>e,E</strong> A <code>double</code>, <code>half<em>n</em></code>, <code>float<em>n</em></code> or <code>double<em>n</em></code> argument
representing a floating-point number is converted in the style
<em>[</em><strong>-</strong><em>]d</em><strong>.</strong><em>ddd </em><strong>e±}</strong><em>dd</em>, where there is one digit
(which is nonzero if the argument is nonzero) before the decimal-point
character and the number of digits after it is equal to the precision; if
the precision is missing, it is taken as 6; if the precision is zero and the
<strong>#</strong> flag is not specified, no decimal-point character appears.
The value is rounded to the appropriate number of digits.
The <strong>E</strong> conversion specifier produces a number with <strong>E</strong> instead of <strong>e</strong>
introducing the exponent.
The exponent always contains at least two digits, and only as many more
digits as necessary to represent the exponent.
If the value is zero, the exponent is zero.
A <code>double</code>, <code>half<em>n</em></code>, <code>float<em>n</em></code> or <code>double<em>n</em></code> argument representing
an infinity or NaN is converted in the style of an <strong>f</strong> or <strong>F</strong> conversion
specifier.</p>
</div>
<div class="paragraph">
<p><strong>g,G</strong> A <code>double</code>, <code>half<em>n</em></code>, <code>float<em>n</em></code> or <code>double<em>n</em></code> argument
representing a floating-point number is converted in style <strong>f</strong> or <strong>e</strong> (or in
style <strong>F</strong> or <strong>E</strong> in the case of a <strong>G</strong> conversion specifier), depending on
the value converted and the precision.
Let <em>P </em>equal the precision if nonzero, 6 if the precision is omitted, or
1 if the precision is zero.
Then, if a conversion with style <strong>E</strong> would have an exponent of <em>X</em>:&#8201;&#8212;&#8201;if
<em>P</em> &gt; <em>X</em> ≥ -4, the conversion is with style <strong>f</strong> (or <strong>F</strong>) and precision
<em>P</em> <strong>-</strong> (<em>X</em> <strong>+</strong> 1).&#8201;&#8212;&#8201;otherwise, the conversion is with style <strong>e *(or *E</strong>) and precision <em>P</em>
<strong>-</strong> 1.
Finally, unless the <strong>#</strong> flag is used, any trailing zeros are removed from
the fractional portion of the result and the decimal-point character is
removed if there is no fractional portion remaining.
A <code>double</code>, <code>half<em>n</em></code>, <code>float<em>n</em></code> or <code>double<em>n</em></code> <strong>e</strong> argument
representing an infinity or NaN is converted in the style of an <strong>f</strong> or <strong>F</strong>
conversion specifier.</p>
</div>
<div class="paragraph">
<p><strong>a,A</strong> A <code>double</code>, <code>half<em>n</em></code>, <code>float<em>n</em></code> or <code>double<em>n</em></code> argument
representing a floating-point number is converted in the style
<em>[</em><strong>-</strong><em>]</em><strong>0x</strong><em>h</em><strong>.</strong><em>hhhh </em><strong>p±</strong><em>d</em>, where there is one
hexadecimal digit (which is nonzero if the argument is a normalized
floating-point number and is otherwise unspecified) before the decimal-point
character <sup class="footnote">[<a id="_footnoteref_63" class="footnote" href="#_footnotedef_63" title="View footnote.">63</a>]</sup> and the number of hexadecimal digits
after it is equal to the precision; if the precision is missing, then the
precision is sufficient for an exact representation of the value; if the
precision is zero and the <strong>#</strong> flag is not specified, no decimal point character
appears.
The letters <strong>abcdef</strong> are used for <strong>a</strong> conversion and the letters <strong>ABCDEF</strong>
for <strong>A</strong> conversion.
The <strong>A</strong> conversion specifier produces a number with <strong>X</strong> and <strong>P</strong> instead of
<strong>x</strong> and <strong>p</strong>.
The exponent always contains at least one digit, and only as many more
digits as necessary to represent the decimal exponent of 2.
If the value is zero, the exponent is zero.
A <code>double</code>, <code>half<em>n</em></code>, <code>float<em>n</em></code> or <code>double<em>n</em></code> argument representing
an infinity or NaN is converted in the style of an <strong>f</strong> or <strong>F</strong> conversion
specifier.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>The conversion specifiers <strong>e,E,g,G,a,A</strong> convert a <code>float</code> or <code>half</code> argument
that is a scalar type to a <code>double</code> only if the <code>double</code> data type is
supported, e.g. for OpenCL C 3.0 or newer the <code>__opencl_c_fp64</code> feature
macro is present.
If the <code>double</code> data type is not supported, the argument will be a <code>float</code>
instead of a <code>double</code> and the <code>half</code> type will be converted to a <code>float</code>.</p>
</div>
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p><strong>c</strong> The <code>int</code> argument is converted to an <code>unsigned char</code>, and the resulting
character is written.</p>
</div>
<div class="paragraph">
<p><strong>s</strong> The argument shall be a literal string
<sup class="footnote">[<a id="_footnoteref_64" class="footnote" href="#_footnotedef_64" title="View footnote.">64</a>]</sup>.
Characters from the literal string array are written up to (but not
including) the terminating null character.
If the precision is specified, no more than that many bytes are written.
If the precision is not specified or is greater than the size of the array,
the array shall contain a null character.</p>
</div>
<div class="paragraph">
<p><strong>p</strong> The argument shall be a pointer to <strong>void</strong>.
The pointer can refer to a memory region in the <code>global</code>, <code>constant</code>,
<code>local</code>, <code>private</code>, or generic address space.
The value of the pointer is converted to a sequence of printing characters
in an implementation-defined manner.</p>
</div>
<div class="paragraph">
<p><strong>%</strong> A <strong>%</strong> character is written.
No argument is converted.
The complete conversion specification shall be <strong>%%</strong>.</p>
</div>
<div class="paragraph">
<p>If a conversion specification is invalid, the behavior is undefined.
If any argument is not the correct type for the corresponding conversion
specification, the behavior is undefined.</p>
</div>
<div class="paragraph">
<p>In no case does a nonexistent or small field width cause truncation of a
field; if the result of a conversion is wider than the field width, the
field is expanded to contain the conversion result.</p>
</div>
<div class="paragraph">
<p>For <strong>a</strong> and <strong>A</strong> conversions, the value is correctly rounded to a hexadecimal
floating number with the given precision.</p>
</div>
<div class="paragraph">
<p>A few examples of printf are given below:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">float4 f = (float4)(<span class="float">1</span><span class="float">.0f</span>, <span class="float">2</span><span class="float">.0f</span>, <span class="float">3</span><span class="float">.0f</span>, <span class="float">4</span><span class="float">.0f</span>);
uchar4 uc = (uchar4)(<span class="hex">0xFA</span>, <span class="hex">0xFB</span>, <span class="hex">0xFC</span>, <span class="hex">0xFD</span>);
printf(<span class="string"><span class="delimiter">&quot;</span><span class="content">f4 = %2.2v4hlf</span><span class="char">\n</span><span class="delimiter">&quot;</span></span>, f);
printf(<span class="string"><span class="delimiter">&quot;</span><span class="content">uc = %#v4hhx</span><span class="char">\n</span><span class="delimiter">&quot;</span></span>, uc);</code></pre>
</div>
</div>
<div class="paragraph">
<p>The above two printf calls print the following:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">f4 = <span class="float">1</span><span class="float">.00</span>,<span class="float">2</span><span class="float">.00</span>,<span class="float">3</span><span class="float">.00</span>,<span class="float">4</span><span class="float">.00</span>
uc = <span class="hex">0xfa</span>,<span class="hex">0xfb</span>,<span class="hex">0xfc</span>,<span class="hex">0xfd</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>A few examples of valid use cases of printf for the conversion specifier <strong>s</strong>
are given below.
The argument value must be a pointer to a literal string.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span> my_kernel( ... )
{
printf(<span class="string"><span class="delimiter">&quot;</span><span class="content">%s</span><span class="char">\n</span><span class="delimiter">&quot;</span></span>, <span class="string"><span class="delimiter">&quot;</span><span class="content">this is a test string</span><span class="char">\n</span><span class="delimiter">&quot;</span></span>);
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>A few examples of invalid use cases of printf for the conversion specifier
<strong>s</strong> are given below:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span> my_kernel(global <span class="predefined-type">char</span> *s, ... )
{
printf(<span class="string"><span class="delimiter">&quot;</span><span class="content">%s</span><span class="char">\n</span><span class="delimiter">&quot;</span></span>, s);
constant <span class="predefined-type">char</span> *p = <span class="string"><span class="delimiter">&quot;</span><span class="content">`this is a test string</span><span class="char">\n</span><span class="content">`</span><span class="delimiter">&quot;</span></span>;
printf(<span class="string"><span class="delimiter">&quot;</span><span class="content">%s</span><span class="char">\n</span><span class="delimiter">&quot;</span></span>, p);
printf(<span class="string"><span class="delimiter">&quot;</span><span class="content">%s</span><span class="char">\n</span><span class="delimiter">&quot;</span></span>, &amp;p[<span class="integer">3</span>]);
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>A few examples of invalid use cases of printf where data types given by the
vector specifier and length modifier do not match the argument type are
given below:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span> my_kernel(global <span class="predefined-type">char</span> *s, ... )
{
uint2 ui = (uint2)(<span class="hex">0x12345678</span>, <span class="hex">0x87654321</span>);
printf(<span class="string"><span class="delimiter">&quot;</span><span class="content">unsigned short value = (%#v2hx)</span><span class="char">\n</span><span class="delimiter">&quot;</span></span>, ui)
printf(<span class="string"><span class="delimiter">&quot;</span><span class="content">unsigned char value = (%#v2hhx)</span><span class="char">\n</span><span class="delimiter">&quot;</span></span>, ui)
}</code></pre>
</div>
</div>
</div>
<div class="sect4">
<h5 id="differences-between-opencl-c-and-c99-printf"><a class="anchor" href="#differences-between-opencl-c-and-c99-printf"></a>6.15.14.3. Differences between OpenCL C and C99 printf</h5>
<div class="ulist">
<ul>
<li>
<p>The <strong>l</strong> modifier followed by a <strong>c</strong> conversion specifier or <strong>s</strong>
conversion specifier is not supported by OpenCL C.</p>
</li>
<li>
<p>The <strong>ll</strong>, <strong>j</strong>, <strong>z</strong>, <strong>t</strong>, and <strong>L</strong> length modifiers are not supported by
OpenCL C but are reserved.</p>
</li>
<li>
<p>The <strong>n</strong> conversion specifier is not supported by OpenCL C but is
reserved.</p>
</li>
<li>
<p>OpenCL C adds the optional *v*<em>n</em> vector specifier to support printing
of vector types.</p>
</li>
<li>
<p>The conversion specifiers <strong>f</strong>, <strong>F</strong>, <strong>e</strong>, <strong>E</strong>, <strong>g</strong>, <strong>G</strong>, <strong>a</strong>, <strong>A</strong> convert
a <code>float</code> argument to a <code>double</code> only if the <code>double</code> data type is
supported.
Refer to the value of the <a href="#opencl-device-queries"><code>CL_DEVICE_DOUBLE_FP_CONFIG</code> device query</a>.
If the <code>double</code> data type is not supported, the argument will be a
<code>float</code> instead of a <code>double</code>.</p>
</li>
<li>
<p>For the embedded profile, the <strong>l</strong> length modifier is supported only if
64-bit integers are supported.</p>
</li>
<li>
<p>In OpenCL C, <strong>printf</strong> returns 0 if it was executed successfully and -1
otherwise vs.
C99 where <strong>printf</strong> returns the number of characters printed or a
negative value if an output or encoding error occurred.</p>
</li>
<li>
<p>In OpenCL C, the conversion specifier <strong>s</strong> can only be used for arguments
that are literal strings.</p>
</li>
</ul>
</div>
</div>
</div>
<div class="sect3">
<h4 id="image-read-and-write-functions"><a class="anchor" href="#image-read-and-write-functions"></a>6.15.15. Image Read and Write Functions</h4>
<div class="paragraph">
<p>The built-in functions defined in this section can only be used with image
memory objects.
An image memory object can be accessed by specific function calls that read
from and/or write to specific locations in the image.</p>
</div>
<div class="paragraph">
<p>Support for the image built-in functions is optional.
If a device supports images then the value of the <a href="#opencl-device-queries"><code>CL_DEVICE_IMAGE_SUPPORT</code> device query</a>) is <code>CL_TRUE</code> and the OpenCL C
compiler for that device must define the <code>__IMAGE_SUPPORT__</code> macro.
A compiler for OpenCL C 3.0 or newer for that device must also support the
<code>__opencl_c_images</code> feature.</p>
</div>
<div class="paragraph">
<p>Image memory objects that are being read by a kernel should be declared with
the <code>read_only</code> qualifier.
<strong>write_image</strong> calls to image memory objects declared with the read_only
qualifier will generate a compilation error.
Image memory objects that are being written to by a kernel should be
declared with the write_only qualifier.
<strong>read_image</strong> calls to image memory objects declared with the <code>write_only</code>
qualifier will generate a compilation error.
<strong>read_image</strong> and <strong>write_image</strong> calls to the same image memory object in a
kernel are supported.
Image memory objects that are being read and written by a kernel should be
declared with the <code>read_write</code> qualifier.</p>
</div>
<div class="paragraph">
<p>The <strong>read_image</strong> calls returns a four component floating-point, integer or
unsigned integer color value.
The color values returned by <strong>read_image</strong> are identified as <em>x</em>, <em>y</em>, <em>z</em>,
<em>w</em> where <em>x</em> refers to the red component, <em>y</em> refers to the green
component, <em>z</em> refers to the blue component and <em>w</em> refers to the alpha
component.</p>
</div>
<div class="sect4">
<h5 id="samplers"><a class="anchor" href="#samplers"></a>6.15.15.1. Samplers</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The image read functions take a sampler argument.
The sampler can be passed as an argument to the kernel using
<strong>clSetKernelArg</strong>, or can be declared in the outermost scope of kernel
functions, or it can be a constant variable of type <code>sampler_t</code> declared in
the program source.</p>
</div>
<div class="paragraph">
<p>Sampler variables in a program are declared to be of type <code>sampler_t</code>.
A variable of <code>sampler_t</code> type declared in the program source must be
initialized with a 32-bit unsigned integer constant, which is interpreted as
a bit-field specifying the following properties:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Addressing Mode</p>
</li>
<li>
<p>Filter Mode</p>
</li>
<li>
<p>Normalized Coordinates</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>These properties control how elements of an image object are read by
<strong>read_image{f|i|ui}</strong>.</p>
</div>
<div class="paragraph">
<p>Samplers can also be declared as global constants in the program source
using the following syntax.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="directive">const</span> sampler_t &lt;sampler name&gt; = &lt;value&gt;</code></pre>
</div>
</div>
<div class="paragraph">
<p>or</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">constant sampler_t &lt;sampler name&gt; = &lt;value&gt;</code></pre>
</div>
</div>
<div class="paragraph">
<p>or</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">__constant sampler_t &lt;sampler_name&gt; = &lt;value&gt;</code></pre>
</div>
</div>
<div class="paragraph">
<p>Note that samplers declared using the <code>constant</code> qualifier are not counted
towards the maximum number of arguments pointing to the constant address
space or the maximum size of the <code>constant</code> address space allowed per device
(i.e. the value of the <a href="#opencl-device-queries"><code>CL_DEVICE_MAX_CONSTANT_ARGS</code></a> and <a href="#opencl-device-queries"><code>CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE</code></a> device queries).</p>
</div>
<div class="paragraph">
<p>The sampler fields are described in the following table.</p>
</div>
<table id="table-sampler-descriptor" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 26. Sampler Descriptor</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Sampler State</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>&lt;normalized coords&gt;</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Specifies whether the <em>x</em>, <em>y</em> and <em>z</em> coordinates are passed in as
normalized or unnormalized values.
This must be a literal value and can be one of the following
predefined enums:</p>
<p class="tableblock"> <code>CLK_NORMALIZED_COORDS_TRUE</code> or <code>CLK_NORMALIZED_COORDS_FALSE</code>.</p>
<p class="tableblock"> The samplers used with an image in multiple calls to
<strong>read_image{f|i|ui}</strong> declared in a kernel must use the same value
for &lt;normalized coords&gt;.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>&lt;addressing mode&gt;</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Specifies the image addressing mode, i.e. how out-of-range image
coordinates are handled.
This must be a literal value and can be one of the following
predefined enums:</p>
<p class="tableblock"> <code>CLK_ADDRESS_MIRRORED_REPEAT</code> - Flip the image coordinate at every
integer junction.
This addressing mode can only be used with normalized coordinates.
If normalized coordinates are not used, this addressing mode may
generate image coordinates that are undefined.</p>
<p class="tableblock"> <code>CLK_ADDRESS_REPEAT</code> - out-of-range image coordinates are wrapped to
the valid range.
This addressing mode can only be used with normalized coordinates.
If normalized coordinates are not used, this addressing mode may
generate image coordinates that are undefined.</p>
<p class="tableblock"> <code>CLK_ADDRESS_CLAMP_TO_EDGE</code> - out-of-range image coordinates are
clamped to the extent.</p>
<p class="tableblock"> <code>CLK_ADDRESS_CLAMP</code> - out-of-range image coordinates will return a
border color <sup class="footnote">[<a id="_footnoteref_65" class="footnote" href="#_footnotedef_65" title="View footnote.">65</a>]</sup>.</p>
<p class="tableblock"> <code>CLK_ADDRESS_NONE</code> - for this addressing mode the programmer
guarantees that the image coordinates used to sample elements of the
image refer to a location inside the image; otherwise the results are
undefined.</p>
<p class="tableblock"> For 1D and 2D image arrays, the addressing mode applies only to the
<em>x</em> and (<em>x, y</em>) coordinates.
The addressing mode for the coordinate which specifies the array index
is always <code>CLK_ADDRESS_CLAMP_TO_EDGE</code>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>&lt;filter mode&gt;</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Specifies the filter mode to use.
This must be a literal value and can be one of the following
predefined enums: <code>CLK_FILTER_NEAREST</code> or <code>CLK_FILTER_LINEAR</code>.</p>
<p class="tableblock"> Refer to the <a href="#addressing-and-filter-modes">detailed description of
these filter modes</a>.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p><strong>Examples</strong>:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="directive">const</span> sampler_t samplerA = CLK_NORMALIZED_COORDS_TRUE |
CLK_ADDRESS_REPEAT |
CLK_FILTER_NEAREST;</code></pre>
</div>
</div>
<div class="paragraph">
<p><code>samplerA</code> specifies a sampler that uses normalized coordinates, the repeat
addressing mode and a nearest filter.</p>
</div>
<div class="paragraph">
<p>The maximum number of samplers that can be declared in a kernel can be
queried using the <code>CL_DEVICE_MAX_SAMPLERS</code> token in <strong>clGetDeviceInfo</strong>.</p>
</div>
</div>
</div>
<div class="sect5">
<h6 id="determining-the-border-color-or-value"><a class="anchor" href="#determining-the-border-color-or-value"></a>6.15.15.1.1. <strong>Determining the border color or value</strong></h6>
<div class="paragraph">
<p>If <code>&lt;addressing mode&gt;</code> in sampler is <code>CLK_ADDRESS_CLAMP</code>, then out-of-range
image coordinates return the border color.
The border color selected depends on the image channel order and can be one
of the following values:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>If the image channel order is <code>CL_A</code>, <code>CL_INTENSITY</code>, <code>CL_Rx</code>,
<code>CL_RA</code>, <code>CL_RGx</code>, <code>CL_RGBx</code>, <code>CL_sRGBx</code>, <code>CL_ARGB</code>, <code>CL_BGRA</code>,
<code>CL_ABGR</code>, <code>CL_RGBA</code>, <code>CL_sRGBA</code> or <code>CL_sBGRA</code>, the border color is
<code>(0.0f, 0.0f, 0.0f, 0.0f)</code>.</p>
</li>
<li>
<p>If the image channel order is <code>CL_R</code>, <code>CL_RG</code>, <code>CL_RGB</code>, or
<code>CL_LUMINANCE</code>, the border color is <code>(0.0f, 0.0f, 0.0f, 1.0f)</code>.</p>
</li>
<li>
<p>If the image channel order is <code>CL_DEPTH</code>, the border value is <code>0.0f</code>.</p>
</li>
</ul>
</div>
</div>
<div class="sect5">
<h6 id="srgb-images"><a class="anchor" href="#srgb-images"></a>6.15.15.1.2. <strong>sRGB Images</strong></h6>
<div class="paragraph">
<p>The built-in image read functions will perform sRGB to linear RGB
conversions if the image is an sRGB image.
Likewise, the built-in image write functions perform the linear to
sRGB conversion if the image is an sRGB image.</p>
</div>
<div class="paragraph">
<p>Only the R, G and B components are converted from linear to sRGB and
vice-versa.
The alpha component is returned as is.</p>
</div>
</div>
</div>
<div class="sect4">
<h5 id="built-in-image-read-functions"><a class="anchor" href="#built-in-image-read-functions"></a>6.15.15.2. Built-in Image Read Functions</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The following built-in function calls to read images with a sampler are
supported <sup class="footnote">[<a id="_footnoteref_66" class="footnote" href="#_footnotedef_66" title="View footnote.">66</a>]</sup>.</p>
</div>
<table id="table-image-read" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 27. Built-in Image Read Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float4 <strong>read_imagef</strong>(read_only image2d_t <em>image</em>, sampler_t <em>sampler</em>,
int2 <em>coord</em>)<br>
float4 <strong>read_imagef</strong>(read_only image2d_t <em>image</em>, sampler_t <em>sampler</em>,
float2 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use the coordinate (<em>coord.x</em>, <em>coord.y</em>) to do an element lookup in
the 2D image object specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [0.0, 1.0]
for image objects created with <em>image_channel_data_type</em> set to one of
the pre-defined packed formats or <code>CL_UNORM_INT8</code>, or
<code>CL_UNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [-1.0, 1.0]
for image objects created with <em>image_channel_data_type</em> set to
<code>CL_SNORM_INT8</code>, or <code>CL_SNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values for image objects created
with <em>image_channel_data_type</em> set to <code>CL_HALF_FLOAT</code> or <code>CL_FLOAT</code>.</p>
<p class="tableblock"> The <strong>read_imagef</strong> calls that take integer coordinates must use a
sampler with filter mode set to <code>CLK_FILTER_NEAREST</code>, normalized
coordinates set to <code>CLK_NORMALIZED_COORDS_FALSE</code> and addressing mode
set to <code>CLK_ADDRESS_CLAMP_TO_EDGE</code>, <code>CLK_ADDRESS_CLAMP</code> or
<code>CLK_ADDRESS_NONE</code>; otherwise the values returned are undefined.</p>
<p class="tableblock"> Values returned by <strong>read_imagef</strong> for image objects with
<em>image_channel_data_type</em> values not specified in the description
above are undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int4 <strong>read_imagei</strong>(read_only image2d_t <em>image</em>, sampler_t <em>sampler</em>,
int2 <em>coord</em>)<br>
int4 <strong>read_imagei</strong>(read_only image2d_t <em>image</em>, sampler_t <em>sampler</em>,
float2 <em>coord</em>)<br>
uint4 <strong>read_imageui</strong>(read_only image2d_t <em>image</em>, sampler_t <em>sampler</em>,
int2 <em>coord</em>)<br>
uint4 <strong>read_imageui</strong>(read_only image2d_t <em>image</em>, sampler_t <em>sampler</em>,
float2 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use the coordinate (<em>coord.x</em>, <em>coord.y</em>) to do an element lookup in
the 2D image object specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagei</strong> and <strong>read_imageui</strong> return unnormalized signed integer
and unsigned integer values respectively.
Each channel will be stored in a 32-bit integer.</p>
<p class="tableblock"> <strong>read_imagei</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_SIGNED_INT8</code>,<br>
<code>CL_SIGNED_INT16</code> and<br>
<code>CL_SIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imagei</strong> are undefined.</p>
<p class="tableblock"> <strong>read_imageui</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_UNSIGNED_INT8</code>,<br>
<code>CL_UNSIGNED_INT16</code> and<br>
<code>CL_UNSIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imageui</strong> are undefined.</p>
<p class="tableblock"> The <strong>read_image{i|ui}</strong> calls support a nearest filter only.
The filter_mode specified in <em>sampler</em> must be set to
<code>CLK_FILTER_NEAREST</code>; otherwise the values returned are undefined.</p>
<p class="tableblock"> Furthermore, the <strong>read_image{i|ui}</strong> calls that take integer
coordinates must use a sampler with normalized coordinates set to
<code>CLK_NORMALIZED_COORDS_FALSE</code> and addressing mode set to
<code>CLK_ADDRESS_CLAMP_TO_EDGE</code>, <code>CLK_ADDRESS_CLAMP</code> or
<code>CLK_ADDRESS_NONE</code>; otherwise the values returned are undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float4 <strong>read_imagef</strong>(read_only image3d_t <em>image</em>, sampler_t <em>sampler</em>,
int4 <em>coord</em> )<br>
float4 <strong>read_imagef</strong>(read_only image3d_t <em>image</em>, sampler_t <em>sampler</em>,
float4 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use the coordinate (<em>coord.x</em>, <em>coord.y</em>, <em>coord.z</em>) to do an element
lookup in the 3D image object specified by <em>image</em>.
<em>coord.w</em> is ignored.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [0.0, 1.0]
for image objects created with <em>image_channel_data_type</em> set to one of
the pre-defined packed formats or <code>CL_UNORM_INT8</code>, or
<code>CL_UNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [-1.0, 1.0]
for image objects created with <em>image_channel_data_type</em> set to
<code>CL_SNORM_INT8</code>, or <code>CL_SNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values for image objects created
with <em>image_channel_data_type</em> set to <code>CL_HALF_FLOAT</code> or <code>CL_FLOAT</code>.</p>
<p class="tableblock"> The <strong>read_imagef</strong> calls that take integer coordinates must use a
sampler with filter mode set to <code>CLK_FILTER_NEAREST</code>, normalized
coordinates set to <code>CLK_NORMALIZED_COORDS_FALSE</code> and addressing mode
set to <code>CLK_ADDRESS_CLAMP_TO_EDGE</code>, <code>CLK_ADDRESS_CLAMP</code> or
<code>CLK_ADDRESS_NONE</code>; otherwise the values returned are undefined.</p>
<p class="tableblock"> Values returned by <strong>read_imagef</strong> for image objects with
<em>image_channel_data_type</em> values not specified in the description are
undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int4 <strong>read_imagei</strong>(read_only image3d_t <em>image</em>, sampler_t <em>sampler</em>,
int4 <em>coord</em>)<br>
int4 <strong>read_imagei</strong>(read_only image3d_t <em>image</em>, sampler_t <em>sampler</em>,
float4 <em>coord</em>)<br>
uint4 <strong>read_imageui</strong>(read_only image3d_t <em>image</em>, sampler_t <em>sampler</em>,
int4 <em>coord</em>)<br>
uint4 <strong>read_imageui</strong>(read_only image3d_t <em>image</em>, sampler_t <em>sampler</em>,
float4 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use the coordinate (<em>coord.x</em>, <em>coord.y</em>, <em>coord.z</em>) to do an element
lookup in the 3D image object specified by <em>image</em>.
<em>coord.w</em> is ignored.</p>
<p class="tableblock"> <strong>read_imagei</strong> and <strong>read_imageui</strong> return unnormalized signed integer
and unsigned integer values respectively.
Each channel will be stored in a 32-bit integer.</p>
<p class="tableblock"> <strong>read_imagei</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_SIGNED_INT8</code>,<br>
<code>CL_SIGNED_INT16</code> and<br>
<code>CL_SIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imagei</strong> are undefined.</p>
<p class="tableblock"> <strong>read_imageui</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_UNSIGNED_INT8</code>,<br>
<code>CL_UNSIGNED_INT16</code> and<br>
<code>CL_UNSIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imageui</strong> are undefined.</p>
<p class="tableblock"> The <strong>read_image{i|ui}</strong> calls support a nearest filter only.
The filter_mode specified in <em>sampler</em> must be set to
<code>CLK_FILTER_NEAREST</code>; otherwise the values returned are undefined.</p>
<p class="tableblock"> Furthermore, the <strong>read_image{i|ui}</strong> calls that take integer
coordinates must use a sampler with normalized coordinates set to
<code>CLK_NORMALIZED_COORDS_FALSE</code> and addressing mode set to
<code>CLK_ADDRESS_CLAMP_TO_EDGE</code>, <code>CLK_ADDRESS_CLAMP</code> or
<code>CLK_ADDRESS_NONE</code>; otherwise the values returned are undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float4 <strong>read_imagef</strong>(read_only image2d_array_t <em>image</em>,
sampler_t <em>sampler</em>, int4 <em>coord</em>)<br>
float4 <strong>read_imagef</strong>(read_only image2d_array_t <em>image</em>,
sampler_t <em>sampler</em>, float4 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use <em>coord.xy</em> to do an element lookup in the 2D image identified by
<em>coord.z</em> in the 2D image array specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [0.0, 1.0]
for image objects created with image_channel_data_type set to one of
the pre-defined packed formats or <code>CL_UNORM_INT8</code>, or
<code>CL_UNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [-1.0, 1.0]
for image objects created with image_channel_data_type set to
<code>CL_SNORM_INT8</code>, or <code>CL_SNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values for image objects created
with image_channel_data_type set to <code>CL_HALF_FLOAT</code> or <code>CL_FLOAT</code>.</p>
<p class="tableblock"> The <strong>read_imagef</strong> calls that take integer coordinates must use a
sampler with filter mode set to <code>CLK_FILTER_NEAREST</code>, normalized
coordinates set to <code>CLK_NORMALIZED_COORDS_FALSE</code> and addressing mode
set to <code>CLK_ADDRESS_CLAMP_TO_EDGE</code>, <code>CLK_ADDRESS_CLAMP</code> or
<code>CLK_ADDRESS_NONE</code>; otherwise the values returned are undefined.</p>
<p class="tableblock"> Values returned by <strong>read_imagef</strong> for image objects with
image_channel_data_type values not specified in the description above
are undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int4 <strong>read_imagei</strong>(read_only image2d_array_t <em>image</em>, sampler_t <em>sampler</em>,
int4 <em>coord</em>)<br>
int4 <strong>read_imagei</strong>(read_only image2d_array_t <em>image</em>, sampler_t <em>sampler</em>,
float4 <em>coord</em>)<br>
uint4 <strong>read_imageui</strong>(read_only image2d_array_t <em>image</em>,
sampler_t <em>sampler</em>, int4 <em>coord</em>)<br>
uint4 <strong>read_imageui</strong>(read_only image2d_array_t <em>image</em>,
sampler_t <em>sampler</em>, float4 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use <em>coord.xy</em> to do an element lookup in the 2D image identified by
<em>coord.z</em> in the 2D image array specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagei</strong> and <strong>read_imageui</strong> return unnormalized signed integer
and unsigned integer values respectively.
Each channel will be stored in a 32-bit integer.</p>
<p class="tableblock"> <strong>read_imagei</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_SIGNED_INT8</code>,<br>
<code>CL_SIGNED_INT16</code> and<br>
<code>CL_SIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imagei</strong> are undefined.</p>
<p class="tableblock"> <strong>read_imageui</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_UNSIGNED_INT8</code>,<br>
<code>CL_UNSIGNED_INT16</code> and<br>
<code>CL_UNSIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imageui</strong> are undefined.</p>
<p class="tableblock"> The <strong>read_image{i|ui}</strong> calls support a nearest filter only.
The filter_mode specified in <em>sampler</em> must be set to
<code>CLK_FILTER_NEAREST</code>; otherwise the values returned are undefined.</p>
<p class="tableblock"> Furthermore, the <strong>read_image{i|ui}</strong> calls that take integer
coordinates must use a sampler with normalized coordinates set to
<code>CLK_NORMALIZED_COORDS_FALSE</code> and addressing mode set to
<code>CLK_ADDRESS_CLAMP_TO_EDGE</code>, <code>CLK_ADDRESS_CLAMP</code> or
<code>CLK_ADDRESS_NONE</code>; otherwise the values returned are undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float4 <strong>read_imagef</strong>(read_only image1d_t <em>image</em>, sampler_t <em>sampler</em>,
int <em>coord</em>)<br>
float4 <strong>read_imagef</strong>(read_only image1d_t <em>image</em>, sampler_t <em>sampler</em>,
float <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use <em>coord</em> to do an element lookup in the 1D image object specified
by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [0.0, 1.0]
for image objects created with <em>image_channel_data_type</em> set to one of
the pre-defined packed formats or <code>CL_UNORM_INT8</code>, or
<code>CL_UNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [-1.0, 1.0]
for image objects created with <em>image_channel_data_type</em> set to
<code>CL_SNORM_INT8</code>, or <code>CL_SNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values for image objects created
with <em>image_channel_data_type</em> set to <code>CL_HALF_FLOAT</code> or <code>CL_FLOAT</code>.</p>
<p class="tableblock"> The <strong>read_imagef</strong> calls that take integer coordinates must use a
sampler with filter mode set to <code>CLK_FILTER_NEAREST</code>, normalized
coordinates set to <code>CLK_NORMALIZED_COORDS_FALSE</code> and addressing mode
set to <code>CLK_ADDRESS_CLAMP_TO_EDGE</code>, <code>CLK_ADDRESS_CLAMP</code> or
<code>CLK_ADDRESS_NONE</code>; otherwise the values returned are undefined.</p>
<p class="tableblock"> Values returned by <strong>read_imagef</strong> for image objects with
<em>image_channel_data_type</em> values not specified in the description
above are undefined.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.2 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int4 <strong>read_imagei</strong>(read_only image1d_t <em>image</em>, sampler_t <em>sampler</em>,
int <em>coord</em>)<br>
int4 <strong>read_imagei</strong>(read_only image1d_t <em>image</em>, sampler_t <em>sampler</em>,
float <em>coord</em>)<br>
uint4 <strong>read_imageui</strong>(read_only image1d_t <em>image</em>, sampler_t <em>sampler</em>,
int <em>coord</em>)<br>
uint4 <strong>read_imageui</strong>(read_only image1d_t <em>image</em>, sampler_t <em>sampler</em>,
float <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use <em>coord</em> to do an element lookup in the 1D image object specified
by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagei</strong> and <strong>read_imageui</strong> return unnormalized signed integer
and unsigned integer values respectively.
Each channel will be stored in a 32-bit integer.</p>
<p class="tableblock"> <strong>read_imagei</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_SIGNED_INT8</code>,<br>
<code>CL_SIGNED_INT16</code> and<br>
<code>CL_SIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imagei</strong> are undefined.</p>
<p class="tableblock"> <strong>read_imageui</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_UNSIGNED_INT8</code>,<br>
<code>CL_UNSIGNED_INT16</code> and<br>
<code>CL_UNSIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imageui</strong> are undefined.</p>
<p class="tableblock"> The <strong>read_image{i|ui}</strong> calls support a nearest filter only.
The filter_mode specified in <em>sampler</em> must be set to
<code>CLK_FILTER_NEAREST</code>; otherwise the values returned are undefined.</p>
<p class="tableblock"> Furthermore, the <strong>read_image{i|ui}</strong> calls that take integer
coordinates must use a sampler with normalized coordinates set to
<code>CLK_NORMALIZED_COORDS_FALSE</code> and addressing mode set to
<code>CLK_ADDRESS_CLAMP_TO_EDGE</code>, <code>CLK_ADDRESS_CLAMP</code> or
<code>CLK_ADDRESS_NONE</code>; otherwise the values returned are undefined.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.2 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float4 <strong>read_imagef</strong>(read_only image1d_array_t <em>image</em>,
sampler_t <em>sampler</em>, int2 <em>coord</em>)<br>
float4 <strong>read_imagef</strong>(read_only image1d_array_t <em>image</em>,
sampler_t <em>sampler</em>, float2 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use <em>coord.x</em> to do an element lookup in the 1D image identified by
<em>coord.y</em> in the 1D image array specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [0.0, 1.0]
for image objects created with image_channel_data_type set to one of
the pre-defined packed formats or <code>CL_UNORM_INT8</code>, or
<code>CL_UNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [-1.0, 1.0]
for image objects created with image_channel_data_type set to
<code>CL_SNORM_INT8</code>, or <code>CL_SNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values for image objects created
with image_channel_data_type set to <code>CL_HALF_FLOAT</code> or <code>CL_FLOAT</code>.</p>
<p class="tableblock"> The <strong>read_imagef</strong> calls that take integer coordinates must use a
sampler with filter mode set to <code>CLK_FILTER_NEAREST</code>, normalized
coordinates set to <code>CLK_NORMALIZED_COORDS_FALSE</code> and addressing mode
set to <code>CLK_ADDRESS_CLAMP_TO_EDGE</code>, <code>CLK_ADDRESS_CLAMP</code> or
<code>CLK_ADDRESS_NONE</code>; otherwise the values returned are undefined.</p>
<p class="tableblock"> Values returned by <strong>read_imagef</strong> for image objects with
image_channel_data_type values not specified in the description above
are undefined.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.2 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int4 <strong>read_imagei</strong>(read_only image1d_array_t <em>image</em>, sampler_t <em>sampler</em>,
int2 <em>coord</em>)<br>
int4 <strong>read_imagei</strong>(read_only image1d_array_t <em>image</em>, sampler_t <em>sampler</em>,
float2 <em>coord</em>)<br>
uint4 <strong>read_imageui</strong>(read_only image1d_array_t <em>image</em>,
sampler_t <em>sampler</em>, int2 <em>coord</em>)<br>
uint4 <strong>read_imageui</strong>(read_only image1d_array_t <em>image</em>,
sampler_t <em>sampler</em>, float2 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use <em>coord.x</em> to do an element lookup in the 1D image identified by
<em>coord.y</em> in the 1D image array specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagei</strong> and <strong>read_imageui</strong> return unnormalized signed integer
and unsigned integer values respectively. Each channel will be stored
in a 32-bit integer.</p>
<p class="tableblock"> <strong>read_imagei</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_SIGNED_INT8</code>,<br>
<code>CL_SIGNED_INT16</code> and<br>
<code>CL_SIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imagei</strong> are undefined.</p>
<p class="tableblock"> <strong>read_imageui</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_UNSIGNED_INT8</code>,<br>
<code>CL_UNSIGNED_INT16</code> and<br>
<code>CL_UNSIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imageui</strong> are undefined.</p>
<p class="tableblock"> The <strong>read_image{i|ui}</strong> calls support a nearest filter only.
The filter_mode specified in <em>sampler</em> must be set to
<code>CLK_FILTER_NEAREST</code>; otherwise the values returned are undefined.</p>
<p class="tableblock"> Furthermore, the <strong>read_image{i|ui}</strong> calls that take integer
coordinates must use a sampler with normalized coordinates set to
<code>CLK_NORMALIZED_COORDS_FALSE</code> and addressing mode set to
<code>CLK_ADDRESS_CLAMP_TO_EDGE</code>, <code>CLK_ADDRESS_CLAMP</code> or
<code>CLK_ADDRESS_NONE</code>; otherwise the values returned are undefined.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.2 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float <strong>read_imagef</strong>(read_only image2d_depth_t <em>image</em>,
sampler_t <em>sampler</em>, int2 <em>coord</em>)<br>
float <strong>read_imagef</strong>(read_only image2d_depth_t <em>image</em>,
sampler_t <em>sampler</em>, float2 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use the coordinate (<em>coord.x</em>, <em>coord.y</em>) to do an element lookup in
the 2D depth image object specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns a floating-point value in the range [0.0, 1.0]
for depth image objects created with <em>image_channel_data_type</em> set to
<code>CL_UNORM_INT16</code> or <code>CL_UNORM_INT24</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns a floating-point value for depth image objects
created with <em>image_channel_data_type</em> set to <code>CL_FLOAT</code>.</p>
<p class="tableblock"> The <strong>read_imagef</strong> calls that take integer coordinates must use a
sampler with filter mode set to <code>CLK_FILTER_NEAREST</code>, normalized
coordinates set to <code>CLK_NORMALIZED_COORDS_FALSE</code> and addressing mode
set to <code>CLK_ADDRESS_CLAMP_TO_EDGE</code>, <code>CLK_ADDRESS_CLAMP</code> or
<code>CLK_ADDRESS_NONE</code>; otherwise the values returned are undefined.</p>
<p class="tableblock"> Values returned by <strong>read_imagef</strong> for depth image objects with
<em>image_channel_data_type</em> values not specified in the description
above are undefined.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 2.0 or newer, also see
<code>cl_khr_depth_images</code> extension.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float <strong>read_imagef</strong>(read_only image2d_array_depth_t <em>image</em>,
sampler_t <em>sampler</em>, int4 <em>coord</em>)<br>
float <strong>read_imagef</strong>(read_only image2d_array_depth_t <em>image</em>,
sampler_t <em>sampler</em>, float4 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use <em>coord.xy</em> to do an element lookup in the 2D image identified by
<em>coord.z</em> in the 2D depth image array specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns a floating-point value in the range [0.0, 1.0]
for depth image objects created with <em>image_channel_data_type</em> set to
<code>CL_UNORM_INT16</code> or <code>CL_UNORM_INT24</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns a floating-point value for depth image objects
created with <em>image_channel_data_type</em> set to <code>CL_FLOAT</code>.</p>
<p class="tableblock"> The <strong>read_imagef</strong> calls that take integer coordinates must use a
sampler with filter mode set to <code>CLK_FILTER_NEAREST</code>, normalized
coordinates set to <code>CLK_NORMALIZED_COORDS_FALSE</code> and addressing mode
set to <code>CLK_ADDRESS_CLAMP_TO_EDGE</code>, <code>CLK_ADDRESS_CLAMP</code> or
<code>CLK_ADDRESS_NONE</code>; otherwise the values returned are undefined.</p>
<p class="tableblock"> Values returned by <strong>read_imagef</strong> for image objects with
<em>image_channel_data_type</em> values not specified in the description
above are undefined.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 2.0 or newer, also see
<code>cl_khr_depth_images</code> extension.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="sect4">
<h5 id="built-in-image-sampler-less-read-functions"><a class="anchor" href="#built-in-image-sampler-less-read-functions"></a>6.15.15.3. Built-in Image Sampler-less Read Functions</h5>
<div class="openblock">
<div class="content">
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
Sampler-less image read functions <a href="#unified-spec">require</a> support for
OpenCL C 1.2 or newer, with some functions requiring support for newer
versions of OpenCL C as noted in the <a href="#table-image-samplerless-read">table
below</a>.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>The sampler-less image read functions behave exactly as the corresponding
<a href="#built-in-image-read-functions">built-in image read functions</a> that take
integer coordinates and a sampler with filter mode set to
<code>CLK_FILTER_NEAREST</code>, normalized coordinates set to
<code>CLK_NORMALIZED_COORDS_FALSE</code> and addressing mode to <code>CLK_ADDRESS_NONE</code>.
There is one exception when the <em>image_channel_data_type</em> is a floating
point type (such as <code>CL_FLOAT</code>).
In this exceptional case, when channel data values are denormalized, the
sampler-less image read function may return the denormalized data, while
the image read function with a sampler argument may flush the denormalized
channel data values to zero.</p>
</div>
<div class="paragraph">
<p><em>aQual</em> in the following table refers to one of the access qualifiers.
For samplerless read functions this may be <code>read_only</code> or <code>read_write</code>.</p>
</div>
<table id="table-image-samplerless-read" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 28. Built-in Image Sampler-less Read Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float4 <strong>read_imagef</strong>(<em>aQual</em> image2d_t <em>image</em>, int2 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use the coordinate (<em>coord.x</em>, <em>coord.y</em>) to do an element lookup in
the 2D image object specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [0.0, 1.0]
for image objects created with <em>image_channel_data_type</em> set to one of
the pre-defined packed formats or <code>CL_UNORM_INT8</code>, or
<code>CL_UNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [-1.0, 1.0]
for image objects created with <em>image_channel_data_type</em> set to
<code>CL_SNORM_INT8</code>, or <code>CL_SNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values for image objects created
with <em>image_channel_data_type</em> set to <code>CL_HALF_FLOAT</code> or <code>CL_FLOAT</code>.</p>
<p class="tableblock"> Values returned by <strong>read_imagef</strong> for image objects with
<em>image_channel_data_type</em> values not specified in the description
above are undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int4 <strong>read_imagei</strong>(<em>aQual</em> image2d_t <em>image</em>, int2 <em>coord</em>)<br>
uint4 <strong>read_imageui</strong>(<em>aQual</em> image2d_t <em>image</em>, int2 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use the coordinate (<em>coord.x</em>, <em>coord.y</em>) to do an element lookup in
the 2D image object specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagei</strong> and <strong>read_imageui</strong> return unnormalized signed integer
and unsigned integer values respectively. Each channel will be stored
in a 32-bit integer.</p>
<p class="tableblock"> <strong>read_imagei</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_SIGNED_INT8</code>,<br>
<code>CL_SIGNED_INT16</code> and<br>
<code>CL_SIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imagei</strong> are undefined.</p>
<p class="tableblock"> <strong>read_imageui</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_UNSIGNED_INT8</code>,<br>
<code>CL_UNSIGNED_INT16</code> and<br>
<code>CL_UNSIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imageui</strong> are undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float4 <strong>read_imagef</strong>(<em>aQual</em> image3d_t <em>image</em>, int4 <em>coord</em> )</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use the coordinate (<em>coord.x</em>, <em>coord.y</em>, <em>coord.z</em>) to do an element
lookup in the 3D image object specified by <em>image</em>.
<em>coord.w</em> is ignored.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [0.0, 1.0]
for image objects created with <em>image_channel_data_type</em> set to one of
the pre-defined packed formats or <code>CL_UNORM_INT8</code>, or
<code>CL_UNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [-1.0, 1.0]
for image objects created with <em>image_channel_data_type</em> set to
<code>CL_SNORM_INT8</code>, or <code>CL_SNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values for image objects created
with <em>image_channel_data_type</em> set to <code>CL_HALF_FLOAT</code> or <code>CL_FLOAT</code>.</p>
<p class="tableblock"> Values returned by <strong>read_imagef</strong> for image objects with
<em>image_channel_data_type</em> values not specified in the description are
undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int4 <strong>read_imagei</strong>(<em>aQual</em> image3d_t <em>image</em>, int4 <em>coord</em>)<br>
uint4 <strong>read_imageui</strong>(<em>aQual</em> image3d_t <em>image</em>, int4 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use the coordinate (<em>coord.x</em>, <em>coord.y</em>, <em>coord.z</em>) to do an element
lookup in the 3D image object specified by <em>image</em>.
<em>coord.w</em> is ignored.</p>
<p class="tableblock"> <strong>read_imagei</strong> and <strong>read_imageui</strong> return unnormalized signed integer
and unsigned integer values respectively.
Each channel will be stored in a 32-bit integer.</p>
<p class="tableblock"> <strong>read_imagei</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_SIGNED_INT8</code>,<br>
<code>CL_SIGNED_INT16</code> and<br>
<code>CL_SIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imagei</strong> are undefined.</p>
<p class="tableblock"> <strong>read_imageui</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_UNSIGNED_INT8</code>,<br>
<code>CL_UNSIGNED_INT16</code> and<br>
<code>CL_UNSIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imageui</strong> are undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float4 <strong>read_imagef</strong>(<em>aQual</em> image2d_array_t <em>image</em>, int4 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use <em>coord.xy</em> to do an element lookup in the 2D image identified by
<em>coord.z</em> in the 2D image array specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [0.0, 1.0]
for image objects created with <em>image_channel_data_type</em> set to one of
the pre-defined packed formats or <code>CL_UNORM_INT8</code>, or
<code>CL_UNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [-1.0, 1.0]
for image objects created with <em>image_channel_data_type</em> set to
<code>CL_SNORM_INT8</code>, or <code>CL_SNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values for image objects created
with <em>image_channel_data_type</em> set to <code>CL_HALF_FLOAT</code> or <code>CL_FLOAT</code>.</p>
<p class="tableblock"> Values returned by <strong>read_imagef</strong> for image objects with
<em>image_channel_data_type</em> values not specified in the description
above are undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int4 <strong>read_imagei</strong>(<em>aQual</em> image2d_array_t <em>image</em>, int4 <em>coord</em>)<br>
uint4 <strong>read_imageui</strong>(<em>aQual</em> image2d_array_t <em>image</em>, int4 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use <em>coord.xy</em> to do an element lookup in the 2D image identified by
<em>coord.z</em> in the 2D image array specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagei</strong> and <strong>read_imageui</strong> return unnormalized signed integer
and unsigned integer values respectively. Each channel will be stored
in a 32-bit integer.</p>
<p class="tableblock"> <strong>read_imagei</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_SIGNED_INT8</code>,<br>
<code>CL_SIGNED_INT16</code> and<br>
<code>CL_SIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imagei</strong> are undefined.</p>
<p class="tableblock"> <strong>read_imageui</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_UNSIGNED_INT8</code>,<br>
<code>CL_UNSIGNED_INT16</code> and<br>
<code>CL_UNSIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imageui</strong> are undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float4 <strong>read_imagef</strong>(<em>aQual</em> image1d_t <em>image</em>, int <em>coord</em>)<br>
float4 <strong>read_imagef</strong>(<em>aQual</em> image1d_buffer_t <em>image</em>, int <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use <em>coord</em> to do an element lookup in the 1D image or 1D image buffer
object specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [0.0, 1.0]
for image objects created with <em>image_channel_data_type</em> set to one of
the pre-defined packed formats or <code>CL_UNORM_INT8</code>, or
<code>CL_UNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [-1.0, 1.0]
for image objects created with <em>image_channel_data_type</em> set to
<code>CL_SNORM_INT8</code>, or <code>CL_SNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values for image objects created
with <em>image_channel_data_type</em> set to <code>CL_HALF_FLOAT</code> or <code>CL_FLOAT</code>.</p>
<p class="tableblock"> Values returned by <strong>read_imagef</strong> for image objects with
<em>image_channel_data_type</em> values not specified in the description
above are undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int4 <strong>read_imagei</strong>(<em>aQual</em> image1d_t <em>image</em>, int <em>coord</em>)<br>
uint4 <strong>read_imageui</strong>(<em>aQual</em> image1d_t <em>image</em>, int <em>coord</em>)<br>
int4 <strong>read_imagei</strong>(<em>aQual</em> image1d_buffer_t <em>image</em>, int <em>coord</em>)<br>
uint4 <strong>read_imageui</strong>(<em>aQual</em> image1d_buffer_t <em>image</em>, int <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use <em>coord</em> to do an element lookup in the 1D image or 1D image buffer
object specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagei</strong> and <strong>read_imageui</strong> return unnormalized signed integer
and unsigned integer values respectively. Each channel will be stored
in a 32-bit integer.</p>
<p class="tableblock"> <strong>read_imagei</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_SIGNED_INT8</code>,<br>
<code>CL_SIGNED_INT16</code> and<br>
<code>CL_SIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imagei</strong> are undefined.</p>
<p class="tableblock"> <strong>read_imageui</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_UNSIGNED_INT8</code>,<br>
<code>CL_UNSIGNED_INT16</code> and<br>
<code>CL_UNSIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imageui</strong> are undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float4 <strong>read_imagef</strong>(<em>aQual</em> image1d_array_t <em>image</em>, int2 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use <em>coord.x</em> to do an element lookup in the 1D image identified by
<em>coord.y</em> in the 1D image array specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [0.0, 1.0]
for image objects created with <em>image_channel_data_type</em> set to one of
the pre-defined packed formats or <code>CL_UNORM_INT8</code>, or
<code>CL_UNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values in the range [-1.0, 1.0]
for image objects created with <em>image_channel_data_type</em> set to
<code>CL_SNORM_INT8</code>, or <code>CL_SNORM_INT16</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns floating-point values for image objects created
with <em>image_channel_data_type</em> set to <code>CL_HALF_FLOAT</code> or <code>CL_FLOAT</code>.</p>
<p class="tableblock"> Values returned by <strong>read_imagef</strong> for image objects with
<em>image_channel_data_type</em> values not specified in the description
above are undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int4 <strong>read_imagei</strong>(<em>aQual</em> image1d_array_t <em>image</em>, int2 <em>coord</em>)<br>
uint4 <strong>read_imageui</strong>(<em>aQual</em> image1d_array_t <em>image</em>, int2 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use <em>coord.x</em> to do an element lookup in the 1D image identified by
<em>coord.y</em> in the 1D image array specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagei</strong> and <strong>read_imageui</strong> return unnormalized signed integer
and unsigned integer values respectively. Each channel will be stored
in a 32-bit integer.</p>
<p class="tableblock"> <strong>read_imagei</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_SIGNED_INT8</code>,<br>
<code>CL_SIGNED_INT16</code> and<br>
<code>CL_SIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imagei</strong> are undefined.</p>
<p class="tableblock"> <strong>read_imageui</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_UNSIGNED_INT8</code>,<br>
<code>CL_UNSIGNED_INT16</code> and<br>
<code>CL_UNSIGNED_INT32</code>.</p>
<p class="tableblock"> If the <em>image_channel_data_type</em> is not one of the above values, the
values returned by <strong>read_imageui</strong> are undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float <strong>read_imagef</strong>(<em>aQual</em> image2d_depth_t <em>image</em>, int2 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use the coordinate (<em>coord.x</em>, <em>coord.y</em>) to do an element lookup in
the 2D depth image object specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns a floating-point value in the range [0.0, 1.0]
for depth image objects created with <em>image_channel_data_type</em> set to
<code>CL_UNORM_INT16</code> or <code>CL_UNORM_INT24</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns a floating-point value for depth image objects
created with <em>image_channel_data_type</em> set to <code>CL_FLOAT</code>.</p>
<p class="tableblock"> Values returned by <strong>read_imagef</strong> for image objects with
<em>image_channel_data_type</em> values not specified in the description
above are undefined.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 2.0 or newer, also see
<code>cl_khr_depth_images</code> extension.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">float <strong>read_imagef</strong>(<em>aQual</em> image2d_array_depth_t <em>image</em>, int4 <em>coord</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Use <em>coord.xy</em> to do an element lookup in the 2D image identified by
<em>coord.z</em> in the 2D depth image array specified by <em>image</em>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns a floating-point value in the range [0.0, 1.0]
for depth image objects created with <em>image_channel_data_type</em> set to
<code>CL_UNORM_INT16</code> or <code>CL_UNORM_INT24</code>.</p>
<p class="tableblock"> <strong>read_imagef</strong> returns a floating-point value for depth image objects
created with <em>image_channel_data_type</em> set to <code>CL_FLOAT</code>.</p>
<p class="tableblock"> Values returned by <strong>read_imagef</strong> for image objects with
<em>image_channel_data_type</em> values not specified in the description
above are undefined.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 2.0 or newer, also see
<code>cl_khr_depth_images</code> extension.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="sect4">
<h5 id="built-in-image-write-functions"><a class="anchor" href="#built-in-image-write-functions"></a>6.15.15.4. Built-in Image Write Functions</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The following built-in function calls to write images are supported.</p>
</div>
<div class="paragraph">
<p><em>aQual</em> in the following table refers to one of the access qualifiers.
For write functions this may be <code>write_only</code> or <code>read_write</code>.</p>
</div>
<table id="table-image-write" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 29. Built-in Image Write Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>write_imagef</strong>(<em>aQual</em> image2d_t <em>image</em>, int2 <em>coord</em>, float4 <em>color</em>)<br>
void <strong>write_imagei</strong>(<em>aQual</em> image2d_t <em>image</em>, int2 <em>coord</em>, int4 <em>color</em>)<br>
void <strong>write_imageui</strong>(<em>aQual</em> image2d_t <em>image</em>, int2 <em>coord</em>, uint4 <em>color</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Write <em>color</em> value to location specified by <em>coord.xy</em> in the 2D
image object specified by <em>image</em>.
Appropriate data format conversion to the specified image format is
done before writing the color value.
<em>coord.x</em> and <em>coord.y</em> are considered to be unnormalized coordinates,
and must be in the range [0, image width-1] and [0, image height-1]
respectively.</p>
<p class="tableblock"> <strong>write_imagef</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the pre-defined packed formats
or set to <code>CL_SNORM_INT8</code>, <code>CL_UNORM_INT8</code>, <code>CL_SNORM_INT16</code>,
<code>CL_UNORM_INT16</code>, <code>CL_HALF_FLOAT</code> or <code>CL_FLOAT</code>.
Appropriate data format conversion will be done to convert channel
data from a floating-point value to actual data format in which the
channels are stored.</p>
<p class="tableblock"> <strong>write_imagei</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_SIGNED_INT8</code>,<br>
<code>CL_SIGNED_INT16</code> and<br>
<code>CL_SIGNED_INT32</code>.</p>
<p class="tableblock"> <strong>write_imageui</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_UNSIGNED_INT8</code>,<br>
<code>CL_UNSIGNED_INT16</code> and<br>
<code>CL_UNSIGNED_INT32</code>.</p>
<p class="tableblock"> The behavior of <strong>write_imagef</strong>, <strong>write_imagei</strong> and <strong>write_imageui</strong> for
image objects created with <em>image_channel_data_type</em> values not
specified in the description above or with <em>x</em> and <em>y</em> coordinate
values that are not in the range [0, image width-1] and [0, image
height-1], respectively, is undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>write_imagef</strong>(<em>aQual</em> image2d_array_t <em>image</em>, int4 <em>coord</em>,
float4 <em>color</em>)<br>
void <strong>write_imagei</strong>(<em>aQual</em> image2d_array_t <em>image</em>, int4 <em>coord</em>,
int4 <em>color</em>)<br>
void <strong>write_imageui</strong>(<em>aQual</em> image2d_array_t <em>image</em>, int4 <em>coord</em>,
uint4 <em>color</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Write <em>color</em> value to location specified by <em>coord.xy</em> in the 2D
image identified by <em>coord.z</em> in the 2D image array specified by
<em>image</em>.
Appropriate data format conversion to the specified image format is
done before writing the color value.
<em>coord.x</em>, <em>coord.y</em> and <em>coord.z</em> are considered to be unnormalized
coordinates, and must be in the range [0, image width-1] and [0, image
height-1], and [0, image number of layers-1], respectively.</p>
<p class="tableblock"> <strong>write_imagef</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the pre-defined packed formats
or set to <code>CL_SNORM_INT8</code>, <code>CL_UNORM_INT8</code>, <code>CL_SNORM_INT16</code>,
<code>CL_UNORM_INT16</code>, <code>CL_HALF_FLOAT</code> or <code>CL_FLOAT</code>.
Appropriate data format conversion will be done to convert channel
data from a floating-point value to actual data format in which the
channels are stored.</p>
<p class="tableblock"> <strong>write_imagei</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_SIGNED_INT8</code>,<br>
<code>CL_SIGNED_INT16</code> and<br>
<code>CL_SIGNED_INT32</code>.</p>
<p class="tableblock"> <strong>write_imageui</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_UNSIGNED_INT8</code>,<br>
<code>CL_UNSIGNED_INT16</code> and<br>
<code>CL_UNSIGNED_INT32</code>.</p>
<p class="tableblock"> The behavior of <strong>write_imagef</strong>, <strong>write_imagei</strong> and <strong>write_imageui</strong> for
image objects created with <em>image_channel_data_type</em> values not
specified in the description above or with (<em>x</em>, <em>y</em>, <em>z</em>) coordinate
values that are not in the range [0, image width-1], [0, image
height-1], and [0, image number of layers-1], respectively, is
undefined.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>write_imagef</strong>(<em>aQual</em> image1d_t <em>image</em>, int <em>coord</em>,
float4 <em>color</em>)<br>
void <strong>write_imagei</strong>(<em>aQual</em> image1d_t <em>image</em>, int <em>coord</em>,
int4 <em>color</em>)<br>
void <strong>write_imageui</strong>(<em>aQual</em> image1d_t <em>image</em>, int <em>coord</em>,
uint4 <em>color</em>)<br>
void <strong>write_imagef</strong>(<em>aQual</em> image1d_buffer_t <em>image</em>, int <em>coord</em>,
float4 <em>color</em>)<br>
void <strong>write_imagei</strong>(<em>aQual</em> image1d_buffer_t <em>image</em>, int <em>coord</em>,
int4 <em>color</em>)<br>
void <strong>write_imageui</strong>(<em>aQual</em> image1d_buffer_t <em>image</em>, int <em>coord</em>,
uint4 <em>color</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Write <em>color</em> value to location specified by <em>coord</em> in the 1D image
or 1D image buffer object specified by <em>image</em>.
Appropriate data format conversion to the specified image format is
done before writing the color value.
<em>coord</em> is considered to be an unnormalized coordinate, and must be in
the range [0, image width-1].</p>
<p class="tableblock"> <strong>write_imagef</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the pre-defined packed formats
or set to <code>CL_SNORM_INT8</code>, <code>CL_UNORM_INT8</code>, <code>CL_SNORM_INT16</code>,
<code>CL_UNORM_INT16</code>, <code>CL_HALF_FLOAT</code> or <code>CL_FLOAT</code>.
Appropriate data format conversion will be done to convert channel
data from a floating-point value to actual data format in which the
channels are stored.</p>
<p class="tableblock"> <strong>write_imagei</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_SIGNED_INT8</code>,<br>
<code>CL_SIGNED_INT16</code> and<br>
<code>CL_SIGNED_INT32</code>.</p>
<p class="tableblock"> <strong>write_imageui</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_UNSIGNED_INT8</code>,<br>
<code>CL_UNSIGNED_INT16</code> and<br>
<code>CL_UNSIGNED_INT32</code>.</p>
<p class="tableblock"> The behavior of <strong>write_imagef</strong>, <strong>write_imagei</strong> and <strong>write_imageui</strong> for
image objects created with <em>image_channel_data_type</em> values not
specified in the description above, or with a coordinate value that is
not in the range [0, image width-1], is undefined.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.2 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>write_imagef</strong>(<em>aQual</em> image1d_array_t <em>image</em>, int2 <em>coord</em>,
float4 <em>color</em>)<br>
void <strong>write_imagei</strong>(<em>aQual</em> image1d_array_t <em>image</em>, int2 <em>coord</em>,
int4 <em>color</em>)<br>
void <strong>write_imageui</strong>(<em>aQual</em> image1d_array_t <em>image</em>, int2 <em>coord</em>,
uint4 <em>color</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Write <em>color</em> value to location specified by <em>coord.x</em> in the 1D image
identified by <em>coord.y</em> in the 1D image array specified by <em>image</em>.
Appropriate data format conversion to the specified image format is
done before writing the color value.
<em>coord.x</em> and <em>coord.y</em> are considered to be unnormalized coordinates
and must be in the range [0, image width-1] and [0, image number of
layers-1], respectively.</p>
<p class="tableblock"> <strong>write_imagef</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the pre-defined packed formats
or set to <code>CL_SNORM_INT8</code>, <code>CL_UNORM_INT8</code>, <code>CL_SNORM_INT16</code>,
<code>CL_UNORM_INT16</code>, <code>CL_HALF_FLOAT</code> or <code>CL_FLOAT</code>.
Appropriate data format conversion will be done to convert channel
data from a floating-point value to actual data format in which the
channels are stored.</p>
<p class="tableblock"> <strong>write_imagei</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_SIGNED_INT8</code>,<br>
<code>CL_SIGNED_INT16</code> and<br>
<code>CL_SIGNED_INT32</code>.</p>
<p class="tableblock"> <strong>write_imageui</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to one of the following values:</p>
<p class="tableblock"> <code>CL_UNSIGNED_INT8</code>,<br>
<code>CL_UNSIGNED_INT16</code> and<br>
<code>CL_UNSIGNED_INT32</code>.</p>
<p class="tableblock"> The behavior of <strong>write_imagef</strong>, <strong>write_imagei</strong> and <strong>write_imageui</strong> for
image objects created with <em>image_channel_data_type</em> values not
specified in the description above or with (<em>x</em>, <em>y</em>) coordinate
values that are not in the range [0, image width-1] and [0, image
number of layers-1], respectively, is undefined.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 1.2 or newer.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>write_imagef</strong>(<em>aQual</em> image2d_depth_t <em>image</em>, int2 <em>coord</em>,
float <em>depth</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Write <em>depth</em> value to location specified by <em>coord.xy</em> in the 2D
depth image object specified by <em>image</em>.
Appropriate data format conversion to the specified image format is
done before writing the depth value.
<em>coord.x</em> and <em>coord.y</em> are considered to be unnormalized coordinates,
and must be in the range [0, image width-1], and [0, image height-1],
respectively.</p>
<p class="tableblock"> <strong>write_imagef</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to <code>CL_UNORM_INT16</code>, <code>CL_UNORM_INT24</code> or
<code>CL_FLOAT</code>.
Appropriate data format conversion will be done to convert depth valye
from a floating-point value to actual data format associated with the
image.</p>
<p class="tableblock"> The behavior of <strong>write_imagef</strong>, <strong>write_imagei</strong> and <strong>write_imageui</strong> for
image objects created with <em>image_channel_data_type</em> values not
specified in the description above or with (<em>x</em>, <em>y</em>) coordinate
values that are not in the range [0, image width-1] and [0, image
height-1], respectively, is undefined.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 2.0 or newer, also see
<code>cl_khr_depth_images</code> extension.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>write_imagef</strong>(<em>aQual</em> image2d_array_depth_t <em>image</em>, int4 <em>coord</em>,
float <em>depth</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Write <em>depth</em> value to location specified by <em>coord.xy</em> in the 2D
image identified by <em>coord.z</em> in the 2D depth image array specified by
<em>image</em>.
Appropriate data format conversion to the specified image format is
done before writing the depth value.
<em>coord.x</em>, <em>coord.y</em> and <em>coord.z</em> are considered to be unnormalized
coordinates, and must be in the range [0, image width-1], [0, image
height-1], and [0, image number of layers-1], respectively.</p>
<p class="tableblock"> <strong>write_imagef</strong> can only be used with image objects created with
<em>image_channel_data_type</em> set to <code>CL_UNORM_INT16</code>, <code>CL_UNORM_INT24</code> or
<code>CL_FLOAT</code>.
Appropriate data format conversion will be done to convert depth valye
from a floating-point value to actual data format associated with the
image.</p>
<p class="tableblock"> The behavior of <strong>write_imagef</strong>, <strong>write_imagei</strong> and <strong>write_imageui</strong> for
image objects created with <em>image_channel_data_type</em> values not
specified in the description above or with (<em>x</em>, <em>y</em>, <em>z</em>) coordinate
values that are not in the range [0, image width-1], [0, image
height-1], [0, image number of layers-1], respectively, is undefined.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 2.0 or newer, also see
<code>cl_khr_depth_images</code> extension.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>write_imagef</strong>(<em>aQual</em> image3d_t <em>image</em>, int4 <em>coord</em>,
float4 <em>color</em>)<br>
void <strong>write_imagei</strong>(<em>aQual</em> image3d_t <em>image</em>, int4 <em>coord</em>,
int4 <em>color</em>)<br>
void <strong>write_imageui</strong>(<em>aQual</em> image3d_t <em>image</em>, int4 <em>coord</em>,
uint4 <em>color</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Write color value to location specified by <em>coord.xyz</em> in the 3D image
object specified by <em>image</em>.
Appropriate data format conversion to the specified image format is
done before writing the color value.
<em>coord.x</em>, <em>coord.y</em> and <em>coord.z</em> are considered to be unnormalized
coordinates, and must be in the range [0, image width-1], [0, image
height-1], and [0, image depth-1], respectively.</p>
<p class="tableblock"> <strong>write_imagef</strong> can only be used with image objects created with
image_channel_data_type set to one of the pre-defined packed formats
or set to <code>CL_SNORM_INT8</code>, <code>CL_UNORM_INT8</code>, <code>CL_SNORM_INT16</code>,
<code>CL_UNORM_INT16</code>, <code>CL_HALF_FLOAT</code> or <code>CL_FLOAT</code>.
Appropriate data format conversion will be done to convert channel
data from a floating-point value to actual data format in which the
channels are stored.</p>
<p class="tableblock"> <strong>write_imagei</strong> can only be used with image objects created with
image_channel_data_type set to one of the following values:</p>
<p class="tableblock"> <code>CL_SIGNED_INT8</code>,<br>
<code>CL_SIGNED_INT16</code> and<br>
<code>CL_SIGNED_INT32</code>.</p>
<p class="tableblock"> <strong>write_imageui</strong> can only be used with image objects created with
image_channel_data_type set to one of the following values:</p>
<p class="tableblock"> <code>CL_UNSIGNED_INT8</code>,<br>
<code>CL_UNSIGNED_INT16</code> and<br>
<code>CL_UNSIGNED_INT32</code>.</p>
<p class="tableblock"> The behavior of <strong>write_imagef</strong>, <strong>write_imagei</strong> and <strong>write_imageui</strong> for
image objects with <em>image_channel_data_type</em> values not specified in
the description above or with (<em>x</em>, <em>y</em>, <em>z</em>) coordinate values that
are not in the range [0, image width-1], [0, image height-1], and [0,
image depth-1], respectively, is undefined.</p>
<p class="tableblock"> <a href="#unified-spec">Requires</a> support for OpenCL C 2.0, or OpenCL C 3.0 or
newer and the <code>__opencl_c_3d_image_writes</code> feature, or the
<code>cl_khr_3d_image_writes</code> extension.</p></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="sect4">
<h5 id="built-in-image-query-functions"><a class="anchor" href="#built-in-image-query-functions"></a>6.15.15.5. Built-in Image Query Functions</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The following built-in function calls to query image information are
supported.</p>
</div>
<div class="paragraph">
<p><em>aQual</em> in the following table refers to one of the access qualifiers.
For query functions this may be <code>read_only</code>, <code>write_only</code> or <code>read_write</code>.</p>
</div>
<table id="table-image-query" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 30. Built-in Image Query Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>get_image_width</strong>(<em>aQual</em> image2d_t <em>image</em>)<br>
int <strong>get_image_width</strong>(<em>aQual</em> image3d_t <em>image</em>)<br></p>
<p class="tableblock"> For OpenCL C 1.2 or newer:<br></p>
<p class="tableblock"> int <strong>get_image_width</strong>(<em>aQual</em> image1d_t <em>image</em>)<br>
int <strong>get_image_width</strong>(<em>aQual</em> image1d_buffer_t <em>image</em>)<br>
int <strong>get_image_width</strong>(<em>aQual</em> image1d_array_t <em>image</em>)<br>
int <strong>get_image_width</strong>(<em>aQual</em> image2d_array_t <em>image</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0 or newer, also see <code>cl_khr_depth_images</code> extension:<br></p>
<p class="tableblock"> int <strong>get_image_width</strong>(<em>aQual</em> image2d_depth_t <em>image</em>)<br>
int <strong>get_image_width</strong>(<em>aQual</em> image2d_array_depth_t <em>image</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Return the image width in pixels.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>get_image_height</strong>(<em>aQual</em> image2d_t <em>image</em>)<br>
int <strong>get_image_height</strong>(<em>aQual</em> image3d_t <em>image</em>)<br></p>
<p class="tableblock"> For OpenCL C 1.2 or newer:<br></p>
<p class="tableblock"> int <strong>get_image_height</strong>(<em>aQual</em> image2d_array_t <em>image</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0 or newer, also see <code>cl_khr_depth_images</code> extension:<br></p>
<p class="tableblock"> int <strong>get_image_height</strong>(<em>aQual</em> image2d_depth_t <em>image</em>)<br>
int <strong>get_image_height</strong>(<em>aQual</em> image2d_array_depth_t <em>image</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Return the image height in pixels.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>get_image_depth</strong>(image3d_t <em>image</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Return the image depth in pixels.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>get_image_channel_data_type</strong>(<em>aQual</em> image2d_t <em>image</em>)<br>
int <strong>get_image_channel_data_type</strong>(<em>aQual</em> image3d_t <em>image</em>)<br></p>
<p class="tableblock"> For OpenCL C 1.2 or newer:<br></p>
<p class="tableblock"> int <strong>get_image_channel_data_type</strong>(<em>aQual</em> image1d_t <em>image</em>)<br>
int <strong>get_image_channel_data_type</strong>(<em>aQual</em> image1d_buffer_t <em>image</em>)<br>
int <strong>get_image_channel_data_type</strong>(<em>aQual</em> image2d_t <em>image</em>)<br>
int <strong>get_image_channel_data_type</strong>(<em>aQual</em> image3d_t <em>image</em>)<br>
int <strong>get_image_channel_data_type</strong>(<em>aQual</em> image1d_array_t <em>image</em>)<br>
int <strong>get_image_channel_data_type</strong>(<em>aQual</em> image2d_array_t <em>image</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0 or newer, also see <code>cl_khr_depth_images</code> extension:<br></p>
<p class="tableblock"> int <strong>get_image_channel_data_type</strong>(<em>aQual</em> image2d_depth_t <em>image</em>)<br>
int <strong>get_image_channel_data_type</strong>(<em>aQual</em> image2d_array_depth_t <em>image</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Return the channel data type. Valid values are:</p>
<p class="tableblock"> <code>CLK_SNORM_INT8</code><br>
<code>CLK_SNORM_INT16</code><br>
<code>CLK_UNORM_INT8</code><br>
<code>CLK_UNORM_INT16</code><br>
<code>CLK_UNORM_SHORT_565</code><br>
<code>CLK_UNORM_SHORT_555</code><br>
<code>CLK_UNORM_INT_101010</code><br>
<code>CLK_SIGNED_INT8</code><br>
<code>CLK_SIGNED_INT16</code><br>
<code>CLK_SIGNED_INT32</code><br>
<code>CLK_UNSIGNED_INT8</code><br>
<code>CLK_UNSIGNED_INT16</code><br>
<code>CLK_UNSIGNED_INT32</code><br>
<code>CLK_HALF_FLOAT</code><br>
<code>CLK_FLOAT</code><br></p>
<p class="tableblock"> Additionally, for OpenCL C 3.0 or newer:<br></p>
<p class="tableblock"> <code>CLK_UNORM_INT_101010_2</code> <sup class="footnote">[<a id="_footnoteref_67" class="footnote" href="#_footnotedef_67" title="View footnote.">67</a>]</sup></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>get_image_channel_order</strong>(<em>aQual</em> image2d_t <em>image</em>)<br>
int <strong>get_image_channel_order</strong>(<em>aQual</em> image3d_t <em>image</em>)<br></p>
<p class="tableblock"> For OpenCL C 1.2 or newer:<br></p>
<p class="tableblock"> int <strong>get_image_channel_order</strong>(<em>aQual</em> image1d_t <em>image</em>)<br>
int <strong>get_image_channel_order</strong>(<em>aQual</em> image1d_buffer_t <em>image</em>)<br>
int <strong>get_image_channel_order</strong>(<em>aQual</em> image1d_array_t <em>image</em>)<br>
int <strong>get_image_channel_order</strong>(<em>aQual</em> image2d_array_t <em>image</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0 or newer, also see <code>cl_khr_depth_images</code> extension:<br></p>
<p class="tableblock"> int <strong>get_image_channel_order</strong>(<em>aQual</em> image2d_depth_t <em>image</em>)<br>
int <strong>get_image_channel_order</strong>(<em>aQual</em> image2d_array_depth_t <em>image</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Return the image channel order. Valid values are:</p>
<p class="tableblock"> <code>CLK_A</code><br>
<code>CLK_R</code><br>
<code>CLK_RG</code><br>
<code>CLK_RA</code><br>
<code>CLK_RGB</code><br>
<code>CLK_RGBA</code><br>
<code>CLK_ARGB</code><br>
<code>CLK_BGRA</code><br>
<code>CLK_INTENSITY</code><br>
<code>CLK_LUMINANCE</code><br></p>
<p class="tableblock"> Additionally, for OpenCL C 1.1 or newer:<br></p>
<p class="tableblock"> <code>CLK_Rx</code><br>
<code>CLK_RGx</code><br>
<code>CLK_RGBx</code><br></p>
<p class="tableblock"> Additionally, for OpenCL C 2.0 or newer:<br></p>
<p class="tableblock"> <code>CLK_ABGR</code><br>
<code>CLK_DEPTH</code><br>
<code>CLK_sRGB</code><br>
<code>CLK_sRGBx</code><br>
<code>CLK_sRGBA</code><br>
<code>CLK_sBGRA</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int2 <strong>get_image_dim</strong>(<em>aQual</em> image2d_t <em>image</em>)<br></p>
<p class="tableblock"> For OpenCL C 1.2 or newer:<br></p>
<p class="tableblock"> int2 <strong>get_image_dim</strong>(<em>aQual</em> image2d_array_t <em>image</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0 or newer, also see <code>cl_khr_depth_images</code> extension:<br></p>
<p class="tableblock"> int2 <strong>get_image_dim</strong>(<em>aQual</em> image2d_depth_t <em>image</em>)<br>
int2 <strong>get_image_dim</strong>(<em>aQual</em> image2d_array_depth_t <em>image</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Return the 2D image width and height as an int2 type.
The width is returned in the <em>x</em> component, and the height in the <em>y</em>
component.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int4 <strong>get_image_dim</strong>(<em>aQual</em> image3d_t <em>image</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Return the 3D image width, height, and depth as an <code>int4</code> type.
The width is returned in the <em>x</em> component, height in the <em>y</em>
component, depth in the <em>z</em> component and the <em>w</em> component is 0.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">For OpenCL C 1.2 or newer:<br></p>
<p class="tableblock"> size_t <strong>get_image_array_size</strong>(<em>aQual</em> image2d_array_t <em>image</em>)<br></p>
<p class="tableblock"> For OpenCL C 2.0 or newer, also see <code>cl_khr_depth_images</code> extension:<br></p>
<p class="tableblock"> size_t <strong>get_image_array_size</strong>(<em>aQual</em> image2d_array_depth_t <em>image</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Return the number of images in the 2D image array.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">For OpenCL C 1.2 or newer:<br></p>
<p class="tableblock"> size_t <strong>get_image_array_size</strong>(<em>aQual</em> image1d_array_t <em>image</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Return the number of images in the 1D image array.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>The values returned by <strong>get_image_channel_data_type</strong> and
<strong>get_image_channel_order</strong> as specified in <a href="#table-image-query">Built-in Image Query Functions</a> with the
<code>CLK_</code> prefixes correspond to the <code>CL_</code> prefixes used to describe the
<a href="#opencl-channel-order">image channel order</a> and
<a href="#opencl-channel-data-type">data type</a> in the <a href="#opencl-spec">OpenCL
Specification</a>.
For example, both <code>CL_UNORM_INT8</code> and <code>CLK_UNORM_INT8</code> refer to an image
channel data type that is an unnormalized unsigned 8-bit integer.</p>
</div>
</div>
</div>
</div>
<div class="sect4">
<h5 id="reading-and-writing-to-the-same-image-in-a-kernel"><a class="anchor" href="#reading-and-writing-to-the-same-image-in-a-kernel"></a>6.15.15.6. Reading and writing to the same image in a kernel</h5>
<div class="paragraph">
<p>The <strong>atomic_work_item_fence</strong>(<code>CLK_IMAGE_MEM_FENCE</code>) built-in function can be
used to make sure that sampler-less writes are visible to later reads by the
same work-item.
Only a scope of <code>memory_scope_work_item</code> and an order of
<code>memory_order_acq_rel</code> is valid for <code>atomic_work_item_fence</code> when passed the
<code>CLK_IMAGE_MEM_FENCE</code> flag.
If multiple work-items are writing to and reading from multiple locations in
an image, the <strong>work_group_barrier</strong>(<code>CLK_IMAGE_MEM_FENCE</code>) should be used.</p>
</div>
<div class="paragraph">
<p>Consider the following example:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span>
foo(read_write image2d_t img, ... )
{
int2 coord;
coord.x = (<span class="predefined-type">int</span>)get_global_id(<span class="integer">0</span>);
coord.y = (<span class="predefined-type">int</span>)get_global_id(<span class="integer">1</span>);
float4 clr = read_imagef(img, coord);
...
write_imagef(img, coord, clr);
<span class="comment">// required to ensure that following read from image at</span>
<span class="comment">// location coord returns the latest color value.</span>
atomic_work_item_fence(
CLK_IMAGE_MEM_FENCE,
memory_order_acq_rel,
memory_scope_work_item);
float4 clr_new = read_imagef(img, coord);
...
}</code></pre>
</div>
</div>
</div>
<div class="sect4">
<h5 id="mapping-image-channels-to-color-values-returned-by-read_image-and-color-values-passed-to-write_image-to-image-channels"><a class="anchor" href="#mapping-image-channels-to-color-values-returned-by-read_image-and-color-values-passed-to-write_image-to-image-channels"></a>6.15.15.7. Mapping image channels to color values returned by read_image and color values passed to write_image to image channels</h5>
<div class="paragraph">
<p>The following table describes the mapping of the number of channels of an
image element to the appropriate components in the <code>float4</code>, <code>int4</code> or
<code>uint4</code> vector data type for the color values returned by
<strong>read_image{f|i|ui}</strong> or supplied to <strong>write_image{f|i|ui}</strong>.
The unmapped components will be set to 0.0 for red, green and blue channels
and will be set to 1.0 for the alpha channel.</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Channel Order</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>float4</code>, <code>int4</code> or <code>uint4</code> <strong>components of channel data</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_R</code>, <code>CL_Rx</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">(r, 0.0, 0.0, 1.0)</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_A</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">(0.0, 0.0, 0.0, a)</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_RG</code>, <code>CL_RGx</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">(r, g, 0.0, 1.0)</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_RA</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">(r, 0.0, 0.0, a)</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_RGB</code>, <code>CL_RGBx</code>, <code>CL_sRGB</code>, <code>CL_sRGBx</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">(r, g, b, 1.0)</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_RGBA</code>, <code>CL_BGRA</code>, <code>CL_ARGB</code>, <code>CL_ABGR</code>, <code>CL_sRGBA</code>, <code>CL_sBGRA</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">(r, g, b, a)</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_INTENSITY</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">(I, I, I, I)</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CL_LUMINANCE</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">(L, L, L, 1.0)</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>For <code>CL_DEPTH</code> images, a scalar value is returned by <strong>read_imagef</strong> or
supplied to <strong>write_imagef</strong>.
<a href="#unified-spec">Requires</a> support for OpenCL C 2.0 or newer, also see
<code>cl_khr_depth_images</code> extension.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>A kernel that uses a sampler with the <code>CL_ADDRESS_CLAMP</code> addressing mode
with multiple images may result in additional samplers being used internally
by an implementation.
If the same sampler is used with multiple images called via
<strong>read_image{f|i|ui}</strong>, then it is possible that an implementation may need to
allocate an additional sampler to handle the different border color values
that may be needed depending on the image formats being used.
These implementation allocated samplers will count against the maximum
sampler values supported by the device and given by
<code>CL_DEVICE_MAX_SAMPLERS</code>.
Enqueuing a kernel that requires more samplers than the implementation can
support will result in a <code>CL_OUT_OF_RESOURCES</code> error being returned.</p>
</div>
</td>
</tr>
</table>
</div>
</div>
</div>
<div class="sect3">
<h4 id="work-group-functions"><a class="anchor" href="#work-group-functions"></a>6.15.16. Work-group Collective Functions</h4>
<div class="openblock">
<div class="content">
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The functionality described in this section <a href="#unified-spec">requires</a>
support for OpenCL C 2.0, or OpenCL C 3.0 or newer and the
<code>__opencl_c_work_group_collective_functions</code> feature.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>This section decribes built-in functions that perform collective options
across a work-group.
These built-in functions must be encountered by all work-items in a
work-group executing the kernel.
We use the generic type name <code>gentype</code> to indicate the built-in data types
<code>half</code> <sup class="footnote">[<a id="_footnoteref_68" class="footnote" href="#_footnotedef_68" title="View footnote.">68</a>]</sup>, <code>int</code>, <code>uint</code>, <code>long</code>
<sup class="footnote">[<a id="_footnoteref_69" class="footnote" href="#_footnotedef_69" title="View footnote.">69</a>]</sup>, <code>ulong</code>, <code>float</code> or <code>double</code>
<sup class="footnote">[<a id="_footnoteref_70" class="footnote" href="#_footnotedef_70" title="View footnote.">70</a>]</sup> as the type for the arguments.</p>
</div>
<table id="table-builtin-work-group" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 31. Built-in Work-group Collective Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>work_group_all</strong>(int <em>predicate</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Evaluates <em>predicate</em> for all work-items in the work-group and returns
a non-zero value if <em>predicate</em> evaluates to non-zero for all
work-items in the work-group.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>work_group_any</strong>(int <em>predicate</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Evaluates <em>predicate</em> for all work-items in the work-group and returns
a non-zero value if <em>predicate</em> evaluates to non-zero for any
work-items in the work-group.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>work_group_broadcast</strong>(gentype <em>a</em>, size_t <em>local_id</em>)<br>
gentype <strong>work_group_broadcast</strong>(gentype <em>a</em>, size_t <em>local_id_x</em>,
size_t <em>local_id_y</em>)<br>
gentype <strong>work_group_broadcast</strong>(gentype <em>a</em>, size_t <em>local_id_x</em>,
size_t <em>local_id_y</em>, size_t <em>local_id_z</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Broadcast the value of <em>x</em> for work-item identified by <em>local_id</em> to
all work-items in the work-group.</p>
<p class="tableblock"> <em>local_id</em> must be the same value for all work-items in the
work-group.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>work_group_reduce_&lt;op&gt;</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Return result of reduction operation specified by <strong>&lt;op&gt;</strong> for all
values of <em>x</em> specified by work-items in a work-group.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>work_group_scan_exclusive_&lt;op&gt;</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Do an exclusive scan operation specified by <strong>&lt;op&gt;</strong> of all values
specified by work-items in the work-group. The scan results are
returned for each work-item.</p>
<p class="tableblock"> The scan order is defined by increasing 1D linear global ID within the
work-group.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>work_group_scan_inclusive_&lt;op&gt;</strong>(gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Do an inclusive scan operation specified by <strong>&lt;op&gt;</strong> of all values
specified by work-items in the work-group. The scan results are
returned for each work-item.</p>
<p class="tableblock"> The scan order is defined by increasing 1D linear global ID within the
work-group.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>The <strong>&lt;op&gt;</strong> in <strong>work_group_reduce_&lt;op&gt;</strong>, <strong>work_group_scan_exclusive_&lt;op&gt;</strong> and
<strong>work_group_scan_inclusive_&lt;op&gt;</strong> defines the operator and can be <strong>add</strong>,
<strong>min</strong> or <strong>max</strong>.</p>
</div>
<div class="paragraph">
<p>The inclusive scan operation takes a binary operator <strong>op</strong> with an identity I
and <em>n</em> (where <em>n</em> is the size of the work-group) elements [a<sub>0</sub>, a<sub>1</sub>, &#8230;&#8203;
a<sub>n-1</sub>] and returns [a<sub>0</sub>, (a<sub>0</sub> <strong>op</strong> a<sub>1</sub>), &#8230;&#8203; (a<sub>0</sub> <strong>op</strong> a<sub>1</sub> <strong>op</strong> &#8230;&#8203;
<strong>op</strong> a<sub>n-1</sub>)].
If <strong>op</strong> = add, the identity I is 0.
If <strong>op</strong> = min, the identity I is <code>INT_MAX</code>, <code>UINT_MAX</code>, <code>LONG_MAX</code>,
<code>ULONG_MAX</code>, for <code>int</code>, <code>uint</code>, <code>long</code>, <code>ulong</code> types and is <code>+INF</code> for
floating-point types.
Similarly if <strong>op</strong> = max, the identity I is <code>INT_MIN</code>, 0, <code>LONG_MIN</code>, 0 and
<code>-INF</code>.</p>
</div>
<div class="paragraph">
<p>Consider the following example:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="directive">void</span> foo(<span class="predefined-type">int</span> *p)
{
...
<span class="predefined-type">int</span> prefix_sum_val = work_group_scan_inclusive_add(
p[get_local_id(<span class="integer">0</span>)]);
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>For the example above, let&#8217;s assume that the work-group size is 8 and <em>p</em>
points to the following elements [3 1 7 0 4 1 6 3].
Work-item 0 calls <strong>work_group_scan_inclusive_add</strong> with 3 and returns 3.
Work-item 1 calls <strong>work_group_scan_inclusive_add</strong> with 1 and returns 4.
The full set of values returned by <strong>work_group_scan_inclusive_add</strong> for
work-items 0 &#8230;&#8203; 7 are [3 4 11 11 15 16 22 25].</p>
</div>
<div class="paragraph">
<p>The exclusive scan operation takes a binary associative operator <strong>op</strong> with
an identity I and n (where n is the size of the work-group) elements [a<sub>0</sub>,
a<sub>1</sub>, &#8230;&#8203; a<sub>n-1</sub>] and returns [I, a<sub>0</sub>, (a<sub>0</sub> <strong>op</strong> a<sub>1</sub>), &#8230;&#8203; (a<sub>0</sub> <strong>op</strong>
a<sub>1</sub> <strong>op</strong> &#8230;&#8203; <strong>op</strong> a<sub>n-2</sub>)].
For the example above, the exclusive scan add operation on the ordered set
[3 1 7 0 4 1 6 3] would return [0 3 4 11 11 15 16 22].</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>The order of floating-point operations is not guaranteed for the
<strong>work_group_reduce_&lt;op&gt;</strong>, <strong>work_group_scan_inclusive_&lt;op&gt;</strong> and
<strong>work_group_scan_exclusive_&lt;op&gt;</strong> built-in functions that operate on <code>half</code>,
<code>float</code> and <code>double</code> data types.
The order of these floating-point operations is also non-deterministic for a
given work-group.</p>
</div>
</td>
</tr>
</table>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="pipe-functions"><a class="anchor" href="#pipe-functions"></a>6.15.17. Pipe Functions</h4>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The functionality described in this section <a href="#unified-spec">requires</a>
support for OpenCL C 2.0, or OpenCL C 3.0 or newer and the <code>__opencl_c_pipes</code> feature.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>A pipe is identified by specifying the <code>pipe</code> keyword with a type.
The data type specifies the size of each packet in the pipe.
The <code>pipe</code> keyword is a type modifier.
When it is applied to another type <strong>T</strong>, the result is a pipe type whose
elements (or packets) are of type <strong>T</strong>.
The packet type <strong>T</strong> may be any supported OpenCL C scalar and vector integer
or floating-point data types, or a user-defined type built from these scalar
and vector data types.</p>
</div>
<div class="paragraph">
<p>Examples:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">pipe int4 pipeA; <span class="comment">// a pipe with int4 packets</span>
pipe user_type_t pipeB; <span class="comment">// a pipe with user_type_t packets</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>The <code>read_only</code> (or <code>__read_only</code>) and <code>write_only</code> (or <code>__write_only</code>)
qualifiers must be used with the <code>pipe</code> qualifier when a pipe is a parameter
of a kernel or of a user-defined function to identify if a pipe can be read
from or written to by a kernel and its callees and enqueued child kernels.
If no qualifier is specified, <code>read_only</code> is assumed.</p>
</div>
<div class="paragraph">
<p>A kernel cannot read from and write to the same pipe object.
Using the <code>read_write</code> (or <code>__read_write</code>) qualifier with the <code>pipe</code>
qualifier is a compilation error.</p>
</div>
<div class="paragraph">
<p>In the following example</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span>
foo (read_only pipe fooA_t pipeA,
write_only pipe fooB_t pipeB)
{
...
}</code></pre>
</div>
</div>
<div class="paragraph">
<p><code>pipeA</code> is a read-only pipe object, and <code>pipeB</code> is a write-only pipe object.</p>
</div>
<div class="paragraph">
<p>The macro <code>CLK_NULL_RESERVE_ID</code> refers to an invalid reservation ID.</p>
</div>
<div class="sect4">
<h5 id="restrictions-3"><a class="anchor" href="#restrictions-3"></a>6.15.17.1. Restrictions</h5>
<div class="ulist">
<ul>
<li>
<p>Pipes can only be passed as arguments to a function (including kernel
functions).
The <a href="#operators">C operators</a> cannot be used with variables declared
with the pipe qualifier.</p>
</li>
<li>
<p>The <code>pipe</code> qualifier cannot be used with variables declared inside a
kernel, a structure or union field, a pointer type, an array, global
variables declared in program scope or the return type of a function.</p>
</li>
</ul>
</div>
</div>
<div class="sect4">
<h5 id="built-in-pipe-read-and-write-functions"><a class="anchor" href="#built-in-pipe-read-and-write-functions"></a>6.15.17.2. Built-in Pipe Read and Write Functions</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The OpenCL C programming language implements the following built-in
functions that read from or write to a pipe.
We use the generic type name <code>gentype</code> to indicate the built-in OpenCL C scalar
or vector integer or floating-point data types
<sup class="footnote">[<a id="_footnoteref_71" class="footnote" href="#_footnotedef_71" title="View footnote.">71</a>]</sup> or any user defined type built from these
scalar and vector data types can be used as the type for the arguments to the
pipe functions listed in the following table.</p>
</div>
<table id="table-builtin-pipe" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 32. Built-in Pipe Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>read_pipe</strong>(read_only pipe gentype <em>p</em>, gentype *<em>ptr</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Read packet from pipe <em>p</em> into <em>ptr</em>.
Returns 0 if <strong>read_pipe</strong> is successful and a negative value if the
pipe is empty.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>write_pipe</strong>(write_only pipe gentype <em>p</em>, const gentype *<em>ptr</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Write packet specified by <em>ptr</em> to pipe <em>p</em>.
Returns 0 if <strong>write_pipe</strong> is successful and a negative value if the
pipe is full.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>read_pipe</strong>(read_only pipe gentype <em>p</em>, reserve_id_t <em>reserve_id</em>,
uint <em>index</em>, gentype *<em>ptr</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Read packet from the reserved area of the pipe referred to by
<em>reserve_id</em> and <em>index</em> into <em>ptr</em>.</p>
<p class="tableblock"> The reserved pipe entries are referred to by indices that go from 0
&#8230;&#8203; <em>num_packets</em> - 1.</p>
<p class="tableblock"> Returns 0 if <strong>read_pipe</strong> is successful and a negative value otherwise.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>write_pipe</strong>(write_only pipe gentype <em>p</em>, reserve_id_t
<em>reserve_id</em>, uint <em>index</em>, const gentype *<em>ptr</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Write packet specified by <em>ptr</em> to the reserved area of the pipe
referred to by <em>reserve_id</em> and <em>index</em>.</p>
<p class="tableblock"> The reserved pipe entries are referred to by indices that go from 0
&#8230;&#8203; <em>num_packets</em> - 1.</p>
<p class="tableblock"> Returns 0 if <strong>write_pipe</strong> is successful and a negative value
otherwise.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">reserve_id_t <strong>reserve_read_pipe</strong>(read_only pipe gentype <em>p</em>,
uint <em>num_packets</em>)<br>
reserve_id_t <strong>reserve_write_pipe</strong>(write_only pipe gentype <em>p</em>,
uint <em>num_packets</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Reserve <em>num_packets</em> entries for reading from or writing to pipe <em>p</em>.
Returns a valid reservation ID if the reservation is successful.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>commit_read_pipe</strong>(read_only pipe gentype <em>p</em>,
reserve_id_t <em>reserve_id</em>)<br>
void <strong>commit_write_pipe</strong>(write_only pipe gentype <em>p</em>,
reserve_id_t <em>reserve_id</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Indicates that all reads and writes to <em>num_packets</em> associated with
reservation <em>reserve_id</em> are completed.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">bool <strong>is_valid_reserve_id</strong>(reserve_id_t <em>reserve_id</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Return <em>true</em> if <em>reserve_id</em> is a valid reservation ID and <em>false</em>
otherwise.</p></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="sect4">
<h5 id="built-in-work-group-pipe-read-and-write-functions"><a class="anchor" href="#built-in-work-group-pipe-read-and-write-functions"></a>6.15.17.3. Built-in Work-group Pipe Read and Write Functions</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The OpenCL C programming language implements the following built-in pipe
functions that operate at a work-group level.
These built-in functions must be encountered by all work-items in a
work-group executing the kernel with the same argument values; otherwise the
behavior is undefined.
We use the generic type name <code>gentype</code> to indicate the built-in OpenCL C scalar
or vector integer or floating-point data types
<sup class="footnote">[<a id="_footnoteref_72" class="footnote" href="#_footnotedef_72" title="View footnote.">72</a>]</sup> or any user defined type built from these
scalar and vector data types can be used as the type for the arguments to the
pipe functions listed in the following table.</p>
</div>
<table id="table-builtin-pipe-work-group" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 33. Built-in Pipe Work-group Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">reserve_id_t <strong>work_group_reserve_read_pipe</strong>(read_only pipe gentype <em>p</em>,
uint <em>num_packets</em>)<br>
reserve_id_t <strong>work_group_reserve_write_pipe</strong>(write_only pipe gentype <em>p</em>,
uint <em>num_packets</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Reserve <em>num_packets</em> entries for reading from or writing to pipe <em>p</em>.
Returns a valid reservation ID if the reservation is successful.</p>
<p class="tableblock"> The reserved pipe entries are referred to by indices that go from 0
&#8230;&#8203; <em>num_packets</em> - 1.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>work_group_commit_read_pipe</strong>(read_only pipe gentype <em>p</em>,
reserve_id_t <em>reserve_id</em>)
void <strong>work_group_commit_write_pipe</strong>(write_only pipe gentype <em>p</em>,
reserve_id_t <em>reserve_id</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Indicates that all reads and writes to <em>num_packets</em> associated with
reservation <em>reserve_id</em> are completed.</p></td>
</tr>
</tbody>
</table>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>The <strong>read_pipe</strong> and <strong>write_pipe</strong> functions that take a reservation ID as an
argument can be used to read from or write to a packet index.
These built-ins can be used to read from or write to a packet index one or
multiple times.
If a packet index that is reserved for writing is not written to using the
<strong>write_pipe</strong> function, the contents of that packet in the pipe are
undefined.
<strong>commit_read_pipe</strong> and <strong>work_group_commit_read_pipe</strong> remove the entries
reserved for reading from the pipe.
<strong>commit_write_pipe</strong> and <strong>work_group_commit_write_pipe</strong> ensures that the
entries reserved for writing are all added in-order as one contiguous set of
packets to the pipe.</p>
</div>
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>There can only be the value of the <a href="#opencl-device-queries"><code>CL_DEVICE_PIPE_MAX_ACTIVE_RESERVATIONS</code> device query</a> reservations active
(i.e. reservation IDs that have been reserved but not committed) per
work-item or work-group for a pipe in a kernel executing on a device.</p>
</div>
<div class="paragraph">
<p>Work-item based reservations made by a work-item are ordered in the pipe as
they are ordered in the program.
Reservations made by different work-items that belong to the same work-group
can be ordered using the work-group barrier function.
The order of work-item based reservations that belong to different
work-groups is implementation defined.</p>
</div>
<div class="paragraph">
<p>Work-group based reservations made by a work-group are ordered in the pipe
as they are ordered in the program.
The order of work-group based reservations by different work-groups is
implementation defined.</p>
</div>
</div>
</div>
</div>
<div class="sect4">
<h5 id="built-in-pipe-query-functions"><a class="anchor" href="#built-in-pipe-query-functions"></a>6.15.17.4. Built-in Pipe Query Functions</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The OpenCL C programming language implements the following built-in query
functions for a pipe.
We use the generic type name <code>gentype</code> to indicate the built-in OpenCL C scalar
or vector integer or floating-point data types
<sup class="footnote">[<a id="_footnoteref_73" class="footnote" href="#_footnotedef_73" title="View footnote.">73</a>]</sup> or any user defined type built from these
scalar and vector data types can be used as the type for the arguments to the
pipe functions listed in the following table.</p>
</div>
<div class="paragraph">
<p><em>aQual</em> in the following table refers to one of the access qualifiers.
For pipe query functions this may be <code>read_only</code> or <code>write_only</code>.</p>
</div>
<table id="table-builtin-pipe-query" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 34. Built-in Pipe Query Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">uint <strong>get_pipe_num_packets</strong>(<em>aQual</em> pipe gentype <em>p</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the number of available entries in the pipe.
The number of available entries in a pipe is a dynamic value.
The value returned should be considered immediately stale.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">uint <strong>get_pipe_max_packets</strong>(<em>aQual</em> pipe gentype <em>p</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the maximum number of packets specified when <em>pipe</em> was
created.</p></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="sect4">
<h5 id="restrictions-4"><a class="anchor" href="#restrictions-4"></a>6.15.17.5. Restrictions</h5>
<div class="paragraph">
<p>The following behavior is undefined</p>
</div>
<div class="ulist">
<ul>
<li>
<p>A kernel fails to call <strong>reserve_pipe</strong> before calling <strong>read_pipe</strong> or
<strong>write_pipe</strong> that take a reservation ID.</p>
</li>
<li>
<p>A kernel calls <strong>read_pipe</strong>, <strong>write_pipe</strong>, <strong>commit_read_pipe</strong> or
<strong>commit_write_pipe</strong> with an invalid reservation ID.</p>
</li>
<li>
<p>A kernel calls <strong>read_pipe</strong> or <strong>write_pipe</strong> with an valid reservation ID
but with an <em>index</em> that is not a value in the range [0,
<em>num_packets</em>-1] specified to the corresponding call to <em>reserve_pipe</em>.</p>
</li>
<li>
<p>A kernel calls <strong>read_pipe</strong> or <strong>write_pipe</strong> with a reservation ID that
has already been committed (i.e. a <strong>commit_read_pipe</strong> or
<strong>commit_write_pipe</strong> with this reservation ID has already been called).</p>
</li>
<li>
<p>A kernel fails to call <strong>commit_read_pipe</strong> for any reservation ID
obtained by a prior call to <strong>reserve_read_pipe</strong>.</p>
</li>
<li>
<p>A kernel fails to call <strong>commit_write_pipe</strong> for any reservation ID
obtained by a prior call to <strong>reserve_write_pipe</strong>.</p>
</li>
<li>
<p>The contents of the reserved data packets in the pipe are undefined if
the kernel does not call <strong>write_pipe</strong> for all entries that were reserved
by the corresponding call to <strong>reserve_pipe</strong>.</p>
</li>
<li>
<p>Calls to <strong>read_pipe</strong> that takes a reservation ID and <strong>commit_read_pipe</strong>
or <strong>write_pipe</strong> that takes a reservation ID and <strong>commit_write_pipe</strong> for
a given reservation ID must be called by the same kernel that made the
reservation using <strong>reserve_read_pipe</strong> or <strong>reserve_write_pipe</strong>.
The reservation ID cannot be passed to another kernel including child
kernels.</p>
</li>
</ul>
</div>
</div>
</div>
<div class="sect3">
<h4 id="enqueuing-kernels"><a class="anchor" href="#enqueuing-kernels"></a>6.15.18. Enqueuing Kernels</h4>
<div class="openblock">
<div class="content">
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The functionality described in this section <a href="#unified-spec">requires</a>
support for OpenCL C 2.0, or OpenCL C 3.0 or newer and the
<code>__opencl_c_device_enqueue</code> feature.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>This section describes built-in functions that allow a kernel to
enqueue additional work to the same device, without host interaction.
A kernel may enqueue code represented by Block syntax, and control execution
order with event dependencies including user events and markers.
There are several advantages to using the Block syntax: it is more compact;
it does not require a cl_kernel object; and enqueuing can be done as a
single semantic step.</p>
</div>
<div class="paragraph">
<p>The following table describes the list of built-in functions that can be
used to enqueue a kernel(s).</p>
</div>
<div class="paragraph">
<p>The macro <code>CLK_NULL_EVENT</code> refers to an invalid device event.
The macro <code>CLK_NULL_QUEUE</code> refers to an invalid device queue.</p>
</div>
</div>
</div>
<div class="sect4">
<h5 id="built-in-functions-enqueuing-a-kernel"><a class="anchor" href="#built-in-functions-enqueuing-a-kernel"></a>6.15.18.1. Built-in Functions - Enqueuing a kernel</h5>
<table id="table-builtin-kernel-enqueue" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 35. Built-in Kernel Enqueue Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<thead>
<tr>
<th class="tableblock halign-left valign-top"><strong>Built-in Function</strong></th>
<th class="tableblock halign-left valign-top"><strong>Description</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>enqueue_kernel</strong>(queue_t <em>queue</em>, kernel_enqueue_flags_t <em>flags</em>,
const ndrange_t <em>ndrange</em>, void (^<em>block</em>)(void))<br>
int <strong>enqueue_kernel</strong>(queue_t <em>queue</em>, kernel_enqueue_flags_t <em>flags</em>,
const ndrange_t <em>ndrange</em>, uint <em>num_events_in_wait_list</em>,
const clk_event_t *<em>event_wait_list</em>, clk_event_t *<em>event_ret</em>,
void (^<em>block</em>)(void))<br>
int <strong>enqueue_kernel</strong>(queue_t <em>queue</em>, kernel_enqueue_flags_t <em>flags</em>,
const ndrange_t <em>ndrange</em>, void (^<em>block</em>)(local void *, &#8230;&#8203;),
uint size0, &#8230;&#8203;)<br>
int <strong>enqueue_kernel</strong>(queue_t <em>queue</em>, kernel_enqueue_flags_t <em>flags</em>,
const ndrange_t <em>ndrange</em>, uint <em>num_events_in_wait_list</em>,
const clk_event_t *<em>event_wait_list</em>, clk_event_t *<em>event_ret</em>,
void (^<em>block</em>)(local void *, &#8230;&#8203;), uint size0, &#8230;&#8203;)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Enqueue the block for execution to <em>queue</em>.</p>
<p class="tableblock"> If an event is returned, <strong>enqueue_kernel</strong> performs an implicit retain
on the returned event.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>The <strong>enqueue_kernel</strong> built-in function allows a work-item to enqueue a
block.
Work-items can enqueue multiple blocks to a device queue(s).</p>
</div>
<div class="paragraph">
<p>The <strong>enqueue_kernel</strong> built-in function returns <code>CLK_SUCCESS</code> if the block is
enqueued successfully and returns <code>CLK_ENQUEUE_FAILURE</code> otherwise.
If the -g compile option is specified in compiler options passed to
<strong>clCompileProgram</strong> or <strong>clBuildProgram</strong> when compiling or building the parent
program, the following errors may be returned instead of
<code>CLK_ENQUEUE_FAILURE</code> to indicate why <strong>enqueue_kernel</strong> failed to enqueue the
block:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><code>CLK_INVALID_QUEUE</code> if <em>queue</em> is not a valid device queue.</p>
</li>
<li>
<p><code>CLK_INVALID_NDRANGE</code> if <em>ndrange</em> is not a valid ND-range descriptor or
if the program was compiled with <code>-cl-uniform-work-group-size</code> and the
<em>local_work_size</em> is specified in <em>ndrange</em> but the <em>global_work_size</em>
specified in <em>ndrange</em> is not a multiple of the <em>local_work_size</em>.</p>
</li>
<li>
<p><code>CLK_INVALID_EVENT_WAIT_LIST</code> if <em>event_wait_list</em> is <code>NULL</code> and
<em>num_events_in_wait_list</em> &gt; 0, or if <em>event_wait_list</em> is not <code>NULL</code> and
<em>num_events_in_wait_list</em> is 0, or if event objects in <em>event_wait_list</em>
are not valid events.</p>
</li>
<li>
<p><code>CLK_DEVICE_QUEUE_FULL</code> if <em>queue</em> is full.</p>
</li>
<li>
<p><code>CLK_INVALID_ARG_SIZE</code> if size of local memory arguments is 0.</p>
</li>
<li>
<p><code>CLK_EVENT_ALLOCATION_FAILURE</code> if <em>event_ret</em> is not <code>NULL</code> and an event
could not be allocated.</p>
</li>
<li>
<p><code>CLK_OUT_OF_RESOURCES</code> if there is a failure to queue the block in
<em>queue</em> because of insufficient resources needed to execute the kernel.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Below are some examples of how to enqueue a block.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span>
my_func_A(global <span class="predefined-type">int</span> *a, global <span class="predefined-type">int</span> *b, global <span class="predefined-type">int</span> *c)
{
...
}
kernel <span class="directive">void</span>
my_func_B(global <span class="predefined-type">int</span> *a, global <span class="predefined-type">int</span> *b, global <span class="predefined-type">int</span> *c)
{
ndrange_t ndrange;
<span class="comment">// build ndrange information</span>
...
<span class="comment">// example - enqueue a kernel as a block</span>
enqueue_kernel(get_default_queue(), ndrange,
^{my_func_A(a, b, c);});
...
}
kernel <span class="directive">void</span>
my_func_C(global <span class="predefined-type">int</span> *a, global <span class="predefined-type">int</span> *b, global <span class="predefined-type">int</span> *c)
{
ndrange_t ndrange;
<span class="comment">// build ndrange information</span>
...
<span class="comment">// note that a, b and c are variables in scope of</span>
<span class="comment">// the block</span>
<span class="directive">void</span> (^my_block_A)(<span class="directive">void</span>) = ^{my_func_A(a, b, c);};
<span class="comment">// enqueue the block variable</span>
enqueue_kernel(get_default_queue(),
CLK_ENQUEUE_FLAGS_WAIT_KERNEL,
ndrange,
my_block_A);
...
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>The example below shows how to declare a block literal and enqueue it.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span>
my_func(global <span class="predefined-type">int</span> *a, global <span class="predefined-type">int</span> *b)
{
ndrange_t ndrange;
<span class="comment">// build ndrange information</span>
...
<span class="comment">// note that a, b and c are variables in scope of</span>
<span class="comment">// the block</span>
<span class="directive">void</span> (^my_block_A)(<span class="directive">void</span>) =
^{
size_t id = get_global_id(<span class="integer">0</span>);
b[id] += a[id];
};
<span class="comment">// enqueue the block variable</span>
enqueue_kernel(get_default_queue(),
CLK_ENQUEUE_FLAGS_WAIT_KERNEL,
ndrange,
my_block_A);
<span class="comment">// or we could have done the following</span>
enqueue_kernel(get_default_queue(),
CLK_ENQUEUE_FLAGS_WAIT_KERNEL,
ndrange,
^{
size_t id = get_global_id(<span class="integer">0</span>);
b[id] += a[id];
};
}</code></pre>
</div>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>Blocks passed to enqueue_kernel cannot use global variables or stack
variables local to the enclosing lexical scope that are a pointer type in
the <code>local</code> or <code>private</code> address space.</p>
</div>
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>Example:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span>
foo(global <span class="predefined-type">int</span> *a, local <span class="predefined-type">int</span> *lptr, ...)
{
enqueue_kernel(get_default_queue(),
CLK_ENQUEUE_FLAGS_WAIT_KERNEL,
ndrange,
^{
size_t id = get_global_id(<span class="integer">0</span>);
local <span class="predefined-type">int</span> *p = lptr; <span class="comment">// undefined behavior</span>
} );
}</code></pre>
</div>
</div>
</div>
<div class="sect4">
<h5 id="arguments-that-are-a-pointer-type-to-local-address-space"><a class="anchor" href="#arguments-that-are-a-pointer-type-to-local-address-space"></a>6.15.18.2. Arguments that are a pointer type to local address space</h5>
<div class="paragraph">
<p>A block passed to enqueue_kernel can have arguments declared to be a pointer
to <code>local</code> memory.
The enqueue_kernel built-in function variants allow blocks to be enqueued
with a variable number of arguments.
Each argument must be declared to be a <code>void</code> pointer to local memory.
These enqueue_kernel built-in function variants also have a corresponding
number of arguments each of type <code>uint</code> that follow the block argument.
These arguments specify the size of each local memory pointer argument of
the enqueued block.</p>
</div>
<div class="paragraph">
<p>Some examples follow:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span>
my_func_A_local_arg1(global <span class="predefined-type">int</span> *a, local <span class="predefined-type">int</span> *lptr, ...)
{
...
}
kernel <span class="directive">void</span>
my_func_A_local_arg2(global <span class="predefined-type">int</span> *a,
local <span class="predefined-type">int</span> *lptr1, local float4 *lptr2, ...)
{
...
}
kernel <span class="directive">void</span>
my_func_B(global <span class="predefined-type">int</span> *a, ...)
{
...
ndrange_t ndrange = ndrange_1d(...);
uint local_mem_size = compute_local_mem_size();
enqueue_kernel(get_default_queue(),
CLK_ENQUEUE_FLAGS_WAIT_KERNEL,
ndrange,
^(local <span class="directive">void</span> *p){
my_func_A_local_arg1(a, (local <span class="predefined-type">int</span> *)p, ...);},
local_mem_size);
}
kernel <span class="directive">void</span>
my_func_C(global <span class="predefined-type">int</span> *a, ...)
{
...
ndrange_t ndrange = ndrange_1d(...);
<span class="directive">void</span> (^my_blk_A)(local <span class="directive">void</span> *, local <span class="directive">void</span> *) =
^(local <span class="directive">void</span> *lptr1, local <span class="directive">void</span> *lptr2){
my_func_A_local_arg2(
a,
(local <span class="predefined-type">int</span> *)lptr1,
(local float4 *)lptr2, ...);};
<span class="comment">// calculate local memory size for lptr</span>
<span class="comment">// argument in local address space for my_blk_A</span>
uint local_mem_size = compute_local_mem_size();
enqueue_kernel(get_default_queue(),
CLK_ENQUEUE_FLAGS_WAIT_KERNEL,
ndrange,
my_blk_A,
local_mem_size, local_mem_size*<span class="integer">4</span>);
}</code></pre>
</div>
</div>
</div>
<div class="sect4">
<h5 id="a-complete-example"><a class="anchor" href="#a-complete-example"></a>6.15.18.3. A Complete Example</h5>
<div class="paragraph">
<p>The example below shows how to implement an iterative algorithm where the
host enqueues the first instance of the nd-range kernel (dp_func_A).
The kernel dp_func_A will launch a kernel (evaluate_dp_work_A) that will
determine if new nd-range work needs to be performed.
If new nd-range work does need to be performed, then evaluate_dp_work_A will
enqueue a new instance of dp_func_A .
This process is repeated until all the work is completed.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span>
dp_func_A(queue_t q, ...)
{
...
<span class="comment">// queue a single instance of evaluate_dp_work_A to</span>
<span class="comment">// device queue q. queued kernel begins execution after</span>
<span class="comment">// kernel dp_func_A finishes</span>
<span class="keyword">if</span> (get_global_id(<span class="integer">0</span>) == <span class="integer">0</span>)
{
enqueue_kernel(q,
CLK_ENQUEUE_FLAGS_WAIT_KERNEL,
ndrange_1d(<span class="integer">1</span>),
^{evaluate_dp_work_A(q, ...);});
}
}
kernel <span class="directive">void</span>
evaluate_dp_work_A(queue_t q,...)
{
<span class="comment">// check if more work needs to be performed</span>
<span class="predefined-type">bool</span> more_work = check_new_work(...);
<span class="keyword">if</span> (more_work)
{
size_t global_work_size = compute_global_size(...);
<span class="directive">void</span> (^dp_func_A_blk)(<span class="directive">void</span>) =
^{dp_func_A(q, ...});
<span class="comment">// get local WG-size for kernel dp_func_A</span>
size_t local_work_size =
get_kernel_work_group_size(dp_func_A_blk);
<span class="comment">// build nd-range descriptor</span>
ndrange_t ndrange = ndrange_1D(global_work_size,
local_work_size);
<span class="comment">// enqueue dp_func_A</span>
enqueue_kernel(q,
CLK_ENQUEUE_FLAGS_WAIT_KERNEL,
ndrange,
dp_func_A_blk);
}
...
}</code></pre>
</div>
</div>
</div>
<div class="sect4">
<h5 id="determining-when-a-child-kernel-begins-execution"><a class="anchor" href="#determining-when-a-child-kernel-begins-execution"></a>6.15.18.4. Determining when a child kernel begins execution</h5>
<div class="paragraph">
<p>The <code>kernel_enqueue_flags_t</code> <sup class="footnote">[<a id="_footnoteref_74" class="footnote" href="#_footnotedef_74" title="View footnote.">74</a>]</sup> argument
to the <code>enqueue_kernel</code> built-in functions can be used to specify when the child
kernel begins execution.
Supported values are described in the table below:</p>
</div>
<table id="table-kernel-enqueue-flags" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 36. Kernel Enqueue Flags</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>kernel_enqueue_flags_t</code> <strong>enum</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CLK_ENQUEUE_FLAGS_NO_WAIT</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Indicates that the enqueued kernels do not need to wait for the parent
kernel to finish execution before they begin execution.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CLK_ENQUEUE_FLAGS_WAIT_KERNEL</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Indicates that all work-items of the parent kernel must finish
executing and all immediate <sup class="footnote">[<a id="_footnoteref_75" class="footnote" href="#_footnotedef_75" title="View footnote.">75</a>]</sup> side
effects committed before the enqueued child kernel may begin execution.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CLK_ENQUEUE_FLAGS_WAIT_WORK_GROUP</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Indicates that the enqueued kernels wait only for the workgroup that
enqueued the kernels to finish before they begin execution.
<sup class="footnote">[<a id="_footnoteref_76" class="footnote" href="#_footnotedef_76" title="View footnote.">76</a>]</sup></p></td>
</tr>
</tbody>
</table>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>The <code>kernel_enqueue_flags_t</code> flags are useful when a kernel enqueued from
the host and executing on a device enqueues kernels on the device.
The kernel enqueued from the host may not have an event associated with it.
The <code>kernel_enqueue_flags_t</code> flags allow the developer to indicate when the
child kernels can begin execution.</p>
</div>
</td>
</tr>
</table>
</div>
</div>
<div class="sect4">
<h5 id="determining-when-a-parent-kernel-has-finished-execution"><a class="anchor" href="#determining-when-a-parent-kernel-has-finished-execution"></a>6.15.18.5. Determining when a parent kernel has finished execution</h5>
<div class="paragraph">
<p>A parent kernel&#8217;s execution status is considered to be complete when it and
all its child kernels have finished execution.
The execution status of a parent kernel will be <code>CL_COMPLETE</code> if this kernel
and all its child kernels finish execution successfully.
The execution status of the kernel will be an error code (given by a
negative integer value) if it or any of its child kernels encounter an
error, or are abnormally terminated.</p>
</div>
<div class="paragraph">
<p>For example, assume that the host enqueues a kernel <code>k</code> for execution on a
device.
Kernel <code>k</code> when executing on the device enqueues kernels <code>A</code> and <code>B</code> to a
device queue(s).
The enqueue_kernel call to enqueue kernel <code>B</code> specifies the event associated
with kernel <code>A</code> in the <code>event_wait_list</code> argument, i.e. wait for kernel <code>A</code>
to finish execution before kernel <code>B</code> can begin execution.
Let&#8217;s assume kernel <code>A</code> enqueues kernels <code>X</code>, <code>Y</code> and <code>Z</code>.
Kernel <code>A</code> is considered to have finished execution, i.e. its execution
status is <code>CL_COMPLETE</code>, only after <code>A</code> and the kernels <code>A</code> enqueued (and
any kernels these enqueued kernels enqueue and so on) have finished
execution.</p>
</div>
</div>
<div class="sect4">
<h5 id="built-in-functions-kernel-query-functions"><a class="anchor" href="#built-in-functions-kernel-query-functions"></a>6.15.18.6. Built-in Functions - Kernel Query Functions</h5>
<div class="openblock">
<div class="content">
<table id="table-builtin-kernel-query" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 37. Built-in Kernel Query Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Built-in Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">uint <strong>get_kernel_work_group_size</strong>(void (^block)(void))<br>
uint <strong>get_kernel_work_group_size</strong>(void (^block)(local void *, &#8230;&#8203;))</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">This provides a mechanism to query the maximum work-group size that
can be used to execute a block on a specific device given by <em>device</em>.</p>
<p class="tableblock"> <em>block</em> specifies the block to be enqueued.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">uint <strong>get_kernel_preferred_</strong> <strong>work_group_size_multiple</strong>(
void (^block)(void))<br>
uint <strong>get_kernel_preferred_</strong> <strong>work_group_size_multiple</strong>(
void (^block)(local void *, &#8230;&#8203;))</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the preferred multiple of work-group size for launch.
This is a performance hint.
Specifying a work-group size that is not a multiple of the value
returned by this query as the value of the local work size argument to
enqueue_kernel will not fail to enqueue the block for execution unless
the work-group size specified is larger than the device maximum.</p></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="sect4">
<h5 id="built-in-functions-queuing-other-commands"><a class="anchor" href="#built-in-functions-queuing-other-commands"></a>6.15.18.7. Built-in Functions - Queuing other commands</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The following table describes the list of built-in functions that can be
used to enqueue commands such as a marker.</p>
</div>
<table id="table-builtin-other-enqueue" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 38. Built-in Other Enqueue Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Built-in Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>enqueue_marker</strong>(queue_t <em>queue</em>, uint <em>num_events_in_wait_list</em>,
const clk_event_t *<em>event_wait_list</em>, clk_event_t *<em>event_ret</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Enqueue a marker command to <em>queue</em>.</p>
<p class="tableblock"> The marker command waits for a list of events specified by
<em>event_wait_list</em> to complete before the marker completes.</p>
<p class="tableblock"> <em>event_ret</em> must not be <code>NULL</code> as otherwise this is a no-op.</p>
<p class="tableblock"> If an event is returned, <strong>enqueue_marker</strong> performs an implicit retain
on the returned event.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>The <strong>enqueue_marker</strong> built-in function returns <code>CLK_SUCCESS</code> if the marked
command is enqueued successfully and returns <code>CLK_ENQUEUE_FAILURE</code>
otherwise.
If the -g compile option is specified in compiler options passed to
<strong>clCompileProgram</strong> or <strong>clBuildProgram</strong>, the following errors may be returned
instead of <code>CLK_ENQUEUE_FAILURE</code> to indicate why <strong>enqueue_marker</strong> failed to
enqueue the marker command:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><code>CLK_INVALID_QUEUE</code> if <em>queue</em> is not a valid device queue.</p>
</li>
<li>
<p><code>CLK_INVALID_EVENT_WAIT_LIST</code> if <em>event_wait_list</em> is <code>NULL</code>, or if
<em>event_wait_list</em> is not <code>NULL</code> and <em>num_events_in_wait_list</em> is 0, or
if event objects in <em>event_wait_list</em> are not valid events.</p>
</li>
<li>
<p><code>CLK_DEVICE_QUEUE_FULL</code> if <em>queue</em> is full.</p>
</li>
<li>
<p><code>CLK_EVENT_ALLOCATION_FAILURE</code> if <em>event_ret</em> is not <code>NULL</code> and an event
could not be allocated.</p>
</li>
<li>
<p><code>CLK_OUT_OF_RESOURCES</code> if there is a failure to queue the block in
<em>queue</em> because of insufficient resources needed to execute the kernel.</p>
</li>
</ul>
</div>
</div>
</div>
</div>
<div class="sect4">
<h5 id="built-in-functions-event-functions"><a class="anchor" href="#built-in-functions-event-functions"></a>6.15.18.8. Built-in Functions - Event Functions</h5>
<div class="openblock">
<div class="content">
<div class="paragraph">
<p>The following table describes the list of built-in functions that work on
events.</p>
</div>
<table id="table-builtin-event" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 39. Built-in Event Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<thead>
<tr>
<th class="tableblock halign-left valign-top"><strong>Built-in Function</strong></th>
<th class="tableblock halign-left valign-top"><strong>Description</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>retain_event</strong>(clk_event_t <em>event</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Increments the event reference count. <em>event</em> must be an event
returned by enqueue_kernel or enqueue_marker or a user event.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>release_event</strong>(clk_event_t <em>event</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Decrements the event reference count.
The event object is deleted once the event reference count is zero,
the specific command identified by this event has completed (or
terminated) and there are no commands in any device command queue that
require a wait for this event to complete.</p>
<p class="tableblock"> <em>event</em> must be an event returned by enqueue_kernel, enqueue_marker or
a user event.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">clk_event_t <strong>create_user_event</strong>()</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Create a user event.
Returns the user event.
The execution status of the user event created is set to
<code>CL_SUBMITTED</code>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">bool <strong>is_valid_event</strong>(clk_event_t <em>event</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns <em>true</em> if <em>event</em> is a valid event.
Otherwise returns <em>false</em>.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>set_user_event_status</strong>(clk_event_t <em>event</em>, int <em>status</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Sets the execution status of a user event.
<em>event</em> must be a user-event.
<em>status</em> can be either <code>CL_COMPLETE</code> or a negative integer value
indicating an error.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>capture_event_profiling_info</strong>(clk_event_t <em>event</em>,
clk_profiling_info <em>name</em>, global void *<em>value</em>)</p></td>
<td class="tableblock halign-left valign-top"><div class="content"><div class="paragraph">
<p>Captures the profiling information for functions that are enqueued as
commands.
The specific function being referred to is: enqueue_kernel.
These enqueued commands are identified by unique event objects.
The profiling information will be available in <em>value</em> once the
command identified by <em>event</em> has completed.</p>
</div>
<div class="paragraph">
<p><em>event</em> must be an event returned by enqueue_kernel.</p>
</div>
<div class="paragraph">
<p><em>name</em> identifies which profiling information is to be queried and can be:</p>
</div>
<div class="paragraph">
<p><code>CLK_PROFILING_COMMAND_EXEC_TIME</code></p>
</div>
<div class="paragraph">
<p><em>value</em> is a pointer to two 64-bit values.</p>
</div>
<div class="paragraph">
<p>The first 64-bit value describes the elapsed time <code>CL_PROFILING_COMMAND_END</code>
- <code>CL_PROFLING_COMMAND_START</code> for the command identified by <em>event</em> in
nanoseconds.</p>
</div>
<div class="paragraph">
<p>The second 64-bit value describes the elapsed time
<code>CL_PROFILING_COMMAND_COMPLETE</code> - <code>CL_PROFILING_COMAMND_START</code> for the
command identified by <em>event</em> in nanoseconds.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>The behavior of capture_event_profiling_info when called multiple times for
the same <em>event</em> is undefined.</p>
</div>
</td>
</tr>
</table>
</div></div></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>Events can be used to identify commands enqueued to a command-queue from the
host.
These events created by the OpenCL runtime can only be used on the host,
i.e. as events passed in the <em>event_wait_list</em> argument to various
<strong>clEnqueue</strong> APIs or runtime APIs that take events as arguments, such as
<strong>clRetainEvent</strong>, <strong>clReleaseEvent</strong>, and <strong>clGetEventProfilingInfo</strong>.</p>
</div>
<div class="paragraph">
<p>Similarly, events can be used to identify commands enqueued to a device
queue (from a kernel).
These event objects cannot be passed to the host or used by OpenCL runtime
APIs such as the <strong>clEnqueue</strong> APIs or runtime APIs that take event arguments.</p>
</div>
<div class="paragraph">
<p><strong>clRetainEvent</strong> and <strong>clReleaseEvent</strong> will return <code>CL_INVALID_OPERATION</code> if
<em>event</em> specified is an event that refers to any kernel enqueued to a device
queue using <strong>enqueue_kernel</strong> or <strong>enqueue_marker</strong>, or is a user event created
by <strong>create_user_event</strong>.</p>
</div>
<div class="paragraph">
<p>Similarly, <strong>clSetUserEventStatus</strong> can only be used to set the execution
status of events created using <strong>clCreateUserEvent</strong>.
User events created on the device can be set using set_user_event_status
built-in function.</p>
</div>
<div class="paragraph">
<p>The example below shows how events can be used with kernels enqueued to
multiple device queues.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="directive">extern</span> <span class="directive">void</span> barA_kernel(...);
<span class="directive">extern</span> <span class="directive">void</span> barB_kernel(...);
kernel <span class="directive">void</span>
foo(queue_t q0, queue q1, ...)
{
...
clk_event_t evt0;
<span class="comment">// enqueue kernel to queue q0</span>
enqueue_kernel(q0,
CLK_ENQUEUE_FLAGS_NO_WAIT,
ndrange_A,
<span class="integer">0</span>, <span class="predefined-constant">NULL</span>, &amp;evt0,
^{barA_kernel(...);} );
<span class="comment">// enqueue kernel to queue q1</span>
enqueue_kernel(q1,
CLK_ENQUEUE_FLAGS_NO_WAIT,
ndrange_B,
<span class="integer">1</span>, &amp;evt0, <span class="predefined-constant">NULL</span>,
^{barB_kernel(...);} );
<span class="comment">// release event evt0. This will get released</span>
<span class="comment">// after barA_kernel enqueued in queue q0 has finished</span>
<span class="comment">// execution and barB_kernel enqueued in queue q1 and</span>
<span class="comment">// waits for evt0 is submitted for execution, i.e. wait</span>
<span class="comment">// for evt0 is satisfied.</span>
release_event(evt0);
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>The example below shows how the marker command can be used with kernels
enqueued to a device queue.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">kernel <span class="directive">void</span>
foo(queue_t q, ...)
{
...
clk_event_t marker_event;
clk_event_t events[<span class="integer">2</span>];
enqueue_kernel(q,
CLK_ENQUEUE_FLAGS_NO_WAIT,
ndrange,
<span class="integer">0</span>, <span class="predefined-constant">NULL</span>, &amp;events[<span class="integer">0</span>],
^{barA_kernel(...);} );
enqueue_kernel(q,
CLK_ENQUEUE_FLAGS_NO_WAIT,
ndrange,
<span class="integer">0</span>, <span class="predefined-constant">NULL</span>, &amp;events[<span class="integer">1</span>],
^{barB_kernel(...);} );
<span class="comment">// barA_kernel and barB_kernel can be executed</span>
<span class="comment">// out of order. we need to wait for both these</span>
<span class="comment">// kernels to finish execution before barC_kernel</span>
<span class="comment">// starts execution so we enqueue a marker command and</span>
<span class="comment">// then enqueue barC_kernel that waits on the event</span>
<span class="comment">// associated with the marker.</span>
enqueue_marker(q, <span class="integer">2</span>, events, &amp;marker_event);
enqueue_kernel(q,
CLK_ENQUEUE_FLAGS_NO_WAIT,
<span class="integer">1</span>, &amp;marker_event, <span class="predefined-constant">NULL</span>,
^{barC_kernel(...);} );
release_event(events[<span class="integer">0</span>];
release_event(events[<span class="integer">1</span>]);
release_event(marker_event);
}</code></pre>
</div>
</div>
</div>
</div>
</div>
<div class="sect4">
<h5 id="built-in-functions-helper-functions"><a class="anchor" href="#built-in-functions-helper-functions"></a>6.15.18.9. Built-in Functions - Helper Functions</h5>
<div class="openblock">
<div class="content">
<table id="table-builtin-helper" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 40. Built-in Helper Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Built-in Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Description</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">queue_t <strong>get_default_queue</strong>(void)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the default device queue.
If a default device queue has not been created, <code>CLK_NULL_QUEUE</code> is
returned.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">ndrange_t <strong>ndrange_1D</strong>(size_t <em>global_work_size</em>)<br>
ndrange_t <strong>ndrange_1D</strong>(size_t <em>global_work_size</em>,
size_t <em>local_work_size</em>)<br>
ndrange_t <strong>ndrange_1D</strong>(size_t <em>global_work_offset</em>,
size_t <em>global_work_size</em>, size_t <em>local_work_size</em>)<br>
ndrange_t <strong>ndrange_2D</strong>(const size_t <em>global_work_size</em>[2])<br>
ndrange_t <strong>ndrange_2D</strong>(const size_t <em>global_work_size</em>[2],
const size_t <em>local_work_size</em>[2])<br>
ndrange_t <strong>ndrange_2D</strong>(const size_t <em>global_work_offset</em>[2],
const size_t <em>global_work_size</em>[2],
const size_t <em>local_work_size</em>[2])<br>
ndrange_t <strong>ndrange_3D</strong>(const size_t <em>global_work_size</em>[3])<br>
ndrange_t <strong>ndrange_3D</strong>(const size_t <em>global_work_size</em>[3],
const size_t <em>local_work_size</em>[3])<br>
ndrange_t <strong>ndrange_3D</strong>(const size_t <em>global_work_offset</em>[3],
const size_t <em>global_work_size</em>[3],
const size_t <em>local_work_size</em>[3])</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Builds a 1D, 2D or 3D ND-range descriptor.</p></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
<div class="sect3">
<h4 id="subgroup-functions"><a class="anchor" href="#subgroup-functions"></a>6.15.19. Subgroup Functions</h4>
<div class="openblock">
<div class="content">
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The functionality described in this section <a href="#unified-spec">requires</a>
support for OpenCL C 3.0 or newer and the <code>__opencl_c_subgroups</code> feature.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>The table below describes OpenCL C programming language built-in functions that operate on a subgroup level.
These built-in functions must be encountered by all work items in the subgroup executing the kernel.
For the functions below, the generic type name <code>gentype</code> may be the one of the
supported built-in scalar data types <code>int</code>, <code>uint</code>, <code>long</code>
<sup class="footnote">[<a id="_footnoteref_77" class="footnote" href="#_footnotedef_77" title="View footnote.">77</a>]</sup>, <code>ulong</code>, <code>half</code> <sup class="footnote">[<a id="_footnoteref_78" class="footnote" href="#_footnotedef_78" title="View footnote.">78</a>]</sup>,
<code>float</code>, and <code>double</code> <sup class="footnote">[<a id="_footnoteref_79" class="footnote" href="#_footnotedef_79" title="View footnote.">79</a>]</sup>.</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<caption class="title">Table 41. Built-in Subgroup Collective Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<thead>
<tr>
<th class="tableblock halign-left valign-top"><strong>Function</strong></th>
<th class="tableblock halign-left valign-top"><strong>Description</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>sub_group_all</strong> (int <em>predicate</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Evaluates <em>predicate</em> for all work items in the subgroup and returns a
non-zero value if <em>predicate</em> evaluates to non-zero for all work items in
the subgroup.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">int <strong>sub_group_any</strong> (int <em>predicate</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Evaluates <em>predicate</em> for all work items in the subgroup and returns a
non-zero value if <em>predicate</em> evaluates to non-zero for any work items in
the subgroup.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>sub_group_broadcast</strong> (<br>
gentype <em>x</em>, uint <em>sub_group_local_id</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Broadcast the value of <em>x</em> for work item identified by
<em>sub_group_local_id</em> (value returned by <strong>get_sub_group_local_id</strong>) to all
work items in the subgroup.</p>
<p class="tableblock"> <em>sub_group_local_id</em> must be the same value for all work items in the
subgroup.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>sub_group_reduce_&lt;op&gt;</strong> (<br>
gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Return result of reduction operation specified by <strong>&lt;op&gt;</strong> for all values of
<em>x</em> specified by work items in a subgroup.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>sub_group_scan_exclusive_&lt;op&gt;</strong> (<br>
gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Do an exclusive scan operation specified by <strong>&lt;op&gt;</strong> of all values specified
by work items in a subgroup.
The scan results are returned for each work item.</p>
<p class="tableblock"> The scan order is defined by increasing subgroup local ID within the
subgroup.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">gentype <strong>sub_group_scan_inclusive_&lt;op&gt;</strong> (<br>
gentype <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Do an inclusive scan operation specified by <strong>&lt;op&gt;</strong> of all values specified
by work items in a subgroup.
The scan results are returned for each work item.</p>
<p class="tableblock"> The scan order is defined by increasing subgroup local ID within the
subgroup.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>The <strong>&lt;op&gt;</strong> in <strong>sub_group_reduce_&lt;op&gt;</strong>, <strong>sub_group_scan_inclusive_&lt;op&gt;</strong> and <strong>sub_group_scan_exclusive_&lt;op&gt;</strong> defines the operator and can be <strong>add</strong>, <strong>min</strong> or <strong>max</strong>.</p>
</div>
<div class="paragraph">
<p>The exclusive scan operation takes a binary operator <strong>op</strong> with an identity I and <em>n</em> (where <em>n</em> is the size of the sub-group) elements [a<sub>0</sub>, a<sub>1</sub>, &#8230;&#8203; a<sub>n-1</sub>] and returns [I, a<sub>0</sub>, (a<sub>0</sub> <strong>op</strong> a<sub>1</sub>), &#8230;&#8203; (a<sub>0</sub> <strong>op</strong> a<sub>1</sub> <strong>op</strong> &#8230;&#8203; <strong>op</strong> a<sub>n-2</sub>)].</p>
</div>
<div class="paragraph">
<p>The inclusive scan operation takes a binary operator <strong>op</strong> with an identity I and <em>n</em> (where <em>n</em> is the size of the sub-group) elements [a<sub>0</sub>, a<sub>1</sub>, &#8230;&#8203; a<sub>n-1</sub>] and returns [a<sub>0</sub>, (a<sub>0</sub> <strong>op</strong> a<sub>1</sub>), &#8230;&#8203; (a<sub>0</sub> <strong>op</strong> a<sub>1</sub> <strong>op</strong> &#8230;&#8203; <strong>op</strong> a<sub>n-1</sub>)].</p>
</div>
<div class="paragraph">
<p>If <strong>op</strong> = <strong>add</strong>, the identity I is 0.
If <strong>op</strong> = <strong>min</strong>, the identity I is <code>INT_MAX</code>, <code>UINT_MAX</code>, <code>LONG_MAX</code>, <code>ULONG_MAX</code>, for <code>int</code>, <code>uint</code>, <code>long</code>, <code>ulong</code> types and is <code>+INF</code> for
floating-point types.
Similarly if <strong>op</strong> = max, the identity I is <code>INT_MIN</code>, 0, <code>LONG_MIN</code>, 0 and <code>-INF</code>.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>The order of floating-point operations is not guaranteed for the <strong>sub_group_reduce_&lt;op&gt;</strong>, <strong>sub_group_scan_inclusive_&lt;op&gt;</strong> and <strong>sub_group_scan_exclusive_&lt;op&gt;</strong> built-in functions that operate on <code>half</code>, <code>float</code> and <code>double</code> data types.
The order of these floating-point operations is also non-deterministic for a given sub-group.</p>
</div>
</td>
</tr>
</table>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The functionality described in the following table <a href="#unified-spec">requires</a> support for OpenCL C 3.0 or newer and the <code>__opencl_c_subgroups</code>
and <code>__opencl_c_pipes</code> features.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>The following table describes built-in pipe functions that operate at a
subgroup level.
These built-in functions must be encountered by all work items in a subgroup
executing the kernel with the same argument values, otherwise the behavior
is undefined.
We use the generic type name <code>gentype</code> to indicate the built-in OpenCL C
scalar or vector integer or floating-point data types or any user defined
type built from these scalar and vector data types can be used as the type
for the arguments to the pipe functions listed in <em>table 6.29</em>.</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<caption class="title">Table 42. Built-in Subgroup Pipe Functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<thead>
<tr>
<th class="tableblock halign-left valign-top"><strong>Function</strong></th>
<th class="tableblock halign-left valign-top"><strong>Description</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">reserve_id_t <strong>sub_group_reserve_read_pipe</strong> (<br>
read_only pipe gentype <em>pipe</em>,<br>
uint <em>num_packets</em>)</p>
<p class="tableblock"> reserve_id_t <strong>sub_group_reserve_write_pipe</strong> (<br>
write_only pipe gentype <em>pipe</em>,<br>
uint <em>num_packets</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Reserve <em>num_packets</em> entries for reading from or writing to <em>pipe</em>.
Returns a valid non-zero reservation ID if the reservation is successful
and 0 otherwise.</p>
<p class="tableblock"> The reserved pipe entries are referred to by indices that go from 0 &#8230;&#8203;
<em>num_packets</em> - 1.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">void <strong>sub_group_commit_read_pipe</strong> (<br>
read_only pipe gentype <em>pipe</em>,<br>
reserve_id_t <em>reserve_id</em>)</p>
<p class="tableblock"> void <strong>sub_group_commit_write_pipe</strong> (<br>
write_only pipe gentype <em>pipe</em>,<br>
reserve_id_t <em>reserve_id</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Indicates that all reads and writes to <em>num_packets</em> associated with
reservation <em>reserve_id</em> are completed.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>Note: Reservations made by a subgroup are ordered in the pipe as they are
ordered in the program.
Reservations made by different subgroups that belong to the same work group
can be ordered using subgroup synchronization.
The order of subgroup based reservations that belong to different work
groups is implementation defined.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
The functionality described in the following table <a href="#unified-spec">requires</a> support for OpenCL C 3.0 or newer and the <code>__opencl_c_subgroups</code>
and <code>__opencl_c_device_enqueue</code> features.
</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>The following table describes built-in functions to query subgroup
information for a block to be enqueued.</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<caption class="title">Table 43. Built-in Subgroup Kernel Query Functions</caption>
<colgroup>
<col style="width: 55.5555%;">
<col style="width: 44.4445%;">
</colgroup>
<thead>
<tr>
<th class="tableblock halign-left valign-top"><strong>Built-in Function</strong></th>
<th class="tableblock halign-left valign-top"><strong>Description</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">uint <strong>get_kernel_sub_group_count_for_ndrange</strong> (<br>
const ndrange_t <em>ndrange</em>,<br>
void (^block)(void));</p>
<p class="tableblock"> uint <strong>get_kernel_sub_group_count_for_ndrange</strong> (<br>
const ndrange_t <em>ndrange</em>,<br>
void (^block)(local void *, &#8230;&#8203;));</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the number of subgroups in each work group of the dispatch (except
for the last in cases where the global size does not divide cleanly into
work groups) given the combination of the passed ndrange and block.</p>
<p class="tableblock"> <em>block</em> specifies the block to be enqueued.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">uint <strong>get_kernel_max_sub_group_size_for_ndrange</strong> (<br>
const ndrange_t <em>ndrange</em>,<br>
void (^block)(void));<br></p>
<p class="tableblock"> uint <strong>get_kernel_max_sub_group_size_for_ndrange</strong> (<br>
const ndrange_t <em>ndrange</em>,<br>
void (^block)(local void *, &#8230;&#8203;));</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Returns the maximum subgroup size for a block.</p></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="opencl-numerical-compliance"><a class="anchor" href="#opencl-numerical-compliance"></a>7. OpenCL Numerical Compliance</h2>
<div class="sectionbody">
<div class="paragraph">
<p>This section describes features of the <a href="#C99-spec">C99</a> and IEEE 754
standards that must be supported by all OpenCL compliant devices.</p>
</div>
<div class="paragraph">
<p>This section describes the functionality that must be supported by all
OpenCL devices for single precision floating-point numbers.
Currently, only single precision floating-point is a requirement.
Double precision floating-point is an optional feature.</p>
</div>
<div class="sect2">
<h3 id="rounding-modes-1"><a class="anchor" href="#rounding-modes-1"></a>7.1. Rounding Modes</h3>
<div class="paragraph">
<p>Floating-point calculations may be carried out internally with extra
precision and then rounded to fit into the destination type.
IEEE 754 defines four possible rounding modes:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Round to nearest even</p>
</li>
<li>
<p>Round toward +∞</p>
</li>
<li>
<p>Round toward -∞</p>
</li>
<li>
<p>Round toward zero</p>
</li>
</ul>
</div>
<div class="paragraph">
<p><em>Round to nearest</em> <em>even</em> is currently the only rounding mode required by the
OpenCL specification for single precision and double precision operations and is
therefore the default rounding mode
<sup class="footnote">[<a id="_footnoteref_80" class="footnote" href="#_footnotedef_80" title="View footnote.">80</a>]</sup>.
In addition, only static selection of rounding mode is supported.
Dynamically reconfiguring the rounding modes as specified by the IEEE 754
spec is unsupported.</p>
</div>
</div>
<div class="sect2">
<h3 id="inf-nan-and-denormalized-numbers"><a class="anchor" href="#inf-nan-and-denormalized-numbers"></a>7.2. INF, NaN and Denormalized Numbers</h3>
<div class="paragraph">
<p><code>INF</code> and NaNs must be supported.
Support for signaling NaNs is not required.</p>
</div>
<div class="paragraph">
<p>Support for denormalized numbers with single precision floating-point is
optional.
Denormalized single precision floating-point numbers passed as input or
produced as the output of single precision floating-point operations such as
add, sub, mul, divide, and the functions defined in <a href="#math-functions">math
functions</a>, <a href="#common-functions">common functions</a>, and
<a href="#geometric-functions">geometric functions</a> may be flushed to zero.</p>
</div>
</div>
<div class="sect2">
<h3 id="floating-point-exceptions"><a class="anchor" href="#floating-point-exceptions"></a>7.3. Floating-Point Exceptions</h3>
<div class="paragraph">
<p>Floating-point exceptions are disabled in OpenCL.
The result of a floating-point exception must match the IEEE 754 spec for
the exceptions not enabled case.
Whether and when the implementation sets floating-point flags or raises
floating-point exceptions is implementation-defined.
This standard provides no method for querying, clearing or setting
floating-point flags or trapping raised exceptions.
Due to non-performance, non-portability of trap mechanisms and the
impracticality of servicing precise exceptions in a vector context
(especially on heterogeneous hardware), such features are discouraged.</p>
</div>
<div class="paragraph">
<p>Implementations that nevertheless support such operations through an
extension to the standard shall initialize with all exception flags cleared
and the exception masks set so that exceptions raised by arithmetic
operations do not trigger a trap to be taken.
If the underlying work is reused by the implementation, the implementation
is however not responsible for reclearing the flags or resetting exception
masks to default values before entering the kernel.
That is to say that kernels that do not inspect flags or enable traps are
licensed to expect that their arithmetic will not trigger a trap.
Those kernels that do examine flags or enable traps are responsible for
clearing flag state and disabling all traps before returning control to the
implementation.
Whether or when the underlying work-item (and accompanying global
floating-point state if any) is reused is implementation-defined.</p>
</div>
<div class="paragraph">
<p>The expressions <strong>math_errorhandling</strong> and <code>MATH_ERREXCEPT</code> are reserved for
use by this standard, but not defined.
Implementations that extend this specification with support for
floating-point exceptions shall define <strong>math_errorhandling</strong> and
<code>MATH_ERREXCEPT</code> per <a href="#C99-spec">TC2 to the C99 Specification</a>.</p>
</div>
</div>
<div class="sect2">
<h3 id="relative-error-as-ulps"><a class="anchor" href="#relative-error-as-ulps"></a>7.4. Relative Error as ULPs</h3>
<div class="paragraph">
<p>In this section we discuss the maximum relative error defined as ulp (units
in the last place).
Addition, subtraction, multiplication, fused multiply-add and conversion
between integer and a single precision floating-point format are IEEE 754
compliant and are therefore correctly rounded.
Conversion between floating-point formats and
<a href="#explicit-conversions">explicit conversions</a> must be correctly rounded.</p>
</div>
<div class="paragraph">
<p>The ULP is defined as follows:</p>
</div>
<div class="exampleblock">
<div class="content">
<div class="paragraph">
<p>If <em>x</em> is a real number that lies between two finite consecutive
floating-point numbers <em>a</em> and <em>b</em>, without being equal to one of them, then
ulp(<em>x</em>) = |<em>b</em> - <em>a</em>|, otherwise ulp(<em>x</em>) is the distance between the two
non-equal finite floating-point numbers nearest <em>x</em>.
Moreover, ulp(NaN) is NaN.</p>
</div>
</div>
</div>
<div class="paragraph">
<p><em>Attribution: This definition was taken with consent from Jean-Michel Muller
with slight clarification for behavior at zero.</em></p>
</div>
<div class="exampleblock">
<div class="content">
<div class="paragraph">
<p>Jean-Michel Muller. On the definition of ulp(x). RR-5504, INRIA. 2005, pp.16. &lt;inria-00070503&gt;
Currently hosted at
<a href="https://hal.inria.fr/inria-00070503/document">https://hal.inria.fr/inria-00070503/document</a>.</p>
</div>
</div>
</div>
<div class="paragraph">
<p>The following table describes the minimum accuracy of single precision
floating-point arithmetic operations given as ULP values.
The reference value used to compute the ULP value of an arithmetic operation
is the infinitely precise result.
0 ulp is used for math functions that do not require rounding.</p>
</div>
<table id="table-ulp-float-math" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 44. ULP values for single precision built-in math functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Min Accuracy - ULP values</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>x</em> + <em>y</em></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>x</em> - <em>y</em></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>x</em> * <em>y</em></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>1.0 / <em>x</em></strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2.5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>x</em> / <em>y</em></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2.5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>acos</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>acospi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>asin</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>asinpi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atan</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atan2</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 6 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atanpi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atan2pi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 6 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>acosh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>asinh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atanh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cbrt</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>ceil</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>clamp</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>copysign</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cos</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cosh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cospi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cross</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">absolute error tolerance of 'max * max * (3 * FLT_EPSILON)' per vector component, where <em>max</em> is the maximum input operand magnitude</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>degrees</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>distance</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2.5 + 2n ulp, for gentype with vector width <em>n</em></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>dot</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">absolute error tolerance of 'max * max * (2n - 1) * FLT_EPSILON', for vector width <em>n</em> and maximum input operand magnitude <em>max</em> across all vector components</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>erfc</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>erf</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>exp</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>exp2</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>exp10</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>expm1</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fabs</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fdim</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>floor</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fma</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fmax</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fmin</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fmod</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fract</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>frexp</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>hypot</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>ilogb</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>length</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2.75 + 0.5n ulp, for gentype with vector width <em>n</em></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>ldexp</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>lgamma</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Undefined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>lgamma_r</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Undefined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>log</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>log2</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>log10</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>log1p</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>logb</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>mad</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implemented either as a correctly rounded fma or
as a multiply followed by an add both of which are
correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>max</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>maxmag</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>min</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>minmag</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>mix</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">absolute error tolerance of 1e-3</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>modf</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>nan</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>nextafter</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>normalize</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2 + n ulp, for gentype with vector width <em>n</em></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>pow</strong>(<em>x</em>, <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>pown</strong>(<em>x</em>, <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>powr</strong>(<em>x</em>, <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>radians</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>remainder</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>remquo</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>rint</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>rootn</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>round</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>rsqrt</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sign</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sin</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sincos</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp for sine and cosine values</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sinh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sinpi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>smoothstep</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">absolute error tolerance of 1e-5</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sqrt</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>step</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>tan</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>tanh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>tanpi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 6 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>tgamma</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>trunc</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_cos</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_divide</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_exp</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_exp2</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_exp10</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_log</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_log2</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_log10</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_powr</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_recip</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_rsqrt</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_sin</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_sqrt</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_tan</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fast_distance</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8191.5 + 2n ulp, for gentype with vector width <em>n</em></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fast_length</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8191.5 + n ulp, for gentype with vector width <em>n</em></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fast_normalize</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 + n ulp, for gentype with vector width <em>n</em></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_cos</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_divide</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_exp</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_exp2</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_exp10</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_log</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_log2</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_log10</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_powr</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_recip</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_rsqrt</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_sin</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_sqrt</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_tan</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>The following table describes the minimum accuracy of single precision
floating-point arithmetic operations given as ULP values for the embedded
profile.
The reference value used to compute the ULP value of an arithmetic operation
is the infinitely precise result.
0 ulp is used for math functions that do not require rounding.</p>
</div>
<table id="table-ulp-embedded" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 45. ULP values for the embedded profile</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Min Accuracy - ULP values</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>x</em> + <em>y</em></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>x</em> - <em>y</em></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>x</em> * <em>y</em></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>1.0 / <em>x</em></strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>x</em> / <em>y</em></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>acos</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>acospi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>asin</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>asinpi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atan</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atan2</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 6 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atanpi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atan2pi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 6 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>acosh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>asinh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atanh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cbrt</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>ceil</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>clamp</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>copysign</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cos</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cosh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cospi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cross</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>degrees</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>distance</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>dot</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>erfc</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>erf</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>exp</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>exp2</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>exp10</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>expm1</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fabs</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fdim</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>floor</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fma</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fmax</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fmin</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fmod</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fract</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>frexp</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>hypot</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>ilogb</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>ldexp</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>length</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>log</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>log2</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>log10</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>log1p</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>logb</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>mad</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Any value allowed (infinite ulp)</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>max</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>maxmag</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>min</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>minmag</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>mix</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>modf</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>nan</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>normalize</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>nextafter</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>pow</strong>(<em>x</em>, <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>pown</strong>(<em>x</em>, <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>powr</strong>(<em>x</em>, <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>radians</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>remainder</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>remquo</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>rint</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>rootn</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>round</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>rsqrt</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sign</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sin</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sincos</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp for sine and cosine values</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sinh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sinpi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>smoothstep</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sqrt</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>step</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>tan</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>tanh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>tanpi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 6 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>tgamma</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>trunc</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_cos</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_divide</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_exp</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_exp2</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_exp10</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_log</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_log2</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_log10</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_powr</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_recip</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_rsqrt</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_sin</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_sqrt</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>half_tan</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 8192 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fast_distance</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fast_length</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fast_normalize</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_cos</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_divide</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_exp</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_exp2</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_exp10</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_log</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_log2</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_log10</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_powr</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_recip</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_rsqrt</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_sin</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_sqrt</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>native_tan</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>The <a href="#table-float-ulp-relaxed">following table</a> describes the minimum accuracy
of commonly used single precision floating-point arithmetic operations given
as ULP values if the <code>-cl-unsafe-math-optimizations</code> compiler option is
specified when compiling or building an OpenCL program.
For derived implementations, the operations used in the derivation may
themselves be relaxed according to the following table.
The minimum accuracy of math functions not defined in the following table
when the <code>-cl-unsafe-math-optimizations</code> compiler option is specified is as defined
in <a href="#table-ulp-float-math">ULP values for single precision built-in math functions</a> when operating in the full profile, and as
defined in <a href="#table-ulp-embedded">ULP values for the embedded profile</a> when operating in the embedded profile.
The reference value used to compute the ULP value of an arithmetic operation
is the infinitely precise result.
0 ulp is used for math functions that do not require rounding.</p>
</div>
<div class="paragraph">
<p>Defined minimum accuracy of single precision floating-point arithmetic
operations and builtins with <code>-cl-unsafe-math-optimizations</code> <a href="#unified-spec">requires</a> support for OpenCL C 2.0 or newer.</p>
</div>
<table id="table-float-ulp-relaxed" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 46. ULP values for single precision built-in math functions with unsafe math optimizations in the full and embedded profiles</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Min Accuracy - ULP values</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>1.0 / <em>x</em></strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2.5 ulp for <em>x</em> in the domain of 2<sup>-126</sup> to 2<sup>126</sup> for the full
profile, and ≤ 3 ulp for the embedded profile.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>x</em> / <em>y</em></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2.5 ulp for <em>x</em> in the domain of 2<sup>-62</sup> to 2<sup>62</sup> and <em>y</em> in the
domain of 2<sup>-62</sup> to 2<sup>62</sup> for the full profile, and ≤ 3 ulp for
the embedded profile.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>acos</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4096 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>acospi</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implemented as <strong>acos</strong>(<em>x</em>) * <code>M_PI_F</code>.
For non-derived implementations, the error is ≤ 8192 ulp.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>asin</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4096 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>asinpi</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implemented as <strong>asin</strong>(<em>x</em>) * <code>M_PI_F</code>.
For non-derived implementations, the error is ≤ 8192 ulp.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atan</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4096 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atan2</strong>(<em>y</em>, <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implemented as <strong>atan</strong>(<em>y</em> / <em>x</em>) for <em>x</em> &gt; 0, <strong>atan</strong>(<em>y</em> / <em>x</em>)<br>
<code>M_1_PI_F</code> for <em>x</em> &lt; 0 and <em>y</em> &gt; 0 and <strong>atan</strong>(<em>y</em> / <em>x</em>) -
<code>M_1_PI_F</code> for <em>x</em> &lt; 0 and <em>y</em> &lt; 0.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atanpi</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implemented as <strong>atan</strong>(<em>x</em>) * <code>M_1_PI_F</code>.
For non-derived implementations, the error is ≤ 8192 ulp.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atan2pi</strong>(<em>y</em>, <em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implemented as <strong>atan2</strong>(<em>y</em>, <em>x</em>) * <code>M_PI_F</code>.
For non-derived implementations, the error is ≤ 8192 ulp.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>acosh</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implemented as <strong>log</strong>(<em>x</em> + <strong>sqrt</strong>(<em>x</em> * <em>x</em> - 1)).</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>asinh</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implemented as <strong>log</strong>(<em>x</em> + <strong>sqrt</strong>(<em>x</em> * <em>x</em> + 1)).</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cbrt</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implemented as <strong>rootn</strong>(<em>x</em>, 3).
For non-derived implementations, the error is ≤ 8192 ulp.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cos</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">For <em>x</em> in the domain [-π, π], the maximum absolute error
is ≤ 2<sup>-11</sup> and larger otherwise.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cosh</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Defined for <em>x</em> in the domain [-88,88] and implemented as 0.5f *
(<strong>exp</strong>(<em>x</em>) + <strong>exp</strong>(-<em>x</em>)).
For non-derived implementations, the error is ≤ 8192 ulp.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cospi</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">For <em>x</em> in the domain [-1, 1], the maximum absolute error is ≤
2<sup>-11</sup> and larger otherwise.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>exp</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 + <strong>floor</strong>(<strong>fabs</strong>(2 * <em>x</em>)) ulp for the full profile, and ≤
4 ulp for the embedded profile.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>exp2</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 + <strong>floor</strong>(<strong>fabs</strong>(2 * <em>x</em>)) ulp for the full profile, and ≤
4 ulp for the embedded profile.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>exp10</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Derived implementations implement this as <strong>exp2</strong>(<em>x</em> * <strong>log2</strong>(10)).
For non-derived implementations, the error is ≤ 8192 ulp.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>expm1</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Derived implementations implement this as <strong>exp</strong>(<em>x</em>) - 1.
For non-derived implementations, the error is ≤ 8192 ulp.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>log</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">For <em>x</em> in the domain [0.5, 2] the maximum absolute error is ≤
2<sup>-21</sup>; otherwise the maximum error is ≤3 ulp for the full profile
and ≤ 4 ulp for the embedded profile</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>log2</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">For <em>x</em> in the domain [0.5, 2] the maximum absolute error is ≤
2<sup>-21</sup>; otherwise the maximum error is ≤3 ulp for the full profile
and ≤ 4 ulp for the embedded profile</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>log10</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">For <em>x</em> in the domain [0.5, 2] the maximum absolute error is ≤
2<sup>-21</sup>; otherwise the maximum error is ≤3 ulp for the full profile
and ≤ 4 ulp for the embedded profile</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>log1p</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Derived implementations implement this as <strong>log</strong>(<em>x</em> + 1).
For non-derived implementations, the error is ≤ 8192 ulp.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>pow</strong>(<em>x</em>, <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Undefined for <em>x</em> = 0 and <em>y</em> = 0.
Undefined for <em>x</em> &lt; 0 and non-integer y.
Undefined for <em>x</em> &lt; 0 and <em>y</em> outside the domain [-2^24, 2^24].
For <em>x</em> &gt; 0 or <em>x</em> &lt; 0 and even <em>y</em>, derived implementations implement
this as <strong>exp2</strong>(<em>y</em> * <strong>log2</strong>(<strong>fabs</strong>(<em>x</em>))).
For <em>x</em> &lt; 0 and odd <em>y</em>, derived implementations implement this as
-<strong>exp2</strong>(<em>y</em> * <strong>log2</strong>(<strong>fabs</strong>(<em>x</em>)).
For <em>x</em> == 0 and nonzero <em>y</em>, derived implementations return zero.
For non-derived implementations, the error is ≤ 8192 ulp.
<sup class="footnote">[<a id="_footnoteref_81" class="footnote" href="#_footnotedef_81" title="View footnote.">81</a>]</sup></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>pown</strong>(<em>x</em>, <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Defined only for integer values of y.
Undefined for <em>x</em> = 0 and <em>y</em> = 0.
For <em>x</em> &gt;= 0 or <em>x</em> &lt; 0 and even <em>y</em>, derived implementations
implement this as <strong>exp2</strong>(<em>y</em> * <strong>log2</strong>(<strong>fabs</strong>(<em>x</em>))).
For <em>x</em> &lt; 0 and odd <em>y</em>, derived implementations implement this as
-<strong>exp2</strong>(<em>y</em> * <strong>log2</strong>(<strong>fabs</strong>(<em>x</em>)).
For non-derived implementations, the error is ≤ 8192 ulp.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>powr</strong>(<em>x</em>, <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Defined only for <em>x</em> &gt;= 0.
Undefined for <em>x</em> = 0 and <em>y</em> = 0.
Derived implementations implement this as <strong>exp2</strong>(<em>y</em> * <strong>log2</strong>(<em>x</em>)).
For non-derived implementations, the error is ≤ 8192 ulp.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>rootn</strong>(<em>x</em>, <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Defined for <em>x</em> &gt; 0 when <em>y</em> is nonzero, derived implementations
implement this case as <strong>exp2</strong>(log2(<em>x</em>) / <em>y</em>).
Defined for <em>x</em> &lt; 0 when <em>y</em> is odd, derived implementations implement
this case as -<strong>exp2</strong>(<strong>log2</strong>(-<em>x</em>) / <em>y</em>).
Defined for <em>x</em> = +/-0 when <em>y</em> &gt; 0, derived implementations will
return +0 in this case.
For non-derived implementations, the error is ≤ 8192 ulp.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sin</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">For <em>x</em> in the domain [-π, π], the maximum absolute error is
≤ 2<sup>-11</sup> and larger otherwise.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sincos</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">ulp values as defined for <strong>sin</strong>(<em>x</em>) and <strong>cos</strong>(<em>x</em>)</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sinh</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Defined for <em>x</em> in the domain [-88,88].
For <em>x</em> in [-2<sup>-10,2</sup>-10], derived implementations implement as <em>x</em>.
For <em>x</em> outside of [-2<sup>10,2</sup>10], derived implement as <strong>0.5f *
(*exp</strong>(<em>x</em>) - <strong>exp</strong>(-<em>x</em>)).
For non-derived implementations, the error is ≤ 8192 ulp.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sinpi</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">For <em>x</em> in the domain [-1, 1], the maximum absolute error is ≤
2<sup>-11</sup> and larger otherwise.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>tan</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Derived implementations implement this as <strong>sin</strong>(<em>x</em>) * (<code>1.0f</code> /
<strong>cos</strong>(<em>x</em>)).
For non-derived implementations, the error is ≤ 8192 ulp.</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>tanpi</strong>(<em>x</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Derived implementations implement this as <strong>tan</strong>(<em>x</em> * <code>M_PI_F</code>).
For non-derived implementations, the error is ≤ 8192 ulp for <em>x</em>
in the domain [-1, 1].</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>x</em> * <em>y</em> + <em>z</em></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implemented either as a correctly rounded <strong>fma</strong> or as a multiply and
an add both of which are correctly rounded.</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>The following table describes the minimum accuracy of double precision
floating-point arithmetic operations given as ULP values.
The reference value used to compute the ULP value of an arithmetic operation
is the infinitely precise result.
0 ulp is used for math functions that do not require rounding.</p>
</div>
<table id="table-ulp-double" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 47. ULP values for double precision built-in math functions</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Function</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Min Accuracy - ULP values</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>x</em> + <em>y</em></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>x</em> - <em>y</em></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>x</em> * <em>y</em></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">1.0 / <em>x</em></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><em>x</em> / <em>y</em></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>acos</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>acospi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>asin</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>asinpi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atan</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atan2</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 6 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atanpi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atan2pi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 6 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>acosh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>asinh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>atanh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cbrt</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>ceil</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>clamp</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>copysign</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cos</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cosh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cospi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>cross</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">absolute error tolerance of 'max * max * (3 * FLT_EPSILON)' per vector component, where <em>max</em> is the maximum input operand magnitude</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>degrees</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>distance</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5.5 + 2n ulp, for gentype with vector width <em>n</em></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>dot</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">absolute error tolerance of 'max * max * (2n - 1) * FLT_EPSILON', for vector width <em>n</em> and maximum input operand magnitude <em>max</em> across all vector components</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>erfc</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>erf</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>exp</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>exp2</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>exp10</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>expm1</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fabs</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fdim</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>floor</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fma</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fmax</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fmin</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fmod</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fract</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>frexp</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>hypot</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>ilogb</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>length</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5.5 + n ulp, for gentype with vector width <em>n</em></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>ldexp</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>log</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>log2</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>log10</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 3 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>log1p</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>logb</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>mad</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Any value allowed (infinite ulp)</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>max</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>maxmag</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>min</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>minmag</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>mix</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>modf</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>nan</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>nextafter</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>normalize</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4.5 + n ulp, for gentype with vector width <em>n</em></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>pow</strong>(<em>x</em>, <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>pown</strong>(<em>x</em>, <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>powr</strong>(<em>x</em>, <em>y</em>)</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>radians</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>remainder</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>remquo</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>rint</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>rootn</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>round</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>rsqrt</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 2 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sign</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sin</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sincos</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp for sine and cosine values</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sinh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>sinpi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 4 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>smoothstep</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Implementation-defined</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>step</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">0 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>fsqrt</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>tan</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>tanh</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 5 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>tanpi</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 6 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>tgamma</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">≤ 16 ulp</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>trunc</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">Correctly rounded</p></td>
</tr>
</tbody>
</table>
</div>
<div class="sect2">
<h3 id="edge-case-behavior"><a class="anchor" href="#edge-case-behavior"></a>7.5. Edge Case Behavior</h3>
<div class="paragraph">
<p>The edge case behavior of the <a href="#math-functions">math functions</a> shall
conform to <a href="#C99-spec">sections F.9 and G.6 of the C99 Specification</a>,
except <a href="#additional-requirements-beyond-c99-tc2">where noted below</a>.</p>
</div>
<div class="sect3">
<h4 id="additional-requirements-beyond-c99-tc2"><a class="anchor" href="#additional-requirements-beyond-c99-tc2"></a>7.5.1. Additional Requirements Beyond C99 TC2</h4>
<div class="paragraph">
<p>All functions that return a NaN should return a quiet NaN.</p>
</div>
<div class="paragraph">
<p><strong>half_&lt;funcname&gt;</strong> functions behave identically to the function of the same
name without the <strong>half_</strong> prefix.
They must conform to the same edge case requirements (<a href="#C99-spec">see
sections F.9 and G.6 of the C99 Specification</a>).
For other cases, except where otherwise noted, these single precision
functions are permitted to have up to 8192 ulps of error (as measured in the
single precision result), although better accuracy is encouraged.</p>
</div>
<div class="paragraph">
<p>The usual allowances for <a href="#relative-error-as-ulps">rounding error</a> or
<a href="#edge-case-behavior-in-flush-to-zero-mode">flushing behavior</a> shall not
apply for those values for which <a href="#C99-spec">section F.9 of the C99
Specification</a>, or the <a href="#additional-requirements-beyond-c99-tc2">additional
requirements</a> and <a href="#edge-case-behavior-in-flush-to-zero-mode">edge case
behavior</a> below (and similar sections for other floating-point precisions)
prescribe a result (e.g. <strong>ceil</strong>(-1 &lt; <em>x</em> &lt; 0) returns -0).
Those values shall produce exactly the prescribed answers, and no other.
Where the ± symbol is used, the sign shall be preserved.
For example, <strong>sin</strong>(±0) = ±0 shall be interpreted to mean
<strong>sin</strong>(+0) is +0 and <strong>sin</strong>(-0) is -0.</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><strong>acospi</strong>(1) = +0.</p>
</li>
<li>
<p><strong>acospi</strong>(<em>x</em>) returns a NaN for |<em>x</em>| &gt; 1.</p>
</li>
<li>
<p><strong>asinpi</strong>(±0) = ±0.</p>
</li>
<li>
<p><strong>asinpi</strong>(<em>x</em>) returns a NaN for |<em>x</em>| &gt; 1.</p>
</li>
<li>
<p><strong>atanpi</strong>(±0) = ±0.</p>
</li>
<li>
<p><strong>atanpi</strong>(±∞) = ±0.5.</p>
</li>
<li>
<p><strong>atan2pi</strong>(±0, -0) = ±1.</p>
</li>
<li>
<p><strong>atan2pi</strong>(±0, +0) = ±0.</p>
</li>
<li>
<p><strong>atan2pi</strong>(±0, <em>x</em>) returns ±1 for <em>x</em> &lt; 0.</p>
</li>
<li>
<p><strong>atan2pi</strong>(±0, <em>x</em>) returns ±0 for <em>x</em> &gt; 0.</p>
</li>
<li>
<p><strong>atan2pi</strong>(<em>y</em>, ±0) returns -0.5 for <em>y</em> &lt; 0.</p>
</li>
<li>
<p><strong>atan2pi</strong>(<em>y</em>, ±0) returns 0.5 for <em>y</em> &gt; 0.</p>
</li>
<li>
<p><strong>atan2pi</strong>(±_y_, -∞) returns ±1 for finite <em>y</em> &gt; 0.</p>
</li>
<li>
<p><strong>atan2pi</strong>(±_y_, +∞) returns ±0 for finite <em>y</em> &gt; 0.</p>
</li>
<li>
<p><strong>atan2pi</strong>(±∞, <em>x</em>) returns ±0.5 for finite <em>x.</em></p>
</li>
<li>
<p><strong>atan2pi</strong>(±∞, -∞) returns ±0.75.</p>
</li>
<li>
<p><strong>atan2pi</strong>(±∞, +∞) returns ±0.25.</p>
</li>
<li>
<p><strong>ceil</strong>(-1 &lt; <em>x</em> &lt; 0) returns -0.</p>
</li>
<li>
<p><strong>cospi</strong>(±0) returns 1</p>
</li>
<li>
<p><strong>cospi</strong>(<em>n</em> + 0.5) is +0 for any integer <em>n</em> where <em>n</em> + 0.5 is
representable.</p>
</li>
<li>
<p><strong>cospi</strong>(±∞) returns a NaN.</p>
</li>
<li>
<p><strong>exp10</strong>(-∞) returns +0.</p>
</li>
<li>
<p><strong>exp10</strong>(+∞) returns +∞.</p>
</li>
<li>
<p><strong>distance</strong>(<em>x</em>, <em>y</em>) calculates the distance from <em>x</em> to <em>y</em> without
overflow or extraordinary precision loss due to underflow.</p>
</li>
<li>
<p><strong>fdim</strong>(any, NaN) returns NaN.</p>
</li>
<li>
<p><strong>fdim</strong>(NaN, any) returns NaN.</p>
</li>
<li>
<p><strong>fmod</strong>(±0, NaN) returns NaN.</p>
</li>
<li>
<p><strong>frexp</strong>(±∞, <em>exp</em>) returns ±∞ and stores 0 in
<em>exp</em>.</p>
</li>
<li>
<p><strong>frexp</strong>(NaN, <em>exp</em>) returns the NaN and stores 0 in <em>exp</em>.</p>
</li>
<li>
<p><strong>fract</strong>(<em>x</em>, <em>iptr</em>) shall not return a value greater than or equal to
1.0, and shall not return a value less than 0.</p>
</li>
<li>
<p><strong>fract</strong>(+0, <em>iptr</em>) returns +0 and +0 in iptr.</p>
</li>
<li>
<p><strong>fract</strong>(-0, <em>iptr</em>) returns -0 and -0 in iptr.</p>
</li>
<li>
<p><strong>fract</strong>(+∞, <em>iptr</em>) returns +0 and +∞ in <em>iptr</em>.</p>
</li>
<li>
<p><strong>fract</strong>(-∞, <em>iptr</em>) returns -0 and -∞ in <em>iptr</em>.</p>
</li>
<li>
<p><strong>fract</strong>(NaN, <em>iptr</em>) returns the NaN and NaN in <em>iptr</em>.</p>
</li>
<li>
<p><strong>length</strong> calculates the length of a vector without overflow or
extraordinary precision loss due to underflow.</p>
</li>
<li>
<p><strong>lgamma_r</strong>(<em>x</em>, <em>signp</em>) returns 0 in <em>signp</em> if <em>x</em> is zero or a
negative integer.</p>
</li>
<li>
<p><strong>nextafter</strong>(-0, <em>y</em> &gt; 0) returns smallest positive denormal value.</p>
</li>
<li>
<p><strong>nextafter</strong>(+0, <em>y</em> &lt; 0) returns smallest negative denormal value.</p>
</li>
<li>
<p><strong>normalize</strong> shall reduce the vector to unit length, pointing in the
same direction without overflow or extraordinary precision loss due to
underflow.</p>
</li>
<li>
<p><strong>normalize</strong>(<em>v</em>) returns <em>v</em> if all elements of <em>v</em> are zero.</p>
</li>
<li>
<p><strong>normalize</strong>(<em>v</em>) returns a vector full of NaNs if any element is a NaN.</p>
</li>
<li>
<p><strong>normalize</strong>(<em>v</em>) for which any element in <em>v</em> is infinite shall proceed
as if the elements in <em>v</em> were replaced as follows:</p>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="keyword">for</span> (i = <span class="integer">0</span>; i &lt; <span class="keyword">sizeof</span>(v) / <span class="keyword">sizeof</span>(v[<span class="integer">0</span>]); i++)
v[i] = isinf(v[i]) ? copysign(<span class="float">1</span><span class="float">.0</span>, v[i]) : <span class="float">0</span><span class="float">.0</span> * v[i];</code></pre>
</div>
</div>
</li>
<li>
<p><strong>pow</strong>(±0, -∞) returns +∞</p>
</li>
<li>
<p><strong>pown</strong>(<em>x</em>, 0) is 1 for any <em>x</em>, even zero, NaN or infinity.</p>
</li>
<li>
<p><strong>pown</strong>(±0, <em>n</em>) is ±∞ for odd <em>n</em> &lt; 0.</p>
</li>
<li>
<p><strong>pown</strong>(±0, <em>n</em>) is +∞ for even <em>n</em> &lt; 0.</p>
</li>
<li>
<p><strong>pown</strong>(±0, <em>n</em>) is +0 for even <em>n</em> &gt; 0.</p>
</li>
<li>
<p><strong>pown</strong>(±0, <em>n</em>) is ±0 for odd <em>n</em> &gt; 0.</p>
</li>
<li>
<p><strong>powr</strong>(<em>x</em>, ±0) is 1 for finite <em>x</em> &gt; 0.</p>
</li>
<li>
<p><strong>powr</strong>(±0, <em>y</em>) is +∞ for finite <em>y</em> &lt; 0.</p>
</li>
<li>
<p><strong>powr</strong>(±0, -∞) is +∞.</p>
</li>
<li>
<p><strong>powr</strong>(±0, <em>y</em>) is +0 for <em>y</em> &gt; 0.</p>
</li>
<li>
<p><strong>powr</strong>(+1, <em>y</em>) is 1 for finite <em>y.</em></p>
</li>
<li>
<p><strong>powr</strong>(<em>x</em>, <em>y</em>) returns NaN for <em>x</em> &lt; 0.</p>
</li>
<li>
<p><strong>powr</strong>(±0, ±0) returns NaN.</p>
</li>
<li>
<p><strong>powr</strong>(+∞, ±0) returns NaN.</p>
</li>
<li>
<p><strong>powr</strong>(+1, ±∞) returns NaN.</p>
</li>
<li>
<p><strong>powr</strong>(<em>x</em>, NaN) returns the NaN for <em>x</em> &gt;= 0.</p>
</li>
<li>
<p><strong>powr</strong>(NaN, <em>y</em>) returns the NaN.</p>
</li>
<li>
<p><strong>rint</strong>(-0.5 &lt;= <em>x</em> &lt; 0) returns -0.</p>
</li>
<li>
<p><strong>remquo</strong>(<em>x</em>, <em>y</em>, &amp;_quo_) returns a NaN and 0 in <em>quo</em> if <em>x</em> is
±∞, or if <em>y</em> is 0 and the other argument is non-NaN or if
either argument is a NaN.</p>
</li>
<li>
<p><strong>rootn</strong>(±0, <em>n</em>) is ±∞ for odd <em>n</em> &lt; 0.</p>
</li>
<li>
<p><strong>rootn</strong>(±0, <em>n</em>) is +∞ for even <em>n</em> &lt; 0.</p>
</li>
<li>
<p><strong>rootn</strong>(±0, <em>n</em>) is +0 for even <em>n</em> &gt; 0.</p>
</li>
<li>
<p><strong>rootn</strong>(±0, <em>n</em>) is ±0 for odd <em>n</em> &gt; 0.</p>
</li>
<li>
<p><strong>rootn</strong>(<em>x</em>, <em>n</em>) returns a NaN for <em>x</em> &lt; 0 and <em>n</em> is even.</p>
</li>
<li>
<p><strong>rootn</strong>(<em>x</em>, 0) returns a NaN.</p>
</li>
<li>
<p><strong>round</strong>(-0.5 &lt; <em>x</em> &lt; 0) returns -0.</p>
</li>
<li>
<p><strong>sinpi</strong>(±0) returns ±0.</p>
</li>
<li>
<p><strong>sinpi</strong>(+<em>n</em>) returns +0 for positive integers <em>n</em>.</p>
</li>
<li>
<p><strong>sinpi</strong>(-<em>n</em>) returns -0 for negative integers <em>n</em>.</p>
</li>
<li>
<p><strong>sinpi</strong>(±∞) returns a NaN.</p>
</li>
<li>
<p><strong>tanpi</strong>(±0) returns ±0.</p>
</li>
<li>
<p><strong>tanpi</strong>(±∞) returns a NaN.</p>
</li>
<li>
<p><strong>tanpi</strong>(<em>n</em>) is <strong>copysign</strong>(0.0, <em>n</em>) for even integers <em>n</em>.</p>
</li>
<li>
<p><strong>tanpi</strong>(<em>n</em>) is <strong>copysign</strong>(0.0, - <em>n</em>) for odd integers <em>n</em>.</p>
</li>
<li>
<p><strong>tanpi</strong>(<em>n</em> + 0.5) for even integer <em>n</em> is +∞ where <em>n</em> + 0.5 is
representable.</p>
</li>
<li>
<p><strong>tanpi</strong>(<em>n</em> + 0.5) for odd integer <em>n</em> is -∞ where <em>n</em> + 0.5 is
representable.</p>
</li>
<li>
<p><strong>trunc</strong>(-1 &lt; <em>x</em> &lt; 0) returns -0.
Binary file (standard input) matches</p>
</li>
</ul>
</div>
</div>
<div class="sect3">
<h4 id="changes-to-c99-tc2-behavior"><a class="anchor" href="#changes-to-c99-tc2-behavior"></a>7.5.2. Changes to C99 TC2 Behavior</h4>
<div class="paragraph">
<p><strong>modf</strong> behaves as though implemented by:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">gentype modf(gentype value, gentype *iptr)
{
*iptr = trunc( value );
<span class="keyword">return</span> copysign(isinf( value ) ? <span class="float">0</span><span class="float">.0</span> : value - *iptr, value);
}</code></pre>
</div>
</div>
<div class="paragraph">
<p><strong>rint</strong> always rounds according to round to nearest even rounding mode even
if the caller is in some other rounding mode.</p>
</div>
</div>
<div class="sect3">
<h4 id="edge-case-behavior-in-flush-to-zero-mode"><a class="anchor" href="#edge-case-behavior-in-flush-to-zero-mode"></a>7.5.3. Edge Case Behavior in Flush To Zero Mode</h4>
<div class="paragraph">
<p>If denormals are flushed to zero, then a function may return one of four
results:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>Any conforming result for non-flush-to-zero mode</p>
</li>
<li>
<p>If the result given by 1.
is a sub-normal before rounding, it may be flushed to zero</p>
</li>
<li>
<p>Any non-flushed conforming result for the function if one or more of its
sub-normal operands are flushed to zero.</p>
</li>
<li>
<p>If the result of 3.
is a sub-normal before rounding, the result may be flushed to zero.</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>In each of the above cases, if an operand or result is flushed to zero, the
sign of the zero is undefined.</p>
</div>
<div class="paragraph">
<p>If subnormals are flushed to zero, a device may choose to conform to the
following edge cases for <strong>nextafter</strong> instead of those listed in the
<a href="#additional-requirements-beyond-c99-tc2">additional requirements</a> section.</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><strong>nextafter</strong>(+smallest normal, <em>y</em> &lt; +smallest normal) = +0.</p>
</li>
<li>
<p><strong>nextafter</strong>(-smallest normal, <em>y</em> &gt; -smallest normal) = -0.</p>
</li>
<li>
<p><strong>nextafter</strong>(-0, <em>y</em> &gt; 0) returns smallest positive normal value.</p>
</li>
<li>
<p><strong>nextafter</strong>(+0, <em>y</em> &lt; 0) returns smallest negative normal value.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>For clarity, subnormals or denormals are defined to be the set of
representable numbers in the range 0 &lt; <em>x</em> &lt; <code>TYPE_MIN</code> and <code>-TYPE_MIN</code> &lt;
<em>x</em> &lt; -0.
They do not include ±0.
A non-zero number is said to be sub-normal before rounding if after
normalization, its radix-2 exponent is less than (<code>TYPE_MIN_EXP</code> - 1)
<sup class="footnote">[<a id="_footnoteref_82" class="footnote" href="#_footnotedef_82" title="View footnote.">82</a>]</sup>.</p>
</div>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="image-addressing-and-filtering"><a class="anchor" href="#image-addressing-and-filtering"></a>8. Image Addressing and Filtering</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Let w<sub>t</sub>, h<sub>t</sub> and d<sub>t</sub> be the width, height (or image array size for a 1D
image array) and depth (or image array size for a 2D image array) of the
image in pixels.
Let <em>coord.xy</em> (also referred to as (<em>s</em>,<em>t</em>)) or <em>coord.xyz</em> (also referred
to as (<em>s</em>,<em>t</em>,<em>r</em>)) be the coordinates specified to <strong>read_image{f|i|ui}</strong>.
The sampler specified in <strong>read_image{f|i|ui}</strong> is used to determine how to
sample the image and return an appropriate color.</p>
</div>
<div class="sect2">
<h3 id="image-coordinates"><a class="anchor" href="#image-coordinates"></a>8.1. Image Coordinates</h3>
<div class="paragraph">
<p>This affects the interpretation of image coordinates.
If image coordinates specified to <strong>read_image{f|i|ui}</strong> are normalized (as
specified in the sampler), the <em>s</em>, <em>t</em>, and <em>r</em> coordinate values are
multiplied by w<sub>t</sub>, h<sub>t,</sub> and d<sub>t</sub> respectively to generate the unnormalized
coordinate values.
For image arrays, the image array coordinate (i.e. <em>t</em> if it is a 1D image
array or <em>r</em> if it is a 2D image array) specified to <strong>read_image{f|i|ui}</strong>
must always be the un-normalized image coordinate value.</p>
</div>
<div class="paragraph">
<p>Let (<em>u</em>,<em>v</em>,<em>w</em>) represent the unnormalized image coordinate values.</p>
</div>
</div>
<div class="sect2">
<h3 id="addressing-and-filter-modes"><a class="anchor" href="#addressing-and-filter-modes"></a>8.2. Addressing and Filter Modes</h3>
<div class="paragraph">
<p>We first describe how the addressing and filter modes are applied to
generate the appropriate sample locations to read from the image if the
addressing mode is not <code>CLK_ADDRESS_REPEAT</code> nor
<code>CLK_ADDRESS_MIRRORED_REPEAT</code>.</p>
</div>
<div class="paragraph">
<p>After generating the image coordinate (<em>u</em>,<em>v</em>,<em>w</em>) we apply the appropriate
addressing and filter mode to generate the appropriate sample locations to
read from the image.</p>
</div>
<div class="paragraph">
<p>If values in (<em>u</em>,<em>v</em>,<em>w</em>) are <code>INF</code> or NaN, the behavior of
<strong>read_image{f|i|ui}</strong> is undefined.</p>
</div>
<div class="paragraph">
<p><strong>Filter Mode</strong> <code>CLK_FILTER_NEAREST</code></p>
</div>
<div class="paragraph">
<p>When filter mode is <code>CLK_FILTER_NEAREST</code>, the image element in the image
that is nearest (in Manhattan distance) to that specified by (<em>u</em>,<em>v</em>,<em>w</em>)
is obtained.
This means the image element at location (<em>i</em>,<em>j</em>,<em>k</em>) becomes the image
element value, where</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">i = address_mode((<span class="predefined-type">int</span>)floor(u))
j = address_mode((<span class="predefined-type">int</span>)floor(v))
k = address_mode((<span class="predefined-type">int</span>)floor(w))</code></pre>
</div>
</div>
<div class="paragraph">
<p>For a 3D image, the image element at location (<em>i</em>,<em>j</em>,<em>k</em>) becomes the
color value.
For a 2D image, the image element at location (<em>i</em>,<em>j</em>) becomes the color
value.</p>
</div>
<div class="paragraph">
<p>The following table describes the address_mode function.</p>
</div>
<table id="table-address-modes-texel-location" class="tableblock frame-all grid-all stretch">
<caption class="title">Table 48. Addressing modes to generate texel location</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Addressing Mode</strong></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><strong>Result of address_mode(coord)</strong></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CLK_ADDRESS_CLAMP_TO_EDGE</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">clamp (coord, 0, size - 1)</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CLK_ADDRESS_CLAMP</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">clamp (coord, -1, size)</p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>CLK_ADDRESS_NONE</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock">coord</p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>The <code>size</code> term in this table is w<sub>t</sub> for <em>u</em>, h<sub>t</sub> for <em>v</em> and d<sub>t</sub> for
<em>w</em>.</p>
</div>
<div class="paragraph">
<p>The <code>clamp</code> function used in this table is defined as:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">clamp(a, b, c) = <span class="keyword">return</span> (a &lt; b) ? b : ((a &gt; c) ? c : a)</code></pre>
</div>
</div>
<div class="paragraph">
<p>If the selected texel location (<em>i</em>,<em>j</em>,<em>k</em>) refers to a location outside
the image, the border color is used as the color value for this texel.</p>
</div>
<div class="paragraph">
<p><strong>Filter Mode</strong> <code>CLK_FILTER_LINEAR</code></p>
</div>
<div class="paragraph">
<p>When filter mode is <code>CLK_FILTER_LINEAR</code>, a 2×2 square of image
elements for a 2D image or a 2×2×2 cube of image elements for a
3D image is selected.
This 2×2 square or 2×2×2 cube is obtained as follows.</p>
</div>
<div class="paragraph">
<p>Let</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">i0 = address_mode((<span class="predefined-type">int</span>)floor(u - <span class="float">0</span><span class="float">.5</span>))
j0 = address_mode((<span class="predefined-type">int</span>)floor(v - <span class="float">0</span><span class="float">.5</span>))
k0 = address_mode((<span class="predefined-type">int</span>)floor(w - <span class="float">0</span><span class="float">.5</span>))
i1 = address_mode((<span class="predefined-type">int</span>)floor(u - <span class="float">0</span><span class="float">.5</span>) + <span class="integer">1</span>)
j1 = address_mode((<span class="predefined-type">int</span>)floor(v - <span class="float">0</span><span class="float">.5</span>) + <span class="integer">1</span>)
k1 = address_mode((<span class="predefined-type">int</span>)floor(w - <span class="float">0</span><span class="float">.5</span>) + <span class="integer">1</span>)
a = frac(u - <span class="float">0</span><span class="float">.5</span>)
b = frac(v - <span class="float">0</span><span class="float">.5</span>)
c = frac(w - <span class="float">0</span><span class="float">.5</span>)</code></pre>
</div>
</div>
<div class="paragraph">
<p>where <code>frac(x)</code> denotes the fractional part of x and is computed as <code>x -
floor(x)</code>.</p>
</div>
<div class="paragraph">
<p>For a 3D image, the image element value is found as</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">T = (<span class="integer">1</span> - a) * (<span class="integer">1</span> - b) * (<span class="integer">1</span> - c) * T_i0j0k0
+ a * (<span class="integer">1</span> - b) * (<span class="integer">1</span> - c) * T_i1j0k0
+ (<span class="integer">1</span> - a) * b * (<span class="integer">1</span> - c) * T_i0j1k0
+ a * b * (<span class="integer">1</span> - c) * T_i1j1k0
+ (<span class="integer">1</span> - a) * (<span class="integer">1</span> - b) * c * T_i0j0k1
+ a * (<span class="integer">1</span> - b) * c * T_i1j0k1
+ (<span class="integer">1</span> - a) * b * c * T_i0j1k1
+ a * b * c * T_i1j1k1</code></pre>
</div>
</div>
<div class="paragraph">
<p>where <code>T_ijk</code> is the image element at location (<em>i</em>,<em>j</em>,<em>k</em>) in the 3D image.</p>
</div>
<div class="paragraph">
<p>For a 2D image, the image element value is found as</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">T = (<span class="integer">1</span> - a) * (<span class="integer">1</span> - b) * T_i0j0
+ a * (<span class="integer">1</span> - b) * T_i1j0
+ (<span class="integer">1</span> - a) * b * T_i0j1
+ a * b * T_i1j1</code></pre>
</div>
</div>
<div class="paragraph">
<p>where <code>T_ij</code> is the image element at location (<em>i</em>,<em>j</em>) in the 2D image.</p>
</div>
<div class="paragraph">
<p>If any of the selected <code>T_ijk</code> or <code>T_ij</code> in the above equations refers to a
location outside the image, the border color is used as the color value for
<code>T_ijk</code> or <code>T_ij</code>.</p>
</div>
<div class="paragraph">
<p>If the image channel type is <code>CL_FLOAT</code> or <code>CL_HALF_FLOAT</code> and any of the
image elements <code>T_ijk</code> or <code>T_ij</code> is <code>INF</code> or NaN, the behavior of the built-in
image read function is undefined.</p>
</div>
<div class="paragraph">
<p>We now discuss how the addressing and filter modes are applied to generate
the appropriate sample locations to read from the image if the addressing
mode is <code>CLK_ADDRESS_REPEAT</code>.</p>
</div>
<div class="paragraph">
<p>If values in (<em>s</em>,<em>t</em>,<em>r</em>) are <code>INF</code> or NaN, the behavior of the built-in
image read functions is undefined.</p>
</div>
<div class="paragraph">
<p><strong>Filter Mode</strong> <code>CLK_FILTER_NEAREST</code></p>
</div>
<div class="paragraph">
<p>When filter mode is <code>CLK_FILTER_NEAREST</code>, the image element at location
(<em>i</em>,<em>j</em>,<em>k</em>) becomes the image element value, with <em>i</em>, <em>j</em>, and <em>k</em>
computed as</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">u = (s - floor(s)) * w_t
i = (<span class="predefined-type">int</span>)floor(u)
<span class="keyword">if</span> (i &gt; w_t - <span class="integer">1</span>)
i = i - w_t
v = (t - floor(t)) * h_t
j = (<span class="predefined-type">int</span>)floor(v)
<span class="keyword">if</span> (j &gt; h_t - <span class="integer">1</span>)
j = j - h_t
w = (r - floor(r)) * d_t
k = (<span class="predefined-type">int</span>)floor(w)
<span class="keyword">if</span> (k &gt; d_t - <span class="integer">1</span>)
k = k - d_t</code></pre>
</div>
</div>
<div class="paragraph">
<p>For a 3D image, the image element at location (<em>i</em>,<em>j</em>,<em>k</em>) becomes the
color value.
For a 2D image, the image element at location (<em>i</em>,<em>j</em>) becomes the color
value.</p>
</div>
<div class="paragraph">
<p><strong>Filter Mode</strong> <code>CLK_FILTER_LINEAR</code></p>
</div>
<div class="paragraph">
<p>When filter mode is <code>CLK_FILTER_LINEAR</code>, a 2×2 square of image
elements for a 2D image or a 2×2×2 cube of image elements for a
3D image is selected.
This 2×2 square or 2×2×2 cube is obtained as follows.</p>
</div>
<div class="paragraph">
<p>Let</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">u = (s - floor(s)) * w_t
i0 = (<span class="predefined-type">int</span>)floor(u - <span class="float">0</span><span class="float">.5</span>)
i1 = i0 + <span class="integer">1</span>
<span class="keyword">if</span> (i0 &lt; <span class="integer">0</span>)
i0 = w_t + i0
<span class="keyword">if</span> (i1 &gt; w_t - <span class="integer">1</span>)
i1 = i1 - w_t
v = (t - floor(t)) * h_t
j0 = (<span class="predefined-type">int</span>)floor(v - <span class="float">0</span><span class="float">.5</span>)
j1 = j0 + <span class="integer">1</span>
<span class="keyword">if</span> (j0 &lt; <span class="integer">0</span>)
j0 = h_t + j0
<span class="keyword">if</span> (j1 &gt; h_t - <span class="integer">1</span>)
j1 = j1 - h_t
w = (r - floor(r)) * d_t
k0 = (<span class="predefined-type">int</span>)floor(w - <span class="float">0</span><span class="float">.5</span>)
k1 = k0 + <span class="integer">1</span>
<span class="keyword">if</span> (k0 &lt; <span class="integer">0</span>)
k0 = d_t + k0
<span class="keyword">if</span> (k1 &gt; d_t - <span class="integer">1</span>)
k1 = k1 - d_t
a = frac(u - <span class="float">0</span><span class="float">.5</span>)
b = frac(v - <span class="float">0</span><span class="float">.5</span>)
c = frac(w - <span class="float">0</span><span class="float">.5</span>)</code></pre>
</div>
</div>
<div class="paragraph">
<p>where <code>frac(x)</code> denotes the fractional part of x and is computed as <code>x -
floor(x)</code>.</p>
</div>
<div class="paragraph">
<p>For a 3D image, the image element value is found as</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">T = (<span class="integer">1</span> - a) * (<span class="integer">1</span> - b) * (<span class="integer">1</span> - c) * T_i0j0k0
+ a * (<span class="integer">1</span> - b) * (<span class="integer">1</span> - c) * T_i1j0k0
+ (<span class="integer">1</span> - a) * b * (<span class="integer">1</span> - c) * T_i0j1k0
+ a * b * (<span class="integer">1</span> - c) * T_i1j1k0
+ (<span class="integer">1</span> - a) * (<span class="integer">1</span> - b) * c * T_i0j0k1
+ a * (<span class="integer">1</span> - b) * c * T_i1j0k1
+ (<span class="integer">1</span> - a) * b * c * T_i0j1k1
+ a * b * c * T_i1j1k1</code></pre>
</div>
</div>
<div class="paragraph">
<p>where <code>T_ijk</code> is the image element at location (<em>i</em>,<em>j</em>,<em>k</em>) in the 3D image.</p>
</div>
<div class="paragraph">
<p>For a 2D image, the image element value is found as</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">T = (<span class="integer">1</span> - a) * (<span class="integer">1</span> - b) * T_i0j0
+ a * (<span class="integer">1</span> - b) * T_i1j0
+ (<span class="integer">1</span> - a) * b * T_i0j1
+ a * b * T_i1j1</code></pre>
</div>
</div>
<div class="paragraph">
<p>where <code>T_ij</code> is the image element at location (<em>i</em>,<em>j</em>) in the 2D image.</p>
</div>
<div class="paragraph">
<p>If the image channel type is <code>CL_FLOAT</code> or <code>CL_HALF_FLOAT</code> and any of the
image elements <code>T_ijk</code> or <code>T_ij</code> is <code>INF</code> or NaN, the behavior of the built-in
image read function is undefined.</p>
</div>
<div class="paragraph">
<p>We now discuss how the addressing and filter modes are applied to generate
the appropriate sample locations to read from the image if the addressing
mode is <code>CLK_ADDRESS_MIRRORED_REPEAT</code>.
The <code>CLK_ADDRESS_MIRRORED_REPEAT</code> addressing mode causes the image to be
read as if it is tiled at every integer seam with the interpretation of the
image data flipped at each integer crossing.
For example, the (<em>s</em>,<em>t</em>,<em>r</em>) coordinates between 2 and 3 are addressed
into the image as coordinates from 1 down to 0.
If values in (<em>s</em>,<em>t</em>,<em>r</em>) are <code>INF</code> or NaN, the behavior of the built-in
image read functions is undefined.</p>
</div>
<div class="paragraph">
<p><strong>Filter Mode</strong> <code>CLK_FILTER_NEAREST</code></p>
</div>
<div class="paragraph">
<p>When filter mode is <code>CLK_FILTER_NEAREST</code>, the image element at location
(<em>i</em>,<em>j</em>,<em>k</em>) becomes the image element value, with <em>i</em>,<em>j</em> and k computed
as</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">s<span class="char">' </span>= <span class="float">2</span><span class="float">.0f</span> * rint(<span class="float">0</span><span class="float">.5f</span> * s)
s<span class="char">' </span>= fabs(s - s<span class="char">')</span>
u = s<span class="char">' </span>* w_t
i = (<span class="predefined-type">int</span>)floor(u)
i = min(i, w_t - <span class="integer">1</span>)
t<span class="char">' </span>= <span class="float">2</span><span class="float">.0f</span> * rint(<span class="float">0</span><span class="float">.5f</span> * t)
t<span class="char">' </span>= fabs(t - t<span class="char">')</span>
v = t<span class="char">' </span>* h_t
j = (<span class="predefined-type">int</span>)floor(v)
j = min(j, h_t - <span class="integer">1</span>)
r<span class="char">' </span>= <span class="float">2</span><span class="float">.0f</span> * rint(<span class="float">0</span><span class="float">.5f</span> * r)
r<span class="char">' </span>= fabs(r - r<span class="char">')</span>
w = r<span class="char">' </span>* d_t
k = (<span class="predefined-type">int</span>)floor(w)
k = min(k, d_t - <span class="integer">1</span>)</code></pre>
</div>
</div>
<div class="paragraph">
<p>For a 3D image, the image element at location (<em>i</em>,<em>j</em>,<em>k</em>) becomes the
color value.
For a 2D image, the image element at location (<em>i</em>,<em>j</em>) becomes the color
value.</p>
</div>
<div class="paragraph">
<p><strong>Filter Mode</strong> <code>CLK_FILTER_LINEAR</code></p>
</div>
<div class="paragraph">
<p>When filter mode is <code>CLK_FILTER_LINEAR</code>, a 2×2 square of image
elements for a 2D image or a 2×2×2 cube of image elements for a
3D image is selected.
This 2×2 square or 2×2×2 cube is obtained as follows.</p>
</div>
<div class="paragraph">
<p>Let</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">s<span class="char">' </span>= <span class="float">2</span><span class="float">.0f</span> * rint(<span class="float">0</span><span class="float">.5f</span> * s)
s<span class="char">' </span>= fabs(s - s<span class="char">')</span>
u = s<span class="char">' </span>* w_t
i0 = (<span class="predefined-type">int</span>)floor(u - <span class="float">0</span><span class="float">.5f</span>)
i1 = i0 + <span class="integer">1</span>
i0 = max(i0, <span class="integer">0</span>)
i1 = min(i1, w_t - <span class="integer">1</span>)
t<span class="char">' </span>= <span class="float">2</span><span class="float">.0f</span> * rint(<span class="float">0</span><span class="float">.5f</span> * t)
t<span class="char">' </span>= fabs(t - t<span class="char">')</span>
v = t<span class="char">' </span>* h_t
j0 = (<span class="predefined-type">int</span>)floor(v - <span class="float">0</span><span class="float">.5f</span>)
j1 = j0 + <span class="integer">1</span>
j0 = max(j0, <span class="integer">0</span>)
j1 = min(j1, h_t - <span class="integer">1</span>)
r<span class="char">' </span>= <span class="float">2</span><span class="float">.0f</span> * rint(<span class="float">0</span><span class="float">.5f</span> * r)
r<span class="char">' </span>= fabs(r - r<span class="char">')</span>
w = r<span class="char">' </span>* d_t
k0 = (<span class="predefined-type">int</span>)floor(w - <span class="float">0</span><span class="float">.5f</span>)
k1 = k0 + <span class="integer">1</span>
k0 = max(k0, <span class="integer">0</span>)
k1 = min(k1, d_t - <span class="integer">1</span>)
a = frac(u - <span class="float">0</span><span class="float">.5</span>)
b = frac(v - <span class="float">0</span><span class="float">.5</span>)
c = frac(w - <span class="float">0</span><span class="float">.5</span>)</code></pre>
</div>
</div>
<div class="paragraph">
<p>where <code>frac(x)</code> denotes the fractional part of x and is computed as <code>x -
floor(x)</code>.</p>
</div>
<div class="paragraph">
<p>For a 3D image, the image element value is found as</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">T = (<span class="integer">1</span> - a) * (<span class="integer">1</span> - b) * (<span class="integer">1</span> - c) * T_i0j0k0
+ a * (<span class="integer">1</span> - b) * (<span class="integer">1</span> - c) * T_i1j0k0
+ (<span class="integer">1</span> - a) * b * (<span class="integer">1</span> - c) * T_i0j1k0
+ a * b * (<span class="integer">1</span> - c) * T_i1j1k0
+ (<span class="integer">1</span> - a) * (<span class="integer">1</span> - b) * c * T_i0j0k1
+ a * (<span class="integer">1</span> - b) * c * T_i1j0k1
+ (<span class="integer">1</span> - a) * b * c * T_i0j1k1
+ a * b * c * T_i1j1k1</code></pre>
</div>
</div>
<div class="paragraph">
<p>where <code>T_ijk</code> is the image element at location (<em>i</em>,<em>j</em>,<em>k</em>) in the 3D image.</p>
</div>
<div class="paragraph">
<p>For a 2D image, the image element value is found as</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">T = (<span class="integer">1</span> - a) * (<span class="integer">1</span> - b) * T_i0j0
+ a * (<span class="integer">1</span> - b) * T_i1j0
+ (<span class="integer">1</span> - a) * b * T_i0j1
+ a * b * T_i1j1</code></pre>
</div>
</div>
<div class="paragraph">
<p>where <code>T_ij</code> is the image element at location (<em>i</em>,<em>j</em>) in the 2D image.</p>
</div>
<div class="paragraph">
<p>For a 1D image, the image element value is found as</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c">T = (<span class="integer">1</span> - a) * T_i0
+ a * T_i1</code></pre>
</div>
</div>
<div class="paragraph">
<p>where <code>T_i</code> is the image element at location (<em>i</em>) in the 1D image.</p>
</div>
<div class="paragraph">
<p>If the image channel type is <code>CL_FLOAT</code> or <code>CL_HALF_FLOAT</code> and any of the
image elements <code>T_ijk</code> or <code>T_ij</code> is <code>INF</code> or NaN, the behavior of the built-in
image read function is undefined.</p>
</div>
<div class="admonitionblock note">
<table>
<tr>
<td class="icon">
<i class="fa icon-note" title="Note"></i>
</td>
<td class="content">
<div class="paragraph">
<p>If the sampler is specified as using unnormalized coordinates
(floating-point or integer coordinates), filter mode set to
<code>CLK_FILTER_NEAREST</code> and addressing mode set to one of the following modes -
<code>CLK_ADDRESS_NONE</code>, <code>CLK_ADDRESS_CLAMP_TO_EDGE</code> or <code>CLK_ADDRESS_CLAMP</code>, the
<a href="#addressing-and-filter-modes">location of the image element in the image</a>
given by (<em>i</em>,<em>j</em>,<em>k</em>) will be computed without any loss of precision.</p>
</div>
<div class="paragraph">
<p>For all other sampler combinations of normalized or unnormalized
coordinates, filter and addressing modes, the relative error or precision of
the addressing mode calculations and the image filter operation are not
defined by this revision of the OpenCL specification.
To ensure a minimum precision of image addressing and filter calculations
across any OpenCL device, for these sampler combinations, developers should
unnormalize the image coordinate in the kernel and implement the linear
filter in the kernel with appropriate calls to <strong>read_image{f|i|ui}</strong> with a
sampler that uses unnormalized coordinates, filter mode set to
<code>CLK_FILTER_NEAREST</code>, addressing mode set to <code>CLK_ADDRESS_NONE</code>,
<code>CLK_ADDRESS_CLAMP_TO_EDGE</code> <em>or</em> <code>CLK_ADDRESS_CLAMP</code>, and finally performing
the interpolation of color values read from the image to generate the
filtered color value.</p>
</div>
</td>
</tr>
</table>
</div>
</div>
<div class="sect2">
<h3 id="conversion-rules"><a class="anchor" href="#conversion-rules"></a>8.3. Conversion Rules</h3>
<div class="paragraph">
<p>In this section we discuss conversion rules that are applied when reading
and writing images in a kernel.</p>
</div>
<div class="sect3">
<h4 id="conversion-rules-for-normalized-integer-channel-data-types"><a class="anchor" href="#conversion-rules-for-normalized-integer-channel-data-types"></a>8.3.1. Conversion rules for normalized integer channel data types</h4>
<div class="paragraph">
<p>In this section we discuss converting normalized integer channel data types
to floating-point values and vice-versa.</p>
</div>
<div class="sect4">
<h5 id="converting-normalized-integer-channel-data-types-to-floating-point-values"><a class="anchor" href="#converting-normalized-integer-channel-data-types-to-floating-point-values"></a>8.3.1.1. Converting normalized integer channel data types to floating-point values</h5>
<div class="paragraph">
<p>For images created with image channel data type of <code>CL_UNORM_INT8</code> and
<code>CL_UNORM_INT16</code>, <strong>read_imagef</strong> will convert the channel values from an
8-bit or 16-bit unsigned integer to normalized floating-point values in the
range [<code>0.0f</code>, <code>1.0f</code>].</p>
</div>
<div class="paragraph">
<p>For images created with image channel data type of <code>CL_SNORM_INT8</code> and
<code>CL_SNORM_INT16</code>, <strong>read_imagef</strong> will convert the channel values from an
8-bit or 16-bit signed integer to normalized floating-point values in the
range [<code>-1.0f</code>, <code>1.0f</code>].</p>
</div>
<div class="paragraph">
<p>These conversions are performed as follows:</p>
</div>
<div class="paragraph">
<p><code>CL_UNORM_INT8</code> (8-bit unsigned integer) → <code>float</code></p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>normalized <code>float</code> value = <code>(float)c / 255.0f</code></p>
</li>
</ul>
</div>
<div class="paragraph">
<p><code>CL_UNORM_INT_101010</code> (10-bit unsigned integer) → <code>float</code></p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>normalized <code>float</code> value = <code>(float)c / 1023.0f</code></p>
</li>
</ul>
</div>
<div class="paragraph">
<p><code>CL_UNORM_INT16</code> (16-bit unsigned integer) → <code>float</code></p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>normalized <code>float</code> value = <code>(float)c / 65535.0f</code></p>
</li>
</ul>
</div>
<div class="paragraph">
<p><code>CL_SNORM_INT8</code> (8-bit signed integer) → <code>float</code></p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>normalized <code>float</code> value = <strong>max</strong>(<code>-1.0f</code>, <code>(float)c / 127.0f</code>)</p>
</li>
</ul>
</div>
<div class="paragraph">
<p><code>CL_SNORM_INT16</code> (16-bit signed integer) → <code>float</code></p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>normalized <code>float</code> value = <strong>max</strong>(<code>-1.0f</code>, <code>(float)c / 32767.0f</code>)</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>The precision of the above conversions is &lt;= 1.5 ulp except for the
following cases.</p>
</div>
<div class="paragraph">
<p>For <code>CL_UNORM_INT8</code></p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>0 must convert to <code>0.0f</code> and</p>
</li>
<li>
<p>255 must convert to <code>1.0f</code></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>For <code>CL_UNORM_INT_101010</code></p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>0 must convert to <code>0.0f</code> and</p>
</li>
<li>
<p>1023 must convert to <code>1.0f</code></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>For <code>CL_UNORM_INT16</code></p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>0 must convert to <code>0.0f</code> and</p>
</li>
<li>
<p>65535 must convert to <code>1.0f</code></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>For <code>CL_SNORM_INT8</code></p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>-128 and -127 must convert to <code>-1.0f</code>,</p>
</li>
<li>
<p>0 must convert to <code>0.0f</code> and</p>
</li>
<li>
<p>127 must convert to <code>1.0f</code></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>For <code>CL_SNORM_INT16</code></p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>-32768 and -32767 must convert to <code>-1.0f</code>,</p>
</li>
<li>
<p>0 must convert to <code>0.0f</code> and</p>
</li>
<li>
<p>32767 must convert to <code>1.0f</code></p>
</li>
</ul>
</div>
</div>
<div class="sect4">
<h5 id="converting-floating-point-values-to-normalized-integer-channel-data-types"><a class="anchor" href="#converting-floating-point-values-to-normalized-integer-channel-data-types"></a>8.3.1.2. Converting floating-point values to normalized integer channel data types</h5>
<div class="paragraph">
<p>For images created with image channel data type of <code>CL_UNORM_INT8</code> and
<code>CL_UNORM_INT16</code>, <strong>write_imagef</strong> will convert the floating-point color value
to an 8-bit or 16-bit unsigned integer.</p>
</div>
<div class="paragraph">
<p>For images created with image channel data type of <code>CL_SNORM_INT8</code> and
<code>CL_SNORM_INT16</code>, <strong>write_imagef</strong> will convert the floating-point color value
to an 8-bit or 16-bit signed integer.</p>
</div>
<div class="paragraph">
<p>The preferred method for how conversions from floating-point values to
normalized integer values are performed is as follows:</p>
</div>
<div class="paragraph">
<p><code>float</code> → <code>CL_UNORM_INT8</code> (8-bit unsigned integer)</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><strong>convert_uchar_sat_rte</strong>(<code>f * 255.0f</code>)</p>
</li>
</ul>
</div>
<div class="paragraph">
<p><code>float</code> → <code>CL_UNORM_INT_101010</code> (10-bit unsigned integer)</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><strong>min</strong>(<strong>convert_ushort_sat_rte</strong>(<code>f * 1023.0f</code>), <code>0x3ff</code>)</p>
</li>
</ul>
</div>
<div class="paragraph">
<p><code>float</code> → <code>CL_UNORM_INT16</code> (16-bit unsigned integer)</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><strong>convert_ushort_sat_rte</strong>(<code>f * 65535.0f</code>)</p>
</li>
</ul>
</div>
<div class="paragraph">
<p><code>float</code> → <code>CL_SNORM_INT8</code> (8-bit signed integer)</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><strong>convert_char_sat_rte</strong>(<code>f * 127.0f</code>)</p>
</li>
</ul>
</div>
<div class="paragraph">
<p><code>float</code> → <code>CL_SNORM_INT16</code> (16-bit signed integer)</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><strong>convert_short_sat_rte</strong>(<code>f * 32767.0f</code>)</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Please refer to the <a href="#out-of-range-behavior">out-of-range behavior and
saturated conversion</a> rules.</p>
</div>
<div class="paragraph">
<p>OpenCL implementations may choose to approximate the rounding mode used in
the conversions described above.
If a rounding mode other than round to nearest even (<code>_rte</code>) is used, the
absolute error of the implementation dependant rounding mode vs.
the result produced by the round to nearest even rounding mode must be ≤
0.6.</p>
</div>
<div class="paragraph">
<p><code>float</code> → <code>CL_UNORM_INT8</code> (8-bit unsigned integer)</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>Let f<sub>preferred</sub> = <strong>convert_uchar_sat_rte</strong>(f * <code>255.0f</code>)</p>
</li>
<li>
<p>Let f<sub>approx</sub> = <strong>convert_uchar_sat_&lt;impl-rounding-mode&gt;</strong>(f * <code>255.0f</code>)</p>
</li>
<li>
<p><strong>fabs</strong>(f<sub>preferred</sub> - f<sub>approx</sub>) must be &lt;= 0.6</p>
</li>
</ul>
</div>
<div class="paragraph">
<p><code>float</code> → <code>CL_UNORM_INT_101010</code> (10-bit unsigned integer)</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>Let f<sub>preferred</sub> = <strong>convert_ushort_sat_rte</strong>(f * <code>1023.0f</code>)</p>
</li>
<li>
<p>Let f<sub>approx</sub> = <strong>convert_ushort_sat_&lt;impl-rounding-mode&gt;</strong>(f *
<code>1023.0f</code>)</p>
</li>
<li>
<p><strong>fabs</strong>(f<sub>preferred</sub> - f<sub>approx</sub>) must be &lt;= 0.6</p>
</li>
</ul>
</div>
<div class="paragraph">
<p><code>float</code> → <code>CL_UNORM_INT16</code> (16-bit unsigned integer)</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>Let f<sub>preferred</sub> = <strong>convert_ushort_sat_rte</strong>(f * <code>65535.0f</code>)</p>
</li>
<li>
<p>Let f<sub>approx</sub> = <strong>convert_ushort_sat_&lt;impl-rounding-mode&gt;</strong>(f *
<code>65535.0f</code>)</p>
</li>
<li>
<p><strong>fabs</strong>(f<sub>preferred</sub> - f<sub>approx</sub>) must be &lt;= 0.6</p>
</li>
</ul>
</div>
<div class="paragraph">
<p><code>float</code> → <code>CL_SNORM_INT8</code> (8-bit signed integer)</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>Let f<sub>preferred</sub> = <strong>convert_char_sat_rte</strong>(f * <code>127.0f</code>)</p>
</li>
<li>
<p>Let f<sub>approx</sub> = <strong>convert_char_sat_&lt;impl_rounding_mode&gt;</strong>(f * <code>127.0f</code>)</p>
</li>
<li>
<p><strong>fabs</strong>(f<sub>preferred</sub> - f<sub>approx</sub>) must be &lt;= 0.6</p>
</li>
</ul>
</div>
<div class="paragraph">
<p><code>float</code> → <code>CL_SNORM_INT16</code> (16-bit signed integer)</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>Let f<sub>preferred</sub> = <strong>convert_short_sat_rte</strong>(f * <code>32767.0f</code>)</p>
</li>
<li>
<p>Let f<sub>approx</sub> = <strong>convert_short_sat_&lt;impl-rounding-mode&gt;</strong>(f *
<code>32767.0f</code>)</p>
</li>
<li>
<p><strong>fabs</strong>(f<sub>preferred</sub> - f<sub>approx</sub>) must be &lt;= 0.6</p>
</li>
</ul>
</div>
</div>
</div>
<div class="sect3">
<h4 id="conversion-rules-for-half-precision-floating-point-channel-data-type"><a class="anchor" href="#conversion-rules-for-half-precision-floating-point-channel-data-type"></a>8.3.2. Conversion rules for half precision floating-point channel data type</h4>
<div class="paragraph">
<p>For images created with a channel data type of <code>CL_HALF_FLOAT</code>, the
conversions from <code>half</code> to <code>float</code> are lossless (as described in
<a href="#the-half-data-type">"The half data type"</a>).
Conversions from <code>float</code> to <code>half</code> round the mantissa using the round to
nearest even or round to zero rounding mode.
Denormalized numbers for the <code>half</code> data type which may be generated when
converting a <code>float</code> to a <code>half</code> may be flushed to zero.
A <code>float</code> NaN must be converted to an appropriate NaN in the <code>half</code> type.
A <code>float</code> <code>INF</code> must be converted to an appropriate <code>INF</code> in the <code>half</code>
type.</p>
</div>
</div>
<div class="sect3">
<h4 id="conversion-rules-for-floating-point-channel-data-type"><a class="anchor" href="#conversion-rules-for-floating-point-channel-data-type"></a>8.3.3. Conversion rules for floating-point channel data type</h4>
<div class="paragraph">
<p>The following rules apply for reading and writing images created with
channel data type of <code>CL_FLOAT</code>.</p>
</div>
<div class="ulist">
<ul>
<li>
<p>NaNs may be converted to a NaN value(s) supported by the device.</p>
</li>
<li>
<p>Denorms can be flushed to zero.</p>
</li>
<li>
<p>All other values must be preserved.</p>
</li>
</ul>
</div>
</div>
<div class="sect3">
<h4 id="conversion-rules-for-signed-and-unsigned-8-bit-16-bit-and-32-bit-integer-channel-data-types"><a class="anchor" href="#conversion-rules-for-signed-and-unsigned-8-bit-16-bit-and-32-bit-integer-channel-data-types"></a>8.3.4. Conversion rules for signed and unsigned 8-bit, 16-bit and 32-bit integer channel data types</h4>
<div class="paragraph">
<p>Calls to <strong>read_imagei</strong> with channel data type values of <code>CL_SIGNED_INT8</code>,
<code>CL_SIGNED_INT16</code> and <code>CL_SIGNED_INT32</code> return the unmodified integer values
stored in the image at specified location.</p>
</div>
<div class="paragraph">
<p>Calls to <strong>read_imageui</strong> with channel data type values of <code>CL_UNSIGNED_INT8</code>,
<code>CL_UNSIGNED_INT16</code> and <code>CL_UNSIGNED_INT32</code> return the unmodified integer
values stored in the image at specified location.</p>
</div>
<div class="paragraph">
<p>Calls to <strong>write_imagei</strong> will perform one of the following conversions:</p>
</div>
<div class="paragraph">
<p>32 bit signed integer → 8-bit signed integer</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><strong>convert_char_sat</strong>(i)</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>32 bit signed integer → 16-bit signed integer</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><strong>convert_short_sat</strong>(i)</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>32 bit signed integer → 32-bit signed integer</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>no conversion is performed</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Calls to <strong>write_imageui</strong> will perform one of the following conversions:</p>
</div>
<div class="paragraph">
<p>32 bit unsigned integer → 8-bit unsigned integer</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><strong>convert_uchar_sat</strong>(i)</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>32 bit unsigned integer → 16-bit unsigned integer</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><strong>convert_ushort_sat</strong>(i)</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>32 bit unsigned integer → 32-bit unsigned integer</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>no conversion is performed</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>The conversions described in this section must be correctly saturated.</p>
</div>
</div>
<div class="sect3">
<h4 id="conversion-rules-for-srgba-and-sbgra-images"><a class="anchor" href="#conversion-rules-for-srgba-and-sbgra-images"></a>8.3.5. Conversion rules for sRGBA and sBGRA images</h4>
<div class="paragraph">
<p>Standard RGB data, which roughly displays colors in a linear ramp of
luminosity levels such that an average observer, under average viewing
conditions, can view them as perceptually equal steps on an average display.
All 0&#8217;s maps to <code>0.0f</code>, and all 1&#8217;s maps to <code>1.0f</code>.
The sequence of unsigned integer encodings between all 0&#8217;s and all 1&#8217;s
represent a nonlinear progression in the floating-point interpretation of
the numbers between <code>0.0f</code> to <code>1.0f</code>.
For more detail, see the <a href="#sRGB-spec">SRGB color standard</a>.</p>
</div>
<div class="paragraph">
<p>Conversion from sRGB space is automatically done by <strong>read_imagef</strong> built-in
functions if the image channel order is one of the sRGB values described
above.
When reading from an sRGB image, the conversion from sRGB to linear RGB is
performed before the filter specified in the sampler specified to
read_imagef is applied.
If the format has an alpha channel, the alpha data is stored in linear color
space.
Conversion to sRGB space is automatically done by <strong>write_imagef</strong> built-in
functions if the image channel order is one of the sRGB values described
above and the device supports writing to sRGB images.</p>
</div>
<div class="paragraph">
<p>If the format has an alpha channel, the alpha data is stored in linear color
space.</p>
</div>
<div class="paragraph">
<p>The following is the conversion rule for converting a normalized 8-bit
unsigned integer sRGB color value to a floating-point linear RGB color value
using <strong>read_imagef</strong>.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="comment">// Convert the normalized 8-bit unsigned integer R, G and B channel values</span>
<span class="comment">// to a floating-point value (call it c) as per rules described in section</span>
<span class="comment">// 8.3.1.1.</span>
<span class="keyword">if</span> (c &lt;= <span class="float">0</span><span class="float">.04045</span>),
result = c / <span class="integer">1</span><span class="float">2</span><span class="float">.92</span>;
<span class="keyword">else</span>
result = powr((c + <span class="float">0</span><span class="float">.055</span>) / <span class="float">1</span><span class="float">.055</span>, <span class="float">2</span><span class="float">.4</span>);</code></pre>
</div>
</div>
<div class="paragraph">
<p>The resulting floating point value, if converted back to an sRGB value
without rounding to a 8-bit unsigned integer value, must be within 0.5 ulp
of the original sRGB value.</p>
</div>
<div class="paragraph">
<p>The following are the conversion rules for converting a linear RGB
floating-point color value (call it <em>c</em>) to a normalized 8-bit unsigned
integer sRGB value using <strong>write_imagef</strong>.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="c"><span class="keyword">if</span> (c is NaN)
c = <span class="float">0</span><span class="float">.0</span>;
<span class="keyword">if</span> (c &gt; <span class="float">1</span><span class="float">.0</span>)
c = <span class="float">1</span><span class="float">.0</span>;
<span class="keyword">else</span> <span class="keyword">if</span> (c &lt; <span class="float">0</span><span class="float">.0</span>)
c = <span class="float">0</span><span class="float">.0</span>;
<span class="keyword">else</span> <span class="keyword">if</span> (c &lt; <span class="float">0</span><span class="float">.0031308</span>)
c = <span class="integer">1</span><span class="float">2</span><span class="float">.92</span> * c;
<span class="keyword">else</span>
c = <span class="float">1</span><span class="float">.055</span> * powr(c, <span class="float">1</span><span class="float">.0</span>/<span class="float">2</span><span class="float">.4</span>) - <span class="float">0</span><span class="float">.055</span>;
scaled_reference_result = c * <span class="integer">255</span>
channel_component = floor(scaled_reference_result + <span class="float">0</span><span class="float">.5</span>);</code></pre>
</div>
</div>
<div class="paragraph">
<p>The precision of the above conversion should be such that</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p><code>|generated_channel_component - scaled_reference_result|</code> ≤ 0.6</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>where <code>generated_channel_component</code> is the actual value that the
implementation produces and being checked for conformance.</p>
</div>
</div>
</div>
<div class="sect2">
<h3 id="selecting-an-image-from-an-image-array"><a class="anchor" href="#selecting-an-image-from-an-image-array"></a>8.4. Selecting an Image from an Image Array</h3>
<div class="paragraph">
<p>Let (<em>u</em>,<em>v</em>,<em>w</em>) represent the unnormalized image coordinate values for
reading from and/or writing to a 2D image in a 2D image array.</p>
</div>
<div class="paragraph">
<p>When read using a sampler, the 2D image layer selected is computed as:</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>layer = <strong>clamp</strong>(<strong>rint</strong>(<em>w</em>), 0, d<sub>t</sub> - 1)</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>otherwise the layer selected is computed as:</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>layer = <em>w</em></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>(since <em>w</em> is already an integer) and the result is undefined if <em>w</em> is not
one of the integers 0, 1, &#8230;&#8203; d<sub>t</sub> - 1.</p>
</div>
<div class="paragraph">
<p>Let (<em>u</em>,<em>v</em>) represent the unnormalized image coordinate values for reading
from and/or writing to a 1D image in a 1D image array.</p>
</div>
<div class="paragraph">
<p>When read using a sampler, the 1D image layer selected is computed as:</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>layer = <strong>clamp</strong>(<strong>rint</strong>(<em>v</em>), 0, h<sub>t</sub> - 1)</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>otherwise the layer selected is computed as:</p>
</div>
<div class="ulist none">
<ul class="none">
<li>
<p>layer = <em>v</em></p>
</li>
</ul>
</div>
<div class="paragraph">
<p>(since <em>v</em> is already an integer) and the result is undefined if <em>v</em> is not
one of the integers 0, 1, &#8230;&#8203; h<sub>t</sub> - 1.</p>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="references"><a class="anchor" href="#references"></a>9. Normative References</h2>
<div class="sectionbody">
<div class="olist arabic">
<ol class="arabic">
<li>
<p><a id="C99-spec"></a> &#8220;ISO/IEC 9899:1999 - Programming languages - C&#8221;, with
technical corrigenda TC1 and TC2,
<a href="https://www.iso.org/standard/29237.html" class="bare">https://www.iso.org/standard/29237.html</a> .
References are to sections of this specific version, referred to as the
&#8220;C99 Specification&#8221;, although other versions exist.</p>
</li>
<li>
<p><a id="C11-spec"></a> &#8220;ISO/IEC 9899:2011 - Information technology - Programming
languages - C&#8221;, <a href="https://www.iso.org/standard/57853.html" class="bare">https://www.iso.org/standard/57853.html</a> .
References are to sections of this specific version, referred to as the
&#8220;C11 Specification&#8221;, although other versions exist.</p>
</li>
<li>
<p><a id="opencl-spec"></a> &#8220;The OpenCL Specification, Version 3.0, Unified&#8221;,
<a href="https://www.khronos.org/registry/OpenCL/" class="bare">https://www.khronos.org/registry/OpenCL/</a> .
References are to sections and tables of this specific version, although
other versions exists.</p>
</li>
<li>
<p><a id="opencl-device-queries"></a> &#8220;Device Queries&#8221; are defined in the
<a href="#opencl-spec">OpenCL Specification</a> for <strong>clGetDeviceInfo</strong>, and the
individual queries are defined in the &#8220;OpenCL Device Queries&#8221; table
(4.3) of that Specification.</p>
</li>
<li>
<p><a id="opencl-channel-order"></a> &#8220;Image Channel Order&#8221; is
defined in the <a href="#opencl-spec">OpenCL Specification</a> in the &#8220;Image
Format Descriptor&#8221; section (5.3.1.1), and the individual channel orders
are defined in the &#8220;List of supported Image Channel Order Values&#8221;
table (5.6) of that Specification.</p>
</li>
<li>
<p><a id="opencl-channel-data-type"></a> &#8220;Image Channel
Data Type&#8221; is defined in the <a href="#opencl-spec">OpenCL Specification</a> in the
&#8220;Image Format Descriptor&#8221; section (5.3.1.1), and the individual
channel data types are defined in the "`List of supported Image Channel
Data Types" table (5.7) of that Specification.</p>
</li>
<li>
<p><a id="opencl-extension-spec"></a> &#8220;The OpenCL Extension Specification, Version
3.0, Unified&#8221;, <a href="https://www.khronos.org/registry/OpenCL/" class="bare">https://www.khronos.org/registry/OpenCL/</a> .
References are to sections and tables of this specific version, although
other versions exists.</p>
</li>
<li>
<p><a id="sRGB-spec"></a> &#8220;IEC 61966-2-1:1999 Multimedia systems and equipment -
Colour measurement and management - Part 2-1: Colour management -
Default RGB colour space - sRGB&#8221;,
<a href="https://webstore.iec.ch/publication/6169" class="bare">https://webstore.iec.ch/publication/6169</a> .</p>
</li>
</ol>
</div>
</div>
</div>
</div>
<div id="footnotes">
<hr>
<div class="footnote" id="_footnotedef_1">
<a href="#_footnoteref_1">1</a>. When any scalar value is converted to <code>bool</code>, the result is 0 if the value compares equal to 0; otherwise, the result is 1.
</div>
<div class="footnote" id="_footnotedef_2">
<a href="#_footnoteref_2">2</a>. The <code>long</code>, <code>unsigned long</code> and <code>ulong</code> scalar types are optional types for EMBEDDED profile devices that are supported if the value of the <code>CL_DEVICE_EXTENSIONS</code> device query contains <strong>cles_khr_int64</strong>. An OpenCL C 3.0 compiler must also define the <code>__opencl_c_int64</code> feature macro unconditionally for FULL profile devices, or for EMBEDDED profile devices that support these types.
</div>
<div class="footnote" id="_footnotedef_3">
<a href="#_footnoteref_3">3</a>. The <code>double</code> scalar type is an optional type that is supported if the value of the <code>CL_DEVICE_DOUBLE_FP_CONFIG</code> device query is not zero. If this is the case then an OpenCL C 3.0 compiler must also define the <code>__opencl_c_fp64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_4">
<a href="#_footnoteref_4">4</a>. This is a 32-bit type if the value of the <code>CL_DEVICE_ADDRESS_BITS</code> device query is 32-bits, and a 64-bit type if the value of the query is 64-bits.
</div>
<div class="footnote" id="_footnotedef_5">
<a href="#_footnoteref_5">5</a>. <a href="#unified-spec">Requires</a> support for OpenCL C 1.2 or above. Also see extension <strong>cl_khr_fp64</strong>.
</div>
<div class="footnote" id="_footnotedef_6">
<a href="#_footnoteref_6">6</a>. Built-in vector data types are supported by the OpenCL implementation even if the underlying compute device does not natively support any or all of the vector data types. They are to be converted by the device compiler to appropriate instructions that use underlying built-in types supported natively by the compute device. Refer to Appendix B in the OpenCL API specification for a description of the order of the components of a vector type in memory.
</div>
<div class="footnote" id="_footnotedef_7">
<a href="#_footnoteref_7">7</a>. The <code>long<em>n</em></code> and <code>ulong<em>n</em></code> vector types are optional types for EMBEDDED profile devices that are supported if the value of the <code>CL_DEVICE_EXTENSIONS</code> device query contains <strong>cles_khr_int64</strong>. An OpenCL C 3.0 compiler must also define the <code>__opencl_c_int64</code> feature macro unconditionally for FULL profile devices, or for EMBEDDED profile devices that support these types.
</div>
<div class="footnote" id="_footnotedef_8">
<a href="#_footnoteref_8">8</a>. The <code>double<em>n</em></code> vector type is an optional type that is supported if the value of the <code>CL_DEVICE_DOUBLE_FP_CONFIG</code> device query is not zero. If this is the case then an OpenCL C 3.0 compiler must also define the <code>__opencl_c_fp64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_9">
<a href="#_footnoteref_9">9</a>. Refer to the detailed description of the built-in <a href="#image-read-and-write-functions">Image Read and Write Functions</a> that use this type.
</div>
<div class="footnote" id="_footnotedef_10">
<a href="#_footnoteref_10">10</a>. That is, for the purpose of applying type-based aliasing rules, a built-in vector data type will be considered equivalent to the corresponding array type.
</div>
<div class="footnote" id="_footnotedef_11">
<a href="#_footnoteref_11">11</a>. Unless the <strong>cl_khr_fp16</strong> extension is supported and has been enabled.
</div>
<div class="footnote" id="_footnotedef_12">
<a href="#_footnoteref_12">12</a>. Unless the <strong>cl_khr_fp16</strong> extension is supported and has been enabled.
</div>
<div class="footnote" id="_footnotedef_13">
<a href="#_footnoteref_13">13</a>. For conversions to floating-point format, when a finite source value exceeds the maximum representable finite floating-point destination value, the rounding mode will affect whether the result is the maximum finite floating-point value or infinity of same sign as the source value, per IEEE-754 rules for rounding.
</div>
<div class="footnote" id="_footnotedef_14">
<a href="#_footnoteref_14">14</a>. In addition, some other extensions to the C language designed to support a particular vector ISA (e.g. AltiVecâ„¢, CELL Broadband Engineâ„¢ Architecture) use such conversions in conjunction with swizzle operators to achieve type un-conversion. So as to support legacy code of this type, <strong>as_typen</strong>() allows conversions between vectors of the same size but different numbers of elements, even though the behavior of this sort of conversion is not likely to be portable except to other OpenCL implementations for the same hardware architecture.<br> AltiVec is a trademark of Motorola Inc.<br> Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc.
</div>
<div class="footnote" id="_footnotedef_15">
<a href="#_footnoteref_15">15</a>. Unless the <strong>cl_khr_fp16</strong> extension is supported and has been enabled.
</div>
<div class="footnote" id="_footnotedef_16">
<a href="#_footnoteref_16">16</a>. While the union is intended to reflect the organization of data in memory, the <strong>as_type</strong>() and <strong>as_type<em>n</em></strong>() constructs are intended to reflect the organization of data in register. The <strong>as_type</strong>() and <strong>as_type<em>n</em></strong>() constructs are intended to compile to no instructions on devices that use a shared register file designed to operate on both the operand and result types. Note that while differences in memory organization are expected to largely be limited to those arising from endianness, the register based representation may also differ due to size of the element in register. For example, an architecture may load a <code>char</code> into a 32-bit register, or a <code>char</code> vector into a SIMD vector register with fixed 32-bit element size. If the element count does not match, then the implementation should pick a data representation that most closely matches what would happen if an appropriate result type operator was applied to a register containing data of the source type. If the number of elements matches, then the <strong>as_type<em>n</em></strong>() should faithfully reproduce the behavior expected from a similar data type reinterpretation using memory/unions. So, for example if an implementation stores all single precision data as <code>double</code> in register, it should implement <strong>as_int</strong>(<code>float</code>) by first down-converting the <code>double</code> to single precision and then (if necessary) moving the single precision bits to a register suitable for operating on integer data. If data stored in different address spaces do not have the same endianness, then the &#8220;dominant endianness&#8221; of the device should prevail.
</div>
<div class="footnote" id="_footnotedef_17">
<a href="#_footnoteref_17">17</a>. This is different from the standard integer conversion rank described in <a href="#C99-spec">section 6.3.1.1 of the C99 Specification</a>.
</div>
<div class="footnote" id="_footnotedef_18">
<a href="#_footnoteref_18">18</a>. The pre- and post- increment operators may have unexpected behavior on floating-point values and are therefore not supported for floating-point scalar and vector built-in types. For example, if variable <em>a</em> has type <code>float</code> and holds the value <code>0x1.0p25f</code>, then <code><em>a</em>++</code> returns <code>0x1.0p25f</code>.<br> Also, <code>(<em>a</em>++)--</code> is not guaranteed to return <em>a</em>, if <em>a</em> has fractional value.<br> In non-default rounding modes, <code>(<em>a</em>++)--</code> may produce the same result as <code><em>a</em>++</code> or <code><em>a</em>--</code> for large <em>a</em>.
</div>
<div class="footnote" id="_footnotedef_19">
<a href="#_footnoteref_19">19</a>. To test whether any or all elements in the result of a vector relational operator test <em>true</em>, for example to use in the context in an <strong>if ( )</strong> statement, please see the <a href="#relational-functions"><strong>any</strong> and <strong>all</strong> built-ins</a>.
</div>
<div class="footnote" id="_footnotedef_20">
<a href="#_footnoteref_20">20</a>. To test whether any or all elements in the result of a vector relational operator test <em>true</em>, for example to use in the context in an <strong>if ( )</strong> statement, please see the <a href="#relational-functions"><strong>any</strong> and <strong>all</strong> built-ins</a>.
</div>
<div class="footnote" id="_footnotedef_21">
<a href="#_footnoteref_21">21</a>. Integer promotion is described in <a href="#C99-spec">section 6.3.1.1 of the C99 Specification</a>.
</div>
<div class="footnote" id="_footnotedef_22">
<a href="#_footnoteref_22">22</a>. Variable length arrays are <a href="#restrictions-variable-length">not supported in OpenCL C</a>.
</div>
<div class="footnote" id="_footnotedef_23">
<a href="#_footnoteref_23">23</a>. Except for 3-component vectors whose size is defined as 4 times the size of each scalar component.
</div>
<div class="footnote" id="_footnotedef_24">
<a href="#_footnoteref_24">24</a>. Bit-field struct members are <a href="#restrictions-bitfield">not supported in OpenCL C</a>.
</div>
<div class="footnote" id="_footnotedef_25">
<a href="#_footnoteref_25">25</a>. Among the invalid values for dereferencing a pointer by the unary <strong>*</strong> operator are a null pointer, an address inappropriately aligned for the type of object pointed to, and the address of an object after the end of its lifetime. If <strong>*P</strong> is an l-value and <strong>T</strong> is the name of an object pointer type, <strong>*(T)P</strong> is an l-value that has a type compatible with that to which <strong>T</strong> points.
</div>
<div class="footnote" id="_footnotedef_26">
<a href="#_footnoteref_26">26</a>. Thus, <strong>&amp;*E</strong> is equivalent to <strong>E</strong> (even if <strong>E</strong> is a null pointer), and <strong>&amp;(E1[E2])</strong> is equivalent to <strong>E1) + (E2</strong>. It is always true that if <strong>E</strong> is an l-value that is a valid operand of the unary <strong>&amp;</strong> operator, <strong>*&amp;E</strong> is an l-value equal to <strong>E</strong>.
</div>
<div class="footnote" id="_footnotedef_27">
<a href="#_footnoteref_27">27</a>. Implicit in autovectorization is the assumption that any libraries called from the <code>__kernel</code> must be recompilable at run time to handle cases where the compiler decides to merge or separate workitems. This probably means that such libraries can never be hard coded binaries or that hard coded binaries must be accompanied either by source or some retargetable intermediate representation. This may be a code security question for some.
</div>
<div class="footnote" id="_footnotedef_28">
<a href="#_footnoteref_28">28</a>. Unless the <strong>cl_khr_fp16</strong> extension is supported and has been enabled.
</div>
<div class="footnote" id="_footnotedef_29">
<a href="#_footnoteref_29">29</a>. This syntax is already part of the clang source tree on which most vendors have based their OpenCL implementations. Additionally, blocks based closures are supported by the clang open source C compiler as well as Mac OS X&#8217;s C and Objective C compilers. Specifically, Mac OS X&#8217;s Grand Central Dispatch allows applications to queue tasks as a block.
</div>
<div class="footnote" id="_footnotedef_30">
<a href="#_footnoteref_30">30</a>. OpenCL C <a href="#restrictions">does not allow function pointers</a> primarily because it is difficult or expensive to implement generic indirections to executable code in many hardware architectures that OpenCL targets. OpenCL C&#8217;s design of Blocks is intended to respect that same condition, yielding the restrictions listed here. As such, Blocks allow a form of dynamically enqueued function scheduling without providing a form of runtime synchronous dynamic dispatch analogous to function pointers.
</div>
<div class="footnote" id="_footnotedef_31">
<a href="#_footnoteref_31">31</a>. I.e. the <em>global_work_size</em> values specified to <strong>clEnqueueNDRangeKernel</strong> are not evenly divisible by the <em>local_work_size</em> values for each dimension.
</div>
<div class="footnote" id="_footnotedef_32">
<a href="#_footnoteref_32">32</a>. Only if double precision is supported. In OpenCL C 3.0 this will be indicated by the presence of the <code>__opencl_c_fp64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_33">
<a href="#_footnoteref_33">33</a>. <strong>fmin</strong> and <strong>fmax</strong> behave as defined by C99 and may not match the IEEE 754-2008 definition for <strong>minNum</strong> and <strong>maxNum</strong> with regard to signaling NaNs. Specifically, signaling NaNs may behave as quiet NaNs.
</div>
<div class="footnote" id="_footnotedef_34">
<a href="#_footnoteref_34">34</a>. The <strong>min</strong>() operator is there to prevent <strong>fract</strong>(-small) from returning 1.0. It returns the largest positive floating-point number less than 1.0.
</div>
<div class="footnote" id="_footnotedef_35">
<a href="#_footnoteref_35">35</a>. The user is cautioned that for some usages, e.g. <strong>mad</strong>(a, b, -a*b), the definition of <strong>mad</strong>() is loose enough in the embedded profile that almost any result is allowed from <strong>mad</strong>() for some values of a and b.
</div>
<div class="footnote" id="_footnotedef_36">
<a href="#_footnoteref_36">36</a>. Only if 64-bit integers are supported. In OpenCL C 3.0 this will be indicated by the presence of the <code>__opencl_c_int64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_37">
<a href="#_footnoteref_37">37</a>. Frequently vector operations need n + 1 bits temporarily to calculate a result. The <strong>rhadd</strong> instruction gives you an extra bit without needing to upsample and downsample. This can be a profound performance win.
</div>
<div class="footnote" id="_footnotedef_38">
<a href="#_footnoteref_38">38</a>. Only if double precision is supported. In OpenCL C 3.0 this will be indicated by the presence of the <code>__opencl_c_fp64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_39">
<a href="#_footnoteref_39">39</a>. Only if double precision is supported. In OpenCL C 3.0 this will be indicated by the presence of the <code>__opencl_c_fp64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_40">
<a href="#_footnoteref_40">40</a>. If an implementation extends this specification to support IEEE-754 flags or exceptions, then all built-in functions defined in the following table shall proceed without raising the <em>invalid</em> floating-point exception when one or more of the operands are NaNs.
</div>
<div class="footnote" id="_footnotedef_41">
<a href="#_footnoteref_41">41</a>. Only if 64-bit integers are supported. In OpenCL C 3.0 this will be indicated by the presence of the <code>__opencl_c_int64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_42">
<a href="#_footnoteref_42">42</a>. Only if double precision is supported. In OpenCL C 3.0 this will be indicated by the presence of the <code>__opencl_c_fp64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_43">
<a href="#_footnoteref_43">43</a>. This definition means that the behavior of select and the ternary operator for vector and scalar types is dependent on different interpretations of the bit pattern of <em>c</em>.
</div>
<div class="footnote" id="_footnotedef_44">
<a href="#_footnoteref_44">44</a>. Only if 64-bit integers are supported. In OpenCL C 3.0 this will be indicated by the presence of the <code>__opencl_c_int64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_45">
<a href="#_footnoteref_45">45</a>. Only if double precision is supported. In OpenCL C 3.0 this will be indicated by the presence of the <code>__opencl_c_fp64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_46">
<a href="#_footnoteref_46">46</a>. <strong>vload3</strong> and <strong>vload_half3</strong> read (<em>x</em>,<em>y</em>,<em>z</em>) components from address <code>(<em>p</em> + (<em>offset</em> * 3))</code> into a 3-component vector. <strong>vstore3</strong> and <strong>vstore_half3</strong> write (<em>x</em>,<em>y</em>,<em>z</em>) components from a 3-component vector to address <code>(<em>p</em> + (<em>offset</em> * 3))</code>. In addition, <strong>vloada_half3</strong> reads (<em>x</em>,<em>y</em>,<em>z</em>) components from address <code>(<em>p</em> + (<em>offset</em> * 4))</code> into a 3-component vector and <strong>vstorea_half3</strong> writes (<em>x</em>,<em>y</em>,<em>z</em>) components from a 3-component vector to address <code>(<em>p</em> + (<em>offset</em> * 4))</code>. Whether <strong>vloada_half3</strong> and <strong>vstorea_half3</strong> read/write padding data between the third vector element and the next alignment boundary is implementation defined. The <strong>vloada_</strong> and <strong>vstorea_</strong> variants are provided to access data that is aligned to the size of the vector, and are intended to enable performance on hardware that can take advantage of the increased alignment.
</div>
<div class="footnote" id="_footnotedef_47">
<a href="#_footnoteref_47">47</a>. Refer to the description and restrictions for <a href="#memory-scope"><code>memory_scope</code></a>.
</div>
<div class="footnote" id="_footnotedef_48">
<a href="#_footnoteref_48">48</a>. Only if 64-bit integers are supported. In OpenCL C 3.0 this will be indicated by the presence of the <code>__opencl_c_int64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_49">
<a href="#_footnoteref_49">49</a>. Only if double precision is supported. In OpenCL C 3.0 this will be indicated by the presence of the <code>__opencl_c_fp64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_50">
<a href="#_footnoteref_50">50</a>. <strong>async_work_group_copy</strong> and <strong>async_work_group_strided_copy</strong> for 3-component vector types behave as <strong>async_work_group_copy</strong> and <strong>async_work_group_strided_copy</strong> respectively for 4-component vector types.
</div>
<div class="footnote" id="_footnotedef_51">
<a href="#_footnoteref_51">51</a>. The <a href="#C11-spec">C11</a> consume operation is not supported.
</div>
<div class="footnote" id="_footnotedef_52">
<a href="#_footnoteref_52">52</a>. The atomic_long and atomic_ulong types are supported if the <strong>cl_khr_int64_base_atomics</strong> and <strong>cl_khr_int64_extended_atomics</strong> extensions are supported and have been enabled. If this is the case then an OpenCL C 3.0 compiler must also define the <code>__opencl_c_int64</code> feature.
</div>
<div class="footnote" id="_footnotedef_53">
<a href="#_footnoteref_53">53</a>. The <code>atomic_double</code> type is only supported if double precision is supported and the <strong>cl_khr_int64_base_atomics</strong> and <strong>cl_khr_int64_extended_atomics</strong> extensions are supported and have been enabled. If this is the case then an OpenCL C 3.0 compiler must also define the <code>__opencl_c_fp64</code> feature.
</div>
<div class="footnote" id="_footnotedef_54">
<a href="#_footnoteref_54">54</a>. If the device address space is 64-bits, the data types <code>atomic_intptr_t</code>, <code>atomic_uintptr_t</code>, <code>atomic_size_t</code> and <code>atomic_ptrdiff_t</code> are supported if the <strong>cl_khr_int64_base_atomics</strong> and <strong>cl_khr_int64_extended_atomics</strong> extensions are supported and have been enabled.
</div>
<div class="footnote" id="_footnotedef_55">
<a href="#_footnoteref_55">55</a>. This spurious failure enables implementation of compare-and-exchange on a broader class of machines, e.g. load-locked store-conditional machines.
</div>
<div class="footnote" id="_footnotedef_56">
<a href="#_footnoteref_56">56</a>. Only if the <strong>cl_khr_fp16</strong> extension is supported and has been enabled.
</div>
<div class="footnote" id="_footnotedef_57">
<a href="#_footnoteref_57">57</a>. Only if 64-bit integers are supported. In OpenCL C 3.0 this will be indicated by the presence of the <code>__opencl_c_int64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_58">
<a href="#_footnoteref_58">58</a>. Only if double precision is supported. In OpenCL C 3.0 this will be indicated by the presence of the <code>__opencl_c_fp64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_59">
<a href="#_footnoteref_59">59</a>. Note that <strong>0</strong> is taken as a flag, not as the beginning of a field width.
</div>
<div class="footnote" id="_footnotedef_60">
<a href="#_footnoteref_60">60</a>. The results of all floating conversions of a negative zero, and of negative values that round to zero, include a minus sign.
</div>
<div class="footnote" id="_footnotedef_61">
<a href="#_footnoteref_61">61</a>. Only if the <strong>cl_khr_fp16</strong> extension is supported and has been enabled.
</div>
<div class="footnote" id="_footnotedef_62">
<a href="#_footnoteref_62">62</a>. When applied to infinite and NaN values, the <strong>-</strong>, <strong>+</strong>, and <em>space</em> flag characters have their usual meaning; the <strong>#</strong> and <strong>0</strong> flag characters have no effect.
</div>
<div class="footnote" id="_footnotedef_63">
<a href="#_footnoteref_63">63</a>. Binary implementations can choose the hexadecimal digit to the left of the decimal-point character so that subsequent digits align to nibble (4-bit) boundaries.
</div>
<div class="footnote" id="_footnotedef_64">
<a href="#_footnoteref_64">64</a>. No special provisions are made for multibyte characters. The behavior of <strong>printf</strong> with the <strong>s</strong> conversion specifier is undefined if the argument value is not a pointer to a literal string.
</div>
<div class="footnote" id="_footnotedef_65">
<a href="#_footnoteref_65">65</a>. This is similar to the <code>GL_ADDRESS_CLAMP_TO_BORDER</code> addressing mode.
</div>
<div class="footnote" id="_footnotedef_66">
<a href="#_footnoteref_66">66</a>. Note that the built-in function calls to read images with a sampler are not supported for <code>image1d_buffer_t</code> image types.
</div>
<div class="footnote" id="_footnotedef_67">
<a href="#_footnoteref_67">67</a>. Although <code>CL_UNORM_INT_101010_2</code> was added in OpenCL 2.1, because there was no OpenCL C 2.1 this image channel order <a href="#unified-spec">requires</a> OpenCL 3.0.
</div>
<div class="footnote" id="_footnotedef_68">
<a href="#_footnoteref_68">68</a>. Only if the <strong>cl_khr_fp16</strong> extension is supported and has been enabled.
</div>
<div class="footnote" id="_footnotedef_69">
<a href="#_footnoteref_69">69</a>. Only if 64-bit integers are supported. In OpenCL C 3.0 this will be indicated by the presence of the <code>__opencl_c_int64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_70">
<a href="#_footnoteref_70">70</a>. Only if double precision is supported. In OpenCL C 3.0 this will be indicated by the presence of the <code>__opencl_c_fp64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_71">
<a href="#_footnoteref_71">71</a>. The <code>half</code> scalar and vector types can only be used if the <strong>cl_khr_fp16</strong> extension is supported and has been enabled. The <code>double</code> scalar and vector types can only be used if <code>double</code> precision is supported, e.g. for OpenCL C 3.0 the <code>__opencl_c_fp64</code> feature macro is present.
</div>
<div class="footnote" id="_footnotedef_72">
<a href="#_footnoteref_72">72</a>. The <code>half</code> scalar and vector types can only be used if the <strong>cl_khr_fp16</strong> extension is supported and has been enabled. The <code>double</code> scalar and vector types can only be used if <code>double</code> precision is supported, e.g. for OpenCL C 3.0 the <code>__opencl_c_fp64</code> feature macro is present.
</div>
<div class="footnote" id="_footnotedef_73">
<a href="#_footnoteref_73">73</a>. The <code>half</code> scalar and vector types can only be used if the <strong>cl_khr_fp16</strong> extension is supported and has been enabled. The <code>double</code> scalar and vector types can only be used if <code>double</code> precision is supported, e.g. for OpenCL C 3.0 the <code>__opencl_c_fp64</code> feature macro is present.
</div>
<div class="footnote" id="_footnotedef_74">
<a href="#_footnoteref_74">74</a>. Implementations are not required to honor this flag. Implementations may not schedule kernel launch earlier than the point specified by this flag, however.
</div>
<div class="footnote" id="_footnotedef_75">
<a href="#_footnoteref_75">75</a>. Immediate meaning not side effects resulting from child kernels. The side effects would include stores to <code>global</code> memory and pipe reads and writes.
</div>
<div class="footnote" id="_footnotedef_76">
<a href="#_footnoteref_76">76</a>. This acts as a memory synchronization point between work-items in a work-group and child kernels enqueued by work-items in the work-group.
</div>
<div class="footnote" id="_footnotedef_77">
<a href="#_footnoteref_77">77</a>. Only if 64-bit integers are supported. In OpenCL C 3.0 this will be indicated by the presence of the <code>__opencl_c_int64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_78">
<a href="#_footnoteref_78">78</a>. Only if the <strong>cl_khr_fp16</strong> extension is supported and has been enabled.
</div>
<div class="footnote" id="_footnotedef_79">
<a href="#_footnoteref_79">79</a>. Only if double precision is supported. In OpenCL C 3.0 this will be indicated by the presence of the <code>__opencl_c_fp64</code> feature macro.
</div>
<div class="footnote" id="_footnotedef_80">
<a href="#_footnoteref_80">80</a>. Except for the embedded profile where either round to zero or round to nearest rounding mode may be supported for single precision floating-point.
</div>
<div class="footnote" id="_footnotedef_81">
<a href="#_footnoteref_81">81</a>. On some implementations, <strong>powr</strong>() or <strong>pown</strong>() may perform faster than <strong>pow</strong>(). If <em>x</em> is known to be &gt;= 0, consider using <strong>powr</strong>() in place of <strong>pow</strong>(), or if <em>y</em> is known to be an integer, consider using <strong>pown</strong>() in place of <strong>pow</strong>().
</div>
<div class="footnote" id="_footnotedef_82">
<a href="#_footnoteref_82">82</a>. Here <code>TYPE_MIN</code> and <code>TYPE_MIN_EXP</code> should be substituted by constants appropriate to the floating-point type under consideration, such as <code>FLT_MIN</code> and <code>FLT_MIN_EXP</code> for <code>float</code>.
</div>
</div>
<div id="footer">
<div id="footer-text">
Version v3.0.5<br>
Last updated 2020-09-29 16:22:41 -0700
</div>
</div>
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.4/MathJax.js?config=TeX-MML-AM_HTMLorMML"></script>
</body>
</html>