blob: 1fa6fb4b9bb1aa4f287ef0b5ef89ef3470a99231 [file] [log] [blame]
<!-- buildOptionsInc.xml, section 5.6.4 -->
<bridgehead>Compiler Options</bridgehead>
<para>
The compiler options are categorized as pre-processor options, options for math intrinsics,
options that control optimization and miscellaneous options. This specification defines
a standard set of options that must be supported by the OpenCL C compiler when building
program executables online or offline. These may be extended by a set of vendor- or
platform specific options.
</para>
<bridgehead>Preprocessor Options</bridgehead>
<para>
These options control the OpenCL C preprocessor which is run on each program source before
actual compilation.
</para>
<para>
-D options are processed in the order they are given in the
<varname>options</varname> argument to
<function>clBuildProgram</function> or <function>clCompileProgram</function>.
</para>
<variablelist>
<varlistentry>
<term>-D name</term>
<listitem>
<para>
Predefine <varname>name</varname> as a macro, with definition 1.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>-D name=definition</term>
<listitem>
<para>
The contents of <varname>definition</varname> are tokenized and processed as
if they appeared during translation phase three in a '#define' directive. In
particular, the definition will be truncated by embedded newline characters.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>-I dir</term>
<listitem>
<para>
Add the directory <varname>dir</varname> to the list of directories to be
searched for header files.
</para>
</listitem>
</varlistentry>
</variablelist>
<bridgehead>Math Intrinsics Options</bridgehead>
These options control compiler behavior regarding floating-point arithmetic. These options trade
off between speed and correctness.
<variablelist>
<varlistentry>
<term>-cl-single-precision-constant</term>
<listitem>
<para>
Treat double precision floating-point constant as single precision constant.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>-cl-denorms-are-zero</term>
<listitem>
<para>
This option controls how single precision and double precision denormalized
numbers are handled. If specified as a build option, the single precision
denormalized numbers may be flushed to zero; double precision denormalized
numbers may also be flushed to zero if the optional extension for double
precsion is supported. This is intended to be a performance hint and the
OpenCL compiler can choose not to flush denorms to zero if the device supports
single precision (or double precision) denormalized numbers.
</para>
<para>
This option is ignored for single precision numbers if the
device does not support single precision denormalized numbers
i.e. <constant>CL_FP_DENORM</constant> bit is not set in
<constant>CL_DEVICE_SINGLE_FP_CONFIG</constant>.
</para>
<para>
This option is ignored for double precision numbers if the device
does not support double precision or if it does support double
precison but not double precision denormalized
numbers i.e. <constant>CL_FP_DENORM</constant> bit is not set in
<constant>CL_DEVICE_DOUBLE_FP_CONFIG</constant>.
</para>
<para>
This flag only applies for scalar and vector single precision floating-point
variables and computations on these floating-point variables inside a
program. It does not apply to reading from or writing to image objects.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>-cl-fp32-correctly-rounded-divide-sqrt</term>
<listitem>
<para>
The <code>-cl-fp32-correctly-rounded-divide-sqrt</code> build option to
<function>clBuildProgram</function> or <function>clCompileProgram</function>
allows an application to specify that single precision floating-point divide
(x/y and 1/x) and sqrt used in the program source are correctly rounded.
If this build option is not specified, the minimum numerical accuracy of
single precision floating-point divide and sqrt are as defined in section
7.4 of the OpenCL specification.
</para>
<para>
This build option can only be specified if the
<constant>CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT</constant> is set
in <constant>CL_DEVICE_SINGLE_FP_CONFIG</constant> (as defined in
in the table of allowed values for <varname>param_name</varname> for
<citerefentry><refentrytitle>clGetDeviceInfo</refentrytitle></citerefentry>) for
devices that the program is being build. <function>clBuildProgram</function>
or <function>clCompileProgram</function> will fail to compile the program for
a device if the <code>-cl-fp32-correctly-rounded-divide-sqrt</code> option is specified
and <constant>CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT</constant> is not set for
the device.
</para>
</listitem>
</varlistentry>
</variablelist>
<bridgehead>Optimization Options</bridgehead>
<para>
These options control various sorts of optimizations. Turning on optimization flags
makes the compiler attempt to improve the performance and/or code size at the expense of
compilation time and possibly the ability to debug the program.
</para>
<variablelist>
<varlistentry>
<term>-cl-opt-disable</term>
<listitem>
<para>
This option disables all optimizations. The default is optimizations are enabled.
</para>
</listitem> </varlistentry>
</variablelist>
<para>
The following options control compiler behavior regarding floating-point arithmetic. These
options trade off between performance and correctness and must be specifically enabled.
These options are not turned on by default since it can result in incorrect output for
programs which depend on an exact implementation of IEEE 754 rules/specifications for
math functions.
</para>
<variablelist>
<varlistentry>
<term>-cl-mad-enable</term>
<listitem>
<para>
Allow <code>a * b + c</code> to be replaced by a <code>mad</code>. The
<code>mad</code> computes <code>a * b + c</code> with reduced accuracy. For
example, some OpenCL devices implement <code>mad</code> as truncate the result
of <code>a * b</code> before adding it to <code>c</code>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>-cl-no-signed-zeros</term>
<listitem>
<para>
Allow optimizations for floating-point arithmetic that ignore the signedness of
zero. IEEE 754 arithmetic specifies the distinct behavior of <code>+0.0</code>
and <code>-0.0</code> values, which then prohibits simplification of expressions
such as <code>x+0.0</code> or <code>0.0*x</code> (even with <code>-clfinite-math</code>
only). This option implies that the sign of a zero result isn't significant.
</para>
</listitem> </varlistentry>
<varlistentry>
<term>-cl-unsafe-math-optimizations</term>
<listitem>
<para>
Allow optimizations for floating-point arithmetic that (a) assume that
arguments and results are valid, (b) may violate IEEE 754 standard and (c)
may violate the OpenCL numerical compliance requirements as defined in section
7.4 for single precision and double precision floating-point, and edge case
behavior in section 7.5. This option includes the <code>-cl-no-signed-zeros</code> and
<code>-cl-mad-enable</code> options.
</para>
</listitem> </varlistentry>
<varlistentry>
<term>-cl-finite-math-only</term>
<listitem>
<para>
Allow optimizations for floating-point arithmetic that assume that arguments and
results are not NaNs or &#x000B1;&#x0221E;. This option may violate the OpenCL
numerical compliance requirements defined in section 7.4 for single precision
and double precision floating point, and edge case behavior in section 7.5.
</para>
</listitem> </varlistentry>
<varlistentry>
<term>-cl-fast-relaxed-math</term>
<listitem>
<para>
Sets the optimization options <code>-cl-finite-math-only</code> and
<code>-cl-unsafe-math-optimizations</code>. This allows optimizations for floating-point
arithmetic that may violate the IEEE 754 standard and the OpenCL numerical
compliance requirements defined in the specification in section 7.4 for
single-precision and double precision floating-point, and edge case
behavior in section 7.5. This option also relaxes the precision of commonly
used math functions (refer to
table 7.2 defined in section 7.4).
This option causes the preprocessor macro
<code>__FAST_RELAXED_MATH__</code> to be defined in the OpenCL program.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>-cl-uniform-work-group-size</term>
<listitem>
<para>
This requires that the global work-size be
a multiple of the work-group size specified to
<citerefentry><refentrytitle>clEnqueueNDRangeKernel</refentrytitle></citerefentry>.
Allow optimizations that are made possible by this restriction.
</para>
</listitem>
</varlistentry>
</variablelist>
<bridgehead>Options to Request or Suppress Warnings</bridgehead>
Warnings are diagnostic messages that report constructions which are not inherently erroneous
but which are risky or suggest there may have been an error. The following language independent
options do not enable specific warnings but control the kinds of diagnostics
produced by the OpenCL compiler.
<variablelist>
<varlistentry>
<term>-w</term>
<listitem>
<para>
Inhibit all warning messages.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>-Werror</term>
<listitem>
<para>
Make all warnings into errors.
</para>
</listitem>
</varlistentry>
</variablelist>
<bridgehead>Options Controlling the OpenCL C Version</bridgehead>
The following option controls the version of OpenCL C that the compiler accepts.
<variablelist>
<varlistentry>
<term>-cl-std=</term>
<listitem>
<para>
Determine the OpenCL C language version to use. A value for this option must
be provided. Valid values are:
</para>
<para>
<code>CL1.1</code> - Support all OpenCL C programs that use the OpenCL C language features
defined in section 6 of the OpenCL 1.1 specification.
</para>
<para>
<code>CL1.2</code> - Support all OpenCL C programs that use the OpenCL C language features
defined in section 6 of the OpenCL 1.2 specification.
</para>
<para>
<code>CL2.0</code> - Support all OpenCL C programs that use the OpenCL C language features
defined in section 6 of the OpenCL 2.0 specification.
</para>
</listitem>
</varlistentry>
</variablelist>
<para>
Calls to <function>clBuildProgram</function> or <function>clCompileProgram</function> with
the <code>-cl-std=CL1.1</code> option will fail to compile the program for any devices with
<constant>CL_DEVICE_OPENCL_C_VERSION</constant> = OpenCL C 1.0.
</para>
<para>
Calls to <function>clBuildProgram</function> or <function>clCompileProgram</function> with
the <code>-cl-std=CL1.2</code> option will fail to compile the program for any devices with
<constant>CL_DEVICE_OPENCL_C_VERSION</constant> = OpenCL C 1.0 or OpenCL C 1.1.
</para>
<para>
Calls to <function>clBuildProgram</function> or <function>clCompileProgram</function> with
the <code>-cl-std=CL2.0</code> option will fail to compile the program for any devices with
<constant>CL_DEVICE_OPENCL_C_VERSION</constant> = OpenCL C 1.0, OpenCL C 1.1, or OpenCL C 1.2.
</para>
<para>
If the <code>–cl-std</code> build option is not specified, the
highest OpenCL C 1.x language version supported
by each device is used when compiling the program
for each device. Applications are required
to specify the <code>–cl-std=CL2.0</code> option if they want
to compile or build their programs with
OpenCL C 2.0.
</para>
<bridgehead>Options enabled by the cl_khr_spir extension</bridgehead>
<variablelist>
<varlistentry>
<term>-x spir</term>
<listitem>
<para>
Indicates that the binary is in SPIR format.
</para>
</listitem>
</varlistentry>
</variablelist>
<variablelist>
<varlistentry>
<term>-spir-std</term>
<listitem>
<para>
Specifies the version of the SPIR specification that
describes the format and meaning of the binary. For
example, if the binary is as described in
SPIR version 1.2, then <code>-spir-std=1.2</code> must
be specified. Failing to specify these compile options
may result in implementation defined behavior.
</para>
</listitem>
</varlistentry>
</variablelist>
<bridgehead>Options for Querying Kernel Argument Information</bridgehead>
<variablelist>
<varlistentry>
<term>-cl-kernel-arg-info</term>
<listitem>
<para>
This option allows the compiler to store information about the
arguments of a kernel(s) in the program executable. The argument
information stored includes the argument name, its type,
the address and access qualifiers used. Refer to description of
<citerefentry><refentrytitle>clGetKernelArgInfo</refentrytitle></citerefentry>
on how to query this information.
</para>
</listitem>
</varlistentry>
</variablelist>
<bridgehead>Options for debugging your program</bridgehead>
<variablelist>
<varlistentry>
<term>-g</term>
<listitem>
<para>
This option can currently be used to generate
additional errors for the built-in functions
that allow you to enqueue commands on a device
(refer to section 6.13.17).
</para>
</listitem>
</varlistentry>
</variablelist>
<bridgehead>Linker Options</bridgehead>
This specification defines a standard set of linker options that must be supported by the
OpenCL C compiler when linking compiled programs online or offline. These linker options
are categorized as library linking options and program linking options. These may be
extended by a set of vendor- or platform-specific options.
<bridgehead>Library Linking Options</bridgehead>
The following options can be specified when creating a library of compiled binaries.
<variablelist>
<varlistentry>
<term>-create-library</term>
<listitem>
<para>
Create a library of compiled binaries specified
in <varname>input_programs</varname> argument to
<citerefentry><refentrytitle>clLinkProgram</refentrytitle></citerefentry>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>-enable-link-options</term>
<listitem>
<para>
Allows the linker to modify the library behavior based on one or more link
options (described in Program Linking Options, below) when this library is
linked with a program executable. This option must be specified with the
<code>-create-library</code> option.
</para>
</listitem>
</varlistentry>
</variablelist>
<bridgehead>Program Linking Options</bridgehead>
The following options can be specified when linking a program executable.
<itemizedlist mark="disc">
<listitem> <code>-cl-denorms-are-zero</code> </listitem>
<listitem> <code>-cl-no-signed-zeroes</code> </listitem>
<listitem> <code>-cl-unsafe-math-optimizations</code> </listitem>
<listitem> <code>-cl-finite-math-only</code> </listitem>
<listitem> <code>-cl-fast-relaxed-mat</code> </listitem>
</itemizedlist>
The linker may apply these options to all compiled program objects specified to
<citerefentry><refentrytitle>clLinkProgram</refentrytitle></citerefentry>. The linker
may apply these options only to libraries which were created with the
<code>-enable-link-option</code>.
<bridgehead>Separate Compilation and Linking of Programs</bridgehead>
<para>
OpenCL programs are compiled and linked to support the following:
</para>
<itemizedlist mark="disc">
<listitem>
Separate compilation and link stages. Program sources can be compiled to generate
a compiled binary object and linked in a separate stage with other compiled program
objects to the program exectuable.
</listitem>
<listitem>
Embedded headers. In OpenCL 1.0 and 1.1, the <code>-I</code> build option could be used to
specify the list of directories to be searched for headers files that are included
by a program source(s). OpenCL 1.2 extends this by allowing the header sources to
come from program objects instead of just header files.
</listitem>
<listitem>
Libraries. The linker can be used to link compiled objects and libraries into a
program executable or to create a library of compiled binaries.
</listitem>
</itemizedlist>
<!-- 23-Dec-2013, rev. 19 -->