| # SPIR-V Assembly language syntax | 
 |  | 
 | ## Overview | 
 |  | 
 | The assembly attempts to adhere to the binary form from Section 3 of the SPIR-V | 
 | spec as closely as possible, with one exception aiming at improving the text's | 
 | readability.  The `<result-id>` generated by an instruction is moved to the | 
 | beginning of that instruction and followed by an `=` sign.  This allows us to | 
 | distinguish between variable definitions and uses and locate value definitions | 
 | more easily. | 
 |  | 
 | Here is an example: | 
 |  | 
 | ``` | 
 |      OpCapability Shader | 
 |      OpMemoryModel Logical Simple | 
 |      OpEntryPoint GLCompute %3 "main" | 
 |      OpExecutionMode %3 LocalSize 64 64 1 | 
 | %1 = OpTypeVoid | 
 | %2 = OpTypeFunction %1 | 
 | %3 = OpFunction %1 None %2 | 
 | %4 = OpLabel | 
 |      OpReturn | 
 |      OpFunctionEnd | 
 | ``` | 
 |  | 
 | A module is a sequence of instructions, separated by whitespace. | 
 | An instruction is an opcode name followed by operands, separated by | 
 | whitespace.  Typically each instruction is presented on its own line, | 
 | but the assembler does not enforce this rule. | 
 |  | 
 | The opcode names and expected operands are described in Section 3 of | 
 | the SPIR-V specification.  An operand is one of: | 
 | * a literal integer: A decimal integer, or a hexadecimal integer. | 
 |   A hexadecimal integer is indicated by a leading `0x` or `0X`.  A hex | 
 |   integer supplied for a signed integer value will be sign-extended. | 
 |   For example, `0xffff` supplied as the literal for an `OpConstant` | 
 |   on a signed 16-bit integer type will be interpreted as the value `-1`. | 
 | * a literal floating point number, in decimal or hexadecimal form. | 
 |   See [below](#floats). | 
 | * a literal string. | 
 |    * A literal string is everything following a double-quote `"` until the | 
 |      following un-escaped double-quote. This includes special characters such | 
 |      as newlines. | 
 |    * A backslash `\` may be used to escape characters in the string. The `\` | 
 |      may be used to escape a double-quote or a `\` but is simply ignored when | 
 |      preceding any other character. | 
 | * a named enumerated value, specific to that operand position.  For example, | 
 |   the `OpMemoryModel` takes a named Addressing Model operand (e.g. `Logical` or | 
 |   `Physical32`), and a named Memory Model operand (e.g. `Simple` or `OpenCL`). | 
 |   Named enumerated values are only meaningful in specific positions, and will | 
 |   otherwise generate an error. | 
 | * a mask expression, consisting of one or more mask enum names separated | 
 |   by `|`.  For example, the expression `NotNaN|NotInf|NSZ` denotes the mask | 
 |   which is the combination of the `NotNaN`, `NotInf`, and `NSZ` flags. | 
 | * an injected immediate integer: `!<integer>`.  See [below](#immediate). | 
 | * an ID, e.g. `%foo`. See [below](#id). | 
 | * the name of an extended instruction.  For example, `sqrt` in an extended | 
 |   instruction such as `%f = OpExtInst %f32 %OpenCLImport sqrt %arg` | 
 | * the name of an opcode for OpSpecConstantOp, but where the `Op` prefix | 
 |   is removed.  For example, the following indicates the use of an integer | 
 |   addition in a specialization constant computation: | 
 |   `%sum = OpSpecConstantOp %i32 IAdd %a %b` | 
 |  | 
 | ## ID Definitions & Usage | 
 | <a name="id"></a> | 
 |  | 
 | An ID _definition_ pertains to the `<result-id>` of an instruction, and ID | 
 | _usage_ is a use of an ID as an input to an instruction. | 
 |  | 
 | An ID in the assembly language begins with `%` and must be followed by a name | 
 | consisting of one or more letters, numbers or underscore characters. | 
 |  | 
 | For every ID in the assembly program, the assembler generates a unique number | 
 | called the ID's internal number. Then each ID reference translates into its | 
 | internal number in the SPIR-V output. Internal numbers are unique within the | 
 | compilation unit: no two IDs in the same unit will share internal numbers. | 
 |  | 
 | The disassembler generates IDs where the name is always a decimal number | 
 | greater than 0. | 
 |  | 
 | So the example can be rewritten using more user-friendly names, as follows: | 
 | ``` | 
 |           OpCapability Shader | 
 |           OpMemoryModel Logical Simple | 
 |           OpEntryPoint GLCompute %main "main" | 
 |           OpExecutionMode %main LocalSize 64 64 1 | 
 |   %void = OpTypeVoid | 
 | %fnMain = OpTypeFunction %void | 
 |   %main = OpFunction %void None %fnMain | 
 | %lbMain = OpLabel | 
 |           OpReturn | 
 |           OpFunctionEnd | 
 | ``` | 
 |  | 
 | ## Floating point literals | 
 | <a name="floats"></a> | 
 |  | 
 | The assembler and disassembler support floating point literals in both | 
 | decimal and hexadecimal form. | 
 |  | 
 | The syntax for a floating point literal is the same as floating point | 
 | constants in the C programming language, except: | 
 | * An optional leading minus (`-`) is part of the literal. | 
 | * An optional type specifier suffix is not allowed. | 
 | Infinity and NaN values are expressed in hexadecimal float literals | 
 | by using the maximum representable exponent for the bit width. | 
 |  | 
 | For example, in 32-bit floating point, 8 bits are used for the exponent, and the | 
 | exponent bias is 127.  So the maximum representable unbiased exponent is 128. | 
 | Therefore, we represent the infinities and some NaNs as follows: | 
 |  | 
 | ``` | 
 | %float32 = OpTypeFloat 32 | 
 | %inf     = OpConstant %float32 0x1p+128 | 
 | %neginf  = OpConstant %float32 -0x1p+128 | 
 | %aNaN    = OpConstant %float32 0x1.8p+128 | 
 | %moreNaN = OpConstant %float32 -0x1.0002p+128 | 
 | ``` | 
 | The assembler preserves all the bits of a NaN value.  For example, the encoding | 
 | of `%aNaN` in the previous example is the same as the word with bits | 
 | `0x7fc00000`, and `%moreNaN` is encoded as `0xff800100`. | 
 |  | 
 | The disassembler prints infinite, NaN, and subnormal values in hexadecimal form. | 
 | Zero and normal values are printed in decimal form with enough digits | 
 | to preserve all significand bits. | 
 |  | 
 | ## Arbitrary Integers | 
 | <a name="immediate"></a> | 
 |  | 
 | When writing tests it can be useful to emit an invalid 32 bit word into the | 
 | binary stream at arbitrary positions within the assembly. To specify an | 
 | arbitrary word into the stream the prefix `!` is used, this takes the form | 
 | `!<integer>`. Here is an example. | 
 |  | 
 | ``` | 
 | OpCapability !0x0000FF00 | 
 | ``` | 
 |  | 
 | Any token in a valid assembly program may be replaced by `!<integer>` -- even | 
 | tokens that dictate how the rest of the instruction is parsed.  Consider, for | 
 | example, the following assembly program: | 
 |  | 
 | ``` | 
 | %4 = OpConstant %1 123 456 789 OpExecutionMode %2 LocalSize 11 22 33 | 
 | OpExecutionMode %3 InputLines | 
 | ``` | 
 |  | 
 | The tokens `OpConstant`, `LocalSize`, and `InputLines` may be replaced by random | 
 | `!<integer>` values, and the assembler will still assemble an output binary with | 
 | three instructions.  It will not necessarily be valid SPIR-V, but it will | 
 | faithfully reflect the input text. | 
 |  | 
 | You may wonder how the assembler recognizes the instruction structure (including | 
 | instruction boundaries) in the text with certain crucial tokens replaced by | 
 | arbitrary integers.  If, say, `OpConstant` becomes a `!<integer>` whose value | 
 | differs from the binary representation of `OpConstant` (remember that this | 
 | feature is intended for fine-grain control in SPIR-V testing), the assembler | 
 | generally has no idea what that value stands for.  So how does it know there is | 
 | exactly one `<id>` and three number literals following in that instruction, | 
 | before the next one begins?  And if `LocalSize` is replaced by an arbitrary | 
 | `!<integer>`, how does it know to take the next three tokens (instead of zero or | 
 | one, both of which are possible in the absence of certainty that `LocalSize` | 
 | provided)?  The answer is a simple rule governing the parsing of instructions | 
 | with `!<integer>` in them: | 
 |  | 
 | When a token in the assembly program is a `!<integer>`, that integer value is | 
 | emitted into the binary output, and parsing proceeds differently than before: | 
 | each subsequent token not recognized as an OpCode or a <result-id> is emitted | 
 | into the binary output without any checking; when a recognizable OpCode or a | 
 | <result-id> is eventually encountered, it begins a new instruction and parsing | 
 | returns to normal.  (If a subsequent OpCode is never found, then this alternate | 
 | parsing mode handles all the remaining tokens in the program.) | 
 |  | 
 | The assembler processes the tokens encountered in alternate parsing mode as | 
 | follows: | 
 |  | 
 | * If the token is a number literal, since context may be lost, the number | 
 |   is interpreted as a 32-bit value and output as a single word.  In order to | 
 |   specify multiple-word literals in alternate-parsing mode, further uses of | 
 |   `!<integer>` tokens may be required. | 
 |   All formats supported by `strtoul()` are accepted. | 
 | * If the token is a string literal, it outputs a sequence of words representing | 
 |   the string as defined in the SPIR-V specification for Literal String. | 
 | * If the token is an ID, it outputs the ID's internal number. | 
 | * If the token is another `!<integer>`, it outputs that integer. | 
 | * Any other token causes the assembler to quit with an error. | 
 |  | 
 | Note that this has some interesting consequences, including: | 
 |  | 
 | * When an OpCode is replaced by `!<integer>`, the integer value should encode | 
 |   the instruction's word count, as specified in the physical-layout section of | 
 |   the SPIR-V specification. | 
 |  | 
 | * Consecutive instructions may have their OpCode replaced by `!<integer>` and | 
 |   still produce valid SPIR-V.  For example, `!262187 %1 %2 "abc" !327739 %1 %3 6 | 
 |   %2` will successfully assemble into SPIR-V declaring a constant and a | 
 |   PrivateGlobal variable. | 
 |  | 
 | * Enums (such as `DontInline` or `SubgroupMemory`, for instance) are not handled | 
 |   by the alternate parsing mode.  They must be replaced by `!<integer>` for | 
 |   successful assembly. | 
 |  | 
 | * The `<result-id>` on the left-hand side of an assignment cannot be a | 
 |   `!<integer>`. The `<result-id>` can be still be manually controlled if desired | 
 |   by expressing the entire instruction as `!<integer>` tokens for its opcode and | 
 |   operands. | 
 |  | 
 | * The `=` sign cannot be processed by the alternate parsing mode if the OpCode | 
 |   following it is a `!<integer>`. | 
 |  | 
 | * When replacing a named ID with `!<integer>`, it is possible to generate | 
 |   unintentionally valid SPIR-V.  If the integer provided happens to equal a | 
 |   number generated for an existing named ID, it will result in a reference to | 
 |   that named ID being output.  This may be valid SPIR-V, contrary to the | 
 |   presumed intention of the writer. | 
 |  | 
 | ## Notes | 
 |  | 
 | * Some enumerants cannot be used by name, because the target instruction | 
 | in which they are meaningful take an ID reference instead of a literal value. | 
 | For example: | 
 |    * Named enumerated value `CmdExecTime` from section 3.30 Kernel | 
 |      Profiling Info is used in constructing a mask value supplied as | 
 |      an ID for `OpCaptureEventProfilingInfo`.  But no other instruction | 
 |      has enough context to bring the enumerant names from section 3.30 | 
 |      into scope. | 
 |    * Similarly, the names in section 3.29 Kernel Enqueue Flags are used to | 
 |      construct a value supplied as an ID to the Flags argument of | 
 |      OpEnqueueKernel. | 
 |    * Similarly for the names in section 3.25 Memory Semantics. | 
 |    * Similarly for the names in section 3.27 Scope. | 
 | * Some enumerants cannot be used by name, because they only name values | 
 | returned by an instruction: | 
 |    * Enumerants from 3.12 Image Channel Order name possible values returned | 
 |      by the `OpImageQueryOrder` instruction. | 
 |    * Enumerants from 3.13 Image Channel Data Type name possible values | 
 |      returned by the `OpImageQueryFormat` instruction. |