extensions/NV/NV_tessellation_program5.txt - external/github.com/KhronosGroup/OpenGL-Registry - Git at Google

 Name

     NV_tessellation_program5

 Name Strings

     (none)

 Contact

     Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com)

 Status

     Shipping.

 Version

     Last Modified Date:         12/19/2011
     NVIDIA Revision:            3

 Number

     391

 Dependencies

     OpenGL 1.1 is required.

     This extension is written against the OpenGL 2.1 specification.

     NV_gpu_program5 is required.  This extension is supported if and only
     "GL_NV_gpu_program5" is found in the extension string.  This extension is
     written against the NV_gpu_program5 extension.

     This specification interacts with ARB_tessellation_shader.

     This specification interacts with NV_parameter_buffer_object.

 Overview

     This extension, in conjunction with the ARB_tessellation_shader extension,
     introduces a new tessellation stage to the OpenGL primitive processing
     pipeline.  The ARB_tessellation_shader extension provides programmable
     shading functionality using the OpenGL Shading Language as its base; this
     extension provides assembly programmable shaders building on the family of
     assembly programmability extensions including ARB_vertex_program,
     ARB_fragment_program, NV_gpu_program4, and NV_geometry_program4.

     This extension adds a new basic primitive type, called a patch, which
     consists of an array of vertices plus some associated per-patch state.  It
     also adds two new assembly program types:  a tessellation control program
     that transforms a patch into a new patch and a tessellation evaluation
     program that computes the position and attributes of each vertex produced
     by the tesselator.

     When tessellation is active, it begins by running the optional
     tessellation control program, if enabled.  This program consumes a
     variable-size input patch and produces a new fixed-size output patch.  The
     output patch consists of an array of vertices, and a set of per-patch
     attributes.  The per-patch attributes include tessellation levels that
     control how finely the patch will be tessellated.  For each patch
     processed, multiple tessellation control program invocations are performed
     -- one per output patch vertex.  Each tessellation control program
     invocation writes all the attributes of its corresponding output patch
     vertex.  A tessellation control program may also read the per-vertex
     outputs of other tessellation control program invocations, as well as read
     and write shared per-patch outputs.  The tessellation control program
     invocations for a single patch effectively run as a group.  The GL
     automatically synchronizes threads to ensure that when executing a given
     instruction, all previous instructions have completed for all program
     invocations in the group.

     The tessellation primitive generator then decomposes a patch into a new
     set of primitives using the tessellation levels to determine how finely
     tessellated the output should be.  The primitive generator begins with
     either a triangle or a quad, and splits each outer edge of the primitive
     into a number of segments approximately equal to the corresponding element
     of the outer tessellation level array.  The interior of the primitive is
     tessellated according to elements of the inner tessellation level array.
     The primitive generator has three modes:  TRIANGLES and QUADS split a
     triangular or quad-shaped patch into a set of triangles that cover the
     original patch; ISOLINES_NV splits a quad-shaped patch into a set of line
     strips spanning the patch.  Each vertex generated by the tessellation
     primitive generator is assigned a (u,v) or (u,v,w) coordinate indicating
     its relative location in the subdivided triangle or quad.

     For each vertex produced by the tessellation primitive generator, the
     tessellation evaluation program is run to compute its position and other
     attributes of the vertex, using its (u,v) or (u,v,w) coordinate.  When
     computing the final vertex attributes, the tessellation evaluation program
     can also read the attributes of any of the vertices of the patch written
     by the tessellation control program.  Tessellation evaluation program
     invocations are completely independent, although all invocations for a
     single patch share the same collection of input vertices and per-patch
     attributes.

     The tessellator operates on vertices after they have been transformed by a
     vertex program or fixed-function vertex processing.  The primitives
     generated by the tessellator are passed further down the OpenGL pipeline,
     where they can be used as inputs to geometry programs, transform feedback,
     and the rasterizer.

     The tessellation control and evaluation programs are both optional.  If
     neither program type is present, the tessellation stage has no effect.  If
     no tessellation control program is present, the input patch provided by
     the application is passed directly to the tessellation primitive
     generator, and a set of fixed tessellation level parameters (specified via
     the PatchParameterfv function) is used to control primitive generation.
     If no tessellation evaluation program is present, the output patch
     produced by the tessellation control program is passed as a patch to
     subsequent pipeline stages, where it can be consumed by geometry programs,
     transform feedback, or the rasterizer.


 New Procedures and Functions

     None

     (Note:  The PatchParameteri and PatchParameterfv functions from
      ARB_tessellation_shader will also be used by this extension.)

 New Tokens

     Accepted by the <cap> parameter of Disable, Enable, and IsEnabled,
     by the <pname> parameter of GetBooleanv, GetIntegerv, GetFloatv,
     and GetDoublev, and by the <target> parameter of ProgramStringARB,
     BindProgramARB, ProgramEnvParameter4[df][v]ARB,
     ProgramLocalParameter4[df][v]ARB, GetProgramEnvParameter[df]vARB,
     GetProgramLocalParameter[df]vARB, GetProgramivARB and
     GetProgramStringARB:

         TESS_CONTROL_PROGRAM_NV                         0x891E
         TESS_EVALUATION_PROGRAM_NV                      0x891F

     Accepted by the <target> parameter of ProgramBufferParametersfvNV,
     ProgramBufferParametersIivNV, and ProgramBufferParametersIuivNV,
     BindBufferRangeNV, BindBufferOffsetNV, BindBufferBaseNV, and BindBuffer
     and the <value> parameter of GetIntegerIndexedvEXT:

         TESS_CONTROL_PROGRAM_PARAMETER_BUFFER_NV        0x8C74
         TESS_EVALUATION_PROGRAM_PARAMETER_BUFFER_NV     0x8C75

     Accepted by the <pname> parameter of GetProgramivARB:

         MAX_PROGRAM_PATCH_ATTRIBS_NV                    0x86D8

     (Note:  Various enumerants from ARB_tessellation_shader will also be used
      by this extension.)


 Additions to Chapter 2 of the OpenGL 1.5 Specification (OpenGL Operation)

     (Incorporate Section 2.X of the ARB_tessellation_shader specification,
      Tessellation in its entirety.)

     Insert a new section after Section 2.X.1 in ARB_tessellation_shader,
     Tessellation Control Shaders

     Tessellation Control Programs

     Each patch primitive may be optionally processed by a tessellation control
     program, which operates similarly to the tessellation control shader
     described above.  Tessellation control programs are enabled by calling
     Enable with the value TESS_CONTROL_PROGRAM_NV.  If a GLSL program is
     active, the tessellation control program enable is ignored and treated as
     disabled unless the program contains only fragment shaders.

     When enabled, each patch primitive received by the GL will be processed by
     the tessellation control program to produce a new patch.  The tessellation
     control program emits a patch with a fixed number of vertices, given by
     the value specified in the VERTICES_OUT declaration.  It computes the
     attributes of each vertex of the output patch in parallel, and assembles
     the emitted vertices into an output patch.  The program also computes
     per-patch tessellation level values that control the number of vertices
     produced by the tessellation primitive generator when that patch is
     processed.  The program may also compute additional generic per-patch
     attributes that may be accessed by invocations of the tessellation
     evaluation program or a subsequent geometry program when processing the
     patch.  When the tessellation control program completes, the input patch
     is discarded and the output patch is processed by the remainder of the GL
     pipeline.

     Each patch processed by the tessellation control program will result in
     multiple program invocations (threads), with one invocation per output
     patch vertex.  Each program invocation has a corresponding output patch
     vertex, and can write per-vertex attributes only for that vertex.  All
     program invocations may read and write per-patch attributes of the output
     patch, and may read per-vertex attributes of any vertex in the output
     patch.

     The tessellation control program threads are run as a group, and execute
     effectively in lock-step.  In this model, the execution of each
     instruction completes for all active threads before the execution of
     subsequent instruction is started.  All threads in the group are initially
     active, but the set of active threads change as flow-control instructions
     are encountered.  Full details on the execution model are specified in
     Section 2.X.5.

     Tessellation control programs execute using the instruction set documented
     in the GL_NV_gpu_program5 extension specification.  Tessellation control
     programs can read attributes from all vertices of the input patch, and
     each vertex attribute access must identify the vertex number being
     accessed.  For example, "vertex[1].position" and "vertex.in[1].position"
     identify the position of the second vertex (numbered "1") in the input
     patch.  Programs may also read attributes of all vertices of the output
     patch (e.g., "vertex.out[2].position") and per-patch attributes of the
     output patch (e.g., "primitive.out.attrib[3]").  In both cases, the output
     patch vertices or attributes accessed in this manner are undefined unless
     written by a previous instruction executed on one of the threads.
     Programs may also write attributes of their corresponding vertex in the
     output patch (e.g., "result.attrib[0]") and shared per-patch attributes
     (e.g., "result.patch.attrib[4]").  When writing output patch vertex
     attributes, a vertex number is not supplied.

     The only input primitives supported by tessellation control programs are
     patches.  The error INVALID_OPERATION is generated by Begin (or vertex
     array functions that implicitly call Begin) if a tessellation control
     program is active and <mode> is not PATCHES_NV.


     Modify section after Section 2.X.2 of ARB_tessellation_shader,
     Tessellation Primitive Generation

     (add to the end of the section describing the operation of the
      tessellation primitive generator when assembly tessellation evaluation
      programs are used)

     If no GLSL program object is active, or if the active program contains
     only a fragment shader, the tessellation primitive generator will be
     active if and only if an assembly tessellation evaluation program is
     enabled.  When a tessellation evaluation program is used, the tessellation
     primitive generator will operate in exactly the manner describe above,
     except that the parameters controlling tessellation will be taken from
     declaration statements in the tessellation evaluation program.  The
     declaration statements used to specify each tessellation parameter are as
     described in Table X.1.

           GLSL Program Parameter        TEP Declaration
           ------------------------      -----------------
           TESS_GEN_MODE_NV              TESS_MODE
           TESS_GEN_SPACING_NV           TESS_SPACING
           TESS_GEN_VERTEX_ORDER_NV      TESS_VERTEX_ORDER
           TESS_GEN_POINT_MODE_NV        TESS_POINT_MODE

       Table X.1, Parameters used to control tessellation when a program object
       with a tessellation evaluation shader is active and their tessellation
       evaluation program equivalents.

     If no tessellation control program is enabled, the default tessellation
     levels specified by calling PatchParameterfvNV with a <pname> of
     PATCH_DEFAULT_OUTER_LEVEL_NV or PATCH_DEFAULT_INNER_LEVEL_NV.

     If a GLSL program containing only a fragment shader is active, any
     tessellation-related program parameters in effect when the program was
     linked have no effect on tessellation.


     Insert a new section after Section 2.X.3 in ARB_tessellation_shader,
     Tessellation Evaluation Shaders

     Tessellation Evaluation Programs

     If a tessellation evaluation program is active, the tessellation primitive
     generator will subdivide a basic primitive and run the tessellation
     evaluation program on each generated vertex.  Tessellation evaluation
     programs are enabled by calling Enable with the value
     TESS_EVALUATION_PROGRAM_NV.  If a GLSL program is active, the tessellation
     evaluation program enable is ignored and treated as disabled unless the
     program contains only fragment shaders.

     When tessellation evaluation programs are enabled, each patch primitive
     received by the GL will trigger the tessellation primitive generator to
     perform primitive subdivision and generate a new set of vertices.  For
     each generated vertex, the tessellation evaluation program is invoked.
     Each tessellation evaluation program invocation produces a single output
     vertex.  These vertices are assembled into primitives according to the
     subdivision produced by the tessellation primitive generator, and these
     primitives are processed by the remainder of the GL pipeline.  The input
     patch used by the tessellation evaluation program is discarded.

     Tessellation evaluation programs execute using the instruction set
     documented in the GL_NV_gpu_program5 extension specification and in a
     manner similar to vertex programs.  Tessellation control programs can read
     attributes from all vertices of the input patch, and each vertex attribute
     access must identify the vertex number being accessed.  For example,
     "vertex[1].position" identifies the transformed position of "vertex[1]",
     which is the second vertex in the input patch.  Additionally, the special
     attribute variable "vertex.tesscoord" is available to specify the location
     of the vertex within the subdivided primitive.  Per-patch attributes,
     including the tessellation levels, are also available.

     The only input primitives supported by tessellation evaluation programs
     are patches.  The error INVALID_OPERATION is generated by Begin (or vertex
     array functions that implicitly call Begin) if a tessellation evaluation
     program is active and <mode> is not PATCHES_NV.


     Modify Section 2.X.2 of NV_gpu_program4, Program Grammar

     (replace third paragraph)

     Tessellation control programs are required to begin with the header string
     "!!NVtcp5.0".  Tessellation evaluation programs are required to begin with
     the header string "!!NVtep5.0".  These header strings identify the
     subsequent program body as being a tessellation control or evaluation
     program, respectively, and indicate that they should be parsed according
     to the base NV_gpu_program5 grammar plus the additions below.  Program
     string parsing begins with the character immediately following the header
     string.

     (For tessellation control programs, add the following grammar rules to the
      NV_gpu_program5 base grammar)

     <declSequence>          ::= <declaration> <declSequence>

     <attribUseV>            ::= <attribColor> "." <faceType> <swizzleSuffix>
                               | <attribColor> "." <faceType> "." <colorType>
                                 <swizzleSuffix>

     <resultUseW>            ::= <resultVarName> <arrayMem> <optWriteMask>
                               | <resultColor> <optWriteMask>
                               | <resultColor> "." <colorType> <optWriteMask>
                               | <resultColor> "." <faceType> <optWriteMask>
                               | <resultColor> "." <faceType> "." <colorType>
                                 "." <optWriteMask>

     <resultUseD>            ::= <resultColor> <optFaceColorType>
                               | <resultMulti>

     <optFaceColorType>      ::= <optColorType>
                               | "." <faceType> <optColorType>

     <declaration>           ::= "VERTICES_OUT" <int>

     <attribBasic>           ::= <vtxPrefix> "position"
                               | <vtxPrefix> "fogcoord"
                               | <vtxPrefix> "pointsize"
                               | <vtxPrefix> "id"
                               | <attribTexCoord> <optArrayMemAbs>
                               | <attribClip> <arrayMemAbs>
                               | <attribGeneric> <arrayMemAbs>
                               | <primPrefix> "." "id"
                               | <primPrefix> "." "invocation"
                               | <primPrefix> "." "vertexcount"
                               | <attribTessOuter> <arrayMemAbs>
                               | <attribTessInner> <arrayMemAbs>
                               | <attribPatchGeneric> <arrayMemAbs>

     <attribColor>           ::= <vtxPrefix> "color"

     <attribMulti>           ::= <attribTexCoord> <arrayRange>
                               | <attribClip> <arrayRange>
                               | <attribGeneric> <arrayRange>
                               | <attribTessOuter> <arrayRange>
                               | <attribTessInner> <arrayRange>
                               | <attribPatchGeneric> <arrayRange>

     <attribTexCoord>        ::= <vtxPrefix> "texcoord"

     <attribClip>            ::= <vtxPrefix> "clip"

     <attribGeneric>         ::= <vtxPrefix> "attrib"

     <attribTessOuter>       ::= <primPrefix> "." "tessouter"

     <attribTessInner>       ::= <primPrefix> "." "tessinner"

     <attribPatchGeneric>    ::= <primPrefix> "." "patch" "." "attrib"

     <vtxPrefix>             ::= "vertex" "."
                               | "vertex" <arrayMemAbs> "."
                               | "vertex" "." "in" <optArrayMemAbs> "."
                               | "vertex" "." "out" <optArrayMemAbs> "."

     <primPrefix>            ::= "primitive" "."
                               | "primitive" "." "in" "."
                               | "primitive" "." "out" "."

     <resultBasic>           ::= <resPrefix> "position"
                               | <resPrefix> "fogcoord"
                               | <resPrefix> "pointsize"
                               | <resultTexCoord> <optArrayMemAbs>
                               | <resultClip> <arrayMemAbs>
                               | <resultGeneric> <arrayMemAbs>
                               | <resPrefix> "id"
                               | <resultTessOuter> <arrayMemAbs>
                               | <resultTessInner> <arrayMemAbs>
                               | <resultPatchGeneric> <arrayMemAbs>

     <resultColor>           ::= <resPrefix> "color"

     <resultMulti>           ::= <resultTexCoord> <arrayRange>
                               | <resultClip> <arrayRange>
                               | <resultGeneric> <arrayRange>
                               | <resultTessOuter> <arrayRange>
                               | <resultTessInner> <arrayRange>
                               | <resultPatchGeneric> <arrayRange>

     <resultTexCoord>        ::= <resPrefix> "texcoord"

     <resultClip>            ::= <resPrefix> "clip"

     <resultGeneric>         ::= <resPrefix> "attrib"

     <resultTessOuter>       ::= <resPrefix> "." "patch" "." "tessouter"

     <resultTessInner>       ::= <resPrefix> "." "patch" "." "tessinner"

     <resultPatchGeneric>    ::= <resPrefix> "." "patch" "." "attrib"

     <resPrefix>             ::= "result" "."


     (For tessellation evaluation programs, add the following grammar rules to
      the NV_gpu_program5 base grammar)

     <declSequence>          ::= <declaration> <declSequence>

     <attribUseV>            ::= <attribColor> "." <faceType> <swizzleSuffix>
                               | <attribColor> "." <faceType> "." <colorType>
                                 <swizzleSuffix>

     <resultUseW>            ::= <resultVarName> <arrayMem> <optWriteMask>
                               | <resultColor> <optWriteMask>
                               | <resultColor> "." <colorType> <optWriteMask>
                               | <resultColor> "." <faceType> <optWriteMask>
                               | <resultColor> "." <faceType> "." <colorType>
                                 "." <optWriteMask>

     <resultUseD>            ::= <resultColor> <optFaceColorType>
                               | <resultMulti>

     <optFaceColorType>      ::= <optColorType>
                               | "." <faceType> <optColorType>

     <declaration>           ::= "TESS_MODE" <declTessMode>
                               | "TESS_SPACING" <declTessSpacing>
                               | "TESS_VERTEX_ORDER" <declTessVtxOrder>
                               | "TESS_POINT_MODE"

     <declTessMode>          ::= "TRIANGLES"
                               | "QUADS"
                               | "ISOLINES"

     <declTessSpacing>       ::= "EQUAL"
                               | "FRACTIONAL_ODD"
                               | "FRACTIONAL_EVEN"

     <declTessVtxOrder>      ::= "CW"
                               | "CCW"

     <attribBasic>           ::= <vtxPrefix> "position"
                               | <vtxPrefix> "fogcoord"
                               | <vtxPrefix> "pointsize"
                               | <vtxPrefix> "id"
                               | <attribTexCoord> <optArrayMemAbs>
                               | <attribClip> <arrayMemAbs>
                               | <attribGeneric> <arrayMemAbs>
                               | "vertex" "." "tesscoord"
                               | <primPrefix> "id"
                               | <primPrefix> "vertexcount"
                               | <attribTessOuter> <optArrayMemAbs>
                               | <attribTessInner> <optArrayMemAbs>
                               | <attribPatchGeneric> <optArrayMemAbs>

     <attribColor>           ::= <vtxPrefix> "color"

     <attribMulti>           ::= <attribTexCoord> <arrayRange>
                               | <attribClip> <arrayRange>
                               | <attribGeneric> <arrayRange>
                               | <attribTessOuter> <arrayRange>
                               | <attribTessInner> <arrayRange>
                               | <attribPatchGeneric> <arrayRange>

     <attribTexCoord>        ::= <vtxPrefix> "texcoord"

     <attribClip>            ::= <vtxPrefix> "clip"

     <attribGeneric>         ::= <vtxPrefix> "attrib"

     <attribTessOuter>       ::= <primPrefix> "." "tessouter"

     <attribTessInner>       ::= <primPrefix> "." "tessinner"

     <attribPatchGeneric>    ::= <primPrefix> "." "patch" "." "attrib"

     <vtxPrefix>             ::= "vertex" "."
                               | "vertex" <arrayMemAbs> "."
                               | "vertex" "." "in" <optArrayMemAbs> "."
                               | "vertex" "." "out" <optArrayMemAbs> "."

     <primPrefix>            ::= "primitive" "."
                               | "primitive" "." "in" "."

     <resultBasic>           ::= <resPrefix> "position"
                               | <resPrefix> "fogcoord"
                               | <resPrefix> "pointsize"
                               | <resultTexCoord> <optArrayMemAbs>
                               | <resultClip> <arrayMemAbs>
                               | <resultGeneric> <arrayMemAbs>
                               | <resPrefix> "id"

     <resultColor>           ::= <resPrefix> "color"

     <resultMulti>           ::= <resultTexCoord> <arrayRange>
                               | <resultClip> <arrayRange>
                               | <resultGeneric> <arrayRange>

     <resultTexCoord>        ::= <resPrefix> "texcoord"

     <resultClip>            ::= <resPrefix> "clip"

     <resultGeneric>         ::= <resPrefix> "attrib"

     <resPrefix>             ::= "result" "."


     (add the following subsection to section 2.X.3.2 of NV_gpu_program4,
      Program Attribute Variables)

     Tessellation control and evaluation program attribute variables describe
     inputs accessible to the program.  There are several different classes of
     attribute bindings available, identified by the binding prefix.  The set
     of attribute binding classes and their corresponding prefixes are
     described in Table X.2.  The specific attributes for each class are
     identified by a binding suffix.

       Attribute Binding Prefix   Description
       ------------------------   ------------------------------------------
       vertex[m]                  Vertex <m> of the input patch
       vertex.in[m]               Vertex <m> of the input patch
       vertex                     Array spanning vertices of the input patch
                                    or the specific vertex being evaluated
       vertex.in                  Array spanning vertices of the input patch
                                    or the specific vertex being evaluated
       primitive                  Per-patch value of the input patch
       primitive.in               Per-patch value of the input patch
       vertex.out[m]              Vertex <m> of the output patch
       vertex.out                 Array spanning vertices of the output patch
       primitive.out              Per-patch value of the output patch

       Table X.2, Tessellation Control and Evaluation Program Attribute Binding
       Prefixes.  <m> refers to a constant integer vertex number in the input
       or output patch.

     If an attribute binding prefix matches "vertex[m]" or "vertex.in[m]", the
     attribute binding refers to an attribute of the vertex numbered <m> in the
     input patch.  If <m> is greater than or equal to the number of vertices in
     the input patch, the values corresponding to the binding are undefined.

     If an attribute binding prefix matches "vertex" or "vertex.in" and the
     suffix identifies an attribute of the vertex being processed by a
     tessellation evaluation program (e.g., "tesscoord"), the attribute binding
     refers to that attribute.

     If an attribute binding prefix matches "vertex" or "vertex.in" and the
     suffix identifies any other vertex attribute, the attribute binding refers
     to that specific attribute for each of the vertices of the input patch.
     Bindings of this form may only be used in explicit variable declarations.
     If the variable declaration identifies an array, the program will fail to
     load unless each binding in the binding list uses an attribute prefix of
     this form.  When such variables are used in instructions, they must be
     accessed as an array, with the first array index identifying the vertex
     number.  If such variables are declared as an array, a second array index
     must be provided to identify the specific per-vertex attribute to select.
     If the first array index is negative or greater than or equal to the
     number of vertices in the input patch, the value obtained is undefined.

     If an attribute binding prefix matches "primitive" or "primitive.in", the
     attribute binding refers to an attribute of the input patch.

     If a tessellation control program attribute binding prefix matches
     "vertex.out[m]", the attribute binding refers to an attribute of the
     vertex numbered <m> in the output patch.  These attributes correspond to
     per-vertex output values written by the tessellation control program
     thread numbered <m>.  A program will fail to load if the vertex number <m>
     is greater than or equal to the number of vertices in the output patch.
     Tessellation evaluation programs do not have an output patch and do not
     support this attribute binding prefix.

     If a tessellation control program attribute binding prefix matches
     "vertex.out", the attribute binding identifies a specific attribute for
     each of the vertices of the output patch.  Bindings of this form may only
     be used in explicit variable declarations, and all the usage rules
     described above for bindings using the prefix "vertex.in" apply.  If the
     vertex number identified when accessing such variables is negative or
     greater than or equal to the number of vertices in the output patch, the
     resulting values are undefined.  Tessellation evaluation programs do not
     have an output patch and do not support this attribute binding suffix.

     If an attribute binding prefix matches "primitive.out", the attribute
     binding refers to a per-patch attribute of the output patch.  These
     attributes correspond to per-patch result values written by one of the
     tessellation control program threads.  Tessellation evaluation programs do
     not have an output patch and do not support this attribute binding suffix.

     The following examples illustrate various legal and illegal program
     bindings and their meanings.

       ATTRIB pos = vertex.position;
       ATTRIB pos2 = vertex.in[2].position;
       ATTRIB outpos = vertex.out.position;
       ATTRIB outpos2 = vertex.out[2].position;
       ATTRIB texcoords[] = { vertex.texcoord[0..3] };
       ATTRIB tcoords1[4] = { vertex[1].texcoord[1..4] };
       ATTRIB outattr[2] = { vertex.out.attrib[0..1] };
       INT TEMP A0;
       ...
       MOV R0, pos[1];                   # position of input vertex 1
       MOV R0, vertex[1].position;       # position of input vertex 1
       MOV R0, pos2;                     # position of input vertex 2
       MOV R0, outpos;                   # ILLEGAL - needs a vertex number
       MOV R0, outpos[1];                # position of output vertex 1 (TCP)
       MOV R0, outpos2;                  # position of output vertex 2 (TCP)
       MOV R0, texcoords[A0.x][1];       # texcoord 1 of input vertex A0.x
       MOV R0, texcoords[A0.x][A0.y];    # texcoord A0.y of input vertex A0.x
       MOV R0, tcoords1[2];              # texcoord 3 of input vertex 1
       MOV R0, outattr[A0.x][1];         # generic attr 1 of output vertex
                                         # A0.x (TCP)
       MOV R0, vertex[A0.x].texcoord[1]; # ILLEGAL -- vertex number must be
                                         # constant or must use variables like
                                         # "texcoords" using bindings w/o
                                         # vertex numbers

     Attributes from input patch vertices will be obtained from the per-vertex
     outputs of the previous program used to generate the vertex in question.
     For tessellation evaluation programs, that previous program would be the
     tessellation control program, if enabled, or the vertex program otherwise.
     For tessellation control programs, the previous program is always the
     vertex program.  Tessellation control and evaluation program attributes
     should be read using the same component data type used to write the
     corresponding vertex program results.  If input patch vertices are
     specified to come from vertex program outputs but no vertex program is
     enabled, the values are instead produced from fixed-function vertex
     processing.  The value of any attribute corresponding to a vertex output
     not written by the previous program stage is undefined, as are the values
     of all generic attributes if the vertex was produced by fixed-function
     vertex processing.

     Attributes from output patch vertices are only available in tessellation
     control programs, and will be obtained from the per-vertex outputs of the
     same program.  When executing an instruction, the values of any output
     patch vertex attribute are undefined unless the corresponding program
     output was written by a previously executed instruction.

     Per-patch attributes of the input patch are only available in tessellation
     evaluation and geometry programs.  If a tessellation control program is
     enabled, they will be obtained from the corresponding per-patch outputs of
     the tessellation control program producing the patch, and any attributes
     not written by any thread of the control program are undefined.  If no
     tessellation control program is enabled, the inner and outer tessellation
     levels are taken from the default tessellation levels, and all other
     per-patch attributes are undefined.

     Per-patch attributes of the output patch are available only in
     tessellation control programs and will be obtained from the per-patch
     outputs of the same program.  When executing an instruction, the values of
     any output patch attribute are undefined unless the corresponding program
     output was written by a previously executed instruction.

     The attributes of the vertices of an input or output patch vertex are
     selected by an attribute binding suffix, as identified in Table X.3.  All
     such bindings correspond to one of multiple patch vertices and require a
     vertex number, either in the binding prefix used in the instruction or as
     the first array index when using an explicitly declared attribute variable
     whose bindings have no vertex number.

       Vertex Binding Suffix      Components   Description
       ------------------------   ----------   ----------------------------
       position                    (x,y,z,w)   clip coordinates
       color                       (r,g,b,a)   front primary color
       color.primary               (r,g,b,a)   front primary color
       color.secondary             (r,g,b,a)   front secondary color
       color.front                 (r,g,b,a)   front primary color
       color.front.primary         (r,g,b,a)   front primary color
       color.front.secondary       (r,g,b,a)   front secondary color
       color.back                  (r,g,b,a)   back primary color
       color.back.primary          (r,g,b,a)   back primary color
       color.back.secondary        (r,g,b,a)   back secondary color
       fogcoord                    (f,-,-,-)   fog coordinate
       pointsize                   (s,-,-,-)   point size
       texcoord                    (s,t,r,q)   texture coordinate, unit 0
       texcoord[n]                 (s,t,r,q)   texture coordinate, unit n
       attrib[n]                   (x,y,z,w)   generic interpolant n
       clip[n]                     (d,-,-,-)   clip plane distance
       texcoord[n..o]              (s,t,r,q)   array of texture coordinates
       attrib[n..o]                (x,y,z,w)   array of generic interpolants
       clip[n..o]                  (d,-,-,-)   array of clip distances
       id                          (id,-,-,-)  vertex id

       Table X.3, Tessellation Control and Evaluation Program Per-Patch Vertex
       Attribute Bindings.  <n> and <o> refer to integer constants.

     If an attribute binding suffix matches "position", the "x", "y", "z" and
     "w" components of the attribute variable are filled with the "x", "y",
     "z", and "w" components, respectively, of the transformed position of the
     specified vertex, in clip coordinates.

     If an attribute binding suffix matches any binding in Table X.3 beginning
     with "color", the "x", "y", "z", and "w" components of the attribute
     variable are filled with the "r", "g", "b", and "a" components,
     respectively, of the corresponding color of the specified vertex.
     Bindings containing "front" and "back" refer to the front and back colors,
     respectively.  Bindings containing "primary" and "secondary" refer to
     primary and secondary colors, respectively.  If face or color type is
     omitted in the binding, the binding is treated as though "front" and
     "primary", respectively, were specified.

     If an attribute binding suffix matches "fogcoord", the "x" component of
     the attribute variable is filled with the fog coordinate of the specified
     vertex.  The "y", "z", and "w" components are undefined.

     If an attribute binding suffix matches "pointsize", the "x" component of
     the attribute variable is filled with the point size of the specified
     vertex.  If the vertex was produced by fixed-function vertex processing,
     the point size attribute is undefined.  The "y", "z", and "w" components
     are always undefined.

     If an attribute binding suffix matches "texcoord" or "texcoord[n]", the
     "x", "y", "z", and "w" coordinates of the attribute variable are filled
     with the "s", "t", "r", and "q" coordinates of texture coordinate set <n>
     of the specified vertex.  If <n> is omitted, texture coordinate set zero
     is used.

     If an attribute binding suffix matches "attrib[n]", the "x", "y", "z", and
     "w" components of the attribute variable are filled with the "x", "y",
     "z", and "w" coordinates of generic interpolant <n> of the specified.  All
     generic interpolants will be undefined when the vertex is produced by
     fixed-function vertex processing.

     If an attribute binding suffix matches "clip[n]", the "x" component of the
     attribute variable is filled the clip distance of the specified vertex for
     clip plane <n>, as written by the vertex program.  If the vertex was
     produced by fixed-function vertex processing or a position-invariant
     vertex program, the clip distance is obtained by computing the per-clip
     plane dot product:

       (p_1' p_2' p_3' p_4') dot (x_e y_e z_e w_e),

     at the vertex location, as described in section 2.12.  The clip distance
     for clip plane <n> is undefined if clip plane <n> is disabled.  The "y",
     "z", and "w" components of the attribute are undefined.

     If an attribute binding suffix matches "texcoord[n..o]", "attrib[n..o]",
     or "clip[n..o]", a sequence of 1+<o>-<n> texture coordinate, generic
     attribute, or clip distance bindings is created.  For texture coordinate
     bindings, it is as though the sequence "vertex[m].texcoord[n],
     vertex[m].texcoord[n+1], ...  vertex[m].texcoord[o]" were specfied.  These
     bindings are available only in explicit declarations of array variables.
     A program will fail to load if <n> is greater than <o>.

     If an attribute binding suffix matches "id", the "x" component is filled
     with the vertex ID of the specified vertex.  If the vertex was generated
     by a previous program, the attribute variable is filled with the vertex ID
     result written by that program.  Otherwise, the vertex ID is undefined.
     The "y", "z", and "w" components of the attribute are undefined.

     Attribute bindings other than those corresponding to individual vertices
     in input and output patch are identified in Table X.4.  All of these items
     except for "vertex.tesscoord" are per-patch attributes, and require one of
     the prefixes beginning with "primitive".

       Primitive Binding Suffix   Components  Description
       ------------------------   ----------  ----------------------------
       id                         (id,-,-,-)  primitive number
       invocation                 (id,-,-,-)  tess. control invocation
       vertexcount                (c,-,-,-)   vertices in primitive
       tessouter[n]               (x,-,-,-)   outer tess. level n
       tessinner[n]               (x,-,-,-)   inner tess. level n
       patch.attrib[n]            (x,y,z,w)   generic patch attribute n
       tessouter[n..o]            (x,-,-,-)   outer tess. levels n to o
       tessinner[n..o]            (x,-,-,-)   inner tess. levels n to o
       patch.attrib[n..o]         (x,y,z,w)   generic patch attrib n to o
       vertex.tesscoord (*)       (u,v,w,-)   tess. coordinate in [0,1]

       Table X.4, Tessellation Control and Evaluation Miscellaneous Attribute
       Bindings.  <n> and <o> refer to integer constants.

     If an attribute binding suffix matches "id", the "x" component is filled
     with the number of primitives received by the GL since the last time Begin
     was called (directly or indirectly via vertex array functions).  The first
     primitive generated after a Begin is numbered zero, and the primitive ID
     counter is incremented after every individual point, line, or polygon
     primitive is processed.  Restarting a primitive topology using the
     primitive restart index has no effect on the primitive ID counter.  The
     "y", "z", and "w" components of the variable are always undefined.  This
     suffix may only be used with the prefixes "primitive", "primitive.in", or
     "primitive.out", and produces the same value in all cases.

     If a tessellation control program attribute binding suffix matches
     "invocation", the "x" component is filled with the thread number of the
     program invocation.  The invocation number identifies the number of the
     vertex in the output patch whose attributes are produced by this
     invocation, and is in the range [0..<n>-1], where <n> is given by the
     VERTICES_OUT declaration.  The "y", "z", and "w" components of the
     variable are always undefined.  This suffix is not available to
     tessellation evaluation programs and may only be used with the prefixes
     "primitive", "primitive.in", or "primitive.out", and produces the same
     value in all cases.

     If an attribute binding suffix matches "vertexcount", the "x" component is
     filled with the number of vertices in the input primitive being processed.
     The "y", "z", and "w" components of the variable are always undefined.
     This suffix is available only with the prefixes "primitive" and
     "primitive.in".

     If an attribute binding suffix matches "tessouter[n]", the "x" component
     is filled with the per-patch outer tessellation level numbered <n> of the
     identified input or output patch.  <n> must be less than four.  The "y",
     "z", and "w" components are always undefined.  This suffix is available
     only with the prefixes "primitive", "primitive.in", and "primitive.out".
     For tessellation control programs, this suffix is available only with
     "primitive.out".

     If an attribute binding suffix matches "tessinner[n]", the "x" component
     is filled with the per-patch inner tessellation level numbered <n> of the
     identified input or output patch.  <n> must be less than two.  The "y",
     "z", and "w" components are always undefined.  This suffix is available
     only with the prefixes "primitive", "primitive.in", and "primitive.out".
     For tessellation control programs, this suffix is available only with
     "primitive.out".

     If an attribute binding suffix matches "patch.attrib[n]", the "x", "y",
     "z", and "w" components are filled with the corresponding components of
     the per-patch generic attribute numbered <n> of the identified input or
     output patch.  This suffix is available only with the prefixes
     "primitive", "primitive.in", and "primitive.out".  For tessellation
     control programs, this suffix is available only with "primitive.out".

     If an attribute binding suffix matches "tessouter[n..o]",
     "tessinner[n..o]", or "patch.attrib[n..o]", a sequence of 1+<o>-<n> outer
     tessellation level, inner tessellation level, or per-patch generic
     attribute bindings is created.  For per-patch generic attribute bindings,
     it is as though the sequence "primitive.patch.attrib[n],
     primitive.patch.attrib[n+1], ...  primitive.patch.attrib[o]" were
     specfied.  These bindings are available only in explicit declarations of
     array variables.  A program will fail to load if <n> is greater than <o>.

     If a tessellation evaluation program attribute binding suffix matches
     "vertex.tesscoord", the "x", "y", and "z" components are filled with the
     floating-point (u,v,w) values, respectively, corresponding to the vertex
     being processed by the tessellation evaluation program.  For triangle
     tessellation, the (u,v,w) values are barycentric coordinates that specify
     the location of the vertex relative to the three corners of the subdivided
     triangle.  The (u,v,w) values are in the range [0,1] and sum to one.  For
     quad and isoline tessellation, the (u,v) values are in the range [0,1] and
     specify the relative horizontal and vertical position in the subdivided
     quad.  The third component of the (u,v,w) vector is undefined for quad and
     isoline tessellation.  The "w" component of the variable is always
     undefined.  This suffix is not available to tessellation control shaders
     and may only be used with the prefix "vertex".


     (add the following subsection to section 2.X.3.5 of NV_gpu_program4,
      Program Results.)

     The attributes of individual output vertices are written by tessellation
     control and evaluation programs.  For tessellation control programs, these
     attributes are those of the output patch vertex corresponding to the
     program invocation.  For tessellation evaluation programs, these
     attributes specify the attributes of the vertex in the tessellated patch
     corresponding to the program invocation.  The set of allowable per-vertex
     result variable bindings is the same for tessellation control and
     evaluation programs correspond to attributes of output vertices and is
     given in Table X.5.

       Binding                        Components  Description
       -----------------------------  ----------  ----------------------------
       result.position                (x,y,z,w)   position in clip coordinates
       result.color                   (r,g,b,a)   front-facing primary color
       result.color.primary           (r,g,b,a)   front-facing primary color
       result.color.secondary         (r,g,b,a)   front-facing secondary color
       result.color.front             (r,g,b,a)   front-facing primary color
       result.color.front.primary     (r,g,b,a)   front-facing primary color
       result.color.front.secondary   (r,g,b,a)   front-facing secondary color
       result.color.back              (r,g,b,a)   back-facing primary color
       result.color.back.primary      (r,g,b,a)   back-facing primary color
       result.color.back.secondary    (r,g,b,a)   back-facing secondary color
       result.fogcoord                (f,*,*,*)   fog coordinate
       result.pointsize               (s,*,*,*)   point size
       result.texcoord                (s,t,r,q)   texture coordinate, unit 0
       result.texcoord[n]             (s,t,r,q)   texture coordinate, unit n
       result.attrib[n]               (x,y,z,w)   generic interpolant n
       result.clip[n]                 (d,*,*,*)   clip plane distance
       result.texcoord[n..o]          (s,t,r,q)   texture coordinates n thru o
       result.attrib[n..o]            (x,y,z,w)   generic interpolants n thru o
       result.clip[n..o]              (d,*,*,*)   clip distances n thru o

       Table X.5:  Tessellation Control and Evaluation Program Per-Vertex
       Result Variable Bindings.  Components labeled "*" are unused.

     If a result variable binding matches "result.position", updates to the
     "x", "y", "z", and "w" components of the result variable modify the "x",
     "y", "z", and "w" components, respectively, of the transformed vertex's
     clip coordinates.  Final window coordinates of vertices used for
     rasterization will be generated for the vertex as described in section
     2.14.4.4.

     If a result variable binding match begins with "result.color", updates to
     the "x", "y", "z", and "w" components of the result variable modify the
     "r", "g", "b", and "a" components, respectively, of the corresponding
     vertex color attribute in Table X.3.  Color bindings that do not specify
     "front" or "back" are consided to refer to front-facing colors.  Color
     bindings that do not specify "primary" or "secondary" are considered to
     refer to primary colors.

     If a result variable binding matches "result.fogcoord", updates to the "x"
     component of the result variable set the transformed vertex's fog
     coordinate.  Updates to the "y", "z", and "w" components of the result
     variable have no effect.

     If a result variable binding matches "result.pointsize", updates to the
     "x" component of the result variable set the transformed vertex's point
     size.  Updates to the "y", "z", and "w" components of the result variable
     have no effect.

     If a result variable binding matches "result.texcoord" or
     "result.texcoord[n]", updates to the "x", "y", "z", and "w" components of
     the result variable set the "s", "t", "r" and "q" components,
     respectively, of the transformed vertex's texture coordinates for texture
     unit <n>.  If "[n]" is omitted, texture unit zero is selected.

     If a result variable binding matches "result.attrib[n]", updates to the
     "x", "y", "z", and "w" components of the result variable set the "x", "y",
     "z", and "w" components of the generic interpolant <n>.

     If a result variable binding matches "result.clip[n]", updates to the "x"
     component of the result variable set the clip distance for clip plane <n>.

     If a result variable binding matches "result.texcoord[n..o]",
     "result.attrib[n..o]", or "result.clip[n..o]", a sequence of 1+<o>-<n>
     bindings is created.  For texture coordinates, it is as though the
     sequence "result.texcoord[n], result.texcoord[n+1],
     ... result.texcoord[o]" were specfied.  These bindings are available only
     in explicit declarations of array variables.  A program will fail to load
     if <n> is greater than <o>.

     In addition to per-vertex attribute bindings, a set of per-patch result
     bindings are available to tessellation control programs, as described in
     Table X.6.  These bindings are not available to tessellation evaluation
     programs.

       Binding                        Components  Description
       -----------------------------  ----------  ----------------------------
       result.patch.tessouter[n]      (x,*,*,*)   tessctl outer level n
       result.patch.tessinner[n]      (x,*,*,*)   tessctl inner level n
       result.patch.attrib[n]         (x,y,z,w)   per-patch generic attrib n
       result.patch.tessouter[n..o]   (x,*,*,*)   tessctl outer levels n thru o
       result.patch.tessinner[n..o]   (x,*,*,*)   tessctl inner levels n thru o
       result.patch.attrib[n..o]      (x,y,z,w)   per-patch attribs n thru o

       Table X.4:  Tessellation Control Per-Patch Result Variable Bindings.
       Components labeled "*" are unused.

     If a result variable binding matches "result.patch.tessouter[n]", updates
     to the "x" component set the outer tessellation level numbered <n> for the
     output patch.  Updates to the "y", "z", and "w" components have no effect.

     If a result variable binding matches "result.patch.tessinner[n]", updates
     to the "x" component set the inner tessellation level numbered <n> for the
     output patch.  Updates to the "y", "z", and "w" components have no effect.

     If a result variable binding matches "result.patch.attrib[n]", updates to
     the "x", "y", "z", and "w" components of the result variable set the "x",
     "y", "z", and "w" components of the per-patch generic attribute numbered
     <n> for the output patch.

     If a result variable binding matches "result.patch.tessouter[n..o]",
     "result.patch.tessinner[n..o]", or "result.patch.attrib[n..o]", a sequence
     of 1+<o>-<n> bindings is created.  For per-patch generic attributes, it is
     as though the sequence "result.patch.attrib[n], result.patch.attrib[n+1],
     ...  result.patch.attrib[o]" were specfied.  These bindings are available
     only in explicit declarations of array variables.  A program will fail to
     load if <n> is greater than <o>.


     Modify Section 2.X.5 of NV_gpu_program4, Program Flow Control

     (modify spec language at the end of the section to account for the
      different flow control model for tessellation control programs)

     Tessellation Control Program Flow Control

     For tessellation control programs, there are multiple program invocations
     for each patch processed that run as a group.  Any given program
     invocation can read per-vertex or per-patch attributes of the output
     patch, which may be computed during the execution of the program and may
     be computed by a different program invocation.  To provide defined
     behavior for such accesses, we specify that all threads for each patch run
     as a group.  When executing any block of instructions, all active threads
     will complete the excecution of one instruction before starting the
     execution of the subsequent instruction.  Flow control instructions may
     cause the flow of threads in a group to diverge and will modify the set of
     active threads.  The handling of flow control instructions is described in
     more detail below.

     A tessellation control program is handled by executing all instructions in
     a block of instructions corresponding to the main subroutine, with all
     threads initially active.  This block consists of all instructions between
     the "main" label and the next subroutine label.  If no "main" label is
     present, the block starts with the first instruction in the program.  If
     there is no subroutine label following the beginning of the block, the
     block ends at the END instruction.  Instructions in the block are executed
     in order until all threads reach a termination condition.  A thread will
     terminate:

       * if it executes a RET anywhere within the main subroutine, unless the
         RET instruction is conditional and the condition code test fails; or

       * if it completes the execution of all instructions in the subroutine
         block.

     When an individual thread terminates processing of the main subroutine,
     the thread will become inactive and remain inactive for the remainder of
     program execution.  When all threads have terminated the main subroutine
     block, program execution is complete and the output patch is passed to
     subsequent pipeline stages.

     When a CAL instruction is executed, the current set of active threads will
     execute a block of instructions corresponding to the specified subroutine
     label.  This block consists of all instructions between the specified
     label and the next subroutine label.  If there is no subroutine label
     following the beginning of the block, the block ends at the END
     instruction.  Instructions in the block are executed in order until all
     active threads reach a termination condition.  A thread will complete
     execution of a subroutine block:

       * if the CAL instruction is conditional and the condition code test
         fails;

       * if it executes a RET anywhere within the subroutine block, unless the
         RET instruction is conditional and the condition code test fails; or

       * if it completes the execution of all instructions in the subroutine
         block.

     When an individual thread terminates processing of a called subroutine,
     the thread will become inactive and remain inactive until all threads have
     reached their termination condition.  When all threads have terminated the
     subroutine, execution continues at the instruction following the CAL
     instruction.  All threads active for initial CAL instruction become active
     again; all other threads will remain inactive.

     When a REP instruction is executed, the current set of active threads will
     repeatedly execute the instructions between the REP and corresponding
     ENDREP instruction in order.  Execution of this instruction loop will
     continue until all threads active when the REP instruction is executed
     reach a termination condition.  A thread will terminate the processing of
     a REP/ENDREP block:

       * if the REP instruction specifies a loop count, and the initial loop
         count is not positive;

       * if the REP instruction specifies a loop count, and the current value
         of the loop count for the thread reaches zero when decremented by an
         ENDREP instruction;

       * if a RET instruction is executed anywhere within the REP/ENDREP block,
         unless the RET instruction is conditional and the condition code test
         fails; or

       * if a BRK instruction is executed inside the REP/ENDREP block, unless
         the BRK instruction is contained inside a more-deeply nested
         REP/ENDREP block or the BRK instruction is conditional and the
         condition code test fails.

     When an individual thread terminates processing of a REP/ENDREP loop, the
     thread will become inactive and remain inactive until all threads have
     terminated the loop.  When all threads have terminated the loop, execution
     continues at the instruction following the ENDREP instruction.  All
     threads active for initial REP instruction become active again, unless
     they executed a RET instruction inside the REP/ENDREP block.  All other
     threads will be inactive.

     If a conditional CONT instruction is executed inside a REP/ENDREP block,
     all active threads passing the condition code test will become inactive
     and remain inactive until the next ENDREP instruction.  If all active
     threads become inactive following the completion of a CONT instruction,
     processing continues at the next ENDIF or ENDREP instruction.  An
     unconditional CONT instruction is treated identically to a conditional
     CONT instruction where all active threads pass the condition code test.

     When an IF instruction belonging to an IF/ELSE/ENDIF block is executed,
     the current set of active threads is split into two groups.  The first
     group consists of all active threads passing the condition code test, and
     will execute a block of instructions between the IF and ELSE.  The second
     group consists of all active threads failing the condition code test, and
     will execute a block of instructions between the ELSE and ENDIF.
     Instructions within each group are executed in lock-step order.  However,
     the order of execution of instructions for threads in the first group are
     undefined relative to those in the second group.

     When executing a block of instructions for either of the two groups in an
     IF/ELSE/ENDIF block, instructions within the block will be executed in
     order with only the threads in that group active.  The instructions of the
     block are executed until all threads in the group reach a block
     termination condition.  A thread will terminate the processing of its
     block:

       * if it executes a RET instruction, unless the RET instruction is
         conditional and the condition code test fails;

       * if it executes a BRK or CONT instruction inside the IF/ENDIF block,
         unless that instruction is contained in a more-deeply nested
         REP/ENDREP block or if the instruction is conditional and the
         condition code test fails; or

       * if it completes the execution of all instructions in the instruction
         block.

     When both groups have completed their instruction blocks, execution
     continues at the instruction following the ENDIF.  No instruction
     following the ENDIF will be executed until both groups have completed.  At
     that point, any thread active for the IF instruction will become active
     again unless the execution of its instruction block was terminated due to
     the execution of a RET, BRK, or CONT instruction.  All other threads will
     be inactive.

     An IF instruction belonging to an IF/ENDIF block (with no corresponding
     ELSE) is handled as above, except that only one thread group created.
     That group will consists of all active threads passing the condition code,
     and it executes a block of instructions between the IF and ENDIF.

     The order of execution imposed by this flow control model typically
     produces defined results when a tessellation control shader writes an
     output patch attribute, and then reads it (possibly on a different thread)
     for further computation.  There are two cases where undefined instruction
     execution order will lead to undefined attribute values.  When two or more
     threads access an attribute in a single executed instruction:

       * the value of the attribute after the instruction completes will be
         undefined if multiple threads write different values; and

       * the value of the attribute read by one thread will be undefined if the
         same attribute is written by another thread executing the same
         instruction.

     Also, when an IF/ELSE/ENDIF block is executed and a thread from each of
     the two thread groups access an attribute within its block:

       * the value of the attribute after the completion of the block will be
         undefined if both threads write different values;

       * the value of the attribute read by one thread will be undefined if the
         same attribute is written by another thread.

     If either thread group in an IF/ELSE/ENDIF block issue CAL instructions,
     these restrictions also apply to the instructions executed in the called
     subroutine.

     The additional complexities of this tessellation control program flow
     control model are not fundamentally incompatible with the simpler flow
     control rules above.  They are simply intended to provide a useful model
     allowing for multiple cooperating threads.  In particular, two models are
     completely equivalent if there is only number of tessellation control
     program threads per patch is one.


     (add the following subsections to section 2.X.6 of NV_gpu_program4,
      Program Options.)

     Section 2.X.6.Y, Tessellation Control Program Options

     No options are supported at present for tessellation control programs.


     Section 2.X.6.Y, Tessellation Evaluation Program Options

     No options are supported at present for tessellation evaluation programs.


     (add the following subsections to section 2.X.7 of NV_gpu_program4,
      Program Declarations.)

     Section 2.X.7.Y, Tessellation Control Program Declarations

     Tessellation control programs support one type of declaration statement,
     as described below.

     - Output Vertex Count (VERTICES_OUT)

     The VERTICES_OUT statement declares the number of vertices in the output
     patch produced by the tessellation control program, which also specifies
     the number of program invocations for each input patch.  The single
     argument must be a positive integer less than or equal to the value of the
     implementation-dependent limit MAX_PATCH_VERTICES_NV.  Each program
     invocation will have the same inputs except for the built-in input
     variable "primitive.invocation".  This variable will be an integer between
     0 and <n>-1, where <n> is the declared number of invocations.  A program
     will fail to load unless it contains exactly one VERTICES_OUT declaration.


     Section 2.X.7.Y, Tessellation Evaluation Program Declarations

     Tessellation evaluation programs support several declaration statements.
     Each of these may be included at most in a tessellation evaluation
     program.

     - Tessellation Primitive Generation Mode (TESS_MODE)

     The TESS_MODE statement declares the type of subdivision performed by the
     tessellation primitive generator when the tessellation evaluation program,
     as described for the TESS_GEN_MODE_NV parameter in Section 2.X.2.  The
     single argument must be "TRIANGLES", "QUADS", or "ISOLINES".  A
     tessellation evaluation program will fail to load if it has no primitive
     generation mode declaration.

     - Tessellation Primitive Spacing (TESS_SPACING)

     The TESS_SPACING statement declares the type of spacing the tessellation
     primitive generator applies when subdivides primitive edge, as described
     for the TESS_GEN_SPACING_NV parameter in Section 2.X.2.  The single
     argument must be "EQUAL", "FRACTIONAL_ODD", or "FRACTIONAL_EVEN".  If a
     program omits a spacing declaration, "EQUAL" will be used.

     - Tessellation Vertex Order (TESS_VERTEX_ORDER)

     The TESS_VERTEX_ORDER statement declares the order of the vertices in the
     triangles emitted by the tessellation primitive generator in TRIANGLES or
     QUADS mode, as described for the TESS_GEN_VERTEX_ORDER_NV parameter in
     Section 2.X.2.  The single argument must be "CW" or "CCW".  If a program
     omits a vertex order declaration, "CCW" will be used.

     - Tessellation Point Mode (TESS_POINT_MODE)

     The TESS_POINT_MODE statement declares that the tessellation primitive
     generator will emit points for each vertex in the subdivided primitive
     instead of lines or triangles, as described for the TESS_GEN_POINT_MODE_NV
     parameter in Section 2.X.2.  The declaration takes no arguments.  If a
     program omits a point mode declaration, the primitives emitted will be
     lines (for ISOLINES mode) or triangles (for TRIANGLES and QUADS mode).


 Additions to Chapter 3 of the OpenGL 1.5 Specification (Rasterization)

     None.

 Additions to Chapter 4 of the OpenGL 1.5 Specification (Per-Fragment
 Operations and the Frame Buffer)

     None.

 Additions to Chapter 5 of the OpenGL 1.5 Specification (Special Functions)

     None.

 Additions to Chapter 6 of the OpenGL 1.5 Specification (State and
 State Requests)

     None.

 Additions to Appendix A of the OpenGL 1.5 Specification (Invariance)

     None.

 Additions to the AGL/GLX/WGL Specifications

     None.

 GLX Protocol

     None.

 Errors

     The error INVALID_OPERATION is generated if Begin, or any command that
     implicitly calls Begin, is called when tessellation control programs are
     enabled and the currently bound tessellation control program object does
     not contain a valid program.

     The error INVALID_OPERATION is generated if Begin, or any command that
     implicitly calls Begin, is called when tessellation evaluation programs
     are enabled and the currently bound tessellation evaluation program object
     does not contain a valid program.

     The error INVALID_OPERATION is generated if Begin, or any command that
     implicitly calls Begin, is called when tessellation control programs are
     enabled and <mode> is not PATCHES_NV.

     The error INVALID_OPERATION is generated if Begin, or any command that
     implicitly calls Begin, is called when tessellation evaluation programs
     are enabled and <mode> is not PATCHES_NV.

 New State

     (Modify ARB_vertex_program, Table X.6 -- Program State)

                                                      Initial
     Get Value                  Type    Get Command    Value  Description              Sec.    Attribute
     -------------------------  ----    -----------   ------- ------------------------ ------  ---------
     TESS_CONTROL_PROGRAM_NV     B      IsEnabled      FALSE  Tessellation control     2.14.6  enable
                                                              program enable
     TESS_EVALUATION_PROGRAM_NV  B      IsEnabled      FALSE  Tess. evaluation         2.14.6  enable
                                                              program enable

     TESS_CONTROL_PROGRAM_       Z+     GetIntegerv      0    Active tess control      2.14.1  -
       PARAMETER_BUFFER_NV                                    program buffer object
                                                              binding
     TESS_CONTROL_PROGRAM_       nxZ+   GetInteger-      0    Buffer objects bound for 2.14.1  -
       PARAMETER_BUFFER_NV              IndexedvEXT           tess. control program use

     TESS_EVALUATION_PROGRAM_    Z+     GetIntegerv      0    Active tess evaluation   2.14.1  -
       PARAMETER_BUFFER_NV                                    program buffer object
                                                              binding
     TESS_EVALUATION_PROGRAM_    nxZ+   GetInteger-      0    Buffer objects bound for 2.14.1  -
       PARAMETER_BUFFER_NV              IndexedvEXT           tess. eval. program use


     Additionally, some tessellation-related state applicable to this extension
     is added by ARB_tessellation_shader.

 New Implementation Dependent State

                                                              Minimum
     Get Value                         Type  Get Command       Value   Description             Sec.     Attrib
     --------------------------------  ----  ---------------  -------  ----------------------- -------- ------
     MAX_PROGRAM_PATCH_ATTRIBS_NV       Z+   GetProgramivARB     30    number of generic patch 2.X.3.2    -
                                                                       attribute vectors
                                                                       supported

     Additionally, some tessellation-related state applicable to this extension
     is added by ARB_tessellation_shader.


 Dependencies on ARB_tessellation_shader

     This spec incorporates the text of ARB_tessellation_shader in its
     entirety.  If ARB_tessellation_shader is not supported, language
     documenting GLSL tessellation control and evaluation shaders should be
     removed; tessellation would be available only using the assembly
     interface.  Language describing the operation of patch primitives and the
     tessellation primitive generator would be retained.

 Dependencies on NV_parameter_buffer_object

     The NV_parameter_buffer_object (PaBO) extension provides the ability to
     bind buffer objects to be read by vertex, geometry, and fragment programs.

     If NV_parameter_buffer_object is supported, this extension adds the
     ability to bind buffer objects to be accessed by tessellation control and
     evaluation programs.  The NV_parameter_buffer_object should be modified to
     accept the enums TESS_CONTROL_PROGRAM_PARAMETER_BUFFER_NV and
     TESS_EVALUATION_PROGRAM_PARAMETER_BUFFER_NV where the three previously
     defined enums (for vertex, geometry, and fragment programs) are accepted.

     If NV_parameter_buffer_object is not supported, references to the two new
     buffer object binding points should be removed.

 Issues

     (1) How does tessellation fit into the existing GL pipeline?

       RESOLVED:  See issue (1) in the ARB_tessellation_shader specification,
       which contains beautifully crafted ASCII art depicting the pipeline.

     (2) What other considerations were involved in the design of the
         tessellation API?

       RESOLVED:  Go look at the detailed issues section of the GLSL-based
       ARB_tessellation_shader specification.  There are a good number of
       issues that apply equally to the assembly APIs that won't be duplicated
       here.

     (3) Should the tessellation-related parameters (e.g., the primitive
         decomposition, spacing, vertex orientation) be context state or
         provided with the program?  If the latter, how should they be
         provided.

       RESOLVED:  We are providing declaration statements to specify each of
       these parameters in the tessellation evaluation program.  Because they
       are part of the program text, they can't be changed independently of the
       program.  We don't think that limitation is serious, and the same
       limitation applies to GLSL shaders (you need to re-link when changing
       these parameters).

       Putting these declarations in the shader means that it wasn't necessary
       to create a new "tessellation parameter" API to set this state.  Such an
       API would only apply to assembly programs and could be a source of
       confusion if developers thought it might apply to GLSL shaders as well.

     (4) The programming model for tessellation control programs supports
         multiple threads, each providing attributes for a single vertex.  But
         it also supports the ability to read the per-vertex outputs written by
         other threads and to read and write shared per-patch attribute
         outputs.  The latter capabilities require some sort of synchronization
         to ensure consistently ordered reads and writes whenever possible.
         How should this be handled?

       RESOLVED:  We will expose a programming model where we run groups of <N>
       parallel threads in lock-step.  In this model, all <N> threads
       effectively retire one instruction before starting the next.  This
       execution model provides a simple abstraction, and provides an obvious
       instruction order allowing an application to avoid most read-write and
       write-write hazards.

       There are three places where we have explicitly undefined behavior:

         * If flow control diverges in an IF/ELSE/ENDIF block, the relative
           order of writes in the "IF" side of the block and those in the
           "ELSE" side of the block is undefined.

         * If multiple threads write different values to the same per-patch
           attribute in the same instruction, the order in which the writes
           land is undefined.

         * If any single instruction has one thread reading a per-vertex output
           or a per-patch attribute and another thread writing the same output,
           the order in which the reads and writes land is undefined.

       Implementations need not actually run the threads in this manner, as
       long as the compiler properly synchronizes threads at the points where
       execution order dependencies do occur.  Since the NV_gpu_program4
       programming model uses structured branching (e.g., IF/ELSE/ENDIF
       blocks), the points at which threads may diverge and converge again are
       easily identified.  We expect that the number of such synchronization
       points will be low for most tessellation control programs.

       One other approach considered is to limit the flow control model and the
       capabilities of the system to result in a minimal number of required
       synchronization points.  For example, the tessellation control program
       might be split into phases where the capabilities of each thread to
       access outputs would be limited.  For example, one might have a
       three-phase model like the following:

                       Per-Vertex Outputs           Per-Patch Outputs
          Phase     can read?    can write?      can read?    can write?
          -----     ---------    ----------      ---------    ----------
            1          NO           YES             NO           NO
            2          YES          NO              NO           YES(a)
            3          YES          NO              YES(a)       YES(b)

      In this model, there would be two explicit synchronization points --
      between each pair of phases.  The limits on access prevent most cases
      where conficts could occur (e.g., you can't read any per-vertex outputs
      until you're completely done writing all).  To further limit conflicts,
      per-patch attributes might be divided into two sets -- set (a) can be
      written only in phase 2 and read only in phase (3), and set (b) can be
      written only in phase 3.

      We decided to expose a general model on the grounds that having the
      compiler automatically determine possible synchronization points was easy
      enough.  Optimizing compilers that reorder instructions already have to
      deal with this exact type of issue -- they can't move instructions that
      write a variable past subsequent instructions that read it.

      The programming model adopted for GLSL in ARB_tessellation_shader
      similarly has a set of parallel threads running one executable, but it
      provides a barrier() call that serves as a synchronization point and can
      be used to split shader execution into phases.

      Note that while all previous OpenGL programmability extensions exposed a
      model of completely independent threads (i.e., one thread can't read the
      outputs of another), threads weren't always completely independent!  In
      fragment programs/shaders, some texture and all partial derivative
      built-ins (dFdx, dFdy in GLSL) require screen-space derivatives.  If the
      quantity used for derivatives is computed by the shader, OpenGL
      implementations generally run threads in groups arranged by screen-space
      location and approximate derivatives by computing differences of the
      inputs between threads.  This approach requires the same sort of
      automatic synchronization between threads, since derivatives implicitly
      read values computed by other threads.


 Revision History

     Rev.    Date    Author    Changes
     ----  --------  --------  -----------------------------------------
      3    12/19/11  pbrown    Clarify that "primitive.tessouter[n]",
                               "primitive.tessinner[n]", and "primitive.
                               patch.attrib[n]" are not available on the input
                               patch for tessellation control programs.  Remove
                               stray language referring to a non-existent
                               vector tessellation level.

      2    03/22/10  pbrown    Rename references to ARB_tessellation_shader
                               (formerly EXT).  Minor other cleanups, including
                               the issues section.

      1              pbrown    Internal revisions.