| Name |
| |
| NV_vertex_program3 |
| |
| Name Strings |
| |
| GL_NV_vertex_program3 |
| |
| Contact |
| |
| Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com) |
| |
| Status |
| |
| Shipping. |
| |
| Version |
| |
| Last Modified Data: 10/12/2009 |
| NVIDIA Revision: 7 |
| |
| Number |
| |
| 306 |
| |
| Dependencies |
| |
| ARB_vertex_program is required. |
| NV_vertex_program2_option is required. |
| This extension interacts with ARB_fragment_program_shadow. |
| |
| Overview |
| |
| This extension, like the NV_vertex_program2_option extension, |
| provides additional vertex program functionality to extend the |
| standard ARB_vertex_program language and execution environment. |
| ARB programs wishing to use this added functionality need only add: |
| |
| OPTION NV_vertex_program3; |
| |
| to the beginning of their vertex programs. |
| |
| New functionality provided by this extension, above and beyond that |
| already provided by NV_vertex_program2_option extension, includes: |
| |
| * texture lookups in vertex programs, |
| |
| * ability to push and pop address registers on the stack, |
| |
| * address register-relative addressing for vertex attribute and |
| result arrays, and |
| |
| * a second four-component condition code. |
| |
| Issues |
| |
| Should we provided a separate "!!VP3.0" program type, like the |
| "!!VP2.0" type defined in NV_vertex_program2? |
| |
| RESOLVED: No. Since ARB_vertex_program has been fully defined |
| (it wasn't in the !!VP2.0 time-frame), we will simply define |
| language extensions to !!ARBvp1.0 that expose new functionality. |
| The NV_vertex_program2_option specification followed this same |
| pattern for the NV3X family (GeForce FX, Quadro FX). |
| |
| Should this be called "NV_vertex_program3_option"? |
| |
| RESOLVED: No. The similar extension to !!ARBvp1.0 called |
| "NV_vertex_program2_option" got that name only because the simpler |
| "NV_vertex_program2" name had already been used. |
| |
| Is there a limit on the number of texture units that can be accessed |
| by a vertex program? |
| |
| RESOLVED: Yes. The limit may be lower than the total number of texture |
| image units available and is given by the implementation-dependent |
| constant MAX_VERTEX_TEXTURE_IMAGE_UNITS_ARB. Any program that attempts |
| to use more unique texture image units will fail to load. Programs can |
| use any texture image unit number, as long as they don't use too many |
| simultaneously. As an example, the GeForce 6 series of GPUs provides 16 |
| texture image units accessible to vertex programs, but no more than four |
| can be used simultaneously. It is not an error to use texture image |
| units 12-15 in a program. |
| |
| This limitation is identical to the one in the ARB_vertex_shader |
| extensions -- both extensions use the same enum to query the number of |
| available image units. Violating this limit in GLSL results in a link |
| error. |
| |
| Is there a restriction on the texture targets that can be accessed by a |
| vertex program? |
| |
| RESOLVED: Yes -- for any texture image unit, vertex and fragment |
| processing can not use different targets. If they do, an |
| INVALID_OPERATION is generated at Begin-time. This resolution is |
| consistent with resultion of the same issue in the ARB_vertex_shader |
| extension and OpenGL 2.0. |
| |
| Since vertices don't have screen space partial derivatives, how is |
| the LOD used for texture accesses defined? |
| |
| RESOLVED: The TXL instruction allows a program to explicitly |
| set an LOD; the LOD for all other texture instructions is zero. |
| The texture LOD bias specified in the texture object and environment |
| do apply to all vertex texture lookups. |
| |
| |
| New Procedures and Functions |
| |
| None. |
| |
| New Tokens |
| |
| Accepted by the <pname> parameter of GetBooleanv, GetIntegerv, |
| GetFloatv, and GetDoublev: |
| |
| MAX_VERTEX_TEXTURE_IMAGE_UNITS_ARB 0x8B4C |
| |
| Additions to Chapter 2 of the OpenGL 1.4 Specification (OpenGL Operation) |
| |
| Modify Section 2.14.2, Vertex Program Grammar and Restrictions |
| |
| (mostly add to existing grammar rules, as extended by |
| NV_vertex_program2_option) |
| |
| <optionName> ::= "NV_vertex_program3" |
| |
| <instruction> ::= <TexInstruction> |
| |
| <ALUInstruction> ::= <ASTACKop_instruction> |
| |
| <TexInstruction> ::= <TEXop_instruction> |
| |
| <ASTACKop_instruction> ::= <PUSHAop> <instOperandAddrVNS> |
| | <POPAop> <instResultAddr> |
| |
| <PUSHAop> ::= "PUSHA" |
| |
| <POPAop> ::= "POPA" |
| |
| <TEXop_instruction> ::= <TEXop> <instResult> "," <instOperandV> "," |
| <texTarget> |
| |
| <TEXop> ::= "TEX" |
| | "TXP" |
| | "TXB" |
| | "TXL" |
| |
| <texTarget> ::= <texImageUnit> "," <texTargetType> |
| |
| <texImageUnit> ::= "texture" <optTexImageUnitNum> |
| |
| <optTexImageUnitNum> ::= /* empty */ |
| | "[" <texImageUnitNum> "]" |
| |
| <texImageUnitNum> ::= <integer> |
| /*[0,MAX_TEXTURE_IMAGE_UNITS_ARB-1]*/ |
| |
| <texTargetType> ::= "1D" |
| | "2D" |
| | "3D" |
| | "CUBE" |
| | "RECT" |
| |
| <attribVtxBasic> ::= "texcoord" "[" <arrayMemRel> "]" |
| | "attrib" "[" <arrayMemRel> "]" |
| |
| <resultVtxBasic> ::= "texcoord" "[" <arrayMemRel> "]" |
| |
| <ccMaskRule> ::= "EQ0" |
| | "GE0" |
| | "GT0" |
| | "LE0" |
| | "LT0" |
| | "NE0" |
| | "TR0" |
| | "FL0" |
| | "EQ1" |
| | "GE1" |
| | "GT1" |
| | "LE1" |
| | "LT1" |
| | "NE1" |
| | "TR1" |
| | "FL1" |
| |
| (modify description of reserved identifiers) |
| |
| ... The following strings are reserved keywords and may not be used |
| as identifiers: |
| |
| ABS, ADD, ADDRESS, ALIAS, ARA, ARL, ARR, ATTRIB, BRA, CAL, COS, |
| DP3, DP4, DPH, DST, END, EX2, EXP, FLR, FRC, LG2, LIT, LOG, MAD, |
| MAX, MIN, MOV, MUL, OPTION, OUTPUT, PARAM, POPA, POW, PUSHA, RCC, |
| RCP, RET, RSQ, SEQ, SFL, SGE, SGT, SIN, SLE, SLT, SNE, SUB, SSG, |
| STR, SWZ, TEMP, TEX, TXB, TXL, TXP, XPD, program, result, state, |
| and vertex. |
| |
| Modify Section 2.14.3.1, Vertex Attributes |
| |
| (add new bindings to binding table) |
| |
| Vertex Attribute Binding Components Underlying State |
| ------------------------ ---------- -------------------------------- |
| ... |
| vertex.texcoord[A+n] (s,t,r,q) indexed texture coordinate |
| vertex.attrib[A+n] (x,y,z,w) indexed generic vertex attribute |
| |
| If a vertex attribute binding matches "vertex.texcoord[A+n]", where |
| "A" is a component of an address register (Section 2.14.3.5), a |
| texture coordinate number <c> is computed by adding the current |
| value of the address register component and <n>. The "x", "y", |
| "z", and "w" components of the vertex attribute variable are |
| filled with the "s", "t", "r", and "q" components, respectively, |
| of the vertex texture coordinates for texture unit <c>. If <c> |
| is negative or greater than or equal to MAX_TEXTURE_COORDS_ARB, |
| the vertex attribute variable is undefined. |
| |
| If a vertex attribute binding matches "vertex.attrib[A+n]", where |
| "A" is a component of an address register (Section 2.14.3.5), a |
| vertex attribute number <a> is computed by adding the current value |
| of the address register component and <n>. The "x", "y", "z", and |
| "w" components of the vertex attribute variable are filled with the |
| "x", "y", "z", and "w" components, respectively, of generic vertex |
| attribute <a>. If <a> is negative or greater than or equal to |
| MAX_VERTEX_ATTRIBS_ARB, the vertex attribute variable is undefined. |
| |
| Modify Section 2.14.3.4, Vertex Program Results |
| |
| (add new binding to binding table) |
| |
| Binding Components Description |
| ----------------------------- ---------- ---------------------------- |
| ... |
| result.texcoord[A+n] (s,t,r,q) indexed texture coordinate |
| |
| If a result variable binding matches "result.texcoord[A+n]", where "A" |
| is a component of an address register (Section 2.14.3.5), a texture |
| coordinate number <c> is computed by adding the current value of |
| the address register component and <n>. Updates to the "x", "y", |
| "z", and "w" components of the result variable set the "s", "t", |
| "r" and "q" components, respectively, of the transformed vertex's |
| texture coordinates for texture unit <c>. If <c> is negative or |
| greater than or equal to MAX_TEXTURE_COORDS_ARB, the effects of |
| updates to vertex attribute variable are undefined and may overwrite |
| other programs results. |
| |
| Modify Section 2.14.3.X, Condition Code Registers (added in |
| NV_Vertex_program2_option) |
| |
| The vertex program condition code registers are two four-component |
| vectors, called CC0 and CC1. Each component of this register is one |
| of four enumerated values: GT (greater than), EQ (equal), LT (less |
| than), or UN (unordered). The condition code register can be used |
| to mask writes to registers and to evaluate conditional branches. |
| |
| Most vertex program instructions can optionally update one of the |
| two condition code registers. When a vertex program instruction |
| updates a condition code register, a condition code component is set |
| to LT if the corresponding component of the result is less than zero, |
| EQ if it is equal to zero, GT if it is greater than zero, and UN if |
| it is NaN (not a number). |
| |
| The condition code registers are initialized to vectors of EQ values |
| each time a vertex program executes. |
| |
| Modify Section 2.14.3.7, Vertex Program Resource Limits |
| |
| (add new paragraph to end of section) In addition to the previous limits, |
| the number of unique texture image units that can be accessed |
| simultaneously by a vertex program is limited. The limit is given by the |
| implementation-dependent constant MAX_VERTEX_TEXTURE_IMAGE_UNITS_ARB, and |
| may be lower than the total number of texture image units provided. If |
| the number of texture image units referenced by a vertex program exceeds |
| this limit, the program will fail to load. |
| |
| Modify Section 2.14.4, Vertex Program Execution Environment |
| |
| (modify Begin-time error language for vertex program execution to cover |
| invalid texture uses) |
| |
| If vertex program mode is enabled and the currently bound program object |
| does not contain a valid vertex program, the error INVALID_OPERATION will |
| be generated by Begin, RasterPos, and any command that implicitly calls |
| Begin (e.g., DrawArrays). |
| |
| If vertex program mode is enabled and the currently bound program object |
| accesses a texture image unit, the texture target used must be consistent |
| with the target (if any) used for fragment processing. If vertex and |
| fragment processing require the use of different texture targets on the |
| same texture image unit, the error INVALID_OPERATION will be generated by |
| Begin, RasterPos, and any command that implicitly calls Begin. |
| |
| (modify instruction table) There are forty-eight vertex program |
| instructions. Vertex program instructions may have up to eight |
| variants, including a suffix of "C" or "C0" to allow an update of |
| condition code register zero (section 2.14.3.X), a suffix of "C1" |
| to allow an update of condition code register one, and a suffix of |
| "_SAT" to clamp the result vector components to the range [0,1]. |
| For example, the eight forms of the "ADD" instruction are "ADD", |
| "ADDC", "ADDC0", "ADDC1", "ADD_SAT", "ADDC_SAT", "ADDC0_SAT", and |
| "ADDC1_SAT". The instructions and their respective input and output |
| parameters are summarized in Table X.5. |
| |
| Modifiers |
| Instruction C S Inputs Output Description |
| ----------- - - ------ ------ -------------------------------- |
| ABS X X v v absolute value |
| ADD X X v,v v add |
| ARA X - a a address register add |
| ARL X - s a address register load |
| ARR X - v a address register load (round) |
| BRA - - c - branch |
| CAL - - c - subroutine call |
| COS X X s ssss cosine |
| DP3 X X v,v ssss 3-component dot product |
| DP4 X X v,v ssss 4-component dot product |
| DPH X X v,v ssss homogeneous dot product |
| DST X X v,v v distance vector |
| EX2 X X s ssss exponential base 2 |
| EXP X X s v exponential base 2 (approximate) |
| FLR X X v v floor |
| FRC X X v v fraction |
| LG2 X X s ssss logarithm base 2 |
| LIT X X v v compute light coefficients |
| LOG X X s v logarithm base 2 (approximate) |
| MAD X X v,v,v v multiply and add |
| MAX X X v,v v maximum |
| MIN X X v,v v minimum |
| MOV X X v v move |
| MUL X X v,v v multiply |
| POPA - - - a pop address register |
| POW X X s,s ssss exponentiate |
| PUSHA - - a - push address register |
| RCC X X s ssss reciprocal (clamped) |
| RCP X X s ssss reciprocal |
| RET - - c - subroutine return |
| RSQ X X s ssss reciprocal square root |
| SEQ X X v,v v set on equal |
| SFL X X v,v v set on false |
| SGE X X v,v v set on greater than or equal |
| SGT X X v,v v set on greater than |
| SIN X X s ssss sine |
| SLE X X v,v v set on less than or equal |
| SLT X X v,v v set on less than |
| SNE X X v,v v set on not equal |
| SSG X X v v set sign |
| STR X X v,v v set on true |
| SUB X X v,v v subtract |
| SWZ X X v v extended swizzle |
| TEX X X v v texture lookup |
| TXB X X v v texture lookup with LOD bias |
| TXL X X v v texture lookup with explicit LOD |
| TXP X X v v projective texture lookup |
| XPD X X v,v v cross product |
| |
| Table X.5: Summary of vertex program instructions. The columns |
| "C" and "S" indicate whether the "C", "C0", and "C1" condition code |
| update modifiers, and the "_SAT" saturation modifiers, respectively, |
| are supported for the opcode. "v" indicates a floating-point vector |
| input or output, "s" indicates a floating-point scalar input, |
| "ssss" indicates a scalar output replicated across a 4-component |
| result vector, "a" indicates a vector address register, and "c" |
| indicates a condition code test. |
| |
| Rewrite Section 2.14.4.3, Vertex Program Destination Register Update |
| |
| A vertex program instruction can optionally clamp the results of |
| a floating-point result vector to the range [0,1]. The components |
| of the result vector are clamped to [0,1] if the saturation suffix |
| "_SAT" is present in the instruction. |
| |
| Most vertex program instructions write a 4-component result vector to |
| a single temporary or vertex result register. Writes to individual |
| components of the destination register are controlled by individual |
| component write masks specified as part of the instruction. |
| |
| The component write mask is specified by the <optionalMask> rule |
| found in the <maskedDstReg> rule. If the optional mask is "", |
| all components are enabled. Otherwise, the optional mask names |
| the individual components to enable. The characters "x", "y", |
| "z", and "w" match the x, y, z, and w components respectively. |
| For example, an optional mask of ".xzw" indicates that the x, z, |
| and w components should be enabled for writing but the y component |
| should not. The grammar requires that the destination register mask |
| components must be listed in "xyzw" order. The condition code write |
| mask is specified by the <ccMask> rule found in the <instResultCC> |
| and <instResultAddrCC> rules. Otherwise, the selected condition |
| code register is loaded and swizzled according to the swizzle |
| codes specified by <swizzleSuffix>. Each component of the swizzled |
| condition code is tested according to the rule given by <ccMaskRule>. |
| <ccMaskRule> may have the values "EQ", "NE", "LT", "GE", LE", or "GT", |
| which mean to enable writes if the corresponding condition code field |
| evaluates to equal, not equal, less than, greater than or equal, less |
| than or equal, or greater than, respectively. Comparisons involving |
| condition codes of "UN" (unordered) evaluate to true for "NE" and |
| false otherwise. For example, if the condition code is (GT,LT,EQ,GT) |
| and the condition code mask is "(NE.zyxw)", the swizzle operation |
| will load (EQ,LT,GT,GT) and the mask will thus will enable writes on |
| the y, z, and w components. In addition, "TR" always enables writes |
| and "FL" always disables writes, regardless of the condition code. |
| If the condition code mask is empty, it is treated as "(TR)". |
| |
| Each component of the destination register is updated with the result |
| of the vertex program instruction if and only if the component is |
| enabled for writes by both the component write mask and the condition |
| code write mask. Otherwise, the component of the destination register |
| remains unchanged. |
| |
| A vertex program instruction can also optionally update the condition |
| code register. The condition code is updated if the condition |
| code register update suffix "C" is present in the instruction. |
| The instruction "ADDC" will update the condition code; the otherwise |
| equivalent instruction "ADD" will not. If condition code updates |
| are enabled, each component of the destination register enabled |
| for writes is compared to zero. The corresponding component of |
| the condition code is set to "LT", "EQ", or "GT", if the written |
| component is less than, equal to, or greater than zero, respectively. |
| Condition code components are set to "UN" if the written component is |
| NaN (not a number). Values of -0.0 and +0.0 both evaluate to "EQ". |
| If a component of the destination register is not enabled for writes, |
| the corresponding condition code component is also unchanged. |
| |
| In the following example code, |
| |
| # R1=(-2, 0, 2, NaN) R0 CC |
| MOVC R0, R1; # ( -2, 0, 2, NaN) (LT,EQ,GT,UN) |
| MOVC R0.xyz, R1.yzwx; # ( 0, 2, NaN, NaN) (EQ,GT,UN,UN) |
| MOVC R0 (NE), R1.zywx; # ( 0, 0, NaN, -2) (EQ,EQ,UN,LT) |
| |
| the first instruction writes (-2,0,2,NaN) to R0 and updates the |
| condition code to (LT,EQ,GT,UN). The second instruction, only the |
| "x", "y", and "z" components of R0 and the condition code are updated, |
| so R0 ends up with (0,2,NaN,NaN) and the condition code ends up with |
| (EQ,GT,UN,UN). In the third instruction, the condition code mask |
| disables writes to the x component (its condition code field is "EQ"), |
| so R0 ends up with (0,0,NaN,-2) and the condition code ends up with |
| (EQ,EQ,UN,LT). |
| |
| The following pseudocode illustrates the process of writing a |
| result vector to the destination register. In the pseudocode, |
| "instrSaturate" is TRUE if and only if result saturation is |
| enabled, "instrMask" refers to the component write mask given by |
| the <optWriteMask> rule. "ccMaskRule" refers to the condition code |
| mask rule given by <ccMask> and "updatecc" is TRUE if and only if |
| condition code updates are enabled. "result", "destination", and "cc" |
| refer to the result vector, the register selected by <dstRegister> |
| and the condition code, respectively. Condition codes do not exist |
| in the VP1 execution environment. |
| |
| boolean TestCC(CondCode field) { |
| switch (ccMaskRule) { |
| case "EQ": return (field == "EQ"); |
| case "NE": return (field != "EQ"); |
| case "LT": return (field == "LT"); |
| case "GE": return (field == "GT" || field == "EQ"); |
| case "LE": return (field == "LT" || field == "EQ"); |
| case "GT": return (field == "GT"); |
| case "TR": return TRUE; |
| case "FL": return FALSE; |
| case "": return TRUE; |
| } |
| } |
| |
| enum GenerateCC(float value) { |
| if (value == NaN) { |
| return UN; |
| } else if (value < 0) { |
| return LT; |
| } else if (value == 0) { |
| return EQ; |
| } else { |
| return GT; |
| } |
| } |
| |
| void UpdateDestination(floatVec destination, floatVec result) |
| { |
| floatVec merged; |
| ccVec mergedCC; |
| |
| // Clamp result components to [0,1] if requested in the instruction. |
| if (instrSaturate) { |
| if (result.x < 0) result.x = 0; |
| else if (result.x > 1) result.x = 1; |
| if (result.y < 0) result.y = 0; |
| else if (result.y > 1) result.y = 1; |
| if (result.z < 0) result.z = 0; |
| else if (result.z > 1) result.z = 1; |
| if (result.w < 0) result.w = 0; |
| else if (result.w > 1) result.w = 1; |
| } |
| |
| // Merge the converted result into the destination register, under |
| // control of the compile- and run-time write masks. |
| merged = destination; |
| mergedCC = cc; |
| if (instrMask.x && TestCC(cc.c***)) { |
| merged.x = result.x; |
| if (updatecc) mergedCC.x = GenerateCC(result.x); |
| } |
| if (instrMask.y && TestCC(cc.*c**)) { |
| merged.y = result.y; |
| if (updatecc) mergedCC.y = GenerateCC(result.y); |
| } |
| if (instrMask.z && TestCC(cc.**c*)) { |
| merged.z = result.z; |
| if (updatecc) mergedCC.z = GenerateCC(result.z); |
| } |
| if (instrMask.w && TestCC(cc.***c)) { |
| merged.w = result.w; |
| if (updatecc) mergedCC.w = GenerateCC(result.w); |
| } |
| |
| // Write out the new destination register and condition code. |
| destination = merged; |
| cc = mergedCC; |
| } |
| |
| While this rule describes floating-point results, the same logic |
| applies to the integer results generated by the ARA, ARL, and ARR |
| instructions. |
| |
| Add to Section 2.14.4.5, Vertex Program Options |
| |
| Section 2.14.4.5.3, NV_vertex_program3 Program Option |
| |
| If a vertex program specifies the "NV_vertex_program3" option, the |
| ARB_vertex_program grammar and execution environment are extended |
| to take advantage of all the features of the "NV_vertex_program2" |
| option, plus the following features: |
| |
| * several new instructions: |
| |
| * POPA -- pop address register off stack |
| * PUSHA -- push address register onto stack |
| * TEX -- texture lookup |
| * TXB -- texture lookup w/LOD bias |
| * TXL -- texture lookup w/explicit LOD |
| * TXP -- projective texture lookup |
| |
| * address register-relative addressing for vertex texture |
| coordinate and generic attribute arrays, |
| |
| * address register-relative addressing for vertex texture |
| coordinate result array, and |
| |
| * a second four-component condition code. |
| |
| |
| Modify Section 2.14.5.34, RET: Subroutine Call Return |
| |
| The RET instruction conditionally returns from a subroutine initiated |
| by a CAL instruction by popping an instruction reference off the |
| top of the call stack and transferring control to the referenced |
| instruction. The following pseudocode describes the operation of |
| the instruction: |
| |
| if (TestCC(cc.c***) || TestCC(cc.*c**) || |
| TestCC(cc.**c*) || TestCC(cc.***c)) { |
| if (callStackDepth <= 0) { |
| // terminate vertex program normally |
| } else { |
| callStackDepth--; |
| if (callStack[callStackDepth] is a instruction reference) { |
| instruction = callStack[callStackDepth]; |
| } else { |
| // terminate vertex program abnormally |
| } |
| } |
| |
| // continue execution at <instruction> |
| } else { |
| // do nothing |
| } |
| |
| In the pseudocode, <callStackDepth> is the depth of the call stack, |
| <callStack> is an array holding the call stack, and <instruction> is |
| a reference to an instruction previously pushed onto the call stack. |
| |
| If the call stack is empty when RET executes, the vertex program |
| terminates normally. |
| |
| The vertex program terminates abnormally if the entry at the top of the |
| call stack is not an instruction reference pushed by CAL. When a vertex |
| program terminates abnormally, all of the vertex program results are |
| undefined. |
| |
| Add to Section 2.14.5, Vertex Program Instruction Set |
| |
| Section 2.14.5.43, POPA: Pop Address Register Stack |
| |
| The POPA instruction generates a integer result vector by popping |
| an entry off of the call stack. |
| |
| if (callStackDepth <= 0) { |
| terminate vertex program; |
| } else { |
| callStackDepth--; |
| if (callStack[callStackDepth] is an address register) { |
| iresult = callStack[callStackDepth]; |
| } else { |
| terminate vertex program; |
| } |
| } |
| |
| POPA does not support non-default write masks; a program will fail to load |
| if it includes a component write mask other than ".xyzw" or a condition |
| code write mask test other than "TR". |
| |
| In the pseudocode, <callStackDepth> is the current depth of the call |
| stack and <callStack> is an array holding the call stack. |
| |
| The vertex program terminates abnormally if it executes a POPA instruction |
| when the call stack is empty, or when the entry at the top of the call |
| stack is not an address register pushed by PUSHA. When a vertex program |
| terminates abnormally, all of the vertex program results are undefined. |
| |
| Section 2.14.5.44, PUSHA: Push Address Register Stack |
| |
| The PUSHA instruction pushes the address register operand onto the |
| call stack, which is also used for subroutine calls. The PUSHA |
| instruction does not generate a result vector. |
| |
| tmp = AddrVectorLoad(op0); |
| if (callStackDepth >= MAX_PROGRAM_CALL_DEPTH_NV) { |
| terminate vertex program; |
| } else { |
| callStack[callStackDepth] = tmp; |
| callStackDepth++; |
| } |
| |
| In the pseudocode, <callStackDepth> is the current depth of the call |
| stack and <callStack> is an array holding the call stack. |
| |
| The vertex program terminates abnormally if it executes a PUSHA |
| instruction when the call stack is full. When a vertex program terminates |
| abnormally, all of the vertex program results are undefined. |
| |
| Component swizzling is not supported when the operand is loaded. |
| |
| Section 2.14.5.45, TEX: Texture Lookup |
| |
| The TEX instruction uses the single vector operand to perform a |
| lookup in the specified texture map, yielding a 4-component result |
| vector containing filtered texel values. The (s,t,r,q) coordinates |
| used for the texture lookup are (x,y,z,1), where x, y, and z are |
| components of the vector operand. |
| |
| tmp = VectorLoad(op0); |
| result = TextureSample(tmp.x, tmp.y, tmp.z, 1.0, 0.0, unit, target); |
| |
| where <unit> and <target> are the texture image unit number and |
| target type, matching the <texImageUnitNum> and <texTargetType> |
| grammar rules. |
| |
| The resulting sample is mapped to RGBA as described in Table 3.21, |
| and the R, G, B, and A values are written to the x, y, z, and w |
| components, respectively, of the result vector. |
| |
| Since partial derivatives of the texture coordinates are not defined, |
| the base LOD value for vertex texture lookups is defined to be |
| zero. The value of lambda' used in equation 3.16 will be simply |
| clamp(texobj_bias + texunit_bias). |
| |
| Section 2.14.5.46, TXB: Texture Lookup (With LOD Bias) |
| |
| The TXB instruction uses the single vector operand to perform a |
| lookup in the specified texture map, yielding a 4-component result |
| vector containing filtered texel values. The (s,t,r,q) coordinates |
| used for the texture lookup are (x,y,z,1), where x, y, and z are |
| components of the vector operand. The w component of the operand |
| is used as an additional LOD bias. |
| |
| tmp = VectorLoad(op0); |
| result = TextureSample(tmp.x, tmp.y, tmp.z, 1.0, tmp.w, unit, target); |
| |
| where <unit> and <target> are the texture image unit number and |
| target type, matching the <texImageUnitNum> and <texTargetType> |
| grammar rules. |
| |
| The resulting sample is mapped to RGBA as described in Table 3.21, |
| and the R, G, B, and A values are written to the x, y, z, and w |
| components, respectively, of the result vector. |
| |
| Since partial derivatives of the texture coordinates are not defined, |
| the base LOD value for vertex texture lookups is defined to be |
| zero. The value of lambda' used in equation 3.16 will be simply |
| clamp(texobj_bias + texunit_bias + tmp.w). |
| |
| Since the base LOD value is zero, the TXB instruction is completely |
| equivalent to the TXL instruction, where the w component contains |
| an explicit base LOD value. |
| |
| Section 2.14.5.47, TXL: Texture Lookup (With Explicit LOD) |
| |
| The TXL instruction uses the single vector operand to perform a |
| lookup in the specified texture map, yielding a 4-component result |
| vector containing filtered texel values. The (s,t,r,q) coordinates |
| used for the texture lookup are (x,y,z,1), where x, y, and z are |
| components of the vector operand. The w component of the operand |
| is used as the base LOD for the texture lookup. |
| |
| tmp = VectorLoad(op0); |
| result = TextureSampleLOD(tmp.x, tmp.y, tmp.z, 1.0, tmp.w, unit, target); |
| |
| where <unit> and <target> are the texture image unit number and |
| target type, matching the <texImageUnitNum> and <texTargetType> |
| grammar rules. |
| |
| The resulting sample is mapped to RGBA as described in Table 3.21, |
| and the R, G, B, and A values are written to the x, y, z, and w |
| components, respectively, of the result vector. |
| |
| The value of lambda' used in equation 3.16 will be simply tmp.w + |
| clamp(texobj_bias + texunit_bias), where tmp.w is the base LOD. |
| |
| Section 2.14.5.48, TXP: Texture Lookup (Projective) |
| |
| The TXP instruction uses the single vector operand to perform a |
| lookup in the specified texture map, yielding a 4-component result |
| vector containing filtered texel values. The (s,t,r,q) coordinates |
| used for the texture lookup are (x,y,z,w), where x, y, z, and w are |
| the four components of the vector operand. |
| |
| tmp = VectorLoad(op0); |
| result = TextureSample(tmp.x, tmp.y, tmp.z, tmp.w, 0.0, unit, target); |
| |
| where <unit> and <target> are the texture image unit number and |
| target type, matching the <texImageUnitNum> and <texTargetType> |
| grammar rules. |
| |
| The resulting sample is mapped to RGBA as described in Table 3.21, |
| and the R, G, B, and A values are written to the x, y, z, and w |
| components, respectively, of the result vector. |
| |
| Since partial derivatives of the texture coordinates are not defined, |
| the base LOD value for vertex texture lookups is defined to be |
| zero. The value of lambda' used in equation 3.16 will be simply |
| clamp(texobj_bias + texunit_bias). |
| |
| Additions to Chapter 3 of the OpenGL 1.4 Specification (Rasterization) |
| |
| None. |
| |
| Additions to Chapter 4 of the OpenGL 1.4 Specification (Per-Fragment |
| Operations and the Frame Buffer) |
| |
| None. |
| |
| Additions to Chapter 5 of the OpenGL 1.4 Specification (Special Functions) |
| |
| None. |
| |
| Additions to Chapter 6 of the OpenGL 1.4 Specification (State and |
| State Requests) |
| |
| None. |
| |
| Additions to Appendix A of the OpenGL 1.4 Specification (Invariance) |
| |
| None. |
| |
| Additions to the AGL/GLX/WGL Specifications |
| |
| None. |
| |
| Dependencies on ARB_vertex_program |
| |
| ARB_vertex_program is required. |
| |
| This specification and NV_vertex_program2_option are based on a |
| modified version of the grammar published in the ARB_vertex_program |
| specification. This modified grammar includes a few structural |
| changes to better accommodate new functionality from this and |
| other extensions, but should be functionally equivalent to the |
| ARB_vertex_program grammar. See NV_vertex_program2_option for |
| details on the base grammar. |
| |
| Dependencies on NV_vertex_program2_option |
| |
| NV_vertex_program2_option is required. |
| |
| If the NV_vertex_program3 program option is specified, all |
| the functionality described in both this extension and the |
| NV_vertex_program2_option specification is available. |
| |
| Dependencies on ARB_fragment_program_shadow |
| |
| If this extension and ARB_fragment_program shadow are both supported, |
| vertex programs may include the option statement: |
| |
| OPTION ARB_fragment_program_shadow; |
| |
| which enables the use of SHADOW1D, SHADOW2D, and SHADOWRECT texture |
| targets in texture lookup instructions, as described in the |
| ARB_fragment_program_shadow specification. |
| |
| NVIDIA NOTE: Drivers prior to September 2006 do not support the use of |
| this option, and will not accept texture lookups with SHADOW1D, SHADOW2D, |
| and SHADOWRECT targets. Shadow mapping in vertex programs will result in |
| software fallbacks on GeForce 6 and GeForce 7 series GPUs, but may be done |
| in hardware on future GPUs. |
| |
| Errors |
| |
| None. |
| |
| New State |
| |
| None. |
| |
| New Implementation Dependent State: |
| |
| Minimum |
| Get Value Type Get Command Value Description Section Attr. |
| --------- ---- ----------- ------- -------------------------- -------- ----- |
| MAX_VERTEX_TEXTURE_ Z+ GetIntegerv 1 Number of separate texture 2.14.3.7 - |
| IMAGE_UNITS_ARB image units that can be |
| accessed by a vertex program |
| |
| Revision History |
| |
| Rev. Date Author Changes |
| ---- -------- -------- -------------------------------------------- |
| 7 10/12/09 pbrown Update grammar/documentation of PUSHA/POPA to |
| reflect the implementation. <instResultAddr> is |
| used for POPA with some semantic checks. Note |
| that some driver versions erroneously allowed |
| conditional write masks on POPA. Also clarify |
| that ARB_fragment_program_shadow includes |
| support for "SHADOWRECT". |
| |
| 6 09/27/06 pbrown Document that ARB_fragment_program_shadow is |
| allowed, to enable the use of "SHADOW1D" and |
| "SHADOW2D" targets for texture lookups. |
| |
| 5 11/07/05 pbrown Fix PUSHA documentation to specify the right |
| constant name used for overflow testing. |
| |
| 4 09/01/05 pbrown Fix spec language to document that a vertex |
| program will fail to compile if it uses "too |
| many" textures -- previously only documented |
| in the issues section. |
| |
| 3 08/25/05 pbrown Document that using a different texture target |
| than fragment processing on the same texture |
| unit results in an INVALID_OPERATION error at |
| Begin time. This is consistent with GLSL |
| language in the ARB_shader_objects and OpenGL |
| 2.0 specifications. The implementation has |
| always done this, but it was overlooked in |
| the spec language. |
| |
| 2 06/23/04 pbrown Documented that vertex results are undefined |
| when a vertex program terminates abnormally |
| (e.g., PUSHA/POPA stack overflow/underflow). |
| Documented error in RET if the top of the call |
| stack contains a value written by PUSHA. |
| |
| 1 -------- pbrown Initial pre-release revisions. |
| |