| Name |
| |
| NV_vertex_program1_1 |
| |
| Name Strings |
| |
| GL_NV_vertex_program1_1 |
| |
| Contact |
| |
| Mark J. Kilgard, NVIDIA Corporation (mjk 'at' nvidia.com) |
| |
| Contributors |
| |
| Pat Brown |
| Erik Lindholm |
| Steve Glanville |
| Erik Faye-Lund |
| |
| Notice |
| |
| Copyright NVIDIA Corporation, 2001, 2002. |
| |
| IP Status |
| |
| NVIDIA Proprietary. |
| |
| Status |
| |
| Version 1.0 |
| |
| Version |
| |
| NVIDIA Date: March 4, 2014 |
| Version: 8 |
| |
| Number |
| |
| 266 |
| |
| Dependencies |
| |
| Written based on the wording of the OpenGL 1.2.1 specification and |
| requires OpenGL 1.2.1. |
| |
| Assumes support for the NV_vertex_program extension. |
| |
| Overview |
| |
| This extension adds four new vertex program instructions (DPH, |
| RCC, SUB, and ABS). |
| |
| This extension also supports a position-invariant vertex program |
| option. A vertex program is position-invariant when it generates |
| the _exact_ same homogenuous position and window space position |
| for a vertex as conventional OpenGL transformation (ignoring vertex |
| blending and weighting). |
| |
| By default, vertex programs are _not_ guaranteed to be |
| position-invariant because there is no guarantee made that the way |
| a vertex program might compute its homogenous position is exactly |
| identical to the way conventional OpenGL transformation computes |
| its homogenous positions. In a position-invariant vertex program, |
| the homogeneous position (HPOS) is not output by the program. |
| Instead, the OpenGL implementation is expected to compute the HPOS |
| for position-invariant vertex programs in a manner exactly identical |
| to how the homogenous position and window position are computed |
| for a vertex by conventional OpenGL transformation. In this way |
| position-invariant vertex programs guarantee correct multi-pass |
| rendering semantics in cases where multiple passes are rendered and |
| the second and subsequent passes use a GL_EQUAL depth test. |
| |
| Issues |
| |
| How should options to the vertex program semantics be handled? |
| |
| RESOLUTION: A VP1.1 vertex program can contain a sequence |
| of options. This extension provides a single option |
| ("NV_position_invariant"). Specifying an option changes the |
| way the program's subsequent instruction sequence are parsed, |
| may add new semantic checks, and modifies the semantics by which |
| the vertex program is executed. |
| |
| Should this extension provide SUB and ABS instructions even though |
| the functionality can be accomplished with ADD and MAX? |
| |
| RESOLUTION: Yes. SUB and ABS provide no functionality that could |
| not be accomplished in VP1.0 with ADD and MAX idioms, SUB and ABS |
| provide more understanable vertex programs. |
| |
| Should the optionalSign in a VP1.1 accept both "-" and "+"? |
| |
| RESOLUTION: Yes. The "+" does not negate its operand but is |
| available for symetry. |
| |
| Is relative addressing available to position-invariant version 1.1 |
| vertex programs? |
| |
| RESOLUTION: No. This reflects a hardware restriction. |
| |
| Should something be said about the relative performance of |
| position-invariant vertex programs and conventional vertex programs? |
| |
| RESOLUTION: For architectural reasons, position-invariant vertex |
| programs may be _slightly_ faster than conventional vertex programs. |
| This is true in the GeForce3 architecture. If your vertex program |
| transforms the object-space position to clip-space with four DP4 |
| instructions using the tracked GL_MODELVIEW_PROJECTION_NV matrix, |
| consider using position-invariant vertex programs. Do not expect a |
| measurable performance improvement unless vertex program processing |
| is your bottleneck and your vertex program is relatively short. |
| |
| Should position-invariant vertex programs have a lower limit on the |
| maximum instructions? |
| |
| RESOLUTION: Yes, the driver takes care to match the same |
| instructions used for position transformation used by conventional |
| transformation and this requires a few vertex program instructions. |
| |
| New Procedures and Functions |
| |
| None. |
| |
| New Tokens |
| |
| None. |
| |
| Additions to Chapter 2 of the OpenGL 1.2.1 Specification (OpenGL Operation) |
| |
| 2.14.1.9 Vertex Program Register Accesses |
| |
| Replace the first two sentences and update Table X.4: |
| |
| "There are 21 vertex program instructions. The instructions and their |
| respective input and output parameters are summarized in Table X.4." |
| |
| Output |
| Inputs (vector or |
| Opcode (scalar or vector) replicated scalar) Operation |
| ------ ------------------ ------------------ -------------------------- |
| ARL s address register address register load |
| MOV v v move |
| MUL v,v v multiply |
| ADD v,v v add |
| MAD v,v,v v multiply and add |
| RCP s ssss reciprocal |
| RSQ s ssss reciprocal square root |
| DP3 v,v ssss 3-component dot product |
| DP4 v,v ssss 4-component dot product |
| DST v,v v distance vector |
| MIN v,v v minimum |
| MAX v,v v maximum |
| SLT v,v v set on less than |
| SGE v,v v set on greater equal than |
| EXP s v exponential base 2 |
| LOG s v logarithm base 2 |
| LIT v v light coefficients |
| DPH v,v ssss homogeneous dot product |
| RCC s ssss reciprocal clamped |
| SUB v,v v subtract |
| ABS v v absolute value |
| |
| Table X.4: Summary of vertex program instructions. "v" indicates a |
| vector input or output, "s" indicates a scalar input, and "ssss" indicates |
| a scalar output replicated across a 4-component vector. |
| |
| Add four new sections describing the DPH, RCC, SUB, and ABS |
| instructions. |
| |
| "2.14.1.10.18 DPH: Homogeneous Dot Product |
| |
| The DPH instruction assigns the four-component dot product of the |
| two source vectors where the W component of the first source vector |
| is assumed to be 1.0 into the destination register. |
| |
| t.x = source0.c***; |
| t.y = source0.*c**; |
| t.z = source0.**c*; |
| if (negate0) { |
| t.x = -t.x; |
| t.y = -t.y; |
| t.z = -t.z; |
| } |
| u.x = source1.c***; |
| u.y = source1.*c**; |
| u.z = source1.**c*; |
| u.w = source1.***c; |
| if (negate1) { |
| u.x = -u.x; |
| u.y = -u.y; |
| u.z = -u.z; |
| u.w = -u.w; |
| } |
| v.x = t.x * u.x + t.y * u.y + t.z * u.z + u.w; |
| if (xmask) destination.x = v.x; |
| if (ymask) destination.y = v.x; |
| if (zmask) destination.z = v.x; |
| if (wmask) destination.w = v.x; |
| |
| 2.14.1.10.19 RCC: Reciprocal Clamped |
| |
| The RCC instruction inverts the value of the source scalar, clamps |
| the result as described below, and stores the clamped result into |
| the destination register. The reciprocal of exactly 1.0 must be |
| exactly 1.0. |
| |
| Additionally (before clamping) the reciprocal of negative infinity |
| gives [-0.0, -0.0, -0.0, -0.0]; the reciprocal of negative zero gives |
| [-Inf, -Inf, -Inf, -Inf]; the reciprocal of positive zero gives |
| [+Inf, +Inf, +Inf, +Inf]; and the reciprocal of positive infinity |
| gives [0.0, 0.0, 0.0, 0.0]. |
| |
| t.x = source0.c; |
| if (negate0) { |
| t.x = -t.x; |
| } |
| if (t.x == 1.0f) { |
| u.x = 1.0f; |
| } else { |
| u.x = 1.0f / t.x; |
| } |
| if (Positive(u.x)) { |
| if (u.x > 1.84467e+019) { |
| u.x = 1.84467e+019; // the IEEE 32-bit binary value 0x5F800000 |
| } else if (u.x < 5.42101e-020) { |
| u.x = 5.42101e-020; // the IEEE 32-bit bindary value 0x1F800000 |
| } |
| } else { |
| if (u.x < -1.84467e+019) { |
| u.x = -1.84467e+019; // the IEEE 32-bit binary value 0xDF800000 |
| } else if (u.x > -5.42101e-020) { |
| u.x = -5.42101e-020; // the IEEE 32-bit binary value 0x9F800000 |
| } |
| } |
| if (xmask) destination.x = u.x; |
| if (ymask) destination.y = u.x; |
| if (zmask) destination.z = u.x; |
| if (wmask) destination.w = u.x; |
| |
| where Positive(x) is true for +0 and other positive values and false |
| for -0 and other negative values; and |
| |
| | u.x - IEEE(1.0f/t.x) | < 1.0f/(2^22) |
| |
| for 1.0f <= t.x <= 2.0f. The intent of this precision requirement is |
| that this amount of relative precision apply over all values of t.x." |
| |
| 2.14.1.10.20 SUB: Subtract |
| |
| The SUB instruction subtracts the values of the one source vector |
| from another source vector and stores the result into the destination |
| register. |
| |
| t.x = source0.c***; |
| t.y = source0.*c**; |
| t.z = source0.**c*; |
| t.w = source0.***c; |
| if (negate0) { |
| t.x = -t.x; |
| t.y = -t.y; |
| t.z = -t.z; |
| t.w = -t.w; |
| } |
| u.x = source1.c***; |
| u.y = source1.*c**; |
| u.z = source1.**c*; |
| u.w = source1.***c; |
| if (negate1) { |
| u.x = -u.x; |
| u.y = -u.y; |
| u.z = -u.z; |
| u.w = -u.w; |
| } |
| if (xmask) destination.x = t.x - u.x; |
| if (ymask) destination.y = t.y - u.y; |
| if (zmask) destination.z = t.z - u.z; |
| if (wmask) destination.w = t.w - u.w; |
| |
| 2.14.1.10.21 ABS: Absolute Value |
| |
| The ABS instruction assigns the component-wise absolute value of a |
| source vector into the destination register. |
| |
| t.x = source0.c***; |
| t.y = source0.*c**; |
| t.z = source0.**c*; |
| t.w = source0.***c; |
| if (xmask) destination.x = (t.x >= 0) ? t.x : -t.x; |
| if (ymask) destination.y = (t.y >= 0) ? t.y : -t.y; |
| if (zmask) destination.z = (t.z >= 0) ? t.z : -t.z; |
| if (wmask) destination.w = (t.w >= 0) ? t.w : -t.w; |
| |
| Insert sections 2.14.A and 2.14.B after section 2.14.4 |
| |
| "2.14.A Version 1.1 Vertex Programs |
| |
| Version 1.1 vertex programs provide support for the DPH, RCC, SUB, |
| and ABS instructions (see sections 2.14.1.10.18 through 2.14.1.10.21). |
| |
| Version 1.1 vertex programs are loaded with the LoadProgramNV command |
| (see section 2.14.1.7). The target must be VERTEX_PROGRAM_NV to |
| load a version 1.1 vertex program. The initial "!!VP1.1" token |
| designates the program should be parsed and treated as a version 1.1 |
| vertex program. |
| |
| Version 1.1 programs must conform to a more expanded grammar than |
| the grammar for vertex programs. The version 1.1 vertex program |
| grammar for syntactically valid sequences is the same as the grammar |
| defined in section 2.14.1.7 with the following modified rules: |
| |
| <program> ::= "!!VP1.1" <optionSequence> <instructionSequence> "END" |
| |
| <optionSequence> ::= <optionSequence> <option> |
| | "" |
| |
| <option> ::= "OPTION" "NV_position_invariant" ";" |
| |
| <VECTORop> ::= "MOV" |
| | "LIT" |
| | "ABS" |
| |
| <SCALARop> ::= "RCP" |
| | "RSQ" |
| | "EXP" |
| | "LOG" |
| | "RCC" |
| |
| <BINop> ::= "MUL" |
| | "ADD" |
| | "DP3" |
| | "DP4" |
| | "DST" |
| | "MIN" |
| | "MAX" |
| | "SLT" |
| | "SGE" |
| | "DPH" |
| | "SUB" |
| |
| <optionalSign> ::= "-" |
| | "+" |
| | "" |
| |
| Except for supporting the additional DPH, RCC, SUB, and ABS |
| instructions, version 1.1 vertex programs with no options specified |
| otherwise behave in the same manner as version 1.0 vertex programs. |
| |
| 2.14.B Position-invariant Vertex Program Option |
| |
| By default, vertex programs are _not_ guaranteed to be |
| position-invariant because there is no guarantee made that the |
| way a vertex program might compute its homogenous position is |
| exactly identical to the way conventional OpenGL transformation |
| computes its homogenous positions. However in a position-invariant |
| vertex program, the homogeneous position (HPOS) is not output by |
| the program. Instead, the OpenGL implementation is expected to |
| compute the HPOS for position-invariant vertex programs in a manner |
| exactly identical to how the homogenous position and window position |
| are computed for a vertex by conventional OpenGL transformation |
| (assuming vertex weighting and vertex blending are disabled). In this |
| way position-invariant vertex programs guarantee correct multi-pass |
| rendering semantics in cases where multiple passes are rendered with |
| conventional OpenGL transformation and position-invariant vertex |
| programs and the second and subsequent passes use a EQUAL depth test. |
| |
| If an <option> with the identifier "NV_position_invariant" is |
| encountered during the parsing of the program, the specified program |
| is presumed to be position-invariant. |
| |
| When a position-invariant vertex program is specified, the |
| <vertexResultRegName> rule is replaced with the following rule |
| (that does not provide "HPOS"): |
| |
| <vertexResultRegName> ::= "COL0" |
| | "COL1" |
| | "BFC0" |
| | "BFC1" |
| | "FOGC" |
| | "PSIZ" |
| | "TEX0" |
| | "TEX1" |
| | "TEX2" |
| | "TEX3" |
| | "TEX4" |
| | "TEX5" |
| | "TEX6" |
| | "TEX7" |
| |
| While position-invariant version 1.1 vertex programs provide |
| position-invariance, such programs do not provide support for |
| relative program parameter addressing. The <relProgParamReg> rule |
| for version 1.1 position-invariant vertex programs is replaced by |
| (eliminating the relative addressing cases): |
| |
| <relProgParamReg> ::= "c" "[" <addrReg> "]" |
| |
| Note that while the ARL instruction is still available to |
| position-invariant version 1.1 vertex programs, it provides no |
| meaningful functionality without support for relative addressing. |
| |
| The semantic restriction for vertex program instruction length is |
| changed in the case of position-invariant vertex programs to the |
| following: A position-invariant vertex program fails to load if it |
| contains more than 124 instructions. |
| |
| " |
| |
| Additions to Chapter 4 of the OpenGL 1.2.1 Specification (Per-Fragment |
| Operations and the Framebuffer) |
| |
| None |
| |
| Additions to Chapter 5 of the OpenGL 1.2.1 Specification (Special Functions) |
| |
| None |
| |
| Additions to Chapter 6 of the OpenGL 1.2.1 Specification (State and |
| State Requests) |
| |
| None |
| |
| Additions to the AGL/GLX/WGL Specifications |
| |
| None |
| |
| GLX Protocol |
| |
| None |
| |
| Errors |
| |
| None |
| |
| New State |
| |
| None |
| |
| Revision History |
| |
| Rev. Date Author Changes |
| ---- -------- --------- ---------------------------------------- |
| 8 03/04/14 mjk RCC decimal value corrections |