skia / external / github.com / KhronosGroup / OpenGL-Registry / eae1d6dde1e283f6fdf803274a2484007e592599 / . / extensions / NV / NV_vertex_program1_1.txt

Name | |

NV_vertex_program1_1 | |

Name Strings | |

GL_NV_vertex_program1_1 | |

Contact | |

Mark J. Kilgard, NVIDIA Corporation (mjk 'at' nvidia.com) | |

Contributors | |

Pat Brown | |

Erik Lindholm | |

Steve Glanville | |

Erik Faye-Lund | |

Notice | |

Copyright NVIDIA Corporation, 2001, 2002. | |

IP Status | |

NVIDIA Proprietary. | |

Status | |

Version 1.0 | |

Version | |

NVIDIA Date: March 4, 2014 | |

Version: 8 | |

Number | |

266 | |

Dependencies | |

Written based on the wording of the OpenGL 1.2.1 specification and | |

requires OpenGL 1.2.1. | |

Assumes support for the NV_vertex_program extension. | |

Overview | |

This extension adds four new vertex program instructions (DPH, | |

RCC, SUB, and ABS). | |

This extension also supports a position-invariant vertex program | |

option. A vertex program is position-invariant when it generates | |

the _exact_ same homogenuous position and window space position | |

for a vertex as conventional OpenGL transformation (ignoring vertex | |

blending and weighting). | |

By default, vertex programs are _not_ guaranteed to be | |

position-invariant because there is no guarantee made that the way | |

a vertex program might compute its homogenous position is exactly | |

identical to the way conventional OpenGL transformation computes | |

its homogenous positions. In a position-invariant vertex program, | |

the homogeneous position (HPOS) is not output by the program. | |

Instead, the OpenGL implementation is expected to compute the HPOS | |

for position-invariant vertex programs in a manner exactly identical | |

to how the homogenous position and window position are computed | |

for a vertex by conventional OpenGL transformation. In this way | |

position-invariant vertex programs guarantee correct multi-pass | |

rendering semantics in cases where multiple passes are rendered and | |

the second and subsequent passes use a GL_EQUAL depth test. | |

Issues | |

How should options to the vertex program semantics be handled? | |

RESOLUTION: A VP1.1 vertex program can contain a sequence | |

of options. This extension provides a single option | |

("NV_position_invariant"). Specifying an option changes the | |

way the program's subsequent instruction sequence are parsed, | |

may add new semantic checks, and modifies the semantics by which | |

the vertex program is executed. | |

Should this extension provide SUB and ABS instructions even though | |

the functionality can be accomplished with ADD and MAX? | |

RESOLUTION: Yes. SUB and ABS provide no functionality that could | |

not be accomplished in VP1.0 with ADD and MAX idioms, SUB and ABS | |

provide more understanable vertex programs. | |

Should the optionalSign in a VP1.1 accept both "-" and "+"? | |

RESOLUTION: Yes. The "+" does not negate its operand but is | |

available for symetry. | |

Is relative addressing available to position-invariant version 1.1 | |

vertex programs? | |

RESOLUTION: No. This reflects a hardware restriction. | |

Should something be said about the relative performance of | |

position-invariant vertex programs and conventional vertex programs? | |

RESOLUTION: For architectural reasons, position-invariant vertex | |

programs may be _slightly_ faster than conventional vertex programs. | |

This is true in the GeForce3 architecture. If your vertex program | |

transforms the object-space position to clip-space with four DP4 | |

instructions using the tracked GL_MODELVIEW_PROJECTION_NV matrix, | |

consider using position-invariant vertex programs. Do not expect a | |

measurable performance improvement unless vertex program processing | |

is your bottleneck and your vertex program is relatively short. | |

Should position-invariant vertex programs have a lower limit on the | |

maximum instructions? | |

RESOLUTION: Yes, the driver takes care to match the same | |

instructions used for position transformation used by conventional | |

transformation and this requires a few vertex program instructions. | |

New Procedures and Functions | |

None. | |

New Tokens | |

None. | |

Additions to Chapter 2 of the OpenGL 1.2.1 Specification (OpenGL Operation) | |

2.14.1.9 Vertex Program Register Accesses | |

Replace the first two sentences and update Table X.4: | |

"There are 21 vertex program instructions. The instructions and their | |

respective input and output parameters are summarized in Table X.4." | |

Output | |

Inputs (vector or | |

Opcode (scalar or vector) replicated scalar) Operation | |

------ ------------------ ------------------ -------------------------- | |

ARL s address register address register load | |

MOV v v move | |

MUL v,v v multiply | |

ADD v,v v add | |

MAD v,v,v v multiply and add | |

RCP s ssss reciprocal | |

RSQ s ssss reciprocal square root | |

DP3 v,v ssss 3-component dot product | |

DP4 v,v ssss 4-component dot product | |

DST v,v v distance vector | |

MIN v,v v minimum | |

MAX v,v v maximum | |

SLT v,v v set on less than | |

SGE v,v v set on greater equal than | |

EXP s v exponential base 2 | |

LOG s v logarithm base 2 | |

LIT v v light coefficients | |

DPH v,v ssss homogeneous dot product | |

RCC s ssss reciprocal clamped | |

SUB v,v v subtract | |

ABS v v absolute value | |

Table X.4: Summary of vertex program instructions. "v" indicates a | |

vector input or output, "s" indicates a scalar input, and "ssss" indicates | |

a scalar output replicated across a 4-component vector. | |

Add four new sections describing the DPH, RCC, SUB, and ABS | |

instructions. | |

"2.14.1.10.18 DPH: Homogeneous Dot Product | |

The DPH instruction assigns the four-component dot product of the | |

two source vectors where the W component of the first source vector | |

is assumed to be 1.0 into the destination register. | |

t.x = source0.c***; | |

t.y = source0.*c**; | |

t.z = source0.**c*; | |

if (negate0) { | |

t.x = -t.x; | |

t.y = -t.y; | |

t.z = -t.z; | |

} | |

u.x = source1.c***; | |

u.y = source1.*c**; | |

u.z = source1.**c*; | |

u.w = source1.***c; | |

if (negate1) { | |

u.x = -u.x; | |

u.y = -u.y; | |

u.z = -u.z; | |

u.w = -u.w; | |

} | |

v.x = t.x * u.x + t.y * u.y + t.z * u.z + u.w; | |

if (xmask) destination.x = v.x; | |

if (ymask) destination.y = v.x; | |

if (zmask) destination.z = v.x; | |

if (wmask) destination.w = v.x; | |

2.14.1.10.19 RCC: Reciprocal Clamped | |

The RCC instruction inverts the value of the source scalar, clamps | |

the result as described below, and stores the clamped result into | |

the destination register. The reciprocal of exactly 1.0 must be | |

exactly 1.0. | |

Additionally (before clamping) the reciprocal of negative infinity | |

gives [-0.0, -0.0, -0.0, -0.0]; the reciprocal of negative zero gives | |

[-Inf, -Inf, -Inf, -Inf]; the reciprocal of positive zero gives | |

[+Inf, +Inf, +Inf, +Inf]; and the reciprocal of positive infinity | |

gives [0.0, 0.0, 0.0, 0.0]. | |

t.x = source0.c; | |

if (negate0) { | |

t.x = -t.x; | |

} | |

if (t.x == 1.0f) { | |

u.x = 1.0f; | |

} else { | |

u.x = 1.0f / t.x; | |

} | |

if (Positive(u.x)) { | |

if (u.x > 1.84467e+019) { | |

u.x = 1.84467e+019; // the IEEE 32-bit binary value 0x5F800000 | |

} else if (u.x < 5.42101e-020) { | |

u.x = 5.42101e-020; // the IEEE 32-bit bindary value 0x1F800000 | |

} | |

} else { | |

if (u.x < -1.84467e+019) { | |

u.x = -1.84467e+019; // the IEEE 32-bit binary value 0xDF800000 | |

} else if (u.x > -5.42101e-020) { | |

u.x = -5.42101e-020; // the IEEE 32-bit binary value 0x9F800000 | |

} | |

} | |

if (xmask) destination.x = u.x; | |

if (ymask) destination.y = u.x; | |

if (zmask) destination.z = u.x; | |

if (wmask) destination.w = u.x; | |

where Positive(x) is true for +0 and other positive values and false | |

for -0 and other negative values; and | |

| u.x - IEEE(1.0f/t.x) | < 1.0f/(2^22) | |

for 1.0f <= t.x <= 2.0f. The intent of this precision requirement is | |

that this amount of relative precision apply over all values of t.x." | |

2.14.1.10.20 SUB: Subtract | |

The SUB instruction subtracts the values of the one source vector | |

from another source vector and stores the result into the destination | |

register. | |

t.x = source0.c***; | |

t.y = source0.*c**; | |

t.z = source0.**c*; | |

t.w = source0.***c; | |

if (negate0) { | |

t.x = -t.x; | |

t.y = -t.y; | |

t.z = -t.z; | |

t.w = -t.w; | |

} | |

u.x = source1.c***; | |

u.y = source1.*c**; | |

u.z = source1.**c*; | |

u.w = source1.***c; | |

if (negate1) { | |

u.x = -u.x; | |

u.y = -u.y; | |

u.z = -u.z; | |

u.w = -u.w; | |

} | |

if (xmask) destination.x = t.x - u.x; | |

if (ymask) destination.y = t.y - u.y; | |

if (zmask) destination.z = t.z - u.z; | |

if (wmask) destination.w = t.w - u.w; | |

2.14.1.10.21 ABS: Absolute Value | |

The ABS instruction assigns the component-wise absolute value of a | |

source vector into the destination register. | |

t.x = source0.c***; | |

t.y = source0.*c**; | |

t.z = source0.**c*; | |

t.w = source0.***c; | |

if (xmask) destination.x = (t.x >= 0) ? t.x : -t.x; | |

if (ymask) destination.y = (t.y >= 0) ? t.y : -t.y; | |

if (zmask) destination.z = (t.z >= 0) ? t.z : -t.z; | |

if (wmask) destination.w = (t.w >= 0) ? t.w : -t.w; | |

Insert sections 2.14.A and 2.14.B after section 2.14.4 | |

"2.14.A Version 1.1 Vertex Programs | |

Version 1.1 vertex programs provide support for the DPH, RCC, SUB, | |

and ABS instructions (see sections 2.14.1.10.18 through 2.14.1.10.21). | |

Version 1.1 vertex programs are loaded with the LoadProgramNV command | |

(see section 2.14.1.7). The target must be VERTEX_PROGRAM_NV to | |

load a version 1.1 vertex program. The initial "!!VP1.1" token | |

designates the program should be parsed and treated as a version 1.1 | |

vertex program. | |

Version 1.1 programs must conform to a more expanded grammar than | |

the grammar for vertex programs. The version 1.1 vertex program | |

grammar for syntactically valid sequences is the same as the grammar | |

defined in section 2.14.1.7 with the following modified rules: | |

<program> ::= "!!VP1.1" <optionSequence> <instructionSequence> "END" | |

<optionSequence> ::= <optionSequence> <option> | |

| "" | |

<option> ::= "OPTION" "NV_position_invariant" ";" | |

<VECTORop> ::= "MOV" | |

| "LIT" | |

| "ABS" | |

<SCALARop> ::= "RCP" | |

| "RSQ" | |

| "EXP" | |

| "LOG" | |

| "RCC" | |

<BINop> ::= "MUL" | |

| "ADD" | |

| "DP3" | |

| "DP4" | |

| "DST" | |

| "MIN" | |

| "MAX" | |

| "SLT" | |

| "SGE" | |

| "DPH" | |

| "SUB" | |

<optionalSign> ::= "-" | |

| "+" | |

| "" | |

Except for supporting the additional DPH, RCC, SUB, and ABS | |

instructions, version 1.1 vertex programs with no options specified | |

otherwise behave in the same manner as version 1.0 vertex programs. | |

2.14.B Position-invariant Vertex Program Option | |

By default, vertex programs are _not_ guaranteed to be | |

position-invariant because there is no guarantee made that the | |

way a vertex program might compute its homogenous position is | |

exactly identical to the way conventional OpenGL transformation | |

computes its homogenous positions. However in a position-invariant | |

vertex program, the homogeneous position (HPOS) is not output by | |

the program. Instead, the OpenGL implementation is expected to | |

compute the HPOS for position-invariant vertex programs in a manner | |

exactly identical to how the homogenous position and window position | |

are computed for a vertex by conventional OpenGL transformation | |

(assuming vertex weighting and vertex blending are disabled). In this | |

way position-invariant vertex programs guarantee correct multi-pass | |

rendering semantics in cases where multiple passes are rendered with | |

conventional OpenGL transformation and position-invariant vertex | |

programs and the second and subsequent passes use a EQUAL depth test. | |

If an <option> with the identifier "NV_position_invariant" is | |

encountered during the parsing of the program, the specified program | |

is presumed to be position-invariant. | |

When a position-invariant vertex program is specified, the | |

<vertexResultRegName> rule is replaced with the following rule | |

(that does not provide "HPOS"): | |

<vertexResultRegName> ::= "COL0" | |

| "COL1" | |

| "BFC0" | |

| "BFC1" | |

| "FOGC" | |

| "PSIZ" | |

| "TEX0" | |

| "TEX1" | |

| "TEX2" | |

| "TEX3" | |

| "TEX4" | |

| "TEX5" | |

| "TEX6" | |

| "TEX7" | |

While position-invariant version 1.1 vertex programs provide | |

position-invariance, such programs do not provide support for | |

relative program parameter addressing. The <relProgParamReg> rule | |

for version 1.1 position-invariant vertex programs is replaced by | |

(eliminating the relative addressing cases): | |

<relProgParamReg> ::= "c" "[" <addrReg> "]" | |

Note that while the ARL instruction is still available to | |

position-invariant version 1.1 vertex programs, it provides no | |

meaningful functionality without support for relative addressing. | |

The semantic restriction for vertex program instruction length is | |

changed in the case of position-invariant vertex programs to the | |

following: A position-invariant vertex program fails to load if it | |

contains more than 124 instructions. | |

" | |

Additions to Chapter 4 of the OpenGL 1.2.1 Specification (Per-Fragment | |

Operations and the Framebuffer) | |

None | |

Additions to Chapter 5 of the OpenGL 1.2.1 Specification (Special Functions) | |

None | |

Additions to Chapter 6 of the OpenGL 1.2.1 Specification (State and | |

State Requests) | |

None | |

Additions to the AGL/GLX/WGL Specifications | |

None | |

GLX Protocol | |

None | |

Errors | |

None | |

New State | |

None | |

Revision History | |

Rev. Date Author Changes | |

---- -------- --------- ---------------------------------------- | |

8 03/04/14 mjk RCC decimal value corrections |