blob: 63b5a4d4d4444cee6904434feb04343eac3704ff [file] [log] [blame]
Name
NV_vertex_program3
Name Strings
GL_NV_vertex_program3
Contact
Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com)
Status
Shipping.
Version
Last Modified Data: 10/12/2009
NVIDIA Revision: 7
Number
306
Dependencies
ARB_vertex_program is required.
NV_vertex_program2_option is required.
This extension interacts with ARB_fragment_program_shadow.
Overview
This extension, like the NV_vertex_program2_option extension,
provides additional vertex program functionality to extend the
standard ARB_vertex_program language and execution environment.
ARB programs wishing to use this added functionality need only add:
OPTION NV_vertex_program3;
to the beginning of their vertex programs.
New functionality provided by this extension, above and beyond that
already provided by NV_vertex_program2_option extension, includes:
* texture lookups in vertex programs,
* ability to push and pop address registers on the stack,
* address register-relative addressing for vertex attribute and
result arrays, and
* a second four-component condition code.
Issues
Should we provided a separate "!!VP3.0" program type, like the
"!!VP2.0" type defined in NV_vertex_program2?
RESOLVED: No. Since ARB_vertex_program has been fully defined
(it wasn't in the !!VP2.0 time-frame), we will simply define
language extensions to !!ARBvp1.0 that expose new functionality.
The NV_vertex_program2_option specification followed this same
pattern for the NV3X family (GeForce FX, Quadro FX).
Should this be called "NV_vertex_program3_option"?
RESOLVED: No. The similar extension to !!ARBvp1.0 called
"NV_vertex_program2_option" got that name only because the simpler
"NV_vertex_program2" name had already been used.
Is there a limit on the number of texture units that can be accessed
by a vertex program?
RESOLVED: Yes. The limit may be lower than the total number of texture
image units available and is given by the implementation-dependent
constant MAX_VERTEX_TEXTURE_IMAGE_UNITS_ARB. Any program that attempts
to use more unique texture image units will fail to load. Programs can
use any texture image unit number, as long as they don't use too many
simultaneously. As an example, the GeForce 6 series of GPUs provides 16
texture image units accessible to vertex programs, but no more than four
can be used simultaneously. It is not an error to use texture image
units 12-15 in a program.
This limitation is identical to the one in the ARB_vertex_shader
extensions -- both extensions use the same enum to query the number of
available image units. Violating this limit in GLSL results in a link
error.
Is there a restriction on the texture targets that can be accessed by a
vertex program?
RESOLVED: Yes -- for any texture image unit, vertex and fragment
processing can not use different targets. If they do, an
INVALID_OPERATION is generated at Begin-time. This resolution is
consistent with resultion of the same issue in the ARB_vertex_shader
extension and OpenGL 2.0.
Since vertices don't have screen space partial derivatives, how is
the LOD used for texture accesses defined?
RESOLVED: The TXL instruction allows a program to explicitly
set an LOD; the LOD for all other texture instructions is zero.
The texture LOD bias specified in the texture object and environment
do apply to all vertex texture lookups.
New Procedures and Functions
None.
New Tokens
Accepted by the <pname> parameter of GetBooleanv, GetIntegerv,
GetFloatv, and GetDoublev:
MAX_VERTEX_TEXTURE_IMAGE_UNITS_ARB 0x8B4C
Additions to Chapter 2 of the OpenGL 1.4 Specification (OpenGL Operation)
Modify Section 2.14.2, Vertex Program Grammar and Restrictions
(mostly add to existing grammar rules, as extended by
NV_vertex_program2_option)
<optionName> ::= "NV_vertex_program3"
<instruction> ::= <TexInstruction>
<ALUInstruction> ::= <ASTACKop_instruction>
<TexInstruction> ::= <TEXop_instruction>
<ASTACKop_instruction> ::= <PUSHAop> <instOperandAddrVNS>
| <POPAop> <instResultAddr>
<PUSHAop> ::= "PUSHA"
<POPAop> ::= "POPA"
<TEXop_instruction> ::= <TEXop> <instResult> "," <instOperandV> ","
<texTarget>
<TEXop> ::= "TEX"
| "TXP"
| "TXB"
| "TXL"
<texTarget> ::= <texImageUnit> "," <texTargetType>
<texImageUnit> ::= "texture" <optTexImageUnitNum>
<optTexImageUnitNum> ::= /* empty */
| "[" <texImageUnitNum> "]"
<texImageUnitNum> ::= <integer>
/*[0,MAX_TEXTURE_IMAGE_UNITS_ARB-1]*/
<texTargetType> ::= "1D"
| "2D"
| "3D"
| "CUBE"
| "RECT"
<attribVtxBasic> ::= "texcoord" "[" <arrayMemRel> "]"
| "attrib" "[" <arrayMemRel> "]"
<resultVtxBasic> ::= "texcoord" "[" <arrayMemRel> "]"
<ccMaskRule> ::= "EQ0"
| "GE0"
| "GT0"
| "LE0"
| "LT0"
| "NE0"
| "TR0"
| "FL0"
| "EQ1"
| "GE1"
| "GT1"
| "LE1"
| "LT1"
| "NE1"
| "TR1"
| "FL1"
(modify description of reserved identifiers)
... The following strings are reserved keywords and may not be used
as identifiers:
ABS, ADD, ADDRESS, ALIAS, ARA, ARL, ARR, ATTRIB, BRA, CAL, COS,
DP3, DP4, DPH, DST, END, EX2, EXP, FLR, FRC, LG2, LIT, LOG, MAD,
MAX, MIN, MOV, MUL, OPTION, OUTPUT, PARAM, POPA, POW, PUSHA, RCC,
RCP, RET, RSQ, SEQ, SFL, SGE, SGT, SIN, SLE, SLT, SNE, SUB, SSG,
STR, SWZ, TEMP, TEX, TXB, TXL, TXP, XPD, program, result, state,
and vertex.
Modify Section 2.14.3.1, Vertex Attributes
(add new bindings to binding table)
Vertex Attribute Binding Components Underlying State
------------------------ ---------- --------------------------------
...
vertex.texcoord[A+n] (s,t,r,q) indexed texture coordinate
vertex.attrib[A+n] (x,y,z,w) indexed generic vertex attribute
If a vertex attribute binding matches "vertex.texcoord[A+n]", where
"A" is a component of an address register (Section 2.14.3.5), a
texture coordinate number <c> is computed by adding the current
value of the address register component and <n>. The "x", "y",
"z", and "w" components of the vertex attribute variable are
filled with the "s", "t", "r", and "q" components, respectively,
of the vertex texture coordinates for texture unit <c>. If <c>
is negative or greater than or equal to MAX_TEXTURE_COORDS_ARB,
the vertex attribute variable is undefined.
If a vertex attribute binding matches "vertex.attrib[A+n]", where
"A" is a component of an address register (Section 2.14.3.5), a
vertex attribute number <a> is computed by adding the current value
of the address register component and <n>. The "x", "y", "z", and
"w" components of the vertex attribute variable are filled with the
"x", "y", "z", and "w" components, respectively, of generic vertex
attribute <a>. If <a> is negative or greater than or equal to
MAX_VERTEX_ATTRIBS_ARB, the vertex attribute variable is undefined.
Modify Section 2.14.3.4, Vertex Program Results
(add new binding to binding table)
Binding Components Description
----------------------------- ---------- ----------------------------
...
result.texcoord[A+n] (s,t,r,q) indexed texture coordinate
If a result variable binding matches "result.texcoord[A+n]", where "A"
is a component of an address register (Section 2.14.3.5), a texture
coordinate number <c> is computed by adding the current value of
the address register component and <n>. Updates to the "x", "y",
"z", and "w" components of the result variable set the "s", "t",
"r" and "q" components, respectively, of the transformed vertex's
texture coordinates for texture unit <c>. If <c> is negative or
greater than or equal to MAX_TEXTURE_COORDS_ARB, the effects of
updates to vertex attribute variable are undefined and may overwrite
other programs results.
Modify Section 2.14.3.X, Condition Code Registers (added in
NV_Vertex_program2_option)
The vertex program condition code registers are two four-component
vectors, called CC0 and CC1. Each component of this register is one
of four enumerated values: GT (greater than), EQ (equal), LT (less
than), or UN (unordered). The condition code register can be used
to mask writes to registers and to evaluate conditional branches.
Most vertex program instructions can optionally update one of the
two condition code registers. When a vertex program instruction
updates a condition code register, a condition code component is set
to LT if the corresponding component of the result is less than zero,
EQ if it is equal to zero, GT if it is greater than zero, and UN if
it is NaN (not a number).
The condition code registers are initialized to vectors of EQ values
each time a vertex program executes.
Modify Section 2.14.3.7, Vertex Program Resource Limits
(add new paragraph to end of section) In addition to the previous limits,
the number of unique texture image units that can be accessed
simultaneously by a vertex program is limited. The limit is given by the
implementation-dependent constant MAX_VERTEX_TEXTURE_IMAGE_UNITS_ARB, and
may be lower than the total number of texture image units provided. If
the number of texture image units referenced by a vertex program exceeds
this limit, the program will fail to load.
Modify Section 2.14.4, Vertex Program Execution Environment
(modify Begin-time error language for vertex program execution to cover
invalid texture uses)
If vertex program mode is enabled and the currently bound program object
does not contain a valid vertex program, the error INVALID_OPERATION will
be generated by Begin, RasterPos, and any command that implicitly calls
Begin (e.g., DrawArrays).
If vertex program mode is enabled and the currently bound program object
accesses a texture image unit, the texture target used must be consistent
with the target (if any) used for fragment processing. If vertex and
fragment processing require the use of different texture targets on the
same texture image unit, the error INVALID_OPERATION will be generated by
Begin, RasterPos, and any command that implicitly calls Begin.
(modify instruction table) There are forty-eight vertex program
instructions. Vertex program instructions may have up to eight
variants, including a suffix of "C" or "C0" to allow an update of
condition code register zero (section 2.14.3.X), a suffix of "C1"
to allow an update of condition code register one, and a suffix of
"_SAT" to clamp the result vector components to the range [0,1].
For example, the eight forms of the "ADD" instruction are "ADD",
"ADDC", "ADDC0", "ADDC1", "ADD_SAT", "ADDC_SAT", "ADDC0_SAT", and
"ADDC1_SAT". The instructions and their respective input and output
parameters are summarized in Table X.5.
Modifiers
Instruction C S Inputs Output Description
----------- - - ------ ------ --------------------------------
ABS X X v v absolute value
ADD X X v,v v add
ARA X - a a address register add
ARL X - s a address register load
ARR X - v a address register load (round)
BRA - - c - branch
CAL - - c - subroutine call
COS X X s ssss cosine
DP3 X X v,v ssss 3-component dot product
DP4 X X v,v ssss 4-component dot product
DPH X X v,v ssss homogeneous dot product
DST X X v,v v distance vector
EX2 X X s ssss exponential base 2
EXP X X s v exponential base 2 (approximate)
FLR X X v v floor
FRC X X v v fraction
LG2 X X s ssss logarithm base 2
LIT X X v v compute light coefficients
LOG X X s v logarithm base 2 (approximate)
MAD X X v,v,v v multiply and add
MAX X X v,v v maximum
MIN X X v,v v minimum
MOV X X v v move
MUL X X v,v v multiply
POPA - - - a pop address register
POW X X s,s ssss exponentiate
PUSHA - - a - push address register
RCC X X s ssss reciprocal (clamped)
RCP X X s ssss reciprocal
RET - - c - subroutine return
RSQ X X s ssss reciprocal square root
SEQ X X v,v v set on equal
SFL X X v,v v set on false
SGE X X v,v v set on greater than or equal
SGT X X v,v v set on greater than
SIN X X s ssss sine
SLE X X v,v v set on less than or equal
SLT X X v,v v set on less than
SNE X X v,v v set on not equal
SSG X X v v set sign
STR X X v,v v set on true
SUB X X v,v v subtract
SWZ X X v v extended swizzle
TEX X X v v texture lookup
TXB X X v v texture lookup with LOD bias
TXL X X v v texture lookup with explicit LOD
TXP X X v v projective texture lookup
XPD X X v,v v cross product
Table X.5: Summary of vertex program instructions. The columns
"C" and "S" indicate whether the "C", "C0", and "C1" condition code
update modifiers, and the "_SAT" saturation modifiers, respectively,
are supported for the opcode. "v" indicates a floating-point vector
input or output, "s" indicates a floating-point scalar input,
"ssss" indicates a scalar output replicated across a 4-component
result vector, "a" indicates a vector address register, and "c"
indicates a condition code test.
Rewrite Section 2.14.4.3, Vertex Program Destination Register Update
A vertex program instruction can optionally clamp the results of
a floating-point result vector to the range [0,1]. The components
of the result vector are clamped to [0,1] if the saturation suffix
"_SAT" is present in the instruction.
Most vertex program instructions write a 4-component result vector to
a single temporary or vertex result register. Writes to individual
components of the destination register are controlled by individual
component write masks specified as part of the instruction.
The component write mask is specified by the <optionalMask> rule
found in the <maskedDstReg> rule. If the optional mask is "",
all components are enabled. Otherwise, the optional mask names
the individual components to enable. The characters "x", "y",
"z", and "w" match the x, y, z, and w components respectively.
For example, an optional mask of ".xzw" indicates that the x, z,
and w components should be enabled for writing but the y component
should not. The grammar requires that the destination register mask
components must be listed in "xyzw" order. The condition code write
mask is specified by the <ccMask> rule found in the <instResultCC>
and <instResultAddrCC> rules. Otherwise, the selected condition
code register is loaded and swizzled according to the swizzle
codes specified by <swizzleSuffix>. Each component of the swizzled
condition code is tested according to the rule given by <ccMaskRule>.
<ccMaskRule> may have the values "EQ", "NE", "LT", "GE", LE", or "GT",
which mean to enable writes if the corresponding condition code field
evaluates to equal, not equal, less than, greater than or equal, less
than or equal, or greater than, respectively. Comparisons involving
condition codes of "UN" (unordered) evaluate to true for "NE" and
false otherwise. For example, if the condition code is (GT,LT,EQ,GT)
and the condition code mask is "(NE.zyxw)", the swizzle operation
will load (EQ,LT,GT,GT) and the mask will thus will enable writes on
the y, z, and w components. In addition, "TR" always enables writes
and "FL" always disables writes, regardless of the condition code.
If the condition code mask is empty, it is treated as "(TR)".
Each component of the destination register is updated with the result
of the vertex program instruction if and only if the component is
enabled for writes by both the component write mask and the condition
code write mask. Otherwise, the component of the destination register
remains unchanged.
A vertex program instruction can also optionally update the condition
code register. The condition code is updated if the condition
code register update suffix "C" is present in the instruction.
The instruction "ADDC" will update the condition code; the otherwise
equivalent instruction "ADD" will not. If condition code updates
are enabled, each component of the destination register enabled
for writes is compared to zero. The corresponding component of
the condition code is set to "LT", "EQ", or "GT", if the written
component is less than, equal to, or greater than zero, respectively.
Condition code components are set to "UN" if the written component is
NaN (not a number). Values of -0.0 and +0.0 both evaluate to "EQ".
If a component of the destination register is not enabled for writes,
the corresponding condition code component is also unchanged.
In the following example code,
# R1=(-2, 0, 2, NaN) R0 CC
MOVC R0, R1; # ( -2, 0, 2, NaN) (LT,EQ,GT,UN)
MOVC R0.xyz, R1.yzwx; # ( 0, 2, NaN, NaN) (EQ,GT,UN,UN)
MOVC R0 (NE), R1.zywx; # ( 0, 0, NaN, -2) (EQ,EQ,UN,LT)
the first instruction writes (-2,0,2,NaN) to R0 and updates the
condition code to (LT,EQ,GT,UN). The second instruction, only the
"x", "y", and "z" components of R0 and the condition code are updated,
so R0 ends up with (0,2,NaN,NaN) and the condition code ends up with
(EQ,GT,UN,UN). In the third instruction, the condition code mask
disables writes to the x component (its condition code field is "EQ"),
so R0 ends up with (0,0,NaN,-2) and the condition code ends up with
(EQ,EQ,UN,LT).
The following pseudocode illustrates the process of writing a
result vector to the destination register. In the pseudocode,
"instrSaturate" is TRUE if and only if result saturation is
enabled, "instrMask" refers to the component write mask given by
the <optWriteMask> rule. "ccMaskRule" refers to the condition code
mask rule given by <ccMask> and "updatecc" is TRUE if and only if
condition code updates are enabled. "result", "destination", and "cc"
refer to the result vector, the register selected by <dstRegister>
and the condition code, respectively. Condition codes do not exist
in the VP1 execution environment.
boolean TestCC(CondCode field) {
switch (ccMaskRule) {
case "EQ": return (field == "EQ");
case "NE": return (field != "EQ");
case "LT": return (field == "LT");
case "GE": return (field == "GT" || field == "EQ");
case "LE": return (field == "LT" || field == "EQ");
case "GT": return (field == "GT");
case "TR": return TRUE;
case "FL": return FALSE;
case "": return TRUE;
}
}
enum GenerateCC(float value) {
if (value == NaN) {
return UN;
} else if (value < 0) {
return LT;
} else if (value == 0) {
return EQ;
} else {
return GT;
}
}
void UpdateDestination(floatVec destination, floatVec result)
{
floatVec merged;
ccVec mergedCC;
// Clamp result components to [0,1] if requested in the instruction.
if (instrSaturate) {
if (result.x < 0) result.x = 0;
else if (result.x > 1) result.x = 1;
if (result.y < 0) result.y = 0;
else if (result.y > 1) result.y = 1;
if (result.z < 0) result.z = 0;
else if (result.z > 1) result.z = 1;
if (result.w < 0) result.w = 0;
else if (result.w > 1) result.w = 1;
}
// Merge the converted result into the destination register, under
// control of the compile- and run-time write masks.
merged = destination;
mergedCC = cc;
if (instrMask.x && TestCC(cc.c***)) {
merged.x = result.x;
if (updatecc) mergedCC.x = GenerateCC(result.x);
}
if (instrMask.y && TestCC(cc.*c**)) {
merged.y = result.y;
if (updatecc) mergedCC.y = GenerateCC(result.y);
}
if (instrMask.z && TestCC(cc.**c*)) {
merged.z = result.z;
if (updatecc) mergedCC.z = GenerateCC(result.z);
}
if (instrMask.w && TestCC(cc.***c)) {
merged.w = result.w;
if (updatecc) mergedCC.w = GenerateCC(result.w);
}
// Write out the new destination register and condition code.
destination = merged;
cc = mergedCC;
}
While this rule describes floating-point results, the same logic
applies to the integer results generated by the ARA, ARL, and ARR
instructions.
Add to Section 2.14.4.5, Vertex Program Options
Section 2.14.4.5.3, NV_vertex_program3 Program Option
If a vertex program specifies the "NV_vertex_program3" option, the
ARB_vertex_program grammar and execution environment are extended
to take advantage of all the features of the "NV_vertex_program2"
option, plus the following features:
* several new instructions:
* POPA -- pop address register off stack
* PUSHA -- push address register onto stack
* TEX -- texture lookup
* TXB -- texture lookup w/LOD bias
* TXL -- texture lookup w/explicit LOD
* TXP -- projective texture lookup
* address register-relative addressing for vertex texture
coordinate and generic attribute arrays,
* address register-relative addressing for vertex texture
coordinate result array, and
* a second four-component condition code.
Modify Section 2.14.5.34, RET: Subroutine Call Return
The RET instruction conditionally returns from a subroutine initiated
by a CAL instruction by popping an instruction reference off the
top of the call stack and transferring control to the referenced
instruction. The following pseudocode describes the operation of
the instruction:
if (TestCC(cc.c***) || TestCC(cc.*c**) ||
TestCC(cc.**c*) || TestCC(cc.***c)) {
if (callStackDepth <= 0) {
// terminate vertex program normally
} else {
callStackDepth--;
if (callStack[callStackDepth] is a instruction reference) {
instruction = callStack[callStackDepth];
} else {
// terminate vertex program abnormally
}
}
// continue execution at <instruction>
} else {
// do nothing
}
In the pseudocode, <callStackDepth> is the depth of the call stack,
<callStack> is an array holding the call stack, and <instruction> is
a reference to an instruction previously pushed onto the call stack.
If the call stack is empty when RET executes, the vertex program
terminates normally.
The vertex program terminates abnormally if the entry at the top of the
call stack is not an instruction reference pushed by CAL. When a vertex
program terminates abnormally, all of the vertex program results are
undefined.
Add to Section 2.14.5, Vertex Program Instruction Set
Section 2.14.5.43, POPA: Pop Address Register Stack
The POPA instruction generates a integer result vector by popping
an entry off of the call stack.
if (callStackDepth <= 0) {
terminate vertex program;
} else {
callStackDepth--;
if (callStack[callStackDepth] is an address register) {
iresult = callStack[callStackDepth];
} else {
terminate vertex program;
}
}
POPA does not support non-default write masks; a program will fail to load
if it includes a component write mask other than ".xyzw" or a condition
code write mask test other than "TR".
In the pseudocode, <callStackDepth> is the current depth of the call
stack and <callStack> is an array holding the call stack.
The vertex program terminates abnormally if it executes a POPA instruction
when the call stack is empty, or when the entry at the top of the call
stack is not an address register pushed by PUSHA. When a vertex program
terminates abnormally, all of the vertex program results are undefined.
Section 2.14.5.44, PUSHA: Push Address Register Stack
The PUSHA instruction pushes the address register operand onto the
call stack, which is also used for subroutine calls. The PUSHA
instruction does not generate a result vector.
tmp = AddrVectorLoad(op0);
if (callStackDepth >= MAX_PROGRAM_CALL_DEPTH_NV) {
terminate vertex program;
} else {
callStack[callStackDepth] = tmp;
callStackDepth++;
}
In the pseudocode, <callStackDepth> is the current depth of the call
stack and <callStack> is an array holding the call stack.
The vertex program terminates abnormally if it executes a PUSHA
instruction when the call stack is full. When a vertex program terminates
abnormally, all of the vertex program results are undefined.
Component swizzling is not supported when the operand is loaded.
Section 2.14.5.45, TEX: Texture Lookup
The TEX instruction uses the single vector operand to perform a
lookup in the specified texture map, yielding a 4-component result
vector containing filtered texel values. The (s,t,r,q) coordinates
used for the texture lookup are (x,y,z,1), where x, y, and z are
components of the vector operand.
tmp = VectorLoad(op0);
result = TextureSample(tmp.x, tmp.y, tmp.z, 1.0, 0.0, unit, target);
where <unit> and <target> are the texture image unit number and
target type, matching the <texImageUnitNum> and <texTargetType>
grammar rules.
The resulting sample is mapped to RGBA as described in Table 3.21,
and the R, G, B, and A values are written to the x, y, z, and w
components, respectively, of the result vector.
Since partial derivatives of the texture coordinates are not defined,
the base LOD value for vertex texture lookups is defined to be
zero. The value of lambda' used in equation 3.16 will be simply
clamp(texobj_bias + texunit_bias).
Section 2.14.5.46, TXB: Texture Lookup (With LOD Bias)
The TXB instruction uses the single vector operand to perform a
lookup in the specified texture map, yielding a 4-component result
vector containing filtered texel values. The (s,t,r,q) coordinates
used for the texture lookup are (x,y,z,1), where x, y, and z are
components of the vector operand. The w component of the operand
is used as an additional LOD bias.
tmp = VectorLoad(op0);
result = TextureSample(tmp.x, tmp.y, tmp.z, 1.0, tmp.w, unit, target);
where <unit> and <target> are the texture image unit number and
target type, matching the <texImageUnitNum> and <texTargetType>
grammar rules.
The resulting sample is mapped to RGBA as described in Table 3.21,
and the R, G, B, and A values are written to the x, y, z, and w
components, respectively, of the result vector.
Since partial derivatives of the texture coordinates are not defined,
the base LOD value for vertex texture lookups is defined to be
zero. The value of lambda' used in equation 3.16 will be simply
clamp(texobj_bias + texunit_bias + tmp.w).
Since the base LOD value is zero, the TXB instruction is completely
equivalent to the TXL instruction, where the w component contains
an explicit base LOD value.
Section 2.14.5.47, TXL: Texture Lookup (With Explicit LOD)
The TXL instruction uses the single vector operand to perform a
lookup in the specified texture map, yielding a 4-component result
vector containing filtered texel values. The (s,t,r,q) coordinates
used for the texture lookup are (x,y,z,1), where x, y, and z are
components of the vector operand. The w component of the operand
is used as the base LOD for the texture lookup.
tmp = VectorLoad(op0);
result = TextureSampleLOD(tmp.x, tmp.y, tmp.z, 1.0, tmp.w, unit, target);
where <unit> and <target> are the texture image unit number and
target type, matching the <texImageUnitNum> and <texTargetType>
grammar rules.
The resulting sample is mapped to RGBA as described in Table 3.21,
and the R, G, B, and A values are written to the x, y, z, and w
components, respectively, of the result vector.
The value of lambda' used in equation 3.16 will be simply tmp.w +
clamp(texobj_bias + texunit_bias), where tmp.w is the base LOD.
Section 2.14.5.48, TXP: Texture Lookup (Projective)
The TXP instruction uses the single vector operand to perform a
lookup in the specified texture map, yielding a 4-component result
vector containing filtered texel values. The (s,t,r,q) coordinates
used for the texture lookup are (x,y,z,w), where x, y, z, and w are
the four components of the vector operand.
tmp = VectorLoad(op0);
result = TextureSample(tmp.x, tmp.y, tmp.z, tmp.w, 0.0, unit, target);
where <unit> and <target> are the texture image unit number and
target type, matching the <texImageUnitNum> and <texTargetType>
grammar rules.
The resulting sample is mapped to RGBA as described in Table 3.21,
and the R, G, B, and A values are written to the x, y, z, and w
components, respectively, of the result vector.
Since partial derivatives of the texture coordinates are not defined,
the base LOD value for vertex texture lookups is defined to be
zero. The value of lambda' used in equation 3.16 will be simply
clamp(texobj_bias + texunit_bias).
Additions to Chapter 3 of the OpenGL 1.4 Specification (Rasterization)
None.
Additions to Chapter 4 of the OpenGL 1.4 Specification (Per-Fragment
Operations and the Frame Buffer)
None.
Additions to Chapter 5 of the OpenGL 1.4 Specification (Special Functions)
None.
Additions to Chapter 6 of the OpenGL 1.4 Specification (State and
State Requests)
None.
Additions to Appendix A of the OpenGL 1.4 Specification (Invariance)
None.
Additions to the AGL/GLX/WGL Specifications
None.
Dependencies on ARB_vertex_program
ARB_vertex_program is required.
This specification and NV_vertex_program2_option are based on a
modified version of the grammar published in the ARB_vertex_program
specification. This modified grammar includes a few structural
changes to better accommodate new functionality from this and
other extensions, but should be functionally equivalent to the
ARB_vertex_program grammar. See NV_vertex_program2_option for
details on the base grammar.
Dependencies on NV_vertex_program2_option
NV_vertex_program2_option is required.
If the NV_vertex_program3 program option is specified, all
the functionality described in both this extension and the
NV_vertex_program2_option specification is available.
Dependencies on ARB_fragment_program_shadow
If this extension and ARB_fragment_program shadow are both supported,
vertex programs may include the option statement:
OPTION ARB_fragment_program_shadow;
which enables the use of SHADOW1D, SHADOW2D, and SHADOWRECT texture
targets in texture lookup instructions, as described in the
ARB_fragment_program_shadow specification.
NVIDIA NOTE: Drivers prior to September 2006 do not support the use of
this option, and will not accept texture lookups with SHADOW1D, SHADOW2D,
and SHADOWRECT targets. Shadow mapping in vertex programs will result in
software fallbacks on GeForce 6 and GeForce 7 series GPUs, but may be done
in hardware on future GPUs.
Errors
None.
New State
None.
New Implementation Dependent State:
Minimum
Get Value Type Get Command Value Description Section Attr.
--------- ---- ----------- ------- -------------------------- -------- -----
MAX_VERTEX_TEXTURE_ Z+ GetIntegerv 1 Number of separate texture 2.14.3.7 -
IMAGE_UNITS_ARB image units that can be
accessed by a vertex program
Revision History
Rev. Date Author Changes
---- -------- -------- --------------------------------------------
7 10/12/09 pbrown Update grammar/documentation of PUSHA/POPA to
reflect the implementation. <instResultAddr> is
used for POPA with some semantic checks. Note
that some driver versions erroneously allowed
conditional write masks on POPA. Also clarify
that ARB_fragment_program_shadow includes
support for "SHADOWRECT".
6 09/27/06 pbrown Document that ARB_fragment_program_shadow is
allowed, to enable the use of "SHADOW1D" and
"SHADOW2D" targets for texture lookups.
5 11/07/05 pbrown Fix PUSHA documentation to specify the right
constant name used for overflow testing.
4 09/01/05 pbrown Fix spec language to document that a vertex
program will fail to compile if it uses "too
many" textures -- previously only documented
in the issues section.
3 08/25/05 pbrown Document that using a different texture target
than fragment processing on the same texture
unit results in an INVALID_OPERATION error at
Begin time. This is consistent with GLSL
language in the ARB_shader_objects and OpenGL
2.0 specifications. The implementation has
always done this, but it was overlooked in
the spec language.
2 06/23/04 pbrown Documented that vertex results are undefined
when a vertex program terminates abnormally
(e.g., PUSHA/POPA stack overflow/underflow).
Documented error in RET if the top of the call
stack contains a value written by PUSHA.
1 -------- pbrown Initial pre-release revisions.