D3DBook:Shaders

From GDWiki
Jump to: navigation, search

Contents

Shaders

Any stage that is not configured with a state object needs a shader program to process the data. Typical programs contain sections that calculate positions, texture coordinates or colors.

Common Shader core

All three shader stages based on the same common shader core that defines the general function set. Every shader core will receive data from a previous pipeline stage through its input register and feed flowing stages with the output register. Additional sources are the 16 constant buffers and the 128 attached shader resources that are accessed with a view. A Direct3D 10 shader core can read these resources direct or use one of the 16 attached sampler objects. Every shader core is able to use 32 Bit floating points and integer values. This includes bitwise operations.

HLSL

The programs that are executed from the three shader cores are written in HLSL (High Level Shader Language). This C based language was introduced with Direct3D 9 as alternative to the assembler like shader programming from Direct3D 8. With Direct3D 10 it becomes the only way to write shader programs.

HLSL variable types

As a C derivate HLSL reuse some of the variables types from this language.


Type Description
bool Boolean data (true or false)
int Signed integer with 32 bit
uint Unsigned integer with 32 bit
half Floating point type with 16 bit. It is only supported to be compatible with HLSL for former Direct3D versions.
float Floating point type with 32 bit.
double Floating point type with 64 bit.
string Text type. Can't use in shader programs. Supported only by the effect framework

Table: HLSL base variable types

As an addition to the C language the float type can be limited in its range. The two type modifier snorm and unorm will allow ranges from -1 to 1 and 0 to 1.

Another difference from C is the native support of vector and matrix variable types. You can use every base type to create vectors with up to four elements and matrices with any size up to 4 elements in both directions. There are two different way to specify a vector or matrix

You can add the number of rows and columns to the name of the base type

float2 texture;
float4 position;
float1 color;
float4x4 view;

A vector or a matrix with a size of one is not the same as the base type.

The more complex syntax use the template syntax known from C++

vector <float,2> texture;
vector <float,4> position;
vector <float,1> color;
matrix <float, 4, 4> view;

To define more complex types you can combine base, vector and matrix types together to structs.

struct Input
{
    float3 Position;
    float3 Normal;
    float2 Texturecoordinates;
};

The last C element that made it in HLSL is the typedef.

HLSL functions

To declare and define a function HLSL is following C but made some changes.

Each parameter that is part of the function parameter list can have an additional modifier.


modifier Description
in The parameter is only used as input.
out The parameter is only used as output.
inout The parameter is used in both directions.
uniform Uniform parameters are provided form the CPU and stay the same for all elements that are part of the same draw call.

Table: HLSL function parameter modifier

Whenever it was possible HLSL implements a function from the common C runtime library in the same way that is already know. Other functions are new as there is no counterpart in C. But the number of both types is limited


Name Syntax Description
abs abs(value a) Absolute value (per component).
acos acos(x) Returns the arccosine of each component of x. Each component should be in the range [-1, 1].
all all(x) Test if all components of x are nonzero.
any any(x) Test if any component of x is nonzero.
append append(x) Append data to the geometry shader out stream.
asfloat asfloat(x) Convert the input type to a float.
asin asin(x) Returns the arcsine of each component of x. Each component should be in the range [-pi/2, pi/2].
asint asint(x) Convert the input type to an integer.
asuint asuint(x) Convert the input type to an unsigned integer.
atan atan(x) Returns the arctangent of x. The return values are in the range [-pi/2, pi/2].
atan2 atan2(y, x) Returns the arctangent of y/x. The signs of y and x are used to determine the quadrant of the return values in the range [-pi, pi]. atan2 is well-defined for every point other than the origin, even if x equals 0 and y does not equal 0.
ceil ceil(x) Returns the smallest integer which is greater than or equal to x.
clamp clamp(x, min, max) Clamps x to the range [min, max].
clip clip(x) Discards the current pixel, if any component of x is less than zero. This can be used to simulate clip planes, if each component of x represents the distance from a plane.
cos cos(x) Returns the cosine of x.
cosh cosh(x) Returns the hyperbolic cosine of x.
cross cross(a, b) Returns the cross product of two 3D vectors a and b.
D3DCOLORtoUBYTE4 D3DCOLORtoUBYTE4(x) Swizzles and scales components of the 4D vector x to compensate for the lack of UBYTE4 support in some hardware.
ddx ddx(x) Returns the partial derivative of x with respect to the screen-space x-coordinate.
ddy ddy(x) Returns the partial derivative of x with respect to the screen-space y-coordinate.
degrees degrees(x) Converts x from radians to degrees.
determinant determinant(m) Returns the determinant of the square matrix m.
distance distance(a, b) Returns the distance between two points, a and b.
dot dot(a, b) Returns the dot product of two vectors, a and b.
exp exp(x) Returns the base-e exponent.
exp2 value exp2(value a) Base 2 Exp (per component).
faceforward faceforward(n, i, ng) Returns -n * sign(•(i, ng)).
floor floor(x) Returns the greatest integer which is less than or equal to x.
fmod fmod(a, b) Returns the floating point remainder f of a / b such that a = i * b + f, where i is an integer, f has the same sign as x, and the absolute value of f is less than the absolute value of b.
frac frac(x) Returns the fractional part f of x, such that f is a value greater than or equal to 0, and less than 1.
frexp frexp(x, exp) Returns the mantissa and exponent of x. frexp returns the mantissa, and the exponent is stored in the output parameter exp. If x is 0, the function returns 0 for both the mantissa and the exponent.
fwidth fwidth(x) Returns abs(ddx(x)) + abs(ddy(x))
isfinite isfinite(x) Returns true if x is finite, false otherwise.
isinf isinf(x) Returns true if x is +INF or -INF, false otherwise.
isnan isnan(x) Returns true if x is NAN or QNAN, false otherwise.
ldexp ldexp(x, exp) Returns x * 2exp
length length(v) Returns the length of the vector v.
lerp lerp(a, b, s) Returns a + s(b - a). This linearly interpolates between a and b, such that the return value is a when s is 0, and b when s is 1.
lit lit(n dot l, n dot h, m) Returns a lighting vector (ambient, diffuse, specular, 1):ambient = 1;diffuse = (n • l < 0) ? 0 : n dot l; specular = (n dot l < 0) (n dot h < 0) ? 0 : (n dot h * m);
log log(x) Returns the base-e logarithm of x. If x is negative, the function returns indefinite. If x is 0, the function returns +INF.
log10 log10(x) Returns the base-10 logarithm of x. If x is negative, the function returns indefinite. If x is 0, the function returns +INF.
log2 log2(x) Returns the base-2 logarithm of x. If x is negative, the function returns indefinite. If x is 0, the function returns +INF.
max max(a, b) Selects the greater of a and b.
min min(a, b) Selects the lesser of a and b.
modf modf(x, out ip) Splits the value x into fractional and integer parts, each of which has the same sign and x. The signed fractional portion of x is returned. The integer portion is stored in the output parameter ip.
mul mul(a, b) Performs matrix multiplication between a and b. If a is a vector, it is treated as a row vector. If b is a vector, it is treated as a column vector. The inner dimension acolumns and brows must be equal. The result has the dimension arows x bcolumns.
noise noise(x) Not yet implemented.
normalize normalize(v) Returns the normalized vector v / length(v). If the length of v is 0, the result is indefinite.
pow pow(x, y) Returns xy.
radians radians(x) Converts x from degrees to radians.
reflect reflect(i, n) Returns the reflection vector v, given the entering ray direction i, and the surface normal n, such that v = i - 2n * (i•n).
refract refract(i, n, R) Returns the refraction vector v, given the entering ray direction i, the surface normal n, and the relative index of refraction R. If the angle between i and n is too great for a given R, refract returns (0,0,0).
restartstrip restartstrip() Restart the current primitive strip and start a new primitive strip.
round round(x) Rounds x to the nearest integer
rsqrt rsqrt(x) Returns 1 / sqrt(x)
saturate saturate(x) Clamps x to the range [0, 1]
sign sign(x) Computes the sign of x. Returns -1 if x is less than 0, 0 if x equals 0, and 1 if x is greater than zero.
sin sin(x) Returns the sine of x
sincos sincos(x, out s, out c) Returns the sine and cosine of x. sin(x) is stored in the output parameter s. cos(x) is stored in the output parameter c.
sinh sinh(x) Returns the hyperbolic sine of x
smoothstep smoothstep(min, max, x) Returns 0 if x < min. Returns 1 if x > max. Returns a smooth Hermite interpolation between 0 and 1, if x is in the range [min, max].
sqrt value sqrt(value a) Square root (per component)
step step(a, x) Returns (x >= a) ? 1 : 0
tan tan(x) Returns the tangent of x
tanh tanh(x) Returns the hyperbolic tangent of x
tex1D tex1D(s, t) 1D texture lookup. s is a sampler or a sampler1D object. t is a scalar.
tex1Dgrad tex1Dgrad(s, t, ddx, ddy) 1D gradient texture lookup. s is a sampler or sampler1D object. t is a 4D vector. The gradient values (ddx, ddy) select the appropriate mipmap level of the texture for sampling.
tex1Dlod tex1Dlod(s, t) 1D texture lookup with LOD. s is a sampler or sampler1D object. t is a 4D vector. The mipmap LOD is specified in t.
tex1Dproj tex1Dproj(s, t) 1D projective texture lookup. s is a sampler or sampler1D object. t is a 4D vector. t is divided by its last component before the lookup takes place.
tex2D tex2D(s, t) 2D texture lookup. s is a sampler or a sampler2D object. t is a 2D texture coordinate.
tex2Dgrad tex2Dgrad(s, t, ddx, ddy) 2D gradient texture lookup. s is a sampler or sampler2D object. t is a 4D vector. The gradient values (ddx, ddy) select the appropriate mipmap level of the texture for sampling.
tex2Dlod tex2Dlod(s, t) 2D texture lookup with LOD. s is a sampler or sampler2D object. t is a 4D vector. The mipmap LOD is specified in t.
tex2Dproj tex2Dproj(s, t) 2D projective texture lookup. s is a sampler or sampler2D object. t is a 4D vector. t is divided by its last component before the lookup takes place.
tex3D tex3D(s, t) 3D volume texture lookup. s is a sampler or a sampler3D object. t is a 3D texture coordinate.
tex3Dgrad tex3Dgrad(s, t, ddx, ddy) 3D gradient texture lookup. s is a sampler or sampler3D object. t is a 4D vector. The gradient values (ddx, ddy) select the appropriate mipmap level of the texture for sampling.
tex3Dlod tex3Dlod(s, t) 3D texture lookup with LOD. s is a sampler or sampler3D object. t is a 4D vector. The mipmap LOD is specified in t.
tex3Dproj tex3Dproj(s, t) 3D projective volume texture lookup. s is a sampler or sampler3D object. t is a 4D vector. t is divided by its last component before the lookup takes place.
texCUBE texCUBE(s, t) 3D cube texture lookup. s is a sampler or a samplerCUBE object. t is a 3D texture coordinate.
texCUBEgrad texCUBEgrad(s, t, ddx, ddy) 3D gradient cube texture lookup. s is a sampler or samplerCUBE object. t is a 4D vector. The gradient values (ddx, ddy) select the appropriate mipmap level of the texture for sampling.
texCUBElod tex3Dlod(s, t) 3D 3D cube texture lookup with LOD. s is a sampler or samplerCUBE object. t is a 4D vector. The mipmap LOD is specified in t.
texCUBEproj texCUBEproj(s, t) 3D projective cube texture lookup. s is a sampler or samplerCUBE object. t is a 4D vector. t is divided by its last component before the lookup takes place.
transpose transpose(m) Returns the transpose of the matrix m. If the source is dimension mrows x mcolumns, the result is dimension mcolumns x mrows.

Table: HLSL intrinsic functions

HLSL classes

Even primary C based HLSL know C++ like object classes. One for each shader resource type and another for the output stream that is written by a geometry shader.

For every valid type that can be used in a shader resource view object there is an HLSL object class


HLSL type Shader resource view type Description
Buffer D3D10_SRV_DIMENSION_BUFFER A buffer resource.
Texture1D D3D10_SRV_DIMENSION_TEXTURE1D A one dimension texture.
Texture1DArray D3D10_SRV_DIMENSION_TEXTURE1DARRAY An array of one dimension textures.
Texture2D D3D10_SRV_DIMENSION_TEXTURE2D A two dimension texture.
Texture2DArray D3D10_SRV_DIMENSION_TEXTURE2DARRAY An Array of two dimension textures.
Texture2DMS D3D10_SRV_DIMENSION_TEXTURE2DMS A two dimension texture with multi sampling
Texture2DMSArray D3D10_SRV_DIMENSION_TEXTURE2DMSARRAY An Array of two dimension textures with multi sampling
Texture3D D3D10_SRV_DIMENSION_TEXTURE3D A three dimension texture
TextureCube D3D10_SRV_DIMENSION_TEXTURECUBE A cube texture

The HLSL shader resource objects support the following methods.


Method Description
GetDimensions Report the width, height and number of mip maps. Cannot be used with views that represent a buffer.
Load Load a texture value without using a sampler.
Sample Use a sampler to read from a texture.
SampleCmp Like Sample but this method does an additional compare against a provide reference value.
SampleCmpLevelZero Works like SampleCmp but will only use the zero mip map level.
SampleGrad Use provided gradients for the sample instead of calculate them.
SampleLevel Sample from a specified level.

Table: HLSL texture object methods


HLSL flow control attributes

As extension some of the C flow control statements support attributes to control how the compiler should handle them.

  • for/while
    • unroll (x)instead of insert loop instructions the compiler will unroll the loop.
    • loopthe compiler use a real loop
  • if
    • branchinsert code that let the shader decide at runtime which side should taken.
    • flattenAlways calculate both sides and decide then which results are used.
  • switch
    • branchconverts the switch to a cascade of if instructions that are use the “branch” attribute.
    • flattenexecute all cases and decide at the end which results are zaken.
    • forcecaseforce the compiler to use a real hardware case.
    • callthe compiler will generate a sub function for each case.

Geometry Shader

If a shader will be used on the geometry shader stage it could use two additional HLSL functions.

  • Append: Adds an additional vertex to the output stream.
  • RestartStripe: Terminate the current striped primitive and start a new one.

To make use of them the shader function need a stream object as in out parameter. HLSL knows three different types of stream objects:

  • PointStream
  • LineStream
  • TriangleStream

As template object types the need to use together with an structure that describe the output vertex format. This could be different from the input format.

As HLSL supports these two commands inside a loop a generic number of vertexes could be outputted with each shader invocation. To allow the GPUs future optimizations it is necessary to define the maximum number of vertexes that will be generated.

[maxvertexcount(10)]
 
void GSMain(point GSIn input[1], inout PointStream<GSOut> PointOutputStream)
{
    ...
}

Pixel Shader

A Direct3D 10 pixel shader is mostly identical to the common shader core. But as the input values could base on the output of multiple vertex shaders runs it needs additional information. Therefore HLSL provides 4 interpolation usage specifiers to manage this behavior.


Modifier Description
linear The values will be linear interpolated between the output values of the uses vertices. This is the default modifier.
centroid Use centroid interpolation to improve results in the case of anti-aliasing.
nointerpolation Values will not be interpolated. Can only be used for integer variables.
Noperspective The values will be interpolated but without perspective correction.

Table: pixel shader interpolation modifiers

Compile Shader

Even with HLSL as the only way to program the shader stages it is still necessary to compile your shader in a binary format before it could be used to create the shader object. This makes it possible to do this expensive operation ahead. As the compiler is part of the core runtime you can compile all your shaders during the setup process. It is even possible to compile the shader as part of the build process and only include the binary code. In this case the HLSL code does not need to leave the development system.

As already shown the common shader core is only the basic and each shader stage have different limitations. Therefore it is necessary to tell the compiler on which shader stage the result of the compilation will be later used. Beside of adjust all check functions the selected profile will be include in the binary shader for later use.

Additional to the primary output the compile method could deliver a human readable list of errors and warnings.

Create Shader

In the end the binary shader code is used to create the real shader object. It makes no difference if it was directly compiled or loaded from the disk. In every case the runtime will check if the shader was compiled with the right signature for the type you want to create. Additional there is a hash signature that provides some kind protection against modification. If it doesn't match the shader object will not be created.

To create a normal shader you will need only the binary code and the size of it. But as there is no state object for the stream out unit like the input layout object for the input assembler the information's need to provide during the creation of the geometry shader that will generate the output stream.

Reflect Shader

Instead of creating a shader object Direct3D 10 could create a shader reflection object with the binary code. Beside of enumerate all input and output registers you can query it for the resources that the shader consume. This is useful if you want to write your own effect framework instead of using the default one. Finally it gives you access to the statistical information for the compiled shader.

The following example shows how to create a shader reflection and loop over all elements it contains.

D3D10ReflectShader (pCode, CodeSize, &pReflector);
 
// General description first
pReflector->GetDesc (&ShaderDesc);
 
// enumerate the input elements
for (UINT Input = 0 ; Input < ShaderDesc.InputParameters ; ++Input)
{
    pReflector->GetInputParameterDesc (Input, &InputDesc);
}
 
// enumerate the output elements
for (UINT Output = 0 ; Output < ShaderDesc.OutputParameters ; ++Output)
{
    pReflector->GetOutputParameterDesc (Output, &OutputDesc);
}
 
// enumerate the resource bindings
for (UINT Resource = 0 ; Resource < ShaderDesc.BoundResources ; ++Resource)
{
    pReflector->GetResourceBindingDesc (Resource, &ResourceDesc);
}
 
// enumerate the constant buffers
for (UINT Buffer = 0 ; Buffer < ShaderDesc.ConstantBuffers ; ++Buffer)
{
    ID3D10ShaderReflectionConstantBuffer* pBuffer = 
        pReflector->GetConstantBufferByIndex (Buffer);
 
    D3D10_SHADER_BUFFER_DESC BufferDesc;
    pBuffer->GetDesc (&BufferDesc);
 
    // enumerate the variables in the buffer
    for (UINT Variable = 0 ; Variable < BufferDesc.Variables; ++Variable)
    {
        ID3D10ShaderReflectionVariable* pVariable =
            pBuffer->GetVariableByIndex (Variable);
 
        D3D10_SHADER_VARIABLE_DESC VariableDesc;
        pVariable->GetDesc (&VariableDesc);
    }
}
 
// Finally release the reflection object. 
pReflector->Release ();
Personal tools