Contents |
Any stage that is not configured with a state object needs a shader program to process the data. Typical programs contain sections that calculate positions, texture coordinates or colors.
All three shader stages based on the same common shader core that defines the general function set. Every shader core will receive data from a previous pipeline stage through its input register and feed flowing stages with the output register. Additional sources are the 16 constant buffers and the 128 attached shader resources that are accessed with a view. A Direct3D 10 shader core can read these resources direct or use one of the 16 attached sampler objects. Every shader core is able to use 32 Bit floating points and integer values. This includes bitwise operations.
The programs that are executed from the three shader cores are written in HLSL (High Level Shader Language). This C based language was introduced with Direct3D 9 as alternative to the assembler like shader programming from Direct3D 8. With Direct3D 10 it becomes the only way to write shader programs.
As a C derivate HLSL reuse some of the variables types from this language.
| Type | Description |
| bool | Boolean data (true or false) |
| int | Signed integer with 32 bit |
| uint | Unsigned integer with 32 bit |
| half | Floating point type with 16 bit. It is only supported to be compatible with HLSL for former Direct3D versions. |
| float | Floating point type with 32 bit. |
| double | Floating point type with 64 bit. |
| string | Text type. Can't use in shader programs. Supported only by the effect framework |
Table: HLSL base variable types
As an addition to the C language the float type can be limited in its range. The two type modifier snorm and unorm will allow ranges from -1 to 1 and 0 to 1.
Another difference from C is the native support of vector and matrix variable types. You can use every base type to create vectors with up to four elements and matrices with any size up to 4 elements in both directions. There are two different way to specify a vector or matrix
You can add the number of rows and columns to the name of the base type
float2 texture; float4 position; float1 color; float4x4 view;
A vector or a matrix with a size of one is not the same as the base type.
The more complex syntax use the template syntax known from C++
vector <float,2> texture; vector <float,4> position; vector <float,1> color; matrix <float, 4, 4> view;
To define more complex types you can combine base, vector and matrix types together to structs.
struct Input { float3 Position; float3 Normal; float2 Texturecoordinates; };
The last C element that made it in HLSL is the typedef.
To declare and define a function HLSL is following C but made some changes.
Each parameter that is part of the function parameter list can have an additional modifier.
| modifier | Description |
| in | The parameter is only used as input. |
| out | The parameter is only used as output. |
| inout | The parameter is used in both directions. |
| uniform | Uniform parameters are provided form the CPU and stay the same for all elements that are part of the same draw call. |
Table: HLSL function parameter modifier
Whenever it was possible HLSL implements a function from the common C runtime library in the same way that is already know. Other functions are new as there is no counterpart in C. But the number of both types is limited
| Name | Syntax | Description | |
| abs | abs(value a) | Absolute value (per component). | |
| acos | acos(x) | Returns the arccosine of each component of x. Each component should be in the range [-1, 1]. | |
| all | all(x) | Test if all components of x are nonzero. | |
| any | any(x) | Test if any component of x is nonzero. | |
| append | append(x) | Append data to the geometry shader out stream. | |
| asfloat | asfloat(x) | Convert the input type to a float. | |
| asin | asin(x) | Returns the arcsine of each component of x. Each component should be in the range [-pi/2, pi/2]. | |
| asint | asint(x) | Convert the input type to an integer. | |
| asuint | asuint(x) | Convert the input type to an unsigned integer. | |
| atan | atan(x) | Returns the arctangent of x. The return values are in the range [-pi/2, pi/2]. | |
| atan2 | atan2(y, x) | Returns the arctangent of y/x. The signs of y and x are used to determine the quadrant of the return values in the range [-pi, pi]. atan2 is well-defined for every point other than the origin, even if x equals 0 and y does not equal 0. | |
| ceil | ceil(x) | Returns the smallest integer which is greater than or equal to x. | |
| clamp | clamp(x, min, max) | Clamps x to the range [min, max]. | |
| clip | clip(x) | Discards the current pixel, if any component of x is less than zero. This can be used to simulate clip planes, if each component of x represents the distance from a plane. | |
| cos | cos(x) | Returns the cosine of x. | |
| cosh | cosh(x) | Returns the hyperbolic cosine of x. | |
| cross | cross(a, b) | Returns the cross product of two 3D vectors a and b. | |
| D3DCOLORtoUBYTE4 | D3DCOLORtoUBYTE4(x) | Swizzles and scales components of the 4D vector x to compensate for the lack of UBYTE4 support in some hardware. | |
| ddx | ddx(x) | Returns the partial derivative of x with respect to the screen-space x-coordinate. | |
| ddy | ddy(x) | Returns the partial derivative of x with respect to the screen-space y-coordinate. | |
| degrees | degrees(x) | Converts x from radians to degrees. | |
| determinant | determinant(m) | Returns the determinant of the square matrix m. | |
| distance | distance(a, b) | Returns the distance between two points, a and b. | |
| dot | dot(a, b) | Returns the dot product of two vectors, a and b. | |
| exp | exp(x) | Returns the base-e exponent. | |
| exp2 | value exp2(value a) | Base 2 Exp (per component). | |
| faceforward | faceforward(n, i, ng) | Returns -n * sign(•(i, ng)). | |
| floor | floor(x) | Returns the greatest integer which is less than or equal to x. | |
| fmod | fmod(a, b) | Returns the floating point remainder f of a / b such that a = i * b + f, where i is an integer, f has the same sign as x, and the absolute value of f is less than the absolute value of b. | |
| frac | frac(x) | Returns the fractional part f of x, such that f is a value greater than or equal to 0, and less than 1. | |
| frexp | frexp(x, exp) | Returns the mantissa and exponent of x. frexp returns the mantissa, and the exponent is stored in the output parameter exp. If x is 0, the function returns 0 for both the mantissa and the exponent. | |
| fwidth | fwidth(x) | Returns abs(ddx(x)) + abs(ddy(x)) | |
| isfinite | isfinite(x) | Returns true if x is finite, false otherwise. | |
| isinf | isinf(x) | Returns true if x is +INF or -INF, false otherwise. | |
| isnan | isnan(x) | Returns true if x is NAN or QNAN, false otherwise. | |
| ldexp | ldexp(x, exp) | Returns x * 2exp | |
| length | length(v) | Returns the length of the vector v. | |
| lerp | lerp(a, b, s) | Returns a + s(b - a). This linearly interpolates between a and b, such that the return value is a when s is 0, and b when s is 1. | |
| lit | lit(n dot l, n dot h, m) | Returns a lighting vector (ambient, diffuse, specular, 1):ambient = 1;diffuse = (n • l < 0) ? 0 : n dot l; specular = (n dot l < 0) | (n dot h < 0) ? 0 : (n dot h * m); |
| log | log(x) | Returns the base-e logarithm of x. If x is negative, the function returns indefinite. If x is 0, the function returns +INF. | |
| log10 | log10(x) | Returns the base-10 logarithm of x. If x is negative, the function returns indefinite. If x is 0, the function returns +INF. | |
| log2 | log2(x) | Returns the base-2 logarithm of x. If x is negative, the function returns indefinite. If x is 0, the function returns +INF. | |
| max | max(a, b) | Selects the greater of a and b. | |
| min | min(a, b) | Selects the lesser of a and b. | |
| modf | modf(x, out ip) | Splits the value x into fractional and integer parts, each of which has the same sign and x. The signed fractional portion of x is returned. The integer portion is stored in the output parameter ip. | |
| mul | mul(a, b) | Performs matrix multiplication between a and b. If a is a vector, it is treated as a row vector. If b is a vector, it is treated as a column vector. The inner dimension acolumns and brows must be equal. The result has the dimension arows x bcolumns. | |
| noise | noise(x) | Not yet implemented. | |
| normalize | normalize(v) | Returns the normalized vector v / length(v). If the length of v is 0, the result is indefinite. | |
| pow | pow(x, y) | Returns xy. | |
| radians | radians(x) | Converts x from degrees to radians. | |
| reflect | reflect(i, n) | Returns the reflection vector v, given the entering ray direction i, and the surface normal n, such that v = i - 2n * (i•n). | |
| refract | refract(i, n, R) | Returns the refraction vector v, given the entering ray direction i, the surface normal n, and the relative index of refraction R. If the angle between i and n is too great for a given R, refract returns (0,0,0). | |
| restartstrip | restartstrip() | Restart the current primitive strip and start a new primitive strip. | |
| round | round(x) | Rounds x to the nearest integer | |
| rsqrt | rsqrt(x) | Returns 1 / sqrt(x) | |
| saturate | saturate(x) | Clamps x to the range [0, 1] | |
| sign | sign(x) | Computes the sign of x. Returns -1 if x is less than 0, 0 if x equals 0, and 1 if x is greater than zero. | |
| sin | sin(x) | Returns the sine of x | |
| sincos | sincos(x, out s, out c) | Returns the sine and cosine of x. sin(x) is stored in the output parameter s. cos(x) is stored in the output parameter c. | |
| sinh | sinh(x) | Returns the hyperbolic sine of x | |
| smoothstep | smoothstep(min, max, x) | Returns 0 if x < min. Returns 1 if x > max. Returns a smooth Hermite interpolation between 0 and 1, if x is in the range [min, max]. | |
| sqrt | value sqrt(value a) | Square root (per component) | |
| step | step(a, x) | Returns (x >= a) ? 1 : 0 | |
| tan | tan(x) | Returns the tangent of x | |
| tanh | tanh(x) | Returns the hyperbolic tangent of x | |
| tex1D | tex1D(s, t) | 1D texture lookup. s is a sampler or a sampler1D object. t is a scalar. | |
| tex1Dgrad | tex1Dgrad(s, t, ddx, ddy) | 1D gradient texture lookup. s is a sampler or sampler1D object. t is a 4D vector. The gradient values (ddx, ddy) select the appropriate mipmap level of the texture for sampling. | |
| tex1Dlod | tex1Dlod(s, t) | 1D texture lookup with LOD. s is a sampler or sampler1D object. t is a 4D vector. The mipmap LOD is specified in t. | |
| tex1Dproj | tex1Dproj(s, t) | 1D projective texture lookup. s is a sampler or sampler1D object. t is a 4D vector. t is divided by its last component before the lookup takes place. | |
| tex2D | tex2D(s, t) | 2D texture lookup. s is a sampler or a sampler2D object. t is a 2D texture coordinate. | |
| tex2Dgrad | tex2Dgrad(s, t, ddx, ddy) | 2D gradient texture lookup. s is a sampler or sampler2D object. t is a 4D vector. The gradient values (ddx, ddy) select the appropriate mipmap level of the texture for sampling. | |
| tex2Dlod | tex2Dlod(s, t) | 2D texture lookup with LOD. s is a sampler or sampler2D object. t is a 4D vector. The mipmap LOD is specified in t. | |
| tex2Dproj | tex2Dproj(s, t) | 2D projective texture lookup. s is a sampler or sampler2D object. t is a 4D vector. t is divided by its last component before the lookup takes place. | |
| tex3D | tex3D(s, t) | 3D volume texture lookup. s is a sampler or a sampler3D object. t is a 3D texture coordinate. | |
| tex3Dgrad | tex3Dgrad(s, t, ddx, ddy) | 3D gradient texture lookup. s is a sampler or sampler3D object. t is a 4D vector. The gradient values (ddx, ddy) select the appropriate mipmap level of the texture for sampling. | |
| tex3Dlod | tex3Dlod(s, t) | 3D texture lookup with LOD. s is a sampler or sampler3D object. t is a 4D vector. The mipmap LOD is specified in t. | |
| tex3Dproj | tex3Dproj(s, t) | 3D projective volume texture lookup. s is a sampler or sampler3D object. t is a 4D vector. t is divided by its last component before the lookup takes place. | |
| texCUBE | texCUBE(s, t) | 3D cube texture lookup. s is a sampler or a samplerCUBE object. t is a 3D texture coordinate. | |
| texCUBEgrad | texCUBEgrad(s, t, ddx, ddy) | 3D gradient cube texture lookup. s is a sampler or samplerCUBE object. t is a 4D vector. The gradient values (ddx, ddy) select the appropriate mipmap level of the texture for sampling. | |
| texCUBElod | tex3Dlod(s, t) | 3D 3D cube texture lookup with LOD. s is a sampler or samplerCUBE object. t is a 4D vector. The mipmap LOD is specified in t. | |
| texCUBEproj | texCUBEproj(s, t) | 3D projective cube texture lookup. s is a sampler or samplerCUBE object. t is a 4D vector. t is divided by its last component before the lookup takes place. | |
| transpose | transpose(m) | Returns the transpose of the matrix m. If the source is dimension mrows x mcolumns, the result is dimension mcolumns x mrows. |
Table: HLSL intrinsic functions
Even primary C based HLSL know C++ like object classes. One for each shader resource type and another for the output stream that is written by a geometry shader.
For every valid type that can be used in a shader resource view object there is an HLSL object class
| HLSL type | Shader resource view type | Description |
| Buffer | D3D10_SRV_DIMENSION_BUFFER | A buffer resource. |
| Texture1D | D3D10_SRV_DIMENSION_TEXTURE1D | A one dimension texture. |
| Texture1DArray | D3D10_SRV_DIMENSION_TEXTURE1DARRAY | An array of one dimension textures. |
| Texture2D | D3D10_SRV_DIMENSION_TEXTURE2D | A two dimension texture. |
| Texture2DArray | D3D10_SRV_DIMENSION_TEXTURE2DARRAY | An Array of two dimension textures. |
| Texture2DMS | D3D10_SRV_DIMENSION_TEXTURE2DMS | A two dimension texture with multi sampling |
| Texture2DMSArray | D3D10_SRV_DIMENSION_TEXTURE2DMSARRAY | An Array of two dimension textures with multi sampling |
| Texture3D | D3D10_SRV_DIMENSION_TEXTURE3D | A three dimension texture |
| TextureCube | D3D10_SRV_DIMENSION_TEXTURECUBE | A cube texture |
The HLSL shader resource objects support the following methods.
| Method | Description |
| GetDimensions | Report the width, height and number of mip maps. Cannot be used with views that represent a buffer. |
| Load | Load a texture value without using a sampler. |
| Sample | Use a sampler to read from a texture. |
| SampleCmp | Like Sample but this method does an additional compare against a provide reference value. |
| SampleCmpLevelZero | Works like SampleCmp but will only use the zero mip map level. |
| SampleGrad | Use provided gradients for the sample instead of calculate them. |
| SampleLevel | Sample from a specified level. |
Table: HLSL texture object methods
As extension some of the C flow control statements support attributes to control how the compiler should handle them.
If a shader will be used on the geometry shader stage it could use two additional HLSL functions.
To make use of them the shader function need a stream object as in out parameter. HLSL knows three different types of stream objects:
As template object types the need to use together with an structure that describe the output vertex format. This could be different from the input format.
As HLSL supports these two commands inside a loop a generic number of vertexes could be outputted with each shader invocation. To allow the GPUs future optimizations it is necessary to define the maximum number of vertexes that will be generated.
[maxvertexcount(10)] void GSMain(point GSIn input[1], inout PointStream<GSOut> PointOutputStream) { ... }
A Direct3D 10 pixel shader is mostly identical to the common shader core. But as the input values could base on the output of multiple vertex shaders runs it needs additional information. Therefore HLSL provides 4 interpolation usage specifiers to manage this behavior.
| Modifier | Description |
| linear | The values will be linear interpolated between the output values of the uses vertices. This is the default modifier. |
| centroid | Use centroid interpolation to improve results in the case of anti-aliasing. |
| nointerpolation | Values will not be interpolated. Can only be used for integer variables. |
| Noperspective | The values will be interpolated but without perspective correction. |
Table: pixel shader interpolation modifiers
Even with HLSL as the only way to program the shader stages it is still necessary to compile your shader in a binary format before it could be used to create the shader object. This makes it possible to do this expensive operation ahead. As the compiler is part of the core runtime you can compile all your shaders during the setup process. It is even possible to compile the shader as part of the build process and only include the binary code. In this case the HLSL code does not need to leave the development system.
As already shown the common shader core is only the basic and each shader stage have different limitations. Therefore it is necessary to tell the compiler on which shader stage the result of the compilation will be later used. Beside of adjust all check functions the selected profile will be include in the binary shader for later use.
Additional to the primary output the compile method could deliver a human readable list of errors and warnings.
In the end the binary shader code is used to create the real shader object. It makes no difference if it was directly compiled or loaded from the disk. In every case the runtime will check if the shader was compiled with the right signature for the type you want to create. Additional there is a hash signature that provides some kind protection against modification. If it doesn't match the shader object will not be created.
To create a normal shader you will need only the binary code and the size of it. But as there is no state object for the stream out unit like the input layout object for the input assembler the information's need to provide during the creation of the geometry shader that will generate the output stream.
Instead of creating a shader object Direct3D 10 could create a shader reflection object with the binary code. Beside of enumerate all input and output registers you can query it for the resources that the shader consume. This is useful if you want to write your own effect framework instead of using the default one. Finally it gives you access to the statistical information for the compiled shader.
The following example shows how to create a shader reflection and loop over all elements it contains.
D3D10ReflectShader (pCode, CodeSize, &pReflector); // General description first pReflector->GetDesc (&ShaderDesc); // enumerate the input elements for (UINT Input = 0 ; Input < ShaderDesc.InputParameters ; ++Input) { pReflector->GetInputParameterDesc (Input, &InputDesc); } // enumerate the output elements for (UINT Output = 0 ; Output < ShaderDesc.OutputParameters ; ++Output) { pReflector->GetOutputParameterDesc (Output, &OutputDesc); } // enumerate the resource bindings for (UINT Resource = 0 ; Resource < ShaderDesc.BoundResources ; ++Resource) { pReflector->GetResourceBindingDesc (Resource, &ResourceDesc); } // enumerate the constant buffers for (UINT Buffer = 0 ; Buffer < ShaderDesc.ConstantBuffers ; ++Buffer) { ID3D10ShaderReflectionConstantBuffer* pBuffer = pReflector->GetConstantBufferByIndex (Buffer); D3D10_SHADER_BUFFER_DESC BufferDesc; pBuffer->GetDesc (&BufferDesc); // enumerate the variables in the buffer for (UINT Variable = 0 ; Variable < BufferDesc.Variables; ++Variable) { ID3D10ShaderReflectionVariable* pVariable = pBuffer->GetVariableByIndex (Variable); D3D10_SHADER_VARIABLE_DESC VariableDesc; pVariable->GetDesc (&VariableDesc); } } // Finally release the reflection object. pReflector->Release ();