D3DBook:Procedural Textures

From GDWiki
Jump to: navigation, search

Procedural Textures

Image:Pvgps7_1_sphere_wood.png

Texturing can make or break the realism of a 3D rendered scene, as surface color and detailing is, along geometric fidelity, one of the main features that capture the human visual system’s attention. Using 2-dimensional bitmaps to store and render textures has been common almost since the global adoption of real-time 3D graphics, about one and a half decades ago, and volumetric bitmaps have recently become commonplace as on-board memory and texel cache sizes of the graphics accelerators has increased over the years.

The requirements for texture detail, color depth and flexibility of control are sharply increasing as consumers demand crisper graphics and more realism with each new generation of graphics hardware. In modern 3D games, for example, it is not uncommon to observe that (almost) each grain of sand, each blade of grass and each wisp of cloud have been visualized by using a skillfully crafted combination of textures.

Each texture takes up some amount of memory. The more detailed and deep in color the texture is, the more space it takes on the graphics card. Volumetric textures take many times more memory than 2-dimensional textures as each slice of the volume is, in practice, a 2D texture and so the width and height of a volume texture is multiplied by the depth of the volume and the bit depth of the texels to deduce the memory requirements of the volume.

As games and applications get more complex all the time, so does the demand for more complex and deep textures as well as for more textures in quantity increase. Modern online role playing games may have hundreds of different locations, each with a distinct texture set, and the games may even have tools for the players to define their own textures for use in the game world.

The amount of memory it takes to store all these textures as bitmaps can easily become a problem. Quantity of the textures times the quality of the textures equals to having to store massive amounts of data in hard drives, in system memory and in graphics memory. It is possible to compress textures and use them directly in compressed formats, but this can only make textures so much smaller, and compression often causes artifacts to the perceived color of the texels so the rendering quality is decreased.

The following table shows a comparison of common uncompressed 2-dimensional RGBA texture sizes:

Dimensions (width*height*bits per pixel) Memory requirement
256x256x32bpp 262144 bytes (256 KB)
512x512x32bpp 1048576 bytes (1 MB)
1024x1024x32bpp 4194304 bytes (4 MB)
2048x2048x32bpp 16777216 bytes (16 MB)
4096x4096x32bpp 67108864 bytes (64 MB)

Storage requirements for compressed textures grow squared with regard to the root dimension just like in uncompressed textures.

In recent years, programmable rasterizers have become commonplace, allowing programmer-defined mathematical operations to affect the rendering of pixels to the frame buffer. In Direct3D 8, where programmable pixel shaders were introduced to the general gaming and real-time visualization public, it was possible to use a modest amount of mathematical operations and texture fetches to determine the color of pixels to be rendered.

Direct3D 9 –era graphics processors were capable of increased complexity in the shader programs, allowing things such as dependent reads and predicated instructions to be used. Dependent reading means that the address of a texel to be read from a texture can be calculated inside the pixel shader based on values calculated on the same shader or read from another texture. Predicated instructions refer to semi-conditional execution where the effective resulting code flow of a shader is based on conditions calculated in the same shader.

Direct3D 10 defines the latest generation of the pixel shader architecture. Texture read count and instruction count limits are all but eliminated. Shaders can read and write data in the graphics pipeline more freely than ever, and the results of a pixel shader can be read from the earlier shader stages - vertex shader and geometry shader – with just few commands executed inside the graphics processor.

Pixels shaders can be used to evaluate color of the rendered pixels based on other things than textures, like incoming interpolated per-vertex color, texture coordinates, positions or normals. This makes it possible to generate infinitely intricate surface detailing without using bitmaps at all, using bitmaps as seed values for the generation of the final detail or using bitmaps in conjunction with other formulas.

The technique of generating new data based on seed values is referred to as “procedural”, as specific per-primitive procedures - instead of values calculated in previous steps of the rendering or read from data buffers such as textures - are utilized to generate the output values.

A huge benefit of procedural texture generation as compared to using bitmap textures is that procedural textures take only as much memory as the definition of the procedure takes. Also, depending on the complexity of the parameters passed to the procedure, a procedurally-shaded surface can be drawn with practically infinite detail and variation, as the values are not primarily bound to explicitly stored values as in bitmaps. These traits make procedural textures very useful in rendering very complex scenes with huge amounts of different surface types.


Simple Procedural Pixel Shader

Image:Pvgps7_2_sphere_wood_checker.png

This render of the wooden ball uses a procedural shader to calculate the color and shininess of the rendered surface. A small volume of noise is used to input a noise seed to the pixel shader, as – while possible - it is impractical to calculate pseudo-random values in the shader itself as random number generation algorithms generally take relatively large amounts of time. It is very possible, however, to generate pseudo-random numbers in the shaders if needed by using the same algorithms and almost, if not actually, the same code (C++ versus HLSL) as in off-line processing.

The noise used here is simply a volume of random RGBA values. A more sophisticated noise pattern could be used for higher-order effects.

The wood color gradients are specified with a simple interpolation between some shades of brown, by using the values fetched from the seed volume as the interpolation parameter.

The positions of the dark and light brown checkers are determined by using true bitwise operations, a new capability in Direct3D 10. In legacy pixel shaders, the bitwise operations were generally emulated in floating-point instructions, which hindered the performance of such operations considerably if used in bulks.

The pixel shader used to generate the checker colors, implemented in Direct3D 10 HLSL, is as follows:

// This variable controls the checker frequency with respect to
// the unit cube:
int sizeMultiplier = 8;
 
//this function returns whether a given in a black or white volumetric checker:
bool Checker(float3 inputCoord)
{
    //Booleans are now a first-class citizen of the shader model,
    // as are bitwise operations: 
    bool x = (int)(inputCoord.x*sizeMultiplier) & 2;
    bool y = (int)(inputCoord.y*sizeMultiplier) & 2;
    bool z = (int)(inputCoord.z*sizeMultiplier) & 2;
 
    // Checkerboard pattern is formed by inverting the boolean flag
    // at each dimension separately:
    return (x != y != z);
}
 
// This is the texture containing the pre-calculated noise.
// note that we could calculate the noise inside the pixel shader
// too, seeded by the pixel position or some other value.
// however, pre-calculated noise is a good general-purpose seed 
// for many effects and it is efficient.
Texture3D noiseTex;
 
// This is the sampler used to fetch values from the 
// volume noise texture:
SamplerState VolumeSampler
{
    Filter = MIN_MAG_LINEAR_MIP_POINT;
    AddressU = Wrap;
    AddressV = Wrap;
    AddressW = Wrap;
};
 
//This shader colors the geometry with volumetric wooden checkers:
float4 CheckeredWoodPS( VS_OUTPUT In ) : SV_Target
{ 
    float4 ret = (float4)1.0;	
    float3 tcScale;
    bool inChecker = Checker(In.WSPos.xyz);
 
    //if we are in an odd checker, use the darker colors.
    //it is assumed that the shadesOfBrown array is initialized
    // with appropriate colors as rgb triplets:
    if (inChecker)
    {
        //scale the coordinates used for noise fetching:
        tcScale = float3(4,1,1);
        ret.xyz = lerp(shadesOfBrown[0], shadesOfBrown[1],
        noiseTex.Sample(VolumeSampler, input.WSPos.xyz * tcScale).x);
    }
    else 
    //we are in even checker; use slightly different colors 
    //and noise scale:
    {
        tcScale = float3(1,8,1);
        ret.xyz = lerp(shadesOfBrown[2], shadesOfBrown[3],
        noiseTex.Sample(VolumeSampler, input.WSPos.xyz * tcScale).x);
    }
 
    //OMITTED for clarity: Modulate the incoming diffuse color with 
    //the checker colors
 
    return ret;
}


Advanced Pixel Shaders

Image:Pvgps7_3_snowflake.png

While the previous example would be relatively easy to implement with older pixel shader architectures, the performance and flexibility of shader pipelines offered by the Direct3D 10-capable cards will enable much more complex per-pixel shading logic like image compression and decompression, precisely indexed data lookups and practically unlimited (though performance bound) per-pixel instruction counts for ray tracing and fractal-type effects.

To illustrate the potential power of Direct3D 10 pixel shaders, we will render a procedural snow flake entirely in the pixel shader. The coefficients used to create the geometry of the flake reside in a constant buffer that can be filled by the application hosting the shader to generate infinite amount of variations of the snow flake.

The constant buffer defines one half of a root branch of the flake. The evaluation is effectively repeated 12 times per a snowflake, first to mirror the branch and then array the whole branch in a radial fashion to produce the 6 branches of the final snow flake. The actual mirroring and arraying is realized in the geometric level, with texture coordinates driving the positions of the flake edges.

To determine if a current pixel is inside the snow flake area or not, triangles stored in the branch constant data are iterated by using a fill algorithm that checks if the current pixel is inside a given triangle and calculates a Boolean flag that stores whether the current pixel is in the snowflake area.

The contributions of all the triangles to the pixel are combined together with an XOR operation that switches the “insideness” flag of the current pixel for each overlapping triangle area.

The following code implements the pixel shader used to rasterize the snowflake:

//A triangle data type:
struct triang2
{
     float2 p0, p1, p2;
}
 
// Establish a maximum number of snowflake triangles:
#define MAX_SF_TRIS 20
 
// The host application fills the snowflake data buffer
// with triangles:
cbuffer snowflakeData
{
    int numFlakeTris;
    triang2 flakeTris[MAX_SF_TRIS];
}
 
// Find the determinant of the matrix formed by u and v.
// Effectively determines if the vectors are clockwise
// or counterclockwise related to each other.
float det(float2 u, float2 v)
{
    return u.x * v.y - u.y * v.x; //u cross v
}
 
// Find if a point is inside a given triangle:
bool PointInTriangle(float2 pnt, triang2 tri)
{
    float2 v0, v1, v2;
    float a1, a2, b1, b2;
 
    // Test if the vector from p0 to pnt is inside the 
    // sector formed by p1-p0 and p2-p0:
    v0 = tri.p0;
    v1 = tri.p1 - tri.p0;
    v2 = tri.p2 - tri.p0;
    a1 = (det(pnt, v2)-det(v0, v2)) / det(v1, v2);
    b1 = -(det(pnt, v1)-det(v0, v1)) / det(v1, v2);
 
    // Test again from different vertex:
    v0 = tri.p1;
    v1 = tri.p2 - tri.p1;
    v2 = tri.p0 - tri.p1;
    a2 = (det(pnt, v2)-det(v0, v2)) / det(v1, v2);
    b2 = -(det(pnt, v1)-det(v0, v1)) / det(v1, v2);
 
    // We don't need to test the third sector separately,
    // because if the point was in the first two sectors,
    // it must already be inside the triangle.
 
    //Just return the combined result of the first two tests:
 
    if ((a1 > 0) && (b1 > 0) && (a2 > 0) && (b2 > 0))
        return true;
    else
        return false; 
}
 
// Vertex shader is omitted; it just transforms the
// geometry to projection space and passes thru the UV coords.
 
// The pixel shader for the snow flake. 
// Input texture coordinates are used.
float4 snowflakePS(PS_In input) : SV_Target
{
 
    //by default, we are not inside the flake:
    bool inFlake = false;
 
    //iterate over all flake triangles:
    for (int i = 0; i < numFlakeTris; i++)
    {
        //utilize bitwise XOR to establish the flake shape:
        inFlake ^= PointInTriangle(input.uv, flakeTris[i]);
    }
 
    // we draw the combined effect of all the flake triangles:
    return (inFlake?float4(1.0,1.0,1.0,1.0):float4(0.0,0.0,0.0,0.0));
 
}

Note that this is intentionally a heavy approach to rendering a snowflake. In most applications, this method of rendering snowflakes would be an order of magnitude more complex than is actually needed, because snowflakes are usually very small in screen space and thus it would be very wasteful to render it with all the detail that our approach generates.

That said, this technique does illustrate the flexibility and power of the pixel shaders in Direct3D 10. More practical applications can and will be developed that take advantage of them in common rendering tasks and special effects.

Personal tools