D3DBook:Shadow Volumes

From GDWiki
Jump to: navigation, search

Contents

Volumetric Shadows

Image:Pvgps3_2_lamppost_sharp.png

When a photon hits an opaque object, a volume in the other side of the object can be thought to form in which the said photon cannot go without reflecting on some other surface. This volume encompasses everything that is in behind of the opaque object from the photon’s point of view. For each particle, if the volume encompasses said particle, the contribution of the current photon to the amount of accumulated light reflection at that particle is reduced to zero, thus the particle being in the shadow with regard to the current photon.

Now, take into account that everything is made of particles in real life, including the air that we appear to see through. In reality, the light bounces between the air particles all the way from the photon emitters to the light-sensing cells in our eyes. This phenomenon can be observed as distortion of distant objects on a hot day, as hot air particles (for example, near a pavement heated by the sun or a desert) reflect light slightly differently than cool ones. As the cold and hot air particles mix, the background seems to undulate in small waves that can refract the light in a distorted manner (also known as “desert mirages”).

It is worth to note that the light sensing cells in our eyes are simply a special case of material, in which a substantial portion of the incoming photons are converted to bio-electrical pulses for our brains to handle as vision; for this reason, all of the steps for interacting with “ordinary” material particles would also need to be repeated for each of the receptor cell.

If you thought that all of the real process sounds very complex, you’re quite right. As it is still not practical to simulate this behavior for each discrete photon against each discrete material particle in the scene, we make some broad approximations regarding the flow of the light in order to establish our photon obstruction volumes for simulation:

- We will assume that the geometry which is to cast the shadows is uniformly opaque; that is, every point of the shadow caster is equally blocking the photons. We can simulate varying per-object opacity by varying the alpha factor with which the shadow is finally blended in the scene.

-Conceptually, we regard the photon source as a single point or direction. In reality, singular light sources do not exist in practice.

-A light is either occluded or reflected/absorbed by the occluding geometry. This can be thought as using one large photon for the test instead of trillions of them as in a real-life scene.

-We do not take into account that photons which would reflect off the shadow caster could in fact end up to the shadow volume by means of another reflecting material. This is a distinct possibility in reality, but human eye cannot easily discern more than three orders of reflections, and only a special scene (like a hall of mirrors) would make the phenomenon truly stand out. Then again, a hall of mirrors is difficult to render with a rasterizer anyway, because practically all material reflections depend on each other.


Theory of Implementation of Volumetric Shadows

Image:Pvgps3_3_extrusion.png

The idea for extruding the volume of the shadow, as seen in the above image, is quite simple from programming perspective when we use the approximations listed previously:

“If a given edge is shared between two faces - one of which faces the light source and one which doesn’t – extrude it very far along the direction that the photon would go over that edge. Repeat for all edges of the occluding geometry.”

A legacy implementation of the shadow volume extrusion generally would either use CPU exclusively or generate helper geometry in form of degenerate (or of zero area in rest) fins and shells in CPU that would then be extruded conditionally in a vertex shader to generate the volume.

The following metacode illustrates an implementation of how a shadow volume would be generated on the CPU:

for each light l
 
  for each triangle t of the mesh

    for each adjacent triangle tn of triangle t

      if triangles t and tn are both facing the light l

        copy the triangle t to the shadow volume as is

        apply bias to the triangle along it’s normal

      else 

        if triangles t and tn are facing in opposite
        directions from the light l

          take the edge e between triangles t and tn

          copy the vertices of the edge e and move them by
          the desired extrusion distance along the vector
          from the light to the vertices

          construct two triangles to form a quad between the
          original edge vertices and the new vertices

          apply bias to the quad along it’s normal 

        else

          if triangles t and tn are both facing away from
          the light l

            copy the vertices of the triangle t to the shadow
            volume and move them by the extrusion distance along
            the vector from the light l to the vertices 

            apply bias to the triangle along it’s normal


Visualizing the Shadow Volume

Image:Pvgps3_4_sphere_vol_add_n_regular.png

After the successful generation, the shadow volume could be drawn just like the geometry that it was extruded from, to help verify that the algorithm works and that the volume would be extruded in the right direction. The above example image is rendered using additive blending so that the overlapping faces extruded from the edges would be easier to visualize. Of course, the programmer is free to implement the visualization in any way suitable for the scene in question.

Drawing the shadow legacy style More commonly than drawing the extruded geometry itself, the shadow volume is used, in conjunction with the calculated scene depth, as a part of a stencil mask used to clip the rendering of the actual shadow to the pixels that would not receive the photons as simulated by the extrusion algorithm.

Firstly, we lock the geometry buffers; one which contains the original mesh and the one that will contain the extruded shadow volume. For each edge that is between a back-facing and front-facing triangle (as seen from the light source direction), we make a quad (two triangles) which has the same normal as the edge it was extruded from, and the new points moved far from the original geometry in direction of the “photons” that hit the edge.

A common practice in creating shadow volume is to generate “caps” to the volume to limit it’s extents with regard to other geometry. For this, we determine for each triangle if that triangle is facing the light or not. If it is facing the light, we copy it to the shadow volume as is. If it’s not, it is copied to the back side of the volume and translated by the same vector used to extrude the edges. The caps of the volume are useful to prevent the situation in which the observer is inside the shadow volume and every other pixel except the ones at actually inside the shadow volume are drawn shadowed, when using a method of shadow drawing called “z-fail”.

At this step, we can move the resulting faces by applying a bias (a correction vector) to the extruded quads along their normals so that the shadow geometry doesn’t cause so-called “z-fighting”. Z-fighting is a condition in which the shadow and the source geometry are in exactly same screen-space depth and thus cause artifacts when rendered (as the rasterizer has to “guess” which of them to draw on top of another). This code shows how to apply a bias value to the triangles in order to eliminate z artifacts, written in C++. Note that in practice the bias values are best applied inside the extrusion loop:

VERTEX* allVertices; // VERTEX is defined as a structure with a 
// position vector. Actual definition of the type is omitted here 
// for brevity.
 
// the vertex buffer is assumed to be mapped/locked and the data
// pointer copied to allVertices before this.
 
for (int i = 0; i < numTriangles; i++)
{
  int vi0; 
// Vertex index corresponding to the first vertex of 
// the current triangle.
 
  vi0 = i * 3; 
// This works as each triangle has exactly three indices.
 
// The first vertex of the current triangle is at 
// allVertices[i*3]. By applying specifying i*3+1 and i*3+2,
// we can address the other two vertices of the triangle.
 
  D3DXVECTOR3 normal = D3DXVec3Normalize(...); 
// Normal is calculated from the vertex positions; the actual 
// calculation is omitted for brevity.
 
  float bias = 0.001; 
// The distance to move the triangles along their normals. 
// The actual value is assumed to have initialized earlier.
 
// now we apply the bias to the original vertices:
  for (int j = 0; j < 3; j++)
  {
    allVertices[vi0 + j] = allVertices[vi0 + j] - normal * bias; 
//this does the actual moving of the vertices.
  }
 
}
 
// Vertex buffer is unmapped/unlocked after the manipulation.
// This step is omitted for brevity.

Note that this implementation doesn’t even attempt to use special vertex shaders for extrusion, so it could be use on any hardware that just supports stencil buffers for the drawing step up next. There are methods of using a vertex shader to do the extrusion step, but we will only cover this basic approach and the one implemented with Direct3D 10 shaders for simplicity. For the example implementations, I chose the z-fail method for the stencil masking as it is an excellent algorithm for solid shadows with no artifacts, in the situation where the observer is inside the shadow volume. The z-fail algorithm is efficient in that it doesn’t need to render a full-screen all-encompassing quad in order to visualize the areas in the shadow volume – it simply doesn’t draw the scene in where the stencil mask condition fails based on the volume.

Before the laying of the stencil buffer values takes place, we render an ambient pass of the scene to establish a scene color for pixels that were destined to be in the shadow, and the depth of the scene for the shadows to clip to. Rendering ambient color will provide additional realism to the scene as in real life, there are no perfectly shadowed objects. Some light will always reflect from other light sources, and the ambient scene pass would emulate this. The z-values are written so that the shadows will clip to the rest of the scene correctly. The stencil buffer operations should be disabled while rendering the ambient pass.

After preparing the shadow volume and drawing the ambient and z pass, we set up some render states regarding depth and stencil buffer operations. As we need to mask in the pixels that are in the shadow volume and on some other geometry, we need to express a following sequence (assuming z-fail render method):

  1. Disable writing to depth buffer and color buffer. We don’t need to see the actual volume geometry, but we do need to manipulate the stencil buffer with it.
  2. Set cull mode to “none”, so back-facing triangles are processed.
  3. Depth compare function should be “less”, so that the shadow volume geometry behaves like other objects in depth order but is drawn just slightly nearer to the observer than other geometry (which normally uses “less or equal” function for depth comparison) to avoid depth artifacts.
  4. Two-sided stencil buffering should be enabled; otherwise, we would need separate passes for incrementing and decrementing the stencil buffer values for front- and back-facing volume geometry respectively.
  5. Stencil reference value should be set to 1 to establish a default behavior for the mask, of not rendering the shadow. Stencil mask and stencil write mask should have all bits up; that is, have value 0xffffffff in order to use all stencil bits available if needed.
  6. Both clockwise and counterclockwise stencil functions should be set to “always” to denote that each and every pixel of the volume should contribute to the stencil buffer values.
  7. Counterclockwise stencil z fail function should be specified as “increase” and clockwise “decrease” so as to properly increment and decrement the stencil values based on whether the current pixel is front-facing or back-facing.
  8. Stencil pass function for both clockwise and counterclockwise stencil operations should be set to “keep” so that the stencil values would not get modified if the drawn pixels will pass the depth test and therefore are of no interest of us.

This code listing, borrowed from the DirectX SDK February 2007 sample “ShadowVolume10”, shows how to set the depth stencil state for two-sided z-fail rendering:

DepthStencilState TwoSidedStencil
{
    DepthEnable = true;
    DepthWriteMask = ZERO;
    DepthFunc = Less;
 
    // Setup stencil states
    StencilEnable = true;
    StencilReadMask = 0xFFFFFFFF;
    StencilWriteMask = 0xFFFFFFFF;
 
    BackFaceStencilFunc = Always;
    BackFaceStencilDepthFail = Incr;
    BackFaceStencilPass = Keep;
    BackFaceStencilFail = Keep;
 
    FrontFaceStencilFunc = Always;
    FrontFaceStencilDepthFail = Decr;
    FrontFaceStencilPass = Keep;
    FrontFaceStencilFail = Keep;
};

In addition to the above depth stencil state, the technique should also set the culling rasterizer state to off in order to let the back-facing triangles to affect the stencil and depth buffers, and blend state to “no blending” so that unnecessary color writes are avoided.

After these steps, the extruded shadow volume can be rendered. For each pixel, the above conditions are evaluated and the stencil buffer is incremented and decremented if the conditions match the current pixel. The rendering of the shadow volume establishes a stencil mask that can be used to prevent the pixels from the actual scene to be rendered as opposed to allowing shadow visualization color to be rendered (essentially, z-fail as opposed to z-pass).

Moving on, we can now render the shadowed scene. As we need to render the scene where the stencil value was left greater than 1, we need to establish some stencil and depth rules again:

  1. Z-buffering and stencil buffering should be enabled.
  2. Z function should be reset to its default value of “less or equal” in order to render normally.
  3. If an ambient scene color was laid out before the stencil buffer manipulation step, we need to enable alpha blending, set the blending operation to “add” and both source and destination blend factors to “one” in order to additively blend the colors not in the shadows with the ambient colors in the color buffer that lack the shadow information.
  4. Stencil reference value should be 1 to establish the default mask value so that geometry not in shadow will be rendered.
  5. Stencil function should be set to “greater” to assure that correct pixels will be masked in.
  6. Stencil pass behavior should be set to “keep” so that the stencil values don’t get modified in the next step.

Now, you can render the scene normally and those pixels that were in the shadow as determined by the stencil mask – created by using the scene depth and shadow volume information – are discarded based on the stencil value.

The following code illustrates the most important steps to render volumetric shadows using the z-fail technique.

//simple pixel shader that returns a constant ambient color.
// it is assumed that a PSSceneIn structure contains the 
// output of the ordinary vertex shader – position and normal.
float4 PSAmbient(PSSceneIn input) : SV_Target
{   
    return g_vAmbient; //global ambient color variable
}
 
//pixel shader for diffuse color:
float4 PSDiffuse(PSSceneIn input) : SV_Target
{
    return input.color; //just pass on the diffuse color from vs
}
 
//pixel shader for shadow rendering:
float4 PSShadow(PSShadowIn input)
{
//PSShadowIn contains only the position from the shadow
//vertex shader.
    return float4( 0.0f, 0.0f, 0.0f, 0.5f ); //black shadow
}
 
//VSShadow omitted – outputs vertices in world space to gs
//VSScene omitted – calculates diffuse color of the scene and
// outputs vertices in projection space
 
// GSShadow geometry shader is discussed later.
 
 
//The following techniques then utilize the above functionality:
 
//renders the scene diffuse colors:
technique10 RenderDiff
{
    pass p0
    {
        SetVertexShader( CompileShader( vs_4_0, VSScene () ) );
        SetGeometryShader( NULL );
        SetPixelShader( CompileShader( ps_4_0, PSScene () ) );
 
//since the diffuse pass occurs after the ambient color has
//been already laid out, we need to blend it additively:
        SetBlendState( AdditiveBlending, float4( 0.0f, 0.0f, 0.0f, 0.0f ), 0xFFFFFFFF );
 
//reset the depth stencil state to ordinary rendering:
        SetDepthStencilState( RenderNonShadows, 0 ); 
//and enable back-face culling:
        SetRasterizerState( EnableCulling );
    }  
}
 
// this technique calls the shadow volume extrusion and 
// manipulates the stencil buffer:
technique10 RenderShadow
{
    pass p0
    {
     SetVertexShader( CompileShader( vs_4_0, VSShadow() ) );
     SetGeometryShader( CompileShader( gs_4_0, GSShadow() ) );
     SetPixelShader( CompileShader( ps_4_0, PSShadow() ) );
//we don’t want to write colors here:   
     SetBlendState( DisableFrameBuffer, 
     float4( 0.0f, 0.0f, 0.0f,0.0f ), 0xFFFFFFFF );
     SetDepthStencilState( TwoSidedStencil, 1 ); 
     SetRasterizerState( DisableCulling );
    }  
}
 
//this simple technique just lays out the ambient color of the scene:
technique10 RenderAmbient
{
    pass p0
    {
      SetVertexShader( CompileShader( vs_4_0, VSScene() ) );
      SetGeometryShader( NULL );
      SetPixelShader( CompileShader( ps_4_0, PSAmbient() ) );
 
      SetBlendState( NoBlending, 
      float4( 0.0f, 0.0f, 0.0f, 0.0f ), 0xFFFFFFFF );
      SetDepthStencilState( EnableDepth, 0 );
      SetRasterizerState( EnableCulling );
    }  
}
 
 
// These techniques are used in the host application as follows:
// 1: render ambient
// 2: render shadow
// 3: render diffuse, which is now clipped by the shadow stencil.

It is possible to render volumes of multiple lights at the same time, and the z-fail algorithm still works as intended.

The volumetric shadow technique in general produce very sharp shadows, but it is possible to integrate over a number of lights to simulate an area light that creates smooth shadow edges. This will require additional passes for the drawing of the shadowed scene, though, and CPU needs to be utilized between the passes to control the parameters of the passes. In the DirectX SDK, a method of using a vertex shader to do the extrusion is presented to alleviate this cost, but it still requires the CPU to initiate and feed each pass with their parameters.

Bottlenecks in the legacy approach The volumetric shadowing technique, as described previously, has a couple of bottlenecks that are evident when implementing it in hardware earlier than Direct3D 10 generation. These mainly have to do with unnecessary CPU utilization and graphics bus bandwidth saturation.

Firstly, it takes considerable time from the CPU to perform the volume extrusion as opposed to having the graphics card do it. The graphics processing unit is optimized for parallel vector calculations (of which the extrusion operation is full). Compare it to the central processing unit which, while very flexible with regard to the nature of the calculations, only allows a small amount of parallel vector operations per instruction and therefore is inefficient in most 3d graphics calculations.

The CPU load can be eased by doing the extrusion in the vertex shader, but then another problem surfaces: a heavy amount of extra geometry, in the form of degenerate triangles for each edge and open vertex must be calculated and baked into the geometry buffers, which in turn takes a lot of extra memory and graphics bus bandwidth.

In cases with large amounts of graphics primitives, this also causes the geometry processing pipeline to choke because the extrusion vertex shader is run for each vertex regardless of whether it is going to be extruded or not, and this load is multiplied with the number of extra vertices that make possible to use vertex shader for the operation to begin with.

Direct3D 10 allows us to do the extrusion step entirely in the graphics processor, without CPU intervention, thus effectively eliminating both of these bottlenecks in one sweep. This, combined with the immense geometry processing power available on Direct3D 10 cards due to the unified shader pipelines, allows for large performance boosts over the legacy technique presented earlier.

Using Geometry Shader to Implement Volumetric Shadow Extrusion

Image:Pvgps3_5_triangleadj.png

To extrude the shadow geometry in Direct3D 10, we can use the new component of the graphics pipeline, the geometry shader. A geometry shader works one graphics primitive (like triangle, line or point) at a time, and can access adjacency information of the current primitive so as to determine whether the current triangle’s edge is a candidate for the extrusion along the light’s path.

In addition to the positions, normals and other data associated with the vertices of the geometry to be extruded by the geometry shader, we need to pass the adjacency data to the pipeline in the form of extra index data for each triangle, interleaved with the original indices of the geometry. The vertex data input parameter to the geometry shader is represented as an array of vertex structures, with 6 as the number of elements to consider the adjacent vertices of the triangle in addition to the triangle itself.

Firstly, the geometry shader determines if the current triangle is facing the light source. If it isn’t, there’s no need to perform the rest of the shader as the current triangle wouldn’t be affected by light anyway, and thus the geometry shader can fall through with no output corresponding to the current triangle. If the triangle did face the light source, then we’d perform a few steps more in the geometry shader. For each edge of the triangle, it is determined whether that particular edge is on the silhouette of the geometry with regard to the light position. If it is, new triangles are created inside the geometry shader to construct the extrusion faces for that edge.

After the edges have been extruded, the current triangle is capped from both the front and back side to construct a solid volume out of it, in a similar fashion as the legacy approach does. As was discussed before, this prevents the volume from “leaking” when the observer is inside it during the shadow visualization phase.

The generated primitives are output from the geometry shader to a TriangleStream object which has a template type corresponding to the input to the pixel shader. The TriangleStream has two methods; Append() adds one vertex to the output, and RestartStrip() terminates the current triangle strip being output and begins a new one. New vertices calculated in the geometry shader are appended to the output stream and between each edge qualifying for the extrusion, the start cap and the end cap, a new triangle strip is started to distinguish the different parts of the output for the rasterizing steps after the geometry shader. The bias value is applied to the new geometry in the geometry shader, for the same reason as in the legacy implementation.

//The following geometry shader code is borrowed from the DirectX SDK samples:
 
//
// Helper to detect a silhouette edge and extrude a volume from it
//
void DetectAndProcessSilhouette( float3 N,         
// Un-normalized triangle normal
                                 GSShadowIn v1,    
// Shared vertex
                                 GSShadowIn v2,    
// Shared vertex
                                 GSShadowIn vAdj,  
// Adjacent triangle vertex
    inout TriangleStream<PSShadowIn> ShadowTriangleStream 
// triangle stream
)
{    
    float3 NAdj = cross( v2.pos - vAdj.pos, v1.pos - vAdj.pos );
 
    float3 outpos[4];
    float3 extrude1 = normalize(v1.pos - g_vLightPos);
    float3 extrude2 = normalize(v2.pos - g_vLightPos);
 
    outpos[0] = v1.pos + g_fExtrudeBias*extrude1;
    outpos[1] = v1.pos + g_fExtrudeAmt*extrude1;
    outpos[2] = v2.pos + g_fExtrudeBias*extrude2;
    outpos[3] = v2.pos + g_fExtrudeAmt*extrude2;
 
    // Extrude silhouette to create two new triangles
    PSShadowIn Out;
    for(int v=0; v<4; v++)
    {
        Out.pos = mul( float4(outpos[v],1), g_mViewProj );
        ShadowTriangleStream.Append( Out );
    }
    ShadowTriangleStream.RestartStrip();
}
 
 
//This is the crux of the code. It calls the above function as
//needed, and generates the cap geometry for the shadow volume.
 
//
// GS for generating shadow volumes
//
[maxvertexcount(18)]
void GSShadowmain( triangleadj GSShadowIn In[6], inout TriangleStream<PSShadowIn> ShadowTriangleStream )
{
    // Compute un-normalized triangle normal
    float3 N = cross( In[2].pos - In[0].pos, In[4].pos - In[0].pos );
 
    // Compute direction from this triangle to the light
    float3 lightDir = g_vLightPos - In[0].pos;
 
    //if we're facing the light
    if( dot(N, lightDir) > 0.0f )
    {
        // For each edge of the triangle, determine if it is a silhouette edge
        DetectAndProcessSilhouette( lightDir, In[0], In[2], In[1], ShadowTriangleStream );
        DetectAndProcessSilhouette( lightDir, In[2], In[4], In[3], ShadowTriangleStream );
        DetectAndProcessSilhouette( lightDir, In[4], In[0], In[5], ShadowTriangleStream );
 
        //near cap
        PSShadowIn Out;
        for(int v=0; v<6; v+=2)
        {
            float3 extrude = normalize(In[v].pos - g_vLightPos);
 
            float3 pos = In[v].pos + g_fExtrudeBias*extrude;
            Out.pos = mul( float4(pos,1), g_mViewProj );
            ShadowTriangleStream.Append( Out );
        }
        ShadowTriangleStream.RestartStrip();
 
        //far cap (reverse the order)
        for(int v=4; v>=0; v-=2)
        {
            float3 extrude = normalize(In[v].pos - g_vLightPos);
 
            float3 pos = In[v].pos + g_fExtrudeAmt*extrude;
            Out.pos = mul( float4(pos,1), g_mViewProj );
            ShadowTriangleStream.Append( Out );
        }
        ShadowTriangleStream.RestartStrip();
    }
}

Apart from the extrusion step discussed herein, the application of the extruded data to create the stencil values is the same on both the old and the new way. However, doing the extrusion in the geometry shader saves some valuable clock cycles in both the CPU and GPU as compared to using vertex shaders with extra geometry. It is clear that programmers using Direct3D 10 should embrace the new method for increased geometry performance with virtually no handicaps as compared to the legacy way.

It should be noted that the volumetric shadow technique itself has a flaw that is not trivial to fix: as the regular lighting of the scene regularly uses normals interpolated between the vertices, the shadow volume extrusion is based on the normals of the faces at the silhouette edge of the geometry. This means that the extruded edges of the shadow volume overlap the regular shading in parts where the triangles of the edge would be lit by the interpolated light values. The overlap is best described as an irregular sawtooth edge artifact along the triangles connected to the silhouette edges from the lit side of the geometry, as seen in the following image (with the artifact boundary marked with red line):

Image:Pvgps3_6_sphere_vol_artifd.png

While it is possible to lessen the effect of the artifact by using higher-resolution geometry to hide the edge lighting inconsistency, it is often a better decision to use shadow maps instead.

Personal tools