Contents |
All the content for lighting has been covered by the previous nine chapters. Because of the significant number of methods and algorithms discussed an additional reference chapter is included. It is this final chapter that covers a comparison and summary of the key points of interest with each approach discussed so far. For further information on the points summarised here, refer back to previous chapters.
In simple terms there are two phases to lighting – transport and interaction. The complexities of both phases can be varied. Some key points:
| Global | Local |
|---|---|
| Based on physical understanding of light energy’s interaction with the entire environment | Only considers relationship between source and destination. |
| Tend to be very expensive in terms of performance | Can be easily implemented for real-time performance |
| Generates much higher quality images | Generates acceptable quality images |
| Shadowing is implicitly included | Requires an explicit shadowing algorithm to be implemented |
There is a lot of active research into global illumination and over the previous few years a lot of advances in performance have been made. Techniques such as pre-computed radiance transfer and spherical harmonics are popular when it comes to bridging the real-time gap. It would not be surprising if global illumination becomes a realistic option for general lighting in real-time applications in the future.
The use of lighting models should be an integral part of any graphics architecture – it is possible to shoe-horn lighting models into existing code, but consideration in the design phase is always better. Consider the following:
The resolution of lighting calculations has a big impact on the quality of the final image. Refer to the following images from the third chapter:

From left to right image 10.1 shows per-triangle, per-vertex and per-pixel. Left-most is cheapest whereas right-most is most expensive. Both per-face and per-vertex derive their performance from the complexity of the geometry being rendered; per-pixel also varies according to the output resolution.
There are four types of per-pixel lighting to consider. In order of performance (fastest to slowest) and quality (lowest to highest):
There are several storage-reduction methods available for per-pixel lighting techniques:
If realism is the priority then understanding the material being modelled is essential. The human brain is very good at picking up mismatches or incorrect results for materials it is familiar with in the real-world.
Real-world values for energy response, colour and other inputs can be hard to find but do exist. Matching values between rendered images and real-world measurements can make a noticeable difference.
Six main lighting models were introduced in this section of the book. These should cover most uses but are by no means the only lighting models available. For more specific requirements it is worth researching other available models – searching online for other work by the authors of the papers referenced in this section is a good starting point.
Some basic highlights of the models are as follows:
Phong and Blinn-Phong
Cook-Torrance
Oren-Nayar
Strauss
Ward
Ashikhmin-Shirley
It is worth noting that individual models may perform well for particular classes of materials but it rarely means that these are the only classes of materials they support. Appropriate use of controlling parameters allows most models to be general purpose.
Drawing general conclusions about performance for a massively parallel system such as found with CPU(s) and GPU(s) is very difficult. However the assembly output from the HLSL compiler can be a reasonable indicator as to which approaches are likely to be the most expensive to execute:
| Light source | Approximate number of instructions |
|---|---|
| Directional | 19 |
| Point | 38 |
| Spot | 44 |
| Area | 46-548 |
The per-pixel techniques all rely on supporting data – a normal map, height map or both. The following table includes the storage assuming 512x512 textures stored at 8 bits per channel:
| Per-pixel technique | Approximate number of instructions | Storage required (megabytes) |
|---|---|---|
| Normal mapping | 9 | 1.33 |
| Parallax with offset limiting | 15 | 1.58 |
| Parallax with offset limiting (simple normal generation) | 28 | 0.25 |
| Parallax with offset limiting (Sobel filter) | 53 | 0.25 |
| Ray-traced | 68 | 1.58 |
| Ray-traced (simple normal generation) | 81 | 0.25 |
| Ray-traced (Sobel filter) | 105 | 0.25 |
| Ray-traced with shadows | 114 | 1.58 |
| Ray-traced with shadows (simple normal generation) | 126 | 0.25 |
| Ray-traced with shadows (Sobel filter) | 151 | 0.25 |
The above table lists the number of instructions for the pixel shader component (obviously the most significant for per-pixel techniques!) but they all rely on a 36 instruction vertex shader to perform the tangent-space transformations.
With a framework of light source and per-pixel lighting technique it is necessary to include a lighting model to generate the final colour. The following table summarises the performance characteristics for all of the methods shown in previous chapters – they utilize Phong shading as their framework.
| Lighting model | Approximate number of instructions |
|---|---|
| Blinn-Phong | 23 |
| Phong | 26 |
| Ward (Isotropic with look-up) | 33 |
| Oren-Nayar (simplified evaluation with look-up) | 38 |
| Cook-Torrance (look-up roughness, simple Fresnel) | 44 |
| Ward (Isotropic) | 44 |
| Ward (Anisotropic) | 48 |
| Cook-Torrance (Beckmann roughness, approximated Fresnel) | 52 |
| Cook-Torrance (look-up roughness, simple Fresnel) | 54 |
| Cook-Torrance (Gaussian roughness, approximated Fresnel) | 54 |
| Strauss | 60 |
| Oren-Nayar (simplified evaluation) | 62 |
| Cook-Torrance (Beckmann roughness, simple Fresnel) | 62 |
| Cook-Torrance (Gaussian roughness, simple Fresnel) | 65 |
| Ashikhmin-Shirley | 71 |
| Oren-Nayar (full evaluation) | 77 |
With three fundamental choices when implementing a lighting model (light source, per-pixel technique and lighting model) it is difficult to represent all performance combinations in a manageable form. However, using the previous three tables it is possible to put some rough estimates together for how a particular method is likely to perform.
For example, combining a spot-light with ray-traced per-pixel lighting and Ashikhmin-Shirley’s model is going to produce a very complex shader whilst a point light, parallax mapping and Phong combination produces a much shorter option.
It is important to realise that the implementations of algorithms provided in this section are not exhaustively optimized – they are not intentionally sub-optimal, but clarity and simplicity was chosen over complex but potentially faster methods.
Several chapters show how look-up textures can be used to balance between storage space and instruction counts. This basic principle can be employed in many other areas and is an important optimization opportunity.
The advanced programming constructs that come as standard in shader model 4 – such as looping and branching – also offer numerous optimization opportunities. Expensive evaluations can be dynamically eliminated by branching on an important input; for example only executing the lighting calculation for surfaces oriented towards the light requires a simple if( dot( normal, light ) >= 0.0f ) { /* code */ } construct.
Whilst the software side of Direct3D 10 can efficiently express these constructs it is worth noting that not all hardware will do such a good job. In previous generations different GPU’s could have wildly different performance characteristics for dynamic flow control. This made it difficult to write one shader that would get the best performance from all hardware and it is quite possible that the same will happen with Direct3D 10 hardware. The exact details of this are beyond the scope of this book and referring to IHV provided information is the best approach.
Navigate to other chapters in this section:
| Foundation & Theory | Direct Light Sources | Techniques For Dynamic Per-Pixel Lighting | Phong and Blinn-Phong | Cook-Torrance | Oren-Nayar | Strauss | Ward | Ashikhmin-Shirley | Comparison and Summary |