Contents |
The lighting model put forward by Michael Oren and Shree Nayar in their 1994 paper is a substantial departure from the previous three. Not only does it deviate from the otherwise standard Lambertian model but it is also a diffuse-only model and doesn’t focus on modelling specular highlights.
This model is particularly useful as it specialises in rough surfaces – such as brick, concrete and also clothing materials such as velvet. A key characteristic of a rough surface is the degree of light scattering due to the uneven surface; this gives rise to several observable qualities that cannot be replicated using the more traditional lighting models.
Consider the full moon when viewed on a clear night. In this situation its shape is circular and has a very clearly defined edge when seen against the background of space. However we know that the moon is a spherical body and that towards the edge of the full moon its surface is orienting away from our point of view and the light source. Using a standard Lambertian approach, the basis for previous models, this would mean that the edge of the full moon should tend towards darkness – but this is quite clearly not the case! The following image shows a comparison with a Lambertian sphere on the left and a rough sphere on the right.

This deviation is down to the scattering of light from a rough surface – all of the tiny imperfections scatter and reflect light in many more directions than a smooth surface might. The previous chapter covering the Cook-Torrance model introduced the idea of ‘micro-facets’ and this concept provides a foundation for the Oren-Nayar model.
Despite the model breaking Lambertian norms, each micro-facet on the surface is assumed to show Lambertian reflection; it is the interaction of these facets that allows the overall result to deviate. The previous chapter covered distribution functions and self-shadowing and this model introduces the concept of inter-reflection.

The preceding diagram shows a situation where a facet receiving no direct light is viewed. On the left, where no inter-reflection is considered, the observed facet is dark and unlit. On the right the facet indirectly receives light energy due to inter-reflection and consequently appears lit to the viewer. This effect is most noticeable on rough surfaces because smooth surfaces have small, if any, facets to generate inter-reflection.
For rough surfaces it is quite possible for the visible side of a facet to be oriented away from the light source yet its opposite and invisible side to be oriented towards the light source. By assuming that the lit side reflects some of the received light energy the visible side of the facet will in turn reflect some light energy back towards the viewer.
As will become evident over the remainder of this chapter this model is one of the more complex to implement simply due to the heavy use of spherical coordinates and trigonometric functions. This doesn’t invalidate it as an option for real-time computer graphics; rather it requires consideration regarding its usage. In simple terms it should be used sparingly and only where the extra computation expense is going to have a significant influence on the final image.
The previous Cook-Torrance model is easy to break down into constituent parts, but unfortunately the rendering equations proposed by Oren-Nayar is more monolithic. Spherical coordinates are used in the original equations as they offer numerous mathematical advantages, however for implementation purposes this is inconvenient – most Direct3D graphics will be based on the Cartesian coordinate system. The next section of this chapter demonstrates how it is possible to convert to a more shader-friendly vector form.
Unit vectors in Cartesian form require three components – x, y and z. Due to their length being exactly 1.0 they can be represented by only two values when using spherical coordinates – θ and Φ. Conversion can be performed according to the following equations:
φ = cos − 1z
The Oren-Nayar equation takes the following values as input:
| Φi, θi | Light’s direction vector |
| Φr, θr | View direction vector |
| σ | Roughness coefficient, between 0.0 and 1.0 |
| E0 | Incoming light energy |
| ρ/π | Diffuse colour for the current pixel |
The complete equation, broken down in order to make it easier to read in printed form, is as follows:
Evaluating the above equation in full will require six calls to trigonometric functions (three cosines, two tangents and one sine). This, combined with the possibility of converting Cartesian input vectors into spherical coordinates makes it a very complex evaluation and definitely not the most performance-friendly!
Due to the computation complexity of the previous equation Oren and Nayar proposed a simplification as part of the same publication. As with most simplifications this sacrifices quality and accuracy, but this qualitative model was arrived at by studying the significance of the many terms in the previous equation. Both C3 and inter-reflection proved to have a minimal effect on the final image yet were responsible for a substantial amount of the computation – the following simplified equation is based on the removal of these two terms:
The A and B terms are directly derived from C1 and C2 in the original lighting model. An oft-missed footnote in the original paper suggests that replacing the 0.33 constant in the original C1 equation with 0.57 can improve the quality slightly.
It is immediately obvious that this simplification has removed a lot of the work required for the original model. The simplified model still depends on four trigonometric functions, but as will be shown in the following section it is possible to further reduce this.
As previously shown, the equations are heavy on spherical coordinates and trigonometry. This doesn’t suit GPU’s such that it is better to convert to vector form where possible
Common terms in vector form:
The circular angle, γ, can be converted into vector mathematics by projecting the two angles onto the x,y plane in tangent space. When dealing with unit vectors it is standard to replace the cosine function with a much simpler dot-product operation. The full equation in vector form:
The above equations when implemented in HLSL are as follows:
float4 psOrenNayarComplex( in VS_OUTPUT f ) : SV_TARGET { const float PI = 3.14159f; // Make sure the interpolated inputs and // constant parameters are normalized float3 n = normalize( f.normal ); float3 l = normalize( -vLightDirection ); float3 v = normalize( pCameraPosition - f.world ); // Compute the other aliases float alpha = max( acos( dot( v, n ) ), acos( dot( l, n ) ) ); float beta = min( acos( dot( v, n ) ), acos( dot( l, n ) ) ); float gamma = dot( v - n * dot( v, n ), l - n * dot( l, n ) ); float rough_sq = fRoughness * fRoughness; float C1 = 1.0f - 0.5f * ( rough_sq / ( rough_sq + 0.33f ) ); float C2 = 0.45f * ( rough_sq / ( rough_sq + 0.09 ) ); if( gamma >= 0 ) { C2 *= sin( alpha ); } else { C2 *= ( sin( alpha ) - pow( (2 * beta) / PI, 3 ) ); } float C3 = (1.0f / 8.0f) ; C3 *= ( rough_sq / ( rough_sq + 0.09f ) ); C3 *= pow( ( 4.0f * alpha * beta ) / (PI * PI), 2 ); float A = gamma * C2 * tan( beta ); float B = (1 - abs( gamma )) * C3 * tan( (alpha + beta) / 2.0f ); float3 final = cDiffuse * max( 0.0f, dot( n, l ) ) * ( C1 + A + B ); return float4( final, 1.0f ); }
Evaluation of the C2 term requires the use of a dynamic branch in the shader code which is not ideal. Use of the [branch] and [flatten] attributes can push the compiler in either direction, but the compiled shader code may well end up evaluating both branches in order to achieve this.
The simplified equation in vector form is straight-forward and still relies on the previous conversions of α, β and γ:
Implemented in HLSL:
float4 psOrenNayarSimple
(
in VS_OUTPUT f,
uniform bool UseLookUpTexture
) : SV_TARGET
{
// Make sure the interpolated inputs and
// constant parameters are normalized
float3 n = normalize( f.normal );
float3 l = normalize( -vLightDirection );
float3 v = normalize( pCameraPosition - f.world );
// Compute the other aliases
float gamma = dot
(
v - n * dot( v, n ),
l - n * dot( l, n )
);
float rough_sq = fRoughness * fRoughness;
float A = 1.0f - 0.5f * (rough_sq / (rough_sq + 0.57f));
float B = 0.45f * (rough_sq / (rough_sq + 0.09));
float C;
if( UseLookUpTexture )
{
// The two dot-products will be in the range of
// 0.0 to 1.0 which is perfect for a texture lookup:
float tc = float2
(
(VdotN + 1.0f) / 2.0f,
(LdotN + 1.0f) / 2.0f
);
C = texSinTanLookup.Sample( DefaultSampler, tc ).r;
}
else
{
float alpha = max( acos( dot( v, n ) ), acos( dot( l, n ) ) );
float beta = min( acos( dot( v, n ) ), acos( dot( l, n ) ) );
C = sin(alpha) * tan(beta);
}
float3 final = (A + B * max( 0.0f, gamma ) * C);
return float4( cDiffuse * max( 0.0f, dot( n, l ) ) * final, 1.0f );
}
Both full and simplified still rely on the sin()/tan() terms. The simplified form of the model makes it trivial to factor out these expensive operations and replace them with a single texture fetch, as shown by the presence of the compile-time switch in the above HLSL code. The following function from the main application shows how to generate the look-up texture referenced by the HLSL:
HRESULT CreateSinTanLookupTexture( ID3D10Device* pDevice ) { HRESULT hr = S_OK; // The incoming dot-product will be between 0.0 and 1.0 // covering 180 degrees thus a look-up of 512 pixels // gives roughly 1/3rd of a degree accuracy which // should be enough... const UINT LOOKUP_DIMENSION = 512; // Describe the texture D3D10_TEXTURE2D_DESC texDesc; texDesc.ArraySize = 1; texDesc.BindFlags = D3D10_BIND_SHADER_RESOURCE; texDesc.CPUAccessFlags = 0; texDesc.Format = DXGI_FORMAT_R32_FLOAT; texDesc.Height = LOOKUP_DIMENSION; texDesc.Width = LOOKUP_DIMENSION; texDesc.MipLevels = 1; texDesc.MiscFlags = 0; texDesc.SampleDesc.Count = 1; texDesc.SampleDesc.Quality = 0; texDesc.Usage = D3D10_USAGE_IMMUTABLE; // Generate the initial data float* fLookup = new float[ LOOKUP_DIMENSION * LOOKUP_DIMENSION ]; for( UINT x = 0; x < LOOKUP_DIMENSION; ++x ) { for( UINT y = 0; y < LOOKUP_DIMENSION; ++y ) { // This following fragment is a direct conversion of // the code that appears in the HLSL shader float VdotN = static_cast< float >( x ) / static_cast< float >( LOOKUP_DIMENSION ); float LdotN = static_cast< float >( y ) / static_cast< float >( LOOKUP_DIMENSION ); // Convert the 0.0..1.0 ranges to be -1.0..+1.0 VdotN *= 2.0f; VdotN -= 1.0f; LdotN *= 2.0f; LdotN -= 1.0f; float alpha = max( acosf( VdotN ), acosf( LdotN ) ); float beta = min( acosf( VdotN ), acosf( LdotN ) ); fLookup[ x + y * LOOKUP_DIMENSION ] = sinf( alpha ) * tanf( beta ); } } D3D10_SUBRESOURCE_DATA initialData; initialData.pSysMem = fLookup; initialData.SysMemPitch = sizeof(float) * LOOKUP_DIMENSION; initialData.SysMemSlicePitch = 0; // Create the actual texture hr = pDevice->CreateTexture2D ( &texDesc, &initialData, &g_pSinTanTexture ); if( FAILED( hr ) ) { SAFE_DELETE_ARRAY( fLookup ); return hr; } // Create a view onto the texture ID3D10ShaderResourceView* pLookupRV = NULL; hr = pDevice->CreateShaderResourceView ( g_pSinTanTexture, NULL, &pLookupRV ); if( FAILED( hr ) ) { SAFE_RELEASE( pLookupRV ); SAFE_RELEASE( g_pSinTanTexture ); SAFE_DELETE_ARRAY( fLookup ); return hr; } // Bind it to the effect variable ID3D10EffectShaderResourceVariable *pFXVar = g_pEffect->GetVariableByName("texSinTanLookup")->AsShaderResource( ); if( !pFXVar->IsValid() ) { SAFE_RELEASE( pLookupRV ); SAFE_RELEASE( g_pSinTanTexture ); SAFE_DELETE_ARRAY( fLookup ); return hr; } pFXVar->SetResource( pLookupRV ); // Clear up any intermediary resources SAFE_RELEASE( pLookupRV ); SAFE_DELETE_ARRAY( fLookup ); return hr; }
With some work it would be possible to apply this same principle to the original and more complex Oren-Nayar equations, but there are less obvious choices. For example, many of the expensive evaluations require three inputs which would require a volume texture to be used as a look-up – something that isn’t good for storage (a low-quality volume texture of 128x128x128 and stored at half-precision still requires 4mb, which is four times as much as the reasonable quality look-up generated in the above fragment of code).
Roughness is the key parameter affecting the output from the Oren-Nayar model. The following set of images show the two main techniques compared at various roughness coefficients:

As is noted in their paper, when roughness is 0.0 (a completely smooth surface) the model exhibits Lambertian behaviour in the same way as other lighting models discussed in this section.
It is difficult to find any visual differences between the complex model and its simplified form – testament to the fact that the removed terms didn’t make a significant contribution to the outcome. The most obvious difference is that the simplified model comes across as being brighter, but the distribution of light is still the same. It may well be possible to find some examples where the complex model generates significantly better results, but for the most part it would be safe to work with the simplified model unless it showed any obvious signs of incorrectness.
Despite the visual differences being negligible the compiled output is substantially different – the original complex form is 24% longer than the simplified one. The instruction count drops from 77 to 62 between them which is surprising given their near-equal results. The use of a texture look-up knocks 24 instructions off the simplified model bringing it to a total of 38.
A simple count of instructions can be a useful measure of the evaluation cost for a shader but it should come as no surprise that individual instructions will have different costs. Ten expensive instructions may well be slower than twenty cheap ones even though the latter appears to be twice as long on paper. The assembly for the complex Oren-Nayar equation’s relies upon five SINCOS instructions, down to three in the simplified form and none appear when relying on a look-up texture. Thus the drop from 77 to 38 instructions could be more significant than it might initially seem.
[Oren&Nayar94] “Generalization of Lambert’s reflectance model”, Michael Oren and Shree K. Nayar. ACM SIGGRAPH 1994 Proceedings, ISBN 0-89791-667-0
Navigate to other chapters in this section:
| Foundation & Theory | Direct Light Sources | Techniques For Dynamic Per-Pixel Lighting | Phong and Blinn-Phong | Cook-Torrance | Oren-Nayar | Strauss | Ward | Ashikhmin-Shirley | Comparison and Summary |