Академический Документы
Профессиональный Документы
Культура Документы
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=v...
explains the complexity of CSMs; gives details on the possible variations of the CSM algorithms; describes the two most common filtering techniquespercentage closer filtering (PCF) and filtering with variance shadow maps (VSMs); identifies and addresses some of the common pitfalls associated with adding filtering to CSMs; and shows how to map CSMs to Direct3D 10 through Direct3D 11 hardware. The code used in this article can be found in the DirectX Software Development Kit (SDK) in the CascadedShadowMaps11 and VarianceShadows11 samples. This article will prove most useful after implementing the techniques covered in the technical article, Common Techniques to Improve Shadow Depth Maps, are implemented.
In Figure 1, quality is shown (left to right) from highest to lowest. The series of grids representing shadow maps with a
1 of 17
1/13/2013 5:08 PM
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=v...
view frustum (inverted cone in red) shows how pixel coverage is affected with different resolution shadow maps. Shadows are of the highest quality (white pixels) when there is a 1:1 ratio mapping pixels in light space to texels in the shadow map. Perspective aliasing occurs in the form of large, blocky texture maps (left image) when too many pixels map to the same shadow texel. When the shadow map is too large, it is under sampled. In this case, texels are skipped, shimmering artifacts are introduced, and performance is affected. Figure 2. CSM shadow quality
Figure 2 shows cutouts from the highest quality section in each shadow map in Figure 1. The shadow map with the most closely placed pixels (at the apex) is nearest the eye. Technically, these are maps of the same size, with white and grey used to exemplify the success of the cascaded shadow map. White is ideal because it shows good coveragea 1:1 ratio for eye-space pixels and shadow-map texels. CSMs require the following steps per frame.
1. 2. 3. 4.
Partition the frustum into subfrusta. Compute an orthographic projection for each subfrustum. Render a shadow map for each subfrustum. Render the scene. 1. Bind the shadow maps and render. 2. The vertex shader does the following: Computes texture coordinates for each light subfrustum (unless the needed texture coordinate is calculated in the pixel shader). Transforms and lights the vertex, and so on. 3. The pixel shader does the following: Determines the proper shadow map. Transforms the texture coordinates if necessary. Samples the cascade. Lights the pixel.
2 of 17
1/13/2013 5:08 PM
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=v...
In practice, recalculating the frustum splits per frame causes shadow edges to shimmer. The generally accepted practice is to use a static set of cascade intervals per scenario. In this scenario, the interval along the Z-axis is used to describe a subfrustum that occurs when partitioning the frustum. Determining the correct size intervals for a given scene depends upon several factors.
Orientation of the Scene Geometry
With respect to scene geometry, camera orientation affects cascade interval selection. For example, a camera very near the ground, such as a ground camera in a football game, has a different static set of cascade intervals than a camera in the sky. Figure 4 shows some different cameras and their respective partitions. When the scene's Z-range is very large, more split planes are required. For example, when the eye is very near the ground plane, but distant objects are still visible, multiple cascades can be necessary. Dividing the frustum so that more splits are near the eye (where perspective aliasing is changing the fastest) is also valuable. When most of the geometry is clumped into a small section (such as an overhead view or a flight simulator) of the view frustum, fewer cascades are necessary. Figure 4. Different configurations require different frustum splits
(Left) When geometry has a high dynamic range in Z, lots of cascades are required. (Center) When the geometry has low
3 of 17
1/13/2013 5:08 PM
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=v...
dynamic range in Z, there is little benefit from multiple frustums. (Right) Only three partitions are needed when the dynamic range is medium.
Orientation of the Light and the Camera
Each cascade's projection matrix is fit tightly around its corresponding subfrustum. In configurations where the view camera and the light directions are orthogonal, the cascades can be fit tightly with little overlap. The overlap becomes larger as the light and the view camera move into parallel alignment (Figure 5). When the light and the view camera are nearly parallel, it is called a "dueling frusta," and is a very hard scenario for most shadowing algorithms. It is not uncommon to constrain the light and camera so that this scenario does not occur. CSMs, however, perform much better than many other algorithms in this scenario. Figure 5. Cascade overlap increases as light direction becomes parallel with camera direction
Many CSM implementations use fixed-size frusta. The pixel shader can use the Z-depth to index into the array of cascades when the frustum is split in fixed-size intervals.
All of the frusta can be created with the same near plane. This forces the cascades to overlap. The CascadedShadowMaps11 sample calls this technique fit to scene.
Fit to Cascade
Alternatively, frusta can be created with the actual partition interval being used as near and far planes. This causes a tighter fit, but degenerates to fit to scene in the case of dueling frusta. The CascadedShadowMaps11 samples calls this technique fit to cascade. These two methods are shown in Figure 6. Fit to cascade wastes less resolution. The problem with fit to cascade is that the orthographic projection grows and shrinks based on the orientation of the view frustum. The fit to scene technique pads the orthographic projection by the max size of the view frustum removing the artifacts that appear when the view-camera moves. Common Techniques to Improve Shadow Depth Maps addresses the artifacts that appear when the light moves in the section "Moving the light in texel sized increments." Figure 6. Fit to scene vs. fit to cascade
4 of 17
1/13/2013 5:08 PM
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=v...
In interval-based selection (Figure 7), the vertex shader computes the position in world-space of the vertex.
C++
5 of 17
1/13/2013 5:08 PM
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=v...
C++
fCurrentPixelDepth = Input.vDepth;
Interval-based cascade selection uses a vector comparison and a dot product to determine the correct cacade. The CASCADE_COUNT_FLAG specifies the number of cascades. The m_fCascadeFrustumsEyeSpaceDepths_data constrains the view frustum partitions. After the comparison, the fComparison contains a value of 1 where the current pixel is larger than the barrier, and a value of 0 when the current cascade is smaller. A dot product sums these values into an array index.
C++
float4 vCurrentPixelDepth = Input.vDepth; float4 fComparison = ( vCurrentPixelDepth > m_fCascadeFrustumsEyeSpaceDepths_data[0]); float fIndex = dot( float4( CASCADE_COUNT_FLAG > 0, CASCADE_COUNT_FLAG > 1, CASCADE_COUNT_FLAG > 2, CASCADE_COUNT_FLAG > 3) , fComparison ); fIndex = min( fIndex, CASCADE_COUNT_FLAG ); iCurrentCascadeIndex = (int)fIndex;
Once the cascade is selected, the texture coordinate must be transformed to the correct cascade.
C++
This texture coordinate is then used to sample the texture with the X-coordinate and the Y-coordinate. The Z-coordinate is used to do the final depth comparison.
Map-Based Cascade Selection
Map-based selection (Figure 8) tests against the four sides of the cascades to find the tightest map that covers the
6 of 17
1/13/2013 5:08 PM
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=v...
specific pixel. Instead of calculating the position in world space, the vertex shader calculates the view-space position for every cascade. The pixel shader iterates over the cascades in order to scale and shift the texture coordinates so that they index the current cascade. The texture coordinate is then tested against the texture bounds. When the X and Y values of the texture coordinate fall inside a cascade, they are used to sample the texture. The Z-coordinate is used to do the final depth comparison. Figure 8. Map-based cascade selection
Interval-based selection is slightly faster than map-based selection because the cascade selection can be done directly. Map-based selection must intersect the texture coordinate with the cascade bounds. Map-based selection uses the cascade more efficiently when shadow maps do not align perfectly (see Figure 8).
7 of 17
1/13/2013 5:08 PM
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=v...
(Left) A visible seam can be seen where cascades overlap. (Right) When the cascades are blended between, no seam occurs.
Filtering ordinary shadow maps does not produce soft, blurred shadows. The filtering hardware blurs the depth values, and then compares those blurred values to the light space texel. The hard edge resulting from the pass/fail test still exists. Blurring shadow maps only serves to erroneously move the hard edge. PCF enables filtering on shadow maps. The general idea of PCF is to calculate a percentage of the pixel in shadow based on the number of subsamples that pass the depth test over the total number of subsamples. Direct3D 10 and Direct3D 11 hardware can perform PCF. The input to a PCF sampler consists of the texture-coordinate and a comparison depth value. For simplicity, PCF is explained with a four-tap filter. The texture sampler reads the texture four times, similar to a standard filter. However, the returned result is a percentage of the pixels that passed the depth test. Figure 10 shows how a pixel that passes one of the four depth tests is 25 percent in shadow. The actual value returned is a linear interpolation based on the subtexel coordinates of the texture reads to produce a smooth gradient. Without this linear interpolation, the four-tap PCF would only be able to return five values: { 0.0, 0.25, 0.5, 0.75, 1.0 }. Figure 10. PCF filtered image, with 25 percent of the selected pixel covered
It is also possible to do PCF without hardware support or extend PCF to larger kernels. Some techniques even sample with a weighted kernel. To do this, create a kernel (such as a Gaussian) for an N N grid. The weights must add up to 1. The texture is then sampled N2 times. Each sample is scaled by the corresponding weights in the kernel. The CascadedShadowMaps11 sample uses this approach.
Depth Bias
Depth bias becomes even more important when large PCF kernels are used. It is only valid to compare a pixel's
8 of 17
1/13/2013 5:08 PM
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=v...
light-space depth against the pixel it maps to in the depth map. The depth map texel's neighbors refer to a different position. This depth is likely to be similar, but can be very different depending on the scene. Figure 11 highlights the artifacts that occur. A single depth is compared to three neighboring texels in the shadow map. One of the depth tests erroneously fails because its depth does not correlate to the computed light-space depth of the current geometry. The recommended solution to this problem is to use a larger offset. Too large of an offset, however, can result in Peter Panning. Calculating a tight near plane and far plane helps reduce the effects of using an offset. Figure 11. Erroneous self-shadowing
The erroneous self-shadowing results from comparing pixels in the light-space depth to the texels in the shadow map that do not correlate. The depth in light-space correlates to shadow texel 2 in the depth map. Texel 1 is greater than the light-space depth while 2 is equal and 3 is less. Texels 2 and 3 pass the depth test, while Texel 1 fails.
Calculating a Per-Texel Depth Bias with DDX and DDY for Large PCFs
Calculating a per texel depth bias with ddx and ddy for large PCFs is a technique that calculates the correct depth biasassuming the surface is planarfor the adjacent shadow map texel. This technique fits the comparison depth to a plane using the derivative information. Because this technique is computationally complex, it should be used only when a GPU has compute cycles to spare. When very large kernels are used, this may be the only technique that works to remove self-shadowing artifacts without causing Peter Panning. Figure 12 highlights the problem. The depth in light-space is known for the one texel that is being compared. The light-space depths that correspond to the neighboring texels in the depth map are unknown. Figure 12. Scene and depth map
The rendered scene is shown at left, and the depth map with a sample texel block is shown at right. The eye-space texel maps to the pixel labeled D in the center of the block. This comparison is accurate. The correct depth in eye space
9 of 17
1/13/2013 5:08 PM
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=v...
correlating to the pixels that neighbor D is unknown. Mapping the neighboring texels back to eye space is possible only if we assume the pixel pertains to the same triangle as D. The depth is known for the texel that correlates with the light-space position. The depth is unknown for the neighboring texels in the depth map. At a high level, this technique uses the ddx and ddy HLSL operations to find the derivative of the light-space position. This is nontrivial because the derivative operations return the gradient of the light-space depth with respect to screen space. To convert this to a gradient of the light-space depth with respect to light space, a conversion matrix must be calculated.
Explanation with Shader Code
The details of the rest of the algorithm are given as an explanation of the shader code that performs this operation. This code can be found in the CascadedShadowMaps11 sample. Figure 13 shows how the light-space texture coordinates map to the depth map and how the derivatives in X and Y can be used to create a transformation matrix. Figure 13. Screen-space to light-space matrix
The derivatives of the light-space position in X and Y are used to create this matrix. The first step is to calculate the derivative of the light-view-space position.
C++
Direct3D 11 class GPUs calculate these derivatives by running 2 2 quad of pixels in parallel and subtracting the texture coordinates from the neighbor in X for ddx and from the neighbor in Y for ddy. These two derivatives make up the rows of a 2 2 matrix. In its current form, this matrix could be used to convert screen-space neighboring pixels to light-space slopes. However, the inverse of this matrix is needed. A matrix that transforms light-space neighboring pixels to screen-space slopes is needed.
C++
10 of 17
1/13/2013 5:08 PM
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=v...
float2x2 matScreentoShadow = float2x2( vShadowTexDDX.xy, vShadowTexDDY.xy ); float fInvDeterminant = 1.0f / fDeterminant; float2x2 matShadowToScreen = float2x2 ( matScreentoShadow._22 * fInvDeterminant, matScreentoShadow._12 * -fInvDeterminant, matScreentoShadow._21 * -fInvDeterminant, matScreentoShadow._11 * fInvDeterminant );
This matrix is then used to transform the two texels above and to the right of the current texel. These neighbors are represented as an offset from the current texel.
C++
float2 vRightShadowTexelLocation = float2( m_fTexelSize, 0.0f ); float2 vUpShadowTexelLocation = float2( 0.0f, m_fTexelSize ); float2 vRightTexelDepthRatio = mul( vRightShadowTexelLocation, matShadowToScreen ); float2 vUpTexelDepthRatio = mul( vUpShadowTexelLocation, matShadowToScreen );
The ratio that the matrix creates is finally multiplied by the depth derivatives to calculate the depth offsets for the neighboring pixels.
C++
11 of 17
1/13/2013 5:08 PM
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=v...
These weights can now be used in a PCF loop to add an offset to the position.
C++
for( int x = m_iPCFBlurForLoopStart; x < m_iPCFBlurForLoopEnd; ++x ) { for( int y = m_iPCFBlurForLoopStart; y < m_iPCFBlurForLoopEnd; ++y ) { if ( USE_DERIVATIVES_FOR_DEPTH_OFFSET_FLAG ) { depthcompare += fRightTexelDepthDelta * ( (float) x ) + fUpTexelDepthDelta * ( (float) y ); } // Compare the transformed pixel depth to the depth read // from the map. fPercentLit += g_txShadow.SampleCmpLevelZero( g_samShadow, float2( vShadowTexCoord.x + ( ( (float) x ) * m_fNativeTexelSizeInX ) , vShadowTexCoord.y + ( ( (float) y ) * m_fTexelSize ) ), depthcompare ); } }
Adding the derivative based offsets for CSMs presents some challenges. This is due to a derivative calculation within divergent flow control. The problem occurs because of a fundamental way that GPUs operate. Direct3D11 GPUs operate on 2 2 quads of pixels. To perform a derivative, GPUs generally subtract the current pixel's copy of a variable from the neighboring pixel's copy of that same variable. How this happens varies from GPU to GPU. The texture coordinates are determined by map-based or interval-based cascade selection. Some pixels in a pixel quad choose a different cascade than the rest of the pixels. This results in visible seams between shadow maps because the derivative-based offsets are
12 of 17
1/13/2013 5:08 PM
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=v...
now completely wrong. The solution is to perform the derivative on light-view space texture coordinates. These coordinates are the same for every cascade.
Padding for PCF Kernels
PCF kernels index outside of a cascade partition if the shadow buffer is not padded. The solution is to pad the outer rim of the cascade by one-half the size of the PCF kernel. This must be implemented in the shader that selects the cascade and in the projection matrix that must render the cascade large enough that the border is preserved.
Algorithm Details
VSMs work by rendering the depth and the depth squared to a two-channel shadow map. This two-channel shadow map can then be blurred and filtered just like a normal texture. The algorithm then uses Chebychev's Inequality in the pixel shader to estimate the fraction of pixel area that would pass the depth test. The pixel shader fetches the depth and depth-squared values.
C++
float float
C++
13 of 17
1/13/2013 5:08 PM
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=v...
If the depth comparison fails, the percentage of the pixel that is lit is estimated. Variance is calculated as averageof-squares minus square-of-average.
C++
float variance = ( fAvgZ2 ) ( fAvgZ * fAvgZ ); variance = min( 1.0f, max( 0.0f, variance + 0.00001f ) );
C++
Light Bleeding
The biggest drawback to VSMs is light bleeding (Figure 16). Light bleeding occurs when multiple shadow casters occlude each other along edges. VSMs shade the edges of shadows based on depth disparities. When shadows overlap each other, a depth disparity exists in the center of a region that should be shadowed. This is a problem with using the VSM algorithm. Figure 16. VSM light bleeding
14 of 17
1/13/2013 5:08 PM
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=v...
A partial solution to the problem is to raise the fPercentLit to a power. This has the effect of dampening the blur, which can cause artifacts where depth disparity is small. Sometimes there exists a magical value that alleviates the problem.
C++
An alternative to raising the percent lit to a power is to avoid configurations where shadows overlap. Even highly tuned shadow configurations have several constraints on light, camera, and geometry. Light bleeding is also lessened by using higher resolution textures. Layered variance shadow maps (LVSMs) solve the problem at the expense of breaking the frustum into layers that are perpendicular to the light. The number of maps required would be quite large when CSMs are also being used. Additionally, Andrew Lauritzen, co-author of the paper on VSMs, and author of a paper on LVSMs, discussed combining exponential shadow maps (ESMs) with VSMs to counteract light blending in a Beyond3D Forum.
Using gradients with CSMs can produce a seam along the border between two cascades as seen in Figure 17. The sample instruction uses derivatives between pixels to calculate information, such as the mipmap level, needed by the filter. This causes a problem in particular for mipmap selection or anisotropic filtering. When pixels in a quad take different branches in the shader, the derivatives calculated by the GPU hardware are invalid. This results in a jagged seam along the shadow map. Figure 17. Seams on cascade borders due to anisotropic filtering with divergent flow control
15 of 17
1/13/2013 5:08 PM
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=v...
This problem is solved by computing the derivatives on the position in light-view space; the light-view space coordinate is not specific to the selected cascade. The computed derivatives can be scaled by the scale portion of the projectiontexture matrix to the correct mipmap level.
C++
float3 vShadowTexCoordDDX = ddx( vShadowMapTextureCoordViewSpace ); vShadowTexCoordDDX *= m_vCascadeScale[iCascade].xyz; float3 vShadowTexCoordDDY = ddy( vShadowMapTextureCoordViewSpace ); vShadowTexCoordDDY *= m_vCascadeScale[iCascade].xyz; mapDepth += g_txShadow.SampleGrad( g_samShadow, vShadowTexCoord.xyz, vShadowTexCoordDDX, vShadowTexCoordDDY );
Summary
CSMs offer a solution to the perspective aliasing problem. There are several possible configurations to get the needed
16 of 17
1/13/2013 5:08 PM
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=v...
visual fidelity for a title. PCF and VSMs are widely used and should be combined with CSMs to reduce aliasing.
References
Donnelly, W. and Lauritzen, A. Variance shadow maps. In SI3D '06: Proceedings of the 2006 symposium on Interactive 3D graphics and games. 2006. pp. 161165. New York, NY, USA: ACM Press. Lauritzen, Andrew and McCool, Michael. Layered variance shadow maps. Proceedings of graphics interface 2008, May 2830, 2008, Windsor, Ontario, Canada. Engel, Woflgang F. Section 4. Cascaded Shadow Maps. ShaderX5 , Advanced Rendering Techniques, Wolfgang F. Engel, Ed. Charles River Media, Boston, Massachusetts. 2006. pp. 197206.
Community Additions
2013 Microsoft. All rights reserved.
17 of 17
1/13/2013 5:08 PM