Posted by redbeard on May 16, 2011
For my CubeWorld prototype, I wanted to try some screen-space effects like SSAO (screen-space ambient occlusion) and also compare the performance of deferred lighting versus standard forward-rendering lights; I’m also interested in just implementing a deferred renderer as I haven’t experimented with the concept before.
I found some good foundation code & explanation in the articles at http://www.catalinzima.com/tutorials/deferred-rendering-in-xna/, which got me started with some directional and point light functionality. I made a few modifications here and there such as combining multiple directional lights into a single pass and taking some liberties with the C# and shader code; I also used a procedural cube instead of a sphere mesh for my point light. I also found some good intro material in the NVidia deferred rendering presentation from “6800 Leagues Under the Sea”: http://developer.nvidia.com/presentations-6800-leagues-under-sea, which includes a few optimizations which can help (if you’re not using XNA, I’ll get to that). The performance of the deferred lighting is quite good on my PC, although I haven’t tried it extensively on the Xbox.
After seeing the deferred shading in action, I wanted to make even more use of the G-Buffer for effects that can make use of it, and one of the primary effects I’m interested in is SSAO, because the cube world looks rather artificial with all the faces shaded relatively flatly. I implemented the SSAO shader described in a gamedev.net article, which provides dense and somewhat unintuitive code, but it works and the rest of the article explains the concepts used. The article offered little guidance for tweaking the 4 input parameters such as “bias” and “scale”, but I found some numbers which appeared to work, and named them more intuitively for my internal API. I’m currently using only a 9×9 separated blur rather than the 15×15 suggested in the article. The effect works, but the screen-space random field is plain to see, and it seems to be more pronounced on distant geometry; I can probably do some more work to try and resolve those artifacts. A much more distracting artifact is the total loss of ambient occlusion at the edges of the screen in certain condition, I’m not sure if there’s a reasonable solution for that. I may try some static AO calculations for each cube face to see if I can get stable results that way.
The overall flow of my deferred renderer, currently (1 or more “passes” per step below):
- Render all scene geometry into G-Buffer
- Generate noisy SSAO buffer at half-resolution
- Blur SSAO buffer horizontally and vertically at full-resolution
- Accumulate directional and point lights, one per pass
- Combine lighting, albedo, and SSAO into final image
Some issues I ran into when implementing my deferred shading in XNA:
- XNA does not allow you to separate the depth-buffer from a render-target, which means you cannot use the stencil optimization for light volumes as discussed in the NVidia “6800 Leagues” presentation. The optimization allows you to only light-shade the pixels which are within the light volume, rather than all the ones that may be covered by it but are too distance to be affected. This requires that you retain the depth buffer from the geometry pass and use it to depth-test and store stencil values for light geometry, and then use those stencil values for a different render-target, specifically the light accumulation buffer.
- Xbox 360 has 10MB of framebuffer memory linked to the GPU, which works great if you’re rendering a single 1280×720 render-target and depth-buffer at 4 bytes each (about 7MB). When you want 3 rendertargets and a depth-buffer, you can either “tile” the framebuffer and re-draw the geometry multiple times, or you can drop the resolution until all the buffers fit; I opted for the latter option, using 1024×576 (for 16:9). XNA doesn’t expose the ability to resolve the depth-buffer to a texture, which means you must include your own depth render-target in your G-Buffer, or else that target resolution could be increased. On PC, the memory limitation is lifted, but you still can’t read back depth via D3D9 so the extra buffer still applies.
- I can see visible banding on my point lights, I’m not sure if this is due to banding in the light buffer itself or the final compositing. XNA 4.0 exposes the HdrBlendable format, which on Xbox uses a 10-bit floating-point value per component, but with only 7 bits of mantissa I’m not convinced it offers any reduced banding from 8-bit fixed-point components, just a different pattern.
Screenshots of my results:
- Directional and point lights: screenshot (debug display shows albedo, depth, normals, and lighting)
- SSAO random samples before blurring: screenshot (slightly more noisy than it should be, due to non-normalized random vectors)
- SSAO after blurring: screenshot
- Comparison images from before deferred shading was implemented: shot 1, shot 2
Other resources I came across while implementing these things: