Presentation is loading. Please wait.

Presentation is loading. Please wait.

Deferred Lighting and Shading

Similar presentations


Presentation on theme: "Deferred Lighting and Shading"— Presentation transcript:

1 Deferred Lighting and Shading
© 2004 Blue Shift, Inc. With Implementation Details and Demos for the PC (DX9), Xbox, and PS2 Platforms

2 Presenters Rich Geldreich, Ensemble Studios, Matt Pritchard, Ensemble Studios, John Brooks, Blue Shift, Inc.,

3 Session Overview High-level description PS2, Xbox demos Pros and cons
Xbox demo will be shown early Pros and cons Theory and concepts Platform specific details Advanced stuff, future research This is a huge topic, we can’t do it justice in a ~25 slides.

4 What is it? Deferred lighting and shading is the decoupling of geometry rasterization from lighting and (maybe) shading. Scene passes only write attributes, not lit pixels Light independent attributes are dumped to buffers in one or more deferment passes Some common attributes for deferred lighting: normal, position/depth Common attributes for deferred shading: albedo, specular color/power Basically, raw data needed for lighting (and maybe shading) is saved and reused in later passes Lights become 2D image space ops From now on, when we say “Deferred Shading” we usually mean “Deferred Lighting/Shading”.

5 Deferred Shading Demos
Xbox: Gladiator 2K4 2K3 version was shown at last year’s GDC to a limited audience. This year’s version is a bit more optimized, packs albedo colors differently (YCbCr instead of sRGB), and implements a “metalness” hack for better specular highlights. Created by Blue Shift, Inc.: PS2: Bunny Demo Object space normal maps, gloss maps Directional lights with specular highlights Two hemisphere area lighting HDR accumulation, tone mapping Show Gladiator!

6 Deferred Lighting, Deferred Shading, or Both?
Lighting – Light accumulation Process can take into account a pixel’s location, normal (or full tangent frame for anisotropic shading), and some aspect(s) of the illumination model’s BRDF function (i.e. specular power for Phong) Basically, “lighting” = Irradiance computation for direct light sources, diffuse and specular contributions separately accumulated Shading – The process of computing an output pixel’s final color (or exit radiance for HDR) using all of the pixel’s attributes (such as albedo, specular color, etc.), accumulated light, and a shading function. Shading can be done at many places: During lighting passes (only final color is accumulated) All at once as a final image space post process During a final scene render pass Really “pixel” or “fragment”, depending on what you’re doing and where. Lighting and shading are a little blurred because we partially factor in our BRDF into our lighting passes We’ve approximated the diffuse-only irradiance contribution from indirect lighting using low order Spherical Harmonic (SH) encoded Irradiance Volumes [ref], but we’re not going to cover that here.

7 Why Bother with Deferred?
Some shadow techniques (stencil shadow volumes, “forward” shadow buffering) require a Z pre-pass anyway. Why not write some per-pixel attributes too, instead of just depth? Predictability: The smaller or further away the light, the less processor time it takes to compute its influence, independent of the quantity of objects the light illuminates. Consistency: Entire scene can be lit using the same illumination model, shadowing technique, etc. Quality: Most of the lighting equation recomputed per-pixel Good example is light attenuation (falloff): No per-vertex approximations that break down on large triangles Simplicity: Scene rasterization passes require attribute dumping only No need to handle a zillion combinations of dir/omni/spot/etc. lights in your vertex/pixel programs The speed hit from writing attributes can be won back in the lighting passes. Some platforms have higher fill when writing just depth, so there must be some justification for writing depth + attributes We say “processor time” and not just “GPU time” because on PS2 things are blurred a bit TODO: scene rasterization overdraw cost is independent of the # of lights

8 Conceptual Operation Render scene to attribute buffer(s)
Example attributes: Normal, albedo, specular color, depth For each light: Shadow pass (stencil ops, shadow buffering, etc.) Shadow buffering can benefit from having the scene’s depth buffer available as a texture Render light’s bounding volume, accumulate diffuse/specular light contributions, or lit/shaded pixels May use a screenspace quad (directional light/full screen light), CPU projected/clipped screenspace n-gon, a 3D object with front face culling and projective texturing, tilization, etc. (Optional) Shade scene Render a full-screen quad, apply same shader to all pixels Or… Render scene again to support arbitrary material shaders Perform image space post processing effects Particles, blooming, streaks, fog, tone mapping, etc.

9 Other Advantages Eliminates hard limits on the number of lights influencing individual objects Number of active lights per object in forward shading engines is usually constrained in some way by the max. size of a single vertex/pixel program Potentially reduced overdraw cost Depends on the lighting complexity Not as important on GPU’s with fast Z occlusion hardware Lighting passes can take advantage of the fast Z occlusion hardware on modern GPU’s Occluded lights, or lights totally in front of stuff can be almost free On some platforms (PS2) it may be the only way of achieving bump mapped, per-pixel lighting with many lights PS2 is quite capable of rasterizing per-pixel attributes to buffers Lighting and shading performed at full floating point precision! Max vertex/pixel shader size problem: There are lots of workarounds, but in the end they usually involve visual compromises that compromise the consistency of the lighting Low per-vertex cost: No need for huge, fancy vertex programs– attribute rendering is simple “Forward shading” = immediate shading, i.e. lighting & shading as you render scene geometry

10 Cons of Deferred Shading 1/2
Can be bandwidth/memory intensive On some platforms, attributes must be written once, and read back many times. Multiple attribute buffers at high framebuffer resolutions gobble memory Very high hardware requirements Especially at typical PC game resolutions, or with FSAA Alpha blending: It’s all smoke and mirrors anyway, we’ll cover some alternative techniques Not practical on PC w/o DX9+ HW: Maybe doable on DX8.1 HW)

11 Cons of Deferred Shading 2/2
Per-material shaders or multiple BRDF’s are impractical on some platforms On some platforms/shader models it’s infeasible to use multiple pixel shader programs to shade the scene Fully texture driven shaders take on more importance Alpha blending “Real” alpha blending (over operator [6]) is difficult To do it “correctly”, multiple pixels must be independently lit and shaded before applying the over operator Excuse: Alpha blending is a total hack anyway We’ll show some hacks that can be decent alternatives on some platforms

12 What Attributes to Render?
Minimal attributes: Albedo (typically sRGB or YCbCr) Gloss (specular intensity, scalar) Normal (typically unit vector) Can also write Yaw/Pitch, (ZSign,Y,X), etc. Depth (scaler, float, int, or spread over several components) Could also write full 3D position, but this is unnecessary as the full position can easily be “recovered” given the pixel’s screenspace location and Z depth. Additional attributes: Specular color, power, BRDF or NDF (Normal Distribution Function) index, etc. For anisotropic shading: Perturbed Tangent (XYZ) and/or Binormal (XYZ) Special effects: Object ID (int) Additional shader function parameters (example: flags) Minimal output scenario recovers full pixel position

13 Deferred on DX9 Hardware
Attribute passes can use Multiple Render Targets (MRT) Unfortunately, must write depth or position, so up to one whole RT is lost. Can’t alias Z-buffer to a texture like on Xbox. Can write higher precision normals looks substantially better than 8-8-8 Less need to pack data, however, PS_2_0 give us lots of ways to pack data Some scenes look like trash with 24-bit normals

14 Deferred on DX9 Hardware HDR Light Accumulation
Alpha blending to HDR surfaces isn’t supported by current DX9 class hardware Lighting passes are just 2D screenspace ops, so it’s easy to predict which portions of the framebuffer are affected by each light We’ve accumulated HDR light in-place (i.e. coherent read-modify-write to a single buffer set as a render target and texture source) on the Radeon 9800 series by rendering 32x32 tiles and switching to dummy render targets after each light Render targets must be temporarily switched to unrelated surfaces to flush the R3xx’s backside caches (?). There are lots of ways of doing HDR accumulation on today’s hardware, no time to cover them all. Reading & writing to the same buffer is an evil hack but it’s worked for us since early ’03 D3D debug runtime doesn’t like this

15 Deferred on Xbox Round One: Two Scene Passes
Two scene rendering passes: Render Albedo and Gloss (“C Buffer”) Render Normal and Object ID (“N Buffer”) Alias Depth/Stencil Buffer to linear 32-bit A8R8G8B8 texture Tricky part: Omni lights Use texm3x3pad / texm3x3tex to “unproject” [9] and transform pixels to Normalized Light Space (NLS) [10] Volume texture lookup fetches NLS light vector and attenuation Rotate light vector to view space, renormalize in the combiners Dump light vector and attenuation to temp. buffer Perform usual Phong lighting calcs. in another pass in the combiners Accumulate lit and shaded pixels This technique has been used in a shipped title. HILO_1 dot mapping is key to making this possible

16 Deferred on Xbox Round Two: One Scene Pass
We don’t have Multiple Render Targets on Xbox, but we can fake it by tightly packing attributes: High word = Packed albedo and gloss ( ) Low word = Packed normal (1-7-8) Other stuff doesn’t differ from the two pass technique in any major ways, except for a more refined and optimized implementation. See Rich’s Gladiator presentation for more details. Xbox could be pushed further Should be possible to do a scene ~2x more visually complex at 30Hz. Object space normal mapping, shadow LOD’s would make a big difference in Gladiator

17 Deferred Shading on PS2 Benefits of Deferred on the PS2
Allows per-pixel shading on PS2! Extensive per-pixel programmability floating point (including divide, sqrt, random, etc) integer data swizzling lookup tables data load/store (pixel local or between pixels) looping branching Unlimited shader length Flexible shader memory use (shader program, constants, etc) Enables high-end effects like normal mapping & per-pixel lighting

18 Deferred Shading on PS2 Drawbacks to Deferred on the PS2 GPU fill-rate
DMA attribute buffers from VRAM DMA shaded pixels to VRAM Multiple render passes needed to store per-pixel attributes CPU performance Based on shader length & number of pixels shaded Focus on efficient algorithms and tight asm code VRAM memory Attribute buffers Attribute textures DRAM memory Per-pixel attributes Shaded pixel data

19 Alpha Blending - It’s a Pain
Stippling/screen door transparency [4] [9] An old school hack useful on console games. Console video encoders and most displays form a big low pass filter, so the pattern is invisible except near edges Can alternate the stipple pattern every frame Stippled surfaces interact with stencil shadows in a natural way Can also use stippling while rendering shadow volumes, for less than fully dark shadows [11] No explicit sorting required Depth peeling [5] For surfaces that need blending: Peel back 1-3 layers Independently light each layer, composite everything together Hybrid techniques Forward (immediate) shade surfaces that need blending Alpha blend overtop of the deferred engine’s output We’ve used stippling on a shipping title Stippling is less useful on HDTV displays for obvious reasons

20 Deferred on DX9 Hardware Other Neat Stuff We’ve Tried
2D Normal Distribution Functions (NDFs) [8] Anisotropic shading, direction of anisotropy maps Half-angle rotated to Perturbed Tangent Space, XY used as texcoords for specular & iridescence map lookup Requires storage of per-pixel perturbed tangent vectors Arbitrary material shaders Deferred lighting, but forward (immediate) shading Rerender scene after lighting, use PS 3.0 “vpos” register or texture projection to read accumulated light, shade fragment using any material pixel shader SH Encoded Irradiance Volumes [2] Use renderer to create HDR “Radiance Probes” for any position in a scene – output is HDR environment map (cubemap) Radiance probes can be quickly converted to irradiance environment maps [7] using Spherical Harmonics, completely on the GPU SH coefficients can be packed into multiple volume textures and reused in later passes

21 References Nicolas Thibieroz, “Deferred Shading with Multiple Render Targets”. ShaderX2 - Shader Tips & Tricks Peter-Pike Sloan, “Efficient Evaluation of Irradiance Environment Maps”. ShaderX2 - Shader Tips & Tricks Dean Calver, “Photo-realistic Deferred Lighting”. Tom McReynolds, “Advanced Graphics Programming Techniques Using OpenGL”. Cass Everrit, NVIDIA Order Independent Transparency demo Jim Blinn, “Jim Blinn's Corner - Dirty Pixels” Ravi Ramamoorthi, Pat Hanrahan, “An Efficient Representation for Irradiance Environment Maps” Jan Kautz, “Rendering with Hand Crafted Shading Models”. Game Programming Gems 3 Atman Binstock, private conversation, 2001 Alex Vlachos, John Isidoro, Chris Oat, “Textures as Lookup Tables for Per-Pixel Lighting”. Game Programming Gems 3 Atman Binstock, private conversation, 2003


Download ppt "Deferred Lighting and Shading"

Similar presentations


Ads by Google