Presentation on theme: "Light Propagation Volumes in CryEngine® 3"— Presentation transcript:
1Light Propagation Volumes in CryEngine® 3 Anton KaplanyanGraphics researcher at CrytekNew lighting techniqueAdvances in Real-Time Rendering in 3D Graphics and Games
2Agenda Introduction CryEngine® 3 lighting pipeline overview Core idea Applications (with video)ImprovementsCombination with other technologies (with video)Optimizations for consolesConclusion and future workLive demoYou can take a look at agenda. Meanwhile I’ll introduce a Crytek company.We have 5 studios across Europe, more that 30 nationalities.Already released multiple AAA games such as FarCry and Crysis.We have our own engine, many licensees across the world.There are three iterations of the engine: CryEngine (with FarCry), CryEngine 2 (with Crysis), CryEngine 3 announced at GDC’09
3Introduction into real-time graphics Strictly fixed budget per frameMany techniques are not physically-basedConsistent performanceGame production is complicatedThis talk is mostly about massive and indirect lightingThis is a high level talkMore implementation details in the paperStrictly fixed budget per frame – more than 30 frames per second, which is less than 33 milliseconds per frameMany techniques are not physically-based caused by hard constrain on time budget of the frameWe need to provide consistent rendering performance, which scales well on the rendering complexityInterdependencies with game production leads to production complication, we try to avoid any possible interdependency and keep the production pipeline as simple as possible. That also means that we avoid any precomputation approaches.This talk is mostly about massive and indirect lightingMassive lighting is a lighting of limited area with bunch of light sources (e.g. apartment indoor)There are no implementation details in this presentation, it is only a high level talk. You can find all the details in the paper.
4CryEngine® 3 renderer overview (1 / 5) Xbox 360 / PlayStation 3 / DirectX 9.0c / 10 / (11 soon…)Cross-platform engine which supports multiple graphics API like: Xbox 360, PlayStation 3, DirectX 9, DirectX 10.DirectX 11 API is coming soon…The engine is completely multithreaded.We have seamless world streaming technology for huge levels.You can see some completely different environment created with CryEngine recently: left top – Jungles, right top – Frozen area, right bottom – San Francisco GDC demo, left bottom – recently created forest area.We’re working on Crysis 2 for PC and consoles right now...
5CryEngine® 3 renderer overview (2 / 5) Unified shadow maps solution [Mittring07]We have unified shadow maps solution for different types of light sources, which decreases the number of shader combinations and decouples shadowing from scene rendering.
6CryEngine® 3 renderer overview (3 / 5) SSAO [Kajalin09], [Mittring09]Then we decided to improve the indirect lighting further. So, Kajalin introduced a Screen-Space Ambient Occlusion technique while working in Crytek.This technique is a real-time approximation for ambient occlusion using screen-space scene depth.It was enhanced recently by taking into account normal-mapped surfaces.
7CryEngine® 3 renderer overview (4 / 5) Deferred lighting [Mittring09]Minimal G-BufferSun / Omni / Projectors / Caustics / Deferred light probesThen we decided to do a step further and implement deferred lighting in CryEngine 3.This approach decreases number of shader combinations dramatically.Because of minimal G-Buffer layout is becomes consoles friendly, especially for consoles with limited amount of render target memory, like Xbox 360.You can see the layout of the G-Buffer on the left. The G-Buffer consists of depth, normal and specular power.These values are enough to support lighting with different light types, like sun, point lights, projectors, even deferred caustics and deferred light probes.Deferred light probes – image-based indirect lighting. You can see the example on the right. Deferred light probe consists of two cubemaps: specular and low resolution diffuse-convoluted cubemap. The diffuse-convoluted cubemap approximates precomputed diffuse global illumination from distant objects. Deferred light probes are precomputed and placed by artists in appropriate dark places where necesasry.
8CryEngine® 3 renderer overview (5 / 5) Lighting accumulation pipeline:Apply global / local hemispherical ambientOptionally: Replace it with Deferred Light Probes locally[Global illumination solution should take place here]Multiply indirect term by SSAO to apply ambient occlusionApply Direct Lighting on top of Indirect LightingLighting pipeline in CryEngine 3 is a layered scene lighting, which consists of several layers:First of all we apply global or local hemispherical ambientThen it’s optionally replaced by deferred light probes placed locally by artistsAt this point we have indirect lighting of the scene.Then the result of indirect lighting is multiplied by SSAO to apply occlusion information to indirect lighting.Finally we add direct lighting on top of that.So, global illumination term should take place inside of indirect lighting term.…Now I’d like to overview recent real-time rendering trends….
9Real-time rendering development trends Rendering is a multi-dimensional query [Mittring09]R = R(View, Geometry, Material, Lighting)Divide-and-conquer strategy, some examples:Shadow maps (decouple visibility queries)Deferred techniques (decouple lighting / shading)Screen-space techniques (SSAO, SSGI, etc.)Reprojection techniques (partially decouples view)Why?Less interdependencies => more consistent performanceFuture trends: parallel and distributed computations friendlyRendering is a multi-dimensional query.Rendering performance depends on many arguments, like viewer position, geometry complexity, material variety and lighting complexity.“Divide-and-conquer” strategy is an attempt to decompose this query to separate queries. And widely used in real-time graphics.Examples….
10Paper reference icon This icon means that details are in the paper TM Now I’d like to introduce this icon It means that when you see it on the slide, then you can find more details about the slide content in the paper.Adobe Acrobat registered trademark
12Light Propagation Volumes: Goals Decouples lighting complexity from screen coverage (resolution×overdraw)Radiance caching and storing techniqueMassive lighting with point light sourcesGlobal illuminationParticipating media rendering (still work in progress…)Consoles friendly (Xbox 360, PlayStation 3)What goals we pursuit by developing this technique?Decouples lighting complexity from screen coverage and decreases computations and memory footprint by that. The main idea behind that is to cache and store the radiance of lighting.That could allow us to do massive lighting (Massive lighting - lighting with crazy amount of light sources inside of some limited area (like apartment)) , Global Illumination and participating media rendering (which is still in progress).And different goal is to make it working on consoles.
13Related work Irradiance Volumes [GSHG97], [Tatarchuk04], [Oat05] + Signed Distance Fields [Evans06]Lightcuts: A Scalable Approach to Illumination [WFABDG05]Multiresolution Splatting for Indirect Illumination [NW09]Hierarchical Image-Space Radiosity for Interactive Global Illumination [NSW09]Non-interleaved Deferred Shading of Interleaved Sample Patterns [SIMP06]Let’s start with related work.The irradiance caching technique was introduced by GREGER et.al. and then extended to real-time graphics by Tatarchuk and Oat.There are several other techniques that serves the same purpose: Decouples lighting complexity from screen coverage.You can find them in the references. Unfortunately considering of each paper would take a lot of time. You can find description and overview of these techniques in the paper.Now, let’s start from SH Irradiance Volume overview…
14SH Irradiance volumesA grid of irradiance samples is taken throughout the sceneEach irradiance sample stored in SH formAt render time, the volume is queried and near-by irradiance samples are interpolated to estimate the global illumination at a point in the sceneSH Irradiance Volumes are used to cache irradiance samples in cells of volumetric grid to accelerate global illumination.Afterwards it was extended by Tatarchuck and Oat with irradiance approximation by SH basis.At render time, the volume is queried and near-by irradiance samples are interpolated to estimate the global illumination at a point in the scene.Now let’s consider the idea behind Light Propagation Volumes...From [GSHG97], [Tatarchuk04]
15Low-frequency radiance volumes Similar to SH Irradiance Volumes [Tatarchuk04]Stores radiance distribution insteadLow resolution 3D texture on GPU (up to 323 texels)SH approximation is low order (up to linear band)Radiance is not smooth [GSHG97]But what is the error introduced by approximating it?We use very similar to SH Irradiance Volumes caching approach.But we store radiance distribution instead of irradiance samples.Radiance distribution – outgoing lighting over a sphere.The radiance volume is very low resolution, up to 32 by 32 by 32 texels.We use low order of SH which is linear band (which amounts to 4 coefficients), that fits well into common texture formatsBut regarding to Greger et. Al. the radiance is not smooth.As you can see at the illustration, the radiance on the left in point x is discontinuous because of different energy emitted by four walls.You can see SH approximation on the right.Now let’s discuss the error introduced by that approximation…From [GSHG97]
16Radiance approximation Error of the spatial approximation depends ondensity and size / radii of light sourcesError of the angular approximation depends onShape of light sourceFrequency of angular radiance distribution of light sourceDistance to the light sourceCompensated by the energy fall-offPreserves mean energy and major radiance flow directionEnough if we want to eventually get irradianceResulting error of radiance approximation by radiance volume consists of spatial and angular errors.The spatial error caused by low resolution of the radiance volume and depends on the density of emitters and attenuation factors like radii.The angular approximation error which is SH approximation error depends on shape of particular light source, it’s frequency of angular distribution and the distance from particular radiance volume cell to the light source.The last factor means that the more far the radiance volume cell from light source, the sharper the radiance distribution is (you can see it at the illustration). This factor is compensated by light source energy fall-off with distance.This is a very rough approximation of radiance distribution, however it still preserves the mean radiance distribution and the major direction of the radiance distribution. And this approximation serves the final purpose of extracting the irradiance from it.The research on this approximation is still in progress…But why do we need to store radiance instead of irradiance then?...
17Light propagation in radiance volume Start with given initial radiance distribution from emittersIterative process of radiance propagation6-points axial stencil for adjacent cellsGathering, more efficient for GPUsEnergy conservingEach iteration adds to result, then propagates furtherImagine we have initial radiance distribution placed only into cells where emitters are. This is a very convenient situation, because we need to render only one pixel for one light source.So, we want to get a final radiance distribution over the gridThe proposed solution is to propagate the radiance iteratively. Each iteration applies a 6-points axial stencil to each cell.What does it mean? It means that for each cell we transmit the radiance from adjacent axial cells with gathering scheme. The gathering is GPU friendly.Result of each iteration is collected into final radiance volume and the next iteration is applied onto the result of the previous oneThere is a illustration of a single iteration for single cell in 2D on bottom of the slide.
18Light propagation in radiance volume Here is a result of several radiance propagation iterations. The top row is an initial radiance distribution, as you can see there are a lot of light sources. The top left quad is a single slice of radiance texture. So the 3D radiance texture is unwrapped at this picture.Notice that the process is highly attenuated, which means we can limit it by just several iterations (8 to 16 for 32x32x32 radiance volume depending on the initial intensity of light sources).Note that the resulting radiance distribution is an accumulation of all these radiance iterations.
19Rendering with Light Propagation Volume Regular shading, similar to SH Irradiance VolumesSimple 3D texture look-up using world-space positionIntegrate with normal’s cosine lobe to get irradianceSimple computation in the shader for 2nd order SHLighting for transparent objects and participating mediaDeferred shading / lightingDraw volume’s shape into accumulation bufferSupports almost all deferred optimizationsThe scene lighting with light propagation volumes is very similar to lighting with SH Irradiance Volumes, but we need to compute irradiance from radiance.Aided by GPU.Deferred lighting with optimizations is used in CryEngine 3:- stencil prepass- scissor rectangle- depth bound test- etc.
20Massive Lighting with point light sources This scene is a “programmer art”, hasn’t been created by artists
21Massive lightingOption 1: Inject initial energy, then propagate radianceA bit faster for crazy amount of lightsOption 2: Add pre-propagated radiance into each cellSimple analytical equation in the shader for point lightsHigher quality, no propagation errorError depends on the ratio (light source radius / cell size)Radius threshold for lighting with radiance volumeThere are two ways to get the final radiance distribution for point light sources inside of radiance volume.1. The first way is to inject a point primitive with initial radiance distribution into an appropriate cell of radiance volume.This is a very good way to inject a lot of light sources and then do a single light propagation step for all injected light sources.2. However there is another way to do it. We can actually inject the precomputed pre-propagated radiance into all the cells covered by a particular light source. This way is better in quality of propagated radiance, because the radiance is propagated analytically during injection, thus we avoid the error introduced by propagation step.The error of the massive lighting depends on the ratio of light sources’ radii to size of the radiance volume cell. Thus we use a threshold for light source radius to make a decision if the scene needs to be lit by radiance volume or in regular way.
22Glossy reflections with Light Propagation Volumes Accumulative traversal (ignores reflection occlusion)Several look-ups along reflected ray from cameraCollect incoming radiance from this directionIntegrate over the cone of incoming directionCone angle depends on:Glossiness of surfaceDistance from look-up to point pApproximates the integration with Phong BRDFAnother topic is how to achieve reflections with radiance volume. Let’s talk about glossy reflections.The problem is we have only low frequent radiance distribution, thus we can extract only very glossy reflections, the results are close to diffuse lighting.But we can increase the frequency of radiance distribution by traversing through several cells towards reflected direction.So, the technique is to:Traverse the radiance volume along the reflected eye directionDo several look-ups from the volume (see illustration)For each look-up we compute incoming lighting from this direction by integrating with approximation of view-dependent part of Phong BRDF. We approximate the view-dependent part of Phong BRDF by a cone.So, the cone angle depends on the surface’s glossiness.But the cone angle should also depend on the distance from the look-up to the surface point p. Cone angle shrinks with the distance to the surface point to take into account solid angle decreasing for area of surfel p
23Glossy reflections example Note the glossy reflections of the glowing red teapot on the metallic wall. The teapot is represented as a set of red emitters here.
24Massive lighting: Results NVIDIA GeForce GTX 280 GPU, Intel Core 2 Quad 2.66 GHz, DirectX 9.0c API, HDR 1280x720, no MSAA, Volume size: 323As you can see from these results, the slope of the plot for light propagation volumes is very steep.Instancing is not used for this measurement, which means that we have one draw call for one light source. That might explain the existance of the slope at all.
25Massive lighting video This is a so called “programmer art”, hasn’t been done by artists.Notice that this technique is highly useful for sophisticated lighting emulation, like indirect lighting of complex architectural interiors
26Global Illumination with Light Propagation Volumes
27Global Illumination with Light Propagation Volumes Instant Radiosity [Keller97]The main idea is to represent light bouncing as a set of secondary light sources: Virtual Point Lights (VPL)Splatting Indirect Illumination [DS07]Based on Instant RadiosityReflective Shadow Maps (RSM) are used to generate initial set of VPLs on GPUImportance sampling of VPLs from RSMFirstly, I’d like to explain the Instant radiosity and splatting indirect illumination solutions, because our approach is partially based on that.The idea being instant radiosity is to represent indirect lighting as a direct lighting from emitting surfaces. So each surfel of emitting surfaces is represented as a secondary light source which called “Virtual point light”, VLP. This is an efficient approach for indirect lighting gathering with GPU.The splatting indirect illumination technique extends this approach by Reflective shadow maps and importance sampling from reflective shadow map.What is reflective shadow map?
28Reflective Shadow Maps Reflective Shadow Map – efficient VPL generatorShadow map with MRT layout: depth, normal and colorThe reflective shadow map approach is the most efficient way to generate regular grid of VLPs of lit surfels of the scene.It is a regular shadow map but with multiple render targets to store not only depth but also normal and surface color, as you can see on the image. This allows us to have all the information for single VPL from single pixel of RSM.
29Global Illumination with Light Propagation Volumes Inject the initial radiance from VPLs into radiance volumePoint renderingPlace each point into appropriate cellUsing vertex texture fetch / R2VBApproximate initial radiance of each VPL with SHSimple analytical expression in shaderPropagate the radianceRender scene with propagated radianceThe main performance issue of Splatting Indirect Illumination technique is a rendering of huge amount of VLPs.The idea behind the “global illumination with light propagation volumes” technique is to solve this issue.The algorithm consists of following steps:Inject initial radiance distribution of VPLs from RSMThe injection is implemented as a rendering of large number of point primitives with GPU and place each primitive into appropriate cell of light propagation volume with vertex shader transformation and using vertex texture fetch from RSM (or render to vertex buffer).In pixel shader we convert the information from RSM into initial radiance distribution for each VPL. These computations are very simple for two bands of SH.Propagate radiance across the grid as mentioned before.Light scene with final radiance distribution
30Implementation details Light Propagation Volume moves with camera3D cell-size snapping for volume movement2D texel-size snapping for RSM movementRSM is higher in resolution than radiance volumeSmart down-sampling of RSMSome implementation details for global illumination technique.The light propagation volume is moving with camera.We do a lot of work to provide stable and consistent solution regardless of camera/objects/light movement.The discrete integer-cell-size movement step is applied for movement of volume, which provide a stable world-space places for cells of the volume.The RSM moves with the volume to provide efficient coverage of the scene by VPLs.The discrete one-texel-size movement step is applied for RSM movement as well. This provide a consistent rasterization of small and highly discontinuous geometry during RSM generation step.We use RSM with much higher resolution than the slice of light propagation volume to provide a smooth radiance distribution of injected VLPs.Also we do a smart down-sampling of RSM with VPL clustering which facilitates the injection step afterwards, especially on consoles.You can find details and implementation code of that in the paper.
31Global Illumination with Light Propagation Volumes ResultsNote that these images shows different types of scenes:OutdoorIndoor“Cornel box”-like environmentFoliageLet’s talk about some issues of this approach.
32Issue: Cell-alignment of VPLs Injection of VPLs involves position shiftingPosition of injected VLP becomes grid-alignedConsequence of spatial radiance approximationUnwanted radiance bleedingLighting of double-sided and thin geometryDuring injection step, the VPL is moved to the closest cell of the volume. That cause some unwanted bleeding and lighting of another faces of thin and double-sided geometry.
33Cell-alignment of VPLs: Bleeding example Note the bleeding of radiance through the double-sided roof of the hut on the left imageSo, the roof is a set of VPLs. It’s injected into the volume and the scene is lit by that volume.You can see a ground truth on the right side.So, how can we solve this issue?
34Cell-alignment of VPLs: Solution VPL half-cell shiftingtowards normaltowards light directionCoupled by anisotropic bilateral filteringDuring final rendering passSample radiance with offset by surface normalCompute radiance gradientCompare radiance with radiance gradientThere are two steps to solve the issue.The first one is the VPL shifting. The VPL is being shifted towards surface normal and initial light direction by half of the cell size. The shifting provides a guarantee that the VPL is always injected into the cell in front of the actual position.But that’s not enough to solve the problem completely. The issue still exists because of trilinear interpolation of radiance distribution. And we cannot shift the VPL further because that starts introducing an injection error.The anisotropic bilateral filtering fixes the problem of trilinear HW approximation of radiance distribution.The idea behind the filter is to check if we have a radiance direction and a radiance gradient direction match. This forms the filtering decision for bilateral filter.You can find more details on that techniques in the paper.
35Cascaded Light Propagation Volumes for GI One grid is limited in dimensions and low resolutionMultiresolution approach for radiance volumesSimilar to Cascaded Shadow Maps technique [SD02]Preserves surrounding radiance outside of the viewEach cascade is independentWith separate RSM for each cascadeTransmit radiance across adjacent edgesFilter objects by size for particular RSMEfficient hierarchical representation of radiance emittersSince one grid is limited in world-space dimensions and very low in resolution, sometimes it’s not sufficient to compute GI with that.So we propose a cascaded approach for Light propagation volumes. This approach is very similar to the cascaded approach for shadow maps, however we still need to preserve some area for each cascade around the camera to keep bleeders that are not in the view. So cascades are nested.Each cascade is processed in a completely independent way. It has its own RSM.Also we transmit radiance across cascades’ boundaries to provide better propagation and seamless connection.Since we completely control RSM generation for each cascade, we can filter objects with different into different cascades.This solution leads us to a very efficient hierarchical representation of radiance emitters of the scene.
37Global Illumination: Combination with SSAO No secondary occlusion for light propagation volumesCan be approximated by Ambient Occlusion termSSAO serves as a good approximation of secondary occlusion for GI.Note that SSAO could be extended to screen-space directional occlusion by using main direction from Light Propagation Volume cell for particular screen pixel.SSAO on, GI offSSAO off, GI onGI + SSAO
38Global Illumination: Combination with SSGI Screen-Space Global Illumination [RGS09]Limitations of SSGIOnly screen-space informationHuge kernel radius for close objectsLimitations of Light Propagation VolumesLocal solutionLow resolution spatial approximationSupplementing each otherCustom blendingThe Screen-space global illumination is a screen-space technique similar to SSAO, but taking color into account. Thus the technique provides a screen-space color bleeding.Now I’ll talk about disadvantages of SSGI and disadvantages of current GI technique.There are some disadvantages of SSGI:- It has only screen-space information for color bleeding. That means that it skips object that are not in the view or hidden by another objects.- Additionally we need to increase the screen-space kernel radius for SSGI sampling for objects that are close to camera. That might lead to inconsistent performance.On the other hand the GI with LPV technique has it’s own disadvantages:It’s always a local technique, so it’s not possible cover the whole field of view by light propagation volumes.Secondly, it still has a low resolution spatial approximation which could lead to missing bleeding from of small objects.The constant screen-space radius for SSGI kernel provides consistent bleeding for distant objects and supplements bleeding details for close small objects.More details about custom blending and combining these technologies are in the paper.
39Global Illumination: Combination with SSGI Notice the lack of GI at the far panorama on the left screenshotSSGI offSSGI on
40Optimizations for consoles: Xbox 360 / PS3 3D texture look-up with trilinear filteringRadiance volume is 32 bpp for all three SH texturesXbox 360, ~3,5 ms per frameVertex texture fetching for RSM injectionWork-around to resolve into particular slice of 3D texturePlayStation 3, ~3,4 ms per frameEmulate signed blending in the shaderR2VB for RSM injection (using memory remapping)Render to unwrapped 2D RT then remap as 3D textureMoreover this technique works well on consoles. We need 6 volumes 32x32x32 for GI and one RSM. With reusing of RT memory it leads to less than 1 MB of video memory. On Xbox 360 it‘s possible to resolve part of EDRAM surface into particular slice of tiled 3D texture, but there is a bug in API. You can find the work-around for this bug in the paper. Also on PS3 it‘s possible to remap the same RT as MSAA downscaled RT which facilitates the rendering pass a lot.It takes less than 3,5 ms per frame on both consoles for in-game scenes including RSM generation, which makes this technology useful for game production on consoles.More details about consoles optimization you can find in the paper.
41Future work Better radiance approximation… Participating media renderingOcclusion for indirect lightingMultiple bouncesImprove qualityImproved propagation schemeBetter angular approximationAdaptive gridsSupport for arbitrary types of light sources
42References[DS07] Dachsbacher, C., Stamminger, M Splatting Indirect Illumination[Evans06] Evans, A Fast Approximations for Global Illumination on Dynamic Scenes[GSHG97] Greger, G., Shirley, P., Hubbard, P., Greenberg, D The Irradiance Volume[Isidoro05] Isidoro J Filtering Cubemaps: Angular Extent Filtering and Edge Seam Fixup Methods[Kajalin09] Kajalin, V Screen-space ambient occlusion, Shader X7[Keller97] Keller, A Instant radiosity[Mittring07] Mittring, M Finding Next Gen – CryEngine 2[Mittring09] Mittring, M A bit more Deferred – CryEngine3.[NSW09] Nichols, G., Shopf, J., Wyman, C Hierarchical Image-Space Radiosity for Interactive Global Illumination[NW09] Nichols, G., Wyman, C Multiresolution Splatting for Indirect Illumination[Oat05] Oat, C., 2006 Irradiance Volumes for Real-Time Rendering, ShaderX 5[RGS09] Ritschel, T., Grosch, T., Seidel, H.-P Approximating Dynamic Global Illumination in Image Space[SD02] Stamminger, M., Drettakis, G Perspective shadow maps[SIMP06] Segovia, B., Iehl, J. C., Mitanchey, R., Peroche, B Non-interleaved Deferred Shading of Interleaved Sample Patterns[Tatarchuk04] Tatarchuk, N Irradiance Volumes for Games[WFABDG05] Walter, B., Fernandez, S., Arbree, A., Balda, K., Donkikian, M., Greenberg, D Lightcuts: A Scalable Approach to IlluminationMore details in the paper atYou can find the paper at the AMD portal in developers section and at
43AcknowledgmentMichael Endres, Felix Dodd, Marco Siegel, Frank Meinl, Alexandra Cicorschi, Helder Pinto, Efgeni Bischoff and other artists and designers at Crytek for created scenesMartin Mittring, Vladimir Kajalin, Tiago Sousa, Ury Zhilinsky, Mark Atkinson, Evgeny Adamenkov and the whole Crytek R&D teamSpecial thanks to Carsten Dachsbacher and Natalia Tatarchuk
44Real-time rendering come to a new era – the era of programmable rendering We’re looking for more professionalsLive demo
45Thank you for your attention! Questions? Live demo