Presentation is loading. Please wait.

Presentation is loading. Please wait.

LOD Case Study & Application

Similar presentations


Presentation on theme: "LOD Case Study & Application"— Presentation transcript:

1 LOD Case Study & Application
Robert Huebner Nihilistic Software

2 Speaker Bio President and Director of Technology for Nihilistic Software Currently working on “Starcraft:Ghost” for Blizzard Entertainment Previous credits include Vampire: The Masquerade, Jedi Knight: Dark Forces 2, Descent International Game Developer’s Association Board Member (IGDA) Game Developer’s Conference (GDC) Advisory Board

3 Purpose of Talk Review some of the topics and ideas presented earlier in the course Try to explain what worked for us, and what didn’t This talk is a “case study in progress” for our current Gamecube and XBOX work Still tweaking and changing some LOD schemes

4 Starcraft: Ghost (needs LOD too!)

5 Goal of LOD Back on Pre-3D-hardware PCs, we would spend a LOT of CPU to avoid drawing a few triangles The cost of rendering was much higher We were willing to spend significant CPU to eliminate a single triangle Systems like ROAM, view-dependent LOD Current hardware renders fast, so we only spend CPU if we can discard a lot of triangles Or if it saves us state changes, texture fetches, memory bandwidth, or other costly processing

6 General Block Diagram RAM CPU GPU Frame buffer FIFO Vertex Unit
Texture Mem Pixel Unit

7 Data Flow Management Managing data flow and bandwidth is an important performance metric Each platform has different architectures So our choice of LOD differs for each platform Each main data path can utilize different LOD techniques to increase throughput We try to do this without wasting CPU or memory resources, which are also scarce

8 Where Do We Use LOD? RAM CPU GPU Framebuffer FIFO Vertex Unit
Texture Mem Pixel Unit

9 Classes of Game LOD The design of most console systems is dominated by three data paths: The RAM->GPU path and GPU throughput is managed with geometric LOD The GPU->Framebuffer path is managed via shader LOD The Texture->GPU path is managed with MIP-mapping and shader LOD

10 Games Vs. Research The biggest problems we run into when adopting academic LOD systems to game use are: Dealing with additional properties of meshes Vertex normals, texture, UV coordinates, etc. Avoid the need for general-purpose processing at the vertex level Maintaining data in a format that our hardware can process directly

11 Runtime Selection In our engine, all LOD processing for a given object is driven by a single value The LOD value is stored both as a float (0.0 to 1.0) and as a discrete BYTE (1..X) Each sub-system that wants to do LOD can use either version of the LOD metric to control behavior

12 Runtime Selection The LOD metric is stored for each object or “sector” (world section) Based on many factors (highest to lowest weight) Estimated screen space (size / distance) Overall performance or estimated triangle counts for scene (scene metric) Current player control mode (interact or cutscene, combat or stealth) “Importance” of the object (active AI vs. inactive AI) Viewing angle for terrain blocks

13 Geometric LOD Geometric LOD is the most interesting & complex topic for games There are three main goals we try to achieve with geometric LOD: Send less data to the GPU to avoid exceeding its throughput Utilize less bus bandwidth moving data into the graphics unit Try achieve a constant average triangle size to balance load between vertex and pixel units

14 Compiled Models Most game engines are constructed to load “compiled” models Vertex data is adjusted to match native format Triangles are batched to minimize state changes and fit within hardware limits Optimum strips are constructed DisplayLists/Pushbuffers are compiled Compiled models are highly platform-specific

15 Basic LOD Choices Based on platform specifics, we select a simple half-edge collapse operation as the basis of our LOD Minimizes memory use, vertex data remains unchanged Minimizes dynamically changing vertex data, which minimizes bandwidth & FIFO space Allows us to address problems with property discontinuities

16 Calculating LOD We perform all our LOD computation off-line during model compilation We offer the artists a choice of LOD metric to use when computing automatic LOD levels We chose an LOD scheme that is based on half-edge collapse operations only Less memory, more static data set The LOD is constructed based on edge score Each edge in the model is given a score based on its length, curvature, or other factors Vertices are also given scores to control which endpoint is preserved during the edge collapse

17 Calculating LOD We begin by building an augmented “collapse vertex” structure for the model Links to neighbor verts (edges) Links to associated faces Link and score of “least cost” edge Identification of “border” or “seam” verts Links to “paired” verts Links to the actual “render” vertices This process happens after vertices are split due to texture/normal/UV changes This means one collapse vertex can be linked to multiple “export” vertices

18 Calculating LOD We add game-specific restrictions to LOD
Either adjust the vertex score, exempt it entirely, or link its removal to that of another vertex Texture or UV mapping “seams” due to composited textures Vertex normal discontinuities (hard edge) Unpaired edges Artist influence (blind vertex data in Maya) We also use domain-specific knowledge to adjust scoring algorithm Terrain blocks use z (height) differential as main score factor Shadow/collision LOD ignores texture/UV seams

19 Calculating LOD Once we have a full set of edge scores, we select the least cost edge and remove its least cost vertex Half-edge collapse to the higher-cost endpoint Record the operation in fields in our underlying data Remove degenerate triangles Re-compute all edge costs in neighboring triangles Repeat until only non-collapsible edges remain

20 Note on quality Our reduction and scoring system is simple, but accuracy suffers Because of this, we have found that the last 10% or so of the collapse operations are judged by artists as being unsatisfactory We allow the export process to specify some control over the quality Limit on the maximum cost collapse that will be executed (default excludes about 10% of operations) Object-specific tweaks to the computed LOD factor

21 Calculating LOD The results of this operation are two new data fields in our renderable vertex structure The “collapseOrder” field gives the ordering of the collapse operation The “collapseTo” field is the destination vertex for the edge collapse operation that removes this vertex from the mesh Using these fields, we can export the LOD in various ways in the final compilation Since the LOD metrices are all export-side, we can adopt improvements periodically without affecting run-time data Just re-export to get benefits of better reduction

22 Discrete LOD Discrete LOD is still the workhorse of game mesh LOD
Each level can undergo heavy pre-processing for strip-ordering or displaylist creation Artists can hand-tune the reduction for visual accuracy Can optionally replace both vertices and index lists, or just indices to save memory We represent discrete LOD by loading multiple sets of face index lists, or separate “index buffers” Vertex data is unchanged

23 Exporting Discrete LOD
We can use our computed data to export any number of discrete LOD steps Pick a desired number of vertices for the LOD level Calculate how many collapse operations will reach this level Build an indexed ordering for the mesh For any vertex with a “collapseOrder” value lower than the # of operations, replace its index with its “collapseTo” index Repeat until a vertex is reached that has a higher collapseOrder field Process each index ordering for strips & cache coherency, create packets, etc.

24 Discrete Blended LOD To minimize “popping” that occurs during the LOD switch, we can use image-space blending When an object needs to change between discrete LOD levels, it is queued for blending During blending, the object is actually rendered twice, at both LOD levels, and the alpha values are cross-faded In practice, we find this is useful for larger objects or terrain blocks, but not useful for typical models

25 Continuous LOD Continuous LOD can be an effective extension to discrete-LOD for games Reductions with greater granularity can avoid visible “popping” It can also save memory compared to storing a high number of discrete levels Our continuous implementation is based mainly on half-edge collapse This is the best way to keep our data static

26 CLOD Implementation To implement run-time CLOD, what we’re effectively doing is moving our off-line creation of discrete LOD index lists to the run-time engine To save memory, we re-order vertices in order of their “collapseOrder” field We export a separate parallel array to contain the “collapseTo” index for each vertex

27 CLOD Runtime At run-time, we select a desired number of vertices and repeat the recursive collapse process Each index replaced with its collapseTo until a value less than the desired size is reached For efficiency, we re-order our original index list in reverse-collapse order This allows us to stop when the first degenerate triangle is detected during the collapse process The result is a new indexing of the mesh with the precise number of vertices requested Result is cached in our model instance data

28 CLOD Advantages This method maps moderately well to console needs
The vertex data remains static and indexable Re-indexing can be cached over multiple frames to amortize costs Minimal storage costs above cost of storing basic model data 2 bytes per vert fixed-cost Can actually be more memory-efficient than discrete LOD, but not by a lot

29 CLOD Disadvantages The biggest challenge with CLOD is to optimize the index ordering Normally we perform intense, off-line strip generation to achieve this With an index list that could change every frame, we aren’t able to spend time generating strips We can still “compile” displaylists, etc. but at some additional cost Skip strips and similar techniques of partial-strip buffering can help address these concerns Exploit the fact that most of the model remains unchanged after each step

30 Non-Geometric LOD

31 Vertex Shader LOD Vertex “shader” refers to the processing path required to setup each vertex in the scene Newer PC and console hardware allow for extremely complex vertex operations including transformation, blending, and lighting The throughput of the GPU in verts/sec varies by orders of magnitude depending on the processing required Un-textured, un-lit = 30M V/s Dual-texture, 4 Lights = 9M V/s

32 Lighting LOD One of the most costly parts of vertex processing is lighting calculation Generally the cost increases linearly with the number of active lights. All games do basic operations like selecting the X brightest nearby lights for each mesh The number of lights X can be increased/decreased based on LOD metrics

33 Pre-lighting Because lighting is so expensive, a common optimization is to pre-calculate lights when possible A non-moving (or rarely-moving object) can have the lighting contribution from all nearby, non-moving lights calculated offline & stored in per-vertex color channel As long as certain conditions hold, the object is rendered with a 0-light path If additional moving lights come into range, the hardware allows us to add dynamic and pre-calculated colors in hardware If the object moves, it can revert to real-time lighting

34 Lighting LOD At lower LOD levels, we can use simpler lighting equations Use a static envmap (spherical or cubic) and normal-based texture projection to approximate diffuse lighting Switch to purely ambient lighting or directional lighting at low LOD At lower LOD levels, shadow generation is reduced or disabled Remove self-shadowing, remove accurate projected shadow volumes or textures

35 Projected Lighting A common technique in current games is to use texture projection to simulate complex lighting scenarios Generally this requires an additional rendering pass on affected meshes At lower LOD, we attempt to replace a projected light with a similar point or spotlight Match color & size to approximate the texture effect We also begin to exclude smaller objects from projection Light will affect walls, but not characters

36 Vertex Shader LOD After lighting, the next most costly operation is skinning or blending the vertex Can be performed by fixed-function matrix-palette blending, or programmable vertex shader Our goal with LOD is to use the existing model data but to simplify the vertex processing math We create N versions of all active game vertex processing functions All accept the same input data Selection is driven at run-time by the shared “LOD Factor” Essentially its discrete vertex LOD

37 Model Coordinate System
We store vertex position and normal data in “model space” This enables us to select between several types of vertex processing when needed If we ignore all bone associations and render with a single transform, we get the “at-rest” model pose If we store bone influences in sorted order, we can blend only against the first bone to get less-accurate skinning

38 Skeleton LOD The number of bones in a model skeleton can also affect performance Our vertex shader offers a fixed number of matrices that can be loaded into hardware registers simultaneously This limits on the number of faces we can render before re-loading these registers (batch size) We can replace a vertex->bone binding with that bone’s parent to eliminate “leaf” bones Their geometry will behave as if the removed bones are fused in their at-rest pose This needs to be done off-line because it affects how we split the model into render groups

39 Other Vertex LOD At lower LOD, we replace accurate reflected-normal vectors with camera-space normal vectors Requires less CPU assistance on some platforms We can often reduce the accuracy of skinning/blending for normal vectors before we do the same for position vectors Effects of inaccurate normals are far less obvious

40 Pixel Shader LOD Pixel shader LOD simply means having multiple implementations of each raster-level visual effect Alternate versions would achieve a similar visual result with fewer render passes, texture stages, or texture fetches Disabling multi-pass techniques is particularly effective because it benefits geometric LOD as well Reducing texture stages or fetches increases pixel fill-rate Generally implemented simply as multiple code paths selectable according to LOD metrics Light mapped walls can revert to vertex-lit Bumpmaps, Envmaps are blended out

41 Imposters The most extreme form of geometric LOD is replacing a complex object with an imposter The imposter can be a flat, textured quad Or it can be a simple geometric shell The goal is to approximate the shape & color of the original object at great distances Some game objects are always rendered as imposters Particles, explosions, bullets, foliage

42 Billboard Imposter The billboard imposter replaces a complex shape with a flat textured quad Can be rotated to face the camera in 1, 2 or 3 axes, depending on object symmetry The texture can contain multiple frames to represent different angles or animation frames The engine can blend between frames to improve fidelity, or use 3D volume textures to perform hardware blending Typically billboard imposters use masked (1-bit alpha) texture images so the actual quad outline is not visible “Z sprites” can provide imposters that z-buffer more accurately, particularly useful in clusters of objects

43 Dynamic Texture Imposter
Render-to-texture is a common & reasonably efficient console pipeline Non-dynamic texture imposters use valuable texture memory Gives better simulation of animation, lighting, and movement of the replaced objects We allocate a pool of textures for dynamic imposters at startup and re-use them when necessary A large crowd scene might re-use each imposter many times

44 Geometric Imposter A Geometric imposter uses a rigid 3D model in place of a complex articulated 3D model The “rigid mesh” vertex shader is usually several times faster than skinned/blended The imposter can use simpler shaders, fewer textures, and larger render batches Geometric imposters look better when viewed from multiple angles (object rotating or camera panning) Can take up less memory than multi-frame texture imposters, and can render nearly as quickly

45 Terrain LOD Terrain LOD is often handled specially
Mainly because the terrain is very large compared to the viewer (player) Our terrain is not stored as a heightfield, so we can do more arbitrary shapes We break the terrain into separate blocks according to a 2D grid overlay

46 Terrain LOD Each block has discrete LOD levels pre-computed and compiled into display lists At run-time, an LOD factor is computed for each block Based on distance, viewing angle, viewer height Vertices that lie along the boundaries between blocks are not subject to removal This avoids opening gaps and allows each block to LOD independently Image-space blending can help hide switches

47 Image Processing Techniques
Z-Fade Gameplay elements that are only of player interest at close range can be alpha blended out at increasing z-distance Powerups, small detail models, ground cover foliage, atmosphere objects, etc. Depth of Field effects If the game utilizes a depth-of-field effect to blur distant objects, the game can use far more aggressive distance LOD schemes

48 Non-Visual LOD Creating a special LOD geometry for shadow projection
Could use more aggressive methods beyone half-edge collapse to generate silhouettes Because shadows don’t have texture/lighting concerns, we can be more aggressive in choosing algorithms Automatic Collision geometry Currently we create collision geometry using simple volume shapes, or convex hull algorithms More demanding games could use some of the volume-based LOD reductions to create better-fit collision geometry

49 Future Directions Subdivision & curved surfaces
If future platforms increase RAM sizes and are fast enough to render 1-tri-per-pixel, its unclear if subdiv is needed However, artists are adopting this rapidly for cutscene work, so data-sharing is appealing benefit Subdivision with hardware support that was effectively “free” would definitely find an audience Otherwise, we expect that next-generation projects will continue to encode more data into textures and use programmable shaders to simulate details

50 Future Directions Vertex processing hardware is becoming more general-purpose Will allow more meaningful per-vertex processing for LOD schemes Possibly more emphasis on view-dependent schemes

51 References Surface Simplification Using Quadric Error Metrics, by Michael Garland and Paul Heckbert, SIGGRAPH 97 Bischoff, "Towards Hardware Implementation of Loop Subdivision", Proceedings 2000 SIGGRAPH/EUROGRAPHICS Workshop on Graphics Hardware, August 2000 Brickhill, "Practical Implementation Techniques for Multi-Resolution Subdivision Surfaces". GDC Conference Proceeding, 2001.


Download ppt "LOD Case Study & Application"

Similar presentations


Ads by Google