Ray Tracing Animated Scenes using Coherent Grid Traversal I Wald, T Ize, A Kensler, A Knoll, S Parker SCI Institute, University of Utah.

Slides:



Advertisements
Similar presentations
Sven Woop Computer Graphics Lab Saarland University
Advertisements

Christian Lauterbach COMP 770, 2/16/2009. Overview  Acceleration structures  Spatial hierarchies  Object hierarchies  Interactive Ray Tracing techniques.
Physically Based Real-time Ray Tracing Ryan Overbeck.
Collision Detection CSCE /60 What is Collision Detection?  Given two geometric objects, determine if they overlap.  Typically, at least one of.
Ray Tracing Ray Tracing 1 Basic algorithm Overview of pbrt Ray-surface intersection (triangles, …) Ray Tracing 2 Brute force: Acceleration data structures.
Computer graphics & visualization Collisions. computer graphics & visualization Simulation and Animation – SS07 Jens Krüger – Computer Graphics and Visualization.
Ray Tracing CMSC 635. Basic idea How many intersections?  Pixels  ~10 3 to ~10 7  Rays per Pixel  1 to ~10  Primitives  ~10 to ~10 7  Every ray.
A Coherent Grid Traversal Algorithm for Volume Rendering Ioannis Makris Supervisors: Philipp Slusallek*, Céline Loscos *Computer Graphics Lab, Universität.
Two Methods for Fast Ray-Cast Ambient Occlusion Samuli Laine and Tero Karras NVIDIA Research.
CSE 381 – Advanced Game Programming Scene Management
Two-Level Grids for Ray Tracing on GPUs
RT06 conferenceVlastimil Havran On the Fast Construction of Spatial Hierarchies for Ray Tracing Vlastimil Havran 1,2 Robert Herzog 1 Hans-Peter Seidel.
Experiences with Streaming Construction of SAH KD Trees Stefan Popov, Johannes Günther, Hans-Peter Seidel, Philipp Slusallek.
Afrigraph 2004 Interactive Ray-Tracing of Free-Form Surfaces Carsten Benthin Ingo Wald Philipp Slusallek Computer Graphics Lab Saarland University, Germany.
Fast and Accurate Soft Shadows using a Real-Time Beam Tracer Ravi Ramamoorthi Columbia Vision and Graphics Center Columbia University
Ray Tracing Acceleration Structures Solomon Boulos 4/16/2004.
Tomas Mőller © 2000 Speeding up your game The scene graph Culling techniques Level-of-detail rendering (LODs) Collision detection Resources and pointers.
RT 08 Efficient Clustered BVH Update Algorithm for Highly-Dynamic Models Symposium on Interactive Ray Tracing 2008 Los Angeles, California Kirill Garanzha.
Specialized Acceleration Structures for Ray-Tracing Warren Hunt Bill Mark.
Overview Class #10 (Feb 18) Assignment #2 (due in two weeks) Deformable collision detection Some simulation on graphics hardware... –Thursday: Cem Cebenoyan.
Ray Tracing Dynamic Scenes using Selective Restructuring Sung-eui Yoon Sean Curtis Dinesh Manocha Univ. of North Carolina at Chapel Hill Lawrence Livermore.
Apex Point Map for Constant-Time Bounding Plane Approximation Samuli Laine Tero Karras NVIDIA.
1 Advanced Scene Management System. 2 A tree-based or graph-based representation is good for 3D data management A tree-based or graph-based representation.
Interactive k-D Tree GPU Raytracing Daniel Reiter Horn, Jeremy Sugerman, Mike Houston and Pat Hanrahan.
RT08, August ‘08 Large Ray Packets for Real-time Whitted Ray Tracing Ryan Overbeck Columbia University Ravi Ramamoorthi Columbia University William R.
Collision Detection David Johnson Cs6360 – Virtual Reality.
Week 1 - Friday.  What did we talk about last time?  C#  SharpDX.
Computer Graphics 2 Lecture x: Acceleration Techniques for Ray-Tracing Benjamin Mora 1 University of Wales Swansea Dr. Benjamin Mora.
Interactive Ray Tracing: From bad joke to old news David Luebke University of Virginia.
Spatial Data Structures Jason Goffeney, 4/26/2006 from Real Time Rendering.
10/09/2001CS 638, Fall 2001 Today Spatial Data Structures –Why care? –Octrees/Quadtrees –Kd-trees.
1 Speeding Up Ray Tracing Images from Virtual Light Field Project ©Slides Anthony Steed 1999 & Mel Slater 2004.
Stefan PopovHigh Performance GPU Ray Tracing Real-time Ray Tracing on GPU with BVH-based Packet Traversal Stefan Popov, Johannes Günther, Hans- Peter Seidel,
On a Few Ray Tracing like Algorithms and Structures. -Ravi Prakash Kammaje -Swansea University.
Institute of C omputer G raphics, TU Braunschweig Hybrid Scene Structuring with Application to Ray Tracing 24/02/1999 Gordon Müller, Dieter Fellner 1 Hybrid.
MIT EECS 6.837, Durand and Cutler Acceleration Data Structures for Ray Tracing.
Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.
Click to edit Master title style HCCMeshes: Hierarchical-Culling oriented Compact Meshes Tae-Joon Kim 1, Yongyoung Byun 1, Yongjin Kim 2, Bochang Moon.
Understanding the Efficiency of Ray Traversal on GPUs Timo Aila Samuli Laine NVIDIA Research.
Interactive Rendering With Coherent Ray Tracing Eurogaphics 2001 Wald, Slusallek, Benthin, Wagner Comp 238, UNC-CH, September 10, 2001 Joshua Stough.
Fast BVH Construction on GPUs (Eurographics 2009) Park, Soonchan KAIST (Korea Advanced Institute of Science and Technology)
Supporting Animation and Interaction Ingo Wald SCI Institute, University of Utah
Hierarchical Penumbra Casting Samuli Laine Timo Aila Helsinki University of Technology Hybrid Graphics, Ltd.
CSCE 552 Spring D Models By Jijun Tang. Triangles Fundamental primitive of pipelines  Everything else constructed from them  (except lines and.
Memory Management and Parallelization Paul Arthur Navrátil The University of Texas at Austin.
Ray Tracing Animated Scenes using Motion Decomposition Johannes Günther, Heiko Friedrich, Ingo Wald, Hans-Peter Seidel, and Philipp Slusallek.
Interactive Ray Tracing of Dynamic Scenes Tomáš DAVIDOVIČ Czech Technical University.
Coherent Hierarchical Culling: Hardware Occlusion Queries Made Useful Jiri Bittner 1, Michael Wimmer 1, Harald Piringer 2, Werner Purgathofer 1 1 Vienna.
Ray Tracing II. HW1 Part A due October 10 Camera module Object module –Read from a file –Sphere and Light only Ray tracer module: –No shading. No reflection.
Graphics Interface 2009 The-Kiet Lu Kok-Lim Low Jianmin Zheng 1.
Compact, Fast and Robust Grids for Ray Tracing Ares Lagae & Philip Dutré 19 th Eurographics Symposium on Rendering EGSR 2008Wednesday, June 25th.
Advanced topics Advanced Multimedia Technology: Computer Graphics Yung-Yu Chuang 2006/01/04 with slides by Brian Curless, Zoran Popovic, Mario Costa Sousa.
1 Saarland University, Germany 2 DFKI Saarbrücken, Germany.
Compact, Fast and Robust Grids for Ray Tracing
Ray Tracing Optimizations
Ray Tracing Acceleration (5). Ray Tracing Acceleration Techniques Too Slow! Uniform grids Spatial hierarchies K-D Octtree BSP Hierarchical grids Hierarchical.
1 Advanced Scene Management. 2 This is a game-type-oriented issue Bounding Volume Hierarchies (BVHs) Binary space partition trees (BSP Trees) “Quake”
IIIT, Hyderabad Ray Tracing Dynamic Scenes on the GPU Sashidhar Guntury Centre for Visual Information Technology IIIT, Hyderabad. India.
CHC ++: Coherent Hierarchical Culling Revisited Oliver Mattausch, Jiří Bittner, Michael Wimmer Institute of Computer Graphics and Algorithms Vienna University.
1 The Method of Precomputing Triangle Clusters for Quick BVH Builder and Accelerated Ray Tracing Kirill Garanzha Department of Software for Computers Bauman.
SHADOW CASTER CULLING FOR EFFICIENT SHADOW MAPPING JIŘÍ BITTNER 1 OLIVER MATTAUSCH 2 ARI SILVENNOINEN 3 MICHAEL WIMMER 2 1 CZECH TECHNICAL UNIVERSITY IN.
Bounding Volume Hierarchies and Spatial Partitioning
STBVH: A Spatial-Temporal BVH for Efficient Multi-Segment Motion Blur
Two-Level Grids for Ray Tracing on GPUs
Bounding Volume Hierarchies and Spatial Partitioning
Real-Time Ray Tracing Stefan Popov.
Christian Lauterbach GPGPU presentation 3/5/2007
Accelerated Single Ray Tracing for Wide Vector Units
Collision Detection.
GEARS: A General and Efficient Algorithm for Rendering Shadows
Presentation transcript:

Ray Tracing Animated Scenes using Coherent Grid Traversal I Wald, T Ize, A Kensler, A Knoll, S Parker SCI Institute, University of Utah

Coherent Grid Traversal A new traversal techniques for uniform grids … that makes packet/frustum traversal compatible w/ grids … thus achieves performance competitive w/ fastest kd-trees … and which allows for per-frame rebuilds (dynamic scenes)

What’s so special about grids? Since 70’ies: Lots of different RT data structures

What’s so special about grids? Since 70’ies: Lots of different RT data structures BVH

What’s so special about grids? Since 70’ies: Lots of different RT data structures BVH

What’s so special about grids? Since 70’ies: Lots of different RT data structures BVH

What’s so special about grids? Since 70’ies: Lots of different RT data structures BVH Octree

What’s so special about grids? Since 70’ies: Lots of different RT data structures BVH Octree

What’s so special about grids? Since 70’ies: Lots of different RT data structures BVH Octree

What’s so special about grids? Since 70’ies: Lots of different RT data structures BVH Octree

What’s so special about grids? Since 70’ies: Lots of different RT data structures BVH Octree Grid

What’s so special about grids? Since 70’ies: Lots of different RT data structures BVH Octree Grid

What’s so special about grids? Since 70’ies: Lots of different RT data structures BVH Octree Grid

What’s so special about grids? Since 70’ies: Lots of different RT data structures BVH Octree Grid

What’s so special about grids? Since 70’ies: Lots of different RT data structures BVH Octree Grid Kd-tree

What’s so special about grids? Since 70’ies: Lots of different RT data structures BVH Octree Grid Kd-tree

What’s so special about grids? Since 70’ies: Lots of different RT data structures BVH Octree Grid Kd-tree

What’s so special about grids? Since 70’ies: Lots of different RT data structures BVH Octree Grid Kd-tree

What’s so special about grids? Since 70’ies: Lots of different RT data structures BVH Octree Grid Kd-tree  Of all these, grid is only that is not hierarchical !

What’s so special about grids? Grid is not hierarchical…  Much simpler to build (similar to 3D-rasterization, very fast) Build-times in the paper: 2.2M “Soda Hall” in 110 ms  Ideally suited for handling dynamic scenes Full rebuild every frame, no restrictions at all !

What is so special about dynamic scenes ? All of the recent advancements of RT are for kd-trees ! – Pre-2000: Tie between grids and kd-trees… – [Wald ’01]: New concept  “coherent ray tracing” (for kd-tree) Trace “packets” of coherent rays  10x faster than single rays – [Woop ’05]: First RT hardware prototype  RPU (for kd-tree) – [Reshetov ’05]: New concept  “multilevel ray tracing” (kd-tree) Trace packets using bounding frusta  another 10x faster than CRT ! But: (good) kd-trees are (too) costly to build…

Ray Tracing & Dynamic Scenes SIGGRAPH ‘05: Dynamic Scenes huge problem – Ray tracing has become very fast (MLRT: ~100fps) – If ray tracing is to ever replace rasterization, it must support dynamic scenes (games…) – But: All our fast RT algos are for kd-trees… – … and kd-trees can’t do dynamic scenes …

Ray Tracing & Dynamic Scenes SIGGRAPH ‘05: Dynamic Scenes huge problem Since then, lots of research – Lazy kd-tree construction (Razor [Stoll, Mark ‘06]) – Fast BVH and kd-tree construction (yet unpublished) – Motion decomposition [Günther et al. ‘06] – Dynamic BVHs [Wald et al. ‘06, Lauterbach et al. ’06] – Hybrid BVH/kd-trees [Woop ‘06, Havran ‘06, Wachter ‘06, …] – Coherent Grid Traversal [Wald et al. ’06]

Using grids for dynamics – Where’s the problem ? 2005: Grid too slow to traverse (vs kd-tree)… Fact: Fast RT needs “packets” & “frusta” concepts – Traverse multiple packets over same node of DS Rather simple for hierarchical data structures… – Test both children in turn for overlap w/ packet – If child overlaps: traverse it, else: skip it. (it’s as simple as that) … but not for grids

Grids and Packets – Where’s the problem ? Packets & grids: “Non-trivial task” – In which order to test the nodes ? ABCD or ABDC ? – What to do when packet diverges? 3DDDA etc break in that case… – Split diverging packet ? Quickly degenerates to single-ray traversal… – Fix by re-merging packets ? Non-trivial & costly … AB C D EG F

Coherent Grid Traversal First: Transform all rays into “canonical grid space” – i.e., [0,0,0]-[Nx,Ny,Nz]

Coherent Grid Traversal Idea: Consider only frustum, not “set of rays”

Coherent Grid Traversal Idea: Consider only frustum, not “set of rays”

Coherent Grid Traversal Idea: Consider only frustum, not “set of rays” – Traverse “slice by slice” instead of “cell to cell” Pick “major traversal axis” (e.g., max component of 1 st ray)

Coherent Grid Traversal Idea: Consider only frustum, not “set of rays” – Traverse “slice by slice” instead of “cell to cell” 1

Coherent Grid Traversal Idea: Consider only frustum, not “set of rays” – Traverse “slice by slice” instead of “cell to cell” 12

Coherent Grid Traversal Idea: Consider only frustum, not “set of rays” – Traverse “slice by slice” instead of “cell to cell” 123

Coherent Grid Traversal Idea: Consider only frustum, not “set of rays” – Traverse “slice by slice” instead of “cell to cell” 1234

Coherent Grid Traversal Idea: Consider only frustum, not “set of rays” – Traverse “slice by slice” instead of “cell to cell” – For each slice, compute frustum/slice overlap

Coherent Grid Traversal Idea: Consider only frustum, not “set of rays” – Traverse “slice by slice” instead of “cell to cell” – For each slice, compute frustum/slice overlap

Coherent Grid Traversal Idea: Consider only frustum, not “set of rays” – Traverse “slice by slice” instead of “cell to cell” – For each slice, compute frustum/slice overlap Float-to-int gives overlapped cell IDs

Coherent Grid Traversal Idea: Consider only frustum, not “set of rays” – Traverse “slice by slice” instead of “cell to cell” – For each slice, compute frustum/slice overlap Float-to-int gives overlapped cell IDs Intersect all cells in given slice

Coherent Grid Traversal Idea: Consider only frustum, not “set of rays” – Traverse “slice by slice” instead of “cell to cell” – For each slice, compute frustum/slice overlap Float-to-int gives overlapped cell IDs Intersect all cells in given slice – Loop: incrementally compute next slice’s overlap box 4 additions…

Coherent Grid Traversal Idea: Consider only frustum, not “set of rays” – Traverse “slice by slice” instead of “cell to cell” – For each slice, compute frustum/slice overlap Float-to-int gives overlapped cell IDs Intersect all cells in given slice – Loop: incrementally compute next slice’s overlap box 4 additions…

CGT features Expensive setup phase – Transform rays to canonical grid coordinate system – Determine major march direction (simple) – Compute min/max bounding planes (slopes and offsets) – Compute first and last slice to be traversed (full frustum clip) But: Very simple traversal step – Overlap box update: 4 float additions (1 SIMD instruction) – Get cell IDs: 4 float-to-int truncations (SIMD…) – Loop over overlapped cells (avg: cells per slice)

Traversal fast, but … Grid usually less efficient than kd-tree

Traversal fast, but … Grid usually less efficient than kd-tree – Cannot adapt to geometry as well  more intersections

Traversal fast, but … Grid usually less efficient than kd-tree – Cannot adapt to geometry as well  more intersections – Tris straddle many cells  re-intersection

Traversal fast, but … Grid usually less efficient than kd-tree – Cannot adapt to geometry as well  more intersections – Tris straddle many cells  re-intersection First sight: Frustum makes it worse…

Traversal fast, but … Grid usually less efficient than kd-tree – Cannot adapt to geometry as well  more intersections – Tris straddle many cells  re-intersection First sight: Frustum makes it worse… – Rays isec tris outside “their” cells

Traversal fast, but … Grid usually less efficient than kd-tree – Cannot adapt to geometry as well  more intersections – Tris straddle many cells  re-intersection First sight: Frustum makes it worse… – Rays isec tris outside “their” cells – Re-isec aggravated by width of frustum

Traversal fast, but … Grid usually less efficient than kd-tree First sight: Frustum makes it worse… But: Two easy fixes

Traversal fast, but … Grid usually less efficient than kd-tree First sight: Frustum makes it worse… But: Two easy fixes – Bad culling  SIMD Frustum culling in Packet/Tri Isec [Dmitriev et al.] Outside frustum  cull!

Traversal fast, but … Grid usually less efficient than kd-tree First sight: Frustum makes it worse… But: Two easy fixes – Bad culling  SIMD Frustum culling in Packet/Tri Isec [Dmitriev et al.] – Re-intersection: Mailboxing [Haines] Mailbox detects re-intersection

CGT efficiency Surprise: Mailboxing & Frustum culling very effective – Both standard techniques, both limited success for kd-trees – Grid & Frustum: Exactly counter weak points of CGT … “Hand” – Grid w/o FC & MB : 14 M ray-tri isecs – Grid with FC & MB:.9 M ray-tri isecs (14x less) – Kd-tree :.85M ray-tri isecs (5% less than grid) – And: cost indep of #rays  very cheap (amortize)

Results

Choice of optimal packet size Cost of trv step roughly independent of rays – Larger packets = More “potential” for amortization (+) More rays/packet = Larger Frustum – More visited cell, more tris (-)  There must be some crossover point … – Resample model (armadillo) at various resolutions – Measure runtime w/ 2x2 packets, 4x4, 8x8, …

Choice of optimal packet size Best is 4x4 (green) or 8x8 (blue)  4x4 as default

Impact of Method: Compare to single-ray & kd-tree Comparison to single-ray grid – Fast single-ray traverser, macrocell if advantageous, … – Speedup 6.5x to 20.9x, usually ~10x Comparison to kd-tree – To OpenRT: 2x-8x faster (2M Soda Hall: 4.5x) – To MLRT: ~3x slower (but much less optimized) – Tests performed on “kd-tree friendly” models

Overall Performance Build time: Usually affordable even on single CPU… Traversal results (1024^2, dual 3.2 GHz Xeon PC) – X/Y: X=raycast only; Y=raytrace+shade+texture+shadows “Hand” 16K triangles 34.5/15.3 fps “Poser” 78K triangles 15.8/7.8 fps “Fairy” 174K triangles 3.4/1.2 fps “Marbles” 8.8K triangles 57.1/26.2 fps “Toys” 11K triangles 29.3/10.2 fps

Example “Utah Fairy” Scene: – 174K tris,fully animated – several independently animated objects (fairy, dragonfly, ferns) – heavy texturing – large variations in scale and density Video: Screenshot from 16-core Opteron PC – ~$40k, BUT … – …~180 GFlops (1 CELL has 256)

Example “Utah Fairy” Scene: – 174K tris,fully animated – several independently animated objects (fairy, dragonfly, ferns) – heavy texturing – large variations in scale and density Video: Screenshot from 16-core Opteron PC – ~$40k, BUT … – …~180 GFlops (1 CELL has 256)

Discussion Comparison to state-of-the-art BVH or kd-tree – Somewhat harder to code and “get right” than, e.g., BVH – Usually somewhat slower (~1.5x-3x) – More susceptible to incoherence & teapot-in-stadium cases Pure frustum tech.: Visits all cells in frustum even if not touched by any ray! BUT: – It works at all ! (Who’d have thought 12m ago ?) – ~10x faster than single-ray grid – Benefits better from additional coherence (4x AA at 2x cost) – “Maybe” better suited for regular data or special HW (Cell, GPUs) – Most flexible wrt dynamic  no limitation at all

Conclusion Have developed a new technique that Makes grid compatible with packets & frusta Is competitive with BVHs and kd-trees Most general in handling dynamic scenes

Thanks a lot… Acknowledgements Co-authors: Thiago Ize, Andrew Kensler, Aaron Knoll, Steven Parker People involved in related & follow-up projects: S Boulos, P Shirley, C Robertson, C Gribble, H Friedrich, J Günther, … Models: DAZ Productions, A Kensler