Fast BVH Construction on GPUs (Eurographics 2009) Park, Soonchan KAIST (Korea Advanced Institute of Science and Technology)

Slides:



Advertisements
Similar presentations
Sven Woop Computer Graphics Lab Saarland University
Advertisements

Christian Lauterbach COMP 770, 2/16/2009. Overview  Acceleration structures  Spatial hierarchies  Object hierarchies  Interactive Ray Tracing techniques.
Ray Tracing Ray Tracing 1 Basic algorithm Overview of pbrt Ray-surface intersection (triangles, …) Ray Tracing 2 Brute force: Acceleration data structures.
IIIT Hyderabad Hybrid Ray Tracing and Path Tracing of Bezier Surfaces using a mixed hierarchy Rohit Nigam, P. J. Narayanan CVIT, IIIT Hyderabad, Hyderabad,
Computer graphics & visualization Collisions. computer graphics & visualization Simulation and Animation – SS07 Jens Krüger – Computer Graphics and Visualization.
Ray Tracing CMSC 635. Basic idea How many intersections?  Pixels  ~10 3 to ~10 7  Rays per Pixel  1 to ~10  Primitives  ~10 to ~10 7  Every ray.
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
Two-Level Grids for Ray Tracing on GPUs
RT06 conferenceVlastimil Havran On the Fast Construction of Spatial Hierarchies for Ray Tracing Vlastimil Havran 1,2 Robert Herzog 1 Hans-Peter Seidel.
Experiences with Streaming Construction of SAH KD Trees Stefan Popov, Johannes Günther, Hans-Peter Seidel, Philipp Slusallek.
Afrigraph 2004 Interactive Ray-Tracing of Free-Form Surfaces Carsten Benthin Ingo Wald Philipp Slusallek Computer Graphics Lab Saarland University, Germany.
Computational Support for RRTs David Johnson. Basic Extend.
Computer Graphics Hardware Acceleration for Embedded Level Systems Brian Murray
Ray Tracing Dynamic Scenes using Selective Restructuring Sung-eui Yoon Sean Curtis Dinesh Manocha Univ. of North Carolina at Chapel Hill Lawrence Livermore.
Optimized Subdivisions for Preprocessed Visibility Oliver Mattausch, Jiří Bittner, Peter Wonka, Michael Wimmer Institute of Computer Graphics and Algorithms.
Bounding Volume Hierarchies and Spatial Partitioning Kenneth E. Hoff III COMP-236 lecture Spring 2000.
1 Advanced Scene Management System. 2 A tree-based or graph-based representation is good for 3D data management A tree-based or graph-based representation.
FAST AND SIMPLE AGGLOMERATIVE LBVH CONSTRUCTION
Efficient Distance Computation between Non-Convex Objects By Sean Quinlan Presented by Sean Augenstein and Nicolas Lee.
Collision Detection David Johnson Cs6360 – Virtual Reality.
RAY TRACING ON GPU By: Nitish Jain. Introduction Ray Tracing is one of the most researched fields in Computer Graphics A great technique to produce optical.
Computer Graphics 2 Lecture x: Acceleration Techniques for Ray-Tracing Benjamin Mora 1 University of Wales Swansea Dr. Benjamin Mora.
H IGH P ERFORMANCE R AY T RACING Keqing Chen Yihan Sun Xinran Xu.
Spatial Data Structures Jason Goffeney, 4/26/2006 from Real Time Rendering.
Computer Graphics Graphics Hardware
GPUs and Accelerators Jonathan Coens Lawrence Tan Yanlin Li.
The Visibility Problem In many environments, most of the primitives (triangles) are not visible most of the time –Architectural walkthroughs, Urban environments.
Database Management 9. course. Execution of queries.
SPATIAL DATA STRUCTURES Jon McCaffrey CIS 565. Goals  Spatial Data Structures (Construction esp.)  Why  What  How  Designing Algorithms for the GPU.
Stefan PopovHigh Performance GPU Ray Tracing Real-time Ray Tracing on GPU with BVH-based Packet Traversal Stefan Popov, Johannes Günther, Hans- Peter Seidel,
On a Few Ray Tracing like Algorithms and Structures. -Ravi Prakash Kammaje -Swansea University.
By Mahmoud Moustafa Zidan Basic Sciences Department Faculty of Computer and Information Sciences Ain Shams University Under Supervision of Prof. Dr. Taymoor.
Institute of C omputer G raphics, TU Braunschweig Hybrid Scene Structuring with Application to Ray Tracing 24/02/1999 Gordon Müller, Dieter Fellner 1 Hybrid.
Parallel dynamic batch loading in the M-tree Jakub Lokoč Department of Software Engineering Charles University in Prague, FMP.
Parallelization and Characterization of Pattern Matching using GPUs Author: Giorgos Vasiliadis 、 Michalis Polychronakis 、 Sotiris Ioannidis Publisher:
1 Real-time visualization of large detailed volumes on GPU Cyril Crassin, Fabrice Neyret, Sylvain Lefebvre INRIA Rhône-Alpes / Grenoble Universities Interactive.
Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.
Click to edit Master title style HCCMeshes: Hierarchical-Culling oriented Compact Meshes Tae-Joon Kim 1, Yongyoung Byun 1, Yongjin Kim 2, Bochang Moon.
Interactive Rendering With Coherent Ray Tracing Eurogaphics 2001 Wald, Slusallek, Benthin, Wagner Comp 238, UNC-CH, September 10, 2001 Joshua Stough.
Stefan Popov Space Subdivision for BVHs Stefan Popov, Iliyan Georgiev, Rossen Dimov, and Philipp Slusallek Object Partitioning Considered Harmful: Space.
Computational Geometry Piyush Kumar (Lecture 5: Range Searching) Welcome to CIS5930.
Efficient Local Statistical Analysis via Integral Histograms with Discrete Wavelet Transform Teng-Yok Lee & Han-Wei Shen IEEE SciVis ’13Uncertainty & Multivariate.
Hierarchical Penumbra Casting Samuli Laine Timo Aila Helsinki University of Technology Hybrid Graphics, Ltd.
- Laboratoire d'InfoRmatique en Image et Systèmes d'information
Interactive Ray Tracing of Dynamic Scenes Tomáš DAVIDOVIČ Czech Technical University.
Maximizing Parallelism in the Construction of BVHs, Octrees, and k-d Trees Tero Karras NVIDIA Research.
An Efficient CUDA Implementation of the Tree-Based Barnes Hut n-body Algorithm By Martin Burtscher and Keshav Pingali Jason Wengert.
Compact, Fast and Robust Grids for Ray Tracing Ares Lagae & Philip Dutré 19 th Eurographics Symposium on Rendering EGSR 2008Wednesday, June 25th.
Fateme Hajikarami Spring  What is GPGPU ? ◦ General-Purpose computing on a Graphics Processing Unit ◦ Using graphic hardware for non-graphic computations.
Game Engine Design Quake Engine Presneted by Holmes 2002/12/2.
Ray Tracing Optimizations
David Luebke 3/5/2016 Advanced Computer Graphics Lecture 4: Faster Ray Tracing David Luebke
Multi-dimensional Range Query Processing on the GPU Beomseok Nam Date Intensive Computing Lab School of Electrical and Computer Engineering Ulsan National.
David Luebke3/12/2016 Advanced Computer Graphics Lecture 3: More Ray Tracing David Luebke
Path/Ray Tracing Examples. Path/Ray Tracing Rendering algorithms that trace photon rays Trace from eye – Where does this photon come from? Trace from.
CHC ++: Coherent Hierarchical Culling Revisited Oliver Mattausch, Jiří Bittner, Michael Wimmer Institute of Computer Graphics and Algorithms Vienna University.
Processor Level Parallelism 2. How We Got Here Developments in PC CPUs.
1 The Method of Precomputing Triangle Clusters for Quick BVH Builder and Accelerated Ray Tracing Kirill Garanzha Department of Software for Computers Bauman.
Visibility-Driven View Cell Construction Oliver Mattausch, Jiří Bittner, Michael Wimmer Institute of Computer Graphics and Algorithms Vienna University.
Fabianowski · DinglianaInteractive Global Photon Mapping1 / 22 Interactive Global Photon Mapping Bartosz Fabianowski · John Dingliana Trinity College Dublin.
Bounding Volume Hierarchies and Spatial Partitioning
Hybrid Ray Tracing and Path Tracing of Bezier Surfaces using a mixed hierarchy Rohit Nigam, P. J. Narayanan CVIT, IIIT Hyderabad, Hyderabad, India.
Collision Detection Spring 2004.
Two-Level Grids for Ray Tracing on GPUs
Bounding Volume Hierarchies and Spatial Partitioning
Real-Time Ray Tracing Stefan Popov.
Hybrid Ray Tracing of Massive Models
Ray Tracing, Part 1 Dinesh Manocha COMP 575/770.
Cache-Efficient Layouts of BVHs and Meshes
Analysis of Algorithms
Presentation transcript:

Fast BVH Construction on GPUs (Eurographics 2009) Park, Soonchan KAIST (Korea Advanced Institute of Science and Technology)

2 Contents ● What is BVH ● Motivation ● Three Algorithm to Construct BVH ● LBVH ● SAH Hierarchy Construction ● Hybrid GPU Construction Algorithm ● Results & Analysis

3 Contents ● What is BVH ● Motivation ● Three Algorithm to Construct BVH ● LBVH ● SAH Hierarchy Construction ● Hybrid GPU Construction Algorithm ● Results & Analysis

4 What is BVH? ● Bounding Volume Hierarchy ● A tree structure on a set of geometric objects ● “Fast Computation” ● Ray tracing ● Collision detection ● Visibility Culling

5 What is BVH? ● Issues of BVH construction ● Construction Time ● Effectiveness of Construction ● How much improvement BVH makes –Median Subdivision & Surface Area Heuristic

6 Motivation ● BVH Construction Almost all prior works are about “Purely serial construction algorithms” Make Efficient Parallel algorithms! on manycore processors  How to make processes of BVH construction appropriate for parallel computation

7 Contents ● What is BVH ● Motivation ● Three Algorithm to Construct BVH ● LBVH ● SAH Hierarchy Construction ● Hybrid GPU Construction Algorithm ● Results & Analysis

8 Contents ● What is BVH ● Motivation ● Three Algorithm to Construct BVH ● LBVH ● SAH Hierarchy Construction ● Hybrid GPU Construction Algorithm ● Results & Analysis

9 LBVH ● Linear Bounding Volume Hierarchy ● Simplest approach to parallelizing BVH Construction ● Sorting input primitives by Morton Codes ● BVH Construction  Sorting ( O(nlogn) )

10 Morton Codes (Z-order) ● Space-filling curve ● Morton Codes (Z-order) ● Good locality-preserving ● Express space as bits

11 Morton Codes (Z-order)

12 LBVH ● Linear B.V.H. ● Sorting primitives along the curve parallel radix sort [SHG08] ● Each primitive has bit expression of position ● How to make the Hierarchy?

13 LBVH ● Make Hierarchy ● Test all Primitive i with Primitive i+1 ● What levels they are separated ● Make list ( (Primitive index), ( separate level) ) ● Resort the list by level  We can have intervals at each level!

14 Example (6, 1) (3, 2) (6, 2) (2,3) (3,3) (4,3) (5,3) (6,3) (7,3) (1,4) (2,4) (3,4) (4,4) (5,4) (6,4) (7,4) Split list (Prim.Index, Separate Lev.)

LEVEL

LEVEL 1 LEVEL

LEVEL 1 LEVEL 2 LEVEL

LEVEL 1 LEVEL 2 LEVEL 3 LEVEL

19

20 LBVH ● Pros ● Very fast – same complexity as sorting ● + we use parallel radix sort [SHG 08] ● Cons ● Constructed Hierarchy is not optimized ● It uniformly subdivides space at the median ● Leaf can has multiple primitives

21 Contents ● What is BVH ● Motivation ● Three Algorithm to Construct BVH ● LBVH ● SAH Hierarchy Construction ● Hybrid GPU Construction Algorithm ● Results & Analysis

22 What is SAH ● Surface Area Heuristic ● Answer for optimized architecture ● “which of a number of partitions of primitives will be better? ● “which of a number of possible positions to split space will be better?”

23 What is SAH ● SAH optimized construction can also be achieved in O(nlogn) [WH06] ● Processes for SAH ● Recursively splitting the set of geometric primitives (usually two parts per step-binary tree) ● Evaluate with “cost function” ● Cost function can be defined ● Find the one with lowest cost ● Check all possible split position can be costly ● Sampling method can be applied

24 GPU SAH Construction ● Breadth-first construction using work queues ● Parallelization! Input queue Output queue

25 Data-Parallel SAH Split ● Two steps for performing SAH split ● Determine the best split position by evaluating the SAH ● Reorder the primitives ( corresponds to the new split )

26 Data-Parallel SAH Split ● Determine the best split position ● Approximate SAH computation ● Generate k uniformly sampled split candidates for three axes ( test all the samples in parallel by using 3k threads ) ● Each thread computes the SAH cost for its split candidate ● Find split candidate with lowest cost ● Reorder the Primitives ● In corresponds to the new splits ● Only reorder the indices ● No copy of geometry

27 Small Split Operation ● Two main bottleneck ● Initial split at the top level of hierarchy is very slow ● Large # of primitives at Top level –By using hybrid method (discussed later) ● Large # of small splits at Low level ● Problems ● Higher compaction costs generated by large # of splits ● Vector utilizing is low (Few primitive per split) ● Large # of small size of split makes problem  Use different split kernel for small size

28 Small Split Operation ● Main Idea ● Set Thresh hold to define “Small split” ● Depends on geometry data & cache size (32) ● Use processor’s local memory ● to maintain a local work queue ● Keep all the geometric primitives ● Pros ● Reduce memory bandwidth ● Decrease # of Thread ● Maximize utilization of vector operation ● Avoid waiting for memory access  15~20% speed up

29 Small Split Operation Times # of active splits Level of splits

30 Contents ● What is BVH ● Motivation ● Three Algorithm to Construct BVH ● LBVH ● SAH Hierarchy Construction ● Hybrid GPU Construction Algorithm ● Results & Analysis

31 Hybrid GPU Construction Algorithm ● LBVH ● Not optimized at last ● Shallow hierarchy ● Large # of primitives at the leafs ● But FAST ● Problem of GPU SAH Construction ● Relatively Slow ● Overhead at first level ● But it can build optimized hierarchy ● Solution ● Top level  use LBVH ● Others  use GPU SAH Construction

32 Contents ● What is BVH ● Motivation ● Three Algorithm to Construct BVH ● LBVH ● SAH Hierarchy Construction ● Hybrid GPU Construction Algorithm ● Results & Analysis

33 Results ● Render several scenes ● Comparing with other environments ● One-core not optimized CPU SAH ● Full SAH ● Standard CPU BVH ray tracer using ray packets ● Compare with ● Construction time, Well Optimized, fps

34 Results Construction Time Absolute/relative r.t. perf.

35 Results Construction Time Absolute/relative r.t. perf.

36 Results Construction Time Absolute/relative r.t. perf.

37 Results ● GPU SAH ● Show better performance than CPU SAH ● Good optimization ● LBVH ● Fast, not optimized ● Scene dependent ● Hybrid ● Middle of GPU SAH & LBVH ● can be customized

38 Analysis ● Current GPU architecture several features for constructing hierarchy ● Special Graphics memory  significantly higher memory bandwidth ● Manage fast local memory ● Discussed in Small Split Operation ● Memory ● 113 bytes/triangle ● Worst case: when one triangle per leaf  It allows multi-million triangle models on current GPU

39 Analysis ● Bottleneck Analysis Core overhead Memory overhead

40 Analysis ● Time Distribution *Rest = read/write BVH node information, setting up splits, join rest of steps “Note that Hybrid build is 10 times faster” Full SAH buildHybrid build

41 Video ● Youtube Video Youtube Video

42 Reference ● [SHG08] SATISH N., HARRIS M., GARLAND M.: Designing efficient sorting algorithms for manycore GPUs. Under review (2008). ● [WH06] WALD I., HAVRAN V.: On building fast kd-trees for ray tracing, and on doing that in O(N log N). In Proc. of IEEE Symp.on Interactive Ray Tracing (2006), pp. 61–69.