Photon Mapping on Programmable Graphics Hardware Timothy J. Purcell Mike Cammarano Pat Hanrahan Stanford University Craig Donner Henrik Wann Jensen University.

Slides:



Advertisements
Similar presentations
Exploration of advanced lighting and shading techniques
Advertisements

Sven Woop Computer Graphics Lab Saarland University
COMPUTER GRAPHICS CS 482 – FALL 2014 NOVEMBER 10, 2014 GRAPHICS HARDWARE GRAPHICS PROCESSING UNITS PARALLELISM.
Graphics Hardware CMSC 435/634. Transform Shade Clip Project Rasterize Texture Z-buffer Interpolate Vertex Fragment Triangle A Graphics Pipeline.
Computer graphics & visualization Global Illumination Effects.
Ray Tracing CMSC 635. Basic idea How many intersections?  Pixels  ~10 3 to ~10 7  Rays per Pixel  1 to ~10  Primitives  ~10 to ~10 7  Every ray.
A Coherent Grid Traversal Algorithm for Volume Rendering Ioannis Makris Supervisors: Philipp Slusallek*, Céline Loscos *Computer Graphics Lab, Universität.
GI 2006, Québec, June 9th 2006 Implementing the Render Cache and the Edge-and-Point Image on Graphics Hardware Edgar Velázquez-Armendáriz Eugene Lee Bruce.
Interactive Ray Tracing: From bad joke to old news David Luebke University of Virginia.
Realistic Images Using Photon Mapping Under Supervision of : DR.Zaki Taha Project Team Ahmed IsmaielMahmoud Mostafa Assistants Amr GamgomSalma Hamdy.
Hardware-Accelerated Adaptive EWA Volume Splatting Wei Chen ZJU Liu Ren CMU Matthias Zwicker MIT Hanspeter Pfister MERL.
Paper Presentation - An Efficient GPU-based Approach for Interactive Global Illumination- Rui Wang, Rui Wang, Kun Zhou, Minghao Pan, Hujun Bao Presenter.
Control Flow Virtualization for General-Purpose Computation on Graphics Hardware Ghulam Lashari Ondrej Lhotak University of Waterloo.
Brook for GPUs Ian Buck, Tim Foley, Daniel Horn, Jeremy Sugerman, Kayvon Fatahalian, Mike Houston, Pat Hanrahan Stanford University DARPA Site Visit, UNC.
Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering.
GH05 KD-Tree Acceleration Structures for a GPU Raytracer Tim Foley, Jeremy Sugerman Stanford University.
Direct-to-Indirect Transfer for Cinematic Relighting Milos Hasan (Cornell University) Fabio Pellacini (Dartmouth College) Kavita Bala (Cornell University)
3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research.
Final Gathering on GPU Toshiya Hachisuka University of Tokyo Introduction Producing global illumination image without any noise.
Sorting and Searching Timothy J. PurcellStanford / NVIDIA Updated Gary J. Katz based on GPUTeraSort (MSR TR )U. of Pennsylvania.
Matrix Row-Column Sampling for the Many-Light Problem Miloš Hašan (Cornell University) Fabio Pellacini (Dartmouth College) Kavita Bala (Cornell University)
Time-Dependent Photon Mapping Mike Cammarano Henrik Wann Jensen EGWR ‘02.
Database Operations on GPU Changchang Wu 4/18/2007.
Some Things Jeremy Sugerman 22 February Jeremy Sugerman, FLASHG 22 February 2005 Topics Quick GPU Topics Conditional Execution GPU Ray Tracing.
Adaptive Global Visibility Sampling Jiří Bittner 1, Oliver Mattausch 2, Peter Wonka 3, Vlastimil Havran 1, Michael Wimmer 2 1 Czech Technical University.
Mapping Computational Concepts to GPU’s Jesper Mosegaard Based primarily on SIGGRAPH 2004 GPGPU COURSE and Visualization 2004 Course.
Interactive k-D Tree GPU Raytracing Daniel Reiter Horn, Jeremy Sugerman, Mike Houston and Pat Hanrahan.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
RAY TRACING ON GPU By: Nitish Jain. Introduction Ray Tracing is one of the most researched fields in Computer Graphics A great technique to produce optical.
General-Purpose Computation on Graphics Hardware.
Ray Tracing and Photon Mapping on GPUs Tim PurcellStanford / NVIDIA.
A Multigrid Solver for Boundary Value Problems Using Programmable Graphics Hardware Nolan Goodnight Cliff Woolley Gregory Lewin David Luebke Greg Humphreys.
Enhancing GPU for Scientific Computing Some thoughts.
Realtime Caustics using Distributed Photon Mapping Johannes Günther Ingo Wald * Philipp Slusallek Computer Graphics Group Saarland University ( * now at.
Mapping Computational Concepts to GPUs Mark Harris NVIDIA Developer Technology.
Mapping Computational Concepts to GPUs Mark Harris NVIDIA.
GPU Computation Strategies & Tricks Ian Buck Stanford University.
1 Speeding Up Ray Tracing Images from Virtual Light Field Project ©Slides Anthony Steed 1999 & Mel Slater 2004.
-Global Illumination Techniques
Cg Programming Mapping Computational Concepts to GPUs.
Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.
Photon Mapping on Programmable Graphics Hardware
Finding Body Parts with Vector Processing Cynthia Bruyns Bryan Feldman CS 252.
Improving k-buffer methods via Occupancy Maps Andreas A. Vasilakis and Georgios Papaioannou Dept. of Informatics, Athens University of Economics & Business,
GPU Computation Strategies & Tricks Ian Buck NVIDIA.
Interactive Rendering With Coherent Ray Tracing Eurogaphics 2001 Wald, Slusallek, Benthin, Wagner Comp 238, UNC-CH, September 10, 2001 Joshua Stough.
Boolean Operations on Surfel-Bounded Solids Using Programmable Graphics Hardware Bart AdamsPhilip Dutré Katholieke Universiteit Leuven.
M. Jędrzejewski, K.Marasek, Warsaw ICCVG, Multimedia Chair Computation of room acoustics using programable video hardware Marcin Jędrzejewski.
Stencil Routed A-Buffer
- Laboratoire d'InfoRmatique en Image et Systèmes d'information
Monte-Carlo Ray Tracing and
Photo-realistic Rendering and Global Illumination in Computer Graphics Spring 2012 Hybrid Algorithms K. H. Ko School of Mechatronics Gwangju Institute.
An Efficient CUDA Implementation of the Tree-Based Barnes Hut n-body Algorithm By Martin Burtscher and Keshav Pingali Jason Wengert.
David Angulo Rubio FAMU CIS GradStudent. Introduction  GPU(Graphics Processing Unit) on video cards has evolved during the last years. They have become.
From Turing Machine to Global Illumination Chun-Fa Chang National Taiwan Normal University.
Mapping Computational Concepts to GPUs Mark Harris NVIDIA.
Efficient Partitioning of Fragment Shaders for Multiple-Output Hardware Tim Foley Mike Houston Pat Hanrahan Computer Graphics Lab Stanford University.
Ray Tracing by GPU Ming Ouhyoung. Outline Introduction Graphics Hardware Streaming Ray Tracing Discussion.
Radiance Cache Splatting: A GPU-Friendly Global Illumination Algorithm P. Gautron J. Křivánek K. Bouatouch S. Pattanaik.
Path/Ray Tracing Examples. Path/Ray Tracing Rendering algorithms that trace photon rays Trace from eye – Where does this photon come from? Trace from.
CHC ++: Coherent Hierarchical Culling Revisited Oliver Mattausch, Jiří Bittner, Michael Wimmer Institute of Computer Graphics and Algorithms Vienna University.
COMPUTER GRAPHICS CHAPTER 38 CS 482 – Fall 2017 GRAPHICS HARDWARE
Graphics Processing Unit
From Turing Machine to Global Illumination
Timothy J. Purcell Stanford / NVIDIA
Visibility Computations
Sorting and Searching Tim Purcell NVIDIA.
UMBC Graphics for Games
Ray Tracing on Programmable Graphics Hardware
University of Virginia
Presentation transcript:

Photon Mapping on Programmable Graphics Hardware Timothy J. Purcell Mike Cammarano Pat Hanrahan Stanford University Craig Donner Henrik Wann Jensen University of California, San Diego

Motivation

Motivation Interactive global illumination on the GPU Interactive global illumination on the GPU Nearly have sufficient compute power and flexibility Nearly have sufficient compute power and flexibility Explore GPU-based computation algorithms Explore GPU-based computation algorithms

Related Work CPU-based interactive global illumination CPU-based interactive global illumination Supercomputers [Parker et al.] Supercomputers [Parker et al.] Clusters [Tole et al., Wald et al.] Clusters [Tole et al., Wald et al.] Global illumination on programmable GPUs Global illumination on programmable GPUs Ray tracing [Carr et al., Purcell et al.] Ray tracing [Carr et al., Purcell et al.] Photon mapping [Ma et al.] Photon mapping [Ma et al.] Radiosity [Carr et al., Coombe et al.] Radiosity [Carr et al., Coombe et al.] Translucency [Carr et al., Stamminger et al.] Translucency [Carr et al., Stamminger et al.]

Photon Mapping Algorithm Review Photon tracing Photon tracing Emission, scattering, storing into kd-tree Emission, scattering, storing into kd-tree Similar to ray tracing Similar to ray tracing Rendering Rendering Ray tracing for direct illumination Ray tracing for direct illumination Photon map visualization Photon map visualization Indirect bounce Indirect bounce

Computational Challenge for GPUs #1 Constructing a irregular or sparse data structure Constructing a irregular or sparse data structure

Computational Challenge for GPUs #2 Adaptive nearest neighbor search Adaptive nearest neighbor search Noise vs. blur Noise vs. blur

Computational Challenge for GPUs #2 Adaptive nearest neighbor search Adaptive nearest neighbor search Noise vs. blur Noise vs. blur

Photon Mapping on the CPU Balanced kd-tree Balanced kd-tree Compact storage of photons Compact storage of photons Efficient Efficient O(log n) search O(log n) search Priority queue Priority queue Nearest neighbor search Nearest neighbor search Incremental insertion and removal of photons Incremental insertion and removal of photons

Algorithmic Changes for the GPU Direct visualization of photon map Direct visualization of photon map Keeps rendering costs low Keeps rendering costs low Use grid instead of kd-tree Use grid instead of kd-tree Tried kd-tree… Tried kd-tree… Kd-tree construction is difficult Kd-tree construction is difficult Radiance estimate Radiance estimate –Fixed radius search works fine –Adaptive search needs priority queue No priority queue No priority queue Can’t build on GPU Can’t build on GPU Too much state Too much state

Contributions Mapped complete grid-based photon mapping algorithm onto the GPU Mapped complete grid-based photon mapping algorithm onto the GPU Including photon tracing, ray tracing, etc. Including photon tracing, ray tracing, etc. Implemented an adaptive k-nearest neighbor search Implemented an adaptive k-nearest neighbor search kNN-grid kNN-grid Show how to construct a sparse data structure on the GPU Show how to construct a sparse data structure on the GPU Bitonic merge sort with binary search Bitonic merge sort with binary search Stencil routing Stencil routing

Configuring the GPU for Computing GPU as data parallel compute engine GPU as data parallel compute engine Fragment programs execute compute kernels Fragment programs execute compute kernels Screen sized quad initializes computation Screen sized quad initializes computation SIMD execution SIMD execution Floating point texture memory Floating point texture memory Render-to-texture for intermediate results Render-to-texture for intermediate results Data structure storage Data structure storage Pointer dereferencing via dependent fetches Pointer dereferencing via dependent fetches

Computational Challenge #1 Building a Sparse Data Structure

Requires scatter Requires scatter Dependent texture write Dependent texture write Why don’t we have fragment scatter? Why don’t we have fragment scatter? Fragment processing has highly coherent blocked memory writes Fragment processing has highly coherent blocked memory writes Extra hardware support would be needed Extra hardware support would be needed Write hazards Write hazards Memory latencies Memory latencies

Scatter on the GPU Sort photons into grid cells Sort photons into grid cells Grid cell is sort key Grid cell is sort key Simulate scatter with fragment programs Simulate scatter with fragment programs Bitonic merge sort followed by binary search Bitonic merge sort followed by binary search Compact grid Compact grid O(log 2 n) rendering passes O(log 2 n) rendering passes

Bitonic Merge Sort O(log 2 n) rendering passes

Binary Search Grid cell searches for self in photon list Grid cell searches for self in photon list If none, find first element in next cell If none, find first element in next cell Empty grid cells waste compute Empty grid cells waste compute Log(n) + 1 steps Log(n) + 1 steps

Binary Search Grid cell searches for self in photon list Grid cell searches for self in photon list If none, find first element in next cell If none, find first element in next cell Empty grid cells waste compute Empty grid cells waste compute Log(n) + 1 steps Log(n) + 1 steps v0v0v2v2v5v0v5 Sorted Photon List v2 Searching for first v5 photon initialize

Binary Search Grid cell searches for self in photon list Grid cell searches for self in photon list If none, find first element in next cell If none, find first element in next cell Empty grid cells waste compute Empty grid cells waste compute Log(n) + 1 steps Log(n) + 1 steps v0v0v2v2v5v0v5 Sorted Photon List v0v0v2v2v2v0v5 v2 v5 Searching for first v5 photon initialize step 1

v5 Binary Search Grid cell searches for self in photon list Grid cell searches for self in photon list If none, find first element in next cell If none, find first element in next cell Empty grid cells waste compute Empty grid cells waste compute Log(n) + 1 steps Log(n) + 1 steps v0v0v2v2v5v0v5 Sorted Photon List v0v0v2v2v2v0v5 v0v0v2v2v5v0 v2 v5 v2 Searching for first v5 photon initialize step 1 step 2

v5 Binary Search Grid cell searches for self in photon list Grid cell searches for self in photon list If none, find first element in next cell If none, find first element in next cell Empty grid cells waste compute Empty grid cells waste compute Log(n) + 1 steps Log(n) + 1 steps v0v0v2v2v5v0v5 Sorted Photon List v0v0v2v2v2v0v5 v0v0v2v2v5v0 v0v0v2v2v2v0v5 v2 v5 v2 v5 Searching for first v5 photon initialize step 1 step 2 step 3

v5 Binary Search Grid cell searches for self in photon list Grid cell searches for self in photon list If none, find first element in next cell If none, find first element in next cell Empty grid cells waste compute Empty grid cells waste compute Log(n) + 1 steps Log(n) + 1 steps v0v0v2v2v5v0v5 Sorted Photon List v0v0v2v2v2v0v5 v0v0v2v2v5v0 v0v0v2v2v2v0v5 v0v0v2v2v2v0v5 v2 v5 v2 v5 v5 Searching for first v5 photon initialize step 1 step 2 step 3 step 4

Scatter on the GPU Vertex programs can scatter Vertex programs can scatter Draw point to buffer Draw point to buffer Collisions? Collisions?

Scatter on the GPU Vertex programs can scatter Vertex programs can scatter Draw point to buffer Draw point to buffer Collisions? Collisions? Stencil routing Stencil routing Limit photon count per grid cell Limit photon count per grid cell –Pre-allocate grid cell space Draw photons as points Draw photons as points –Vertex program computes grid cell Stencil buffer controls location within cell Stencil buffer controls location within cell Single rendering pass Single rendering pass

Stencil Routing Fix each grid cell size to n 2 pixels Fix each grid cell size to n 2 pixels Draw fat points to cover each fat cell Draw fat points to cover each fat cell glPointSize(n) glPointSize(n) Vertex ( photon_pos ) Vertex Program Flattened Grid 4 pixels

Stencil Routing Control location written to with stencil Control location written to with stencil Pass when stencil is n 2 -1 Pass when stencil is n 2 -1 Stencil always increments Stencil always increments Location written depends on draw order Location written depends on draw order Vertex ( photon_pos ) Vertex Program Flattened Grid 1 pixel Stencil 4 pixels Stencil Values

Computational Challenge #2 Adaptive Nearest Neighbor Search

Iterative algorithm Iterative algorithm Accept or reject photons in cell visit order Accept or reject photons in cell visit order

kNN-grid Algorithm sample point photons in estimate candidate photon Want a 4 photon estimate

kNN-grid Algorithm Candidate photons must be within max search radius Candidate photons must be within max search radius Visit voxels in order of distance to sample point Visit voxels in order of distance to sample point sample point photons in estimate candidate photon Want a 4 photon estimate

kNN-grid Algorithm If current number of photons in estimate is less than number requested, grow search radius If current number of photons in estimate is less than number requested, grow search radius 1 sample point photons in estimate candidate photon Want a 4 photon estimate

kNN-grid Algorithm If current number of photons in estimate is less than number requested, grow search radius If current number of photons in estimate is less than number requested, grow search radius 2 sample point photons in estimate candidate photon Want a 4 photon estimate

kNN-grid Algorithm Don’t add photons outside maximum search radius Don’t add photons outside maximum search radius Don’t grow search radius when photon is outside maximum radius Don’t grow search radius when photon is outside maximum radius 2 sample point photons in estimate candidate photon Want a 4 photon estimate

kNN-grid Algorithm Add photons within search radius Add photons within search radius 3 sample point photons in estimate candidate photon Want a 4 photon estimate

kNN-grid Algorithm Add photons within search radius Add photons within search radius 4 sample point photons in estimate candidate photon Want a 4 photon estimate

kNN-grid Algorithm Don’t expand search radius if enough photons already found Don’t expand search radius if enough photons already found 4 sample point photons in estimate candidate photon Want a 4 photon estimate

kNN-grid Algorithm Add photons within search radius Add photons within search radius 5 sample point photons in estimate candidate photon Want a 4 photon estimate

kNN-grid Algorithm Visit all other voxels accessible within determined search radius Visit all other voxels accessible within determined search radius Add photons within search radius Add photons within search radius 6 sample point photons in estimate candidate photon Want a 4 photon estimate

kNN-grid Algorithm Finds all photons within a sphere centered about sample point Finds all photons within a sphere centered about sample point May locate more than requested k-nearest neighbors May locate more than requested k-nearest neighbors 6 sample point photons in estimate candidate photon Want a 4 photon estimate

System Implementation NVIDIA GeForce FX 5900 Ultra (NV35) NVIDIA GeForce FX 5900 Ultra (NV35) Cg compiler 1.1 Cg compiler 1.1 Trace Photons Build Photon Map Ray Trace Scene Compute Radiance Estimate Compute LightingRender Image

Demos

Glass Ball – Bitonic Sort 512x384, 5K photons

Glass Ball – Stencil Routing 512x384, 5K photons

Ring – Bitonic Sort 512x384, 16K photons

Ring – Stencil Routing 512x384, 16K photons

Cornell Box – Bitonic Sort 512x512, 65K photons

Cornell Box – Stencil Routing 512x512, 65K photons

Cornell Box – Increased Search Radius

Open Issues (1) How to prevent program execution over a subset of pixels? How to prevent program execution over a subset of pixels? Non-uniform pixel computation distribution Non-uniform pixel computation distribution Radiance estimate Radiance estimate KILL is only a write mask KILL is only a write mask Early-z occlusion culling Early-z occlusion culling No pixel level control No pixel level control Compute mask, branching, or stream buffer? Compute mask, branching, or stream buffer? Improve radiance estimate speed by 30-70% over tiling Improve radiance estimate speed by 30-70% over tiling

Open Issues (2) Scatter Scatter Makes (a programmer’s) life easier Makes (a programmer’s) life easier Is it worth implementing? Is it worth implementing? Gain factor of log 2 n avoiding sort Gain factor of log 2 n avoiding sort

Future Work Kd-trees Kd-trees Photon power redistribution Photon power redistribution Adaptive sampling Adaptive sampling Progressive refinement Progressive refinement

Conclusions The GPU can compute an entire global illumination solution The GPU can compute an entire global illumination solution Nearly interactive Nearly interactive Implemented an adaptive k-nearest neighbor query for the GPU Implemented an adaptive k-nearest neighbor query for the GPU kNN-grid kNN-grid Shown how to construct sparse data structures on the GPU Shown how to construct sparse data structures on the GPU Bitonic merge sort and binary search Bitonic merge sort and binary search Stencil routing Stencil routing Sorting and searching algorithms applicable to other computations Sorting and searching algorithms applicable to other computations

Acknowledgments Stanford FlashG Stanford FlashG Ian Buck, Mike Houston, Kekoa Proudfoot Ian Buck, Mike Houston, Kekoa Proudfoot Stencil routing Stencil routing Kurt Akeley, Matt Papakipos Kurt Akeley, Matt Papakipos Hardware and drivers Hardware and drivers David Kirk, Nick Triantos David Kirk, Nick Triantos Funding Funding NVIDIA, DARPA, NSF, 3Com NVIDIA, DARPA, NSF, 3Com