Download presentation

Presentation is loading. Please wait.

Published byJeremiah Stain Modified over 3 years ago

1
Photon Mapping on Programmable Graphics Hardware Timothy J. Purcell Mike Cammarano Pat Hanrahan Stanford University Craig Donner Henrik Wann Jensen University of California, San Diego

2
Motivation

3
Motivation Interactive global illumination on the GPU Interactive global illumination on the GPU Nearly have sufficient compute power and flexibility Nearly have sufficient compute power and flexibility Explore GPU-based computation algorithms Explore GPU-based computation algorithms

4
Related Work CPU-based interactive global illumination CPU-based interactive global illumination Supercomputers [Parker et al.] Supercomputers [Parker et al.] Clusters [Tole et al., Wald et al.] Clusters [Tole et al., Wald et al.] Global illumination on programmable GPUs Global illumination on programmable GPUs Ray tracing [Carr et al., Purcell et al.] Ray tracing [Carr et al., Purcell et al.] Photon mapping [Ma et al.] Photon mapping [Ma et al.] Radiosity [Carr et al., Coombe et al.] Radiosity [Carr et al., Coombe et al.] Translucency [Carr et al., Stamminger et al.] Translucency [Carr et al., Stamminger et al.]

5
Photon Mapping Algorithm Review Photon tracing Photon tracing Emission, scattering, storing into kd-tree Emission, scattering, storing into kd-tree Similar to ray tracing Similar to ray tracing Rendering Rendering Ray tracing for direct illumination Ray tracing for direct illumination Photon map visualization Photon map visualization Indirect bounce Indirect bounce

6
Computational Challenge for GPUs #1 Constructing a irregular or sparse data structure Constructing a irregular or sparse data structure

7
Computational Challenge for GPUs #2 Adaptive nearest neighbor search Adaptive nearest neighbor search Noise vs. blur Noise vs. blur

8
Computational Challenge for GPUs #2 Adaptive nearest neighbor search Adaptive nearest neighbor search Noise vs. blur Noise vs. blur

9
Photon Mapping on the CPU Balanced kd-tree Balanced kd-tree Compact storage of photons Compact storage of photons Efficient Efficient O(log n) search O(log n) search Priority queue Priority queue Nearest neighbor search Nearest neighbor search Incremental insertion and removal of photons Incremental insertion and removal of photons

10
Algorithmic Changes for the GPU Direct visualization of photon map Direct visualization of photon map Keeps rendering costs low Keeps rendering costs low Use grid instead of kd-tree Use grid instead of kd-tree Tried kd-tree… Tried kd-tree… Kd-tree construction is difficult Kd-tree construction is difficult Radiance estimate Radiance estimate –Fixed radius search works fine –Adaptive search needs priority queue No priority queue No priority queue Can’t build on GPU Can’t build on GPU Too much state Too much state

11
Contributions Mapped complete grid-based photon mapping algorithm onto the GPU Mapped complete grid-based photon mapping algorithm onto the GPU Including photon tracing, ray tracing, etc. Including photon tracing, ray tracing, etc. Implemented an adaptive k-nearest neighbor search Implemented an adaptive k-nearest neighbor search kNN-grid kNN-grid Show how to construct a sparse data structure on the GPU Show how to construct a sparse data structure on the GPU Bitonic merge sort with binary search Bitonic merge sort with binary search Stencil routing Stencil routing

12
Configuring the GPU for Computing GPU as data parallel compute engine GPU as data parallel compute engine Fragment programs execute compute kernels Fragment programs execute compute kernels Screen sized quad initializes computation Screen sized quad initializes computation SIMD execution SIMD execution Floating point texture memory Floating point texture memory Render-to-texture for intermediate results Render-to-texture for intermediate results Data structure storage Data structure storage Pointer dereferencing via dependent fetches Pointer dereferencing via dependent fetches

13
Computational Challenge #1 Building a Sparse Data Structure

14
Requires scatter Requires scatter Dependent texture write Dependent texture write Why don’t we have fragment scatter? Why don’t we have fragment scatter? Fragment processing has highly coherent blocked memory writes Fragment processing has highly coherent blocked memory writes Extra hardware support would be needed Extra hardware support would be needed Write hazards Write hazards Memory latencies Memory latencies

15
Scatter on the GPU Sort photons into grid cells Sort photons into grid cells Grid cell is sort key Grid cell is sort key Simulate scatter with fragment programs Simulate scatter with fragment programs Bitonic merge sort followed by binary search Bitonic merge sort followed by binary search Compact grid Compact grid O(log 2 n) rendering passes O(log 2 n) rendering passes

16
Bitonic Merge Sort 1 3 2 4 7 6 8 5 2 3 1 4 7 5 8 6 3 2 4 1 7 5 8 6 3 7 4 8 2 5 1 6 3 8 4 7 2 6 1 5 1 2 3 4 5 6 7 8 3 8 7 4 5 6 1 2 O(log 2 n) rendering passes

17
Binary Search Grid cell searches for self in photon list Grid cell searches for self in photon list If none, find first element in next cell If none, find first element in next cell Empty grid cells waste compute Empty grid cells waste compute Log(n) + 1 steps Log(n) + 1 steps

18
Binary Search Grid cell searches for self in photon list Grid cell searches for self in photon list If none, find first element in next cell If none, find first element in next cell Empty grid cells waste compute Empty grid cells waste compute Log(n) + 1 steps Log(n) + 1 steps v0v0v2v2v5v0v5 Sorted Photon List v2 Searching for first v5 photon initialize

19
Binary Search Grid cell searches for self in photon list Grid cell searches for self in photon list If none, find first element in next cell If none, find first element in next cell Empty grid cells waste compute Empty grid cells waste compute Log(n) + 1 steps Log(n) + 1 steps v0v0v2v2v5v0v5 Sorted Photon List v0v0v2v2v2v0v5 v2 v5 Searching for first v5 photon initialize step 1

20
v5 Binary Search Grid cell searches for self in photon list Grid cell searches for self in photon list If none, find first element in next cell If none, find first element in next cell Empty grid cells waste compute Empty grid cells waste compute Log(n) + 1 steps Log(n) + 1 steps v0v0v2v2v5v0v5 Sorted Photon List v0v0v2v2v2v0v5 v0v0v2v2v5v0 v2 v5 v2 Searching for first v5 photon initialize step 1 step 2

21
v5 Binary Search Grid cell searches for self in photon list Grid cell searches for self in photon list If none, find first element in next cell If none, find first element in next cell Empty grid cells waste compute Empty grid cells waste compute Log(n) + 1 steps Log(n) + 1 steps v0v0v2v2v5v0v5 Sorted Photon List v0v0v2v2v2v0v5 v0v0v2v2v5v0 v0v0v2v2v2v0v5 v2 v5 v2 v5 Searching for first v5 photon initialize step 1 step 2 step 3

22
v5 Binary Search Grid cell searches for self in photon list Grid cell searches for self in photon list If none, find first element in next cell If none, find first element in next cell Empty grid cells waste compute Empty grid cells waste compute Log(n) + 1 steps Log(n) + 1 steps v0v0v2v2v5v0v5 Sorted Photon List v0v0v2v2v2v0v5 v0v0v2v2v5v0 v0v0v2v2v2v0v5 v0v0v2v2v2v0v5 v2 v5 v2 v5 v5 Searching for first v5 photon initialize step 1 step 2 step 3 step 4

23
Scatter on the GPU Vertex programs can scatter Vertex programs can scatter Draw point to buffer Draw point to buffer Collisions? Collisions?

24
Scatter on the GPU Vertex programs can scatter Vertex programs can scatter Draw point to buffer Draw point to buffer Collisions? Collisions? Stencil routing Stencil routing Limit photon count per grid cell Limit photon count per grid cell –Pre-allocate grid cell space Draw photons as points Draw photons as points –Vertex program computes grid cell Stencil buffer controls location within cell Stencil buffer controls location within cell Single rendering pass Single rendering pass

25
Stencil Routing Fix each grid cell size to n 2 pixels Fix each grid cell size to n 2 pixels Draw fat points to cover each fat cell Draw fat points to cover each fat cell glPointSize(n) glPointSize(n) Vertex ( photon_pos ) Vertex Program Flattened Grid 4 pixels

26
Stencil Routing Control location written to with stencil Control location written to with stencil Pass when stencil is n 2 -1 Pass when stencil is n 2 -1 Stencil always increments Stencil always increments Location written depends on draw order Location written depends on draw order Vertex ( photon_pos ) Vertex Program Flattened Grid 1 pixel Stencil 4 pixels Stencil Values 01 23 12 34 01 23 01 23

27
Computational Challenge #2 Adaptive Nearest Neighbor Search

28
Iterative algorithm Iterative algorithm Accept or reject photons in cell visit order Accept or reject photons in cell visit order

29
kNN-grid Algorithm sample point photons in estimate candidate photon Want a 4 photon estimate

30
kNN-grid Algorithm Candidate photons must be within max search radius Candidate photons must be within max search radius Visit voxels in order of distance to sample point Visit voxels in order of distance to sample point sample point photons in estimate candidate photon Want a 4 photon estimate

31
kNN-grid Algorithm If current number of photons in estimate is less than number requested, grow search radius If current number of photons in estimate is less than number requested, grow search radius 1 sample point photons in estimate candidate photon Want a 4 photon estimate

32
kNN-grid Algorithm If current number of photons in estimate is less than number requested, grow search radius If current number of photons in estimate is less than number requested, grow search radius 2 sample point photons in estimate candidate photon Want a 4 photon estimate

33
kNN-grid Algorithm Don’t add photons outside maximum search radius Don’t add photons outside maximum search radius Don’t grow search radius when photon is outside maximum radius Don’t grow search radius when photon is outside maximum radius 2 sample point photons in estimate candidate photon Want a 4 photon estimate

34
kNN-grid Algorithm Add photons within search radius Add photons within search radius 3 sample point photons in estimate candidate photon Want a 4 photon estimate

35
kNN-grid Algorithm Add photons within search radius Add photons within search radius 4 sample point photons in estimate candidate photon Want a 4 photon estimate

36
kNN-grid Algorithm Don’t expand search radius if enough photons already found Don’t expand search radius if enough photons already found 4 sample point photons in estimate candidate photon Want a 4 photon estimate

37
kNN-grid Algorithm Add photons within search radius Add photons within search radius 5 sample point photons in estimate candidate photon Want a 4 photon estimate

38
kNN-grid Algorithm Visit all other voxels accessible within determined search radius Visit all other voxels accessible within determined search radius Add photons within search radius Add photons within search radius 6 sample point photons in estimate candidate photon Want a 4 photon estimate

39
kNN-grid Algorithm Finds all photons within a sphere centered about sample point Finds all photons within a sphere centered about sample point May locate more than requested k-nearest neighbors May locate more than requested k-nearest neighbors 6 sample point photons in estimate candidate photon Want a 4 photon estimate

40
System Implementation NVIDIA GeForce FX 5900 Ultra (NV35) NVIDIA GeForce FX 5900 Ultra (NV35) Cg compiler 1.1 Cg compiler 1.1 Trace Photons Build Photon Map Ray Trace Scene Compute Radiance Estimate Compute LightingRender Image

41
Demos

42
Glass Ball – Bitonic Sort 18s @ 512x384, 5K photons

43
Glass Ball – Stencil Routing 11s @ 512x384, 5K photons

44
Ring – Bitonic Sort 9s @ 512x384, 16K photons

45
Ring – Stencil Routing 8s @ 512x384, 16K photons

46
Cornell Box – Bitonic Sort 64s @ 512x512, 65K photons

47
Cornell Box – Stencil Routing 47s @ 512x512, 65K photons

48
Cornell Box – Increased Search Radius

49
Open Issues (1) How to prevent program execution over a subset of pixels? How to prevent program execution over a subset of pixels? Non-uniform pixel computation distribution Non-uniform pixel computation distribution Radiance estimate Radiance estimate KILL is only a write mask KILL is only a write mask Early-z occlusion culling Early-z occlusion culling No pixel level control No pixel level control Compute mask, branching, or stream buffer? Compute mask, branching, or stream buffer? Improve radiance estimate speed by 30-70% over tiling Improve radiance estimate speed by 30-70% over tiling

50
Open Issues (2) Scatter Scatter Makes (a programmer’s) life easier Makes (a programmer’s) life easier Is it worth implementing? Is it worth implementing? Gain factor of log 2 n avoiding sort Gain factor of log 2 n avoiding sort

51
Future Work Kd-trees Kd-trees Photon power redistribution Photon power redistribution Adaptive sampling Adaptive sampling Progressive refinement Progressive refinement

52
Conclusions The GPU can compute an entire global illumination solution The GPU can compute an entire global illumination solution Nearly interactive Nearly interactive Implemented an adaptive k-nearest neighbor query for the GPU Implemented an adaptive k-nearest neighbor query for the GPU kNN-grid kNN-grid Shown how to construct sparse data structures on the GPU Shown how to construct sparse data structures on the GPU Bitonic merge sort and binary search Bitonic merge sort and binary search Stencil routing Stencil routing Sorting and searching algorithms applicable to other computations Sorting and searching algorithms applicable to other computations

53
Acknowledgments Stanford FlashG Stanford FlashG Ian Buck, Mike Houston, Kekoa Proudfoot Ian Buck, Mike Houston, Kekoa Proudfoot Stencil routing Stencil routing Kurt Akeley, Matt Papakipos Kurt Akeley, Matt Papakipos Hardware and drivers Hardware and drivers David Kirk, Nick Triantos David Kirk, Nick Triantos Funding Funding NVIDIA, DARPA, NSF, 3Com NVIDIA, DARPA, NSF, 3Com

Similar presentations

OK

3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research.

3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research.

© 2018 SlidePlayer.com Inc.

All rights reserved.

To ensure the functioning of the site, we use **cookies**. We share information about your activities on the site with our partners and Google partners: social networks and companies engaged in advertising and web analytics. For more information, see the Privacy Policy and Google Privacy & Terms.
Your consent to our cookies if you continue to use this website.

Ads by Google

Types of window display ppt online Product mix ppt on nestle chocolate Ppt on interest rate parity Ppt on game theory definition Free download ppt on statistics for class 11 Ppt on different types of computer softwares download Ppt on mpeg audio compression and decompression algorithms Ppt on water conservation download Ppt on fiscal policy 2012 Ppt on statistics in maths what is median