Interactive Geometric Computations using Graphics Processors Naga K. Govindaraju UNC Chapel Hill.

Slides:



Advertisements
Similar presentations
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL CULLIDE: Interactive Collision Detection Between Complex Models in Large Environments using Graphics Hardware.
Advertisements

CS 352: Computer Graphics Chapter 7: The Rendering Pipeline.
Optimized Stencil Shadow Volumes
Understanding the graphics pipeline Lecture 2 Original Slides by: Suresh Venkatasubramanian Updates by Joseph Kider.
Graphics Pipeline.
RealityEngine Graphics Kurt Akeley Silicon Graphics Computer Systems.
Collision Detection CSCE /60 What is Collision Detection?  Given two geometric objects, determine if they overlap.  Typically, at least one of.
Computer graphics & visualization Collisions. computer graphics & visualization Simulation and Animation – SS07 Jens Krüger – Computer Graphics and Visualization.
3D Graphics Rendering and Terrain Modeling
CAP4730: Computational Structures in Computer Graphics Visible Surface Determination.
CHAPTER 12 Height Maps, Hidden Surface Removal, Clipping and Level of Detail Algorithms © 2008 Cengage Learning EMEA.
Visibility Culling using Hierarchical Occlusion Maps Hansong Zhang, Dinesh Manocha, Tom Hudson, Kenneth E. Hoff III Presented by: Chris Wassenius.
Collision Detection for Deformable Models Huai-Ping Lee
Chapter 6: Vertices to Fragments Part 2 E. Angel and D. Shreiner: Interactive Computer Graphics 6E © Addison-Wesley Mohan Sridharan Based on Slides.
Fast Penetration Depth Computation for Physically-based Animation Y. Kim, M. Otaduy, M. Lin and D.
Control Flow Virtualization for General-Purpose Computation on Graphics Hardware Ghulam Lashari Ondrej Lhotak University of Waterloo.
Collision Detection on the GPU Mike Donovan CIS 665 Summer 2009.
Shadow Silhouette Maps Pradeep Sen, Mike Cammarano, Pat Hanrahan Stanford University.
Adapted from: CULLIDE: Interactive Collision Detection Between Complex Models in Large Environments using Graphics Hardware Naga K. Govindaraju, Stephane.
Parallel Occlusion Culling for Interactive Walkthrough using Multiple GPUs Naga K Govindaraju, Avneesh Sud, Sun-Eui Yoon, Dinesh Manocha University of.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Continuous Collision Detection David Knott COMP 259 class presentation.
Interactive Shadow Generation in Complex Environments Naga K. Govindaraju, Brandon Lloyd, Sung-Eui Yoon, Avneesh Sud, Dinesh Manocha Speaker: Alvin Date:
Approximate Soft Shadows on Arbitrary Surfaces using Penumbra Wedges Tomas Akenine-Möller Ulf Assarsson Department of Computer Engineering, Chalmers University.
OBBTree: A Hierarchical Structure for Rapid Interference Detection Gottschalk, M. C. Lin and D. ManochaM. C. LinD. Manocha Department of Computer Science,
UNC Chapel Hill M. C. Lin Overview of Last Lecture About Final Course Project –presentation, demo, write-up More geometric data structures –Binary Space.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Collision Detection for Deformable Objects Xin Huang 16/10/2007.
1 Advanced Scene Management System. 2 A tree-based or graph-based representation is good for 3D data management A tree-based or graph-based representation.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Geometric Computations on GPU: Proximity Queries Avneesh Sud &Dinesh Manocha.
Collision Detection David Johnson Cs6360 – Virtual Reality.
Parallel Graphics Rendering Matthew Campbell Senior, Computer Science
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
University of Texas at Austin CS 378 – Game Technology Don Fussell CS 378: Computer Game Technology Beyond Meshes Spring 2012.
1 A Hierarchical Shadow Volume Algorithm Timo Aila 1,2 Tomas Akenine-Möller 3 1 Helsinki University of Technology 2 Hybrid Graphics 3 Lund University.
Hidden Surface Removal
1 Occlusion Culling ©Yiorgos Chrysanthou, , Anthony Steed, 2004.
Afrigraph 2004 Massive model visualization Tutorial A: Part I Rasterization Based Approaches Andreas Dietrich Computer Graphics Group, Saarland University.
Ray Tracing and Photon Mapping on GPUs Tim PurcellStanford / NVIDIA.
Technology and Historical Overview. Introduction to 3d Computer Graphics  3D computer graphics is the science, study, and method of projecting a mathematical.
Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.
NVIDIA PROPRIETARY AND CONFIDENTIAL Occlusion (HP and NV Extensions) Ashu Rege.
Computer-Aided Design and Manufacturing Laboratory: 3D Minkowski sum computation Sara McMains UC Berkeley.
Fast Computation of Database Operations using Graphics Processors Naga K. Govindaraju Univ. of North Carolina Modified By, Mahendra Chavan forCS632.
Computer Graphics 2 Lecture 8: Visibility Benjamin Mora 1 University of Wales Swansea Pr. Min Chen Dr. Benjamin Mora.
On a Few Ray Tracing like Algorithms and Structures. -Ravi Prakash Kammaje -Swansea University.
Quick-CULLIDE: Efficient Inter- and Intra- Object Collision Culling using Graphics Hardware Naga K. Govindaraju, Ming C. Lin, Dinesh Manocha University.
Tone Mapping on GPUs Cliff Woolley University of Virginia Slides courtesy Nolan Goodnight.
2 COEN Computer Graphics I Evening’s Goals n Discuss application bottleneck determination n Discuss various optimizations for making programs execute.
Collision and Proximity Queries Dinesh Manocha Department of Computer Science University of North Carolina
CIS 350 – I Game Programming Instructor: Rolf Lakaemper.
Fast BVH Construction on GPUs (Eurographics 2009) Park, Soonchan KAIST (Korea Advanced Institute of Science and Technology)
Accelerated Stereoscopic Rendering using GPU François de Sorbier - Université Paris-Est France February 2008 WSCG'2008.
Hierarchical Penumbra Casting Samuli Laine Timo Aila Helsinki University of Technology Hybrid Graphics, Ltd.
Stencil Routed A-Buffer
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
Based on paper by: Rahul Khardekar, Sara McMains Mechanical Engineering University of California, Berkeley ASME 2006 International Design Engineering Technical.
Graphics Interface 2009 The-Kiet Lu Kok-Lim Low Jianmin Zheng 1.
Fateme Hajikarami Spring  What is GPGPU ? ◦ General-Purpose computing on a Graphics Processing Unit ◦ Using graphic hardware for non-graphic computations.
COMPUTER GRAPHICS CS 482 – FALL 2015 SEPTEMBER 29, 2015 RENDERING RASTERIZATION RAY CASTING PROGRAMMABLE SHADERS.
Shadows David Luebke University of Virginia. Shadows An important visual cue, traditionally hard to do in real-time rendering Outline: –Notation –Planar.
CULLIDE: Interactive Collision Detection Between Complex Models in Large Environments using Graphics Hardware Presented by Marcus Parker By Naga K. Govindaraju,
01/28/09Dinesh Manocha, COMP770 Visibility Computations Visible Surface Determination Visibility Culling.
SHADOW CASTER CULLING FOR EFFICIENT SHADOW MAPPING JIŘÍ BITTNER 1 OLIVER MATTAUSCH 2 ARI SILVENNOINEN 3 MICHAEL WIMMER 2 1 CZECH TECHNICAL UNIVERSITY IN.
1 Geometry for Game. Geometry Geometry –Position / vertex normals / vertex colors / texture coordinates Topology Topology –Primitive »Lines / triangles.
Real-Time Soft Shadows with Adaptive Light Source Sampling
3D Graphics Rendering PPT By Ricardo Veguilla.
From Turing Machine to Global Illumination
Parts of these slides are based on
Graphics Processing Unit
Collision Detection.
ATO Project: Year 3 Main Tasks
Presentation transcript:

Interactive Geometric Computations using Graphics Processors Naga K. Govindaraju UNC Chapel Hill

2 Outline Overview Overview Interactive Collision Detection Interactive Collision Detection Conclusions and Future Work Conclusions and Future Work

3 Geometric Computations Well studied Well studied Computer graphics, computational geometry etc. Computer graphics, computational geometry etc. Widely used in games, simulations, virtual reality applications Widely used in games, simulations, virtual reality applications

4 Graphics Processing Units (GPUs) Well-designed for visibility computations Well-designed for visibility computations Rasterization – image-space visibility Rasterization – image-space visibility Massively parallel Massively parallel Render millions of polygons per second Render millions of polygons per second Well suited for image-based algorithms Well suited for image-based algorithms High growth rate High growth rate

5 Recent growth rate of Graphics Processing Units CardMillion triangles/sec Radeon 9700 Pro325 GeForce FX Radeon 9800 XT412 GeForce FX GeForce FX

6 Graphics Processing Units (GPUs) Well-designed for visibility computations Well-designed for visibility computations Rasterization – image-space visibility Rasterization – image-space visibility Massively parallel Massively parallel Render millions of polygons per second Render millions of polygons per second Well suited for image-based algorithms Well suited for image-based algorithms High growth rate High growth rate

7 GPUs: Geometric Computations Used for geometric applications Used for geometric applications Minkowski sums [ Kim et al. 02 ] Minkowski sums [ Kim et al. 02 ] CSG rendering [ Goldfeather et al. 89, Rossignac et al. 90 ] CSG rendering [ Goldfeather et al. 89, Rossignac et al. 90 ] Voronoi computation [ Hoff et al. 01, 02, Sud et al. 04 ] Voronoi computation [ Hoff et al. 01, 02, Sud et al. 04 ] Isosurface computation [ Pascucci 04 ] Isosurface computation [ Pascucci 04 ] Map simplification [ Mustafa et al. 01 ] Map simplification [ Mustafa et al. 01 ]

8 GPUs for Geometric Computations: Issues Precision Precision Frame-buffer readbacks Frame-buffer readbacks

9 Draw stream of triangles CPU GPU Visibility of triangles Vertex Processing Engines Setup Engine Stream of visible pixels Alpha test Pixel Processing Engines Stencil test Depth test Stream of transformed vertices Setup of setup commands and state count of visible pixels GPU Stream of vertices

10 IEEE Floating Point (32-bit) Limited Resolution! Draw stream of triangles CPU GPU Visibility of triangles Vertex Processing Engines Setup Engine Stream of vertices Stream of visible pixels Alpha test Pixel Processing Engines Stencil test Depth test Stream of transformed vertices Setup of setup commands and state count of visible pixels GPU

11 Frame-Buffer Precision Pixel Processing Engines Stream of visible pixels Limited Resolution! Resolution along X, Y,Z X – 12 bits fixed precision Y – 12 bits fixed precision Z – 24 bits fixed precision On CPU – 32-bit or 64-bit floating-point precision

12 Frame-Buffer Readback Involve stalls Involve stalls Affect throughput Affect throughput Slow! Slow!

13 Frame-Buffer Readback Performance Data Courtesy: June 2004www.techreport.com Readback of 1Kx1K frame-buffer takes 18 ms over PCI-Express Graphics driver – 61.45

14 Frame-Buffer Read back Performance Data Courtesy: June 2004 Read back of 1Kx1K frame-buffer takes 18 ms over PCI-Express Graphics driver – 61.45

15 GPU Growth Rate CPU Growth Rate AGP Bandwidth Growth Rate Courtesy: Anselmo Lastra

16 Outline Overview Overview Interactive Collision Detection Interactive Collision Detection Conclusions and Future Work Conclusions and Future Work

17 Features Interactive collision detection between complex objects Interactive collision detection between complex objects Large number of objects Large number of objects High primitive count High primitive count Non-convex objects Non-convex objects Open and closed objects Open and closed objects

18 Non-rigid Motion Deformable objects Deformable objects Changing topology Changing topology Self-collisions Self-collisions

19 Related Work Object-space techniques Object-space techniques Image-space techniques Image-space techniques

20 Object-Space Techniques Broad phase – Compute object pairs in close proximity Broad phase – Compute object pairs in close proximity Spatial partitioning Spatial partitioning Sweep-and-prune Sweep-and-prune Narrow phase – Check each pair for exact collision detection Narrow phase – Check each pair for exact collision detection Convex objects Convex objects Spatial partitioning Spatial partitioning Bounding volume hierarchies Bounding volume hierarchies Surveys in [ Klosowski 1998, Redon et al. 2002, Lin and Manocha 2003 ] Surveys in [ Klosowski 1998, Redon et al. 2002, Lin and Manocha 2003 ]

21 Limitations of Object-Space Techniques Considerable pre-processing Considerable pre-processing Hard to achieve real-time performance on complex deformable models Hard to achieve real-time performance on complex deformable models

22 Collision Detection using Graphics Hardware Primitive rasterization – sorting in screen- space Primitive rasterization – sorting in screen- space Interference tests Interference tests

23 Image-Space Techniques Use of graphics hardware CSG rendering [Goldfeather et al. 1989, Rossignac et al. 1990] CSG rendering [Goldfeather et al. 1989, Rossignac et al. 1990] Interferences and cross-sections [Shinya and Forgue 1991, Rossignac et al. 1992, Myszkowski 1995, Baciu et al. 1998] Interferences and cross-sections [Shinya and Forgue 1991, Rossignac et al. 1992, Myszkowski 1995, Baciu et al. 1998] Minkowski sums [Kim et al. 2002] Minkowski sums [Kim et al. 2002] Cloth animation [Vassilev et al. 2001] Cloth animation [Vassilev et al. 2001] Virtual Surgery [Lombardo et al. 1999] Virtual Surgery [Lombardo et al. 1999] Proximity computation [Hoff et al. 2001, 2002] Proximity computation [Hoff et al. 2001, 2002]

24 Limitations of Image-Space Techniques Pairs of objects Pairs of objects Stencil-based; limited to closed models Stencil-based; limited to closed models Image precision Image precision Frame buffer readbacks Frame buffer readbacks

25 Collision Detection: Outline Overview Overview Collision Detection: CULLIDE Collision Detection: CULLIDE Inter- and Intra-Object Collision Detection: Quick-CULLIDE Inter- and Intra-Object Collision Detection: Quick-CULLIDE Reliable Collision Detection: FAR Reliable Collision Detection: FAR Analysis Analysis

26 Overview Potentially Colliding Set (PCS) computation Potentially Colliding Set (PCS) computation Exact collision tests on the PCS Exact collision tests on the PCS

27 Algorithm Object-Level Pruning Sub-object- Level Pruning Exact Tests GPU-based PCS computation Using CPU

28 Potentially Colliding Set (PCS) PCS

29 Potentially Colliding Set (PCS) PCS

30 Algorithm Object-Level Pruning Sub-object- Level Pruning Exact Tests

31 Visibility Computations Lemma 1: An object O does not collide with a set of objects S if O is fully visible with respect to S Lemma 1: An object O does not collide with a set of objects S if O is fully visible with respect to S

32 Visibility of Objects An object is fully visible if it is completely in front of the remaining objects An object is fully visible if it is completely in front of the remaining objects O1O1 O View O2O2

33 Visibility for Collisions: Geometric Interpretation Sufficient but not a necessary condition for existence of separating surface with unit depth complexity O1O1 O View O2O2

34 PCS Pruning Lemma 2: Given n objects O 1,O 2,…,O n, an object O i does not belong to PCS if it does not collide with O 1,…,O i-1,O i+1,…,O n Lemma 2: Given n objects O 1,O 2,…,O n, an object O i does not belong to PCS if it does not collide with O 1,…,O i-1,O i+1,…,O n Prune objects that do not collide Prune objects that do not collide

35 PCS Pruning O 1 O 2 … O i-1 O i O i+1 … O n-1 O n

36 PCS Computation Each object tested against all objects but itself Each object tested against all objects but itself Naive algorithm is O(n 2 ) Naive algorithm is O(n 2 ) Linear time algorithm Linear time algorithm Uses two pass rendering approach Uses two pass rendering approach Conservative solution Conservative solution

37 PCS Computation: First Pass O 1 O 2 … O i-1 O i O i+1 … O n-1 O n Render

38 O 1 O 2 … O i-1 O i PCS Computation: First Pass Fully Visible? Render Yes. Does not collide with O 1,O 2,…,O i-1

39 PCS Computation: First Pass O 1 O 2 … O i-1 O i O i+1 … O n-1 O n Render Fully Visible?

40 PCS Computation: Second Pass O 1 O 2 … O i-1 O i O i+1 … O n-1 O n Render

41 PCS Computation: Second Pass Render Fully Visible? O i O i+1 … O n-1 O n Yes. Does not collide with O i+1,…,O n-1,O n

42 PCS Computation: Second Pass Render Fully Visible? O 1 O 2 … O i-1 O i O i+1 … O n-1 O n

43 PCS Computation O 1 O 2 … O i-1 O i O i+1 … O n-1 O n Fully Visible

44 PCS Computation O 1 O 2 O 3 … O i-1 O i O i+1 … O n-2 O n-1 O n O 1 O 3 … O i-1 O i+1 … O n-1

45 Algorithm Object-Level Pruning Sub-object- Level Pruning Exact Tests

46 CULLIDE Algorithm Object-Level Pruning Sub-object- Level Pruning Exact Tests Exact overlap tests using CPU

47 Full Visibility Queries on GPUs We require a query We require a query Tests if a primitive is fully visible or not Tests if a primitive is fully visible or not Current hardware supports occlusion queries Current hardware supports occlusion queries Test if only part of a primitive is visible or not Test if only part of a primitive is visible or not Our solution Our solution Change the sign of the depth function Change the sign of the depth function

48 Full Visibility Queries on GPUs Depth function GEQUALLESS All fragments Pass FailPass Fail Fail PassFailPassFail Query not supported Occlusion query Examples - HP_Occlusion_test, NV_occlusion_query

49 Bandwidth Analysis Read back only integer identifiers Read back only integer identifiers Computation at high screen resolutions Computation at high screen resolutions

50 Live Demo: CULLIDE Laptop Laptop 1.6 GHz Pentium IV CPU 1.6 GHz Pentium IV CPU NVIDIA GeForce FX 700 GoGL NVIDIA GeForce FX 700 GoGL AGP 4X AGP 4X

51 Live Demo: CULLIDE Environment Environment Dragon – 250K polygons Dragon – 250K polygons Bunny – 35K polygons Bunny – 35K polygons Average frame rate – 15 frames per second! Average frame rate – 15 frames per second!

52 Interactive Collision Detection: Outline Overview Overview Collision Detection: CULLIDE Collision Detection: CULLIDE Inter- and Intra-Object Collision Detection: Quick-CULLIDE Inter- and Intra-Object Collision Detection: Quick-CULLIDE Reliable Collision Detection: FAR Reliable Collision Detection: FAR Analysis Analysis

53 Quick-CULLIDE Improved two-pass algorithm Improved two-pass algorithm Utilize visibility relationships among objects across different views Utilize visibility relationships among objects across different views

54 Quick-CULLIDE: Visibility Sets Decompose PCS into four disjoint sets Decompose PCS into four disjoint sets FFV (First pass Fully Visible) FFV (First pass Fully Visible) SFV (Second pass Fully Visible) SFV (Second pass Fully Visible) NFV (Not Fully Visible in either passes) NFV (Not Fully Visible in either passes) BFV (Both passes Fully Visible) BFV (Both passes Fully Visible) Visibility sets have five interesting properties! Visibility sets have five interesting properties!

55 Visibility Sets: Properties Lemma 1: FFV and SFV are collision- free sets Lemma 1: FFV and SFV are collision- free sets

56 PCS Computation: First Pass O 1 O 2 … O i-1 O i … O j … O n-1 O n Render

57 PCS Computation: First Pass O 1 O 2 … O i-1 O i … O j … O n-1 O n Render Fully Visible

58 Visibility Sets: Properties Lemma 2: It is sufficient to test visibility of objects in FFV in second pass only Lemma 2: It is sufficient to test visibility of objects in FFV in second pass only

59 PCS Computation: First Pass O 1 O 2 … O i-1 O i O i+1 … O n-1 O n

60 O 1 O 2 … O i-1 O i PCS Computation: First Pass Render

61 PCS Computation: First Pass O 1 O 2 … O i-1 O i O i+1 … O n-1 O n Not Colliding Collision tested in Second pass

62 Visibility Sets: Properties Lemma 3: It is sufficient to render objects in FFV in first pass only! Lemma 3: It is sufficient to render objects in FFV in first pass only!

63 PCS Computation: First Pass O 1 O 2 … O i-1 O i O i+1 … O n-1 O n

64 O 1 O 2 … O i-1 O i PCS Computation: First Pass Render

65 PCS Computation O 1 O 2 … O i-1 O i O i+1 … O n-1 O n Not Colliding Render

66 Visibility Sets: Properties Lemma 4: It is sufficient to test the visibility of objects in SFV in first pass only! Lemma 4: It is sufficient to test the visibility of objects in SFV in first pass only!

67 Visibility Sets: Properties Lemma 5: It is sufficient to render objects in SFV in second pass only! Lemma 5: It is sufficient to render objects in SFV in second pass only!

68 Quick-CULLIDE: Advantages Better culling efficiency Better culling efficiency Lower depth complexity than CULLIDE Lower depth complexity than CULLIDE Always better than CULLIDE Always better than CULLIDE Faster computational performance Faster computational performance Lower number of visibility queries and rendering operations Lower number of visibility queries and rendering operations Can handle self-collisions Can handle self-collisions

69 Self-Collisions: Definition Pairs of overlapping triangles in an object that are not neighboring Pairs of overlapping triangles in an object that are not neighboring

70 Self-Collisions: Definition Pairs of overlapping triangles in an object that are not neighboring Pairs of overlapping triangles in an object that are not neighboring

71 Self-Collisions Occur in most deformable simulations Occur in most deformable simulations Image Courtesy: Baraff and Witkin, SIGGRAPH 2003 Artifacts

72 Our Solution Classification of contacts between triangles in an object Classification of contacts between triangles in an object Touching contacts Touching contacts Penetrating contacts Penetrating contacts

73 Contacts: Classification (a) (b)(c) Touching Contacts Penetrating Contact

74 Solution Ignore touching contacts Ignore touching contacts Consider only penetrating contacts Consider only penetrating contacts Redefine fully visible Redefine fully visible We pass a fragment when a touching contact occurs We pass a fragment when a touching contact occurs Pass all fragments with depth ≤ corresponding depths in frame-buffer Pass all fragments with depth ≤ corresponding depths in frame-buffer

75 Live Demo: Quick-CULLIDE Laptop Laptop 1.6 GHz Pentium IV CPU 1.6 GHz Pentium IV CPU NVIDIA GeForce FX 700 GoGL NVIDIA GeForce FX 700 GoGL AGP 4X AGP 4X

76 Live Demo: Cloth Simulation Cloth – 20K triangles Cloth – 20K triangles Average frame rate – 13 frames per second! Average frame rate – 13 frames per second!

77 Interactive Collision Detection: Outline Overview Overview Collision Detection: CULLIDE Collision Detection: CULLIDE Self-Collision Detection: S-CULLIDE Self-Collision Detection: S-CULLIDE Reliable Collision Detection: FAR Reliable Collision Detection: FAR Analysis Analysis

78 Inaccuracies in GPU-Based Algorithms Image sampling Image sampling Depth buffer precision Depth buffer precision

79 Image Sampling Occurs when a primitive is nearly parallel to view direction Occurs when a primitive is nearly parallel to view direction

80 Image Sampling Primitives are rasterized but no intersecting points are sampled by hardware Primitives are rasterized but no intersecting points are sampled by hardware Viewport C = pixel center Intersecting point

81 Depth Buffer Precision Intersecting points are sampled but precision is not sufficient Intersecting points are sampled but precision is not sufficient Viewport C = pixel center Intersecting point T1T1

82 Our Solution Sufficiently fatten the triangles Sufficiently fatten the triangles Use Minkowski sums Use Minkowski sums Minkowski Sum A B = A  B = {a + b: a  A, b  B} = {a + b: a  A, b  B}

83 PLPL Minkowski Sum: Example P L

84 Reliability Lemma 1: Under orthographic transformation O, the rasterization of Minkowski sum Q s = Q  S, where Q is a point in 3-D space that projects inside a pixel X and S is a sphere bounding a pixel centered at the origin, generates two samples for X that bound the depth value of Q. Lemma 1: Under orthographic transformation O, the rasterization of Minkowski sum Q s = Q  S, where Q is a point in 3-D space that projects inside a pixel X and S is a sphere bounding a pixel centered at the origin, generates two samples for X that bound the depth value of Q.

85 Reliability Under orthographic transformation O, the rasterization of Minkowski sum Q s = Q  S, where Q is a point in 3-D space that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at least two fragments bounding the depth value of Q. Under orthographic transformation O, the rasterization of Minkowski sum Q s = Q  S, where Q is a point in 3-D space that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at least two fragments bounding the depth value of Q. z x

86 Reliability z x Q Under orthographic transformation O, the rasterization of Minkowski sum Q S = Q  S, where Q is a point in 3-D space that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at least two fragments bounding the depth value of Q. Under orthographic transformation O, the rasterization of Minkowski sum Q S = Q  S, where Q is a point in 3-D space that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at least two fragments bounding the depth value of Q. Pixel X

87 S Reliability Under orthographic transformation O, the rasterization of Minkowski sum Q B = Q S, where Q is a point in 3-D space that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at least two fragments bounding the depth value of Q. Under orthographic transformation O, the rasterization of Minkowski sum Q B = Q S, where Q is a point in 3-D space that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at least two fragments bounding the depth value of Q. z x Q Pixel X

88 S Reliability z x Q Under orthographic transformation O, the rasterization of Minkowski sum Q S = Q S, where Q is a point in 3-D space that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at least two fragments bounding the depth value of Q. Under orthographic transformation O, the rasterization of Minkowski sum Q S = Q S, where Q is a point in 3-D space that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at least two fragments bounding the depth value of Q.

89 S Reliability z x Q Under orthographic transformation O, the rasterization of Minkowski sum Q S = Q S, where Q is a point in 3-D space that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at least two fragments bounding the depth value of Q. Under orthographic transformation O, the rasterization of Minkowski sum Q S = Q S, where Q is a point in 3-D space that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at least two fragments bounding the depth value of Q.

90 Under orthographic transformation O, the rasterization of Minkowski sum Q S = Q S, where Q is a point in 3-D space that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at least two fragments bounding the depth value of Q. Under orthographic transformation O, the rasterization of Minkowski sum Q S = Q S, where Q is a point in 3-D space that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at least two fragments bounding the depth value of Q. S Reliability z x Q Sample Depths

91 Under orthographic transformation O, the rasterization of Minkowski sum Q S = Q S, where Q is a point in 3-D space that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at least two fragments bounding the depth value of Q. Under orthographic transformation O, the rasterization of Minkowski sum Q S = Q S, where Q is a point in 3-D space that projects inside a pixel X and S is a sphere centered at origin bounding a pixel, samples X with at least two fragments bounding the depth value of Q. S Reliability z x Q Sample Depths

92 Reliability Lemma 2: Given a primitive P and its Minkowski sum P s = P S. Let X be a pixel partly or fully covered by the orthographic projection of P. P x = {p P, p projects inside X}, Lemma 2: Given a primitive P and its Minkowski sum P s = P S. Let X be a pixel partly or fully covered by the orthographic projection of P. P x = {p P, p projects inside X}, Min-Depth(P, X) = Minimum depth value in P x Min-Depth(P, X) = Minimum depth value in P x Max-Depth(P, X) =Maximum depth value in P x. Max-Depth(P, X) =Maximum depth value in P x. The rasterization of P x S generates at least two fragments whose depth values bound both Min-Depth(P, X) and Max-Depth(P, X) for each pixel X. The rasterization of P x S generates at least two fragments whose depth values bound both Min-Depth(P, X) and Max-Depth(P, X) for each pixel X.

93 Reliability z x Given a primitive P P

94 Reliability z x P x is the portion of P projecting inside pixel X Pixel X PxPx

95 S Reliability z x S is a sphere centered at origin bounding pixel X Pixel X PxPx

96 Reliability z x If we compute Minkowski sum P x S = P x S, Pixel X PxSPxS PxPx

97 Reliability z x PxPx then the rasterization of the Minkowski sum P x S generates two fragments then the rasterization of the Minkowski sum P x S generates two fragments Pixel X PxSPxS Sample Depths

98 Reliability z x PxPx and the fragments bound depth values in P x Pixel X PxSPxS Sample Depths

99 Reliability Theorem 1: Given the Minkowski sum of two primitives with S, P 1 S and P 2 S. If P 1 and P 2 overlap, then a rasterization of their Minkowski sums under orthographic projection overlaps in the viewport. Theorem 1: Given the Minkowski sum of two primitives with S, P 1 S and P 2 S. If P 1 and P 2 overlap, then a rasterization of their Minkowski sums under orthographic projection overlaps in the viewport.

100 Reliability z x P1P1 Given two primitives P 1 and P 2 Pixel X P2P2

101 Reliability d x P1P1 If P 1 and P 2 intersect in 3-D, Pixel X P2P2 P 1 and P 2 intersect in 3-D

102 Reliability d x P1P1 and we compute their Minkowski sums with a pixel-sized sphere centered at origin and we compute their Minkowski sums with a pixel-sized sphere centered at origin Pixel X P2P2

103 Reliability z x P1P1 rasterization of the Minkowski sums overlap in image-space Pixel X P2P2

104 Reliability Corollary 1: Given the Minkowski sum of two primitives with B, P 1 S and P 2 S. If a rasterization of P 1 S and P 2 S under orthographic projection do not overlap in the viewport, then P 1 and P 2 do not overlap in 3- D. Corollary 1: Given the Minkowski sum of two primitives with B, P 1 S and P 2 S. If a rasterization of P 1 S and P 2 S under orthographic projection do not overlap in the viewport, then P 1 and P 2 do not overlap in 3- D. Useful in Collision Culling: apply fattened primitives P 1 S in CULLIDE Useful in Collision Culling: apply fattened primitives P 1 S in CULLIDE

105 z x Pixel X P1P1 P2P2 P1SP1S P2SP2S

106 Bounding Offsets of a Triangle Exact Offsets Exact Offsets Three edge-aligned cylinders, three spheres, two triangles Three edge-aligned cylinders, three spheres, two triangles Can be rendered using fragment programs Can be rendered using fragment programs Expensive! Expensive! Oriented Bounding Box (OBB) Oriented Bounding Box (OBB)

107 (c) (a) (d) (b) OBB Construction

108 Union of OBBs

109 Live Demo: FAR Laptop Laptop 1.6 GHz Pentium IV CPU 1.6 GHz Pentium IV CPU NVIDIA GeForce FX 700 GoGL NVIDIA GeForce FX 700 GoGL AGP 4X AGP 4X

110 Live Demo: FAR Environment Environment Tree – 4000 triangles Tree – 4000 triangles Leaf – 200 triangles, 200 leaves Leaf – 200 triangles, 200 leaves Scene – 44K triangles Scene – 44K triangles Average frame rate – 15 frames per second! Average frame rate – 15 frames per second!

111 Interactive Collision Detection: Outline Overview Overview Collision Detection: CULLIDE Collision Detection: CULLIDE Self-Collision Detection: S-CULLIDE Self-Collision Detection: S-CULLIDE Reliable Collision Detection: FAR Reliable Collision Detection: FAR Analysis Analysis Performance Performance Pruning efficiency Pruning efficiency Precision Precision

112 Analysis: Performance Based on pruning algorithm in CULLIDE Based on pruning algorithm in CULLIDE Factors Factors Output size Output size Rasterization optimizations Rasterization optimizations Number of objects Number of objects Number of triangles per object Number of triangles per object Image resolution Image resolution

113 Analysis: Performance Collision time vs. number of objects NV30 GPU Pentium IV 2GHz CPU (in ms)

114 Analysis: Performance Collision time vs. number of polygons NV30 GPU Pentium IV 2 GHz CPU (in ms)

115 Analysis: Performance Collision time vs. screen resolution NV30 GPU Pentium IV 2 GHz CPU (in ms)

116 Analysis: Pruning Efficiency Input complexity Input complexity Relative object configurations Relative object configurations Pruning efficiency in Pruning efficiency in Object-Level Culling Object-Level Culling Subobject-Level Culling Subobject-Level Culling

117 Comparison: FAR and I-COLLIDE

118 Analysis: Accuracy CULLIDE and S-CULLIDE: Image resolution CULLIDE and S-CULLIDE: Image resolution FAR: IEEE 32-bit floating-point precision FAR: IEEE 32-bit floating-point precision Comparison: Comparison: FAR vs. CULLIDE FAR vs. CULLIDE

119 Accuracy: FAR vs. CULLIDE

120 Outline Overview Overview Interactive Collision Detection Interactive Collision Detection Conclusions and Future Work Conclusions and Future Work

121 Conclusions Designed efficient geometric algorithms on GPUs Designed efficient geometric algorithms on GPUs interactive collision detection interactive collision detection Applied them to complex 3-D environments Applied them to complex 3-D environments Compared to prior state-of-the-art algorithms Compared to prior state-of-the-art algorithms Significant speedups in some cases Significant speedups in some cases

122 Advantages Generality Generality Accuracy Accuracy Image-precision for shadow generation algorithms Image-precision for shadow generation algorithms IEEE 32-bit floating-point precision for collision computations IEEE 32-bit floating-point precision for collision computations Low Bandwidth Low Bandwidth No readbacks No readbacks

123 Advantages Significant Culling Significant Culling Practicality Practicality Designed on commodity hardware Designed on commodity hardware Assumes availability of occlusion queries Assumes availability of occlusion queries

124 Limitations Precision Precision Shadow and self-collision algorithms are limited by image-precision Shadow and self-collision algorithms are limited by image-precision Accuracy can be improved Accuracy can be improved Pair computation Pair computation Algorithms compute potential sets Algorithms compute potential sets

125 Thank You Questions or Comments? Questions or Comments?