Presentation is loading. Please wait.

Presentation is loading. Please wait.

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Cache-Oblivious Mesh Layouts Sung-Eui Yoon, Peter Lindstrom Valerio Pascucci, Dinesh Manocha 1: University.

Similar presentations


Presentation on theme: "The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Cache-Oblivious Mesh Layouts Sung-Eui Yoon, Peter Lindstrom Valerio Pascucci, Dinesh Manocha 1: University."— Presentation transcript:

1 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Cache-Oblivious Mesh Layouts Sung-Eui Yoon, Peter Lindstrom Valerio Pascucci, Dinesh Manocha 1: University of North Carolina - Chapel Hill 2: Lawrence Livermore National Laboratory 1 1 2 2 http://gamma.cs.unc.edu/COL

2 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Goal Compute cache-coherent layouts of polygonal meshes ♦ For geometric processing and visualization ♦ Handle any kinds of polygonal models (e.g., irregular geometry)

3 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Motivation High growth rate of computational power of CPUs and GPUs Growth rate during 1993 – 2004 Courtesy: http://www.hcibook.com/e3/online/moores-law/

4 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Memory Hierarchies and Caches CPU or GPU Fast memory or cache Slow memory Block transfer Disk 10 6 ns Access time: 10 2 ns 10 0 ns

5 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Cache-Coherent Layouts Cache-Aware ♦ Optimized for particular cache parameters (e.g., block size) Cache-Oblivious ♦ Minimizes data access time without any knowledge of cache parameters ♦ Directly applicable to various hardware and memory hierarchies

6 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 82 million triangles Irregular distribution of geometry CAD Model – Double Eagle Tanker Model

7 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Isosurface and Scanned Models Isosurface 100M triangles St. Matthew 372M triangles

8 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Main Contribution Algorithm to compute cache- oblivious layouts of polygonal meshes Cache-oblivious metric Multilevel optimization framework Applicable to hierarchical representations

9 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Live Demo – View- Dependent Rendering (VDR) GeForce Go 6800 Ultra Based on multiresolution hierarchy ♦ Dynamically computes simplification ♦ Cache-oblivious layout is used to minimize GPU vertex cache misses

10 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Related Work Cache-coherent algorithms Mesh layouts

11 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Cache-Coherent Algorithms Cache-aware [Coleman and McKinley 95, Vitter 01, Sen et al. 02] Cache-oblivious [Frigo et al. 99, Arge et al. 04] Focus on specific problems such as sorting and linear algebra computations

12 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Mesh Layouts Rendering sequences ♦ Triangle strips ♦ [Deering 95, Hoppe 99, Bogomjakov and Gotsman 02] Processing sequences ♦ [Isenburg and Gumhold 03, Isenburg and Lindstrom 04] Assume that access pattern globally follows the layout order!

13 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Mesh Layouts Space-filling curves ♦ [Sagan 94, Velho and Gomes 91, Pascucci and Frank 01, Lindstrom and Pascucci 01, Gopi and Eppstein 04] Assume geometric regularity!

14 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Outline Overview Cache-oblivious metric Results

15 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Outline Overview Cache-oblivious metric Results

16 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Overview Multilevel optimization Cache-oblivious metric Local permutations vava vbvb vdvd vcvc Input graph vava vbvb vdvd vcvc Result 1D layout

17 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Graph-based Representation Undirected graph, G = (V, E) ♦ Represents access patterns of applications Vertex ♦ Data element ♦ (e.g., mesh vertex or mesh triangle) Edge ♦ Connects two vertices if they are likely to be accessed sequentially vava vbvb vdvd vcvc

18 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Problem Statement Vertex layout of G = (V, E) ♦ One-to-one mapping of vertices to indices in the 1D layout Compute a that minimizes the expected number of cache misses vava vbvb vdvd vcvc vava vbvb vdvd vcvc 1 234

19 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Local Permutation Vertex layout

20 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Terminology Edge span of (v a, v b ) Layout mapping

21 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Terminology ♦ Set of edges having edge span i in the layout 4

22 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Terminology Edge span distribution ♦ where i is in [1, n] Edge span 1 Number of edges 234 1 1 1 1 4 2 3 4 1

23 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Cache Miss Ratio Function (CMRF), Probability of a cache miss for a given edge span i 0 1 Cache miss ratio = Probability to have a cache miss Edge span 1 n-1 i

24 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Number of Cache Misses at Runtime Estimated by multiplying two factors ♦ Runtime edge span distribution ♦ CMRF 1D Layout: Edge span 2Edge span 4Edge span 2 + + ( )

25 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Number of Cache Misses at Runtime 1D Layout: Edge span 2Edge span 4Edge span 2 + + Runtime edge span distribution CMRF ( )

26 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Expected Number of Cache Misses ♦ Approximate runtime edge span distribution with one of the layout Edge span distribution of the layout The number of vertices

27 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Outline Overview Cache-oblivious metric Results

28 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Cache-Oblivious Metric Decides if a local permutation reduces number of cache misses ♦ Probabilistic formulation ♦ Reduces to geometric volume computation

29 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Does a Local Permutation Decrease Cache Misses? ?

30 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Does a Local Permutation Decrease Cache Misses? 

31 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Monotonocity of CMRF, Assume CMRF is a monotonically increasing function of edge span 0 1 Cache miss ratio Edge span 1 ∞ i

32 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Exact Cache-Oblivious Metric where  All the possible cache configurations Monotonicity of CMRF

33 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Geometric Formulation where Half hyperspace p 2 p 1 0 Closed hyperspace p 2 p 1 0

34 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Geometric Volume Computation Assume each CMRF to be equally likely Half hyperspace (blue area) ♦ Space of CMRFs that reduce cache misses p 2 p 1 0 where

35 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Geometric Volume Computation Time complexity ♦ Exact: [Lasserre and Zeron 01] ♦ Approximate: [Kannan et al. 97]

36 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL p 2 p 1 0 Fast and Approximate Volume Comparison Define a top polytope in closed hyperspace Compute the centroid, C, of the top polytope Top polytope Centroid, C

37 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL p 2 p 1 0 Fast and Approximate Volume Comparison Use the centroid for approximate volume comparison ♦ The volume containing the centroid is likely to be larger Centroid, C

38 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Bound of Approximation 0.1% ~ 0.3% compared to the exact metric

39 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Final Approximate Metric Centroid Pack non-zero to 1,…, m

40 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Layout Optimization Find an optimal layout that minimizes our metric ♦ Combinatorial optimization problem

41 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Multilevel Minimization Step 1: Coarsening

42 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Multilevel Minimization Step 2: Ordering of coarsest graph

43 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Multilevel Minimization Step 3: Refinement and local optimization

44 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Outline Overview Cache-oblivious layouts Results

45 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Layout Computation Time Process 70 million vertices per hour ♦ Takes 2.6 hours to lay out St. Matthew model (372 million triangles) ♦ 2.4GHz of Pentium 4 PC with 1 GB main memory

46 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Edge Span Distributions of Different Layouts Cache-oblivious layout Spectral layout Original layout Edge span Number of edges >

47 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Applications View-dependent rendering Collision detection Isocontour extraction

48 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL View-Dependent Rendering Layout vertices and triangles of CHPM [Yoon et al. 04] ♦ Reduce misses of GPU vertex cache

49 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL View-Dependent Rendering Models# of Tri. Our layout Simplification layout [Yoon et al. 04] St. Matthew 372M106 M/s23 M/s Isosurface100M90 M/s20 M/s Double Eagle Tanker 82M47 M/s22 M/s 4.5X 2.1X Peak performance: 145 M tri / s on GeForce 6800 Ultra

50 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Realtime Captured Video – St. Matthew Model

51 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Comparison with Other Rendering Sequences Our layout Universal rendering sequences [Bogomjakov and Gotsman 2002]

52 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Comparison with Other Rendering Sequences Our layout [Hoppe 99] Optimized for 16 vertex cache size with FIFO replacement Optimized for no particular cache size

53 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Performance during View- Dependent Rendering Our layout [Hoppe 99] Optimized for various resolutions Optimized for full resolution

54 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Comparison with Space Filling Curve on Power Plant Model Our layout Space filling curve (Z-curve)

55 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Collision Detection Bounding volume hierarchies ♦ Widely used to accelerate the performance of collision detection ♦ Traversed to find contacting area ♦ Uses pre-computed layouts of OBB trees [Gottschalk et al. 96]

56 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Rigid Body Simulation

57 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Collision Detection Time 2X on average Depth-first layout Cache-oblivious layout

58 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Isocontour Extraction Contour tree [van Kreveld et al. 97] Use mesh as the input graph Extract an isocontour that is orthogonal to z-axis Puget sound, 134 M triangles Isocontour z(x,y) = 500m

59 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Comparison – First Extraction of Z(x,y) = 500m Relative Performance over Z-axis sorted layout Nearly optimized for particular isocontour 2 21 13 1 Disk access time is bottleneck

60 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Comparison – Second Extraction of Z(x,y) = 500m Relative Performance over Z-axis sorted layout 2 21 13 379 212 1 0.8 Memory and L1/L2 cache access times are bottleneck

61 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Limitations Assumptions on CMRF ♦ May not work well for all applications Does not compute global optimum ♦ Greedy solution

62 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Advantages General ♦ Applicable to all kinds of polygonal models ♦ Works well for various applications Cache-oblivious ♦ Can have benefit from CPU/GPU cache to memory and disk No modification of runtime application ♦ Only layout computation

63 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL OpenCCL: Cache-Coherent Layouts of Graphs and Meshes Source codes for computing a cache-coherent layout Easy to use CLayoutGraph Graph (NumVertex); 0 1 2 Graph.AddEdge (0, 1); Graph.AddEdge (0, 2); Graph.AddEdge (1, 2); int Order [NumVertex]; Graph.ComputeOrdering (Order); Google “Cache Oblivious Mesh Layout” or Http://gamma.cs.unc.edu/COL

64 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Conclusion Novel algorithm for computing cache-oblivious mesh layouts ♦ Cast the problem as an optimization ♦ Probabilistically compute the expected number of caches misses ♦ Achieve significant improvements (2 to 20X) without modifying runtime applications

65 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Ongoing and Future Work Apply to other applications ♦ Simplification and approximate collision detection [Yoon et al. 04] ♦ Shortest path computation, etc. Investigate optimality

66 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Ongoing and Future Work Cache-Oblivious Layouts of Bounding Volume Hierarchies [Yoon and Manocha 05] ♦ Tech. Report, University of North Carolina at Chapel Hill

67 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Acknowledgements Anonymous donor ♦ Power plant model Digital Michelangelo Project ♦ St. Matthew model at Stanford University LLNL ASCI VIEWS ♦ Isosurface model Newport news shipbuilding ♦ Double eagle tanker

68 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Acknowledgements Army Research Office DARPA Intel Corporation Lawrence Livermore Nat’l Lab. National Science Foundation Office of Naval Research RDECOM

69 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Martin Isenburg Dawoon Jung Brandon Lloyd Elise London Brian Salomon Avneesh Sud Acknowledgements

70 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Questions? Project URL http://gamma.cs.unc.edu/COL


Download ppt "The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Cache-Oblivious Mesh Layouts Sung-Eui Yoon, Peter Lindstrom Valerio Pascucci, Dinesh Manocha 1: University."

Similar presentations


Ads by Google