I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, and Ke Yi Duke University University of Aarhus.

Slides:



Advertisements
Similar presentations
Lecture 15. Graph Algorithms
Advertisements

Comments We consider in this topic a large class of related problems that deal with proximity of points in the plane. We will: 1.Define some proximity.
Dynamic Graph Algorithms - I
Efficient access to TIN Regular square grid TIN Efficient access to TIN Let q := (x, y) be a point. We want to estimate an elevation at a point q: 1. should.
Structural Joins: A Primitive for Efficient XML Query Pattern Matching Al Khalifa et al., ICDE 2002.
An Optimal Dynamic Interval Stabbing-Max Data Structure? Pankaj K. Agarwal, Lars Arge and Ke Yi Department of Computer Science Duke University.
I/O-Algorithms Lars Arge Fall 2014 September 25, 2014.
Great Theoretical Ideas in Computer Science for Some.
External Memory Geometric Data Structures
. Computing Contour Maps & Answering Contour Queries Pankaj K. Agarwal Joint work with Lars Arge ThomasMolhave Thomas Molhave Bardia Sadri.
Lasse Deleuran 1/37 Homotopic Polygonal Line Simplification Lasse Deleuran PhD student.
Query Processing in Databases Dr. M. Gavrilova.  Introduction  I/O algorithms for large databases  Complex geometric operations in graphical querying.
I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.
I/O-Efficient Construction of Constrained Delaunay Triangulations Pankaj K. Agarwal, Lars Arge, and Ke Yi Duke University.
I/O-Algorithms Lars Arge University of Aarhus February 21, 2005.
I/O-Algorithms Lars Arge Aarhus University February 27, 2007.
Data Structures & Algorithms Graph Search Richard Newman based on book by R. Sedgewick and slides by S. Sahni.
I/O-Algorithms Lars Arge Aarhus University February 16, 2006.
I/O-Algorithms Lars Arge University of Aarhus February 13, 2005.
I/O-Algorithms Lars Arge University of Aarhus March 1, 2005.
I/O-Algorithms Lars Arge Spring 2009 March 3, 2009.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
Graphs & Graph Algorithms 2
I/O-Algorithms Lars Arge Aarhus University March 5, 2008.
Point Location Computational Geometry, WS 2007/08 Lecture 5 Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für Informatik Fakultät für.
I/O-Algorithms Lars Arge University of Aarhus March 7, 2005.
From Elevation Data to Watershed Hierarchies Pankaj K. Agarwal Duke University Supported by ARO W911NF
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
External Memory Algorithms Kamesh Munagala. External Memory Model Aggrawal and Vitter, 1988.
CSE 373, Copyright S. Tanimoto, 2002 Up-trees - 1 Up-Trees Review of the UNION-FIND ADT Straight implementation with Up-Trees Path compression Worst-case.
Graphs & Graph Algorithms 2 Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Important Problem Types and Fundamental Data Structures
TerraStream: From Elevation Data to Watershed Hierarchies Thursday, 08 November 2007 Andrew Danner (Swarthmore), T. Moelhave (Aarhus), K. Yi (HKUST), P.
Chapter Tow Search Trees BY HUSSEIN SALIM QASIM WESAM HRBI FADHEEL CS 6310 ADVANCE DATA STRUCTURE AND ALGORITHM DR. ELISE DE DONCKER 1.
I/O-Efficient Graph Algorithms Norbert Zeh Duke University EEF Summer School on Massive Data Sets Århus, Denmark June 26 – July 1, 2002.
Chapter 6: Union-Find and Related Structures CS6310 ADVANCED DATA STRUCTURE SHADHA MUHI & HASNAA IMAD.
10/2/2015 3:00 PMCampus Tour1. 10/2/2015 3:00 PMCampus Tour2 Outline and Reading Overview of the assignment Review Adjacency matrix structure (§12.2.3)
Chapter 2 Graph Algorithms.
External Memory Algorithms for Geometric Problems Piotr Indyk (slides partially by Lars Arge and Jeff Vitter)
Computer Algorithms Submitted by: Rishi Jethwa Suvarna Angal.
Lars Arge Presented by Or Ozery. I/O Model Previously defined: N = # of elements in input M = # of elements that fit into memory B = # of elements per.
Graphs. Definitions A graph is two sets. A graph is two sets. –A set of nodes or vertices V –A set of edges E Edges connect nodes. Edges connect nodes.
I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.
Union-find Algorithm Presented by Michael Cassarino.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Data Structures and Algorithms in Parallel Computing Lecture 2.
Minimum Spanning Trees Featuring Disjoint Sets HKOI Training 2006 Liu Chi Man (cx) 25 Mar 2006.
Union-Find A data structure for maintaining a collection of disjoint sets Course: Data Structures Lecturers: Haim Kaplan and Uri Zwick January 2014.
Union-Find  Application in Kruskal’s Algorithm  Optimizing Union and Find Methods.
Data Structures and Algorithms in Parallel Computing Lecture 3.
Graphs. Graphs Similar to the graphs you’ve known since the 5 th grade: line graphs, bar graphs, etc., but more general. Those mathematical graphs are.
ProblemAssumption and PreliminariesAlgorithm  How does the water flow and depressions fill during non-uniform rain over a terrain?  Can we efficiently.
Discrete Structures CISC 2315 FALL 2010 Graphs & Trees.
Introduction Terrain Level set and Contour tree Problem Maintaining the contour tree of a terrain under the following operation: ChangeHeight(v, r) : Change.
Union By Rank Ackermann’s Function Graph Algorithms Rajee S Ramanikanthan Kavya Reddy Musani.
CSE554Contouring IISlide 1 CSE 554 Lecture 5: Contouring (faster) Fall 2013.
Polygon Triangulation
Modeling & Analyzing Massive Terrain Data Sets
Great Theoretical Ideas in Computer Science
Query Processing in Databases Dr. M. Gavrilova
Greedy Algorithms / Minimum Spanning Tree Yin Tat Lee
Campus Tour 11/16/2018 3:14 PM Campus Tour Campus Tour
Graph Algorithm.
Discrete Mathematics for Computer Science
Minimum Spanning Tree.
Graphs & Graph Algorithms 2
T. C. van Dijk1, J.-H. Haunert2, J. Oehrlein2 1University of Würzburg
Campus Tour 2/23/ :26 AM Campus Tour Campus Tour
Important Problem Types and Fundamental Data Structures
A Variation of Minimum Latency Problem on Path, Tree and DAG
Presentation transcript:

I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, and Ke Yi Duke University University of Aarhus

The Union-Find Problem A universe of N elements: x 1, x 2, …, x N Initially N singleton sets: {x 1 }, {x 2 }, …, {x N } Each set has a representative Maintain the partition under –Union( x i, x j ) : Joins the sets containing x i and x j –Find( x i ) : Returns the representative of the set containing x i

The Solution d bja eg h fl n m i srczk p representatives d bja eg h fl n m Union(d, h) : link-by-rank d bja eg h fl n Find(n) : path compression m

Complexity O(N α(N)) for a sequence of N union and find operations [Tarjan 75] – α() : Inverse Ackermann function (very slow!) –Optimal in the worst case [Tarjan79, Fredman and Saks 89] Batched (Off-line) version –Entire sequence known in advance –Can be improved to linear on RAM [Gabow and Tarjan 85] –Not possible on a pointer machine [Tarjan79]

Simple and Good, as long as … The entire data structure fits in memory

The I/O Model Main memory of size M Disk of infinite size One I/O transfers B items between memory and disk

Our Results An I/O-efficient algorithm for the batched union-find problem using O(sort( N )) = O( N/B log M/B (N/B) ) I/Os expected –Same as sorting –optimal in the worst case A practical algorithm using O(sort( N ) log(N/M) ) I/Os Applications to terrain analysis –Topological persistence : O(sort( N )) I/Os –Contour trees : O(sort( N )) I/Os

I/O-Efficient Batched Union-Find Assumption: No redundant unions –Each union must join two different sets –Will remove later Two-stage algorithm –Convert to interval union-find Compute an order on the elements s.t. each union joins two adjacent sets –Solve batched interval union-find

Union Graph r ab cdef ghi 1: Union(d, g) 2: Union(a, c) 3: Union(r, b) 4: Union(a, e) 5: Union(e, i) 6: Union(r, a) 7: Union(a, d) g 8: Union(d, h) r 9: Union(b, f) r ab cde f g h i Equivalent union trees (Tree if no redundant unions)

Transforming the Union Tree r ab cdef ghi r ab cdef g h i r ab c d efg h i r ab c d e f g h i Weights along root-to-leaf path decrease

Formulating as a Batched Problem r ab cdef ghi r ab c d e f g h i For each edge, find the lowest ancestor edge with a higher weight

Cast in a Geometry Setting r ab cdef ghi Euler Tour In O(sort( N )) I/Os [Chiang et al. 95] x : positions in the tour y : weight

Cast in a Geometry Setting r ab cdef ghi For each edge, find the lowest ancestor edge with a higher weight For each segment, find the shortest segment above and containing it

Distribution Sweeping M/B vertical slabs checked here checked recursively Total cost: O(sort( N ))

In-Order Traversal r ab c d e f g h i Weights along root-to-leaf path decrease At u, with child u 1,…, u k (in increasing order of weight) 1.Recursively visit subtree at u 1 2.Return u 3.For i=2,…, k Recursively visit subtree at u i br 8 aceigdhf Claim: this traversal produces the right order

Solving Interval Union-Find Union: x : two operands y : time stamp Find: x : operand y : time stamp representative

Solving Interval Union-Find Union: x : two operands y : time stamp Find: x : operand y : time stamp Four instances of batched ray shooting: O(sort( N ))

Solving Interval Union-Find Union: x : two operands y : time stamp Find: x : operand y : time stamp Four instances of batched ray shooting: O(sort( N ))

Handling Redundant Unions Union tree becomes a general graph Compute the minimum spanning tree –O(sort( N )) I/Os (randomized) [Chiang et al. 95] O(sort( N ) loglog B ) I/Os (deterministic) [Arge et al. 04] –Deterministic O(sort( N )) I/Os if graph is planar –Only MST edges are non-redundant

Applications 1.Topological Persistence 2.Contour Trees

Application: Topological Persistence Introduced by Edelsbrunner et al Measure importance on a surface –Feature extraction –Topological de-noising Many applications –Surface modeling –Shape analysis –Terrain analysis –Computational Biology

Topological Persistence Illustrated

Formulated as Batched Union-Find Represented as a triangulated mesh Consider minimum-saddle pairs When reach –A minimum or maximum: do nothing –A regular point u : Issue union( u,v ) for a lower neighbor v –A saddle u : let v and w be nodes from u ’s two connected pieces in its lower link Issue: find( v ), find( w ), union( u,v ), union( u,w ) lower link

Experiment 1: Random Union-Find 128MB memory

Experiment 2: Topological Persistence on Terrain Data Neuse River Basin of North Carolina: ~ 0.5 billion points

Experiment 2: Topological Persistence on Terrain Data Entire data set (0.5b): IM fails and EM takes 10 hours 128MB memory

Contour Trees

Summary An I/O-efficient algorithm for the batched union-find problem using O(sort( N )) = O( N/B log M/B (N/B) ) I/Os –optimal in the worst case A practical algorithm using O(sort( N ) log(N/M) ) I/Os Applications to terrain analysis –Topological persistence : O(sort( N )) I/Os –Contour trees : O(sort( N )) I/Os Open Question: –On-line case: Can we get below O(N α(N)) I/Os?

Thank you!

Previous Results Directly maintain contours –O( N log N ) time [van Kreveld et al. 97] –Needs union-split-find for circular lists –Do not extend to higher dimensions Two sweeps by maintaining components, then merge –O( N log N ) time [Carr et al. 03] –Extend to arbitrary dimensions

Join Tree and Split Tree Join tree Split tree Qualified nodes Join tree Split tree

Final Contour Tree Join tree Split tree Contour tree Hard to BATCH!

Another Characterization Join tree Split tree Contour tree u v w u v w u u w Let w be the highest node that is a descendant of v in join tree and ancestor of u in split tree, (u, w) is a contour tree edge Now can BATCH!

Map to Rectangles Join tree Split tree u v w u v w u v w Can be solved in O(sort(N)) I/Os (practical, too)

Topological Persistence

Label Nodes with Intervals Using Euler tour (O(sort(N) I/Os)

Map to Rectangles Join tree Split tree u v w u v w u v w Can be solved in O(sort(N)) I/Os (practical, too)

Formulated as Batched Union-Find Represented as a triangulated mesh Consider minimum-saddle pairs When reach –A minimum or maximum: do nothing –A regular poin u : Issue union( u,v ) for a lower neighbor v –A saddle u : let v and w be nodes from u ’s two connected pieces in its lower link Issue: find( v ), find( w ), union( u,v ), union( u,w ) lower link

Experiment 1: Random Union-Find

Experiment 2: Topological Persistence on Terrain Data