A Randomized Linear-Time Algorithm to Find Minimum Spanning Trees David R. Karger David R. Karger Philip N. Klein Philip N. Klein Robert E. Tarjan.

Slides:



Advertisements
Similar presentations
Chapter 5: Tree Constructions
Advertisements

Lecture 15. Graph Algorithms
Trees Chapter 11.
Greedy Algorithms Greed is good. (Some of the time)
Minimum Spanning Trees Definition Two properties of MST’s Prim and Kruskal’s Algorithm –Proofs of correctness Boruvka’s algorithm Verifying an MST Randomized.
CSC401 – Analysis of Algorithms Lecture Notes 14 Graph Biconnectivity
A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees 黃則翰 R 蘇承祖 R 張紘睿 R 許智程 D 戴于晉 R David R. Karger.
1. Given a predetermined property and a graph we want to distinguish between the 2 cases: 1)The graph has the property 2) The graph is “far” from having.
1 Discrete Structures & Algorithms Graphs and Trees: II EECE 320.
Discussion #36 Spanning Trees
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 13 Minumum spanning trees Motivation Properties of minimum spanning trees Kruskal’s.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 8 May 4, 2005
Lecture 12 Minimum Spanning Tree. Motivating Example: Point to Multipoint Communication Single source, Multiple Destinations Broadcast – All nodes in.
Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne MST: Red Rule, Blue Rule Some of these lecture slides are adapted from material.
1 Separator Theorems for Planar Graphs Presented by Shira Zucker.
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 11 Instructor: Paul Beame.
Module #1 - Logic 1 Based on Rosen, Discrete Mathematics & Its Applications. Prepared by (c) , Michael P. Frank and Modified By Mingwu Chen Trees.
External-Memory MST (Arge, Brodal, Toma). Minimum-Spanning Tree Given a weighted, undirected graph G=(V,E), the minimum-spanning tree (MST) problem is.
Minimum Spanning Trees. Subgraph A graph G is a subgraph of graph H if –The vertices of G are a subset of the vertices of H, and –The edges of G are a.
TECH Computer Science Graph Optimization Problems and Greedy Algorithms Greedy Algorithms  // Make the best choice now! Optimization Problems  Minimizing.
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)
Minimal Spanning Trees What is a minimal spanning tree (MST) and how to find one.
Midwestern State University Minimum Spanning Trees Definition of MST Generic MST algorithm Kruskal's algorithm Prim's algorithm 1.
Design and Analysis of Computer Algorithm September 10, Design and Analysis of Computer Algorithm Lecture 5-2 Pradondet Nilagupta Department of Computer.
1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.
Theory of Computing Lecture 10 MAS 714 Hartmut Klauck.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Algorithms for Enumerating All Spanning Trees of Undirected and Weighted Graphs Presented by R 李孟哲 R 陳翰霖 R 張仕明 Sanjiv Kapoor and.
1 Fibonacci heaps, and applications. 2 Yet a better MST algorithm (Fredman and Tarjan) Iteration i: We grow a forest, tree by tree, as follows. Start.
MST Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
2IL05 Data Structures Fall 2007 Lecture 13: Minimum Spanning Trees.
Spring 2015 Lecture 11: Minimum Spanning Trees
Approximating the Minimum Degree Spanning Tree to within One from the Optimal Degree R 陳建霖 R 宋彥朋 B 楊鈞羽 R 郭慶徵 R
UNC Chapel Hill Lin/Foskey/Manocha Minimum Spanning Trees Problem: Connect a set of nodes by a network of minimal total length Some applications: –Communication.
1 Greedy algorithm 叶德仕 2 Greedy algorithm’s paradigm Algorithm is greedy if it builds up a solution in small steps it chooses a decision.
7.1 and 7.2: Spanning Trees. A network is a graph that is connected –The network must be a sub-graph of the original graph (its edges must come from the.
 2004 SDU Lecture 7- Minimum Spanning Tree-- Extension 1.Properties of Minimum Spanning Tree 2.Secondary Minimum Spanning Tree 3.Bottleneck.
Greedy Algorithms and Matroids Andreas Klappenecker.
5.5.2 M inimum spanning trees  Definition 24: A minimum spanning tree in a connected weighted graph is a spanning tree that has the smallest possible.
5.5.3 Rooted tree and binary tree  Definition 25: A directed graph is a directed tree if the graph is a tree in the underlying undirected graph.  Definition.
Lecture 19 Greedy Algorithms Minimum Spanning Tree Problem.
Introduction to Graph Theory
5.5.2 M inimum spanning trees  Definition 24: A minimum spanning tree in a connected weighted graph is a spanning tree that has the smallest possible.
Minimum Bottleneck Spanning Trees (MBST)
Minimum- Spanning Trees
Trees Thm 2.1. (Cayley 1889) There are nn-2 different labeled trees
Lecture 12 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
Algorithm Design and Analysis June 11, Algorithm Design and Analysis Pradondet Nilagupta Department of Computer Engineering This lecture note.
Theory of Computational Complexity Probability and Computing Ryosuke Sasanuma Iwama and Ito lab M1.
Trees.
Introduction to Algorithms
Chapter 5. Greedy Algorithms
Minimum Spanning Tree 8/7/2018 4:26 AM
Lecture 12 Algorithm Analysis
Advanced Algorithms Analysis and Design
Greedy Algorithms / Minimum Spanning Tree Yin Tat Lee
MST in Log-Star Rounds of Congested Clique
A Randomized Linear-Time Algorithm to Find Minimum Spanning Trees
Randomized Algorithms CS648
CS 583 Analysis of Algorithms
Lecture 12 Algorithm Analysis
Lecture 12 Algorithm Analysis
Winter 2019 Lecture 11 Minimum Spanning Trees (Part II)
Analysis of Algorithms
Minimum Spanning Trees
Autumn 2019 Lecture 11 Minimum Spanning Trees (Part II)
Presentation transcript:

A Randomized Linear-Time Algorithm to Find Minimum Spanning Trees David R. Karger David R. Karger Philip N. Klein Philip N. Klein Robert E. Tarjan

Talk Outline Objective & related work from literatures Objective & related work from literatures Intuition Intuition Definitions Definitions Algorithm Algorithm Proof & Analysis Proof & Analysis Conclusion and future work Conclusion and future work

Objective A minimum spanning tree is a tree formed from a subset of the edges in a given undirected graph, with two properties: A minimum spanning tree is a tree formed from a subset of the edges in a given undirected graph, with two properties: 1. it spans the graph, i.e., it includes every vertex in the graph, and vertex in the graph, and 2. it is a minimum, i.e., the total weight of all the edges is as low as possible. all the edges is as low as possible. Find a minimum spanning tree for a graph by linear time with very high probability!!

Related Work Boruvka 1926, textbook algorithms – Boruvka 1926, textbook algorithms – Yao 1975 – Yao 1975 – Cheriton and Tarjan 1976 – Cheriton and Tarjan 1976 – Fredman and Tarjan 1987 – Fredman and Tarjan 1987 – Gabow 1986 – Gabow 1986 – Chazelle 1995 – Chazelle 1995 – Deterministic results! How about the randomized one??

Intuition Cycle Property Cycle Property Cut Property Cut Property Randomization Randomization

Intuition For any cycle C in a graph, the heaviest edge in C does not apper in the minimum spanning tree.

Cycle Property Heaviest edge

Cycle Property For any graph, find all possible cycles and remove the heaviest edge from each cycle. Then we can get a minimum spanning tree?? For any graph, find all possible cycles and remove the heaviest edge from each cycle. Then we can get a minimum spanning tree?? How about the time complexity? How to detect the cycles in the graph??

Cut Property For any proper nonempty subset X of the vertices, the lightest edge with exactly one endpoint in X belongs to the minimum spanning tree.

Cut Property

Boruvka Algorithm For each vertex, select the minimum-weight edge incident to the vertex. Contract all the selected edges, replacing by a single vertex each connected component defined by the selected edges and deleting all resulting isolated vertices, loops (edges both of whose endpoints are the same), and all but the lowest-weight edge among each set of multiple edges. For each vertex, select the minimum-weight edge incident to the vertex. Contract all the selected edges, replacing by a single vertex each connected component defined by the selected edges and deleting all resulting isolated vertices, loops (edges both of whose endpoints are the same), and all but the lowest-weight edge among each set of multiple edges. O(m log n)

Randomization How the randomization can help us to achieve our goal? Boruvka + Cycle Property + Randomization Boruvka + Cycle Property + Randomization = Linear time with very high probability

Definition

Definition Let G be a graph with weighted edges. Let G be a graph with weighted edges. w(x, y) : the weight of edge {x, y} If F is a forest of a subgraph in G, If F is a forest of a subgraph in G, F(x, y) : the path (if any) connecting x and y in F, : the maximum weight of an edge on F(x, y), with the convention that if x and y are not connected in F.

Definition F-heavy F-heavy Otherwise, {x, y} is F-light.

F-heavy & F-light F-light F-heavy

F-heavy & F-light Note that the edges of F are all F-light. Note that the edges of F are all F-light. For any forest F, no F-heavy edge can be in the minimum spanning forest of G. Cycle Property!!

Recursive function call: Input: A undirected graph Output: A minimum spanning forest Time: for the worst case O(m) with very high probability Algorithm

Algorithm Step 1. Apply two successive Boruvka steps to the graph, thereby reducing the number of vertices by at least a factor of four. Step 1. Apply two successive Boruvka steps to the graph, thereby reducing the number of vertices by at least a factor of four.

Algorithm

Algorithm Step 2. In the contracted graph, choose a subgraph H by selecting each edge independently with probability 1/2. Apply the algorithm recursively to H, producing a minimum spanning forest F of H. Find all the F- heavy edges (both those in H and those not in H) and delete them. Step 2. In the contracted graph, choose a subgraph H by selecting each edge independently with probability 1/2. Apply the algorithm recursively to H, producing a minimum spanning forest F of H. Find all the F- heavy edges (both those in H and those not in H) and delete them.

Algorithm Back to analysis

Algorithm Step 3. Apply the algorithm recursively to the remaining graph to compute a spanning forest. Return those edges contracted in Step 1 together with the edges of. Step 3. Apply the algorithm recursively to the remaining graph to compute a spanning forest. Return those edges contracted in Step 1 together with the edges of.

Algorithm Back to analysis

Algorithm Those not in H F - light F - heavy Edges of H

Analysis Correctness? Correctness? Worst-case time complexity? Worst-case time complexity? Expected time complexity? Expected time complexity?

Correctness Completeness Completeness By the cut property, every edge contracted during Step 1 is in the minimum spanning forest. Hence the remaining edges of the minimum spanning forest of the original graph form a minimum spanning forest of the contracted graph.

Correctness Soundness Soundness By the cycle property, the edges deleted in Step 2 do not belong to minimum spanning forest. By the inductive hypothesis, the minimum spanning forest of the remaining graph is correctly determined in recursive call of Step 3.

Worst-case time complexity The worst-case running time of the mini- spanning forest algorithm is, the same as the bound for Boruvka ’ s algorithm. The worst-case running time of the mini- spanning forest algorithm is, the same as the bound for Boruvka ’ s algorithm. Count the total number of edges. Step 1 reduces the size to ¼ as its original. A subproblem at depth d contains at most edges. Summing all subproblems gives an bound on the total number of edges.

Worst-case time complexity Parent: E(G) Parent: E(G) Left child: E(H) Left child: E(H) Left child Left child Right child: Right child: Right child Right child Number of edges in next recursion level = E(G*) + E(F) = E(G) – V(G)/2 + V(G)/4 Parent LeftRight

Worst-case time complexity m edges

Worst-case time complexity The total time spent in Steps 1-3 is linear in the number of edges: The total time spent in Steps 1-3 is linear in the number of edges: Step 1 is just two steps of Boruvka ’ s algorithm. Step 2 takes linear time using the modified Dixon-Rauch-Tarjan verification algorithm. - F-heavy edges of G can be computed in time linear in the number of edges of G. linear in the number of edges of G.

Analysis Given graph G with n vertices and m edges After one Boruvka step, Boruvka step forms connected components and replaces each by single vertex. Since each component connects more than 2 edges, there are at most n/2 vertices remained. For component with k vertices, exactly k – 1 edges are removed. Thus the edges removed is at least where  is set of connected components. Since there is at most n/2 components, there is at least n/2 edges removed.

Analysis Given F = MST(H), for (x, y) in H 1.If (x, y) is in F, (x, y) is F-light 2.If (x, y) is not in F, assume (x, y) is F-light, the heaviest edge in cycle P  (x, y) would be on P, and is belong to no MST according to cycle property. This causes contradiction, thus (x, y) is F-heavy. 3.Thus, each F-light edge in H is also in F, and vice versa.

Analysis According to the distribution of edges used by H and G', edges of F are used twice by calling MST(H) and MST(G').

The binary tree represents the recursive invocation of MST: Analysis Left child represents invocation of MST(H). Right child represents invocation of MST(G'). Since 2 Borůvka step are performed before invocation of MST(H) and MST(G'), number of vertices is reduced in factor of 4. Thus, the height of invocation is at most log 4 n.

Analysis - Worst Case Given graph G with m edges and n vertices 1.After 2 Borůvka steps, at most n/4 vertices and m – n /2 edges remain for G*. This is true also for H and G' which are subgraph of G*. 2.Since F = MST(H), F has at most v H – 1 edges, and thus less than n/4. 3.According to the edge distribution, e H + e G'  e G* + e F  m – n/2 + v H  m – n/2 + n/4  m Thus, the number of edges in subproblems is less than original’s.

4.Since total edge number of subproblems at the same depth is bound by m, and the depth is at most log 4 n, the overall edge number is at most m log 4 n. 5.Since vertex number for submproblem at depth d is at most n/4 d, the edge is at most (n/4 d ) 2. Overall edge number is also bound by Analysis - Worst Case 6.Since running time of the algorithm is proportional to edge number, we could give time complexity as O(min{n2, m log n})

Analysis – Average Case Here, we analyze the average case by partitioning the invocations as “left paths” (red paths above). After reckoning edges of subproblems along each “left path”, sum them up and we will get the overall estimate.

Analysis – Average Case 1.For G* with k edges, after sampling with 1/2 probability for each edge, E[e H ] = k/2. Since G*  G, we have E[e G* ]  E(e G ) and E[e H ] = E[e G* ]/2  E[e G ]/2. 2.Along the left path with starting E[e G ] = k, the expected value of total edges is

Analysis – Average Case 1.For each F-light edge, there is 1/2 probability of being sampled into H. 2.Since each F-light edge in H is also in F and F includes no edges not in H, the chance that an F-light is in F is also 1/2. 3.For edge e with weight heavier than the lightest of F is never F-light since there would be cycle with e as heaviest edge. 4.Thus, the heaviest F-light edge is always in F. Given e F =k, e G' is the number trials before k successes (selected into H), and it forms a negative binomial distribution. Given v G* = n, F = MST(H) where H  G*

Analysis – Average Case 5.For e F = k, e G ' is of negative binomial distribution with parameter 1/2 and k. Thus E[e G ' ] = k/(½) = 2k. 6.Summing all cases, we get Given v G* = n, F = MST(H) where H  G*

Analysis – Average Case 1.For all right subproblems, expected sum of edges is at most 2.For each left path, the expected total number of edges is twice of the leading subproblem, which is root or right child. So the overall expected value is at most 2(m + n). 3.Since running time is proportional to overall edge number, so its expected value is O(m) = O(m + n).

Analysis – Probability of Linearity Chernoff Bound: Given x i as i.d.d. random variables and 0 0, we have Thus, the probability that less than s successes (each with cance p) within k trail is

Analysis – Probability of Linearity Given a path with leading problem G, e G = k For each edge in G, it has 1/2 less chance to be kept in next subproblem. and each edge-keep contributes 1 to the total edge number. The path ends when the k-th edge-move occurs. The probability there are 3k more total edges is probability there are k less edge- remove in k+3k trail. According to Chernoff bound, the probability is exp(-  (k)).

Analysis – Probability of Linearity 1.Given v G* = n'. For each edge in G', it has 1/2 chance to be in F. Since e F = n' – 1, the probability that e G' > 3n' is probability there are n' - 1 less F edge in 3k trail. According to Chernoff bound, the probability is exp(-  (n')). 2.There is at most n/2 total vertices in all G*. If we take all the trail as a whole, the probability that there are more than 3n/2 edges in all right subproblem is exp(-  (n)).

Analysis – Probability of Linearity Combined with previous two analysis, there is at least probability as below that total edges never exceeds 3(m+3n/2), where  is the set of all right problems: Thus, the probability that time complexity is O(m) is 1 – exp(  (m)).