Data Structures for Disjoint Sets Manolis Koubarakis Data Structures and Programming Techniques 1.

Slides:



Advertisements
Similar presentations
Lecture 15. Graph Algorithms
Advertisements

CSE 326: Data Structures Part 7: The Dynamic (Equivalence) Duo: Weighted Union & Path Compression Henry Kautz Autumn Quarter 2002 Whack!! ZING POW BAM!
Prof. Amr Goneid, AUC1 Analysis & Design of Algorithms (CSCE 321) Prof. Amr Goneid Department of Computer Science, AUC Part R4. Disjoint Sets.
1 Union-find. 2 Maintain a collection of disjoint sets under the following two operations S 3 = Union(S 1,S 2 ) Find(x) : returns the set containing x.
1 Disjoint Sets Set = a collection of (distinguishable) elements Two sets are disjoint if they have no common elements Disjoint-set data structure: –maintains.
Minimum Spanning Trees Definition Two properties of MST’s Prim and Kruskal’s Algorithm –Proofs of correctness Boruvka’s algorithm Verifying an MST Randomized.
C++ Programming: Program Design Including Data Structures, Third Edition Chapter 21: Graphs.
Chapter 23 Minimum Spanning Trees
Discussion #36 Spanning Trees
Disjoint Union / Find CSE 373 Data Structures Lecture 17.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
Dynamic Sets and Data Structures Over the course of an algorithm’s execution, an algorithm may maintain a dynamic set of objects The algorithm will perform.
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 17 Union-Find on disjoint sets Motivation Linked list representation Tree representation.
Lecture 16: Union and Find for Disjoint Data Sets Shang-Hua Teng.
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
CSE 373, Copyright S. Tanimoto, 2002 Up-trees - 1 Up-Trees Review of the UNION-FIND ADT Straight implementation with Up-Trees Path compression Worst-case.
Data Structures Using C++ 2E Chapter 11 Binary Trees and B-Trees.
C++ Programming: Program Design Including Data Structures, Third Edition Chapter 20: Binary Trees.
Chapter 19: Binary Trees. Objectives In this chapter, you will: – Learn about binary trees – Explore various binary tree traversal algorithms – Organize.
Spring 2015 Lecture 11: Minimum Spanning Trees
D ESIGN & A NALYSIS OF A LGORITHM 06 – D ISJOINT S ETS Informatics Department Parahyangan Catholic University.
1 Trees A tree is a data structure used to represent different kinds of data and help solve a number of algorithmic problems Game trees (i.e., chess ),
CSC 252a: Algorithms Pallavi Moorthy 252a-av Smith College December 14, 2000.
Lecture X Disjoint Set Operations
Disjoint Sets Data Structure. Disjoint Sets Some applications require maintaining a collection of disjoint sets. A Disjoint set S is a collection of sets.
Union-find Algorithm Presented by Michael Cassarino.
Union Find ADT Data type for disjoint sets: makeSet(x): Given an element x create a singleton set that contains only this element. Return a locator/handle.
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Nicki Dell Spring 2014.
CSE373: Data Structures & Algorithms Lecture 10: Implementing Union-Find Dan Grossman Fall 2013.
The Disjoint Set Class Data Structures & Problem Solving Using JAVA Second Edition Mark Allen Weiss Chapter 24 © 2002 Addison Wesley.
ICS 353: Design and Analysis of Algorithms Heaps and the Disjoint Sets Data Structures King Fahd University of Petroleum & Minerals Information & Computer.
1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei.
Chapter 21 Data Structures for Disjoint Sets Lee, Hsiu-Hui Ack: This presentation is based on the lecture slides from Prof. Tsai, Shi-Chun as well as various.
WEEK 5 The Disjoint Set Class Ch CE222 Dr. Senem Kumova Metin
Chapter 20: Graphs. Objectives In this chapter, you will: – Learn about graphs – Become familiar with the basic terminology of graph theory – Discover.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
Lecture 12 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
CSE 589 Applied Algorithms Spring 1999 Prim’s Algorithm for MST Load Balance Spanning Tree Hamiltonian Path.
Tirgul 12 Solving T4 Q. 3,4 Rehearsal about MST and Union-Find
CSE 373: Data Structures and Algorithms
Data Structures for Disjoint Sets
The Design and Analysis of Algorithms
Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2008
Disjoint Sets Data Structure
CSE 373, Copyright S. Tanimoto, 2001 Up-trees -
Disjoint Sets Chapter 8.
Lecture 12 Algorithm Analysis
CMSC 341 Disjoint Sets Based on slides from previous iterations of this course.
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Linda Shapiro Spring 2016.
Disjoint Set Neil Tang 02/23/2010
Disjoint Set Neil Tang 02/26/2008
Graphs Chapter 11 Objectives Upon completion you will be able to:
ICS 353: Design and Analysis of Algorithms
CSE 373 Data Structures and Algorithms
Outline This topic covers Prim’s algorithm:
Minimum Spanning Trees
CSE 332: Data Abstractions Union/Find II
Lecture 12 Algorithm Analysis
ICS 353: Design and Analysis of Algorithms
Data Structures and Programming Techniques
Union-find algorithms
Disjoint Sets DS.S.1 Chapter 8 Overview Dynamic Equivalence Classes
Kruskal’s algorithm for MST and Special Data Structures: Disjoint Sets
Lecture 12 Algorithm Analysis
CSE 373: Data Structures and Algorithms
Data Structures Using C++ 2E
General Trees A general tree T is a finite set of one or more nodes such that there is one designated node r, called the root of T, and the remaining nodes.
INTRODUCTION A graph G=(V,E) consists of a finite non empty set of vertices V , and a finite set of edges E which connect pairs of vertices .
Disjoint Set Operations: “UNION-FIND” Method
Presentation transcript:

Data Structures for Disjoint Sets Manolis Koubarakis Data Structures and Programming Techniques 1

Dynamic Sets Sets are fundamental for mathematics but also for computer science. In computer science, we usually study dynamic sets i.e., sets that can grow, shrink or otherwise change over time. The data structures we have presented so far in this course offer us ways to represent finite, dynamic sets and manipulate them on a computer. Data Structures and Programming Techniques 2

Dictionaries A dynamic set that supports insertion of elements into it, deletion of elements from it and testing membership in it is called a dictionary. Many of the data structures, we have so far presented, can be used to implement a dictionary (e.g., a linked list, a hash table, a (2,4) tree etc.). Data Structures and Programming Techniques 3

Disjoint Sets Data Structures and Programming Techniques 4

Definitions Data Structures and Programming Techniques 5

Definitions (cont’d) Data Structures and Programming Techniques 6

Determining the Connected Components of an Undirected Graph One of the many applications of disjoint-set data structures is determining the connected components of an undirected graph. We have already shown how to solve this problem by using DFS. The implementation based on disjoint-sets that we will present here is more appropriate when the edges of the graph are not static e.g., when edges are added dynamically and we need to maintain the connected components as each edge is added. Data Structures and Programming Techniques 7

Computing the Connected Components of an Undirected Graph Data Structures and Programming Techniques 8

Computing the Connected Components (cont’d) Data Structures and Programming Techniques 9

Example Graph Data Structures and Programming Techniques 10 a b c d e g f h i j

The Collection of Disjoint Sets After Each Edge is Processed Edge processed initial sets{a}{b}{c}{d}{e}{f}{g}{h}{i}{j} (b,d){a}{b,d}{c}{e}{f}{g}{h}{i}{j} (e,g){a}{b,d}{c}{e,g}{f}{h}{i}{j} (a,c){a,c}{b,d}{e,g}{f}{h}{i}{j} (h,i){a,c}{b,d}{e,g}{f}{h,i}{j} (a,b){a,b,c,d}{e,g}{f}{h,i}{j} (e,f){a,b,c,d}{e,f,g}{h,i}{j} (b,c){a,b,c,d}{e,f,g}{h,i}{j} Data Structures and Programming Techniques 11

Minimum Spanning Trees Another application of the disjoint set operations that we have already seen is Kruskal’s algorithm for computing the MST of a graph. Data Structures and Programming Techniques 12

Maintaining Equivalence Relations Data Structures and Programming Techniques 13

Examples of Equivalence Relations Equality Equivalent type definitions in programming languages. For example, consider the following type definitions in C: struct A { int a; int b; }; typedef A B; typedef A C; typedef A D; The types A, B, C and D are equivalent in the sense that variables of one type can be assigned to variables of the other types without requiring any casting. Data Structures and Programming Techniques 14

Equivalent Classes Data Structures and Programming Techniques 15

Example Data Structures and Programming Techniques 16

The Equivalence Problem Data Structures and Programming Techniques 17

The Equivalence Problem (cont’d) Data Structures and Programming Techniques 18

Example (cont’d) Data Structures and Programming Techniques 19

Example (cont’d) Data Structures and Programming Techniques 20

Linked-List Representation of Disjoint Sets A simple way to implement a disjoint-set data structure is to represent each set by a linked list. The first object in each linked list serves as its set’s representative. The remaining objects can appear in the list in any order. Each object in the linked list contains a set member, a pointer to the object containing the next set member, and a pointer back to the representative. Data Structures and Programming Techniques 21

The Structure of Each List Object Data Structures and Programming Techniques 22 Pointer Back to Representative Pointer to Next Object Set Member

Example: the Sets {c, h, e, b} and {f, g, d} Data Structures and Programming Techniques 23 c h e b. f g d. The representatives of the two sets are c and f.

Implementation of M AKE -S ET and F IND -S ET Data Structures and Programming Techniques 24

Implementation of U NION Data Structures and Programming Techniques 25

Amortized Analysis In an amortized analysis, the time required to perform a sequence of data structure operations is averaged over all operations performed. Amortized analysis can be used to show that the average cost of an operation is small, if one averages over a sequence of operations, even though a single operation might be expensive. Amortized analysis differs from the average-case analysis in that probability is not involved; an amortized analysis guarantees the average performance of each operation in the worst case. Data Structures and Programming Techniques 26

Techniques for Amortized Analysis Data Structures and Programming Techniques 27

Complexity Parameters for the Disjoint-Set Data Structures Data Structures and Programming Techniques 28

Complexity of Operations for the Linked List Representation Data Structures and Programming Techniques 29

Complexity (cont’d) Data Structures and Programming Techniques 30

Proof Data Structures and Programming Techniques 31

Operations Data Structures and Programming Techniques 32 OperationNumber of objects updated

Proof (cont’d) Data Structures and Programming Techniques 33

The Weighted Union Heuristic Data Structures and Programming Techniques 34

Theorem Data Structures and Programming Techniques 35

Proof Data Structures and Programming Techniques 36

Proof (cont’d) Data Structures and Programming Techniques 37

Complexity (cont’d) Data Structures and Programming Techniques 38

Disjoint-Set Forests In the faster implementation of disjoint sets, we represent sets by rooted trees. Each node of a tree represents one set member and each tree represents a set. In a tree, each set member points only to its parent. The root of each tree contains the representative of the set and is its own parent. For many sets, we have a disjoint-set forest. Data Structures and Programming Techniques 39

Example: the Sets {b, c, e, h} and {d, f, g} Data Structures and Programming Techniques 40 c he b f d g The representatives of the two sets are c and f.

Implementing M AKE -S ET, F IND -S ET and U NION A M AKE -S ET operation simply creates a tree with just one node. A F IND -S ET operation can be implemented by chasing parent pointers until we find the root of the tree. The nodes visited on this path towards the root constitute the find-path. A U NION operation can be implemented by making the root of one tree to point to the root of the other. Data Structures and Programming Techniques 41

Example: the Union of Sets {b, c, e, h} and {d, f, g} Data Structures and Programming Techniques 42 c he b f d g

Complexity Data Structures and Programming Techniques 43

The Union by Rank Heuristic Data Structures and Programming Techniques 44

The Path Compression Heuristic The second heuristic, path compression, is also simple and very effective. This heuristic is used during F IND -S ET operations to make each node on the find path point directly to the root. In this way, trees with small height are constructed. Path compression does not change any ranks. Data Structures and Programming Techniques 45

The Path Compression Heuristic Graphically Data Structures and Programming Techniques 46 f e d c b

The Path Compression Heuristic Graphically (cont’d) Data Structures and Programming Techniques 47 e d c b f

Implementing Disjoint-Set Forests Data Structures and Programming Techniques 48

Pseudocode Data Structures and Programming Techniques 49

Pseudocode (cont’d) Data Structures and Programming Techniques 50

The F IND -S ET Procedure Notice that the F IND -S ET procedure is a two- pass method: it makes one pass up the find path to find the root, and it makes a second pass back down the find path to update each node so it points directly to the root. The second pass is made as the recursive calls return. Data Structures and Programming Techniques 51

Complexity Data Structures and Programming Techniques 52

Implementation in C Let us assume that the sets will have positive integers in the range 0 to N-1 as their members. The simplest way to implement in C the disjoint sets data structure is to use an array id[N] of integers that take values in the range 0 to N-1. This array will be used to keep track of the representative of each set but also the members of each set. Initially, we set id[i]=i, for each i between 0 and N-1. This is equivalent to N M AKE -S ET operations that create the initial versions of the sets. To implement the U NION operation for the sets that contain integers p and q, we scan the array id and change all the array elements that have the value p to have the value q. The implementation of the F IND -S ET ( p ) simply returns the value of id[p]. Data Structures and Programming Techniques 53

Implementation in C (cont’d) The program on the next slide initializes the array id, and then reads pairs of integers (p,q) and performs the operation U NION ( p,q ) if p and q are not in the same set yet. The program can is an implementation of the equivalence problem defined earlier. Similar programs can be written for the other applications of disjoint sets presented. Data Structures and Programming Techniques 54

Implementation in C (cont’d) #include #define N main() { int i, p, q, t, id[N]; for (i = 0; i < N; i++) id[i] = i; while (scanf("%d %d", &p, &q) == 2) { if (id[p] == id[q]) continue; for (t = id[p], i = 0; i < N; i++) if (id[i] == t) id[i] = id[q]; printf("%d %d\n", p, q); } Data Structures and Programming Techniques 55

Implementation in C (cont’d) The extension of this implementation to the case where sets are represented by linked lists is left as an exercise. Data Structures and Programming Techniques 56

Implementation in C (cont’d) The disjoint-forests data structure can easily be implemented by changing the meaning of the elements of array id. Now each id[i] represents an element of a set and points to another element of that set. The root element points to itself. The program on the next slide illustrates this functionality. Note that after we have found the roots of the two sets, the U NION operation is simply implemented by the assignment statement id[i]=j. The implementation of the F IND -S ET operation is similar. Data Structures and Programming Techniques 57

Implementation in C (cont’d) #include #define N main() { int i, j, p, q, t, id[N]; for (i = 0; i < N; i++) id[i] = i; while (scanf("%d %d", &p, &q) == 2) { for (i = p; i != id[i]; i = id[i]) ; for (j = q; j != id[j]; j = id[j]) ; if (i == j) continue; id[i] = j; printf("%d %d\n", p, q); } Data Structures and Programming Techniques 58

Implementation in C (cont’d) We can implement a weighted version of the U NION operation by keeping track of the size of the two trees and making the root of the smaller tree point to the root of the larger. The code on the next slide implements this functionality by making use of an array sz[N] (for size). Data Structures and Programming Techniques 59

Implementation in C (cont’d) #include #define N main() { int i, j, p, q, id[N], sz[N]; for (i = 0; i < N; i++) { id[i] = i; sz[i] = 1; } while (scanf("%d %d", &p, &q) == 2) { for (i = p; i != id[i]; i = id[i]) ; for (j = q; j != id[j]; j = id[j]) ; if (i == j) continue; if (sz[i] < sz[j]) { id[i] = j; sz[j] += sz[i]; } else { id[j] = i; sz[i] += sz[j]; } printf("%d %d\n", p, q); } Data Structures and Programming Techniques 60

Implementation in C (cont’d) In a similar way, we can implement the union by rank heuristic. This heuristic together with the path compression heuristic are left as exercises. Data Structures and Programming Techniques 61

Readings T.H. Cormen, C. E. Leiserson and R. L. Rivest. Introduction to Algorithms. MIT Press. – Chapter 22 Robert Sedgewick. Αλγόριθμοι σε C. 3 η Αμερικανική Έκδοση. Εκδόσεις Κλειδάριθμος. – Κεφάλαιο 1 Data Structures and Programming Techniques 62