Union-find Algorithm Presented by Michael Cassarino.

Union-find Algorithm Presented by Michael Cassarino

Introduction This presentation is a discussion on the union-find algorithm, which is primarily used to determine connectivity of disjoint sets. The goal of this presentation is to define key elements of the union-find algorithm as well as provide an overview of its purpose, implementation, and efficiency. Finally an example will be presented showing how Kruskal’s algorithm uses the union-find algorithm to help solve the minimum spanning tree problem.

Outline Disjoint sets Union-find algorithm Spanning trees Kruskal’s algorithm

Disjoint Sets Disjoint sets are a collection of sets whose members do not overlap (no cycles) or are not duplicated in another set (mutually exclusive). Disjoint sets can be thought of as a series of trees in a forest. Disjoint sets are primarily used to determine connected components in an undirected graph.

Disjoint Sets Each disjoint set has a representative that identifies the set. The representative is a member of the set and can be thought of as the parent of the other members in the set. It does not matter which member of the set is the representative. The only thing that matters is that if the set was not modified then the same member is returned whenever the representative is requested. Typically the first member of the set is made the representative.

Disjoint Sets Disjoint sets are typically implemented using an array that has an element for each item in the disjoint sets. If the element is the representative of a set, then its value equals it’s index number. Otherwise the value is the index number of another element in the set, giving rise to a linked list that eventually ends at the representative.

Disjoint Sets The disjoint set implementation is not limited to a linked list structure. A tree structure, where more than one element points to the same parent can be implemented.The benefits to this type of structure will be discussed later in this presentation.

Disjoint Sets Some algorithms group n distinct elements into a collection of equivalence classes (disjoint sets). For example you may have n people and you want to group them by different income levels before evaluating. These disjoint sets grow dynamically as the algorithm progresses due to merging of sets. Two important operations on these disjoint sets are: Finding which disjoint set a given element belongs to. Uniting two disjoint sets.

Union-find Algorithm Used to detect cycles, which is simply finding the representative of two nodes and comparing them. If the representative is the same, then there already exists a path between the two nodes. If the representative is not the same then the two sets are joined together. Four primary functions: Make (x) Find (x) Union (x, y) Link (x, y)

Union-find Algorithm Make (x) Used to initialize elements being evaluated by creating a series of disjoint sets whose only member and representative is ‘x’. Because the sets are disjoint ‘x’ cannot already be in another set. Pseudo Code: Make (x) i = 0 to x –1 Array[i] = i

Union-find Algorithm Find (x) Determines the set in which an element ‘x’ resides Returns the representative of the set that contains ‘x’ The number of steps to complete a Find (x) is proportional to the number of elements that must be traversed until the representative is found. Pseudo Code: Find (x) if Array[x] != x return Find (Array[x]) Return (x)

Union-find Algorithm Union (x,y) Unites the disjoint sets that contain ‘x’ and ‘y’ into a new set if ‘x’ and ‘y’ are not already connected. A destructive operation in the sense that the original sets are lost when they are combined to form the new set. Implemented by calling two find operations and one link operation. Each Union (x, y) reduces the number of sets by one. After n-1 Union (x,y) operations, only one set will remain Pseudo Code: Union (x, y) A = Find (x) B = Find (y) If A != B then Link (x, y)

Union-find Algorithm Link (x,y) Completes the Union(x,y) by changing the representative of one of the disjoint sets to the representative of the other disjoint set. Depending on the type of data structure used, Link(x,y) can be very simple or very complex. Pseudo Code: Link (x, y) Array[y] = x (For simple tree structure)

Union-find Algorithm Implementation: There are two basic structures that are commonly used when implementing Union- find: Linked List Rooted Tree

Union-find Algorithm Linked List Implementation: The simplest implementation is to represent each disjoint set as a linked list. The first element in the linked list is the representative. The Find operation traverses each list until it reaches the representative, taking O(n) time. The Union operation, which calls Find in O(1) time, then simply concatenates the linked lists. This implementation causes inefficiency when doing a Find because after m Union operations you end up with a long list. The cost of performing m Union operations is O(mn).

Union-find Algorithm Linked List Implementation Improvements: The most obvious improvement to a linked list structure would be to have each element maintain a pointer to its representative. This is called Path Compression and will improve the cost of doing a Find to O(1). However, when Union concatenates two lists it must spend additional time changing the pointer to the representative for each element in the list that was appended onto the end of the other list. This takes O(n) time and the cost of performing m Union operations is still O(mn). A further improvement to path compression is to have each representative keep track of the number of elements in its list and Union will always append the smaller list onto the end of the larger list. This guarantees that the representative pointers of the smaller list will always be the ones updated. This is called Weighted Union and it improves the efficiency of doing m Union operations to O(m log n)

Union-find Algorithm Rooted Tree Implementation: The most common implementation is to represent each disjoint set as a tree. Each child node in the tree points to its parent, with the root of the tree being the representative. The Find operation is implemented by walking up the tree to the root, taking at worst O(n) time. The Union operation, which calls Find in O(1) time, then simply makes the representative of one tree the child of the other tree. This implementation is inefficient because you can end up with a tree of linear length that will make subsequent Finds costly. The cost of performing m Union operations is O(mn).

Union-find Algorithm Root Tree Implementation Improvements: Similar to the Linked list implementation, there are two methods of improving the efficiency of the rooted tree implementation. Path compression takes takes each node between the argument node and the root and makes points it to the root instead of its parent. This initially takes considerable time because you have to travel up the path to find the root, then travel down the path to change the parent pointers to the root. However, it reduces the cost on each subsequent Find. Union by rank, which is similar to the linked lists weighted Union, stores the height of the tree in the root and always makes the smaller tree the child of the root of the larger tree. This keeps the height of the new tree small, which will make subsequent Find operations less costly. When Union by Rank and Path Compression are used together, it improves the efficiency of doing m Union operations to O(m log n)

Spanning Trees A spanning tree of a graph is a sub graph that contains all the vertices of a graph without creating a cycle. Basically a set of edges that contain all vertices in a graph. A graph may have many spanning trees. The weight of a spanning tree is the sum of the weights of its edges.

Minimum Spanning Tree A minimum spanning tree is a spanning tree of a graph that has the smallest possible weight. The minimum spanning tree problem has a long history that dates back to the first algorithms in 1926. A minimum spanning tree is useful in constructing networks by determining the way to connect a set of sites using the smallest amount of wire. In fact much of the work on minimum spanning tree algorithms was done by the phone company.

Kruskal’s Algorithm An algorithm for computing the minimum spanning tree of a graph by continually joining two trees in a forest until there is only one tree (the minimum spanning tree). Relies on the Union-find Algorithm to check for cycles and to add an edge to the minimum spanning tree. A greedy algorithm because it chooses the cheapest edge at each step and adds it to the minimum spanning tree.

Kruskal’s Algorithm Overview: Initially creates a forest of trees, each a singleton set that represents the vertices in the graph Creates a list of every possible edge in the graph. Sorts the edges from lowest weight to highest weight. Until n - 1 edges (where n is the number of vertices in the graph) are added to the minimum spanning tree, the following takes place: The next lowest cost edge is extracted from the list. A Union operation is performed to see if the edge would form a cycle.  If a cycle would be formed (both vertices of the edge are already in the minimum spanning tree), the edge is discarded.  Otherwise the union operation unites the two respective trees in the forest and the edge is added to the minimum spanning tree.

Kruskal’s Algorithm Efficiency: Requires O(n) time to create the forest of trees, with n being the number of vertices in the graph. Requires O(E log E) time for sorting the edges, where E is the number of edges. Requires O(E) time to traverse the sorted edges. Requires O(V log V) time for each Union-find operation Efficiencies simplify to a time complexity of O(E log E).

Kruskal’s Algorithm Example: Problem statement: There are four islands separated by a small body of water. The inhabitants of the islands want to build bridges that will allow them to drive from one island to any of the others. However, because of the cost of building the bridges they want to determine the locations of the bridges that will produce the lowest building cost.

Kruskal’s Algorithm Example Step 1: Create a forest of trees, each a singleton set that represents the vertices in the graph.

Kruskal’s Algorithm Example Step 2: Create a list of every possible edge in the graph.

Kruskal’s Algorithm Example Step 3: Sort the edges from lowest to highest weight.

Kruskal’s Algorithm Example Step 4: Build minimum spanning tree until n-1 edges are added.

Union-find Algorithm Question And Answer Session

Union-find Algorithm Presented by Michael Cassarino.

Similar presentations

Presentation on theme: "Union-find Algorithm Presented by Michael Cassarino."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Union-find Algorithm Presented by Michael Cassarino.

Similar presentations

Presentation on theme: "Union-find Algorithm Presented by Michael Cassarino."— Presentation transcript:

Similar presentations

About project

Feedback