Presentation is loading. Please wait.

Presentation is loading. Please wait.

Union Find ADT Data type for disjoint sets: makeSet(x): Given an element x create a singleton set that contains only this element. Return a locator/handle.

Similar presentations


Presentation on theme: "Union Find ADT Data type for disjoint sets: makeSet(x): Given an element x create a singleton set that contains only this element. Return a locator/handle."— Presentation transcript:

1 Union Find ADT Data type for disjoint sets: makeSet(x): Given an element x create a singleton set that contains only this element. Return a locator/handle for e in the data structure. find(x): Given a handle for an element x; find the set that contains x. Return a handle/identifier/pointer/label for this set. union(A,B): Given two set identifiers create the union of the two sets. 1 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A AA A A A A AAA A A

2 Union Find ADT Applications: keep track of the connected components of a dynamic graph that changes due to insertion of nodes and edges Kruskals Minimum Spanning Tree Algorithm 2

3 Union Find ADT List Implementation: the elements of a set are stored in a list; each node has a backward pointer to the tail the tail of the list contains the label for the set makeSet(x) needs constant time find(x) also needs constant time 3 A

4 Union Find ADT Union: take the smaller of the two sets change all its backward pointer to the label of the larger set insert the smaller list at the head of the larger time O(min(|A|,|B|)) 4 B A

5 Union Find Lemma. The amortized running times are: find O(1). makeSet O(log n). union O(1). Proof. Idea: We partially charge the cost for a union operation to the elements involved. In total we will charge at most O(log n) to an element. Since each element has to be created with makeSet(x) we get an amortized time bound by inflating the cost of makeSet(x) to O(log n). fi nd: actual cost and amortized cost are the same. union(A,B ): i f (A==B) do nothing  time O(1) for the check otherwise add the smaller set to the larger; charge the cost for this to the elements in the smaller set; each element is charged one. your cost: zero! 5

6 Union Find How much do we charge to an element? Observation. Whenever we charge one to an element x the size of the subset A x that contains x increases by at least a factor of 2.  total charge to an element is at most O(log n) 6

7 Implementation via trees: the root of the tree is the label of the set only pointer to parent exist; we cannot list all elements of a given set Problem: find is not constant anymore Union Find 7 23 16 19 3 8 1417 7 6 95 10 12 2

8 Union of two sets: the root of the tree is the label of the set store the size of a subtree in the root Union Find: Tree Implementation 8 3 8 1417 7 6 95 10 12 2 7 5 1 1 1 2 1 4 21 1

9 Union of two sets: the root of the tree is the label of the set store the size of a subtree in the root make the smaller tree the child of the larger 3 8 7 6 9 14172 5 10 12 Union Find: Tree Implementation 9 11 5 1 1 1 2 1 4 21 1

10 Find: go upwards until you find the root make all visited node into children of the root (path compression) 3 8 7 6 9 14172 5 10 12 Union Find: Tree Implementation 10 11 5 1 1 1 2 1 4 21 1

11 Find: go upwards until you find the root make all visited node into children of the root (path compression) 3 8 7 6 9 1417 Union Find: Tree Implementation 11 5 1 1 1 2 1 10 12 4 1 2 5 2 1

12 Tree Implementation: Analysis Analysis union (A,B) can be done in time O(1) makeSet(x) is still trivial: O(1) the cost for find(x) may be large. Observation: The height of the trees is at most O(log n). if for an element x the distance to the root increases this means that the number of elements in its sub-tree at least doubles.  the cost for find(x) is at most O(log n) without amortization. 12

13 Can we do better? Yes with amortization! Definitions: n(v) := the number of nodes that were in the subtree rooted at v, when v became child of another node. rank r(v) :=  n(v) ≥ 2 r(v) Lemma: The rank of a parent p must be strictly larger than the rank of its child c. after c is linked to p the rank of c does not change anymore while the rank of c might still increase. directly after the linking r(p) Tree Implementation: Analysis 13 ≥ ≥ = > r(c) = TexPoint Display

14 Tree Implementation: Analysis Theorem. There are at most n/2 s nodes of rank s. Proof: a node of rank s had at least 2 s nodes in its subtree, when it became a child of another node nodes of the same rank have disjoint subtrees as they cannot be ancestors of each other [observe that a node that was initially not in a subtree T cannot via path compression join this sub-tree. ] more precisely: each node in the tree sees during its lifetime at most one ancestor of rank s; for each rank s node there are at least 2 s nodes that have seen him; hence there can at most be n/2 s nodes of rank s. 14

15 Tree Implementation: Analysis Definitions: Theorem: We can obtain the following amortized running times: makeSet(x): find(x): union(A,B): O(1) 15

16 Tree Implementation: Analysis group-number: a node with rank r[v] is in rank-group this means the rank-group g contains ranks t(g-1)+1,…., t(g) there are at most different rank-groups 16

17 Tree Implementation: Analysis Accounting Scheme: create an account for every find-operation create an account for every node The cost of a find operation is equal to the length of the path traversed. We charge the cost for going from v to parent[v] in the following way: if the parent of v does not change due to path-compression we charge the cost to the find-account (at most cost 1 per find) if the group-number of rank[v] is the same as that of rank[parent(v)] (before starting path compression) we charge the cost to the node-account of v otherwise we charge the cost to the find-account 17

18 Tree Implementation: Analysis Observations: find(x) is only charged. ( max number of rank-groups ) after a node is charged its parent is re-assigned to a node higher up in the tree.  parent gets larger rank. after some time the parent is in a larger rank group.  node will never be charged again the charge to a node in rank-group g is at most t(g)-t(g-1)<= t(g) What is the total number of operations that is charged to nodes? the total charge is at most where n(g) is the number of nodes in group g 18

19 Tree Implementation: Analysis  hence: as there are only groups 19

20 Tree Implementation: Analysis If there are only n elements we charge at most to these elements.  Charging to every makeSet()-operation gives the result. The analysis is not tight. In fact it has been shown that the amortized time for the union-find implementation with path compression is O( ® (n)), were ® (n), is the inverse Ackermann function which grows a lot slower than. There is also a lower bound of  ( ® (n)). 20


Download ppt "Union Find ADT Data type for disjoint sets: makeSet(x): Given an element x create a singleton set that contains only this element. Return a locator/handle."

Similar presentations


Ads by Google