# Union-Find structure Mikko Malinen School of Computing University of Eastern Finland.

## Presentation on theme: "Union-Find structure Mikko Malinen School of Computing University of Eastern Finland."— Presentation transcript:

Union-Find structure Mikko Malinen School of Computing University of Eastern Finland

Basic set operations Given several sets. Find the one, where a belongs. Form union of sets and. Usually supposed that. Does element belong to set. Add element to set.. Remove element from set if it is in that set. Suppose that set is linearly ordered. Find the smallest element of set.

Union-Find structure An abstract data type type set(T) has procedure createset(x: T) returns set procedure findset(x: T) returns set procedure union(S1,S2: set) returns set

createset(x) forms a set consisting of one element {x} findset(x) returns the set where x belongs union(S1,S2) forms the union of the sets S1 and S2. In union-operation the sets S1 and S2 are destroyed. So no element can belong to more than one set. We are interested in a task, which consists of a sequence of operations createset, union and findset.

Trivial solution Represeting a set by a list can be formed in constant time by combining the lists findset O(n), when there are n elements

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Trivial solution Representing a set by a bit vector Let U be an ordered base set and |U| = n Representing subset as an n-bit vector ’s i:th bit is 1, if U’s i:th element belongs to S. Union can be implemented as bit vector operations (in one step, if n is not too big); rquires time O(|U|) and each set requires space O(|U|). Findset requires time O(n).

Trivial solution Representing a set as a table union requires time O(n) findset can be implemented in constant time, if elements have order, otherwise O(n)

Tree representation Sets are represented by a forest (a single set is represented as a tree) We choose the root node of a tree to be the representative of the set if vertex x is the root of the tree T, then by notation [x] we mean the set formed be the vertices of the tree T.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Tree implementation Operation makeset(x) forms a tree, the only vertex will be the root x In operation findset(x) a path is followed from vertex x upwards until the root y is reached. Then [y] is the result. Operation union([x],[y]) is implemented by setting vertex x as a child of vertex y. Then [y] is the union set. Problem: the tree may come inbalanced

Solutions to inbalanced trees Solution 1: Balancing. In operation union([x],[y]) the new root will be that element x or y, of which tree is highest. Solution 2: Path compression. When a root y has been found as a result of operation findset(x), the father of all the vertices in the path leading from x to y will be set y.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Time complexity We examine an operation sequence, where there are n makeset-operations and m findset –operations New elements are created only with makeset operation, so n is the number of elements and n- 1 is an upper bound for union-operations. In spite of balancing, a tree may be formed, of which height is log n. If we estimate all find - operations this difficult, the whole task would require time O(m log n). This estimate is too pessimistic.

Time complexity A more accurate analysis is based on the idea of balancing the costs Let A be Ackerman function and its one kind of inverse function grows extremely slowly. <= 3 with all thinkable values of arguments m and n. If union-find task has n union- and m findset -operations, it can be executed in time. (proof omitted).

Applications

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

PNN Clustering

Pseudo code

Example of the overall process M=5000 M=4999 M=4998. M=50. M=16 M=15 M=5000M=50 M=16M=15

Detailed example of the process

Example - 25 Clusters MSE ≈ 1.01*10 9

Example - 24 Clusters MSE ≈ 1.03*10 9

Example - 23 Clusters MSE ≈ 1.06*10 9

Example - 22 Clusters MSE ≈ 1.09*10 9

Example - 21 Clusters MSE ≈ 1.12*10 9

Example - 20 Clusters MSE ≈ 1.16*10 9

Example - 19 Clusters MSE ≈ 1.19*10 9

Example - 18 Clusters MSE ≈ 1.23*10 9

Example - 17 Clusters MSE ≈ 1.26*10 9

Example - 16 Clusters MSE ≈ 1.34*10 9

Example - 15 Clusters MSE ≈ 1.34*10 9

Revised PNN algorithm

Complexity of Union-Find program where n is the number of union operations and m is the number of findset operations Traditional partitioning takes time T=NM To query in Revised PNN algorithm is fast, when the number of queries is low

Download ppt "Union-Find structure Mikko Malinen School of Computing University of Eastern Finland."

Similar presentations