Presentation is loading. Please wait.

Presentation is loading. Please wait.

Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence.

Similar presentations


Presentation on theme: "Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence."— Presentation transcript:

1 Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence Research 4 (1996) 147-179 Presentation: Yugong Cheng 04/23/02

2 2 Outline Introduction Objective Function Iterative Optimization Methods and Experiments Simplification of Hierarchical Clustering Conclusion Final Exam Questions Summary

3 3 Introduction Clustering is a process of unsupervised learning, which groups objects into clusters. Major Clustering Methods –Partitioning –Hierarchical –Density-based –Grid-based –Model-based

4 4 Introduction (Continued) Clustering systems differ in objective function control strategy Usually a search strategy cannot be both computationally inexpensive and give any guarantee about the quality.

5 5 Introduction (Continued)  This paper discusses the use of iterative optimization and simplification to construct clusters that satisfy both conditions: High quality Computationally inexpensive  The suggested method involves two steps: Constructing a clustering inexpensively Using an iterative optimization method to improve the clustering

6 6 Category Utility CU(C K ) = P(C k )  [P(A i = V ij |C K ) 2 -P(A i = V ij ) 2 ] PU({C 1, C 2, … C N }) =  k CU(C K )/N Where an observation is a vector of V ij along attributes(or variables) A i This measure rewards clusters C k, that increases the predictability of V ij within C k (i.e. P(A i =V ij |C k )) relative to their predictability in the population as a whole (i.e. P(A i = V ij ))

7 7

8 8 Hierarchical Sorting Given an observation and current partition, evaluate the quality of the clusterings that result from –Placing the observation in each of the existing clusters –Creating a new cluster that only covers the new observation Select the option that yields the highest quality score (PU)

9 9

10 10 Iterative Optimization Methods Reorder-resort (Cluster/2): seed selection, reordering, and re-clustering. Iterative redistribution of single observation: moving single observation one by one. Iterative hierarchical redistribution: moving clusters together with its sub-tree.

11 11 Reorder-resort (k-mean) k-mean: k random seeds are selected, and k clusters are growing around these attractors; the centroids of the clusters are picked as new seeds, new clusters are growing. The process iterates until there is no further improvement in the quality of generated clustering.

12 12 Reorder-resort (k-mean) Ordering data to make consecutive observations dissimilar based on Euclidean distance leads to good clusterings Extracting biased “dissimilarity” ordering from the hierarchical clustering Initial sorting, extraction dissimilarity ordering, re-clustering

13 13 Iterative Redistribution of Single Observations Moves single observations from cluster to cluster A cluster contains only one observation is removed and its single observation is resorted Iterate until two consecutive iterations yield the same clustering

14 14 The ISODATA algorithm determines a target cluster for each observation but does not move the cluster until targets for all observations have been determined A sequential version that moves each observation as its target is identified through sorting Single Observation Redistribution Variations

15 15 Iterative Hierarchical Redistribution Takes large steps in the search for a better clustering Resorts sub-tree instead of single observation Tree removal requires that the various counts of ancestors’ be decremented. Also, the host cluster’s variable value counts needs to be incremented.

16 16 Scheme Given an existing hierarchical clustering, a recursive loop examines sibling clusters in the hierarchy in a depth first fashion. An inner, iterative loop examines each sibling based on the objective function. And repeats until two consecutive iterations lead to the same set of siblings.

17 17 (Continued) The recursive loop then turns its attention to the children of each of these remaining siblings. Finally the leaves will be reached and resorted. The recursive loop will be applied several times until there are no changes that occur from one pass to the next.

18 18

19 19 Experiment conditions –The initial clustering is generated by hierarchical sorting on random ordering observations similarity ordering observations, which samples observations within the same region before sampling observations from differing regions. –Optimization strategies are applied –Assume the primary goal of clustering is to discover a single-level partitioning of the data that is of optimal quality

20 20 Comparison between Iterative Optimization Strategies

21 21 Main findings from the Table: Hierarchical redistribution achieves the highest mean PU scores in both the random and similarity case in 3 of 4 domains. Reordering and re-clustering comes closest to hierarchical redistribution’s performance in all cases, better it in 1 domain. Single-observation redistribution modestly improves an initial sort, and is substantially worse than the other two optimization methods.

22 22 Time requirements

23 23 Level of Tree

24 24 Simplifying Hierarchical Clustering Simplify hierarchical clustering and minimize classification cost Minimize Error Rate Validation set to identify the frontier of clusters for prediction of each variable Node lies below the frontier of every variable would be pruned

25 25 Validation For each variable, A i, the objects from the validation set are each classified through the hierarchical clustering with the value of variable A i “masked” for purposes of classification. At each cluster encountered during classification the observation’s value for A i is compared to the most probable value for A i at the cluster. A Count of all correct predictions for each variable at a cluster is maintained. A preferred frontier for each variable is identified that maximizes the number of correct counts for the variable.

26 26

27 27

28 28 Concluding Remarks There are three phases in searching the space of hierarchical clusterings: –Inexpensive generation of an initial clustering –Iterative optimization for clusterings –Retrospective simplification of generated clusterings The new method, hierarchical redistribution optimization works well.

29 29 Final Exam Questions 1.The main idea of the paper is to construct clusterings which satisfy two conditions, 1) name the conditions, 2) name the two steps to satisfy the conditions 1) To construct clusterings that satisfy both conditions: high quality and computationally inexpensive 2) First constructs a clustering inexpensively (hierarchical sorting), then uses an iterative optimization method to improve the quality of clustering (reorder- resort, iterative single redistribution, hierarchical redistribution).

30 Final Exam Question 2. Describe the three iterative methods for clustering optimization: Reorder-resort (k-mean): Extracting biased “dissimilarity” ordering from the initial hierarchical clustering, then performing k-mean partitioning iteratively. Iterative redistribution of single observation: moving single observation one by one. A cluster contains only one observation is removed and its single observation is resorted. Iterating until two consecutive iterations yield the same clustering. Hierarchical redistribution: Takes large steps in the search for a better clustering. It resorts sub-tree instead of single observation. Given an existing hierarchical clustering, a recursive loop examines sibling clusters in the hierarchy in a depth first fashion. An inner, iterative loop examines each sibling based on the objective function. And repeats until two consecutive iterations lead to the same set of siblings The recursive loop then turns its attention to the children of each of these remaining siblings. Finally the leaves will be reached and resorted. The recursive loop will be applied several times until there are no changes that occur from one pass to the next.

31 31 Final Exam Question 3. (1) The cluster is better when the relative CU score is a) big, b) small, c) equal to 0. The cluster is better with a higher CU score. So choose a). (2) Which sorting method is better? a) random sorting, b) similarity sorting. Dissimilar ordering will yield better clustering, so random sorting of samples will be better. Choose a).


Download ppt "Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence."

Similar presentations


Ads by Google