Cut-based & divisive clustering Clustering algorithms: Part 2b Pasi Fränti 17.3.2014 Speech & Image Processing Unit School of Computing University of Eastern.

Slides:



Advertisements
Similar presentations
Liang Shan Clustering Techniques and Applications to Image Segmentation.
Advertisements

SEEM Tutorial 4 – Clustering. 2 What is Cluster Analysis?  Finding groups of objects such that the objects in a group will be similar (or.
Ch2 Data Preprocessing part3 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Cluster Analysis: Basic Concepts and Algorithms
1 CSE 980: Data Mining Lecture 16: Hierarchical Clustering.
Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D.
O(N 1.5 ) divide-and-conquer technique for Minimum Spanning Tree problem Step 1: Divide the graph into  N sub-graph by clustering. Step 2: Solve each.
Data Mining Cluster Analysis: Basic Concepts and Algorithms
10/11/2001Random walks and spectral segmentation1 CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View.
Lecture 6 Image Segmentation
Normalized Cuts and Image Segmentation Jianbo Shi and Jitendra Malik, Presented by: Alireza Tavakkoli.
Clustering… in General In vector space, clusters are vectors found within  of a cluster vector, with different techniques for determining the cluster.
Region Segmentation. Find sets of pixels, such that All pixels in region i satisfy some constraint of similarity.
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
3 -1 Chapter 3 The Greedy Method 3 -2 The greedy method Suppose that a problem can be solved by a sequence of decisions. The greedy method has that each.
1 Text Clustering. 2 Clustering Partition unlabeled examples into disjoint subsets of clusters, such that: –Examples within a cluster are very similar.
Cluster Analysis: Basic Concepts and Algorithms
Image Segmentation A Graph Theoretic Approach. Factors for Visual Grouping Similarity (gray level difference) Similarity (gray level difference) Proximity.
Clustering Ram Akella Lecture 6 February 23, & 280I University of California Berkeley Silicon Valley Center/SC.
Chapter 3: Cluster Analysis  3.1 Basic Concepts of Clustering  3.2 Partitioning Methods  3.3 Hierarchical Methods The Principle Agglomerative.
Clustering Unsupervised learning Generating “classes”
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
Computer Vision James Hays, Brown
Clustering Methods: Part 2d Pasi Fränti Speech & Image Processing Unit School of Computing University of Eastern Finland Joensuu, FINLAND Swap-based.
Clustering methods Course code: Pasi Fränti Speech & Image Processing Unit School of Computing University of Eastern Finland Joensuu,
© The McGraw-Hill Companies, Inc., Chapter 3 The Greedy Method.
DATA MINING CLUSTERING K-Means.
Self-organizing map Speech and Image Processing Unit Department of Computer Science University of Joensuu, FINLAND Pasi Fränti Clustering Methods: Part.
Machine Learning Problems Unsupervised Learning – Clustering – Density estimation – Dimensionality Reduction Supervised Learning – Classification – Regression.
CSE 185 Introduction to Computer Vision Pattern Recognition 2.
Text Clustering.
1 Motivation Web query is usually two or three words long. –Prone to ambiguity –Example “keyboard” –Input device of computer –Musical instruments How can.
Genetic Algorithm Using Iterative Shrinking for Solving Clustering Problems UNIVERSITY OF JOENSUU DEPARTMENT OF COMPUTER SCIENCE FINLAND Pasi Fränti and.
Fast search methods Pasi Fränti Clustering methods: Part 5 Speech and Image Processing Unit School of Computing University of Eastern Finland
Clustering What is clustering? Also called “unsupervised learning”Also called “unsupervised learning”
Chapter 9 DTW and VQ Algorithm  9.1 Basic idea of DTW  9.2 DTW algorithm  9.3 Basic idea of VQ  9.4 LBG algorithm  9.5 Improvement of VQ.
Genetic algorithms (GA) for clustering Pasi Fränti Clustering Methods: Part 2e Speech and Image Processing Unit School of Computing University of Eastern.
Learning Spectral Clustering, With Application to Speech Separation F. R. Bach and M. I. Jordan, JMLR 2006.
By Timofey Shulepov Clustering Algorithms. Clustering - main features  Clustering – a data mining technique  Def.: Classification of objects into sets.
CS 8751 ML & KDDData Clustering1 Clustering Unsupervised learning Generating “classes” Distance/similarity measures Agglomerative methods Divisive methods.
Clustering Instructor: Max Welling ICS 178 Machine Learning & Data Mining.
Machine Learning Queens College Lecture 7: Clustering.
Image Segmentation Superpixel methods Speaker: Hsuan-Yi Ko.
 In the previews parts we have seen some kind of segmentation method.  In this lecture we will see graph cut, which is a another segmentation method.
CZ5211 Topics in Computational Biology Lecture 4: Clustering Analysis for Microarray Data II Prof. Chen Yu Zong Tel:
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
On Mechanism in Clustering Speaker: Caiming Zhong
Motion Segmentation at Any Speed Shrinivas J. Pundlik Department of Electrical and Computer Engineering, Clemson University, Clemson, SC.
How to cluster data Algorithm review Extra material for DAA Prof. Pasi Fränti Speech & Image Processing Unit School of Computing University.
An O(nm) Time Algorithm for Optimal Buffer Insertion of m Sink Nets Zhuo Li and Weiping Shi {zhuoli, Texas A&M University College Station,
Normalized Cuts and Image Segmentation Patrick Denis COSC 6121 York University Jianbo Shi and Jitendra Malik.
Genetic Algorithms for clustering problem Pasi Fränti
Agglomerative clustering (AC)
Clustering Clustering definition: Partition a given set of objects into M groups (clusters) such that the objects of each group are ‘similar’ and ‘different’
Clustering CSC 600: Data Mining Class 21.
Divide-and-Conquer MST
Random Swap algorithm Pasi Fränti
Autumn 2016 Lecture 11 Minimum Spanning Trees (Part II)
Random Swap algorithm Pasi Fränti
CSE572, CBS598: Data Mining by H. Liu
Autumn 2015 Lecture 11 Minimum Spanning Trees (Part II)
Autumn 2015 Lecture 10 Minimum Spanning Trees
CSE572, CBS572: Data Mining by H. Liu
Density-Based Image Vector Quantization Using a Genetic Algorithm
Text Categorization Berlin Chen 2003 Reference:
Pasi Fränti and Sami Sieranoja
SEEM4630 Tutorial 3 – Clustering.
CSE572: Data Mining by H. Liu
Presentation transcript:

Cut-based & divisive clustering Clustering algorithms: Part 2b Pasi Fränti Speech & Image Processing Unit School of Computing University of Eastern Finland Joensuu, FINLAND

Part I: Cut-based clustering

Cut-based clustering What is cut? Can we used graph theory in clustering? Is normalized-cut useful? Are cut-based algorithms efficient?

Clustering method = defines the problem Clustering algorithm= solves the problem Problem defined as cost function –Goodness of one cluster –Similarity vs. distance –Global vs. local cost function (what is “cut”) Solution: algorithm to solve the problem Clustering method

Usually assumes graph Based on concept of cut Includes implicit assumptions which are often: –No difference than clustering in vector space –Implies sub-optimal heuristics –Sometimes even false assumptions! Cut-based clustering

Minimum-spanning tree based clustering (single link) Split-and-merge (Lin&Chen TKDE 2005): Split the data set using K- means, then merge similar clusters based on Gaussian distribution cluster similarity. Split-and-merge (Li, Jiu, Cot, PR 2009): Splits data into a large number of subclusters, then remove and add prototypes until no change. DIVFRP (Zhong et al, PRL 2008): Dividing according to furthest point heuristic. Normalized-cut (Shi&Malik, PAMI-2000): Cut-based, minimizing the disassociation between the groups and maximizing the association within the groups. Ratio-Cut (Hagen&Kahng, 1992) Mcut (Ding et al, ICDM 2001) Max k-cut (Frieze&Jerrum 1997) Feng et al, PRL Particle Swarm Optimization for selecting the hyperplane. Cut-based clustering methods Details to be added later…

Clustering a graph But where we get this…?

Distance graph Calculate from vector space!

Space complexity of graph Distance graph Complete graph N∙(N-1)/2 edges = O(N 2 ) But…

Minimum spanning tree (MST) Distance graph MST Works with simple examples like this

Cut Graph cut Resulted clusters Cost function is to maximize the weight of edges cut This equals to minimizing the within cluster edge weights

Cut Graph cut Resulted clusters Equivalent to minimizing MSE!

Stopping criterion Ends up to a local minimum Divisive Agglomerative

Clustering method

Conclusions of “Cut” Cut  Same as partition Cut-based method  Empty concept Cut-based algorithm  Same as divisive Graph-based clustering  Flawed concept Clustering of graph  more relevant topic

Part II: Divisive algorithms

Motivation Efficiency of divide-and-conquer approach Hierarchy of clusters as a result Useful when solving the number of clusters Challenges Design problem 1: What cluster to split? Design problem 2: How to split? Sub-optimal local optimization at best Divisive approach

Split-based (divisive) clustering

Heuristic choices: –Cluster with highest variance (MSE) –Cluster with most skew distribution (3rd moment) Optimal choice: –Tentatively split all clusters –Select the one that decreases MSE most! Complexity of choice: –Heuristics take the time to compute the measure –Optimal choice takes only twice (2  ) more time!!! –The measures can be stored, and only two new clusters appear at each step to be calculated. Select cluster to be split Use this !

Selection example Biggest MSE… … but dividing this decreases MSE more

Selection example Only two new values need to be calculated

How to split Centroid methods: –Heuristic 1: Replace C by C-  and C+  –Heuristic 2: Two furthest vectors. –Heuristic 3: Two random vectors. Partition according to principal axis: –Calculate principal axis –Select dividing point along the axis –Divide by a hyperplane –Calculate centroids of the two sub-clusters

Splitting along principal axis pseudo code Step 1:Calculate the principal axis. Step 2:Select a dividing point. Step 3:Divide the points by a hyper plane. Step 4:Calculate centroids of the new clusters.

Example of dividing Dividing hyper plane Principal axis

Optimal dividing point pseudo code of Step 2 Step 2.1: Calculate projections on the principal axis. Step 2.2: Sort vectors according to the projection. Step 2.3: FOR each vector x i DO: - Divide using x i as dividing point. - Calculate distortion of subsets D 1 and D 2. Step 2.4: Choose point minimizing D 1 +D 2.

Finding dividing point Calculating error for next dividing point: Update centroids: Can be done in O(1) time!!!

Sub-optimality of the split

Example of splitting process Dividing hyper plane Principal axis 2 clusters3 clusters

4 clusters5 clusters Example of splitting process

6 clusters7 clusters Example of splitting process

8 clusters9 clusters Example of splitting process

10 clusters11 clusters Example of splitting process

12 clusters13 clusters Example of splitting process

14 clusters15 clusters MSE = 1.94 Example of splitting process

K-means refinement Result after re-partition: MSE = 1.39 Result after K-means: MSE = 1.33 Result directly after split: MSE = 1.94

Time complexity Number of processed vectors, assuming that clusters are always split into two equal halves: Assuming unequal split to n max and n min sizes:

Number of vectors processed: At each step, sorting the vectors is bottleneck: Time complexity

Comparison of results Birch 1

Conclusions Divisive algorithms are efficient Good quality clustering Several non-trivial design choices Selection of dividing axis can be improved!

References 1.P Fränti, T Kaukoranta and O Nevalainen, "On the splitting method for vector quantization codebook generation", Optical Engineering, 36 (11), , November C-R Lin and M-S Chen, “ Combining partitional and hierarchical algorithms for robust and efficient data clustering with cohesion self-merging”, TKDE, 17(2), M Liu, X Jiang, AC Kot, “A multi-prototype clustering algorithm”, Pattern Recognition, 42(2009) J Shi and J Malik, “Normalized cuts and image segmentation”, TPAMI, 22(8), L Feng, M-H Qiu, Y-X Wang, Q-L Xiang, Y-F Yang, K Liu, ”A fast divisive clustering algorithm using an improved discrete particle swarm optimizer”, Pattern Recognition Letters, C Zhong, D Miao, R Wang, X Zhou, “DIVFRP: An automatic divisive hierarchical clustering method based on the furthest reference points”, Pattern Recognition Letters, 29 (2008) 2067–2077.