Presentation is loading. Please wait.

Presentation is loading. Please wait.

Clustering Methods: Part 2d Pasi Fränti 31.3.2014 Speech & Image Processing Unit School of Computing University of Eastern Finland Joensuu, FINLAND Swap-based.

Similar presentations


Presentation on theme: "Clustering Methods: Part 2d Pasi Fränti 31.3.2014 Speech & Image Processing Unit School of Computing University of Eastern Finland Joensuu, FINLAND Swap-based."— Presentation transcript:

1 Clustering Methods: Part 2d Pasi Fränti 31.3.2014 Speech & Image Processing Unit School of Computing University of Eastern Finland Joensuu, FINLAND Swap-based algorithms

2 Part I: Random Swap algorithm P. Fränti and J. Kivijärvi Randomised local search algorithm for the clustering problem Pattern Analysis and Applications, 3 (4), 358-369, 2000.

3 Pseudo code of Random Swap

4 Demonstration of the algorithm

5 Centroid swap

6 Local repartition

7 Fine-tuning by K-means 1st iteration

8 Fine-tuning by K-means 2nd iteration

9 Fine-tuning by K-means 3rd iteration

10 Fine-tuning by K-means 16th iteration

11 Fine-tuning by K-means 17th iteration

12 Fine-tuning by K-means 18th iteration

13 Fine-tuning by K-means 19th iteration

14 Fine-tuning by K-means Final result after 25 iterations

15 Implementation of the swap 1. Random swap: 2. Re-partition vectors from old cluster: 3. Create new cluster:

16 Random swap as local search Study neighbor solutions

17 Select one and move Random swap as local search

18 Fine-tune solution by hill-climbing technique! Role of K-means

19 Consider only local optima! Role of K-means

20 Effective search space Role of swap: reduce search space

21 Chain reaction by K-means after swap

22 Independency of initialization Results for T = 5000 iterations Worst Best Initial

23 Part II: Efficiency of Random Swap

24 Probability of good swap Select a proper centroid for removal: –There are M clusters in total: p removal =1/M. Select a proper new location: –There are N choices: p add =1/N –Only M are significantly different: p add =1/M In total: –M 2 significantly different swaps. –Probability of each different swap is p swap =1/M 2 –Open question: how many of these are good?

25 Number of neighbors Open question: what is the size of neighborhood (  )? Voronoi neighbors Neighbors by distance

26 Observed number of neighbors Data set S 2

27 Average number of neighbors

28 Probability of not finding good swap: Expected number of iterations Estimated number of iterations:

29 Estimated number of iterations depending on T S1S1 S2S2 S3S3 S4S4 Observed = Number of iterations needed in practice. Estimated = Estimate of the number of iterations needed for given q

30 Probability of success (p) depending on T

31 Probability of failure (q) depending on T

32 Observed probabilities depending on dimensionality

33 Bounds for the number of iterations Upper limit: Lower limit similarly; resulting in:

34 Multiple swaps (w) Probability for performing less than w swaps: Expected number of iterations:

35 Number of swaps needed Example from image quantization

36 Efficiency of the random swap Total time to find correct clustering: –Time per iteration  Number of iterations Time complexity of a single step: –Swap: O(1) –Remove cluster: 2M  N/M = O(N) –Add cluster: 2N = O(N) –Centroids: 2  (2N/M) + 2  + 2 = O(N/M) –(Fast) K-means iteration: 4  N = O(  N) * * See Fast K-means for analysis.

37 Time complexity and the observed number of steps

38 Time spent by K-means iterations

39 Effect of K-means iterations

40 Total time complexity Number of iterations needed (T): t = O(αN) Total time: Time complexity of a single step (t):

41 Time complexity: conclusions 1.Logarithmic dependency on q 2.Linear dependency on N 3.Quadratic dependency on M (With large number of clusters, can be too slow) 4.Inverse dependency on  (worst case  = 2) (Higher the dimensionality and higher the cluster overlap, faster the method)

42 Time-distortion performance

43

44

45

46

47

48 References Random swap algorithm: P. Fränti and J. Kivijärvi, "Randomised local search algorithm for the clustering problem", Pattern Analysis and Applications, 3 (4), 358-369, 2000. P. Fränti, J. Kivijärvi and O. Nevalainen, "Tabu search algorithm for codebook generation in VQ", Pattern Recognition, 31 (8), 1139 ‑ 1148, August 1998. Pseudo code: http://cs.joensuu.fi/sipu/soft/ Efficiency of Random swap algorithm: P. Fränti, O. Virmajoki and V. Hautamäki, “Efficiency of random swap based clustering", IAPR Int. Conf. on Pattern Recognition (ICPR’08), Tampa, FL, Dec 2008.

49 Part III: Example when 4 swaps needed

50 MSE = 4.2 * 10 9 MSE = 3.4 * 10 9 1 st swap

51 MSE = 3.1* 10 9 MSE = 3.0 * 10 9 2 nd swap

52 MSE = 2.3 * 10 9 MSE = 2.1 * 10 9 3 rd swap

53 MSE = 1.9 * 10 9 MSE = 1.7 * 10 9 4 th swap

54 MSE = 1.3 * 10 9 Final result

55 Part IV: Deterministic Swap

56 Deterministic swap Costs for the swap: From where to where?

57 Merge two existing clusters [Frigui 1997, Kaukoranta 1998] following the spirit of agglomerative clustering. Local optimization: remove the prototype that increases the cost function value least [Fritzke 1997, Likas 2003, Fränti 2006]. Smart swap: find two nearest prototypes, and remove one of them randomly [Chen, 2010]. Pairwise swap: locate a pair of inconsistent prototypes in two solutions [Zhao, 2012]. Cluster removal

58 1.Select an existing cluster –Depending on strategy: 1..M choices. –Each choice takes O(N) time to test. 2. Select a location within this cluster –Add new prototype –Consider only existing points Cluster addition

59 Select the cluster Cluster with the biggest MSE –Intuitive heuristic [Fritzke 1997, Chen 2010] –Computationally demanding: Local optimization –Try all clusters for the addition [Likas et al, 2003] –Computationally demanding: O(NM)-O(N 2 )

60 Select the location 1.Current prototype + ε [Fritzke 1997] 2.Furthest vector [Fränti et al 1997] 3.Any other split heuristic [Fränti et al, 1997] 4.Random location 5.Every possible location [Likas et al, 2003]

61 Complexity of swaps

62 Furthest point in cluster Prototype removed Cluster where added Furthest point selected

63 Initialization: O(MN) Swap Iteration –Finding nearest pair: O(M 2 ) –Calculating distortion: O(N) –Sorting clusters: O(M ∙ logM) –Evaluation of result: O(N) –Repartition and fine-tuning: O(  N) Total: O(MN+M 2 +I ∙  N) Number of iteration expected: < 2∙M Estimated total time: O(2M 2 N) Smart swap

64 Nearest prototypes Cluster with largest distortion

65 SmartSwap(X,M) → C,P C ← InitializeCentroids(X); P ←PartitionDataset(X, C); Maxorder ← log 2 M; order ← 1; WHILE order < Maxorder c i, c j ←FindNearestPair(C); S ← SortClustersByDistortion(P, C); c swap ←RandomSelect(c i, c j ); c location ← s order ; C new ← Swap(c swap, c location ); P new ← LocalRepartition(P, C new ); KmeansIteration(P new, C new ); IF f(C new ) < f(C), THEN order ← 1; C ←C new ; ELSE order ← order + 1; KmeansIteration(P, C); Smart swap pseudo code

66 Pairwise swap Unpaired prototypes Nearest neighbors of each other Nearest neighbor of the other set further than in the same set → Subject to swap

67 Combinations of random and deterministic swap VariantRemovalAddition RRRandom RDRandomDeterministic DRDeterministicRandom DDDeterministic D2RD2R + data update Random D2DD2D Deterministic + data update Deterministic

68 Summary of the time complexities Random removal Deterministic removal RRRDDRDDD2RD2RD2DD2D Removal O(1) O(MN) O(αN) Addition O(1)O(N)O(1)O(N)O(1)O(N) Repartition O(N) K-means O(αN) O(MN) O(αN)

69 Profiles of the processing time

70 Test data sets

71 Birch data sets Birch1Birch2Birch3

72 Experiments Bridge RD DD DR Random Swap

73 Experiments Bridge

74 Experiments Birch 2 Random Swap DD DR RD

75 Experiments Miss America

76 Quality comparisons (MSE) with 10 second time constraint 18:14:16:15:14:12:1 Average speed-up from RR to RD 2.785.111.025.586.10171.20 RD-variant 4.435.701.265.856.41174.08 Random Swap 4.105.491.525.926.58177.66 Repeated K-means 22.3513.102.378.3412.12251.32 Repeated Random Birch 2 ×10 6 Birch 1 ×10 8 Europe ×10 7 Miss America HouseBridge

77 Literature 1.P. Fränti and J. Kivijärvi, "Randomised local search algorithm for the clustering problem", Pattern Analysis and Applications, 3 (4), 358-369, 2000. 2.P. Fränti, J. Kivijärvi and O. Nevalainen, "Tabu search algorithm for codebook generation in VQ", Pattern Recognition, 31 (8), 1139 ‑ 1148, August 1998. 3.P. Fränti, O. Virmajoki and V. Hautamäki, “Efficiency of random swap based clustering", IAPR Int. Conf. on Pattern Recognition (ICPR’08), Tampa, FL, Dec 2008. 4.P. Fränti, M. Tuononen and O. Virmajoki, "Deterministic and randomized local search algorithms for clustering", IEEE Int. Conf. on Multimedia and Expo, (ICME'08), Hannover, Germany, 837-840, June 2008. 5.P. Fränti and O. Virmajoki, "On the efficiency of swap-based clustering", Int. Conf. on Adaptive and Natural Computing Algorithms (ICANNGA'09), Kuopio, Finland, LNCS 5495, 303-312, April 2009.

78 5.J. Chen, Q. Zhao, and P. Fränti, "Smart swap for more efficient clustering", Int. Conf. Green Circuits and Systems (ICGCS’10), Shanghai, China, 446-450, June 2010. 6.B. Fritzke, The LBG-U method for vector quantization – an improvement over LBG inspired from neural networks. Neural Processing Letters 5(1) (1997) 35-45. 7.P. Fränti and O. Virmajoki, "Iterative shrinking method for clustering problems", Pat. Rec., 39 (5), 761-765, May 2006. 8.T. Kaukoranta, P. Fränti and O. Nevalainen "Iterative split-and- merge algorithm for VQ codebook generation", Optical Engineering, 37 (10), 2726-2732, October 1998. 9.H. Frigui and R. Krishnapuram, "Clustering by competitive agglomeration". Pattern Recognition, 30 (7), 1109-1119, July 1997. Literature

79 10.A. Likas, N. Vlassis and J.J. Verbeek, "The global k-means clustering algorithm", Pattern Recognition 36, 451-461, 2003. 11.PAM (Kaufman and Rousseeuw, 1987) 12.CLARA (Kaufman and Rousseeuw in 1990) 13.CLARANS (A Clustering Algorithm based on Randomized Search) (Ng and Han 1994) 14.R.T. Ng and J. Han, “CLARANS: A method for clustering objects for spatial data mining,” IEEE Transactions on knowledge and data engineering, 14 (5), September/October 2002. Literature


Download ppt "Clustering Methods: Part 2d Pasi Fränti 31.3.2014 Speech & Image Processing Unit School of Computing University of Eastern Finland Joensuu, FINLAND Swap-based."

Similar presentations


Ads by Google