Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning multiple nonredundant clusterings Presenter :

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning multiple nonredundant clusterings Presenter :"— Presentation transcript:

1 Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning multiple nonredundant clusterings Presenter : Wei-Hao Huang Authors : Ying Gui, Xiaoli Z. Fern, Jennifer G. DY TKDD, 2010

2 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outlines Motivation Objectives Methodology Experiments Conclusions Comments

3 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 3 Motivation  Data exist multiple groupings that are reasonable and interesting from different perspectives.  Traditional clustering is restricted to finding only one single clustering.

4 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objectives 4 To propose a new clustering paradigm for finding all non-redundant clustering solutions of the data.

5 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 5 Methodology  Orthogonal clustering ─ Cluster space  Clustering in orthogonal subspaces ─ Feature space  Automatically Finding the number of clusters  Stopping criteria

6 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Orthogonal Clustering Framework 6 X (Face dataset)

7 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Orthogonal clustering  Residue space 7 )

8 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Clustering in orthogonal subspaces  Feature space ─ linear discriminant analysis (LDA) ─ singular value decomposition (SVD) ─ LDA v.s. SVD where 8 Projection Y=A T X

9 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Clustering in orthogonal subspaces  Residue space 9 A (t) = eigenvectors of

10 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Compare moethod1 and mothod2  Residue space  Moethod1 ─  Moethod2 ─  Moethod1 is a special case of Moethod2. ─ 10 A (t) = eigenvectors of M’=M then P 1 =P 2

11 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  To use PCA to reduce dimensional  Clustering ─ K-means clustering Smallest SSE ─ Gaussian mixture model clustering (GMM) Largest maximum likelihood  Dataset ─ Synthetic ─ Real-world Face, WebKB text, Vowel phoneme, Digit 11

12 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Evaluation 12

13 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Synthetic 13

14 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Face dataset 14

15 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  WebKB dataset  Vowe phoneme dataset 15

16 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Digit dataset 16

17 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Finding the number of clusters ─ K-means  Gap statistics 17

18 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Finding the number of clusters ─ GMM  BIC  Stopping Criteria ─ SSE is less than 10% at first iteration ─ K opt =1 ─ K opt > K max  Select K max ─ Gap statistics  ─ BIC  Maximize value of BIC 18

19 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Synthetic dataset 19

20 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Face dataset 20

21 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  WebKB dataset 21

22 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 22 Conclusions To discover varied interesting and meaningful clustering solutions. Method2 is able to apply any clustering and dimensionality reduction algorithm.

23 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 23 Comments  Advantages ─ Find Multiple non-redundant clustering solutions  Applications ─ Data Clustering


Download ppt "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning multiple nonredundant clusterings Presenter :"

Similar presentations


Ads by Google