Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Validity index for clusters of different sizes and densities Presenter: Jun-Yi Wu Authors: Krista Rizman.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab N.Y.U.S.T. I. M. Validity index for clusters of different sizes and densities Presenter: Jun-Yi Wu Authors: Krista Rizman."— Presentation transcript:

1 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Validity index for clusters of different sizes and densities Presenter: Jun-Yi Wu Authors: Krista Rizman Zalik, Borut Zalik 2011 PRL 國立雲林科技大學 National Yunlin University of Science and Technology

2 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Outline Motivation Objective Methodology Experiments Conclusion Comments 2

3 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation 3 Most of the previous validity indices have been considerably dependent on the number of data objects in clusters, on cluster centroids and on average values. Most popular validity measures have the tendency to ignore clusters with low density and are not efficient in validation of partitions having different sizes and densities.

4 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objective Two cluster validity indices are proposed for efficient validation of partitions containing clusters that widely differ in sizes and densities. To design a cluster validity index that is suitable for the validation of partitions having different sizes and densities. 4  Overlap  Compactness  Separation distance A good partitions:

5 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology 5 Review several popular validity indices Dunn index; D IndxXiE index Davies-Bouldin’s index; DB index C index G index G+ index Partition coefficient; PC index Classification entropy; CE index

6 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology 6 Review several popular validity indices. D Index DB Index G+ Index C Index G Index PC CE XiE

7 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology 7 new clustering validity indices.  SV-index  Validation of index SV  Fuzzification of the SV index  The proposed index OS exploiting overlap and separation measures  Overlap measure  Separation measure and validity index SV  Validation of index OS

8 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology 8 SV-index a measure for partition validity that consists of clusters that widely differ in density or size

9 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology 9 Validation of index SV

10 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology 10 Fuzzification of the SV index A fuzzy version of the index SV is obtained by integrating the membership values in the variation measure.

11 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology 11 The proposed index OS exploiting overlap and separation measure  Experiment results suggested that inter-cluster separation plays a more important role in cluster validation.  Indices are limited in their ability to compute the compactness and the separation in partitions having overlapping clusters and clusters of different sizes, which leads to an incorrect validation results.  Considering these results a cluster validity index is suggested based on an overlap and separation measures.

12 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology 12 Overlap measure

13 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology 13 Separation measure and validity index SV

14 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology 14 Validation of index OS

15 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments 15 To demonstrate the effectiveness of the proposed SV and OS indices for determining the optional number of clusters.  Artificial data set A1  Artificial data set A2  Artificial data set A3  Iris data set  Wine data set  Glass data set

16 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments- Artificial data set A1 16

17 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments- Artificial data set A2 17.

18 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments- Artificial data set A3 18

19 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments- Artificial data set A3 19

20 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments - Iris data set. 20.

21 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments- Wine data set 21

22 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments- Wine data set 22

23 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusion The experimental results proved that the new indices outperform the other considered indices, especially when cluster widely differ in sizes or densities. A good partition is expected to have low degree of overlap and a larger separation distance and compactness. The maximum value of the ratio of the SV index and the minimum value of the OS index indicate the optimal partition. 23

24 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Comments 24 Advantage Drawback  …. Application  Clustering  Validity index


Download ppt "Intelligent Database Systems Lab N.Y.U.S.T. I. M. Validity index for clusters of different sizes and densities Presenter: Jun-Yi Wu Authors: Krista Rizman."

Similar presentations


Ads by Google