Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Yu Cheng Chen Author: Hichem.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Yu Cheng Chen Author: Hichem."— Presentation transcript:

1 Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Yu Cheng Chen Author: Hichem Frigui, Olfa Nasraoui Unsupervised Learning of Prototypes and Attribute Weights Transactions on Pattern Recognition 2004, Pages 567-581

2 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Outline Motivation Objective Introduction Background Simultaneous clustering and attribute discrimination Application Conclusions Personal Opinion

3 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation The selected and weighted attributes can effect the learning algorithms significantly. Several methods have been proposed for feature selection and weighting. Assume feature relevance is invariant Only appropriate for binary weighting No methods exist for assigning different weights for distinct classes of a data set prior to clustering

4 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objective Propose a method to perform clustering and feature weighting simultaneously. For different cluster, we assign different feature weights.

5 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction Illustrate the need for di1erent sets of feature weights for di1erent clusters. Dirt

6 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Background  Prototype-based clustering- Fuzzy C-mean ─ X = {x j | j = 1,…,N} be a set of N feature vectors. ─ B=(B1,…,Bc) represent the prototype set of C clusters. ─ u ij is the menbership of point xj in cluster Bi. ─ Minimize the equation 2.

7 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Background  Prototype-based clustering- Fuzzy C-mean

8 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Background  Fuzzy C-mean ─ Cannot automatic determinate the optimum number of cluster ─ C has to be specified a priori.  CA (Competitive Agglomeration)

9 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Simultaneous clustering and attribute discrimination  Search for the optimal prototype parameters, B, and the optimal set of feature weights, V, simultaneously.  SCAD1 & SCAD2 ─ v ik represents the relevance weight of feature k in cluster I ─ d ijk = | x jk − c ik |

10 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Simultaneous clustering and attribute discrimination  To optimize J1, with respect to V, we use the Lagrange multiplier technique.

11 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Simultaneous clustering and attribute discrimination  The choice of δ i in Eq. (9) is important ─ If δ i is too small, then the 1st term dominates and only one feature in cluster i will be maximally relevant and assigned a weight of 1 The remaining features get assigned 0 weights.

12 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Simultaneous clustering and attribute discrimination  Updated u ij  Updated c ik

13 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Simultaneous clustering and attribute discrimination  SCAD2 ─ q is referred as a “discrimination exponent”.

14 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Simultaneous clustering and attribute discrimination  SCAD2 ─ Updated uij

15 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Simultaneous clustering and attribute discrimination: unknown number of clusters  The objective functions in (5) and (21) complement each other, and can easily be combined into one objective function.  This algorithm is called SCAD2-CA.

16 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Application1:color image segmentation We illustrate the ability of SCAD2 real color images Feature data extraction Texture features: 3 attributes Color features: 2 attributes Position features: 2 attributes (x and y)

17 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Application1:color image segmentation Dirt Grass

18 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Application1:color image segmentation

19 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Application 2:supervised classification  We use SCAD2-CA for supervised classification. ─ Iris data set ─ the Wisconsin Breast Cancer data set ─ the Pima Indians Diabetes data set ─ the Heart Disease data set.

20 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusions  We have proposed a new approach to perform clustering and feature weighting simultaneously.  SCAD2-CA can determine the “optimal” number of clusters automatically.

21 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Personal Opinion  Advantages ─ Take into account different feature weights in different cluster. ─ clustering and feature weighting simultaneously ─ Writing skill  Application ─ Should be applied the idea in our clustering algorithms  Limited ─ Only suit for numeric data.  Discussion ─ Clustering techniques are very hard to improve


Download ppt "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Yu Cheng Chen Author: Hichem."

Similar presentations


Ads by Google