Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A modified version of the K-means algorithm with a distance.

Slides:



Advertisements
Similar presentations
國立雲林科技大學 National Yunlin University of Science and Technology Application of LVQ to novelty detection using outlier training data Hyoung-joo Lee, Sungzoon.
Advertisements

Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Gianfranco Chicco, Roberto Napoli Federico Piglione, Petru Postolache.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Yu Cheng Chen Author: Hichem.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 On Rival Penalization Controlled Competitive Learning.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Clustering data in an uncertain environment using an artificial.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Unsupervised pattern recognition models for mixed feature-type.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Student : Sheng-Hsuan Wang Department.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology U*F clustering : a new performant “ clustering-mining ”
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Human eye sclera detection and tracking using a modified.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A novel genetic algorithm for automatic clustering Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Adaptive nonlinear manifolds and their applications to pattern.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Satoshi Oyama Takashi Kokubo Toru lshida 國立雲林科技大學 National Yunlin.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comparison of SOM Based Document Categorization Systems.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 The k-means range algorithm for personalized data clustering.
Intelligent Database Systems Lab 1 Advisor : Dr. Hsu Graduate : Jian-Lin Kuo Author : Silvia Nittel Kelvin T.Leung Amy Braverman 國立雲林科技大學 National Yunlin.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comprehensive Comparison Study of Document Clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Topology Preservation in Self-Organizing Feature Maps: Exact.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A self-organizing neural network using ideas from the immune.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Ming Hsiao Author : Bing Liu Yiyuan Xia Philp S. Yu 國立雲林科技大學 National Yunlin University.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Keng-Wei Chang Author: Yehuda.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 New Unsupervised Clustering Algorithm for Large Datasets.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A semantic similarity metric combining features and intrinsic information content Presenter: Chun-Ping.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A k-mean clustering algorithm for mixed numeric and categorical.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Development of a reading material recommendation system based on a knowledge engineering approach Presenter.
A Fuzzy k-Modes Algorithm for Clustering Categorical Data
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Manoranjan.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 The Evolving Tree — Analysis and Applications Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Chung-hung.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Fuzzy integration of structure adaptive SOMs for web content.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. The application of SOM as a decision support tool to identify AACSB peer schools Presenter : Chun-Ping.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Authors :
Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Lian Yan and David J. Miller 國立雲林科技大學 National Yunlin University of.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Model-based evaluation of clustering validation measures.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Juan D.Velasquez Richard Weber Hiroshi Yasuda 國立雲林科技大學 National.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A text mining approach on automatic generation of web.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Rival-Model Penalized Self-Organizing Map Yiu-ming Cheung.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Extending the Growing Hierarchal SOM for Clustering Documents.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Information Loss of the Mahalanobis Distance in High Dimensions-
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Multiclass boosting with repartitioning Graduate : Chen,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An initialization method to simultaneously find initial.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology O( ㏒ 2 M) Self-Organizing Map Algorithm Without Learning.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Enhanced neural gas network for prototype-based clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Unsupervised Learning with Mixed Numeric and Nominal Data.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien Shing Chen Author: Wei-Hao.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Validity index for clusters of different sizes and densities Presenter: Jun-Yi Wu Authors: Krista Rizman.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A self-organizing map for adaptive processing of structured.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Adaptive FIR Neural Model for Centroid Learning in Self-Organizing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A new data clustering approach- Generalized cellular automata.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A hierarchical clustering algorithm for categorical sequence.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Direct mining of discriminative patterns for classifying.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Wei Xu,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A survey of kernel and spectral methods for clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Recognizing Partially Occluded, Expression Variant Faces.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology ACM SIGMOD1 Subsequence Matching on Structured Time Series.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Hierarchical model-based clustering of large datasets.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Hierarchical Tree SOM: An unsupervised neural.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Dual clustering : integrating data clustering over optimization.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien-Shing Chen Author: Gustavo.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2005.ACM GECCO.8.Discriminating and visualizing anomalies.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Sanghamitra.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Visualizing social network concepts Presenter : Chun-Ping Wu Authors :Bin Zhu, Stephanie Watts, Hsinchun.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Lynette.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Chun Kai Chen Author : Andrew.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Jian-Lin Kuo Author : Aristidis Likas Nikos Vlassis Jakob J.Verbeek 國立雲林科技大學 National Yunlin.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A New Cluster Validity Index for Data with Merged Clusters.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology IEEE EC1 Generating War Game Strategies Using A Genetic.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Ching-Lung Chen Author : Pabitra Mitra Student Member 國立雲林科技大學 National Yunlin University.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Michael.
Presentation transcript:

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A modified version of the K-means algorithm with a distance based on cluster symmetry Advisor : Dr. Hsu Reporter : Chun Kai Chen Author : Mu-Chun Su and Chien-Hsing Chou IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2001

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Outline Motivation Objective Introduction The Point Symmetry Distance Experimental Results Conclusions Personal Opinion

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation  Since clusters can be of arbitrary shapes and sizes, the Minkowski metrics seem not a good choice for situations where no a priori information about the geometric characteristics of the data set to be clustered exists

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objective  Therefore, we have to find another more flexible measure ─ One of the basic features of shapes and objects is symmetry  Propose a nonmetric measure based on the concept of point symmetry

Intelligent Database Systems Lab N.Y.U.S.T. I. M. K-means Partitional Clustering Update the cluster means reassign

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Symmetry-based version of the K-means algorithm Update the cluster means reassign Update the cluster means Fine- Tuning reassign Coarse-Tuning

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction(1/4)  Most of the conventional clustering methods assume that patterns having similar locations or constant density create a single cluster ─ Location or density becomes a characteristic property of a cluster

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction(2/4)  Mathematically identify clusters in a data set ─ usually necessary to first define a measure of similarity or proximity which will establish a rule for assigning patterns to the domain of a particular cluster center ─ the most popular similarity measure the Euclidean distance

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction(3/4)  Euclidean distance as a measure of similarity ─ hyperspherical-shaped clusters of equal size are usually detected  Mahalanobis distance ─ take care of hyperellipsoidal-shaped clusters, is one of the popular choices

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction(4/4)  The major difficulties using the Mahalanobis distance ─ have to recompute the inverse of the sample covariance matrix every time a pattern changes its cluster domain, which is computationally expensive ─ In fact, not only similarity measures, but also the number of clusters which cannot always be defined a priori will influence the clustering results  In this paper ─ we focus on the selection of similarity measures

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Symmetry  Symmetry is so common in the abstract and in nature ─ reasonable to assume some kinds of symmetry exit in the structures of clusters ─ immediate problem is how to find a metric to measure symmetry

Intelligent Database Systems Lab N.Y.U.S.T. I. M. The Point Symmetry Distance  The point symmetry distance is defined as follows: Given N patterns, x i ; i=1,…,N, and a reference vector c (e.g., a cluster centroid) ─ the denominator term is used to normalize ─ If the right hand term of (2) is minimized when x i = x j*, then the pattern x j* is denoted as the symmetrical pattern relative to x j with respect to c

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Example of The Point Symmetry Distance

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Symmetry-based version of the K-means algorithm(1/3)  Step 1: Initialization ─ randomly choose K data points from the data set to initialize K cluster centroids, c1, c2... ; c K.  Step 2: Coarse-Tuning ─ use the ordinary K-means algorithm with the Euclidean distance to update the K cluster centroids ─ after the K cluster centroids converge or some kind of terminating criteria is satisfied

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Symmetry-based version of the K-means algorithm(2/3)  Step 3: Fine-Tuning ─ For pattern x, find the cluster centroid nearest it in the symmetrical sense ─ If the point symmetry distance is smaller than a prespecified parameter θ, then assign the data point x to the k*th cluster ds(x,c k ) is the point symmetry distance ─ Otherwise, the data point is assigned to the cluster centroid k using the following criterion: d(x,c k ) is the Euclidean distance

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Symmetry-based version of the K-means algorithm(3/3)  Step 4: Updating ─ Compute the new centroids of the K clusters ─ where S k (t) is the set whose elements are the patterns assigned to the kth cluster at time t and N k is the number of elements in S k.  Step 5: Continuation ─ If no patterns change categories or the number of iterations has reached a prespecified maximum number, then stop. Otherwise, go to Step 3.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experimental Results  Used four examples to compare the SBKM algorithm and the SBCL algorithm  In addition, we use one example to show how to use the point symmetry distance in face detections

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mixture of Spherical and Ellipsoidal clusters ordinary K-means SBKM SBCL

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Ring-shaped clusters SBKM SBCL ordinary K-means

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Linear structures SBKM SBCL ordinary K-means

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Combination of ring-shaped, compact, and linear clusters ordinary K-means SBKM SBCL

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Detecting a face in a complex background

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusion  Both use the point symmetry distance as the dissimilarity measure, the SBKM algorithm outperformed the SBCL algorithm in many cases  The proposed SBKM algorithm can be used to group a given data set into a set of clusters of different geometrical structures  Besides, we can also apply the point symmetry distance to detect human faces. The experimental results are encouraging

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Personal Opinion Advantage Idea, innovate Application clustering Future Work Adopt symmetry distance on SOM