Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A novel genetic algorithm for automatic clustering Advisor.

Slides:



Advertisements
Similar presentations
Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Gianfranco Chicco, Roberto Napoli Federico Piglione, Petru Postolache.
Advertisements

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 On Rival Penalization Controlled Competitive Learning.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Clustering data in an uncertain environment using an artificial.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A novel document similarity measure based on earth mover’s.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Fast exact k nearest neighbors search using an orthogonal search tree Presenter : Chun-Ping Wu Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Unsupervised pattern recognition models for mixed feature-type.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Student : Sheng-Hsuan Wang Department.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien-Shing Chen Author: Tie-Yan.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology U*F clustering : a new performant “ clustering-mining ”
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Human eye sclera detection and tracking using a modified.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 On-line Learning of Sequence Data Based on Self-Organizing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Graph self-organizing maps for cyclic and unbounded graphs.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comparison of SOM Based Document Categorization Systems.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 The k-means range algorithm for personalized data clustering.
Intelligent Database Systems Lab 1 Advisor : Dr. Hsu Graduate : Jian-Lin Kuo Author : Silvia Nittel Kelvin T.Leung Amy Braverman 國立雲林科技大學 National Yunlin.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comprehensive Comparison Study of Document Clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Visualizing Ontology Components through Self-Organizing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Finding Terminology Translations From Hyperlinks On the.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Extracting meaningful labels for WEBSOM text archives Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A self-organizing neural network using ideas from the immune.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Ming Hsiao Author : Bing Liu Yiyuan Xia Philp S. Yu 國立雲林科技大學 National Yunlin University.
A genetic approach to the automatic clustering problem Author : Lin Yu Tseng Shiueng Bien Yang Graduate : Chien-Ming Hsiao.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Keng-Wei Chang Author: Yehuda.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 New Unsupervised Clustering Algorithm for Large Datasets.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An Empirical Study of Learning from Imbalanced Data Using.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 GMDH-based feature ranking and selection for improved.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A k-mean clustering algorithm for mixed numeric and categorical.
A Fuzzy k-Modes Algorithm for Clustering Categorical Data
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Extensions of vector quantization for incremental clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Manoranjan.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 The Evolving Tree — Analysis and Applications Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Study on Automatic Recognition of Road Signs Presenter.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Fast accurate fuzzy clustering through data reduction Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A Novel Density-Based Clustering Framework by Using Level.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Utilizing Marginal Net Utility for Recommendation in E-commerce.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Motivated Reinforcement Learning for Non-Player Characters.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Chung-hung.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A modified version of the K-means algorithm with a distance.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Fuzzy integration of structure adaptive SOMs for web content.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Authors :
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Model-based evaluation of clustering validation measures.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Juan D.Velasquez Richard Weber Hiroshi Yasuda 國立雲林科技大學 National.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Rival-Model Penalized Self-Organizing Map Yiu-ming Cheung.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Chun Kai Chen Author : Qing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology O( ㏒ 2 M) Self-Organizing Map Algorithm Without Learning.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Unsupervised Learning with Mixed Numeric and Nominal Data.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A self-organizing map for adaptive processing of structured.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A new data clustering approach- Generalized cellular automata.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Cost- sensitive boosting for classification of imbalanced.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A hierarchical clustering algorithm for categorical sequence.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning multiple nonredundant clusterings Presenter :
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Direct mining of discriminative patterns for classifying.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Jessica K. Ting Michael K. Ng Hongqiang Rong Joshua Z. Huang 國立雲林科技大學.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Wei Xu,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Recognizing Partially Occluded, Expression Variant Faces.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology ACM SIGMOD1 Subsequence Matching on Structured Time Series.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Hierarchical model-based clustering of large datasets.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Hierarchical Tree SOM: An unsupervised neural.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author : Yongqiang Cao Jianhong Wu 國立雲林科技大學 National Yunlin University of Science.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Dual clustering : integrating data clustering over optimization.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien-Shing Chen Author: Gustavo.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2005.ACM GECCO.8.Discriminating and visualizing anomalies.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Sanghamitra.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Visualizing social network concepts Presenter : Chun-Ping Wu Authors :Bin Zhu, Stephanie Watts, Hsinchun.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Chun Kai Chen Author : Andrew.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Jian-Lin Kuo Author : Aristidis Likas Nikos Vlassis Jakob J.Verbeek 國立雲林科技大學 National Yunlin.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A New Cluster Validity Index for Data with Merged Clusters.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology IEEE EC1 Generating War Game Strategies Using A Genetic.
Presentation transcript:

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A novel genetic algorithm for automatic clustering Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Authors : Gautam Garai B.B. Chaudhuri Department of Information Management Pattern Recognition Letters 25 (2004)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Outline Motivation Objective Introduction Basic concept of Classical Genetic Algorithm Clustering with Genetic Algorithm Experimental results Discussion and conclusion Personal opinions Review

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation Some problems of the clustering. Automatic clustering. If one cluster is confined fully or partly within another cluster. If clusters are present in noisy data.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objective A new genetically guided algorithm for solving the clustering problem, which have two-phase process. Cluster Decomposition Algorithm (CDA). Hierarchical Cluster Merging Algorithm (HCMA). Adjacent Cluster Checking Algorithm (ACCA).

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction These clustering methods can broadly be classified into two categories: Hierarchical agglomerative divisive Non-hierarchical k-means

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction Some researchers have used GA based on split-and- merge method in defining clusters. Tseng and Yang (2001). Other algorithms: DBScan CURE Chameleon

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction Genetically based Clustering Algorithm (GCA) which is basically a two-stage split-and-merge algorithm for finding the clusters. Splitting of clusters with CDA. Cluster merging with HCMA. Adjacency checking between two fragmented clusters with ACCA.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Basic concept of Classical Genetic Algorithm Encoding schemas Fitness evaluation Testing the end of the algorithm Parent selection Crossover operators Mutation operators NO Halt YES

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Clustering with Genetic Algorithm n vectors X = {x 1, x 2, …, x n } to be clustered into k groups. The clustering approach has two steps Cluster Decomposition Algorithm (CDA). Hierarchical Cluster Merging Algorithm (HCMA).

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Splitting of clusters with CDA First decomposes the entire data set into m groups of clusters.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. The progress of the CDA process Step 1. For each object x i, find the nearest neighbor x j. Step 2. Compute d av.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. The progress of the CDA process Step 3. Consider x i as the center of a circular region with radius r. Step 4. Set p = 1. Step 5. Extract B p and modify the data set X such that X = |X - B p |. Step 6. Terminate the algorithm if. Otherwise, p = p + 1 and go to step 5.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Cluster merging with HCMA The second stage to merge the fragmented clusters, B i PiPi m … u

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Cluster merging with HCMA The algorithm, HCMA consists of all three phases of CGA. P a and P b are chosen randomly from the pool of individuals. Corssover probability,, using single point corssover operation. Adaptive mutation probability.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Cluster merging with HCMA (example) B1B1 m0m0 pipi Merge until B 0 is null B0B0 m’ CiCi

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Cluster merging with HCMA Let the seed of the fragmented cluster B i be. The center S j of each C j : The fitness function,.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Adjacency checking between two fragmented clusters The ACCA is used along with HCMA if One cluster is confined fully or partly within another cluster. Clusters are present in noisy data. The ACCA uses two thresholds for deciding merging of pair of clusters. : The threshold of boundary points. : The threshold of data density difference.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. The progress of the ACCA process Step 1. Define suitably the value of the radius. Step 2. Select two fragmented clusters,, which satisfy the merging condition. Step 3. Count the number of boundary points of which resides within radius r’. Let it be N b and the object density of be. Step 4. If then are adjacent to each other. Step 5. Terminate the algorithm.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experimental results Parameter setting Population size, = 50. The number of clusters, m, is inversely proportional to the value of r. 2 <= u <= 4. k is pre-specified by the user. Crossover probability Initial mutation probability G max =100 times in each cycle, 30 runs. T b =4 ; T d =0.4

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Cluster partitioning in R 2 feature space

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Cluster partitioning in R 2 feature space

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Cluster partitioning in R 2 feature space The noise is represented as the third cluster.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Cluster separation in Iris data 4-D Iris dataset.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Discussion and conclusion GCA is composed of two algorithms CDA HCMA After several GA cycles when k clusters are found. Identify clusters accurately (ACCA) Either partly or fully enclosed by another cluster. Noise.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Personal Opinions It may be used in SOM 2-D map to automatic clustering.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Review Using GCA to automatic clustering. Split : CDA Merge : HCMA + ACCA