Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Fast accurate fuzzy clustering through data reduction Advisor.

Slides:



Advertisements
Similar presentations
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Yu Cheng Chen Author: Hichem.
Advertisements

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Unsupervised pattern recognition models for mixed feature-type.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Student : Sheng-Hsuan Wang Department.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien-Shing Chen Author: Tie-Yan.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology U*F clustering : a new performant “ clustering-mining ”
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A novel genetic algorithm for automatic clustering Advisor.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Anthony K.H. Tung Hongjun Lu Jiawei Han Ling Feng 國立雲林科技大學 National.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comparison of SOM Based Document Categorization Systems.
Intelligent Database Systems Lab 1 Advisor : Dr. Hsu Graduate : Jian-Lin Kuo Author : Silvia Nittel Kelvin T.Leung Amy Braverman 國立雲林科技大學 National Yunlin.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Looking inside self-organizing map ensembles with resampling.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comprehensive Comparison Study of Document Clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien Shing Chen Author: Wei-Hao.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Visualizing Ontology Components through Self-Organizing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Finding Terminology Translations From Hyperlinks On the.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Extracting meaningful labels for WEBSOM text archives Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Virus Pattern Recognition Using Self-Organization Map.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Ming Hsiao Author : Bing Liu Yiyuan Xia Philp S. Yu 國立雲林科技大學 National Yunlin University.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Keng-Wei Chang Author: Yehuda.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 New Unsupervised Clustering Algorithm for Large Datasets.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 GMDH-based feature ranking and selection for improved.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A k-mean clustering algorithm for mixed numeric and categorical.
A Fuzzy k-Modes Algorithm for Clustering Categorical Data
Intelligent Database Systems Lab N.Y.U.S.T. I. M. TurSOM: A Turing Inspired Self-organizing Map Presenter: Tsai Tzung Ruei Authors: Derek Beaton, Iren.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Manoranjan.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 The Evolving Tree — Analysis and Applications Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2007.SIGIR.8 New Event Detection Based on Indexing-tree.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Utilizing Marginal Net Utility for Recommendation in E-commerce.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Motivated Reinforcement Learning for Non-Player Characters.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Efficient Optimal Linear Boosting of a Pair of Classifiers.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Chung-hung.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A modified version of the K-means algorithm with a distance.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Fuzzy integration of structure adaptive SOMs for web content.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Authors :
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Model-based evaluation of clustering validation measures.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Juan D.Velasquez Richard Weber Hiroshi Yasuda 國立雲林科技大學 National.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Rival-Model Penalized Self-Organizing Map Yiu-ming Cheung.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Extending the Growing Hierarchal SOM for Clustering Documents.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Information Loss of the Mahalanobis Distance in High Dimensions-
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Multiclass boosting with repartitioning Graduate : Chen,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology O( ㏒ 2 M) Self-Organizing Map Algorithm Without Learning.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Enhanced neural gas network for prototype-based clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Unsupervised Learning with Mixed Numeric and Nominal Data.
Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Balaji Rajagopalan Mark W. Isken 國立雲林科技大學 National Yunlin University.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Validity index for clusters of different sizes and densities Presenter: Jun-Yi Wu Authors: Krista Rizman.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A self-organizing map for adaptive processing of structured.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A hierarchical clustering algorithm for categorical sequence.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Direct mining of discriminative patterns for classifying.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Mechanisms and Cluster Identification with TurSOM.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Jessica K. Ting Michael K. Ng Hongqiang Rong Joshua Z. Huang 國立雲林科技大學.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Wei Xu,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A survey of kernel and spectral methods for clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Recognizing Partially Occluded, Expression Variant Faces.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Hierarchical Tree SOM: An unsupervised neural.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author : Yongqiang Cao Jianhong Wu 國立雲林科技大學 National Yunlin University of Science.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Dual clustering : integrating data clustering over optimization.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien-Shing Chen Author: Gustavo.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2005.ACM GECCO.8.Discriminating and visualizing anomalies.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Sanghamitra.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Visualizing social network concepts Presenter : Chun-Ping Wu Authors :Bin Zhu, Stephanie Watts, Hsinchun.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Lynette.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Chun Kai Chen Author : Andrew.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Adaptive Clustering for Multiple Evolving Streams Graduate.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Jian-Lin Kuo Author : Aristidis Likas Nikos Vlassis Jakob J.Verbeek 國立雲林科技大學 National Yunlin.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A clustering-based approach for prediction of cardiac.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Enhancing Text Clustering by Leveraging Wikipedia Semantics.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Ching-Lung Chen Author : Pabitra Mitra Student Member 國立雲林科技大學 National Yunlin University.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Michael.
Presentation transcript:

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Fast accurate fuzzy clustering through data reduction Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Authors : Steven Eschrich, Jingwei Ke, Lawrence O. Hall, Dmitry B. Goldgof Department of Information Management IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 11, NO. 2, APRIL 2003

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Outline Motivation Objective Introduction Related Work BRFCM BRFCM Implementation Experiments Conclusion Personal Opinion Review

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation The problem of the clustering. Fuzzy c-mean(FCM).

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objective As title “Fast Accurate Fuzzy Clustering Through Data Reduction”.~brFCM. Be able to reduce the number of distinct patterns which must be clustered without adversely affecting partition quality. The reduction is done by aggregating similar examples and then using a weighted exemplar in the clustering process.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Introduction Clustering in images. Some modifications to the fuzzy c-means clustering algorithm. Two experiment to test speedup and FCM correspondence results. Infrared images of natural scenes. Magnetic resonance images of the human brain.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Related Work(1/2) For large data sets, the problem of FCM is significant amounts of CPU times. The variants of FCM. AFCM. mrFCM. subsampling algorithm. In this paper, the combination of similar feature vectors is used to speed up FCM.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Related Work (2/2) Our work on speeding up fuzzy c-means has some connection to vector quantization. In the sense that our first step can be seen to be a quantization of the data.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. BRFCM 2rFCM Reducing the precision of the data, in order to speed up the clustering. The brFCM algorithm consists of two phases : Data reduction. Fuzzy clustering using FCM. We attempt to reduce the number of distinct examples to be clustered from n to n o, for some n o << n.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. BRFCM - Data Reduction:Overview The first step is quantization. Quantization forces different continuous values into the same quantization level or bin. The second step is aggregation. Aggregation combines identical feature vectors into a single, weighted exemplar which representing the quantization bin.ex: the mean value of all full-precision feature vectors. When both quantization and aggregation are used, significant data reduction can be obtained.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. BRFCM - Example

Intelligent Database Systems Lab N.Y.U.S.T. I. M. BRFCM - Data Reduction:Overview The quantization is an optional step in data reduction. The brFCM with only aggregation is functionally equivalent to the original FCM. If data redundancy is significant, the dataset can be represented in a more compact form for clustering.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. BRFCM - brFCM Details Data reduction - > brFCM. In more formal terms X’ of example vectors representing a reduced-precision view of the dataset X. There are n o such vectors,. Each represents the mean of all full-precision members in the quantization bin.. representing the number of feature vectors aggregated into.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. BRFCM - brFCM Details The cluster centroids are calculated by The cluster membership values are calculated by

Intelligent Database Systems Lab N.Y.U.S.T. I. M. BRFCM - brFCM Details Two particular features of this algorithm. When no quantization occurs and the aggregation step doesn’t reduce the dataset, and for all. The algorithm reduces to FCM. When the aggregation step is used by itself, the algorithm also reduces to FCM. This formulation can significantly improve the speed of clustering, without a loss of accuracy.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. BRFCM - Image Characteristics RGB image consisting of possible values.(4096 * 4096 pixel image) Consider quantizing RGB space by r = 2, this will create a space of size.(512*512 pixel image)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. BRFCM Implementation For this work, quantization was implemented via bit- masking and aggregation was done using a hashing scheme. A. Formula Implementation The cluster centroids in (1).. The membership values in (2). When i = j.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. BRFCM Implementation B. Quantization Quantization of a feature space can be done either using fixed-size bins or variable-sized bins. The brFCM can be implemented efficiently using fixed-size bins. A more general approach to quantization can be

Intelligent Database Systems Lab N.Y.U.S.T. I. M. BRFCM Implementation C. Aggregation Using Hashing. The function is given by

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments The experiments in two image domains. A set of infrared images. Magnetic resonance images of the normal human brain which are segmented into gray matter, white matter and cerebro-spinalfluid. Data reduction. Clustering time. Cluster result.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments - Infrared Images Our 172 ATR images are 8-bit(256 value) infrared images of size pixels. The image were clustered into c=5 clusters. We use two features:intensity and one Laws’ Texture Energy feature. Table 3 shows the remarkable level of reduction seen in these images.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments - Infrared Images

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments - Correspondence With FCM To measure, the cluster correspondence in clustering results with FCM. Consider two partitions of X={x 1,x 2,…,x n }: We define the maximal intersection of The correspondence mapping can then be defined as the mapping of cluster such that, for all cluster in.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments - Correspondence With FCM The algorithm for calculating the cluster correspondence. Find correspondence mapping Correspondence rate Corr1 is the sum of all maximal intersections in the correspondence mapping, divided by number of examples in X. Repeat for Corr2 (using ). Correspondence rate CR=max(Corr1, Corr2).

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments - Correspondence With FCM How significant are the brFCM-FCM correspondence rates as r increases? brFCM generally creates partitions very similar to FCM, given the same centroid initializations for this dataset.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments - Magnetic Resonance Images The set of MR images consisted of 256* bit images. Each pixel consisted of three features (T1, T2 and PD). 32 MRI slices. Each MR image has an associated ground truth. The images were created by the KNN with k=7, where the training data was chosen by a person who could be labeled a radiology technician. There are three classes of interest in the magnetic resonance images, cerebro-spinal fluid, gray matter and white matter.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments - Magnetic Resonance Images

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments - Magnetic Resonance Images 1) Performance Speedups

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments - Magnetic Resonance Images 2)Correspondence With FCM on Ground Truth

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments - Discussion The brFCM algorithm generates significant speedup over literal FCM in the infrared image dataset and the MRI dataset. A trade off exists between the FCM correspondence and speedup, Fig.2.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusion Speedups versus the bit reduction. The higher the value of r, the higher speedup, the lower accurate. This approach to speeding up clustering can be applied equally well to hard c-means and EM clustering or the optimization to FCM. For many image clustering problems, brFCM is a fast alternative to traditional FCM.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Personal Opinion A trade off between accurate and speedup. Data reduction Numical data => bit mask. Categorical data => Conceptual hierarchical.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Review Fuzzy C-Mean(FCM) Data Reduction Quantization Using Bit Mask. Aggregation Using Hashing. Fuzzy clustering using FCM. Two experiments Infrared images. Magnetic resonance images of the normal human brain.