Presented by Jingting Zeng 11/26/2007

Slides:



Advertisements
Similar presentations
Feature Selection in Classification and R Packages Houtao Deng 1Data Mining with R12/13/2011.
Advertisements

Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
ECG Signal processing (2)
_ Rough Sets. Basic Concepts of Rough Sets _ Information/Decision Systems (Tables) _ Indiscernibility _ Set Approximation _ Reducts and Core _ Rough Membership.
Predictive Analysis of Gene Expression Data from Human SAGE Libraries Alexessander Alves* Nikolay Zagoruiko + Oleg Okun § Olga Kutnenko + Irina Borisova.
Data Mining Feature Selection. Data reduction: Obtain a reduced representation of the data set that is much smaller in volume but yet produces the same.
Christoph F. Eick Questions and Topics Review Nov. 22, Assume you have to do feature selection for a classification task. What are the characteristics.
An Introduction of Support Vector Machine
Minimum Redundancy and Maximum Relevance Feature Selection
Exploratory Data Mining and Data Preparation
A Study on Feature Selection for Toxicity Prediction*
Decision Tree Algorithm
Margin Based Sample Weighting for Stable Feature Selection Yue Han, Lei Yu State University of New York at Binghamton.
Dimension Reduction and Feature Selection Craig A. Struble, Ph.D. Department of Mathematics, Statistics, and Computer Science Marquette University.
Feature Selection Lecture 5
Feature Selection and Its Application in Genomic Data Analysis March 9, 2004 Lei Yu Arizona State University.
Machine Learning CSE 681 CH2 - Supervised Learning.
A Fast Clustering-Based Feature Subset Selection Algorithm for High- Dimensional Data.
Feature selection LING 572 Fei Xia Week 4: 1/29/08 1.
Xiang Zhang, Feng Pan, Wei Wang, and Andrew Nobel VLDB2008 Mining Non-Redundant High Order Correlations in Binary Data.
1 Effective Feature Selection Framework for Cluster Analysis of Microarray Data Gouchol Pok Computer Science Dept. Yanbian University China Keun Ho Ryu.
MINING MULTI-LABEL DATA BY GRIGORIOS TSOUMAKAS, IOANNIS KATAKIS, AND IOANNIS VLAHAVAS Published on July, 7, 2010 Team Members: Kristopher Tadlock, Jimmy.
1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.
Unsupervised Mining of Statistical Temporal Structures in Video Liu ze yuan May 15,2011.
COT6930 Course Project. Outline Gene Selection Sequence Alignment.
Data Mining and Decision Support
Feature Selection on Time-Series Cab Data
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Dr. Gheith Abandah 1.  Feature selection is typically a search problem for finding an optimal or suboptimal subset of m features out of original M features.
Outline Time series prediction Find k-nearest neighbors Lag selection Weighted LS-SVM.
CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.
Support Vector Machine Slides from Andrew Moore and Mingyue Tan.
Support Vector Machine
MS Thesis Proposal Presented By: Anam Naseer Reg No. 697-FBAS/MSCS/F12 Department of Computer Science And Software Engineering International Islamic University,
Sample Representativeness: New aid in Feature Selection
Principal Component Analysis (PCA)
Support Vector Machines
PREDICT 422: Practical Machine Learning
Support Vector Machine
Alan P. Reynolds*, David W. Corne and Michael J. Chantler
k-Nearest neighbors and decision tree
School of Computer Science & Engineering
Software Testing An Introduction.
Boosting and Additive Trees (2)
Information Management course
Support Vector Machines
COMP61011 Foundations of Machine Learning Feature Selection
Computational Intelligence: Methods and Applications
Advanced Artificial Intelligence Feature Selection
Roberto Battiti, Mauro Brunato
Students: Meiling He Advisor: Prof. Brain Armstrong
Linear Discriminators
Feature Selection To avid “curse of dimensionality”
Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis
Open-Category Classification by Adversarial Sample Generation
Classification & Prediction
COSC 4335: Other Classification Techniques
Linear Model Selection and regularization
Data Preprocessing Copyright, 1996 © Dale Carnegie & Associates, Inc.
Overview of Control System
Artificial Intelligence Lecture No. 28
Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen, Zne-Jung Lee
Statistical Learning Dong Liu Dept. EEIS, USTC.
Generally Discriminant Analysis
Generalized Locality Preserving Projections
Data Preprocessing Copyright, 1996 © Dale Carnegie & Associates, Inc.
Feature Selection Methods
©Jiawei Han and Micheline Kamber
SVMs for Document Ranking
A task of induction to find patterns
Presentation transcript:

Presented by Jingting Zeng 11/26/2007 Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution Presented by Jingting Zeng 11/26/2007

Outline Fast Correlation-Based Filter (FCBF) Algorithm Introduction to Feature Selection Feature Selection Models Fast Correlation-Based Filter (FCBF) Algorithm Experiment Discussion Reference

Introduction of Feature Selection Definition A process that chooses an optimal subset of features according to an objective function Objectives To reduce dimensionality and remove noise To improve mining performance Speed of learning Predictive accuracy Simplicity and comprehensibility of mined results

An Example for Optimal Subset Data set (whole set) Five Boolean features C = F1∨F2 F3= ┐F2 ,F5= ┐F4 Optimal subset: {F1, F2}or{F1, F3}

Models of Feature Selection Filter model Separating feature selection from classifier learning Relying on general characteristics of data (information, distance, dependence, consistency) No bias toward any learning algorithm, fast Wrapper model Relying on a predetermined classification algorithm Using predictive accuracy as goodness measure High accuracy, computationally expensive

Filter Model

Wrapper Model

Two Aspects for Feature Selection How to decide whether a feature is relevant to the class or not How to decide whether such a relevant feature is redundant or not compared to other features

Linear Correlation Coefficient For a pair of variables (x,y): However, it may not be able to capture the non-linear correlations

Information Measures Entropy of variable X Entropy of X after observing Y Information Gain Symmetrical Uncertainty

Fast Correlation-Based Filter (FCBF) Algorithm How to decide whether a feature is relevant to the class C or not Find a subset , such that How to decide whether such a relevant feature is redundant Use the correlation of features and class as a reference

Definitions Predominant Correlation Redundant peer (RP) The correlation between a feature and the class C is predominant Redundant peer (RP) If there is , is a RP of Use to denote the set of RP for

i C

Three Heuristics If , treat as a predominant feature, remove all features in and skip identifying redundant peers for them If , process all the features in at first. If non of them becomes predominant, follow the first heuristic The feature with the largest value is always a predominant feature and can be a starting point to remove other features.

i C

FCBF Algorithm Time Complexity: O(N)

FCBF Algorithm (cont.) Time complexity: O(NlogN)

Experiments FCBF are compared to ReliefF, CorrSF and ConsSF Summary of the 10 data sets

Results

Results (cont.)

Pros and Cons Advantage Disadvantage Very fast Select fewer features with higher accuracy Disadvantage Cannot detect some features 4 features generated by 4 Gaussian functions and adding 4 additional redundant features, FCBF selected only 3 features

Discussion FCBF compares only individual features with each other Try to use PCA to capture a group of features. Based on the result, then the FCBF is used.

Reference L. Yu and H. Liu. Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proc 12th Int Conf on Machine Learning (ICML-03), pages 856–863, 2003 Biesiada J, Duch W (2005), Feature Selection for High-Dimensional Data: A Kolmogorov-Smirnov Correlation-Based Filter Solution. (CORES'05) Advances in Soft Computing, Springer Verlag, pp. 95-104, 2005. www.cse.msu.edu/~ptan/SDM07/Yu-Ye-Liu.pdf www1.cs.columbia.edu/~jebara/6772/proj/Keith.ppt

Thank you! Q and A