Download presentation

Presentation is loading. Please wait.

Published byAngelo Sowl Modified over 2 years ago

1
Feature Grouping-Based Fuzzy-Rough Feature Selection Richard Jensen Neil Mac Parthaláin Chris Cornelis

2
Outline Motivation/Feature Selection (FS) Rough set theory Fuzzy-rough feature selection Feature grouping Experimentation

3
The problem: too much data The amount of data is growing exponentially – Staggering 4300% annual growth in global data Therefore, there is a need for FS and other data reduction methods – Curse of dimensionality: a problem for machine learning techniques The complexity of the problem is vast – (e.g. the powerset of features for FS)

4
Feature selection Remove features that are: – Noisy – Irrelevant – Misleading Task: find a subset that – Optimises a measure of subset goodness – Has small/minimal cardinality In rough set theory, this is a search for reducts – Much research in this area

5
Rough set theory (RST) For a subset of features P Upper approximation Set X Lower approximation Equivalence class [x] P

6
Rough set feature selection By considering more features, concepts become easier to define…

7
Rough set theory Problems: – Rough set methods (usually) require data discretization beforehand – Extensions require thresholds, e.g. tolerance rough sets – Also no flexibility in approximations E.g. objects either belong fully to the lower (or upper) approximation, or not at all

8
Fuzzy-rough sets Extends rough set theory – Use of fuzzy tolerance instead of crisp equivalence – Approximations are fuzzified – Collapses to traditional RST when data is crisp New definitions: Fuzzy upper approximation: Fuzzy lower approximation:

9
Fuzzy-rough feature selection Search for reducts – Minimal subsets of features that preserve the fuzzy lower approximations for all decision concepts Traditional approach – Greedy hill-climbing algorithm used – Other search techniques have been applied (e.g. PSO) Problems – Complexity is problematic for large data (e.g. over several thousand features) – No explicit handling of redundancy

10
Feature grouping Idea: don’t need to consider all features – Those that are highly correlated with each other carry the same or similar information – Therefore, we can group these, and work on a group by group basis This paper: based on greedy hill-climbing – Group-then-rank approach Relevancy and redundancy handled by – Correlation: similar features grouped together – Internal ranking (correlation with decision feature) F1F1

11
Forming groups of features Calculate correlations F1F1 F1F1 F2F2 F2F2 F3F3 F3F3 FnFn FnFn... #1 f 3 #2 f 12 #3 f 1 … #m f n #1 f 3 #2 f 12 #3 f 1 … #m f n #1 f #2 f #3 f … #m f n #1 f #2 f #3 f … #m f n #1 f #2 f #3 f … #m f n #1 f #2 f #3 f … #m f n #1 f #2 f #3 f … #m f n #1 f #2 f #3 f … #m f n Feature groups Internally-ranked feature groups Correlation measure Threshold : Redundancy Relevancy Data τ

12
... Selecting features Feature subset search and selection Search mechanism Subset evaluation Selected subset(s)

13
Fuzzy-rough feature grouping

14
Initial experimentation Setup: – 10 datasets (9-2557 features) – 3 classifiers – Stratified 5 x 10-fold cross-validation Performance evaluation in terms of – Subset size – Classification accuracy – Execution time FRFG compared with – Traditional greedy hill-climber (GHC) – GA & PSO (200 generations, population size: 40)

15
Results: average subset size

16
Results: classification accuracy JRip IBk (k=3)

17
Results: execution times (s)

18
Conclusion FRFG – Motivation: reduce computational overhead; improve consideration of redundancy – Group-then-rank approach – Parameter determines granularity of grouping – Weka implementation available: http://bit.ly/1oic2xMhttp://bit.ly/1oic2xM Future work – Automatic determination of parameter τ – Experimentation using much larger data, other FS methods, etc – Clustering of features – Unsupervised selection?

19
Thank you!

20
Simple example Dataset of six features After initialisation, the following groups are formed Within each group, rank determines relevance: e.g. f 4 more relevant than f 3 Ordering of groups Greedy hill-climber F1F1 F2F2 F3F3 F4F4 etc… {F 4, F 1, F 3, F 5, F 2, F 6 }F =

21
Simple example... First group to be considered: F 4 – Feature f 4 is preferable over others – So, add this to current (initially empty) subset R – Evaluate M(R + {f 4 }): If better score than the current best evaluation, store f 4 Current best evaluation = M(R + {f 4 }) – Set of features which appear in F 4 : ({f 1, f 4, f 5 }) Add to the set Avoids Next feature group with elements that do not appear in Avoids: F 1 And so on… F4F4 F1F1

Similar presentations

OK

Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.

Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on indian army weapons Ppt on history of earth Ppt on hydrogen fuel cell vehicles honda Ppt on power line communication standard Ppt on manufacturing of soft drinks Ppt on national defence academy Ppt on product specification document Conceptual architecture view ppt on mac Ppt on etiquette and manners Ppt on sustainable tourism practices in kenya