Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth.

Similar presentations


Presentation on theme: "Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth."— Presentation transcript:

1 Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth Richard Jensen Qiang Shen

2 Outline Utility of decision tree inductionUtility of decision tree induction Importance of attribute selectionImportance of attribute selection Introduction of fuzzy-rough conceptsIntroduction of fuzzy-rough concepts Evaluation of the fuzzy-rough metricEvaluation of the fuzzy-rough metric Results of F-ID3 vs FR-ID3Results of F-ID3 vs FR-ID3 ConclusionsConclusions

3 Decision Trees Popular classification algorithm in data mining and machine learningPopular classification algorithm in data mining and machine learning Fuzzy decision trees (FDTs) follow similar principles to crisp decision treesFuzzy decision trees (FDTs) follow similar principles to crisp decision trees FDTs allow greater flexibilityFDTs allow greater flexibility Partitioning of the instance space; attributes are selected to derive partitionsPartitioning of the instance space; attributes are selected to derive partitions Hence, attribute selection is an important factor in decision tree qualityHence, attribute selection is an important factor in decision tree quality

4 Fuzzy Decision Trees Object membershipObject membership Traditionally, node membership of {0,1}Traditionally, node membership of {0,1} Here, membership is any value in the range [0,1]Here, membership is any value in the range [0,1] Calculated from conjunction of membership degrees along path to the nodeCalculated from conjunction of membership degrees along path to the node Fuzzy tests Fuzzy tests Carried out within nodes to determine the membership of feature values to fuzzy setsCarried out within nodes to determine the membership of feature values to fuzzy sets Stopping criteriaStopping criteria Measure of feature significanceMeasure of feature significance

5 Training set S and (optionally) depth of decision tree l Start to form decision tree from the top level, Do loop until (1)the depth of the tree gets to l or (2)there is no node to expand a) Gauge significance of each attribute of S not already expanded in this branch b) Expand the attribute with the most significance c) Stop expansion of the leaf node of attribute if maximum significance obtained End do loop Decision Tree Algorithm

6 Feature Significance Previous FDT inducers use fuzzy entropyPrevious FDT inducers use fuzzy entropy Little research in the area of alternativesLittle research in the area of alternatives Fuzzy-rough feature significance has been used previously in feature selection with much successFuzzy-rough feature significance has been used previously in feature selection with much success This can also be used to gauge feature importance within FDT constructionThis can also be used to gauge feature importance within FDT construction The fuzzy-rough measure extends concepts from crisp rough set theoryThe fuzzy-rough measure extends concepts from crisp rough set theory

7 Crisp Rough Sets [x] B is the set of all points which are indiscernible with point x in terms of feature subset B. UpperApproximation Set X LowerApproximation Equivalence class [x] B

8 Fuzzy Equivalence Classes Image: Rough Fuzzy Hybridization: A New Trend in Decision Making, S. K. Pal and A. Skowron (eds), Springer-Verlag, Singapore, 1999 Incorporate vaguenessIncorporate vagueness Handle real valued dataHandle real valued data Cope with noisy dataCope with noisy data Crisp equivalence class Fuzzy equivalence class At the centre of Fuzzy-Rough Feature Selection

9 Fuzzy-Rough Significance Deals with real-valued features via fuzzy setsDeals with real-valued features via fuzzy sets Fuzzy lower approximation:Fuzzy lower approximation: Fuzzy positive region:Fuzzy positive region: Evaluation function:Evaluation function: Feature importance is estimated with thisFeature importance is estimated with this

10 Evaluation Is the γ metric a useful gauger of feature significance?Is the γ metric a useful gauger of feature significance? γ metric compared with leading feature rankers:γ metric compared with leading feature rankers: Information Gain, Gain Ratio, Chi 2, Relief, OneRInformation Gain, Gain Ratio, Chi 2, Relief, OneR Applied to test data:Applied to test data: 30 random feature values for 400 objects30 random feature values for 400 objects 2 or 3 features used to determine classification2 or 3 features used to determine classification Task: locate those features that affect the decisionTask: locate those features that affect the decision

11 Evaluation… Results for x*y*z 2 > 0.125Results for x*y*z 2 > Results for (x + y) 3 < 0.125Results for (x + y) 3 < FR, IG and GR perform bestFR, IG and GR perform best FR metric locates the most important featuresFR metric locates the most important features

12 FDT Experiments Fuzzy ID3 (F-ID3) compared with Fuzzy-Rough ID3 (FR-ID3)Fuzzy ID3 (F-ID3) compared with Fuzzy-Rough ID3 (FR-ID3) Only difference between methods is the choice of feature significance measureOnly difference between methods is the choice of feature significance measure Datasets used taken from the machine learning repositoryDatasets used taken from the machine learning repository Data split into two equal halves: training and testingData split into two equal halves: training and testing Resulting trees converted to equivalent rulesetsResulting trees converted to equivalent rulesets

13 Results Real-valued dataReal-valued data Average ruleset sizeAverage ruleset size 56.7 for F-ID356.7 for F-ID for FR-ID388.6 for FR-ID3 F-ID3 performs marginally better than FR-ID3F-ID3 performs marginally better than FR-ID3

14 Results… Crisp dataCrisp data Average ruleset sizeAverage ruleset size 30.2 for F-ID330.2 for F-ID for FR-ID328.8 for FR-ID3 FR-ID3 performs marginally better than F-ID3FR-ID3 performs marginally better than F-ID3

15 Conclusion Decision trees are a popular means of classificationDecision trees are a popular means of classification The selection of branching attributes is key toThe selection of branching attributes is key to resulting tree quality The use of a fuzzy-rough metric for this purpose looks promisingThe use of a fuzzy-rough metric for this purpose looks promising Future workFuture work Further experimental evaluationFurther experimental evaluation Fuzzy-rough feature reduction pre-processorFuzzy-rough feature reduction pre-processor


Download ppt "Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth."

Similar presentations


Ads by Google