Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bab 4.1 - 1/44 Bab 4 Classification: Basic Concepts, Decision Trees & Model Evaluation Part 1 Classification With Decision tree.

Similar presentations


Presentation on theme: "Bab 4.1 - 1/44 Bab 4 Classification: Basic Concepts, Decision Trees & Model Evaluation Part 1 Classification With Decision tree."— Presentation transcript:

1 Bab /44 Bab 4 Classification: Basic Concepts, Decision Trees & Model Evaluation Part 1 Classification With Decision tree

2 Bab /44 Classification: Definition

3 Bab /44 Example of Classification Task

4 Bab /44 General Approach for Building Classification Model

5 Bab /44 Classification Techniques

6 Bab /44 Example of Decision Tree

7 Bab /44 Another Example of Decision Tree

8 Bab /44 Decision Tree Classification Task

9 Bab /44 Apply Model to Test Data

10 Bab /44 Decision Tree Classification Task

11 Bab /44 Decision Tree Induction

12 Bab /44 General Structure of Hunt’s Algorithm

13 Bab /44 Hunt’s Algorithm

14 Bab /44 Design Issues of Decision Tree Induction

15 Bab /44 Methods for Expression Test Conditions

16 Bab /44 Test Condition for Nominal Attributes

17 Bab /44 Test Condition for Ordinal Attributes

18 Bab /44 Test Condition for Continues Attributes

19 Bab /44 Splitting Based on Continues Attributes

20 Bab /44 How to Determine the Best Split / 1

21 Bab /44 How to Determine the Best Split / 2

22 Bab /44 Measures of Node Impurity

23 Bab /44 Finding the Best Split / 1

24 Bab /44 Finding the Best Split / 2

25 Bab /44 Measure of Impurity: GINI

26 Bab /44 Computing GINI Index of a Single Node

27 Bab /44 Computing GINI Index for a Collection of Nodes

28 Bab /44 Binary Attributes: Computing GINI Index

29 Bab /44 Categorical Attributes: Computing GINI Index

30 Bab /44 Continuous Attributes: Computing GINI Index / 1

31 Bab /44 Continuous Attributes: Computing GINI Index / 2

32 Bab /44 Measure of Impurity: Entropy

33 Bab /44 Computing Entropy of a Single Node

34 Bab /44 Computing information Gain After Splitting

35 Bab /44 Problems with Information Gain

36 Bab /44 Gain Ratio

37 Bab /44 Measure of Impurity: Classification Error

38 Bab /44 Computing Error of a Single Node

39 Bab /44 Comparison among Impurity Measures For binary (2-class) classification problems

40 Bab /44 Misclassification Error vs Gini index

41 Bab /44 Example: C4.5 Simple depth-first construction. Uses Information Gain Sorts Continuous Attributes at each node. Needs entire data to fit in memory. Unsuitable for Large Datasets.  Needs out-of-core sorting. You can download the software from:

42 Bab /44 Scalable Decision Tree Induction / 1 How scalable is decision tree induction?  Particularly suitable for small data set SLIQ (EDBT’96 — Mehta et al.)  Builds an index for each attribute and only class list and the current attribute list reside in memory

43 Bab /44 Scalable Decision Tree Induction / 2 SLIQ Sample data for the class buys_computer Disk-resident attribute lists Memory-resident class list RIDCredit_ratingAgeBuys_computer 1excellent38yes 2excellent26yes 3fair35no 4excellent49no Credit_ratingRID excellent1 2 4 fair3 …… ageRID …… RIDBuys_computernode 1yes no3 4 6 ………

44 Bab /44 Decision Tree Based Classification Advantages  Inexpensive to construct  Extremely fast at classifying unknown records  Easy to interpret for small-sized tress  Accuracy is comparable to other classification techniques for many data sets Practical Issues of Classification  Underfitting and Overfitting  Missing Values  Costs of Classification


Download ppt "Bab 4.1 - 1/44 Bab 4 Classification: Basic Concepts, Decision Trees & Model Evaluation Part 1 Classification With Decision tree."

Similar presentations


Ads by Google