Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 CSCI 3202: Introduction to AI Decision Trees Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) Intro AIDecision Trees.

Similar presentations


Presentation on theme: "1 CSCI 3202: Introduction to AI Decision Trees Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) Intro AIDecision Trees."— Presentation transcript:

1 1 CSCI 3202: Introduction to AI Decision Trees Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) Intro AIDecision Trees

2 2 Outline Decision Tree Representations –ID3 and C4.5 learning algorithms (Quinlan 1986) –CART learning algorithm (Breiman et al. 1985) Entropy, Information Gain Overfitting Intro AIDecision Trees

3 3 Training Data Example: Goal is to Predict When This Player Will Play Tennis? Intro AIDecision Trees

4 4Intro AIDecision Trees

5 5Intro AIDecision Trees

6 6Intro AIDecision Trees

7 7Intro AIDecision Trees

8 8 Learning Algorithm for Decision Trees What happens if features are not binary? What about regression? Intro AIDecision Trees

9 9 Choosing the Best Attribute Intro AIDecision Trees - Many different frameworks for choosing BEST have been proposed! - We will look at Entropy Gain. Number + and – examples before and after a split. A1 and A2 are “attributes” (i.e. features or inputs).

10 10 Entropy Intro AIDecision Trees

11 11Intro AIDecision Trees Entropy is like a measure of purity…

12 12 Entropy Intro AIDecision Trees

13 13 Information Gain Intro AIDecision Trees

14 14 Training Example Intro AIDecision Trees

15 15 Selecting the Next Attribute Intro AIDecision Trees

16 16Intro AIDecision Trees

17 17 Non-Boolean Features Features with multiple discrete values –Multi-way splits –Test for one value versus the rest –Group values into disjoint sets Real-valued features –Use thresholds Regression –Splits based on mean squared error metric Intro AIDecision Trees

18 18 Hypothesis Space Search Intro AIDecision Trees You do not get the globally optimal tree! - Search space is exponential.

19 19 Overfitting Intro AIDecision Trees

20 20 Overfitting in Decision Trees Intro AIDecision Trees

21 21 Validation Data is Used to Control Overfitting Prune tree to reduce error on validation set Intro AIDecision Trees

22 Quiz 7 1.What is overfitting? 2.Are decision trees universal approximators? 3.What do decision tree boundaries look like? 4.Why are decision trees not globally optimal? 5.Do Support Vector Machines give the globally optimal solution (given the SVN optimization criteria and a specific kernel)? 6.Does the K Nearest Neighbor algorithm give the globally optimal solution if K is allowed to be as large as the number of training examples (N), N fold cross validation is used, and the distance metric is fixed? Intro AIDecision Trees22


Download ppt "1 CSCI 3202: Introduction to AI Decision Trees Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) Intro AIDecision Trees."

Similar presentations


Ads by Google