Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 05: Decision Trees

Similar presentations


Presentation on theme: "Lecture 05: Decision Trees"— Presentation transcript:

1 Lecture 05: Decision Trees
CS480/680: Intro to ML Lecture 05: Decision Trees 09/25/18 Yao-Liang Yu

2 Trees Recalled 09/25/18 Yao-Liang Yu

3 Example: EnjoySport? Outlook Humidity Wind Yes No
Overcast Sunny Rain No High Normal Strong Weak Decision trees can represent any boolean function 09/25/18 Yao-Liang Yu

4 Classification And Regression Tree
decision: splitting a variable classification: majority of label / prob. of label regression: average of responses training data 09/25/18 Yao-Liang Yu

5 LEARNing a Decision Tree
Which variables to split in each stage? What threshold to use? When to stop? What to put at the leaves? setup a cost/objective NP-Hard… regularization: early stopping / pruning regression / classification / other 09/25/18 Yao-Liang Yu

6 Algorithm 09/25/18 Yao-Liang Yu

7 Growing a tree y X 09/25/18 Yao-Liang Yu

8 Growing a tree Splitting based on x Evaluation can be based on y y X
09/25/18 Yao-Liang Yu

9 Growing a tree y X 09/25/18 Yao-Liang Yu

10 Growing a tree y X 09/25/18 Yao-Liang Yu

11 Growing a tree y X 09/25/18 Yao-Liang Yu

12 Growing a tree y X 09/25/18 Yao-Liang Yu

13 Growing a tree y X 09/25/18 Yao-Liang Yu

14 Growing a tree y X 09/25/18 Yao-Liang Yu

15 Growing a tree 09/25/18 Yao-Liang Yu

16 Which and How For categorical features, simply try each
assuming features are ordinal cost / loss partition training data into left and right greedily choose a variable to split greedily choose a threshold to split For categorical features, simply try each What should Tj be? 09/25/18 Yao-Liang Yu

17 Stopping criterion Maximum depth exceeded
Maximum running time exceeded All children nodes are sufficiently homogeneous All children nodes have too few training examples Cross-validation Reduction in cost is small 09/25/18 Yao-Liang Yu

18 Regression cost Can of course use other loss than least-squares
Can also fit any regression model on D average of yi in D stored in leaves 09/25/18 Yao-Liang Yu

19 Classification cost Misclassification error: Entropy: Gini index:
majority vote Misclassification error: Entropy: Gini index: 09/25/18 Yao-Liang Yu

20 Comparison 09/25/18 Yao-Liang Yu

21 Example 09/25/18 Yao-Liang Yu

22 Type or Patrons A better feature split should lead to nearly all positives or all negatives 09/25/18 Yao-Liang Yu

23 Result 09/25/18 Yao-Liang Yu

24 Pruning Early stopping can be myopic
Grow a full tree and then prune in bottom-up 09/25/18 Yao-Liang Yu

25 Decision Stump A binary tree with depth 1
Eye features low high No Yes A binary tree with depth 1 Performs classification based on one feature Easy to train but underfits; interprettable 09/25/18 Yao-Liang Yu

26 Questions? 09/25/18 Yao-Liang Yu


Download ppt "Lecture 05: Decision Trees"

Similar presentations


Ads by Google