Download presentation

Presentation is loading. Please wait.

Published byFelicity Brummitt Modified about 1 year ago

1
Non-Metric Methods: Decision Trees Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall

2
Decision Trees Motivation: There are features (discrete) that don’t have an obvious notion of similarity or ordering (nominal data), e.g., book type, shape, sound type Taxonomies (i.e., trees with is-a relationship) are the oldest form of classification

3
Decision Trees: Definition Decision Trees are classifiers that classify samples based on a set of questions that are asked hierarchically (tree of questions) Example questions is color red? is x < 0.5? Terminology: root, leaf, node, arc, branch, parent, children, branching factor, depth

4
Fruit classifier Color? green yellow red Size? Shape? Size? Taste? bigmed round thin big small med big small med sweetsour

5
Fruit classification Color? green yellow red Size? Shape? Size? Taste? bigmed round thin big small med big small med sweetsour CHERRY

6
Fruit classification Color? green yellow red Size? Shape? Size? Taste? bigmed round thin big small med big small med sweetsour CHERRY

7
Fruit classification Color? green yellow red Size? Shape? Size? Taste? bigmed round thin big small med big small med sweetsour CHERRY

8
Fruit classification Color? green yellow red Size? Shape? Size? Taste? bigmed round thin big small med big small med sweetsour CHERRY

9
Fruit classifier Color? green yellow red Size? Shape? Size? Taste? bigmed round thin big small med big small med sweetsour watermelon grape grapefruit cherrygrape

10
Binary Trees Binary trees: each parent node has exactly two children nodes (branching factor = 2) Any tree can be represented as a binary tree by changing set of questions and by increasing the tree depth e.g., Color? green yellow red Color = green? Color = yellow? YN Y N

11
Decision Trees: Problems 1.List of questions (features) All possible questions are considered 2.Which questions to split first (best split) The questions that split the data best (reduce impurity at each node) are asked first 3.Stopping criteria (pruning criteria) Stop when further splits don’t reduce imprurity

12
Best Split example Two class problem with 100 examples from w1 and w2 Three binary questions Q1, Q2 and Q3 that split the data as follows: 1. Node 1: (50,50)Node 2: (50,50) 2. Node 1: (100,0)Node 2: (0,100) 3. Node 1: (80,0)Node 2: (20,100)

13
Impurity Measures Impurity measures the degree of homogeneity of a node; a node is pure if it consists of training examples from a single class Impurity Measures Entropy Impurity: i(N) = - i P(w i ) log 2 (P(w i )) Variance (two-class): i(N) = P(w 1 ) P(w 2 ) Gini Impurity: i(N) = 1- i P 2 (w i ) Misclassification: i(N) = 1- max i P(w i )

14
Total Impurity Total Impurity at Depth 0: i(depth =0) = i(N) Total Impurity at Depth 1: i(depth =1) = p(N L ) i(N L ) + p(N R ) i(N R ) N yes no NLNL NRNR Depth 0 Depth 1

15
Impurity Example Node 1: (80,0)Node 2: (20,100) I(node 1) = 0 I(node 2) = - 20/120 log2(20/120) - 100/120 log2(100/120) = 0.65 P(node 1) = 80/200 = 0.4 P(node 2) = 120/200 = 0.6 I(total) = P(node 1) I(node 1) + P(node 2) I(node 2) = = *0.65 = 0.39

16
Continuous Example For continuous features: questions are of the type x

17
Summary Decision trees are useful categorical classification tools especially for nominal (non-metric) data CART creates trees that minimize impurity on the training set at each node Decision region shape CART is a useful tool for feature selection

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google