Presentation is loading. Please wait.

Presentation is loading. Please wait.

Decision Trees Prof. Carolina Ruiz Dept. of Computer Science WPI.

Similar presentations


Presentation on theme: "Decision Trees Prof. Carolina Ruiz Dept. of Computer Science WPI."— Presentation transcript:

1 Decision Trees Prof. Carolina Ruiz Dept. of Computer Science WPI

2 Constructing a decision tree ? Which attribute to use as the root node? That is, which attribute to check first when making a prediction? Pick the attribute that brings us closer to a decision. That is, the attribute that splits the data more homogenously.

3 Which attribute splits the data more homogenously? [0,1,3] [2,1,2] [3,1,1] bad unknown good [3,3,2] [2,1,4] low high [3,2,6] [2,1,0] none adequate [0,0,4] [0,2,2] [5,1,0] 0-15 15-35 >35 low moderate high Goal: Assign a unique number to each attribute that represents how well it “splits” the dataset according to the target attribute Target

4 For example … What function f to use? f([0,1,3],[2,1,2],[3,1,1]) = number Possible f functions: Gini Index measure of impurity Entropy from information theory Misclassification error metric used by OneR [0,1,3] [2,1,2] [3,1,1] bad unknown good

5 Using entropy as the f metric f([0,1,3],[2,1,2],[3,1,1]) = Entropy([0,1,3],[2,1,2],[3,1,1]) = (4/14)*Entropy([0,1,3]) + (5/14)*Entropy([2,1,2]) + (5/14)*Entropy([3,1,1]) = (4/14)*[-0 -1/4 log 2 (1/4) -3/4 log 2 (3/4) ] + (5/14)*[-2/5 log 2 (2/5)-1/5 log 2 (1/5) -2/5 log 2 (2/5) ] + (5/14)*[-3/5 log 2 (3/5)-1/5 log 2 (1/5) -1/5 log 2 (1/5) ] = 1.265 [0,1,3] [2,1,2] [3,1,1] bad unknown good In general: Entropy([p,q,…,z]) = - (p/m)log 2 (p/m) – (q/m)log 2 (q/m) - … - (z/m)log 2 (z/m) where m = p+q+…+z

6 Which attribute splits the data more homogenously? [0,1,3] [2,1,2] [3,1,1] bad unknown good [3,3,2] [2,1,4] low high [3,2,6] [2,1,0] none adequate [0,0,4] [0,2,2] [5,1,0] 0-15 15-35 >35 low moderate high Attribute with lowest entropy is chosen: income Target 1.265 1.467 1.324 0.564

7 Constructing a decision tree ? Which attribute to use as the root node? That is, which attribute to check first when making a prediction? Pick the attribute that brings us closer to a decision. That is, the attribute that splits the data more homogenously.

8 Constructing a decision tree income 0-15 15-35 > 35 prediction: high ? ? ?

9 Splitting instances with income = 15-35 [0,0,1], [0,1,1],[0,1,0] [0,1,0], [0,1,2] [0,2,2], [0,0,0] entropy: 0.5 0.688 1 <- high <- moderate attribute with lowest entropy

10 Constructing a decision tree income 0-15 15-35 > 35 prediction: high … … … Credit- history


Download ppt "Decision Trees Prof. Carolina Ruiz Dept. of Computer Science WPI."

Similar presentations


Ads by Google