Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Decision Tree Algorithm (C4.5). 2 Training Examples for PlayTennis (Mitchell 1997)

Similar presentations


Presentation on theme: "1 Decision Tree Algorithm (C4.5). 2 Training Examples for PlayTennis (Mitchell 1997)"— Presentation transcript:

1 1 Decision Tree Algorithm (C4.5)

2 2 Training Examples for PlayTennis (Mitchell 1997)

3 3 Decision Tree for PlayTennis (Mitchell 1997)

4 4 Decision Tree Algorithm (C4.5) All the equations of C4.5 algorithm are as follow: Calculate Info(S) (Entropy, 熵 ) to identify the class in the training set S. where | S | is the number of cases in the training set, C i is a class, i=1,2,...,k, k is the number of classes, freq(C i, S) and is the number of cases in C i.. (9Y, 5N)

5 5 Decision Tree Algorithm (C4.5) Calculate the expected information value, for feature X to partition S. where L is the number of outputs for feature X, S i is a subset of S corresponding to the i th output, and |S i | is the number of cases in subset S i.

6 6 Decision Tree Algorithm (C4.5) Calculate the information gained after partitioning according to feature X. Calculate the partition information value, acquired for S partitioned into L subsets.

7 7 Decision Tree Algorithm (C4.5) Calculate the gain ratio of Gain(X) over SplitInfo(X). The gain ratio can reduce the probability of choosing the node with more attribute values. If gain ratio for two attribute values were the same smallest, then you can choice one randomly.

8 8 Decision Tree Algorithm (C4.5) Step 1 Decide which attribute should consider first? Outlook? Temperature? Humidity? Wind? Use Entropy: - P + log 2 P + - P - log 2 P - At first, we have 9(YES) and 5(NO) Starting Entropy: - 9/14 log 2 9/14 - 5/14 log 2 5/14=0.94

9 9 Entropy (Info)

10 10 Decision Tree Algorithm (C4.5) Step 2 Compute the gain ratio for each attribute Information gain: (The last E – The possible E)… (desire MAX) Humidity HighNormal Outlook SunnyRain Overcast

11 11 Decision Tree Algorithm (C4.5) Step 2(Con`t) Compute the gain ratio for each attribute Wind StrongWeak Temperature HotCoolMild

12 12 Decision Tree Algorithm (C4.5) Step 2(Con`t) Compute the gain ratio for each attribute Outlook SunnyRainOvercast E 1 =(-2/5 log 2 2/5 -3/5 log 2 3/5)=(-0.4 *( ) -0.6 *( )) = ( )= E 2= (-4/4 log 2 4/4 -0) =(-1 log )=0; E 3= (-3/5 log 2 3/5 -2/5 log 2 2/5) =(-0.6 * ( ) +0.4 *( ))= = /14* /14*0 - 5/14* = =0.247 Gain ratio =0.247/1.577=0.157 SplitInfo = -5/14 log 2 5/14 - 4/14 log 2 4/14 -5/14 log 2 5/14=1.577

13 13 Decision Tree Algorithm (C4.5) Step 2(Con`t) Compute the gain ratio for each attribute Humidity High Normal E 1 =(-3/7 log 2 3/7 -4/7 log 2 4/7)=0.985 E 2= (-6/7 log 2 6/7 -1/7 log 2 1/7)=0.592 Info gain= /14* /14*0.592 =0.151 Gain ratio=0.151/1=0.151 Split Info =-7/14 log 2 7/14 -7/14 log 2 7/14=1

14 14 Decision Tree Algorithm (C4.5) Step 2(Con`t) Compute the gain ratio for each attribute E 1 =(-2/4 log 2 2/4 -2/4 log 2 2/4)=1 E 2= (-4/6 log 2 4/6 -2/6 log 2 2/6)= E 3 =(-3/4 log 2 3/4 -1/4 log 2 1/4)= Info gain = /14*1 - 6/14* /14* = Temperature HotCoolMild Gain ratio=0.0292/1.556= Split Info =-4/14 log 2 4/14 -6/14 log 2 6/14- 4/14 log 2 4/14=1.556

15 15 Decision Tree Algorithm (C4.5) Step 2(Con`t) Compute the gain ratio for each attribute Wind StrongWeak E 1 =(-3/6 log 2 3/6 - 3/6 log 2 3/6)=1 E 2 =(-6/8 log 2 6/8 - 2/8 log 2 2/8)= Info gain= /14*1 - 8/14* = Gain ratio=0.048/0.9852=0.049 Split Info=-6/14 log 2 6/14 -8/14 log 2 8/14 =0.9852

16 16 Decision Tree Algorithm (C4.5) Step 2(Con`t) Summary of gain ratio Gain ratio Outlook=0.157 Gain ratio Humidify=0.151 Gain ratio Wind=0.049 Gain ratio Temperature= So the root node is Outlook

17 17 Decision Tree Algorithm (C4.5) Step 3 Decide other attribute under root node Outlook Sunny Rain Overcast May choice T, H, W, under Sunny and Rain  (Don’t care Overcast, because no information contained) Yes ……. ……

18 18 Decision Tree Algorithm (C4.5) Step 3(Con`t) Look under Outlook=Sunny Sunny 2+3- May choice T, H, W, under Outlook =Sunny Outlook E (Outlook=Sunny) = - 2/5 log 2 2/5 - 3/5 log 2 3/5 = 0.97

19 19 Decision Tree Algorithm (C4.5) Step 3(Con`t) Look under Outlook=Sunny Sunny 0+3- May choice T, H, W, under Outlook =Sunny Outlook E 1 (Under Outlook=Sunny and Humidity=High) = - 0/3 log 2 0/3 - 3/3 log 2 3/3 = 0 E 2 (Under Outlook=Sunny and Humidity=Normal) = - 2/2 log2 2/2 - 0/2 log2 0/2 = 0 Info gain (Under Outlook=Sunny and Humidity) = /5*0- 2/5 *0 = Humidity HighNormal 2+0- Gain ratio=0.971/0.971=1 Split Info=-3/5 log 2 3/5 -2/5 log 2 2/5 =0.971

20 20 Decision Tree Algorithm (C4.5) Step 3(Con`t) Look under Sunny Sunny 0+2- May choice T, H, W, under Outlook=Sunny Outlook E 1 (Under Outlook=Sunny and Temperature=Hot) = - 0/2 log 2 0/2 - 2/2 log 2 2/2 = 0 E 2 (Under Outlook=Sunny and Temperature=Mild) = - 1/2 log2 1/2 - 1/2 log2 1/2 = 1 E 3 (Under Outlook=Sunny and Temperature=Cool) = - 1/1 log2 1/1 - 0/1 log2 0/1 = 0 Info gain (Under Outlook=Sunny and Temperature) = /5*0 - 2/5 *1 - 1/5 *0=0.57 Temperature HotCool Mild Gain ratio=0.57/1.522=0.375 Split Info=-2/5 log 2 2/5 -2/5 log 2 2/5 -1/5 log 2 1/5 =1.522

21 21 Decision Tree Algorithm (C4.5) Step 3(Con`t) Look under Sunny Sunny 1+1- May choice T, H, W, under Outlook=Sunny Outlook E 1 (Under Outlook=Sunny and Wind=Strong) = - 1/2 log 2 1/2 - 1/2 log 2 1/2 = 1 E 2 (Under Outlook=Sunny and Wind=Weak) = - 1/3 log 2 1/3 - 2/3 log 2 2/3 = Info gain (Under Outlook=Sunny and Wind) = /5*1- 3/5 *0.918 = Wind Strong Weak 1+2- Gain ratio=0.0192/0.971=0.02 Split Info =-2/5 log 2 2/5 -3/5 log 2 3/5=0.971

22 22 Decision Tree Algorithm (C4.5) Step 3(Con`t) Summary of gain ratio Gain ratio under Outlook =Sunny and Humidity: 1 Gain ratio under Outlook =Sunny and Temperature: Gain ratio under Outlook =Sunny and Wind: 0.02 So choice Humidity Sunny Outlook Humidity Normal YES High No 2 3 Overcast ……. YES 4

23 23 Decision Tree Algorithm (C4.5) Step 4(Con`t) Look under Rain May choice T, H, W, under Rain E 1 (Under Outlook=Rain and Humidify=High) = - 1/2 log 2 1/2 - 1/2 log 2 1/2 = 1 E 2 (Under Outlook=Rain and Humidify=Normal) = - 2/3 log 2 2/3 - 1/3 log 2 1/3 = Info gain (Under Outlook=Rain and Humidify) = /5*1- 3/5 *0.918 = S 1+1- Outlook O HighNormal 2+1- Rain Humidify E (Rain) = - 3/5 log 2 3/5 - 2/5 log 2 2/5 = 0.97 Gain ratio=0.019/0.971=0.02 Split Info =-2/5 log 2 2/5 -3/5 log 2 3/5=0.971

24 24 Decision Tree Algorithm (C4.5) Step 4(Con`t) Look under Rain S 0+0- May choice T, H, W, under Rain Outlook E 1 (Under Outlook=Rain and T=Hot) = 0 E 2 (Under Outlook=Rain and T=Mild) = - 2/3 log 2 2/3 - 1/3 log 2 1/3 = E 3 (Under Outlook=Rain and T=Cool) = - 1/2 log 2 1/2 - 1/2 log 2 1/2 = 1 Info gain (Under Outlook=Rain and Temperature) = /5*0 - 3/5 * /5 * 1 = O HotMild 2+1- Rain Temperature Cool 1+1- Gain ratio=0.0192/0.971=0.02 Split Info =-3/5 log 2 3/5 -2/5 log 2 2/5=0.971

25 25 Decision Tree Algorithm (C4.5) Step 4(Con`t) Look under Rain S 0+2- May choice T, H, W, under Rain Outlook E 1 (Under Outlook=Rain and Wind=Strong) = - 0/2 log 2 0/2 - 2/2 log 2 2/2 = 0 E 2 (Under Outlook=Rain and Wind=Weak) = - 3/3 log 2 3/3 - 0/3 log 2 0/3 = 0 Info gain (Under Outlook=Rain) = /5*0- 3/5 *0 =0.971 O StrongWeak 3+0- Rain Wind Gain ratio=0.971/0.971=1 Split Info=-2/5 log 2 2/5 -3/5 log 2 3/5=0.971

26 26 Decision Tree Algorithm (C4.5) Step 4(Con`t) Summary of gain ratio Gain ratio Under O=Rain and Wind =1 Gain ratio Under O=Rain and Humidity =0.02 Gain ratio Under O=Rain and Temperature = 0.02 So choice Wind Sunny Outlook Humidity Normal YES High No 2 3 Overcast Wind YES 4 2 Strong Weak 3 YESNo Rain

27 27 Decision Tree Algorithm (C4.5) Additional situation about continuous value OutlookTemperatureHumidityWindyPlay sunny85Highfalseno sunny80Hightrueno overcast83Highfalseyes rainy70Highfalseyes rainy68Normalfalseyes rainy65Normaltrueno overcast64Normaltrueyes sunny72Highfalseno sunny69Normalfalseyes rainy75Normalfalseyes sunny75Normaltrueyes overcast72Hightrueyes overcast81Normalfalseyes rainy71Hightrueno

28 28 Decision Tree Algorithm (C4.5) Additional situation about continuous value Mapping with target Step 1. Sort Y N Y Y Y N N Y N Y Y N Y Y

29 29 Decision Tree Algorithm (C4.5) Step 2 Decide the cut point Y N Y Y Y N N Y N Y Y N Y Y When Y N or N Y Cut Point[E 1 ( )&E 2 ( )] Cut Point[E 1 ( )&E 2 ( )] Cut Point[E 1 ( )&E 2 ( )] Cut Point[E 1 ( )&E 2 ( )] Cut Point[E 1 ( )&E 2 ( )] Cut Point[E 1 ( )&E 2 ( )] Cut Point[E 1 ( )&E 2 ( )] Cut Point[E 1 ( )&E 2 ( )] Then compare with other non-continuous factor Ex. Cut Point at 64.5 E 1 (64.5<) = - 1/1 log 2 1/1 - 0/1 log 2 0/1 = 0 E 2 (>64.5) = - 8/13 log2 8/13 - 5/13 log2 5/13 = info (Under Sunny H) = /14*0 -13/14* =0.077

30 30 Decision Tree Algorithm (C4.5) If cut point is 64.5, gain ratio= / = [E1(1+ 0-)&E2(8+ 5-)] If cut point is 66.5, gain ratio= / = [E1(1+ 1-)&E2(8+ 4-)] If cut point is 70.5, gain ratio= / = [E1(4+ 1-)&E2(5+ 4-)] If cut point is 71.5, gain ratio= / = [E1(4+ 2-)&E2(5+ 3-)]

31 31 Decision Tree Algorithm (C4.5) If the cut point is 73.5, gain ratio= / = [E1(5+ 3-)&E2(4+ 2-)] If the cut point is 77.5, gain ratio= / = [E1(7+ 3-)&E2(2+ 2-)] If the cut point is 80.5, gain ratio= / = [E1(7+ 4-)&E2(2+ 1-)]

32 32 Decision Tree Algorithm (C4.5) And we need to find out the Max. gain ratio. If the cut point is 84.0, gain ratio= / = [E1(9+ 4-)&E2(0+ 1-)] Max. gain ratio is 0.38 and the cut point is 84.0.

33 33 Decision Tree Algorithm (C4.5) Parameter Min. Case How many Min. Case should we set? Ans. If the number of total cases in training set is under 1000, 2 is recommendation. Change Min. Case can reproduce the tree structure, the rule length and the number of rules.

34 Decision Tree Algorithm (C4.5) Parameter Min. Case  In order to avoid the over-fitting, splits can be created if certain specified threshold (e.g. the minimum number of cases for a split search) is met.  This is the so-called minimum case.

35 35 If minimum case is set to 6 Sunny Outlook (14 cases) Humidity (5 cases) Normal YES High No …… Sunny Outlook (14 cases) …… No Decision Tree Algorithm (C4.5)

36  Parameter prune confidence level U CF (E,N) where E is the number of error; N is number of training instance (EX: U 0.25 (0,6)=0.206 ( 預估錯誤率 ) and the expected number of error is 6*0.206=1.236)  Use the estimated error to determine whether the tree built in growth phase is required to prune or not at certain nodes.  The probability of error cannot be determined exactly; however, there exists a probability distribution that is generally summarized as a pair of confidence limits. (binomial distribution.)  C4.5 simply equates the estimated error rate at a leaf with this upper limit, based on the argument that the tree has been constructed to minimize observed error rate

37 Decision Tree Algorithm (C4.5)  Parameter prune confidence level U CF (E,N) where E is the number of error; N is number of training instance (EX: U 0.25 (0,6)= and the expected error is 6*0.206=1.236) If the expect error is 2.63 in node 6 root node node 1 ……. node 6 Leaf (6) If the expect error is root node Node 1 ……. If the expect error is 4.21 in node 1 Leaf (9)Leaf (1) 6* * *0.750=3.273

38 38 Decision Tree Algorithm (C4.5) Future research: Multiple cut points of continuous values.


Download ppt "1 Decision Tree Algorithm (C4.5). 2 Training Examples for PlayTennis (Mitchell 1997)"

Similar presentations


Ads by Google