Presentation is loading. Please wait.

Presentation is loading. Please wait.

Iterative Dichotomiser 3 By Christopher Archibald.

Similar presentations


Presentation on theme: "Iterative Dichotomiser 3 By Christopher Archibald."— Presentation transcript:

1 Iterative Dichotomiser 3 By Christopher Archibald

2 Decision Trees A Decision tree is a tree with branching nodes with a choice between 2 or more choices. Decision Node: A node that a choice is made Leaf Node: The result from that point of the tree

3 Decision Trees Will it rain? If it is Sunny, it will not rain. If it is cloudy it will rain. If it is partially cloudy, it will depends on the if it is humid or not.

4 ID3 Invented by J. Ross Quinlan Employs a top-down greedy search through the space of possible decision trees. Select attribute that is most useful for classifying examples (attribute that has the highest Information Gain).

5 Entropy Entropy tells us how well an attribute will separate the given example according to the target classification class. Entropy(S) = -P pos log 2 P pos – P neg log 2 P neg P pos = Proportion of positive examples P neg = proportion of negative example

6 Entropy Example Example: If S is a collection of 15 examples with 10 YES and 5 NO, then: Entropy(S) = - (10/15) log2 (10/15) - (5/15) log2 (5/15) = 0.918 In your Calculator you would have to enter -((10/15)log(10/15))/log2 – ((5/15)log(10/15))/log2 Because log is set to base 10 and you need base 2

7 Information Gain Measures the expected reduction in entropy. The higher the Information Gain, more is the expected reduction in entropy. The Equation for Information gain is.

8 Information Gain A is an attribute of collection S S v = subset of S for which attribute A has value v |S v | = number of elements in S v |S| = number of elements in S

9 Example VideoContains Car Contains Violence Rally Cars Races GTA 4Yes No DoomNoYesNo GTA3Yes No Halo 3Yes No Need for Speed YesNo Yes Rally Sport YesNoYesNo

10 Example (cont) Entropy(S) = -P pos log 2 P pos – P neg log 2 P neg Entropy(4Y,2N): -(4/6)log 2 (4/6) – (2/6)log 2 (2/6) = 0.91829 Now that we know the Entropy where going to use that answer to find the Information Gain

11 Example VideoContains Car Contains Violence Rally Cars Races GTA 4Yes No DoomNoYesNo GTA3Yes No Halo 3Yes No Need for Speed YesNo Yes Rally Sport YesNoYesNo

12 Example (Cont) For Attributes (Contains Cars) S = [4Y,2N] S Yes = [3Y,2N] E(S Yes ) = 0.97095 S No = [1Y,0N] E(S No ) = 0 Gain (S, Contains Cars) = 0.91829–[(5/6)*0.97095 + (1/6)*0] = 0.10916

13 Example VideoContains Car Contains Violence Rally Cars Races GTA 4Yes No DoomNoYesNo GTA3Yes No Halo 3Yes No Need for Speed YesNo Yes Rally Sport YesNoYesNo

14 Example (Cont) For Attributes (Contains Rally Cars) S = [4Y,2N] S Yes = [0Y,1N] E(S Yes ) = 0 S No = [4Y,1N] E(S No ) = 0.7219 Gain (S, Contains Rally Cars) = 0.91829 – [(1/6)*0 + (5/6)*0.7219] = 0.3167

15 VideoContains Car Contains Violence Rally Cars Races GTA 4Yes No DoomNoYesNo GTA3Yes No Halo 3Yes No Need for Speed YesNo Yes Rally Sport YesNoYesNo

16 Example (Cont) For Attributes (Races) S = [4Y,2N] S Yes = [0Y,1N] E(S Yes ) = 0 S No = [4Y,1N] E(S No ) = 0.7219 Gain (S, Races) = 0.91829 – [(1/6)*0 + (5/6)*0.7219] = 0.3167

17 Example (Cont) Gain (S, Contains Cars) = 0.10916 Gain (S, Contains Rally Cars) = 0.3167 Gain (S, Races) = 0.3167

18 Source Dr. Lee’s Slides, San Jose State University, Spring 2008 http://www.cise.ufl.edu/~ddd/cap6635/Fall- 97/Short-papers/2.htm http://decisiontrees.net/node/27


Download ppt "Iterative Dichotomiser 3 By Christopher Archibald."

Similar presentations


Ads by Google