Iterative Dichotomiser 3 By Christopher Archibald.

Iterative Dichotomiser 3 By Christopher Archibald

Decision Trees A Decision tree is a tree with branching nodes with a choice between 2 or more choices. Decision Node: A node that a choice is made Leaf Node: The result from that point of the tree

Decision Trees Will it rain? If it is Sunny, it will not rain. If it is cloudy it will rain. If it is partially cloudy, it will depends on the if it is humid or not.

ID3 Invented by J. Ross Quinlan Employs a top-down greedy search through the space of possible decision trees. Select attribute that is most useful for classifying examples (attribute that has the highest Information Gain).

Entropy Entropy tells us how well an attribute will separate the given example according to the target classification class. Entropy(S) = -P pos log 2 P pos – P neg log 2 P neg P pos = Proportion of positive examples P neg = proportion of negative example

Entropy Example Example: If S is a collection of 15 examples with 10 YES and 5 NO, then: Entropy(S) = - (10/15) log2 (10/15) - (5/15) log2 (5/15) = 0.918 In your Calculator you would have to enter -((10/15)log(10/15))/log2 – ((5/15)log(10/15))/log2 Because log is set to base 10 and you need base 2

Information Gain Measures the expected reduction in entropy. The higher the Information Gain, more is the expected reduction in entropy. The Equation for Information gain is.

Information Gain A is an attribute of collection S S v = subset of S for which attribute A has value v |S v | = number of elements in S v |S| = number of elements in S

Example VideoContains Car Contains Violence Rally Cars Races GTA 4Yes No DoomNoYesNo GTA3Yes No Halo 3Yes No Need for Speed YesNo Yes Rally Sport YesNoYesNo

Example (cont) Entropy(S) = -P pos log 2 P pos – P neg log 2 P neg Entropy(4Y,2N): -(4/6)log 2 (4/6) – (2/6)log 2 (2/6) = 0.91829 Now that we know the Entropy where going to use that answer to find the Information Gain

Example (Cont) For Attributes (Contains Cars) S = [4Y,2N] S Yes = [3Y,2N] E(S Yes ) = 0.97095 S No = [1Y,0N] E(S No ) = 0 Gain (S, Contains Cars) = 0.91829–[(5/6)*0.97095 + (1/6)*0] = 0.10916

Example (Cont) For Attributes (Contains Rally Cars) S = [4Y,2N] S Yes = [0Y,1N] E(S Yes ) = 0 S No = [4Y,1N] E(S No ) = 0.7219 Gain (S, Contains Rally Cars) = 0.91829 – [(1/6)*0 + (5/6)*0.7219] = 0.3167

VideoContains Car Contains Violence Rally Cars Races GTA 4Yes No DoomNoYesNo GTA3Yes No Halo 3Yes No Need for Speed YesNo Yes Rally Sport YesNoYesNo

Example (Cont) For Attributes (Races) S = [4Y,2N] S Yes = [0Y,1N] E(S Yes ) = 0 S No = [4Y,1N] E(S No ) = 0.7219 Gain (S, Races) = 0.91829 – [(1/6)*0 + (5/6)*0.7219] = 0.3167

Example (Cont) Gain (S, Contains Cars) = 0.10916 Gain (S, Contains Rally Cars) = 0.3167 Gain (S, Races) = 0.3167

Source Dr. Lee’s Slides, San Jose State University, Spring 2008 http://www.cise.ufl.edu/~ddd/cap6635/Fall- 97/Short-papers/2.htm http://decisiontrees.net/node/27

Iterative Dichotomiser 3 By Christopher Archibald.

Similar presentations

Presentation on theme: "Iterative Dichotomiser 3 By Christopher Archibald."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Iterative Dichotomiser 3 By Christopher Archibald.

Similar presentations

Presentation on theme: "Iterative Dichotomiser 3 By Christopher Archibald."— Presentation transcript:

Similar presentations

About project

Feedback