Iterative Dichotomiser 3 (ID3) Algorithm

Slides:



Advertisements
Similar presentations
Data Mining Lecture 9.
Advertisements

Decision Trees Decision tree representation ID3 learning algorithm
CHAPTER 9: Decision Trees
1er. Escuela Red ProTIC - Tandil, de Abril, Decision Tree Learning 3.1 Introduction –Method for approximation of discrete-valued target functions.
Paper By - Manish Mehta, Rakesh Agarwal and Jorma Rissanen
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Data Mining Techniques: Classification. Classification What is Classification? –Classifying tuples in a database –In training set E each tuple consists.
IT 433 Data Warehousing and Data Mining
Hunt’s Algorithm CIT365: Data Mining & Data Warehousing Bajuna Salehe
Decision Tree Approach in Data Mining
Introduction Training Complexity, Pruning CART vs. ID3 vs. C4.5
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 1.
Deriving rules from data Decision Trees a.j.m.m (ton) weijters.
Iterative Dichotomiser 3 (ID3) Algorithm Medha Pradhan CS 157B, Spring 2007.
Decision Tree Learning 主講人:虞台文 大同大學資工所 智慧型多媒體研究室.
By: Phuong H. Nguyen Professor: Lee, Sin-Min Course: CS 157B Section: 2 Date: 05/08/07 Spring 2007.
Machine Learning Group University College Dublin Decision Trees What is a Decision Tree? How to build a good one…
ID3 Algorithm Abbas Rizvi CS157 B Spring What is the ID3 algorithm? ID3 stands for Iterative Dichotomiser 3 Algorithm used to generate a decision.
Decision Tree Algorithm
Decision Tree Learning
Induction of Decision Trees
Lecture 5 (Classification with Decision Trees)
MACHINE LEARNING. What is learning? A computer program learns if it improves its performance at some task through experience (T. Mitchell, 1997) A computer.
CS 4700: Foundations of Artificial Intelligence
Machine Learning Lecture 10 Decision Trees G53MLE Machine Learning Dr Guoping Qiu1.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ID3 Algorithm Allan Neymark CS157B – Spring 2007.
Data Mining: Classification
Machine Learning Chapter 3. Decision Tree Learning
Learning what questions to ask. 8/29/03Decision Trees2  Job is to build a tree that represents a series of questions that the classifier will ask of.
CS-424 Gregory Dudek Today’s outline Administrative issues –Assignment deadlines: 1 day = 24 hrs (holidays are special) –The project –Assignment 3 –Midterm.
Decision Trees & the Iterative Dichotomiser 3 (ID3) Algorithm David Ramos CS 157B, Section 1 May 4, 2006.
Feature Selection: Why?
Ch10 Machine Learning: Symbol-Based
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
By: Phuong H. Nguyen Professor: Lee, Sin-Min Course: CS 157B Section: 2 Date: 05/08/07 Spring 2007.
Decision Tree Learning Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 25, 2014.
Decision Trees. Decision trees Decision trees are powerful and popular tools for classification and prediction. The attractiveness of decision trees is.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.3: Decision Trees Rodney Nielsen Many of.
For Wednesday No reading Homework: –Chapter 18, exercise 6.
Learning with Decision Trees Artificial Intelligence CMSC February 20, 2003.
For Monday No new reading Homework: –Chapter 18, exercises 3 and 4.
CS 8751 ML & KDDDecision Trees1 Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting.
ID3 Algorithm Michael Crawford.
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
Chapter 6 Classification and Prediction Dr. Bernard Chen Ph.D. University of Central Arkansas.
Decision Trees, Part 1 Reading: Textbook, Chapter 6.
Machine Learning Decision Trees. E. Keogh, UC Riverside Decision Tree Classifier Ross Quinlan Antenna Length Abdomen Length.
DECISION TREE Ge Song. Introduction ■ Decision Tree: is a supervised learning algorithm used for classification or regression. ■ Decision Tree Graph:
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
Lecture Notes for Chapter 4 Introduction to Data Mining
Presentation on Decision trees Presented to: Sir Marooof Pasha.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.
Iterative Dichotomiser 3 By Christopher Archibald.
CSE343/543 Machine Learning: Lecture 4.  Chapter 3: Decision Trees  Weekly assignment:  There are lot of applications and systems using machine learning.
DECISION TREES An internal node represents a test on an attribute.
Iterative Dichotomiser 3 (ID3) Algorithm
CS 9633 Machine Learning Decision Tree Learning
Artificial Intelligence
Decision Trees (suggested time: 30 min)
Chapter 6 Classification and Prediction
Data Science Algorithms: The Basic Methods
SAD: 6º Projecto.
ID3 Algorithm.
Chapter 8 Tutorial.
Machine Learning Chapter 3. Decision Tree Learning
Machine Learning: Lecture 3
Machine Learning Chapter 3. Decision Tree Learning
Presentation transcript:

Iterative Dichotomiser 3 (ID3) Algorithm Medha Pradhan CS 157B, Spring 2007

Agenda Basics of Decision Tree Introduction to ID3 Entropy and Information Gain Two Examples

Basics What is a decision tree? A tree where each branching (decision) node represents a choice between 2 or more alternatives, with every branching node being part of a path to a leaf node Decision node: Specifies a test of some attribute Leaf node: Indicates classification of an example

ID3 Invented by J. Ross Quinlan Employs a top-down greedy search through the space of possible decision trees. Greedy because there is no backtracking. It picks highest values first. Select attribute that is most useful for classifying examples (attribute that has the highest Information Gain).

Entropy Entropy measures the impurity of an arbitrary collection of examples. For a collection S, entropy is given as: For a collection S having positive and negative examples Entropy(S) = -p+log2p+ - p-log2p- where p+ is the proportion of positive examples and p- is the proportion of negative examples In general, Entropy(S) = 0 if all members of S belong to the same class. Entropy(S) = 1 (maximum) when all members are split equally.

Information Gain Measures the expected reduction in entropy. The higher the IG, more is the expected reduction in entropy. where Values(A) is the set of all possible values for attribute A, Sv is the subset of S for which attribute A has value v.

Example 1 Sample training data to determine whether an animal lays eggs. Independent/Condition attributes Dependent/Decision attributes Animal Warm-blooded Feathers Fur Swims Lays Eggs Ostrich Yes No Crocodile Raven Albatross Dolphin Koala

Entropy(4Y,2N): -(4/6)log2(4/6) – (2/6)log2(2/6) = 0.91829 Now, we have to find the IG for all four attributes Warm-blooded, Feathers, Fur, Swims

For attribute ‘Warm-blooded’: Values(Warm-blooded) : [Yes,No] S = [4Y,2N] SYes = [3Y,2N] E(SYes) = 0.97095 SNo = [1Y,0N] E(SNo) = 0 (all members belong to same class) Gain(S,Warm-blooded) = 0.91829 – [(5/6)*0.97095 + (1/6)*0] = 0.10916 For attribute ‘Feathers’: Values(Feathers) : [Yes,No] SYes = [3Y,0N] E(SYes) = 0 SNo = [1Y,2N] E(SNo) = 0.91829 Gain(S,Feathers) = 0.91829 – [(3/6)*0 + (3/6)*0.91829] = 0.45914

For attribute ‘Fur’: Values(Fur) : [Yes,No] S = [4Y,2N] SYes = [0Y,1N] E(SYes) = 0 SNo = [4Y,1N] E(SNo) = 0.7219 Gain(S,Fur) = 0.91829 – [(1/6)*0 + (5/6)*0.7219] = 0.3167 For attribute ‘Swims’: Values(Swims) : [Yes,No] SYes = [1Y,1N] E(SYes) = 1 (equal members in both classes) SNo = [3Y,1N] E(SNo) = 0.81127 Gain(S,Swims) = 0.91829 – [(2/6)*1 + (4/6)*0.81127] = 0.04411

Gain(S,Warm-blooded) = 0.10916 Gain(S,Feathers) = 0.45914 Gain(S,Fur) = 0.31670 Gain(S,Swims) = 0.04411 Gain(S,Feathers) is maximum, so it is considered as the root node The ‘Y’ descendant has only positive examples and becomes the leaf node with classification ‘Lays Eggs’ Animal Warm-blooded Feathers Fur Swims Lays Eggs Ostrich Yes No Crocodile Raven Albatross Dolphin Koala Feathers Y N [Ostrich, Raven, Albatross] [Crocodile, Dolphin, Koala] Lays Eggs ?

We now repeat the procedure, S: [Crocodile, Dolphin, Koala] S: [1+,2-] Animal Warm-blooded Feathers Fur Swims Lays Eggs Crocodile No Yes Dolphin Koala We now repeat the procedure, S: [Crocodile, Dolphin, Koala] S: [1+,2-] Entropy(S) = -(1/3)log2(1/3) – (2/3)log2(2/3) = 0.91829

For attribute ‘Warm-blooded’: Values(Warm-blooded) : [Yes,No] S = [1Y,2N] SYes = [0Y,2N] E(SYes) = 0 SNo = [1Y,0N] E(SNo) = 0 Gain(S,Warm-blooded) = 0.91829 – [(2/3)*0 + (1/3)*0] = 0.91829 For attribute ‘Fur’: Values(Fur) : [Yes,No] SYes = [0Y,1N] E(SYes) = 0 SNo = [1Y,1N] E(SNo) = 1 Gain(S,Fur) = 0.91829 – [(1/3)*0 + (2/3)*1] = 0.25162 For attribute ‘Swims’: Values(Swims) : [Yes,No] SYes = [1Y,1N] E(SYes) = 1 SNo = [0Y,1N] E(SNo) = 0 Gain(S,Swims) = 0.91829 – [(2/3)*1 + (1/3)*0] = 0.25162 Gain(S,Warm-blooded) is maximum

The final decision tree will be: Feathers Y N Lays eggs Warm-blooded Lays Eggs Does not lay eggs

Example 2 Factors affecting sunburn Name Hair Height Weight Lotion Sunburned Sarah Blonde Average Light No Yes Dana Tall Alex Brown Short Annie Emily Red Heavy Pete John Katie

In this case, the final decision tree will be Hair Blonde Brown Red Sunburned Not Sunburned Lotion N Y Not Sunburned Sunburned

References "Machine Learning", by Tom Mitchell, McGraw-Hill, 1997 "Building Decision Trees with the ID3 Algorithm", by: Andrew Colin, Dr. Dobbs Journal, June 1996 http://www2.cs.uregina.ca/~dbd/cs831/notes/ml/dtrees/dt_prob1.html Professor Sin-Min Lee, SJSU. http://cs.sjsu.edu/~lee/cs157b/cs157b.html