Decision Trees. Decision trees Decision trees are powerful and popular tools for classification and prediction. The attractiveness of decision trees is.

Slides:



Advertisements
Similar presentations
DECISION TREES. Decision trees  One possible representation for hypotheses.
Advertisements

Decision Trees Decision tree representation ID3 learning algorithm
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Decision Tree Approach in Data Mining
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Deriving rules from data Decision Trees a.j.m.m (ton) weijters.
Chapter 7 – Classification and Regression Trees
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
Decision Tree Learning
ID3 Algorithm Abbas Rizvi CS157 B Spring What is the ID3 algorithm? ID3 stands for Iterative Dichotomiser 3 Algorithm used to generate a decision.
Decision Tree Algorithm
Induction of Decision Trees
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Rule induction: Ross Quinlan's ID3 algorithm Fredda Weinberg CIS 718X Fall 2005 Professor Kopec Assignment #3.
LEARNING DECISION TREES
Classification.
MACHINE LEARNING. What is learning? A computer program learns if it improves its performance at some task through experience (T. Mitchell, 1997) A computer.
Decision Tree Models in Data Mining
Microsoft Enterprise Consortium Data Mining Concepts Introduction to Directed Data Mining: Decision Trees Prepared by David Douglas, University of ArkansasHosted.
Introduction to Directed Data Mining: Decision Trees
Formulating objectives, general and specific
Midwestern State University, Wichita Falls TX 1 Computerized Trip Classification of GPS Data: A Proposed Framework Terry Griffin - Yan Huang – Ranette.
Fall 2004 TDIDT Learning CS478 - Machine Learning.
Machine Learning Chapter 3. Decision Tree Learning
Learning what questions to ask. 8/29/03Decision Trees2  Job is to build a tree that represents a series of questions that the classifier will ask of.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Short Introduction to Machine Learning Instructor: Rada Mihalcea.
Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules.
Chapter 9 – Classification and Regression Trees
Data Mining: Classification & Predication Hosam Al-Samarraie, PhD. Centre for Instructional Technology & Multimedia Universiti Sains Malaysia.
By: Phuong H. Nguyen Professor: Lee, Sin-Min Course: CS 157B Section: 2 Date: 05/08/07 Spring 2007.
Big Ideas Differentiation Frames with Icons. 1. Number Uses, Classification, and Representation- Numbers can be used for different purposes, and numbers.
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
Business Intelligence and Decision Modeling Week 9 Customer Profiling Decision Trees (Part 2) CHAID CRT.
Classification and Prediction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot Readings: Chapter 6 – Han and Kamber.
CS690L Data Mining: Classification
For Monday No new reading Homework: –Chapter 18, exercises 3 and 4.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
ASSESSING LEARNING ALGORITHMS Yılmaz KILIÇASLAN. Assessing the performance of the learning algorithm A learning algorithm is good if it produces hypotheses.
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
ASSESSING LEARNING ALGORITHMS Yılmaz KILIÇASLAN. Assessing the performance of the learning algorithm A learning algorithm is good if it produces hypotheses.
Exercises Decision Trees In decision tree learning, the information gain criterion helps us select the best attribute to split the data at every node.
Decision Trees Example of a Decision Tree categorical continuous class Refund MarSt TaxInc YES NO YesNo Married Single, Divorced < 80K> 80K Splitting.
DECISION TREE Ge Song. Introduction ■ Decision Tree: is a supervised learning algorithm used for classification or regression. ■ Decision Tree Graph:
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
Decision Tree Learning Presented by Ping Zhang Nov. 26th, 2007.
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
Classification and Regression Trees
DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.
1 By: Ashmi Banerjee (125186) Suman Datta ( ) CSE- 3rd year.
BY International School of Engineering {We Are Applied Engineering} Disclaimer: Some of the Images and content have been taken from multiple online sources.
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
CSE343/543 Machine Learning: Lecture 4.  Chapter 3: Decision Trees  Weekly assignment:  There are lot of applications and systems using machine learning.
Decision Tree Learning DA514 - Lecture Slides 2 Modified and expanded from: E. Alpaydin-ML (chapter 9) T. Mitchell-ML.
By N.Gopinath AP/CSE.  A decision tree is a flowchart-like tree structure, where each internal node (nonleaf node) denotes a test on an attribute, each.
CSE573 Autumn /09/98 Machine Learning Administrative –Last topic: Decision Tree Learning Reading: 5.1, 5.4 Last time –finished NLP sample system’s.
CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.
Classification Algorithms
C4.5 algorithm Let the classes be denoted {C1, C2,…, Ck}. There are three possibilities for the content of the set of training samples T in the given node.
Ch9: Decision Trees 9.1 Introduction A decision tree:
Chapter 6 Classification and Prediction
Chapter 8 Tutorial.
Classification by Decision Tree Induction
Data Mining for Business Analytics
Data Mining – Chapter 3 Classification
Machine Learning: Lecture 3
Decision Tree  Decision tree is a popular classifier.
Decision Tree  Decision tree is a popular classifier.
Presentation transcript:

Decision Trees

Decision trees Decision trees are powerful and popular tools for classification and prediction. The attractiveness of decision trees is due to the fact that, decision trees represent rules. Rules can readily be expressed so that humans can understand them or even directly used in a database access language like SQL so that records falling into a particular category may be retrieved.

In some applications, the accuracy of a classification or prediction is the only thing that matters. In such situations we do not necessarily care how or why the model works. In other situations, the ability to explain the reason for a decision, is crucial. In marketing one has describe the customer segments to marketing professionals, so that they can utilize this knowledge in launching a successful marketing campaign. domain experts must recognize and approve this discovered knowledge. There are a variety of algorithms for building decision trees that share the desirable quality of interpretability.

What is a decision tree ? Decision tree is a classifier in the form of a tree structure, where each node is either: – a leaf node - indicates the value of the target attribute (class) of examples, or – a decision node - specifies some test to be carried out on a single attribute-value, with one branch and sub-tree for each possible outcome of the test.

example

Which attribute is the best classifier? The estimation criterion in the decision tree algorithm is the selection of an attribute to test at each decision node in the tree. The goal is to select the attribute that is most useful for classifying examples. A good quantitative measure of the worth of an attribute is a statistical property called information gain that measures how well a given attribute separates the training examples according to their target classification.

Which attribute is the best classifier? information gain is used to select among the candidate attributes at each step while growing the tree. In order to define information gain precisely, we need to define a measure commonly used in information theory, called entropy, that characterizes the (im)purity of an arbitrary collection of examples.

Entropy - a measure of homogeneity of the set of examples Given a set S, containing only positive and negative examples of some target concept (a 2 class problem), the entropy of set S relative to this simple, binary classification is defined as: Entropy(S) = - p p log 2 p p – p n log 2 p n where p p is the proportion of positive examples in S and p n is the proportion of negative examples in S. In all calculations involving entropy we define 0log0 to be 0.

To illustrate, suppose S is a collection of 25 examples, including 15 positive and 10 negative examples [15+, 10-]. Then the entropy of S relative to this classification is ?

Notice that the entropy is 0 if all members of S belong to the same class. For example, if all members are positive (p p = 1 ), then p n is 0, and Entropy(S) = -1´ log 2 (1) - 0´ log 2 0 = -1´ 0 - 0´ log 2 0 = 0. Note the entropy is 1 (at its maximum!) when the collection contains an equal number of positive and negative examples. If the collection contains unequal numbers of positive and negative examples, the entropy is between 0 and 1

The figure shows the form of the entropy function relative to a binary classification, as p + varies between 0 and 1.

So far we have discussed entropy in the special case where the target classification is binary. If the target attribute takes on c different values, then the entropy of S relative to this c- wise classification is defined as