DECISION TREES An internal node represents a test on an attribute.

Slides:



Advertisements
Similar presentations
COMP3740 CR32: Knowledge Management and Adaptive Systems
Advertisements

Data Mining Lecture 9.
Paper By - Manish Mehta, Rakesh Agarwal and Jorma Rissanen
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Hunt’s Algorithm CIT365: Data Mining & Data Warehousing Bajuna Salehe
Decision Tree Approach in Data Mining
Classification Techniques: Decision Tree Learning
Decision Trees.
Decision Trees Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei Han.
Lecture outline Classification Decision-tree classification.
Classification and Prediction
1 Classification with Decision Trees Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei.
Classification: Decision Trees
Induction of Decision Trees
1 Classification with Decision Trees I Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei.
Classification Continued
Decision Trees an Introduction.
Classification Based in part on Chapter 10 of Hand, Manilla, & Smyth and Chapter 7 of Han and Kamber David Madigan.
Classification.
Classification: Decision Trees 2 Outline  Top-Down Decision Tree Construction  Choosing the Splitting Attribute  Information Gain and Gain Ratio.
Machine Learning Lecture 10 Decision Trees G53MLE Machine Learning Dr Guoping Qiu1.
Chapter 7 Decision Tree.
Learning what questions to ask. 8/29/03Decision Trees2  Job is to build a tree that represents a series of questions that the classifier will ask of.
Lecture 7. Outline 1. Overview of Classification and Decision Tree 2. Algorithm to build Decision Tree 3. Formula to measure information 4. Weka, data.
Feature Selection: Why?
Decision Tree Learning Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 25, 2014.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.3: Decision Trees Rodney Nielsen Many of.
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Classification CS 685: Special Topics in Data Mining Fall 2010 Jinze Liu.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
CS690L Data Mining: Classification
Classification and Prediction
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Decision Trees.
Machine Learning Recitation 8 Oct 21, 2009 Oznur Tastan.
Decision Trees by Muhammad Owais Zahid
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
Data Mining Chapter 4 Algorithms: The Basic Methods - Constructing decision trees Reporter: Yuen-Kuei Hsueh Date: 2008/7/24.
Review of Decision Tree Learning Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Induction of Decision Trees Blaž Zupan and Ivan Bratko magix.fri.uni-lj.si/predavanja/uisp.
Induction of Decision Trees
Chapter 6 Decision Tree.
Decision Trees an introduction.
Classification Algorithms
Decision Trees.
Teori Keputusan (Decision Theory)
Prepared by: Mahmoud Rafeek Al-Farra
Artificial Intelligence
Ch9: Decision Trees 9.1 Introduction A decision tree:
Chapter 6 Classification and Prediction
Data Science Algorithms: The Basic Methods
Decision Tree Saed Sayad 9/21/2018.
Classification and Prediction
Jianping Fan Dept of Computer Science UNC-Charlotte
Classification by Decision Tree Induction
CS 685: Special Topics in Data Mining Jinze Liu
Data Mining – Chapter 3 Classification
Machine Learning: Lecture 3
Classification with Decision Trees
CS 685: Special Topics in Data Mining Jinze Liu
Data Mining(中国人民大学) Yang qiang(香港科技大学) Han jia wei(UIUC)
Decision Tree Concept of Decision Tree
Junheng, Shengming, Yunsheng 10/19/2018
©Jiawei Han and Micheline Kamber
Classification.
CS 685: Special Topics in Data Mining Spring 2009 Jinze Liu
Data Mining CSCI 307, Spring 2019 Lecture 15
CS 685: Special Topics in Data Mining Jinze Liu
Classification 1.
Presentation transcript:

DECISION TREES An internal node represents a test on an attribute. A branch represents an outcome of the test, e.g., Color=red. A leaf node represents a class label or class label distribution. At each node, one attribute is chosen to split training examples into distinct classes as much as possible A new case is classified by following a matching path to a leaf node.

Training Set

Example Outlook sunny overcast rain humidity P windy high normal true false N P N P

Building Decision Tree Top-down tree construction At start, all training examples are at the root. Partition the examples recursively by choosing one attribute each time. Bottom-up tree pruning Remove subtrees or branches, in a bottom-up manner, to improve the estimated accuracy on new cases. Use of decision tree: Classifying an unknown sample Test the attribute values of the sample against the decision tree

Training Dataset

Output: A Decision Tree for “buys_computer” age? <=30 overcast 30..40 >40 student? yes credit rating? no yes excellent fair no yes no yes

Algorithm for Decision Tree Induction Basic algorithm (a greedy algorithm) Tree is constructed in a top-down recursive divide-and-conquer manner At start, all the training examples are at the root Attributes are categorical Examples are partitioned recursively based on selected attributes Test attributes are selected on the basis of a heuristic or statistical measure (e.g., information gain) Conditions for stopping partitioning All samples for a given node belong to the same class There are no remaining attributes for further partitioning – majority voting is employed for classifying the leaf There are no samples left

Choosing the Splitting Attribute At each node, available attributes are evaluated on the basis of separating the classes of the training examples. A Goodness function is used for this purpose. Typical goodness functions: information gain (ID3/C4.5) information gain ratio gini index

Which attribute to select?

A criterion for attribute selection Which is the best attribute? The one which will result in the smallest tree Heuristic: choose the attribute that produces the “purest” nodes Popular impurity criterion: information gain Information gain increases with the average purity of the subsets that an attribute produces Strategy: choose attribute that results in greatest information gain

Information Gain (ID3/C4.5) Select the attribute with the highest information gain Assume there are two classes, P and N Let the set of examples S contain p elements of class P and n elements of class N The amount of information, needed to decide if an arbitrary example in S belongs to P or N is defined as

Information Gain in Decision Tree Induction Assume that using attribute A a set S will be partitioned into sets {S1, S2 , …, Sv} If Si contains pi examples of P and ni examples of N, the entropy, or the expected information needed to classify objects in all subtrees Si is The encoding information that would be gained by branching on A

Attribute Selection by Information Gain Computation Class P: buys_computer = “yes” Class N: buys_computer = “no” I(p, n) = I(9, 5) =0.940 Compute the entropy for age: Hence Similarly

Example: attribute “Outlook” “Outlook” = “Sunny”: “Outlook” = “Overcast”: “Outlook” = “Rainy”: Expected information for attribute: Note: this is normally not defined.

Computing the information gain Information gain: information before splitting – information after splitting Information gain for attributes from weather data:

Continuing to split

The final decision tree Note: not all leaves need to be pure; sometimes identical instances have different classes  Splitting stops when data can’t be split any further