Decision Trees with Numeric Tests. Industrial-strength algorithms  For an algorithm to be useful in a wide range of real- world applications it must:

Slides:



Advertisements
Similar presentations
Machine Learning in Real World: C4.5
Advertisements

Decision Trees with Numeric Tests
C4.5 algorithm Let the classes be denoted {C1, C2,…, Ck}. There are three possibilities for the content of the set of training samples T in the given node.
Data Mining Techniques: Classification. Classification What is Classification? –Classifying tuples in a database –In training set E each tuple consists.
IT 433 Data Warehousing and Data Mining
Pavan J Joshi 2010MCS2095 Special Topics in Database Systems
Introduction Training Complexity, Pruning CART vs. ID3 vs. C4.5
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 1.
Using a Mixture of Probabilistic Decision Trees for Direct Prediction of Protein Functions Paper by Umar Syed and Golan Yona department of CS, Cornell.
Classification Techniques: Decision Tree Learning
Decision Tree Learning 主講人:虞台文 大同大學資工所 智慧型多媒體研究室.
Decision Trees.
Decision Trees Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei Han.
ID3 Algorithm Abbas Rizvi CS157 B Spring What is the ID3 algorithm? ID3 stands for Iterative Dichotomiser 3 Algorithm used to generate a decision.
Induction of Decision Trees
1 Classification with Decision Trees I Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei.
Algorithms for Classification: Notes by Gregory Piatetsky.
Decision Trees an Introduction.
Classification with Decision Trees II
Classification.
Ordinal Decision Trees Qinghua Hu Harbin Institute of Technology
Machine Learning Lecture 10 Decision Trees G53MLE Machine Learning Dr Guoping Qiu1.
Decision Tree Learning
Chapter 7 Decision Tree.
Data Mining: Classification
Machine Learning Chapter 3. Decision Tree Learning
Short Introduction to Machine Learning Instructor: Rada Mihalcea.
Mohammad Ali Keyvanrad
Weka Project assignment 3
Data Mining Schemes in Practice. 2 Implementation: Real machine learning schemes  Decision trees: from ID3 to C4.5  missing values, numeric attributes,
Classification I. 2 The Task Input: Collection of instances with a set of attributes x and a special nominal attribute Y called class attribute Output:
Computational Intelligence: Methods and Applications Lecture 19 Pruning of decision trees Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Decision Tree Learning Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 25, 2014.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.3: Decision Trees Rodney Nielsen Many of.
Decision Trees. MS Algorithms Decision Trees The basic idea –creating a series of splits, also called nodes, in the tree. The algorithm adds a node to.
Data Mining – Algorithms: Decision Trees - ID3 Chapter 4, Section 4.3.
CS690L Data Mining: Classification
CS 8751 ML & KDDDecision Trees1 Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting.
Algorithms: Decision Trees 2 Outline  Introduction: Data Mining and Classification  Classification  Decision trees  Splitting attribute  Information.
Today’s Topics Learning Decision Trees (Chapter 18) –We’ll use d-trees to introduce/motivate many general issues in ML (eg, overfitting reduction) “Forests”
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
1 Decision Tree Learning Original slides by Raymond J. Mooney University of Texas at Austin.
Decision Trees, Part 1 Reading: Textbook, Chapter 6.
MULTI-INTERVAL DISCRETIZATION OF CONTINUOUS VALUED ATTRIBUTES FOR CLASSIFICATION LEARNING KIRANKUMAR K. TAMBALKAR.
1 CSCI 3202: Introduction to AI Decision Trees Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) Intro AIDecision Trees.
Exercises Decision Trees In decision tree learning, the information gain criterion helps us select the best attribute to split the data at every node.
DECISION TREE Ge Song. Introduction ■ Decision Tree: is a supervised learning algorithm used for classification or regression. ■ Decision Tree Graph:
Decision Tree Learning
CSC 8520 Spring Paula Matuszek DecisionTreeFirstDraft Paula Matuszek Spring,
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 6.1: Decision Trees Rodney Nielsen Many /
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Decision Trees.
1 Decision Trees Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) [Edited by J. Wiebe] Decision Trees.
Outline Decision tree representation ID3 learning algorithm Entropy, Information gain Issues in decision tree learning 2.
1 By: Ashmi Banerjee (125186) Suman Datta ( ) CSE- 3rd year.
Data Mining Chapter 4 Algorithms: The Basic Methods - Constructing decision trees Reporter: Yuen-Kuei Hsueh Date: 2008/7/24.
Rodney Nielsen Many / most of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Mining Practical Machine Learning Tools and Techniques.
Decision Trees an introduction.
Decision Trees with Numeric Tests
Algorithms for Classification:
C4.5 algorithm Let the classes be denoted {C1, C2,…, Ck}. There are three possibilities for the content of the set of training samples T in the given node.
Artificial Intelligence
Ch9: Decision Trees 9.1 Introduction A decision tree:
Data Science Algorithms: The Basic Methods
Decision Trees Greg Grudic
Machine Learning Chapter 3. Decision Tree Learning
Machine Learning Chapter 3. Decision Tree Learning
Machine Learning in Practice Lecture 23
Presentation transcript:

Decision Trees with Numeric Tests

Industrial-strength algorithms  For an algorithm to be useful in a wide range of real- world applications it must:  Permit numeric attributes  Allow missing values  Be robust in the presence of noise  Basic schemes need to be extended to fulfill these requirements

C4.5 History  ID3, CHAID – 1960s  C4.5 innovations (Quinlan):  permit numeric attributes  deal sensibly with missing values  pruning to deal with for noisy data  C4.5 - one of best-known and most widely-used learning algorithms  Last research version: C4.8, implemented in Weka as J4.8 (Java)  Commercial successor: C5.0 (available from Rulequest)

Numeric attributes  Standard method: binary splits  E.g. temp < 45  Unlike nominal attributes, every attribute has many possible split points  Solution is straightforward extension:  Evaluate info gain (or other measure) for every possible split point of attribute  Choose “best” split point  Info gain for best split point is info gain for attribute  Computationally more demanding

Example  Split on temperature attribute:  E.g.temperature  71.5: yes/4, no/2 temperature  71.5: yes/5, no/3  Info([4,2],[5,3]) = 6/14 info([4,2]) + 8/14 info([5,3]) = bits  Place split points halfway between values  Can evaluate all split points in one pass! Yes No Yes Yes Yes No No Yes Yes Yes No Yes Yes No

Example  Split on temperature attribute: Yes No Yes Yes Yes No No Yes Yes Yes No Yes Yes No infoGain 0

Speeding up  Entropy only needs to be evaluated between points of different classes (Fayyad & Irani, 1992) Yes No Yes Yes Yes No No Yes Yes Yes No Yes Yes No Potential optimal breakpoints Breakpoints between values of the same class cannot be optimal value class X

Application: Computer Vision 1

Application: Computer Vision 2 feature extraction  color (RGB, hue, saturation)  edge, orientation  texture  XY coordinates  3D information

Application: Computer Vision 3 how grey? below horizon?

Application: Computer Vision 4 prediction

Application: Computer Vision 4 inverse perspective

Application: Computer Vision 5 inverse perspective path planning

Quiz 1 Q: If an attribute A has high info gain, does it always appear in a decision tree? A: No. If it is highly correlated with another attribute B, and infoGain(B) > infoGain(A), then B will appear in the tree, and further splitting on A will not be useful A B

Quiz 2 Q: Can an attribute appear more than once in a decision tree? A: Yes. If a test is not at the root of the tree, it can appear in different branches. Q: And on a single path in the tree (from root to leaf)? A: Yes. Numeric attributes can appear more than once, but only with very different numeric conditions A

Quiz 3 Q: If an attribute A has infoGain(A)=0 (at the root), can it ever appear in a decision tree? A: Yes. 1.All attributes may have zero info gain. 2.info gain often changes when splitting on another attribute A B the XOR problem: