Oliver Schulte Machine Learning 726 Decision Tree Classifiers.

Slides:



Advertisements
Similar presentations
Decision Tree Learning - ID3
Advertisements

Decision Trees Decision tree representation ID3 learning algorithm
C4.5 algorithm Let the classes be denoted {C1, C2,…, Ck}. There are three possibilities for the content of the set of training samples T in the given node.
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Introduction Training Complexity, Pruning CART vs. ID3 vs. C4.5
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Pattern Recognition and Machine Learning
ICS320-Foundations of Adaptive and Learning Systems
Oliver Schulte Machine Learning 726
Intelligent Environments1 Computer Science and Engineering University of Texas at Arlington.
Decision Tree under MapReduce Week 14 Part II. Decision Tree.
Decision Trees Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei Han.
Machine Learning Group University College Dublin Decision Trees What is a Decision Tree? How to build a good one…
ID3 Algorithm Abbas Rizvi CS157 B Spring What is the ID3 algorithm? ID3 stands for Iterative Dichotomiser 3 Algorithm used to generate a decision.
Part 7.3 Decision Trees Decision tree representation ID3 learning algorithm Entropy, information gain Overfitting.
Radosław Wesołowski Tomasz Pękalski, Michal Borkowicz, Maciej Kopaczyński
Induction of Decision Trees
Machine Learning CMPT 726 Simon Fraser University
Unsupervised Training and Clustering Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
ICS 273A Intro Machine Learning
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
Ensemble Learning (2), Tree and Forest
Decision Tree Learning
Albert Gatt Corpora and Statistical Methods. Probability distributions Part 2.
Basic Concepts in Information Theory
Some basic concepts of Information Theory and Entropy
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
By Wang Rui State Key Lab of CAD&CG
Fall 2004 TDIDT Learning CS478 - Machine Learning.
Machine Learning Chapter 3. Decision Tree Learning
Learning what questions to ask. 8/29/03Decision Trees2  Job is to build a tree that represents a series of questions that the classifier will ask of.
沈致远. Test error(generalization error): the expected prediction error over an independent test sample Training error: the average loss over the training.
CS-424 Gregory Dudek Today’s outline Administrative issues –Assignment deadlines: 1 day = 24 hrs (holidays are special) –The project –Assignment 3 –Midterm.
Decision tree learning Maria Simi, 2010/2011 Inductive inference with decision trees  Decision Trees is one of the most widely used and practical methods.
Machine Learning Queens College Lecture 2: Decision Trees.
Decision Tree Learning Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 25, 2014.
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
For Wednesday No reading Homework: –Chapter 18, exercise 6.
For Monday No new reading Homework: –Chapter 18, exercises 3 and 4.
CS 8751 ML & KDDDecision Trees1 Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.
Exercises Decision Trees In decision tree learning, the information gain criterion helps us select the best attribute to split the data at every node.
Machine Learning Decision Trees. E. Keogh, UC Riverside Decision Tree Classifier Ross Quinlan Antenna Length Abdomen Length.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Occam’s Razor No Free Lunch Theorem Minimum.
Decision Tree Learning
Presented by Minkoo Seo March, 2006
Lecture 3: MLE, Bayes Learning, and Maximum Entropy
1 Decision Trees Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) [Edited by J. Wiebe] Decision Trees.
Machine Learning Recitation 8 Oct 21, 2009 Oznur Tastan.
DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.
Prof. Pushpak Bhattacharyya, IIT Bombay1 CS 621 Artificial Intelligence Lecture 12 – 30/08/05 Prof. Pushpak Bhattacharyya Fundamentals of Information.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Decision Tree Learning DA514 - Lecture Slides 2 Modified and expanded from: E. Alpaydin-ML (chapter 9) T. Mitchell-ML.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
CMPT 310 Simon Fraser University Oliver Schulte Learning.
Oliver Schulte Machine Learning 726
Oliver Schulte Machine Learning 726
DECISION TREES An internal node represents a test on an attribute.
Decision trees (concept learnig)
C4.5 algorithm Let the classes be denoted {C1, C2,…, Ck}. There are three possibilities for the content of the set of training samples T in the given node.
Ch9: Decision Trees 9.1 Introduction A decision tree:
Bayes Net Learning: Bayesian Approaches
Oliver Schulte Machine Learning 726
Dipartimento di Ingegneria «Enzo Ferrari»,
Decision Tree Saed Sayad 9/21/2018.
Machine Learning Chapter 3. Decision Tree Learning
Decision Trees Decision tree representation ID3 learning algorithm
Machine Learning Chapter 3. Decision Tree Learning
Decision Trees Decision tree representation ID3 learning algorithm
Presentation transcript:

Oliver Schulte Machine Learning 726 Decision Tree Classifiers

2/13 Overview Parent Node/ Child Node DiscreteContinuous DiscreteMaximum Likelihood Decision Trees logit distribution (logistic regression) Continuousconditional Gaussian (not discussed) linear Gaussian (linear regression)

3/13 Decision Tree Popular type of classifier. Easy to visualize. Especially for discrete values, but also for continuous. Learning: Information Theory.

4/13 Decision Tree Example

5/13 Exercise Find a decision tree to represent A OR B, A AND B, A XOR B. (A AND B) OR (C AND notD AND E)

6/13 Decision Tree Learning Basic Loop: 1. A := the “best” decision attribute for next node. 2. For each value of A, create new descendant of node. 3. Assign training examples to leaf nodes. 4. If training examples perfect classified, then STOP. Else iterate over new leaf nodes.

7/13 Entropy

8/13 Uncertainty and Probability The more “balanced” a probability distribution, the less information it conveys (e.g., about class label). How to quantify? Information Theory: Entropy = Balance. S is sample, p + is proportion positive, p - negative. Entropy(S) = -p + log2(p + ) - p - log2(p - )

9/13 Entropy: General Definition Important quantity in coding theory statistical physics machine learning

10/13 Intuition

11/13 Entropy

12/13 Coding Theory Coding theory: X discrete with 8 possible states (“messages”); how many bits to transmit the state of X ? Shannon information theorem: optimal code length assigns p(x) to each “message” X = x. All states equally likely

13/13 Another Coding Example

14/13 Zipf’s Law General principle: frequent messages get shorter codes. e.g., abbreviations. Information Compression.

15/13 The Kullback-Leibler Divergence Measures information-theoretic “distance” between two distributions p and q. Code length of x in true distribution Code length of x in wrong distribution

16/13 Information Gain

17/13 Splitting Criterion A new attribute value changes the entropy. Intuitively, want to split on attribute that has the greatest reduction in entropy, averaged over its attribute values. Gain(S,A) = expected reduction in entropy due to splitting on A.

18/13 Example

19/13 Playtennis