Multiple Decision Trees ISQS7342

Slides:



Advertisements
Similar presentations
Random Forest Predrag Radenković 3237/10
Advertisements

Ensemble Learning Reading: R. Schapire, A brief introduction to boosting.
My name is Dustin Boswell and I will be presenting: Ensemble Methods in Machine Learning by Thomas G. Dietterich Oregon State University, Corvallis, Oregon.
A Statistician’s Games * : Bootstrap, Bagging and Boosting * Please refer to “Game theory, on-line prediction and boosting” by Y. Freund and R. Schapire,
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Data Mining Classification: Alternative Techniques
Longin Jan Latecki Temple University
Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.
Model Evaluation Metrics for Performance Evaluation
Ensemble Learning: An Introduction
Evaluating Hypotheses
Bagging and Boosting in Data Mining Carolina Ruiz
Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?
Examples of Ensemble Methods
Machine Learning: Ensemble Methods
Ensemble Learning (2), Tree and Forest
For Better Accuracy Eick: Ensemble Learning
Machine Learning CS 165B Spring 2012
Predicting Income from Census Data using Multiple Classifiers Presented By: Arghya Kusum Das Arnab Ganguly Manohar Karki Saikat Basu Subhajit Sidhanta.
Chapter 10 Boosting May 6, Outline Adaboost Ensemble point-view of Boosting Boosting Trees Supervised Learning Methods.
Zhangxi Lin ISQS Texas Tech University Note: Most slides are from Decision Tree Modeling by SAS Lecture Notes 6 Ensembles of Trees.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Data Mining - Volinsky Columbia University 1 Topic 10 - Ensemble Methods.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensemble Methods: Bagging and Boosting
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
CLASSIFICATION: Ensemble Methods
ISQS 6347, Data & Text Mining1 Ensemble Methods. ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data.
Konstantina Christakopoulou Liang Zeng Group G21
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Classification Ensemble Methods 1
1 January 24, 2016Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques — Chapter 7 — Classification Ensemble Learning.
Classification and Prediction: Ensemble Methods Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Ensemble Methods Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made.
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Boosting ---one of combining models Xin Li Machine Learning Course.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Ensemble Classifiers.
Machine Learning: Ensemble Methods
Data Mining Practical Machine Learning Tools and Techniques
Bagging and Random Forests
Week 2 Presentation: Project 3
Zaman Faisal Kyushu Institute of Technology Fukuoka, JAPAN
Eco 6380 Predictive Analytics For Economists Spring 2016
Chapter 13 – Ensembles and Uplift
Trees, bagging, boosting, and stacking
Machine Learning: Ensembles
COMP61011 : Machine Learning Ensemble Models
Ensemble Learning Introduction to Machine Learning and Data Mining, Carla Brodley.
Introduction to Data Mining, 2nd Edition by
Data Mining Practical Machine Learning Tools and Techniques
Introduction to Data Mining, 2nd Edition
Evaluating Hypotheses
Ensembles.
Ensemble learning.
Model Combination.
Ensemble learning Reminder - Bagging of Trees Random Forest
Data Mining Ensembles Last modified 1/9/19.
Model generalization Brief summary of methods
Recitation 10 Oznur Tastan
Ensembles An ensemble is a set of classifiers whose combined results give the final decision. test feature vector classifier 1 classifier 2 classifier.
COMP6321 MACHINE LEARNING PROJECT PRESENTATION
Ensembles An ensemble is a set of classifiers whose combined results give the final decision. test feature vector classifier 1 classifier 2 classifier.
CS 391L: Machine Learning: Ensembles
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Presentation transcript:

Multiple Decision Trees ISQS7342 Xiayi Kuang Chin Hwa Tan

Outlines 1 2 3 4 Multiple decision trees www.themegallery.com Outlines Multiple decision trees 1 Advantages of multiple decision tress 2 Major multiple decision tree methods 3 Multiple random classification decision trees 4 Company Logo

Multiple decision trees www.themegallery.com Multiple decision trees The original data set is used to create derived data sets or alternative decision tree models. These data sets or models are used to develop multiple decision tree models and score algorithms. Company Logo

Comparison Single decision tree sample results www.themegallery.com Comparison Single decision tree sample results IF Age > 20 AND Weight > 180 (lbs.) THEN Gender = Male Multiple decision trees sample results Rule 1 IF Age > 20 AND Weight > 180 (lbs.) Rule 2 IF Height > 160 (cm.) AND BodyMass < .3 THEN Probability of Gender = Female is .3 Company Logo

Advantages Use randomization approach www.themegallery.com Advantages Use randomization approach Create an aggregate outcome that represents the summary of all decision trees Significantly improve model results Because many models are developed and averaged, the results are highly stable Company Logo

Major multiple decision trees methods www.themegallery.com Major multiple decision trees methods Cross -validation Bootstrapping Bagging Boosting AdaBoost Company Logo

Cross-Validation Use in CRT decision tree approach www.themegallery.com Cross-Validation Use in CRT decision tree approach The decision tree is allowed to grow larger than normal because cross-validation is then used to trim branches The method used is by partitioning the data in to ten partitions and then 9 of the new partitions are used as new cross-validate training data. The other part is used as an independent test sample Repeat the method above ten times. Then the average if the error rate is calculated and this is known as cross-validation cost. Company Logo

www.themegallery.com Bootstrapping The bootstrap process begins with the original or training data set of observations and forms a bootstrap sample of the original data set by repeatedly selecting an observation from the original data set at random. Bootstrapping can also be used to assess the stability of various predictors. This will serve as indicator of how well predictors will perform in the new data sets. Company Logo

www.themegallery.com Bagging Bagging refers to the creation of a pooled estimate of the target. Bagging stands for bootstrap aggregation Bagging can improve the predictive accuracy of unstable models Bagging help smooth out the prediction Company Logo

www.themegallery.com Boosting Boosting uses varying probabilities in selecting an observation to be included in the sample The goal of boosting is to increase the probability of selecting an observation that performs well when predicting the target Boosting builds a series of decision trees and the prediction receives incremental improvement by each decision tree Company Logo

www.themegallery.com AdaBoost AdaBoost is a form of boosting that builds an initial model from the training data set In the model, some records will be correctly classified by the decision algorithm, and some records will be misclassified Company Logo

Multiple random classification decision trees www.themegallery.com Multiple random classification decision trees Random forests Each decision tree in the random forest is grown in a bootstrap sample of the training data set Company Logo

Thank You !