Machine Learning Ensemble Learning: Voting, Boosting(Adaboost)

Slides:



Advertisements
Similar presentations
Data Mining Classification: Alternative Techniques
Advertisements

Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Data Mining Classification: Alternative Techniques
An Overview of Machine Learning
Longin Jan Latecki Temple University
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Sparse vs. Ensemble Approaches to Supervised Learning
Ensemble Learning what is an ensemble? why use an ensemble?
Ensemble Learning: An Introduction
Adaboost and its application
Sparse vs. Ensemble Approaches to Supervised Learning
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
For Better Accuracy Eick: Ensemble Learning
Machine Learning CS 165B Spring 2012
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Benk Erika Kelemen Zsolt
Boosting of classifiers Ata Kaban. Motivation & beginnings Suppose we have a learning algorithm that is guaranteed with high probability to be slightly.
Ensemble Methods: Bagging and Boosting
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
CS Ensembles1 Ensembles. 2 A “Holy Grail” of Machine Learning Automated Learner Just a Data Set or just an explanation of the problem Hypothesis.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Ensemble Methods.  “No free lunch theorem” Wolpert and Macready 1995.
Classification Ensemble Methods 1
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 8 Combining Methods and Ensemble Learning.
… Algo 1 Algo 2 Algo 3 Algo N Meta-Learning Algo.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
Ensemble Classifiers.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning: Ensemble Methods
Machine Learning Supervised Learning Classification and Regression
HW 2.
Machine Learning Fisher’s Criteria & Linear Discriminant Analysis
Machine Learning Clustering: K-means Supervised Learning
COMP61011 : Machine Learning Ensemble Models
Ensemble Learning Introduction to Machine Learning and Data Mining, Carla Brodley.
A task of induction to find patterns
A “Holy Grail” of Machine Learing
INTRODUCTION TO Machine Learning
Combining Base Learners
Data Mining Practical Machine Learning Tools and Techniques
Introduction to Data Mining, 2nd Edition
Introduction to Boosting
Ensembles.
Ensemble learning.
Model Combination.
Ensemble learning Reminder - Bagging of Trees Random Forest
Model generalization Brief summary of methods
Parametric Methods Berlin Chen, 2005 References:
Recitation 10 Oznur Tastan
A task of induction to find patterns
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
Machine Learning Support Vector Machine Supervised Learning
INTRODUCTION TO Machine Learning 3rd Edition
Machine Learning Perceptron: Linearly Separable Supervised Learning
A task of induction to find patterns
Derek Hoiem CS 598, Spring 2009 Jan 27, 2009
Perceptron Learning Rule
A task of induction to find patterns
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Perceptron Learning Rule
Presentation transcript:

Machine Learning Ensemble Learning: Voting, Boosting(Adaboost) Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron: Linearly Separable Multilayer Perceptron & EBP & Deep Learning, RBF Network Support Vector Machine Ensemble Learning: Voting, Boosting(Adaboost) Unsupervised Learning Principle Component Analysis Independent Component Analysis Clustering: K-means Semi-supervised Learning & Reinforcement Learning

Rationale Different learners use different No Free Lunch Theorem Algorithms Hyper-parameters Representations (Modalities) Training sets Subproblems No Free Lunch Theorem There is no single algorithm that is always the most accurate Each learners assumes a certain model (assumptions, inductive bias) Even if we fine tune the algorithm, there are instances the algorithm is not accurate – there may be another algorithm that is accurate on those Generate a group of base learners which when combined has higher accuracy

Voting Linear combination for output Classification Simple voting: equal weight Plurality voting: class with max vote is the winner Majority voting (2 class problem)

Voting Bayesian perspective (Bayesian model combination): If are i.i.d.(independent, identically distributed) Bias does not change, variance decrease by factor L Suppose base learner = discriminant/regression function + random noise If noises are uncorrelated with 0 mean, we are averaging over noise

Stacking

Stacking

Bagging Bootstrapping method: generate training sets, train one base-learner with each, and combine them Why do we want to do this? Expectation done with regard to the input distribution Unstable algorithms profit from bagging

Boosting

AdaBoost: Algorithm

Toy Example

Round 1

Round 2

Round 3

Final Classifier

Test Error Behavior

Practical Advantages of AdaBoost Fast Simple and easy to program No parameters to tune (except M) Flexible – can combine with any learning algorithm No prior knowledge needed about weak learner Probably effective, provided can consistently find rough rules of thumb Shift in mind set – goal now is merely to find classifiers barely better than random guessing Versatile Can use with data that is textual, numeric, discrete, etc. Has been extended to learning problems well beyond binary classification