Machine Learning Ensemble Learning: Voting, Boosting(Adaboost)

Slides:

Advertisements

Similar presentations

Data Mining Classification: Alternative Techniques

Advertisements

Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.

Data Mining Classification: Alternative Techniques

An Overview of Machine Learning

Longin Jan Latecki Temple University

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Sparse vs. Ensemble Approaches to Supervised Learning

Ensemble Learning what is an ensemble? why use an ensemble?

Ensemble Learning: An Introduction

Adaboost and its application

Sparse vs. Ensemble Approaches to Supervised Learning

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

For Better Accuracy Eick: Ensemble Learning

Machine Learning CS 165B Spring 2012

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

LOGO Ensemble Learning Lecturer: Dr. Bo Yuan

Benk Erika Kelemen Zsolt

Boosting of classifiers Ata Kaban. Motivation & beginnings Suppose we have a learning algorithm that is guaranteed with high probability to be slightly.

Ensemble Methods: Bagging and Boosting

Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.

Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.

CS Ensembles1 Ensembles. 2 A “Holy Grail” of Machine Learning Automated Learner Just a Data Set or just an explanation of the problem Hypothesis.

Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.

Ensemble Methods.  “No free lunch theorem” Wolpert and Macready 1995.

Classification Ensemble Methods 1

Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.

Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.

1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 8 Combining Methods and Ensemble Learning.

… Algo 1 Algo 2 Algo 3 Algo N Meta-Learning Algo.

1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.

Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:

Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:

Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:

Ensemble Classifiers.

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Machine Learning: Ensemble Methods

Machine Learning Supervised Learning Classification and Regression

Machine Learning Fisher’s Criteria & Linear Discriminant Analysis

Machine Learning Clustering: K-means Supervised Learning

COMP61011 : Machine Learning Ensemble Models

Ensemble Learning Introduction to Machine Learning and Data Mining, Carla Brodley.

A task of induction to find patterns

A “Holy Grail” of Machine Learing

INTRODUCTION TO Machine Learning

Combining Base Learners

Data Mining Practical Machine Learning Tools and Techniques

Introduction to Data Mining, 2nd Edition

Introduction to Boosting

Ensemble learning.

Model Combination.

Ensemble learning Reminder - Bagging of Trees Random Forest

Model generalization Brief summary of methods

Parametric Methods Berlin Chen, 2005 References:

Recitation 10 Oznur Tastan

A task of induction to find patterns

Multivariate Methods Berlin Chen

Multivariate Methods Berlin Chen, 2005 References:

Machine Learning Support Vector Machine Supervised Learning

INTRODUCTION TO Machine Learning 3rd Edition

Machine Learning Perceptron: Linearly Separable Supervised Learning

A task of induction to find patterns

Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Perceptron Learning Rule

A task of induction to find patterns

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Perceptron Learning Rule

Presentation transcript:

Machine Learning Ensemble Learning: Voting, Boosting(Adaboost) Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron: Linearly Separable Multilayer Perceptron & EBP & Deep Learning, RBF Network Support Vector Machine Ensemble Learning: Voting, Boosting(Adaboost) Unsupervised Learning Principle Component Analysis Independent Component Analysis Clustering: K-means Semi-supervised Learning & Reinforcement Learning

Rationale Different learners use different No Free Lunch Theorem Algorithms Hyper-parameters Representations (Modalities) Training sets Subproblems No Free Lunch Theorem There is no single algorithm that is always the most accurate Each learners assumes a certain model (assumptions, inductive bias) Even if we fine tune the algorithm, there are instances the algorithm is not accurate – there may be another algorithm that is accurate on those Generate a group of base learners which when combined has higher accuracy

Voting Linear combination for output Classification Simple voting: equal weight Plurality voting: class with max vote is the winner Majority voting (2 class problem)

Voting Bayesian perspective (Bayesian model combination): If are i.i.d.(independent, identically distributed) Bias does not change, variance decrease by factor L Suppose base learner = discriminant/regression function + random noise If noises are uncorrelated with 0 mean, we are averaging over noise

Stacking

Stacking

Bagging Bootstrapping method: generate training sets, train one base-learner with each, and combine them Why do we want to do this? Expectation done with regard to the input distribution Unstable algorithms profit from bagging

Boosting

AdaBoost: Algorithm

Toy Example

Round 1

Round 2

Round 3

Final Classifier

Test Error Behavior

Practical Advantages of AdaBoost Fast Simple and easy to program No parameters to tune (except M) Flexible – can combine with any learning algorithm No prior knowledge needed about weak learner Probably effective, provided can consistently find rough rules of thumb Shift in mind set – goal now is merely to find classifiers barely better than random guessing Versatile Can use with data that is textual, numeric, discrete, etc. Has been extended to learning problems well beyond binary classification