Machine Learning CS 165B Spring 2012

Slides:



Advertisements
Similar presentations
Data Mining and Machine Learning
Advertisements

Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Longin Jan Latecki Temple University
Introduction to Boosting Slides Adapted from Che Wanxiang( 车 万翔 ) at HIT, and Robin Dhamankar of Many thanks!
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Ensemble Learning what is an ensemble? why use an ensemble?
A Brief Introduction to Adaboost
Ensemble Learning: An Introduction
ICS 273A Intro Machine Learning
Adaboost and its application
Three kinds of learning
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
Competent Undemocratic Committees Włodzisław Duch, Łukasz Itert and Karol Grudziński Department of Informatics, Nicholas Copernicus University, Torun,
Examples of Ensemble Methods
Machine Learning: Ensemble Methods
Sparse vs. Ensemble Approaches to Supervised Learning
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Boosting Main idea: train classifiers (e.g. decision trees) in a sequence. a new classifier should focus on those cases which were incorrectly classified.
Ensemble Learning (2), Tree and Forest
For Better Accuracy Eick: Ensemble Learning
Ensembles of Classifiers Evgueni Smirnov
Issues with Data Mining
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
CS 391L: Machine Learning: Ensembles
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Ensemble Classification Methods Rayid Ghani IR Seminar – 9/26/00.
Benk Erika Kelemen Zsolt
CS Fall 2015 (© Jude Shavlik), Lecture 7, Week 3
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensemble Methods: Bagging and Boosting
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
CS Ensembles1 Ensembles. 2 A “Holy Grail” of Machine Learning Automated Learner Just a Data Set or just an explanation of the problem Hypothesis.
CLASSIFICATION: Ensemble Methods
Ensemble Learning (1) Boosting Adaboost Boosting is an additive model
ISQS 6347, Data & Text Mining1 Ensemble Methods. ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
E NSEMBLE L EARNING : A DA B OOST Jianping Fan Dept of Computer Science UNC-Charlotte.
Ensemble Methods.  “No free lunch theorem” Wolpert and Macready 1995.
CSE 473 Ensemble Learning. © CSE AI Faculty 2 Ensemble Learning Sometimes each learning technique yields a different hypothesis (or function) But no perfect.
Classification Ensemble Methods 1
1 January 24, 2016Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques — Chapter 7 — Classification Ensemble Learning.
Classification and Prediction: Ensemble Methods Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
… Algo 1 Algo 2 Algo 3 Algo N Meta-Learning Algo.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Boosting ---one of combining models Xin Li Machine Learning Course.
1 Machine Learning: Ensemble Methods. 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training data or different.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Ensemble Classifiers.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning: Ensemble Methods
Trees, bagging, boosting, and stacking
Machine Learning: Ensembles
COMP61011 : Machine Learning Ensemble Models
Ensemble Learning Introduction to Machine Learning and Data Mining, Carla Brodley.
A “Holy Grail” of Machine Learing
INTRODUCTION TO Machine Learning
Combining Base Learners
Data Mining Practical Machine Learning Tools and Techniques
Introduction to Data Mining, 2nd Edition
Machine Learning Ensemble Learning: Voting, Boosting(Adaboost)
Ensemble learning.
Model Combination.
Ensemble learning Reminder - Bagging of Trees Random Forest
INTRODUCTION TO Machine Learning 3rd Edition
A task of induction to find patterns
CS 391L: Machine Learning: Ensembles
Presentation transcript:

Machine Learning CS 165B Spring 2012

Course outline Introduction (Ch. 1) Concept learning (Ch. 2) Decision trees (Ch. 3) Ensemble learning Neural Networks (Ch. 4) …

Schedule Homework 2 on decision trees will be handed out Thursday 4/19; due Wednesday 5/2 Project choices by Friday 4/20 Topic of discussion section

Projects Projects proposals are due by Friday 4/20. 2-person teams If you want to define your own project: Submit a 1-page proposal with references and ideas Needs to have a significant Machine Learning component You may do experimental work,  theoretical work, a combination of both or a critical survey of results in some specialized topic. Originality is not mandatory but is encouraged.  Try to make it interesting!

Rationale for Ensemble Learning No Free Lunch thm: There is no algorithm that is always the most accurate Generate a group of base-learners which when combined have higher accuracy Different learners use different Algorithms Parameters Representations (Modalities) Training sets Subproblems

Voting Linear combination Classification

Bagging (Bootstrap aggregating) Take M bootstrap samples (with replacement) Train M different classifiers on these bootstrap samples For a new query, let all classifiers predict and take an average (or majority vote) If the classifiers make independent errors, then their ensemble can improve performance. Stated differently: the variance in the prediction is reduced (we don’t suffer from the random errors that a single classifier is bound to make).

Boosting Train classifiers (e.g. decision trees) in a sequence. A new classifier should focus on those cases which were incorrectly classified in the last round. Combine the classifiers by letting them vote on the final prediction (like bagging). Each classifier is “weak” but the ensemble is “strong.” AdaBoost is a specific boosting method.

Example This line is one simple classifier saying that everything to the left + and everything to the right is -

Boosting Intuition We adaptively weigh each data case. Data cases which are wrongly classified get high weight (the algorithm will focus on them) Each boosting round learns a new (simple) classifier on the weighed dataset. These classifiers are weighed to combine them into a single powerful classifier. Classifiers that that obtain low training error rate have high weight. We stop by using monitoring a hold out set (cross-validation).

Boosting in a Picture boosting rounds training cases Correctly classified This example has a large weight in this round This DT has a strong vote.

And in animation Original training set: equal weights to all training samples Taken from “A Tutorial on Boosting” by Yoav Freund and Rob Schapire

AdaBoost example ROUND 1 ε = error rate of classifier α = weight of classifier ROUND 1

AdaBoost example ROUND 2

AdaBoost example ROUND 3

AdaBoost example

Mixture of experts Voting where weights are input-dependent (gating) Different input regions convered by different learners (Jacobs et al., 1991) Gating decides which expert to use Need to learn the individual experts as well as the gating functions wi(x): Σwj(x) = 1, for all x

Stacking Combiner f () is another learner (Wolpert, 1992)

Random Forest Ensemble consisting of a bagging of un-pruned decision tree learners with a randomized selection of features at each split. Grow many trees on datasets sampled from the original dataset with replacement (a bootstrap sample). Draw K bootstrap samples of a fixed size Grow a DT, randomly sampling a few attributes/dimensions to split on at each internal node Average the predictions of the trees for a new query (or take majority vote) Random Forests are state of the art classifiers!