# Boosting Rong Jin.

Boosting Rong Jin

Inefficiency with Bagging
D Bagging D1 D2 Dk Boostrap Sampling h1 h2 hk Inefficient boostrap sampling: Every example has equal chance to be sampled No distinction between “easy” examples and “difficult” examples Inefficient model combination: A constant weight for each classifier No distinction between accurate classifiers and inaccurate classifiers

Improve the Efficiency of Bagging
Better sampling strategy Focus on the examples that are difficult to classify Better combination strategy Accurate model should be assigned larger weights

Intuition + +  May overfitting !! Mistakes Classifier3 Classifier2
No training mistakes !!  May overfitting !! + Mistakes X1 Y1 + Mistakes X1 Y1 X3 Y3 Training Examples X1 Y1 X2 Y2 X3 Y3 X4 Y4

x1, y1 x2, y2 x3, y3 x4, y4 x5, y5 1/5 D0: x5, y5 x3, y3 x1, y1 Sample h1 Training x1, y1 x2, y2 x3, y3 x4, y4 x5, y5 Update Weights h1 Sample x3, y3 x1, y1 2/7 1/7 D1: h2 Training x1, y1 x2, y2 x3, y3 x4, y4 x5, y5 h2 Update Weights 2/9 1/9 4/9 D2: Sample …

How To Choose t in AdaBoost?
How to construct the best distribution Dt+1(i) Dt+1(i) should be significantly different from Dt(i) Dt+1(i) should create a situation that classifier ht performs poorly

How To Choose t in AdaBoost?

Optimization View for Choosing t
ht(x): x{1,-1}; a base (weak) classifier HT(x): a linear combination of basic classifiers Goal: minimize training error Approximate error swith a exponential function

Fix HT-1(x), and solve hT(x) and t