# A Statistician’s Games * : Bootstrap, Bagging and Boosting * Please refer to “Game theory, on-line prediction and boosting” by Y. Freund and R. Schapire,

## Presentation on theme: "A Statistician’s Games * : Bootstrap, Bagging and Boosting * Please refer to “Game theory, on-line prediction and boosting” by Y. Freund and R. Schapire,"— Presentation transcript:

A Statistician’s Games * : Bootstrap, Bagging and Boosting * Please refer to “Game theory, on-line prediction and boosting” by Y. Freund and R. Schapire, Proceedings of 9th Conference on Computational Learning Theory. Yaochu Jin Future Technology Research Honda R&D Europe (Germany) March 21, 2000

Bootstrap -- Problem Description The bootstrap was introduced as a general method for assessing the statistical accuracy of an estimator Given data: x = ( x 1,..., x n ) Have an estimator:  = s(x) ? How to assess the accuracy of 

Bootstrap -- the Idea Bootstrap estimate of the standard error:

Bootstrap -- Pros and Cons Easy to implement Need a large number of independent bootstrap samples (B>=1000) Uncertainty of the estimate 1) Jackknife-after-Bootstrap(JAB) 2) Weighted JAB

Bagging is Not Related to Begging Using bootstrap techniques to improve the estimator Bagging -- Bootstrap aggregating

Bagging -- the Idea The final estimate:  = (  1 +  2 +... +  B )/B

Bagging -- Pros and Cons The estimator can be significantly improved if the learning algorithm is unstable Degrade the performance of stable procedures Reduce the variance, bias unchanged

Adaptive Bagging Reduce both variance and bias

Boosting To boost a “weak” learning algorithm into a “strong” learning algorithm A week learning algorithm can be inaccurate rules of thumb that is slightly better than random guess

AdaBoost Initialize Distribution D 1 (i) = 1/n Calculate error  t Choose weight  t = 1/2ln(1-  t /  t ) Update distribution The final estimate:  = (  1  1 +  2  2 +... +  n  B )/B

AdaBoost -- Pros and Cons Reduce both variance and bias Need large number of estimators (B>=1000) Sensitive to noise Theoretical guarantee (maximizes the likelihood) Easy to implement (compared to Bayesian methods) Relation to Support Vector Machines

Further Information on B 3 http://www-stat.stanford.edu/~tibs/ ftp://ftp.stat.berkeley.edu/pub/users/breiman/ http://www.research.att.com/~yoav/ http://www.research.att.com/~schapire/

Download ppt "A Statistician’s Games * : Bootstrap, Bagging and Boosting * Please refer to “Game theory, on-line prediction and boosting” by Y. Freund and R. Schapire,"

Similar presentations