Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Statistician’s Games * : Bootstrap, Bagging and Boosting * Please refer to “Game theory, on-line prediction and boosting” by Y. Freund and R. Schapire,

Similar presentations


Presentation on theme: "A Statistician’s Games * : Bootstrap, Bagging and Boosting * Please refer to “Game theory, on-line prediction and boosting” by Y. Freund and R. Schapire,"— Presentation transcript:

1 A Statistician’s Games * : Bootstrap, Bagging and Boosting * Please refer to “Game theory, on-line prediction and boosting” by Y. Freund and R. Schapire, Proceedings of 9th Conference on Computational Learning Theory. Yaochu Jin Future Technology Research Honda R&D Europe (Germany) March 21, 2000

2 Bootstrap -- Problem Description The bootstrap was introduced as a general method for assessing the statistical accuracy of an estimator Given data: x = ( x 1,..., x n ) Have an estimator:  = s(x) ? How to assess the accuracy of 

3 Bootstrap -- the Idea Bootstrap estimate of the standard error:

4 Bootstrap -- Pros and Cons Easy to implement Need a large number of independent bootstrap samples (B>=1000) Uncertainty of the estimate 1) Jackknife-after-Bootstrap(JAB) 2) Weighted JAB

5 Bagging is Not Related to Begging Using bootstrap techniques to improve the estimator Bagging -- Bootstrap aggregating

6 Bagging -- the Idea The final estimate:  = (  1 +  2 +... +  B )/B

7 Bagging -- Pros and Cons The estimator can be significantly improved if the learning algorithm is unstable Degrade the performance of stable procedures Reduce the variance, bias unchanged

8 Adaptive Bagging Reduce both variance and bias

9 Boosting To boost a “weak” learning algorithm into a “strong” learning algorithm A week learning algorithm can be inaccurate rules of thumb that is slightly better than random guess

10 AdaBoost Initialize Distribution D 1 (i) = 1/n Calculate error  t Choose weight  t = 1/2ln(1-  t /  t ) Update distribution The final estimate:  = (  1  1 +  2  2 +... +  n  B )/B

11 AdaBoost -- Pros and Cons Reduce both variance and bias Need large number of estimators (B>=1000) Sensitive to noise Theoretical guarantee (maximizes the likelihood) Easy to implement (compared to Bayesian methods) Relation to Support Vector Machines

12 Further Information on B 3 http://www-stat.stanford.edu/~tibs/ ftp://ftp.stat.berkeley.edu/pub/users/breiman/ http://www.research.att.com/~yoav/ http://www.research.att.com/~schapire/


Download ppt "A Statistician’s Games * : Bootstrap, Bagging and Boosting * Please refer to “Game theory, on-line prediction and boosting” by Y. Freund and R. Schapire,"

Similar presentations


Ads by Google