Presentation is loading. Please wait.

Presentation is loading. Please wait.

Report #1 By Team: Green Ensemble AusDM 2009 ENSEMBLE Analytical Challenge: Rules, Objectives, and Our Approach.

Similar presentations


Presentation on theme: "Report #1 By Team: Green Ensemble AusDM 2009 ENSEMBLE Analytical Challenge: Rules, Objectives, and Our Approach."— Presentation transcript:

1 Report #1 By Team: Green Ensemble AusDM 2009 ENSEMBLE Analytical Challenge: Rules, Objectives, and Our Approach

2 What is it all about? The purpose of this challenge is to somehow combine the individual models to give the best overall model performance. We call it Ensembling.

3 What do we mean by Ensembling? Ensembling, Blending, Committee of Experts are various terms used for the process of improving predictive accuracy by combining models built with different algorithms, or the same algorithm but with different parameter settings. It is a technique frequently used to win predictive modelling competitions, but how it is actually achieved in practice maybe somewhat arbitrary.

4 Why Ensembling? Remember NETFLIX prize? Over 1,000 sets of predictions have been provided Taking the mean prediction over all these models is only slightly worse than the best individual model. The mean of the best 10 models is significantly better than any individual model.

5 That’s why we are after Ensembling!

6 Data (1/2) RMSE Small - 200 sets of predictions for 15,000 ratings. AUC Small - 200 sets of prediction for 15,000 ratings. RMSE Medium - 250 sets of predictions for 20,000 ratings. AUC Medium - 250 sets of predictions for 20,000 ratings. RMSE Large - 1,151 sets of predictions for 50,000 ratings. AUC Large - 1,151 sets of predictions for 50,000 ratings.

7 Data (2/2) the predicted ratings values have been converted to integers by rounding to 3 decimal places and multiplying by 1,000. 1000<Prediction<5000 Targets = { 1000,2000,3000,4000,5000} for RMSE challenge Targets={-1,1} for AUC challenge Each of data sets is split into 2 files, one for Training (Target/Rating provided) and one for Scoring (Target/Rating withheld)

8 Our First Approach Weighted Averaging (1/3) A: 200 models, 15000 movies ratings predictions Target: 15000 movies real ratings How to find such weights? Our approach is to find vector w such as:

9 Weighted Averaging (2/3) A is not square, so we must find pseudo-Inverse of A. It’s easy in M ATLAB. w= A\target. W is Least-Square solution of previous equation. Formal mathematical problem:

10 Weighted Averaging (3/3) The result is above the baselines, good for us! RMSE = 882.81593589302 The problem is, it’s so overfitted to Train data set. Also We don’t use any information of Test data set. Just multiply w to test matric and get the results.

11 Our Second Approach Ensemble Selection(1/3) Implemented from this paper: Ensemble Selection from Libraries of Models [R.Caruana, A.Niculesco-Mizil, Proceedings of ICML’04] Winner team of KDD Orange cup also used this method.(IBM team) Just like weighted averaging but they find weights by hill climbing search. Search for models which improve RMSE.

12 Ensemble Selection(2/3) ensemble selection procedure 1.Start with the empty ensemble. Initialized with N best models ( N ~5-25) 2.Select a ranodm Bag of models in library. Add to the ensemble the model in the Bag that maximizes the ensemble’s performance to the error metric on a hillclimb (validation) set. 3. Repeat Step 2 for a fixed number of iterations or Until all the models have been used. 4.Return the ensemble from the nested set of ensembles that has maximum performance on the hillclimb (validation) set.

13 Ensemble Selection(3/3) It’s a fast search but we have better searchs than just simple hill climbing! RMSE slightly improved. RMSE = 882.31317521497 Author argued their method have better performance than many methods such as : SVM, ANN, BAG-DT, KNN,BST-DT. We compare Ensemble Selection and ANN in our problem. They were right!

14 Some Statistics about Data set

15 How much good are these 200 models? Predicting 1000s

16 How much good are these 200 models? Predicting 2000s

17 How much good are these 200 models? Predicting 3000s

18 How much good are these 200 models? Predicting 4000s

19 How much good are these 200 models? Predicting 5000s

20 Noise: difference from target

21

22 Some other Ideas Discritize predictions, then find frequent motifs of each set of movies ranked 1000 to 5000. Using GA to search for better weights Using Estimation theory. (noises are gaussian or semi gaussin ) Using some metrics for each row of test data set to determine its distance to some selected rows of data sets. Metrics could be RMSE,KL divergence, cos( θ ),…

23 Green Ensemble E.Khoddam Mohammadi M.J.Mahzoon A.Askari A.Ghaffari Nejad

24 Thanks for your attention.


Download ppt "Report #1 By Team: Green Ensemble AusDM 2009 ENSEMBLE Analytical Challenge: Rules, Objectives, and Our Approach."

Similar presentations


Ads by Google