1 Ensembles of Nearest Neighbor Forecasts Dragomir Yankov, Eamonn Keogh Dept. of Computer Science & Eng. University of California Riverside Dennis DeCoste.

Slides:



Advertisements
Similar presentations
Eco 5385 Predictive Analytics For Economists Spring 2014 Professor Tom Fomby Director, Richard B. Johnson Center for Economic Studies Department of Economics.
Advertisements

Aggregating local image descriptors into compact codes
Random Forest Predrag Radenković 3237/10
Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Pattern Recognition and Machine Learning
Yue Han and Lei Yu Binghamton University.
Model Assessment, Selection and Averaging
Model assessment and cross-validation - overview
Second order cone programming approaches for handing missing and uncertain data P. K. Shivaswamy, C. Bhattacharyya and A. J. Smola Discussion led by Qi.
MCS Multiple Classifier Systems, Cagliari 9-11 June Giorgio Valentini Random aggregated and bagged ensembles.
Locally Constraint Support Vector Clustering
1 Manifold Clustering of Shapes Dragomir Yankov, Eamonn Keogh Dept. of Computer Science & Eng. University of California Riverside.
Sparse vs. Ensemble Approaches to Supervised Learning
Ensemble Learning what is an ensemble? why use an ensemble?
On Appropriate Assumptions to Mine Data Streams: Analyses and Solutions Jing Gao† Wei Fan‡ Jiawei Han† †University of Illinois at Urbana-Champaign ‡IBM.
Lazy Learning k-Nearest Neighbour Motivation: availability of large amounts of processing power improves our ability to tune k-NN classifiers.
Three kinds of learning
Detecting Time Series Motifs Under
Active Learning Strategies for Drug Screening 1. Introduction At the intersection of drug discovery and experimental design, active learning algorithms.
Feature Subset Selection using Minimum Cost Spanning Trees Mike Farah Supervisor: Dr. Sid Ray.
Bayesian Learning Rong Jin.
Sparse vs. Ensemble Approaches to Supervised Learning
Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva.
Classification and Prediction: Regression Analysis
Ensemble Learning (2), Tree and Forest
2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 1 Hastie, Tibshirani and Friedman.The Elements of Statistical Learning (2nd edition,
Active Learning for Class Imbalance Problem
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 2011 Predicting Solar Generation from Weather Forecasts Using Machine Learning Navin.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Ensemble Classification Methods Rayid Ghani IR Seminar – 9/26/00.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Ensemble Methods: Bagging and Boosting
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
Physical-layer Identification of UHF RFID Tags Authors: Davide Zanetti, Boris Danev and Srdjan Capkun Presented by Zhitao Yang 1.
The Bias-Variance Trade-Off Oliver Schulte Machine Learning 726.
Bias and Variance of the Estimator PRML 3.2 Ethem Chp. 4.
Some Aspects of Bayesian Approach to Model Selection Vetrov Dmitry Dorodnicyn Computing Centre of RAS, Moscow.
Chapter1: Introduction Chapter2: Overview of Supervised Learning
Chapter 13 (Prototype Methods and Nearest-Neighbors )
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Classification Ensemble Methods 1
Bias and Variance of the Estimator PRML 3.2 Ethem Chp. 4.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Instance-Based Learning Evgueni Smirnov. Overview Instance-Based Learning Comparison of Eager and Instance-Based Learning Instance Distances for Instance-Based.
Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.
Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.
Feature learning for multivariate time series classification Mustafa Gokce Baydogan * George Runger * Eugene Tuv † * Arizona State University † Intel Corporation.
Evaluating Techniques for Image Classification
Supervised Time Series Pattern Discovery through Local Importance
Overview of Supervised Learning
Bias and Variance of the Estimator
Introduction Feature Extraction Discussions Conclusions Results
Eco 6380 Predictive Analytics For Economists Spring 2016
An Improved Neural Network Algorithm for Classifying the Transmission Line Faults Slavko Vasilic Dr Mladen Kezunovic Texas A&M University.
COSC 4335: Other Classification Techniques
Department of Electrical Engineering
Ensemble learning Reminder - Bagging of Trees Random Forest
Introduction to Radial Basis Function Networks
Model generalization Brief summary of methods
Derek Hoiem CS 598, Spring 2009 Jan 27, 2009
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Outlines Introduction & Objectives Methodology & Workflow
Presentation transcript:

1 Ensembles of Nearest Neighbor Forecasts Dragomir Yankov, Eamonn Keogh Dept. of Computer Science & Eng. University of California Riverside Dennis DeCoste Yahoo! Research

2 Outline Problem formulation NN forecasting framework Stability of the forecasts Ensembles of NN forecasts Experimental evaluation

3 Problem formulation Predict the number of impressions to be observed for a specific website Data specifics – many patterns present in the data

4 Forecasting framework – overview

5 Forecasting framework – formalization Formalization –Direct forecasts: Given: a query, its k nearest neighbors Estimate: the query continuation –Other approaches: iterative forecasts, mutually validating forecasts

6 Forecasting framework – components Similarity measure –Standardized Euclidean distance: where Prediction accuracy –Prediction root mean square error: Weighting function – uniform weights

7 Stability of the forecasts Stability with respect to the training data –NN is stable in the case of classification and majority voting (Breiman ’96) –Here – extrapolation plus regression. Changing one neighbor can change the forecast significantly Stability with respect to the input parameters –Parameters: k, weights of different neighbors, query length, prediction horizon –Different combinations lead to different forecasts

8 Ensembles of NN forecasts Main idea: rather than tuning up the best parameters for the entire dataset, for each query select the model that will predict it best Issues –What base models to use –How to select among them

9 Ensembles of NN forecasts Base models to use –We focus on pairs of NN learners, in which the base models differ in the number of neighbors used –The optimal single predictors and the suitable ensembles are determined on a validation set using an oracle kRMSE (k-NN)(k1, k2)RMSE (Ens) (1, 20) (2, 40) (6, 1) (10, 1) (100, 1) Optimal Single Predictor Optimal Ensemble (Using Oracle)

10 Ensembles of NN forecasts Selecting among the base models: –Learn a classifier to select the more suitable model for individual queries (SVM with Gaussian kernel) Note: The classifier does not need to be perfect. It is important to identify the “bad” cases for each base learner

11 Ensembles of NN forecasts Selecting among the base models: –Extracted features: Statistics from the query and its nearest neighbors: Mean, Median, Variance, Amplitude Statistics from the models’ forecasts: Mean, Median, Variance, Amplitude Distances between the forecasts of the individual neighbors Performance of the models on the query’s nearest neighbors Step-back forecasts (good for short horizons)

12 Experimental evaluation Website impressions

13 Experimental evaluation Website impressions –Computing the optimal single predictors –Comparison with the accuracy of the ensemble approach HorizonPredictorTest RMSEStd h = 3010-NN (optimal k) Ens = {10-NN,1-NN} h = 608-NN (optimal k) Ens = {10-NN,1-NN} h = 1006-NN (optimal k) Ens = {10-NN,1-NN}

14 Experimental evaluation Website impressions

15 Experimental evaluation Bias-Variance improvement –We compute the bias 2 and variance terms in the error decomposition for h=100 steps ahead –The statistics are recorded over 50 random subsamples from the original training set PredictorBias 2 Variance 6-NN (optimal k) Ens = {10-NN,1-NN}

16 Conclusions and future directions The proposed technique improves significantly the prediction accuracy of the single NN forecasting models It outlines a principled solution to the bias-variance problem of the NN forecasts It is a data specific rather than a generic approach Combining more models and varying other parameters would require selecting different features Thank you!