Download presentation

Presentation is loading. Please wait.

Published byAmia Wilkinson Modified over 2 years ago

1
Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

2
Develop methods for dimensionality reduction of either the input and/or output space of models. To gain an understanding initially use a toy dataset to compare existing methods. Later on utilize methods on real world models. Goal is to extend methods to work with high number of variables - 10^5 Managing Uncertainty in Complex Models, Aston University

3
Feature Selection Also known as Screening in statistical literature Select p most relevant of the original k variables. Meaning of variables is preserved => method results are interpretable Projective methods Variables are transformed X=F(X) Transformations can be linear or non-linear Interpretation is non-trivial especially for non-linear mappings. Managing Uncertainty in Complex Models, Aston University

4
Generate N base vectors x of dimensionality d from sampling a Latin hypercube. Normalize the data. Evaluate the generative model g(.) Corrupt the model output with independent identically distributed Gaussian noise. Initially we set noise variance is 0.1*signal variance. [Screening] Augment with extra noise dimensions e = Bx + input noise Noise is always N(0,I). B matrix is described on the next slide. [Projection] Project to a higher dimensional space using x = W*F(x)

5
[Screening] B matrix determines correlation between noise and model variables B=0 constructs noise variables that are uncorrelated to the model variables. k randomly selected rows have a single non zero entry corresponding to the noise variable being linearly correlated to a single model variable. Currently k=0.5*#noise variables and coefficient is set to 0.5 Same as previous but two elements of k rows are non-zero, k=0.8 and coefficients are randomly taken from the set {-0.2,-0.5,+0.5,+0.7} Managing Uncertainty in Complex Models, Aston University

6
[Projection] Project into higher dimensional space q x = W*F(x) W is a q*d weight matrix and F(·) are basis functions which are responsible for the projection mapping. A typical choice of such projection mapping is to use Radial Basis Functions (RBF). Managing Uncertainty in Complex Models, Aston University

7
Different noise models Correlated Multiplicative Non-linear interactions of noise variables with model variables Mix screening and projection Managing Uncertainty in Complex Models, Aston University

8
Variable selection methods have been broadly categorised in three categories Variable Ranking. Input variables are ranked according to the prediction accuracy of each input calculated against the model output. Wrapper methods. The emulator is used to assess the predictive power of subsets of variables Embedded methods. For both variable ranking and wrapper methods, the emulator is considered a perfect black box. In embedded methods, the variable selection is done as part of the training of the emulator. Managing Uncertainty in Complex Models, Aston University

9
Forward selection where variables are progressively incorporated in larger and larger subsets Backward elimination proceeds in the opposite direction. Efroymsons algorithm aka stepwise selection. Proceed as forward selection but after each variable is added, check if any of the selected variables can be deleted without significantly affecting RSS. Exhaustive search where all possible subsets are considered. Branch and Bound. Eliminate subset choices as early as possible. E.g. is variables A-Z, RSS of A,B subset 100, then C-Z subset branch need not be followed if RSS of all C-Z variables > 100. Managing Uncertainty in Complex Models, Aston University

10
An embedded method commonly employed in the context of Gaussian Processes is Automatic Relevance Determination (ARD) where the characteristic length scales l determine the input relevance Managing Uncertainty in Complex Models, Aston University

11
The following algorithms were used in the experiments BaseRelevant: Baseline run using the relevant dimensions only. The RMSE was obtained by training a GP on the relevant dimensions. This value can be interpreted as the optimal RMSE value. BaseAll: Baseline run using all the dimensions, i.e. relevant + extra. Again the RMSE was obtained by training a GP on this set. The difference BaseAll-BaseRelevant is a measure of the effect of the extra variables on the predictive accuracy of the GP. CorrCoef: Pearson Correlation Coefficient. A variable ranking is performed using the formulae 10 and the top 3 variables are selected and used to train a GP. LinFS: Employ a forward selection subset selection strategy using a multivariate linear regression model. The RMSE is obtained from evaluating the selected subset on a multiple linear regression model. GPFS: Again employ forward selection to generate subsets but use a GP rather than a linear model. ARD: Employ the ARD method to rank the input variables and select the top 3 to train a GP model. Managing Uncertainty in Complex Models, Aston University

12
200 observations,3 model dimensions, 6 total Managing Uncertainty in Complex Models, Aston University AlgorithmVariables Selected RMSEElapsed time BaseRelevant1,2, BaseAll1,2,3,4,5, CorrCoef1,4,2(,3,5,6) LinFS1,4, GPFS1,2, ARD1,2,

13
200 observations,3 model dimensions, 6 total Managing Uncertainty in Complex Models, Aston University AlgorithmVariables Selected RMSEElapsed time BaseRelevant1,2, BaseAll1,2,3,4,5, CorrCoef1,4,5(,2,6,3) LinFS1,4, GPFS1,2, ARD1,2,

14
Initial results for high-D input, two-correlated, model inputs 100, noise dimensions 500, number of observations 500. Managing Uncertainty in Complex Models, Aston University Length - Input Number Length - Input Number

15
Best performing methods are GPFS and ARD which usually find the optimal subset. However the GPFS method is on average more than three times slower than ARD. The CorrCoef and LinFS methods are computationally inexpensive but provide unsatisfactory results. Even for simple mapping functions (sinx) on underdetermined systems where number of observations < dimensions, ARD breaks down. Managing Uncertainty in Complex Models, Aston University

16
Batch hierarchical screening Explore the potential of partitioning the input space into groups of inputs, applying screening methods on the groups and combining the important inputs Some work already done for linear models (Gabriel and Pan 1979) Grouping of variables such that if two variable Xi Xj are in different groups, then their regression sum of squares (RSS) are additive, i.e. if Si is the reduction in RSS from including Xi and Sj for Xj, then when including both Xi Xj Si.j=Si+Sj Managing Uncertainty in Complex Models, Aston University

17
Coupled Emulation separate emulators for different outputs, linked with some model for the covariance Connections to sequential methods to handle large datasets. Linked to Sequential Sparse GPs? Projective methods in conjunction with feature selection. Managing Uncertainty in Complex Models, Aston University

18
[From Van der Maaten et al 2007]

19
But [Van der Maaten et al 2007] compared the non-linear to linear methods and found them no better. Reasons they propose relate to curse of dimensionality, overfitting of local models and others. Managing Uncertainty in Complex Models, Aston University

20
Dimensionality Reduction: A Comparative Review, L.J.P. van der Maaten E.O. Postma H.J. van den Herik 2007 Andr Elisseeff Isabelle Guyon. An Introduction to Variable and Feature Selection. Journal of Maching Learning Research, 3:1157–1182, Managing Uncertainty in Complex Models, Aston University

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google