Download presentation

Presentation is loading. Please wait.

Published byAmia Wilkinson Modified over 4 years ago

1
Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

2
Develop methods for dimensionality reduction of either the input and/or output space of models. To gain an understanding initially use a toy dataset to compare existing methods. Later on utilize methods on real world models. Goal is to extend methods to work with high number of variables - 10^5 Managing Uncertainty in Complex Models, Aston University

3
Feature Selection Also known as Screening in statistical literature Select p most relevant of the original k variables. Meaning of variables is preserved => method results are interpretable Projective methods Variables are transformed X=F(X) Transformations can be linear or non-linear Interpretation is non-trivial especially for non-linear mappings. Managing Uncertainty in Complex Models, Aston University

4
Generate N base vectors x of dimensionality d from sampling a Latin hypercube. Normalize the data. Evaluate the generative model g(.) Corrupt the model output with independent identically distributed Gaussian noise. Initially we set noise variance is 0.1*signal variance. [Screening] Augment with extra noise dimensions e = Bx + input noise Noise is always N(0,I). B matrix is described on the next slide. [Projection] Project to a higher dimensional space using x = W*F(x)

5
[Screening] B matrix determines correlation between noise and model variables B=0 constructs noise variables that are uncorrelated to the model variables. k randomly selected rows have a single non zero entry corresponding to the noise variable being linearly correlated to a single model variable. Currently k=0.5*#noise variables and coefficient is set to 0.5 Same as previous but two elements of k rows are non-zero, k=0.8 and coefficients are randomly taken from the set {-0.2,-0.5,+0.5,+0.7} Managing Uncertainty in Complex Models, Aston University

6
[Projection] Project into higher dimensional space q x = W*F(x) W is a q*d weight matrix and F(·) are basis functions which are responsible for the projection mapping. A typical choice of such projection mapping is to use Radial Basis Functions (RBF). Managing Uncertainty in Complex Models, Aston University

7
Different noise models Correlated Multiplicative Non-linear interactions of noise variables with model variables Mix screening and projection Managing Uncertainty in Complex Models, Aston University

8
Variable selection methods have been broadly categorised in three categories Variable Ranking. Input variables are ranked according to the prediction accuracy of each input calculated against the model output. Wrapper methods. The emulator is used to assess the predictive power of subsets of variables Embedded methods. For both variable ranking and wrapper methods, the emulator is considered a perfect black box. In embedded methods, the variable selection is done as part of the training of the emulator. Managing Uncertainty in Complex Models, Aston University

9
Forward selection where variables are progressively incorporated in larger and larger subsets Backward elimination proceeds in the opposite direction. Efroymsons algorithm aka stepwise selection. Proceed as forward selection but after each variable is added, check if any of the selected variables can be deleted without significantly affecting RSS. Exhaustive search where all possible subsets are considered. Branch and Bound. Eliminate subset choices as early as possible. E.g. is variables A-Z, RSS of A,B subset 100, then C-Z subset branch need not be followed if RSS of all C-Z variables > 100. Managing Uncertainty in Complex Models, Aston University

10
An embedded method commonly employed in the context of Gaussian Processes is Automatic Relevance Determination (ARD) where the characteristic length scales l determine the input relevance Managing Uncertainty in Complex Models, Aston University

11
The following algorithms were used in the experiments BaseRelevant: Baseline run using the relevant dimensions only. The RMSE was obtained by training a GP on the relevant dimensions. This value can be interpreted as the optimal RMSE value. BaseAll: Baseline run using all the dimensions, i.e. relevant + extra. Again the RMSE was obtained by training a GP on this set. The difference BaseAll-BaseRelevant is a measure of the effect of the extra variables on the predictive accuracy of the GP. CorrCoef: Pearson Correlation Coefficient. A variable ranking is performed using the formulae 10 and the top 3 variables are selected and used to train a GP. LinFS: Employ a forward selection subset selection strategy using a multivariate linear regression model. The RMSE is obtained from evaluating the selected subset on a multiple linear regression model. GPFS: Again employ forward selection to generate subsets but use a GP rather than a linear model. ARD: Employ the ARD method to rank the input variables and select the top 3 to train a GP model. Managing Uncertainty in Complex Models, Aston University

12
200 observations,3 model dimensions, 6 total Managing Uncertainty in Complex Models, Aston University AlgorithmVariables Selected RMSEElapsed time BaseRelevant1,2,30.91281.44142 BaseAll1,2,3,4,5,61.04731.60529 CorrCoef1,4,2(,3,5,6)2.16421.50487 LinFS1,4,22.78030.134283 GPFS1,2,30.909218.2017 ARD1,2,30.91345.56684

13
200 observations,3 model dimensions, 6 total Managing Uncertainty in Complex Models, Aston University AlgorithmVariables Selected RMSEElapsed time BaseRelevant1,2,30.91111.42363 BaseAll1,2,3,4,5,61.06331.66093 CorrCoef1,4,5(,2,6,3)2.67941.31676 LinFS1,4,62.80830.143308 GPFS1,2,30.927419.0051 ARD1,2,31.00765.0611 5.56684

14
Initial results for high-D input, two-correlated, model inputs 100, noise dimensions 500, number of observations 500. Managing Uncertainty in Complex Models, Aston University Length - Input Number 31.8373 361 18.7081 501 14.2097 296 12.7581 51 12.3160 456 11.8689 496 11.3176 166 10.2424 310 10.2220 420 9.6192 325 9.0732 363 Length - Input Number 8.6898 53 8.5453 347 7.9338 419 7.8201 294 7.8017 188 7.4327 103 7.3760 13 7.1526 572 7.0997 478 6.9481 393 6.6417 187

15
Best performing methods are GPFS and ARD which usually find the optimal subset. However the GPFS method is on average more than three times slower than ARD. The CorrCoef and LinFS methods are computationally inexpensive but provide unsatisfactory results. Even for simple mapping functions (sinx) on underdetermined systems where number of observations < dimensions, ARD breaks down. Managing Uncertainty in Complex Models, Aston University

16
Batch hierarchical screening Explore the potential of partitioning the input space into groups of inputs, applying screening methods on the groups and combining the important inputs Some work already done for linear models (Gabriel and Pan 1979) Grouping of variables such that if two variable Xi Xj are in different groups, then their regression sum of squares (RSS) are additive, i.e. if Si is the reduction in RSS from including Xi and Sj for Xj, then when including both Xi Xj Si.j=Si+Sj Managing Uncertainty in Complex Models, Aston University

17
Coupled Emulation separate emulators for different outputs, linked with some model for the covariance Connections to sequential methods to handle large datasets. Linked to Sequential Sparse GPs? Projective methods in conjunction with feature selection. Managing Uncertainty in Complex Models, Aston University

18
[From Van der Maaten et al 2007]

19
But [Van der Maaten et al 2007] compared the non-linear to linear methods and found them no better. Reasons they propose relate to curse of dimensionality, overfitting of local models and others. Managing Uncertainty in Complex Models, Aston University

20
Dimensionality Reduction: A Comparative Review, L.J.P. van der Maaten E.O. Postma H.J. van den Herik 2007 Andr Elisseeff Isabelle Guyon. An Introduction to Variable and Feature Selection. Journal of Maching Learning Research, 3:1157–1182, 2003. Managing Uncertainty in Complex Models, Aston University

Similar presentations

OK

Introduction to variable selection I Qi Yu. 2 Problems due to poor variable selection: Input dimension is too large; the curse of dimensionality problem.

Introduction to variable selection I Qi Yu. 2 Problems due to poor variable selection: Input dimension is too large; the curse of dimensionality problem.

© 2018 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google