Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gaussian Process and Prediction. (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)2 Outline Gaussian Process and Bayesian Regression  Bayesian regression.

Similar presentations


Presentation on theme: "Gaussian Process and Prediction. (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)2 Outline Gaussian Process and Bayesian Regression  Bayesian regression."— Presentation transcript:

1 Gaussian Process and Prediction

2 (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)2 Outline Gaussian Process and Bayesian Regression  Bayesian regression  Weight-space view  Function-space view  Spline smoothing  Neural network  Classification problem Active Data Selection  Maximizing the expected information gain  Minimizing the regression error  Experimental result Mixtures of Gaussian Process

3 (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)3 Gaussian Process and Bayesian Regression (1) A distribution of y in Bayesian regression Generalized linear regression Weight-space view

4 (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)4 Gaussian Process and Bayesian Regression (2) Function-space view  Y(x) is a linear combination of Gaussian random variables W ~ N(0,  )  { Y x } is a Gaussian Process with mean and covariance functions:  can be predicted from conditional distributions

5 (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)5 Gaussian Process and Bayesian Regression (3) Weight-space view and function-space view gave same results For a smaller number of basis functions, weight space-view is preferred, while for a larger number of basis functions, function space-view (Gaussian procees view) is better. Cf. Nonparametric Kernel estimator for a density p(y) :

6 (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)6 Spline Smoothing (1) Interpolating spline I nterpolation spline is a cubic polynomial defined piecewise between adjacent knots with continuous second derivative(Schoenberg (1964)) Smoothing spline   interpolating spline.   least squares linear fit.   Smoothing spline is also a cubic spline ( Reinsch (1967) )

7 (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)7 Spline Smoothing (2) Linear smoothing property of smoothing spline If the design is equally spaced, then all of the n component smoothing splines are identical in shape. And the shape converged to the kernel (Silverman (1984)). Cf. Nonparametric kernel regression (Nadaraya(1964) and Watson(1964):

8 (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)8 Spline Smoothing (3) Spline estimation procedure can be interpreted as a Bayesian MAP:   When p=2: the resulting is a cubic spline ( a piesewise cubic function that has knots at the data points.)

9 (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)9 Spline Smoothing (4) Spline priors are Gaussian processes   Gaussian Process:

10 (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)10 Spline Smoothing (5) Splines correspond to Gaussian processes with a particular choice of covariance function.

11 (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)11 Known covariance function for modeling : (e.g.)

12 (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)12 Covariance function with unknown parameters  For a smaller number of parameters: choose a parametric family of covaiance function and estimate by log likelihood. For a larger number of parameters or for a local maxima etc.:use a prior distribution of parameters numerically.

13 (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)13 Multilayer Neural Networks and Gaussian Process The properties of neural network with one hidden layer converge to those of a gaussian process as the number of hidden neurons tends to infinity if standard weight decay priors are assumed. (Neal (1996)) The corresponding covariance of this gaussian process depends on the priors on the weights and activation functions of the hidden units in the network.

14 (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)14 Classification Problems Estimate the posterior p ( k | x ) for each class k with Find a distribution by a Gaussian process prior of activation y(x) through a logistic regression. Make a prediction for a test input x * by ( Apply appropriate Jacobian to the above for a distribution of ) When p(t|y) is Gaussian : exact expression When : no exact expression (use analytic approximation or MCMC)

15 (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)15 Active data Selection (1) Maximizing the expected information gain criterior (Mckay (1992)).  By selecting the data with maximum predictor variance Minimizing the error of (Cohn (1996)) : minimum overall variance.

16 (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)16 Active data Selection (2) (a) Target function from a covariance function (b) Expected change of average variance over x for 100 reference points

17 (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)17 Active data Selection (3) Experiments :  First data is selected random  150 data are selected actively  500 reference points for error evaluation  Optimum query was selected using 300 random reference points.

18 (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)18 Active data Selection (4) For real data: pumadyn-8nm (puma560 robot arm) 250 data points for active selecting, 400 reference points


Download ppt "Gaussian Process and Prediction. (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)2 Outline Gaussian Process and Bayesian Regression  Bayesian regression."

Similar presentations


Ads by Google