- 1 - Calibration with discrepancy Major references –Calibration lecture is not in the book. –Kennedy, Marc C., and Anthony O'Hagan. "Bayesian calibration.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Bayesian tools for analysing and reducing uncertainty Tony OHagan University of Sheffield.
Bayesian inference of normal distribution
Pattern Recognition and Machine Learning
Kriging.
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
Validating uncertain predictions Tony O’Hagan, Leo Bastos, Jeremy Oakley, University of Sheffield.
ECIV 201 Computational Methods for Civil Engineers Richard P. Ray, Ph.D., P.E. Error Analysis.
Prediction and model selection
Data Mining CS 341, Spring 2007 Lecture 4: Data Mining Techniques (I)
Bootstrapping LING 572 Fei Xia 1/31/06.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Chemometrics Method comparison
Lecture 16 Correlation and Coefficient of Correlation
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Gaussian process modelling
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
G. Cowan Lectures on Statistical Data Analysis Lecture 3 page 1 Lecture 3 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
Physics 114: Exam 2 Review Lectures 11-16
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
MECN 3500 Inter - Bayamon Lecture 3 Numerical Methods for Engineering MECN 3500 Professor: Dr. Omar E. Meza Castillo
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
17 May 2007RSS Kent Local Group1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan CTCD, Sheffield.
Center for Radiative Shock Hydrodynamics Fall 2011 Review Assessment of predictive capability Derek Bingham 1.
Chapter Three TWO-VARIABLEREGRESSION MODEL: THE PROBLEM OF ESTIMATION
- 1 - Bayesian inference of binomial problem Estimating a probability from binomial data –Objective is to estimate unknown proportion (or probability of.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Example: Bioassay experiment Problem statement –Observations: At each level of dose, 5 animals are tested, and number of death are observed.
CHEMISTRY ANALYTICAL CHEMISTRY Fall Lecture 6.
INTRODUCTION TO Machine Learning 3rd Edition
Additional Topics in Prediction Methodology. Introduction Predictive distribution for random variable Y 0 is meant to capture all the information about.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
Review of fundamental 1 Data mining in 1D: curve fitting by LLS Approximation-generalization tradeoff First homework assignment.
Sampling and estimation Petter Mostad
NON-LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 6 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall.
Chapter 5 Multilevel Models
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
Linear Regression and Correlation Chapter GOALS 1. Understand and interpret the terms dependent and independent variable. 2. Calculate and interpret.
Machine Learning 5. Parametric Methods.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
Pattern Classification All materials in these slides* were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Richard Kass/F02P416 Lecture 6 1 Lecture 6 Chi Square Distribution (  2 ) and Least Squares Fitting Chi Square Distribution (  2 ) (See Taylor Ch 8,
Introduction to emulators Tony O’Hagan University of Sheffield.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Comparing Models.
Physics 114: Lecture 13 Probability Tests & Linear Fitting
Estimating the Value of a Parameter Using Confidence Intervals
Ch3: Model Building through Regression
CH 5: Multivariate Methods
Physics 114: Exam 2 Review Material from Weeks 7-11
Correlation and Regression
Statistical Methods For Engineers
CHAPTER 29: Multiple Regression*
Introduction to Instrumentation Engineering
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Lecture 3 1 Probability Definition, Bayes’ theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests general.
Simple Linear Regression
Computing and Statistical Data Analysis / Stat 7
LECTURE 09: BAYESIAN LEARNING
Parametric Methods Berlin Chen, 2005 References:
MGS 3100 Business Analysis Regression Feb 18, 2016
Classical regression review
Presentation transcript:

- 1 - Calibration with discrepancy Major references –Calibration lecture is not in the book. –Kennedy, Marc C., and Anthony O'Hagan. "Bayesian calibration of computer models." Journal of Royal Statistical Society: Series B (2001). –Campbell, Katherine. "Statistical calibration of computer simulations." Reliability Engineering & System Safety (2006): –Higdon, Dave, et al. "Combining field data and computer simulations for calibration and prediction." SIAM J Scientific Computing 26.2 (2004). –Bayarri, Maria J., et al. "A framework for validation of computer models." Technometrics 49.2 (2007). –Bayarri, M. J., et al. "Predicting vehicle crashworthiness: Validation of computer models for functional and hierarchical data." JASA (2009). –Loeppky, Jason L., Derek Bingham, and William J. Welch. "Computer model calibration or tuning in practice.“ (2006). –McFarland, John, et al. "Calibration and uncertainty analysis for computer simulations with multivariate output." AIAA journal 46.5 (2008).

- 2 - Calibration with discrepancy Motivation –Computer model approximates reality, but often less succeed due to incomplete knowledge. Then, discrepancy exist in more or less degree. –Accounting for this bias is the central issue for calibration. –If we ignore this, we will get unwanted large bounds in the prediction. How to model the discrepancy ? –Gaussian process regression (GPR) is employed to express the discrepancy in approximate manner. –Estimation includes not only the calibration parameters but also the associated GPR parameters. –The discrepancy term has two purposes. 1. close the gap between the model and reality, making further improvement of the calibration. 2. validate the model accuracy. If small discrepancy, the model is good.

- 3 - Calibration with discrepancy Formulation –Computer model with calibration parameters –Reality = model + discrepancy –Field data = reality + observation error Unknowns to be estimated –Calibration parameter  in the computer model y M = m ( x |  ) –GPR parameters ,  b in the discrepancy b(x) ~ N(F ,  b 2 Q) –Standard deviation  in the observation error  ~ N(0,  2 ) Field data y F Model y M =m(x|  ) Bias b(x) Error  ~ N(0,  2 ) Result =

- 4 - Calibration with discrepancy Practice the process by increasing order of complexity. Estimate bias function only. –Estimate GPR parameters ,  b in the bias function. –Calibration parameter  in the model is assumed as known. –Observation error  is assumed as zero. Estimate bias function plus error. –Calibration parameter  in the model is assumed as known. Estimate all: bias function, error & calibration parameter

- 5 - Estimate bias function only Problem statement –Computer model with single parameter Calibration parameter is known as  = –Field data are given with one data at a point, i.e., with no replication. Assume that there is no observation error , i.e.,  = 0. –Then problem is only to carry out GPR for the bias function. Field data = model + biasBias data = field data - model Estimate GPR parameters using bias data.

- 6 - Gaussian process regression - Review –Review from lecture note: 10 Bayesian correlated regression.ppt Concept of GPR –GP of a function b(x) is defined by multivariate normal distribution where  b are the parameters estimated from the data. –Mean F  responsible for global variation such as polynomial regression. Usually taken to be constant by having F=1. –Correlation matrix, responsible for smoothness of connection between points, is given by where h is the parameter that controls smoothness of the GPR. –If the matrix is changed to Q =  b 2 I, the function is reduced to the ordinary regression with the mean at F  (<- regression fit to a constant  ) and error being iid normal N(0,  2 ).

- 7 - Gaussian process regression - Review Correlation matrix –In the correlation of two points if h is large, we have high correlation for y’s at the points, i.e., y i & y j will not differ much. If h is small, y i & y j will behave independent. –Likewise, at point close to the observation, GPR y will not differ much from the observed y, which means we have small uncertainty. At point itself, uncertainty goes to zero & GPR is the same as observed y. This is why GPR is the interpolation. –If large h, high correlation extends over large distance and leads to smooth connection between observed y. –If small h, correlation quickly dies off, and y are made with iid normal error. The mean becomes close to F . (<- actually this just a constant and mean of observed data y.)

- 8 - Gaussian process regression - Review Process of GPR –Likelihood of y F –Posterior distribution of the parameters  b –Analytical formula of point estimation –Posterior predictive distribution

- 9 - Estimate bias function only Results (think the bias data b F = y F – y M as y F in GPR.) –As expected, GPR behave different with respect to parameter h. –With small h=0.01, GPR gets closer to a constant at the mean of data. –At h=0.2, MCMC is run several times and compared to point estimation. Results are stable, and agree closely to the point estimated values. –Arbitrary h can be avoided using MLE method to obtain optimum h. As a result, h = However, with h>0.2, singularity occurs. –With large h=0.2, GPR connects smoothly over the data. Also the uncertainty bound is decreased.  b point estimation

Estimate bias function plus error Remark –What if the data include observation error ? GPR parameters and observation error are estimated simultaneously. –Recall that in classical regression, we estimate regression model and error  simultaneously, each of which are obtained as distributions. y = y R +  =  i f i (x) +    regression error As a result, we get confidence bounds of regression, and predictive bounds of regression due to the error addition.

Estimate bias function plus error Process of GPR including error –Likelihood of bias data b F = y F – y M where y M = 5exp(-  x) is given function –Posterior distribution of the bias parameters  b and error  –Posterior predictive distribution –Once we obtain b p, we get the original response y by adding y M

Estimate bias function plus error Results –GPR results are given for parameter h=0.01, 0.1 and 0.2 respectively. 1 st figures are b(x) only. 2 nd figures are ym(x)+b(x)+error. Results are compared to that without error & bias.

Calibration with discrepancy GPR including discrepancy –All the process is the same except the calibration parameter  is turned into the unknown to be estimated. –Likelihood of field data y F –Posterior distribution of all the parameters –Posterior predictive distribution Remember b F = y F – y M

Calibration with discrepancy Results –MCMC results are given when h=0.2. left & right figures are trace and histograms of , ,  b respectively. –Following left & right figures scatter plot of ( ,  and ( ,  b ) respectively.

Calibration with discrepancy Results –The two parameters ( ,  are severely correlated. This means that the computer model y M = m(x,  ) and bias function is confounded because of the relation y R = y M + b(x). Due to this, they are not statistically identifiable. –Nevertheless, the prediction will be the same whatever the combination will be. In this sense, interpretation of the calibrated parameter is not important.

Calibration with discrepancy Results –Result by 1 st attempt –Result by 2 nd attempt

Calibration with discrepancy Results –Result by 3 rd attempt –Result comparison