Session 3: Calibration Using observations of the real process.

Slides:



Advertisements
Similar presentations
Bayesian tools for analysing and reducing uncertainty Tony OHagan University of Sheffield.
Advertisements

Things to do in Lecture 1 Outline basic concepts of causality
Interfacing physical experiments and computer models Preliminary remarks Tony O’Hagan.
Mean, Proportion, CLT Bootstrap
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Linear Regression.  The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu:  The model won’t be perfect, regardless.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Linear Regression Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Some terminology When the relation between variables are expressed in this manner, we call the relevant equation(s) mathematical models The intercept and.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Variance reduction techniques. 2 Introduction Simulation models should be coded such that they are efficient. Efficiency in terms of programming ensures.
Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.
Copyright © 2009 Pearson Education, Inc. Chapter 8 Linear Regression.
Chapter 8 Linear Regression © 2010 Pearson Education 1.
CHAPTER 8: LINEAR REGRESSION
Chapter 7 Linear Regression.
Validating uncertain predictions Tony O’Hagan, Leo Bastos, Jeremy Oakley, University of Sheffield.
Gaussian Processes I have known
Gaussian process emulation of multiple outputs Tony O’Hagan, MUCM, Sheffield.
Section 4.2 Fitting Curves and Surfaces by Least Squares.
1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 4: Modeling Decision Processes Decision Support Systems in the.
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Maximum likelihood (ML) and likelihood ratio (LR) test
Excellence Justify the choice of your model by commenting on at least 3 points. Your comments could include the following: a)Relate the solution to the.
The Calibration Process
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of a scientific research When you know the system: Estimation.
Stat Notes 4 Chapter 3.5 Chapter 3.7.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Calibration & Curve Fitting
Engineering subprogramme, 7 November 2006 Tony O’Hagan.
Gaussian process modelling
Calibration and Model Discrepancy Tony O’Hagan, MUCM, Sheffield.
Calibration of Computer Simulators using Emulators.
1 Chapter 8 Sensitivity Analysis  Bottom line:   How does the optimal solution change as some of the elements of the model change?  For obvious reasons.
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
Ch4 Describing Relationships Between Variables. Pressure.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Using an emulator. Outline So we’ve built an emulator – what can we use it for? Prediction What would the simulator output y be at an untried input x?
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
1 Chapter 18 Sampling Distribution Models. 2 Suppose we had a barrel of jelly beans … this barrel has 75% red jelly beans and 25% blue jelly beans.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Linear Regression.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Fast Simulators for Assessment and Propagation of Model Uncertainty* Jim Berger, M.J. Bayarri, German Molina June 20, 2001 SAMO 2001, Madrid *Project of.
Statistics and the Verification Validation & Testing of Adaptive Systems Roman D. Fresnedo M&CT, Phantom Works The Boeing Company.
Gile Sampling1 Sampling. Fundamental principles. Daniel Gile
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
17 May 2007RSS Kent Local Group1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan CTCD, Sheffield.
Center for Radiative Shock Hydrodynamics Fall 2011 Review Assessment of predictive capability Derek Bingham 1.
Lecture 7: What is Regression Analysis? BUEC 333 Summer 2009 Simon Woodcock.
Chapter 8 Linear Regression. Slide 8- 2 Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
9.2 Linear Regression Key Concepts: –Residuals –Least Squares Criterion –Regression Line –Using a Regression Equation to Make Predictions.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8- 1.
Week 6. Statistics etc. GRS LX 865 Topics in Linguistics.
How does Science Work? Presented by : Sabar Nurohman, M.Pd.
Linear Regression Chapter 8. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King.
Machine Learning CUNY Graduate Center Lecture 6: Linear Regression II.
Introduction to emulators Tony O’Hagan University of Sheffield.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 7, Slide 1 Chapter 7 Linear Regression.
Statistics 8 Linear Regression. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King.
8 Sept 2006, DEMA2006Slide 1 An Introduction to Computer Experiments and their Design Problems Tony O’Hagan University of Sheffield.
The Calibration Process
Chapter 6 Calibration and Application Process
Stat 112 Notes 4 Today: Review of p-values for one-sided tests
Regression Models - Introduction
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Presentation transcript:

Session 3: Calibration Using observations of the real process

Outline Calibration and its relatives Calibration and inversion History matching Tuning and extrapolation Data assimilation Validation The role of emulators Model discrepancy Why we need to acknowledge model discrepancy Modelling model discrepancy Case study – history matching the galaxy UQ12 minitutorial - session 32

Calibration and its relatives UQ12 minitutorial - session 33

Using observations of the real process Simulation models are nearly always intended to represent some real world process The issues that we address in this session all arise when we take that representation seriously A nd try to relate the simulator to observations of the real process Three parts to this session Describing different ways that observational data can be used Explaining the importance of model discrepancy – the link between model and reality A case study in a serious and challenging model UQ12 minitutorial - session 34

Terminology A simulation model produces output from inputs It has two kinds of inputs Calibration parameters Unknown but fixed Control variables Known parameters of application context Calibration and the other tasks considered in this session have one common feature Using observations of the real process But they differ slightly in the way those observations are used And in their underlying objectives UQ12 minitutorial - session 35

Notation Simulation model has the form y = f(x, θ ) where y is the output θ denotes calibration parameters and x denotes control variables So the model itself is the function f Observations take the form z i = r(x i ) + ε i where ε i denotes observation error and r(x) denotes reality under conditions x Note that reality doesn’t depend on the calibration parameters UQ12 minitutorial - session 36

Calibration Calibration involves using the observational data to learn about the values of the calibration parameters The traditional method writes z i = f(x i, θ ) + ε i Equating model output f(x i, θ ) with reality r(x i ) Estimate θ, e.g. by minimising sum of squared residuals Call estimate t and predict (extrapolate) real world process at a new x value by f(x, t) Ignores uncertainty about θ Treats it as now known to equal t A Total UQ philosophy demands that we quantify posterior uncertainty in θ, after using the data to learn about it UQ12 minitutorial - session 37

Inversion Calibration is often referred to in some fields as inversion Implicitly, the idea is to take the observations, represented as z = f(x, θ ) = f x ( θ ) in which z and x are known, and solve for θ = f x -1 (z) Inverse problems of this kind are extensively studied Since in practice we don’t have f x -1 (.) inversion usually boils down to searching the parameter space just as in calibration Note that inversion simply tries to find θ But strict solutions do not exist because of observation error We need to recognise uncertainty in observations and then in θ Bayesian methods are often used for this reason UQ12 minitutorial - session 38

History matching Calibration (or inversion) is referred to by several other names in different fields For some communities, it is called history matching However, we will use this term with a slightly different meaning In calibration we (explicitly or implicitly) search the θ space to learn about its value from how close f(x, θ ) gets to reality In history matching we simply try to identify the part of θ space in which the simulator gets close enough to the observations According to a criterion of plausibility History matching is often a useful preliminary to calibration Or just to see whether any acceptable matches exist UQ12 minitutorial - session 39

Tuning Tuning is another word that is in some communities synonymous with calibration However, it often implies a slightly different purpose The purpose of calibration is typically to learn about the parameters θ, as a scientific question Tuning is typically done in order to predict the real process The activity of tuning or calibration (or inversion) is the same To derive a posterior distribution for θ But this is used to predict f(x, θ ) at new control inputs x When the prediction is for x outside the range of observations the prediction becomes extrapolation Which is particularly challenging UQ12 minitutorial - session 310

Tuning and physical parameters Simulator parameters may be physical or just for tuning Physical parameters have true values in the real world We are often really interested in their physical values Tuning parameters don’t have true physical values They often represent crude adjustments for missing physics Their values are whatever makes the model fit best to reality In the tuning task we learn about both sets They together make up the set of calibration parameters θ We may hope to learn about physical parameter values as a by- product of tuning UQ12 minitutorial - session 311

Data assimilation Many simulators are dynamic At each time step, the current state vector ξ t is updated Possibly depending on forcing inputs and other parameters In data assimilation, observations of the process become available at different time points They are used to tune the state vector sequentially Intended to improve the model’s tracking over time So data assimilation is a form of calibration or tuning Typically, uncertainty about ξ t is accounted for and updated Kalman filter, ensemble Kalman filter etc It is not usual to learn about other fixed calibration parameters But we should do this in the interests of Total UQ UQ12 minitutorial - session 312

Validation The last use of observations is quite different Validation is concerned with assessing the validity of the simulator as a representation of reality Part of verification and validation (V&V) Verification asks whether the simulation model has been implemented/coded correctly Validation asks whether it can get sufficiently close to reality Once it has been tuned A simple form of validation is offered by history matching The model can be declared valid if adequate matches exist UQ12 minitutorial - session 313

The role of emulation All of these tasks involve searching through the parameter space Comparing f(x, θ ) with z for many θ In principle they can be performed without emulation As long as the simulator is fast enough But slow simulators and high-dimensional parameter spaces often make emulation essential As always, we need to allow for code uncertainty The toolkit has some pages on calibration and history matching with emulators But the second part of this session concentrates on another very important source of uncertainty UQ12 minitutorial - session 314

Model discrepancy Relating the simulator to reality UQ12 minitutorial - session 315

A fundamental error When presenting calibration, I said the traditional approach equates the simulator to reality The assumption is that for the true value of θ we have r(x) = f(x, θ ) Unfortunately, all models are wrong “All models are wrong but some are useful” George E P Box, 1979 The following simple example explores what happens when we fail to acknowledge this key fact UQ12 minitutorial - session 316

Example: A simple machine (SM) A machine produces an amount of work y which depends on the amount of effort t put into it Model is y = f(t, β ) = β t Control variable t Calibration parameter β is rate at which effort is converted to work True value of β is 0.65 Graph shows observed data Points lie below y = 0.65t For large enough t Because the model is wrong Losses due to friction etc. UQ12 minitutorial - session 317

SM – calibration with no discrepancy We wish to calibrate this model To learn about the true value of β Using observations z i With no model discrepancy, this case reduces to a simple linear regression z i = β t + ε i Posterior distribution of β found by simple regression analysis Mean Standard deviation True value 0.65 is well outside this distribution More data makes things worse UQ12 minitutorial - session 318

SM – calibration, no discrepancy With increasing data the posterior becomes more and more concentrated on the wrong (best fit) value UQ12 minitutorial - session 319

The problem is completely general Calibrating (inverting, tuning, matching) a wrong model gives parameter estimates that are wrong Not equal to their true physical values – biased With more data we become more sure of these wrong values The simple machine is a trivial model, but the same conclusions apply to all simulation models All models are wrong In more complex models it is just harder to see what is going wrong Even with the SM, it takes a lot of data to see any curvature in reality UQ12 minitutorial - session 320

Model discrepancy The SM example demonstrates that we need to accept that the model does not correctly represent reality For any values of the calibration parameters The simulator outputs deviate systematically from reality Call it model bias or model discrepancy There is a difference between the model with best/true parameter values and reality r(x) = f(x, θ ) + δ (x) where δ (x) represents this discrepancy Will typically itself have uncertain parameters UQ12 minitutorial - session 321

SM revisited Kennedy and O’Hagan (2001) introduced this model discrepancy Modelled it as a zero-mean Gaussian process They claimed it acknowledges additional uncertainty And mitigates against over-fitting of θ So add this model discrepancy term to the linear model of the simple machine r(t) = β t + δ (t) With δ (t) modelled as a zero-mean GP Posterior distribution of β now behaves quite differently UQ12 minitutorial - session 322

SM – calibration, with discrepancy Posterior distribution covers the true value, and does not get worse with increasing data UQ12 minitutorial - session 323

Extrapolation To reinforce the message, look at extrapolation Involves predicting the real process at control variable values outside where we have data Implicitly, the data are used to calibrate So with traditional calibration we know the model fits reality as well as possible in the range of the data But without model discrepancy The parameter estimates will be biased Extrapolation will also be biased Because best fitting parameter values are different in different parts of the control variable space With more data we become more sure of these wrong values UQ12 minitutorial - session 324

SM – extrapolation, no discrepancy Even a minor extrapolation (t = 5) is hopelessly wrong and gets worse with increasing data UQ12 minitutorial - session 325

SM – interpolation, no discrepancy Even interpolation (t = 1) is hopelessly wrong, too, and gets worse with increasing data UQ12 minitutorial - session 326

SM – extrapolation, with discrepancy With model discrepancy, extrapolation is OK, even for large sample – interpolation is very good UQ12 minitutorial - session 327

SM – big extrapolation with discrepancy Although if we extrapolate far enough we find problems, despite including model discrepancy UQ12 minitutorial - session 328

Beyond simple model discrepancy With simple GP model discrepancy the posterior distribution for θ is typically very wide Tends to ensure we cover the true value But is not very helpful And increasing data does not improve the precision Similarly, extrapolation with model discrepancy gives wide prediction intervals And may still not be wide enough How can we do better? Primarily by having better prior information UQ12 minitutorial - session 329

Nonidentifiability Formulation with model discrepancy is not identifiable For any θ, there is a δ (x) to match reality perfectly Reality is r(x) = f(x, θ ) + δ (x) Given θ, model discrepancy is δ (x) = r(x) – f(x, θ ) Suppose we had an unlimited number of observations We would learn reality’s true function r(x) exactly But we would still not learn θ It could in principle be anything And we would still not be able to extrapolate reliably UQ12 minitutorial - session 330

The joint posterior Calibration leads to a joint posterior distribution for θ and δ (x) But nonidentifiability means there are many equally good fits ( θ, δ (x)) to the data Induces strong correlation between θ and δ (x) This may be compounded by the fact that simulators often have large numbers of parameters (Near-)redundancy means that different θ values produce (almost) identical predictions Sometimes called equifinality Within this set, the prior distributions for θ and δ (x) count UQ12 minitutorial - session 331

The importance of prior information The nonparametric GP term allows the model to fit and predict reality accurately given enough data Within the range of the data But it doesn’t mean physical parameters are correctly estimated The separation between original model and discrepancy is unidentified Estimates depend on prior information Unless the real model discrepancy is just the kind expected a priori the physical parameter estimates will still be biased To learn about θ in the presence of model discrepancy we need better prior information And this is also crucial for extrapolation UQ12 minitutorial - session 332

Better prior information For calibration Prior information about θ and/or δ (x) We wish to calibrate because prior information about θ is not strong enough So prior knowledge of model discrepancy is crucial In the range of the data In the SM, a model for δ (x) that says it is zero at t = 0, with gradient zero, but then increasingly negative, should do better Talk on Monday by Jenný Brynjarsdóttir For extrapolation All this plus good prior knowledge of δ (x) outside the range of the calibration data That’s seriously challenging! UQ12 minitutorial - session 333

Careful modelling of discrepancy In principle, we can learn more if we put in more and better prior information about model discrepancy This is an important area of ongoing research But some illustrations of the issues that arise may be instructive UQ12 minitutorial - session 3 34

Hierarchies of Simulators Often we have hierarchies of simulators Usually the resolution is increasing but additional processes could be added UQ12 minitutorial - session 335

Hierarchies of Simulators Rather than emulate each simulator separately Emulate simulator 1 and then emulate the difference between outputs at each level Need to have some runs at common inputs Need few runs of expensive complex simulators UQ12 minitutorial - session 336

Reified Simulators UQ12 minitutorial - session 337 Modelling the relationship between Simulator 1 and reality is complex Much of its model discrepancy is linked to the improvements possible with Simulator 2, Simulator 3 …

Reified Simulators UQ12 minitutorial - session 338 Linking Simulator 2 to reality is almost as tricky And data can’t be used twice

Reified Simulators UQ12 minitutorial - session 339 The reified simulator is at the end of currently foreseeable models Its relationship with reality is simpler Other simulators link to reality through the reified simulator

Reified Simulators Reified simulators are ‘imaginary’ simulators that we impose between our simulators and reality They are the ‘best’ simulator we could visualise at this time Model discrepancy is split into two: 1. The discrepancy between the current simulator and the reified simulator 2. The discrepancy between the reified simulator and reality Reification does not reduce the discrepancy But might make it easier to elicit Reification is one quite formal way to think about model discrepancy UQ12 minitutorial - session 340

Conclusions … Several tasks rely on observational data All are deeply compromised if we don’t acknowledge and quantify model discrepancy Calibration/inversion/tuning Parameter estimates wrong, distributions too tight Over-fitting and over-confidence Tuning/prediction/extrapolation Predictions wrong and over-confident Data assimilation Over-reaction to data and over-confidence again Validation Only through correcting discrepancy can a model be valid UQ12 minitutorial - session 341

… and more conclusions Total UQ demands that we quantify all uncertainties Or at least try to, and acknowledge those that are unquantified Model discrepancy is an important source of uncertainty Quantifying prior beliefs about discrepancy is hard but important – active research area Analyses incorporating model discrepancy are more complex but also more honest and less self-deceptive Data assimilation is particularly challenging Uncertainty about both state vector and fixed calibration parameters – rarely done Plus model discrepancy uncertainty Plus code uncertainty when we need to emulate UQ12 minitutorial - session 342

Another conference UCM 2012 Still open for poster abstracts Early bird registration deadline 30 th April UQ12 minitutorial - session 343