Engineering subprogramme, 7 November 2006 Tony O’Hagan.

Slides:



Advertisements
Similar presentations
Bayesian tools for analysing and reducing uncertainty Tony OHagan University of Sheffield.
Advertisements

SAMSI Kickoff 11/9/06Slide 1 Simulators, emulators, predictors – Validity, quality, adequacy Tony O’Hagan.
Running a model's adjoint to obtain derivatives, while more efficient and accurate than other methods, such as the finite difference method, is a computationally.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Lesson 10: Linear Regression and Correlation
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Validating uncertain predictions Tony O’Hagan, Leo Bastos, Jeremy Oakley, University of Sheffield.
Gaussian process emulation of multiple outputs Tony O’Hagan, MUCM, Sheffield.
Climate case study. Outline The challenge The simulator The data Definitions and conventions Elicitation Expert beliefs about climate parameters Expert.
Chapter 12 Simple Regression
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Basic Business Statistics 10 th Edition.
The Simple Regression Model
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
1 Validation and Verification of Simulation Models.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Copyright © Cengage Learning. All rights reserved. 13 Nonlinear and Multiple Regression.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 8-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Lecture 19 Simple linear regression (Review, 18.5, 18.8)
Statistics for Managers Using Microsoft® Excel 7th Edition
Introduction to the design (and analysis) of experiments James M. Curran Department of Statistics, University of Auckland
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Session 3: Calibration Using observations of the real process.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Copyright © Cengage Learning. All rights reserved. 12 Simple Linear Regression and Correlation.
Gaussian process modelling
Calibration Guidelines 1. Start simple, add complexity carefully 2. Use a broad range of information 3. Be well-posed & be comprehensive 4. Include diverse.
Calibration and Model Discrepancy Tony O’Hagan, MUCM, Sheffield.
Calibration of Computer Simulators using Emulators.
Inferences for Regression
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Basic Business Statistics 11 th Edition.
Confidence Interval Estimation
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
1 Institute of Engineering Mechanics Leopold-Franzens University Innsbruck, Austria, EU H.J. Pradlwarter and G.I. Schuëller Confidence.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 8-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
Brian Macpherson Ph.D, Professor of Statistics, University of Manitoba Tom Bingham Statistician, The Boeing Company.
Causality and confounding variables Scientists aspire to measure cause and effect Correlation does not imply causality. Hume: contiguity + order (cause.
17 May 2007RSS Kent Local Group1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan CTCD, Sheffield.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Stat 112: Notes 2 Today’s class: Section 3.3. –Full description of simple linear regression model. –Checking the assumptions of the simple linear regression.
6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 8-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
- 1 - Calibration with discrepancy Major references –Calibration lecture is not in the book. –Kennedy, Marc C., and Anthony O'Hagan. "Bayesian calibration.
Options and generalisations. Outline Dimensionality Many inputs and/or many outputs GP structure Mean and variance functions Prior information Multi-output,
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Business Statistics: A First Course 5 th Edition.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
Introduction to emulators Tony O’Hagan University of Sheffield.
1 DATA ANALYSIS, ERROR ESTIMATION & TREATMENT Errors (or “uncertainty”) are the inevitable consequence of making measurements. They are divided into three.
Remember the equation of a line: Basic Linear Regression As scientists, we find it an irresistible temptation to put a straight line though something that.
Canadian Bioinformatics Workshops
Chapter 13 Simple Linear Regression
The simple linear regression model and parameter estimation
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
Chapter 11: Simple Linear Regression
Statistical Methods For Engineers
6-1 Introduction To Empirical Models
Confidence Interval Estimation
Introduction to the design (and analysis) of experiments
F test for Lack of Fit The lack of fit test..
Presentation transcript:

Engineering subprogramme, 7 November 2006 Tony O’Hagan

Outline Three parts: Turbofan engine vibration model Reification Predictors and validation

Part 1: The new model

Turbofan vibration model Rolls-Royce, Derby, UK Maker of civil aeroplane engines Simulator of a fan assembly Our example has 24 blades Primary concern is with vibration If amplitude is too high on any one blade it may break In effect this will destroy the engine Rolls-Royce Trent 500 engine

Model details 24 inputs are vibration resonant frequency of each blade 24 outputs are amplitude of vibration for each blade Other factors Amount of damping – more results in more complex behaviour and longer model run times Model resolution – it’s possible to run the solver on higher or lower resolution grids Could also vary e.g. number of blades, operating rpm and temperature

Parameter uncertainty It’s not possible to manufacture and assemble blades to be all identical and perfectly oriented Variation in resonant frequencies of blades creates complex variations in their vibration amplitude Uncertainty distribution on each model input is the distribution achieved within manufacturing tolerances Question: Given an assembly of blades sampled from this distribution, what is the risk of high amplitude vibrations resulting?

Emulation Strategy: Emulate single output = blade 1 amplitude 24 inputs = frequencies of blades 1 to 24 Because of rotational symmetry, each model run gives up to 24 design points Simulate random blade assemblies Results Output depends most strongly on blade 1 input Also on neighbouring inputs, 2 and 24, etc But high-order dependencies on all inputs So far we’ve failed to emulate accurately even with very many design points

Challenges What’s going on here? Can we find a way to achieve the original strategy? Should we try instead to emulate max amplitude? This may also be badly behaved!

Part 2: Reification

Reification – background Kennedy & O’Hagan (2001), “Bayesian calibration of computer models” KO’H henceforth Goldstein & Rougier (2006), “Reified Bayesian modelling and inference for physical systems” GR henceforth GR discuss two problems with KO’H 1. Meaning of calibration parameters is unclear 2. Assuming stationary model discrepancy, independent of code, is inconsistent if better models are possible Reification is their solution

Meaning of calibration parameters The model is wrong We need prior distributions for calibration parameters Some may just be tuning parameters with no physical meaning How can we assign priors to these? Even for those that have physical meanings, the model may fit observational data better with wrong values What does a prior mean for a parameter in a wrong model?

Example: some kind of machine Simulator says output is proportional to input Energy in gives work out Proportionality parameter has physical meaning Observations with error Without model discrepancy, this is a simple linear model LS estimate of slope is But true parameter value is 0.65 XY

Model discrepancy Red line is LS fit Black line is simulator with true parameter 0.65 Model is wrong In reality there are energy losses

Case 1 Suppose we have No model discrepancy term Weak prior on slope Then we’ll get Calibration close to LS value, Quite good predictive performance in [0, 2+] Poor estimation of physical parameter

Case 2 Suppose we have No model discrepancy term Informative prior on slope based on knowledge of physical parameter Centred around 0.65 Then we’ll get Calibration between LS and prior values Not so good predictive performance Poor estimation of physical parameter

Without model discrepancy Calibration is just nonlinear regression y = f(x, θ) + e Where f is the computer code Quite good predictive performance can be achieved if there is a θ for which the model gets close to reality Prior information based on physical meaning of θ can be misleading Poor calibration Poor prediction

Case 3 Suppose we have GP model KO’H discrepancy term with constant mean Weak prior on mean Weak prior on slope Then we’ll get Calibration close to LS value for regression with non- zero intercept The GP takes the intercept Slope estimate is now even further from the true physical parameter value, 0.518, albeit more uncertain Discrepancy estimate ‘corrects’ generally upwards

Case 4 Suppose we have GP model KO’H discrepancy term with constant mean Weak prior on mean Informative prior on slope based on knowledge of physical parameter Centred around 0.65 Then we’ll get Something like linear regression with informative prior on the slope Slope estimate is a compromise and loses physical meaning Predictive accuracy weakened

Adding simple discrepancy Although the GP discrepancy of KO’H is in principle flexible and nonparametric, it still fits primarily on its mean function Prediction looks like the result of fitting the regression model with nonlinear f plus the discrepancy mean This process does not give physical meaning to the calibrated parameters Even with informative priors The augmented regression model is also wrong

Reification GR introduce a new entity, the ‘reified’ model To reify is to attribute the status of reality Thus, a reified simulator is one that we can treat as real, and in which the calibration parameters should take their physical values Hence prior distributions on them can be meaningfully specified and should not distort the analysis GR’s reified model is a kind of thought experiment It is conceptually a model that corrects such (scientific and computational) deficiencies as we can identify in f

The GR reified model is not regarded as perfect It still has simple additive model discrepancy as in KO’H The discrepancy in the model is now made up of two parts Difference between f and the reified model For which there is substantive prior information Discrepancy of the reified model Independent of both models

Reification doubts Can the reified model’s parameters be regarded as having physical meaning? Allowing for model discrepancy between the reified model and reality makes this questionable Do we need the reified model? Broadly speaking, the decomposition of the original model’s discrepancy is sensible But it amounts to no more than thinking carefully about model discrepancy and modelling it as informatively as possible

Case 5 Suppose we have GP model discrepancy term with mean function that reflects the acknowledged deficiency of the model in ignoring losses to friction Informative prior on slope based on knowledge of physical parameter Then we’ll get Something more like the original intention of bringing in the model discrepancy! Slope parameter not too distorted, model correction having physical meaning, good predictive performance

Moral There is no substitute for thinking Model discrepancy should be modelled as informatively as possible Inevitably, though, the discrepancy function will to a greater or lesser extent correct for unpredicted deficiencies Then the physical interpretations of calibration parameters can be compromised If this is not recognised in their priors, those priors can distort the analysis

Final comments There is much more in GR than I have dealt with here Definitely repays careful reading E.g. relationships between different simulators of the same reality Their paper will appear in JSPI with discussion This presentation is a pilot for my discussion!

Part 3: Validation

Simulators, emulators, predictors A simulator is a model, representing some real world process An emulator is a statistical description of a simulator Not just a fast surrogate Full probabilistic specification of beliefs A predictor is a statistical description of reality Full probabilistic specification of beliefs Emulator + representation of relationship between simulator and reality

Validation What can be meaningfully called validation? Validation should have the sense of demonstrating that something is right The simulator is inevitably wrong There is no meaningful sense in which we can validate it What about the emulator? It makes statements like, “We give probability 0.9 to the output f(x) lying in the range [a, b] if the model is run with inputs x.” This can be right in the sense that (at least) 90% of such intervals turn out to contain the true output

Validating the emulator Strictly, we can’t demonstrate that the emulator actually is valid in that sense The best we can do is to check that the truth on a number of new runs lies appropriately within probability bounds And apply as many such checks as we feel we need to give reasonable confidence in the emulator’s validity In practice, check it against as many (well- chosen) new runs as possible Do Q-Q plots of standardised residuals and other diagnostic checks

Validating a predictor The predictor is also a stochastic entity We can validate it in the same way Although getting enough observations of reality may be difficult We may have to settle for the predictor not being yet shown to be invalid!

Validity, quality, adequacy So, a predictor/emulator is valid if the truth lies appropriately within probability bounds Could be conservative Need severe testing tools for verification The quality of a predictor is determined by how tight those bounds are Refinement versus calibration A predictor is adequate for purpose if the bounds are tight enough If we are satisfied the predictor is valid over the relevant range we can determine adequacy

Conclusion – terminology I would like to introduce the word ‘predictor’, alongside the already accepted ‘emulator’ and ‘simulator’ I would like the word ‘validate’ to be used in the sense I have done above Not in the sense that Bayarri, Berger, et al have applied it, which has more to do with fitness for purpose And hence involves not just validity but quality Models can have many purposes, but validity can be assessed independently of purpose