17 May 2007RSS Kent Local Group1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan CTCD, Sheffield.

Slides:



Advertisements
Similar presentations
Bayesian tools for analysing and reducing uncertainty Tony OHagan University of Sheffield.
Advertisements

Introduction to Monte Carlo Markov chain (MCMC) methods
Uncertainty and Sensitivity Analysis of Complex Computer Codes
Southampton workshop, July 2009Slide 1 Tony O’Hagan, University of Sheffield Simulators and Emulators.
Quantifying and managing uncertainty with Gaussian process emulators Tony O’Hagan University of Sheffield.
Using an emulator. Outline So we’ve built an emulator – what can we use it for? Prediction What would the simulator output y be at an untried input x.
Emulators and MUCM. Outline Background Simulators Uncertainty in model inputs Uncertainty analysis Case study – dynamic vegetation simulator Emulators.
14 May 2008RSS Oxford1 Towards quantifying the uncertainty in carbon fluxes Tony O’Hagan University of Sheffield.
SAMSI Distinguished, October 2006Slide 1 Tony O’Hagan, University of Sheffield Managing Uncertainty in Complex Models.
Durham workshop, July 2008Slide 1 Tony O’Hagan, University of Sheffield MUCM: An Overview.
Insert Date HereSlide 1 Using Derivative and Integral Information in the Statistical Analysis of Computer Models Gemma Stephenson March 2007.
Dialogue Policy Optimisation
Running a model's adjoint to obtain derivatives, while more efficient and accurate than other methods, such as the finite difference method, is a computationally.
Designing Ensembles for Climate Prediction
Sampling: Final and Initial Sample Size Determination
Validating uncertain predictions Tony O’Hagan, Leo Bastos, Jeremy Oakley, University of Sheffield.
Gaussian Processes I have known
Gaussian process emulation of multiple outputs Tony O’Hagan, MUCM, Sheffield.
Gaussian Processes for Fast Policy Optimisation of POMDP-based Dialogue Managers M. Gašić, F. Jurčíček, S. Keizer, F. Mairesse, B. Thomson, K. Yu, S. Young.
Sensitivity Analysis for Complex Models Jeremy Oakley & Anthony O’Hagan University of Sheffield, UK.
Evaluating Hypotheses
A Two Level Monte Carlo Approach To Calculating
Value of Information for Complex Economic Models Jeremy Oakley Department of Probability and Statistics, University of Sheffield. Paper available from.
The Calibration Process
Uncertainty in Engineering - Introduction Jake Blanchard Fall 2010 Uncertainty Analysis for Engineers1.
Lecture II-2: Probability Review
1 D r a f t Life Cycle Assessment A product-oriented method for sustainability analysis UNEP LCA Training Kit Module k – Uncertainty in LCA.
Helsinki University of Technology Adaptive Informatics Research Centre Finland Variational Bayesian Approach for Nonlinear Identification and Control Matti.
Statistical Methods For Engineers ChE 477 (UO Lab) Larry Baxter & Stan Harding Brigham Young University.
Session 3: Calibration Using observations of the real process.
1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and.
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
Gaussian process modelling
2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 1 Hastie, Tibshirani and Friedman.The Elements of Statistical Learning (2nd edition,
Calibration and Model Discrepancy Tony O’Hagan, MUCM, Sheffield.
Calibration of Computer Simulators using Emulators.
Error Analysis Accuracy Closeness to the true value Measurement Accuracy – determines the closeness of the measured value to the true value Instrument.
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
Soft Sensor for Faulty Measurements Detection and Reconstruction in Urban Traffic Department of Adaptive systems, Institute of Information Theory and Automation,
29 May 2008IMA Scottish Branch1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan University of Sheffield.
Using an emulator. Outline So we’ve built an emulator – what can we use it for? Prediction What would the simulator output y be at an untried input x?
1 Estimation From Sample Data Chapter 08. Chapter 8 - Learning Objectives Explain the difference between a point and an interval estimate. Construct and.
Slide 1 Marc Kennedy, Clive Anderson, Anthony O’Hagan, Mark Lomas, Ian Woodward, Andreas Heinemayer and John Paul Gosling Uncertainty in environmental.
Why it is good to be uncertain ? Martin Wattenbach, Pia Gottschalk, Markus Reichstein, Dario Papale, Jagadeesh Yeluripati, Astley Hastings, Marcel van.
Center for Radiative Shock Hydrodynamics Fall 2011 Review Assessment of predictive capability Derek Bingham 1.
Slide 1 Marc Kennedy, Clive Anderson, Anthony O’Hagan, Mark Lomas, Ian Woodward, Andreas Heinemayer and John Paul Gosling Quantifying uncertainty in the.
Additional Topics in Prediction Methodology. Introduction Predictive distribution for random variable Y 0 is meant to capture all the information about.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
1 1 Slide Simulation Professor Ahmadi. 2 2 Slide Simulation Chapter Outline n Computer Simulation n Simulation Modeling n Random Variables and Pseudo-Random.
- 1 - Calibration with discrepancy Major references –Calibration lecture is not in the book. –Kennedy, Marc C., and Anthony O'Hagan. "Bayesian calibration.
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
Bayesian Modelling Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Anders Nielsen Technical University of Denmark, DTU-Aqua Mark Maunder Inter-American Tropical Tuna Commission An Introduction.
Computacion Inteligente Least-Square Methods for System Identification.
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
Introduction to emulators Tony O’Hagan University of Sheffield.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
1 Life Cycle Assessment A product-oriented method for sustainability analysis UNEP LCA Training Kit Module k – Uncertainty in LCA.
8 Sept 2006, DEMA2006Slide 1 An Introduction to Computer Experiments and their Design Problems Tony O’Hagan University of Sheffield.
Probabilistic methods for aggregate and cumulative exposure to pesticides Marc Kennedy Risk and Numerical Sciences team
Marc Kennedy, Tony O’Hagan, Clive Anderson,
Ch3: Model Building through Regression
The Calibration Process
Ecosystem Demography model version 2 (ED2)
Statistics in Applied Science and Technology
Statistical Methods For Engineers
Introduction to Instrumentation Engineering
Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03
Uncertainty Propagation
Presentation transcript:

17 May 2007RSS Kent Local Group1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan CTCD, Sheffield

17 May 2007RSS Kent Local Group2 Outline Introduction Gaussian process emulation The England and Wales carbon flux 2000

17 May 2007RSS Kent Local Group3 Computer models In almost all fields of science, technology, industry and policy making, people use mechanistic models to describe complex real- world processes For understanding, prediction, control There is a growing realisation of the importance of uncertainty in model predictions Can we trust them? Without any quantification of output uncertainty, it’s easy to dismiss them

17 May 2007RSS Kent Local Group4 Examples Climate prediction Molecular dynamics Nuclear waste disposal Oil fields Engineering design Hydrology

17 May 2007RSS Kent Local Group5 Uncertainty analysis Consider just one source of uncertainty We have a computer model that produces output y = f (x) when given input x But for a particular application we do not know x precisely So X is a random variable, and so therefore is Y = f (X ) We are interested in the uncertainty distribution of Y How can we compute it?

17 May 2007RSS Kent Local Group6 Monte Carlo The usual approach is Monte Carlo Sample values of x from its distribution Run the model for all these values to produce sample values y i = f (x i ) These are a sample from the uncertainty distribution of Y Neat but impractical if it takes minutes or hours to run the model We can then only make a small number of runs

17 May 2007RSS Kent Local Group7 GP solution Treat f (.) as an unknown function with Gaussian process (GP) prior distribution Use available runs as observations without error, to derive posterior distribution (also GP) Make inference about the uncertainty distribution E.g. The mean of Y is the integral of f (x) with respect to the distribution of X Its posterior distribution is normal conditional on GP parameters

17 May 2007RSS Kent Local Group8 Gaussian process emulation Principles of emulation The GP and how it works

17 May 2007RSS Kent Local Group9 Emulation A computer model encodes a function, that takes inputs and produces outputs An emulator is a statistical approximation of that function Estimates what outputs would be obtained from given inputs With statistical measure of estimation error Given enough training data, estimation error variance can be made small

17 May 2007RSS Kent Local Group10 So what? A good emulator estimates the model output accurately with small uncertainty and runs “instantly” So we can do uncertainty analysis etc fast and efficiently Conceptually, we use model runs to learn about the function then derive any desired properties of the model

17 May 2007RSS Kent Local Group11 Gaussian process Simple regression models can be thought of as emulators But error estimates are invalid We use Gaussian process emulation Nonparametric, so can fit any function Error measures can be validated Analytically tractable, so can often do uncertainty analysis etc analytically Highly efficient when many inputs Reproduces training data correctly

17 May 2007RSS Kent Local Group12 2 code runs Consider one input and one output Emulator estimate interpolates data Emulator uncertainty grows between data points

17 May 2007RSS Kent Local Group13 3 code runs Adding another point changes estimate and reduces uncertainty

17 May 2007RSS Kent Local Group14 5 code runs And so on

17 May 2007RSS Kent Local Group15 BACCO This has led to a wide ranging body of tools for inference about all kinds of uncertainties in computer models All based on building the GP emulator of the model from a set of training runs This area is now known as BACCO Bayesian Analysis of Computer Code Output

17 May 2007RSS Kent Local Group16 BACCO includes Uncertainty analysis Sensitivity analysis Calibration Data assimilation Model validation Optimisation Etc… All within a single coherent framework

17 May 2007RSS Kent Local Group17 MUCM Managing Uncertainty in Complex Models Large 4-year research grant Started in June postdoctoral research assistants 4 PhD studentships Based in Sheffield, Durham, Aston, Southampton, LSE Objective: to develop BACCO methods into a robust technology that is widely applicable across the spectrum of modelling applications

17 May 2007RSS Kent Local Group18 Example: UK carbon flux in 2000 Vegetation model predicts carbon exchange from each of 707 pixels over England & Wales Principal output is Net Biosphere Production Accounting for uncertainty in inputs Soil properties Properties of different types of vegetation Aggregated to England & Wales total Allowing for correlations Estimate 7.55 Mt C Std deviation 0.56 Mt C Analysis by Marc Kennedy and John Paul Gosling

17 May 2007RSS Kent Local Group19 SDGVMd outputs for 2000

17 May 2007RSS Kent Local Group20 Outline of analysis 1. Build emulators for each PFT at a sample of sites 2. Identify most important inputs 3. Define distributions to describe uncertainty in important inputs Analysis of soils data Elicitation of uncertainty in PFT parameters Need to consider correlations

17 May 2007RSS Kent Local Group21 4. Carry out uncertainty analysis in each sampled site 5. Interpolate across all sites Mean corrections and standard deviations 6. Aggregate across sites and PFTs Allowing for correlations

17 May 2007RSS Kent Local Group22 Sensitivity analysis for one pixel/PFT

17 May 2007RSS Kent Local Group23 Elicitation Beliefs of expert (developer of SDGVMd) regarding plausible values of PFT parameters Important to allow for uncertainty about mix of species in a pixel and role of parameter in the model In the case of leaf life span for evergreens, this was more complex

17 May 2007RSS Kent Local Group24 EvNl leaf life span

17 May 2007RSS Kent Local Group25 Correlations PFT parameter in one pixel may differ from in another Because of variation in species mix Common uncertainty about average over all species induces correlation Elicit beliefs about average over whole UK EvNl joint distributions are mixtures of 25 components, with correlation both between and within years

17 May 2007RSS Kent Local Group26 Mean NBP corrections

17 May 2007RSS Kent Local Group27 NBP standard deviations

17 May 2007RSS Kent Local Group28 Land cover (from LCM2000)

17 May 2007RSS Kent Local Group29 Aggregate across 4 PFTs

17 May 2007RSS Kent Local Group30 Sensitivity analysis Map shows proportion of overall uncertainty in each pixel that is due to uncertainty in the parameters of PFTs As opposed to soil parameters Contribution of PFT uncertainty largest in grasslands/moorlands

17 May 2007RSS Kent Local Group31 England & Wales aggregate PFT Plug-in estimate (Mt C) Mean (Mt C) Variance (Mt C 2 ) Grass Crop Deciduous Evergreen Covariances Total

17 May 2007RSS Kent Local Group32 Conclusions Bayesian methods offer a powerful basis for computation of uncertainties in model predictions Analysis of E&W aggregate NBP in 2000 Good case study for uncertainty and sensitivity analyses But needs to take account of more sources of uncertainty Involved several technical extensions Has important implications for our understanding of C fluxes Policy implications