Project leader’s report MUCM Advisory Panel Meeting, November 2006.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Report on the activities of the Digital Soil Mapping Working Group Endre Dobos.
Bayesian tools for analysing and reducing uncertainty Tony OHagan University of Sheffield.
Bayesian Belief Propagation
How to Emulate: Recipes without Patronising
MUCM going forward Team Meeting, July MUCM2  Two year project  Starting 1 st October 2010  Finishing 30 th September 2012  People  Continuation.
Uncertainty and Sensitivity Analysis of Complex Computer Codes
Southampton workshop, July 2009Slide 1 Tony O’Hagan, University of Sheffield Simulators and Emulators.
Quantifying and managing uncertainty with Gaussian process emulators Tony O’Hagan University of Sheffield.
Emulators and MUCM. Outline Background Simulators Uncertainty in model inputs Uncertainty analysis Case study – dynamic vegetation simulator Emulators.
SAMSI Kickoff 11/9/06Slide 1 Simulators, emulators, predictors – Validity, quality, adequacy Tony O’Hagan.
14 May 2008RSS Oxford1 Towards quantifying the uncertainty in carbon fluxes Tony O’Hagan University of Sheffield.
SAMSI Distinguished, October 2006Slide 1 Tony O’Hagan, University of Sheffield Managing Uncertainty in Complex Models.
Durham workshop, July 2008Slide 1 Tony O’Hagan, University of Sheffield MUCM: An Overview.
Insert Date HereSlide 1 Using Derivative and Integral Information in the Statistical Analysis of Computer Models Gemma Stephenson March 2007.
Building the community Project 8. Objectives  8.1 MUCM server  8.2 Toolkit release 8  8.3 Community services  8.4 to 8.6 Short courses at conferences.
Interfacing physical experiments and computer models Preliminary remarks Tony O’Hagan.
Simulators and Emulators Tony O’Hagan University of Sheffield.
Dialogue Policy Optimisation
Running a model's adjoint to obtain derivatives, while more efficient and accurate than other methods, such as the finite difference method, is a computationally.
Design of Experiments Lecture I
Designing Ensembles for Climate Prediction
Comments on Hierarchical models, and the need for Bayes Peter Green, University of Bristol, UK IWSM, Chania, July 2002.
Uncertainty Analysis Using GEM-SA. GEM-SA course - session 42 Outline Setting up the project Running a simple analysis Exercise More complex analyses.
What role should probabilistic sensitivity analysis play in SMC decision making? Andrew Briggs, DPhil University of Oxford.
All Hands Meeting, 2006 Title: Grid Workflow Scheduling in WOSE (Workflow Optimisation Services for e- Science Applications) Authors: Yash Patel, Andrew.
Validating uncertain predictions Tony O’Hagan, Leo Bastos, Jeremy Oakley, University of Sheffield.
Testing hydrological models as hypotheses: a limits of acceptability approach and the issue of disinformation Keith Beven, Paul Smith and Andy Wood Lancaster.
SBSE Course 3. EA applications to SE Analysis Design Implementation Testing Reference: Evolutionary Computing in Search-Based Software Engineering Leo.
Gaussian Processes I have known
Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.
Gaussian process emulation of multiple outputs Tony O’Hagan, MUCM, Sheffield.
Climate case study. Outline The challenge The simulator The data Definitions and conventions Elicitation Expert beliefs about climate parameters Expert.
Sensitivity Analysis for Complex Models Jeremy Oakley & Anthony O’Hagan University of Sheffield, UK.
Andrew Schuh, Scott Denning, Marek Ulliasz Kathy Corbin, Nick Parazoo A Case Study in Regional Inverse Modeling.
Value of Information for Complex Economic Models Jeremy Oakley Department of Probability and Statistics, University of Sheffield. Paper available from.
The Calibration Process
Classification: Internal Status: Draft Using the EnKF for combined state and parameter estimation Geir Evensen.
Helsinki University of Technology Adaptive Informatics Research Centre Finland Variational Bayesian Approach for Nonlinear Identification and Control Matti.
Gaussian process modelling
Calibration and Model Discrepancy Tony O’Hagan, MUCM, Sheffield.
Calibration of Computer Simulators using Emulators.
6 July 2007I-Sim Workshop, Fontainebleau1 Simulation and Uncertainty Tony O’Hagan University of Sheffield.
CRESCENDO Full virtuality in design and product development within the extended enterprise Naples, 28 Nov
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
1 Institute of Engineering Mechanics Leopold-Franzens University Innsbruck, Austria, EU H.J. Pradlwarter and G.I. Schuëller Confidence.
A Beginner’s Guide to Bayesian Modelling Peter England, PhD EMB GIRO 2002.
Advances in Robust Engineering Design Henry Wynn and Ron Bates Department of Statistics Workshop at Matforsk, Ås, Norway 13 th -14 th May 2004 Design of.
29 May 2008IMA Scottish Branch1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan University of Sheffield.
Using an emulator. Outline So we’ve built an emulator – what can we use it for? Prediction What would the simulator output y be at an untried input x?
Slide 1 Marc Kennedy, Clive Anderson, Anthony O’Hagan, Mark Lomas, Ian Woodward, Andreas Heinemayer and John Paul Gosling Uncertainty in environmental.
Fast Simulators for Assessment and Propagation of Model Uncertainty* Jim Berger, M.J. Bayarri, German Molina June 20, 2001 SAMO 2001, Madrid *Project of.
Statistics and the Verification Validation & Testing of Adaptive Systems Roman D. Fresnedo M&CT, Phantom Works The Boeing Company.
Why it is good to be uncertain ? Martin Wattenbach, Pia Gottschalk, Markus Reichstein, Dario Papale, Jagadeesh Yeluripati, Astley Hastings, Marcel van.
17 May 2007RSS Kent Local Group1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan CTCD, Sheffield.
Center for Radiative Shock Hydrodynamics Fall 2011 Review Assessment of predictive capability Derek Bingham 1.
Slide 1 Marc Kennedy, Clive Anderson, Anthony O’Hagan, Mark Lomas, Ian Woodward, Andreas Heinemayer and John Paul Gosling Quantifying uncertainty in the.
Emulation and Sensitivity Analysis of the CMAQ Model During a UK Ozone Pollution Episode Andrew Beddows Environmental Research Group King’s College London.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
Reducing MCMC Computational Cost With a Two Layered Bayesian Approach
1 1 Slide Simulation Professor Ahmadi. 2 2 Slide Simulation Chapter Outline n Computer Simulation n Simulation Modeling n Random Variables and Pseudo-Random.
Options and generalisations. Outline Dimensionality Many inputs and/or many outputs GP structure Mean and variance functions Prior information Multi-output,
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
Building Valid, Credible & Appropriately Detailed Simulation Models
Introduction to emulators Tony O’Hagan University of Sheffield.
8 Sept 2006, DEMA2006Slide 1 An Introduction to Computer Experiments and their Design Problems Tony O’Hagan University of Sheffield.
Marc Kennedy, Tony O’Hagan, Clive Anderson,
Updating the Regulation for the JINR Programme Advisory Committees
The Calibration Process
11-12 June Third International Symposium on Climate and Earth System Modeling, NUIST, 南京 (Nanjing) On the added value generated by dynamical models.
Presentation transcript:

Project leader’s report MUCM Advisory Panel Meeting, November 2006

Outline Background: uncertainty in models MUCM overview Putting the structures in place Specific progress

Background: Uncertainty in Models

Computer models In almost all fields of science, technology, industry and policy making, people use mechanistic models to describe complex real- world processes For understanding, prediction, control There is a growing realisation of the importance of uncertainty in model predictions Can we trust them? Without any quantification of output uncertainty, it’s easy to dismiss them

Sources of uncertainty A computer model takes inputs x and produces outputs y = f(x) How might y differ from the true real-world value z that the model is supposed to predict? Error in inputs x Initial values, forcing inputs, model parameters Error in model structure or solution Wrong, inaccurate or incomplete science Bugs, solution errors

Quantifying uncertainty The ideal is to provide a probability distribution p(z) for the true real-world value The centre of the distribution is a best estimate Its spread shows how much uncertainty about z is induced by uncertainties on the last slide How do we get this? Input uncertainty: characterise p(x), propagate through to p(y) Structural uncertainty: characterise p(z-y)

Example: UK carbon flux in 2000 Vegetation model predicts carbon exchange from each of 700 pixels over England & Wales Principal output is Net Biosphere Production Accounting for uncertainty in inputs Soil properties Properties of different types of vegetation Aggregated to England & Wales total Allowing for correlations Estimate 7.55 Mt C Std deviation 0.57 Mt C

Maps

Sensitivity analysis Map shows proportion of overall uncertainty in each pixel that is due to uncertainty in the vegetation parameters As opposed to soil parameters Contribution of vegetation uncertainty is largest in grasslands/moorlands

England & Wales aggregate PFT Plug-in estimate (Mt C) Mean (Mt C) Variance (Mt C 2 ) Grass Crop Deciduous Evergreen Covariances0.001 Total

Reducing uncertainty To reduce uncertainty, get more information! Informal – more/better science Tighten p(x) through improved understanding Tighten p(z-y) through improved modelling or programming Formal – using real-world data Calibration – learn about model parameters Data assimilation – learn about the state variables Learn about structural error z-y Validation

Example: Nuclear accident Radiation was released after an accident at the Tomsk-7 chemical plant in 1993 Data comprise measurements of the deposition of ruthenium 106 at 695 locations obtained by aerial survey after the release The computer code is a simple Gaussian plume model for atmospheric dispersion Two calibration parameters Total release of 106 Ru (source term) Deposition velocity

Data

A small sample (N=10 to 25) of the 695 data points was used to calibrate the model Then the remaining observations were predicted and RMS prediction error computed On a log scale, error of 0.7 corresponds to a factor of 2 Calibration Sample size N Best fit calibration Bayesian calibration

So far, so good, but In principle, all this is straightforward In practice, there are many technical difficulties Formulating uncertainty on inputs Elicitation of expert judgements Propagating input uncertainty Modelling structural error Anything involving observational data! The last two are intricately linked And computation

The problem of big models Tasks like uncertainty propagation and calibration require us to run the model many times Uncertainty propagation Implicitly, we need to run f(x) at all possible x Monte Carlo works by taking a sample of x from p(x) Typically needs thousands of model runs Calibration Traditionally this is done by searching the x space for good fits to the data This is impractical if the model takes more than a few seconds to run We need a more efficient technique

Gaussian process representation More efficient approach First work in early 1980s Consider the code as an unknown function f(.) becomes a random process We represent it as a Gaussian process (GP) Training runs Run model for sample of x values Condition GP on observed data Typically requires many fewer runs than MC And x values don’t need to be chosen randomly

Emulation Analysis is completed by prior distributions for, and posterior estimation of, hyperparameters The posterior distribution is known as an emulator of the computer code Posterior mean estimates what the code would produce for any untried x (prediction) With uncertainty about that prediction given by posterior variance Correctly reproduces training data

2 code runs Consider one input and one output Emulator estimate interpolates data Emulator uncertainty grows between data points

3 code runs Adding another point changes estimate and reduces uncertainty

5 code runs And so on

Then what? Given enough training data points we can emulate any model accurately So that posterior variance is small “everywhere” Typically, feasible with orders of magnitude fewer model runs than traditional methods Use the emulator to make inference about other things of interest Uncertainty analysis, sensitivity analysis, calibration, data assimilation, optimisation, … Conceptually very straightforward in the Bayesian framework But of course can be technically hard

Research directions Models with heterogeneous local behaviour Regions of input space with rapid response, jumps High dimensional models Many inputs, outputs, data points Dynamic models Data assimilation Stochastic models Relationship between models and reality Model/emulator validation Multiple models Design of experiments Sequential design

MUCM Overview

MUCM in a nutshell Managing Uncertainty in Complex Models Four-year research grant 7 postdoctoral research assistants 4 PhD studentships Started in June 2006 Based in Sheffield and 4 other UK universities Objective: To develop Bayesian model uncertainty methods into a robust technology … toolkit, UML specifications that is widely applicable across the spectrum of modelling applications case studies

Theme 1: High Dimensionality Tackling problems associated with dimensionality of inputs, outputs, parameters, and data WP 1.1 – Screening (PS) Identifying most important inputs/outputs WP 1.2 – Sparsity and Projection (RA) Dimension reduction using modern computational techniques WP 1.3 – Multiscale models (RA) Linking models and data at different resolutions Theme leader: Dan Cornford

Theme 2: Using Observational Data Tackling problems associated with model structural error to link models to field data WP 2.1 – Linking Models to Reality (RA) Modelling structural error WP 2.2 – Diagnostics and Validation (PS) Criticising our statistical representations WP 2.3 – Calibration & Data Assimilation (RA) Extending calibration techniques, particularly to dynamic models Theme leader: Michael Goldstein

Theme 3: Realising the Potential Turning theory into reliable, widely applicable techniques across a wide range of models WP 3.1 – Experimental Design (RA + PS) Designing input sets for running models, and planning observational studies WP 3.2 – The Toolkit (RA + PS) Distilling experience with methods into robust tools, relaxing constraints WP 3.3 – Case Studies (RA) Three substantial case studies Theme leader: Peter Challenor

Organisation overview

Organisation by theme 1. Cornford2. Goldstein3. Challenor 1.1 Boukouvalas Cornford (Challenor) 1.2 Maniyar Cornford (Wynn) 1.3 Cumming Goldstein (Rougier) 2.1 House Goldstein (O’Hagan) 2.2 Bastos O’Hagan (Rougier) 2.3 Bhattacharya Oakley (Cornford) 3.1 Maruri-Aguilar Wynn (Goldstein) Youssef Wynn (Oakley) 3.2 Gattiker Challenor (O’Hagan, Cornford) Stephenson Challenor (Oakley) 3.3 Gosling O’Hagan (Challenor) O’Hagan

Organisation by committee The whole Team meets twice a year Presentations, reports and planning The Project Management Board meets four times a year Formal decision making, budgeting, personnel matters The Advisory Panel meets with the investigators twice a year Providing external support and advice

The Team Investigators Challenor, Cornford, Goldstein, Oakley, O’Hagan, Rougier, Wynn Project manager Green RAs Bhattacharya, Cumming, Gattiker, Gosling, House, Maniyar, Maruri-Aguilar PSs Bastos, Bouskouvalas, Stephenson, Youssef

The Board Project Management Board is the primary project management body Tony O’Hagan (Sheffield, Chair) Dan Cornford (Aston) Peter Challenor (Southampton) Michael Goldstein (Durham) Henry Wynn (LSE) Non-voting Jeremy Oakley (Sheffield) Jonty Rougier (Durham) Jo Green (Sheffield)

The Panel Advisory Panel comprises modellers, model users and model uncertainty experts from a wide range of fields Industry Bob Parish, Hilmi Kurt-Elli, Clive Bowman Academia Ron Akehurst, Martin Dove, Keith Beven, Douglas Kell, Ian Woodward Research institutions Richard Haylock, Andrea Saltelli, Andy Hart, David Higdon, Mat Collins

The Mentor Peter Green (Bristol) Appointed by EPSRC Liaise between project team and EPSRC Advise team

Putting the Structures in Place

General All RAs, PSs and Project Manager recruited Started at various times from 1 June to 1 October Need to replace Bhattacharya Website, wiki, lists, logo, templates created Reading list, glossary under development Monthly reporting established RAs set up reading club Links established with related projects Particularly with SAMSI programme in USA

Project planning First draft of rolling workplans Descriptions and objectives Detailed plans and milestones for 12 months ahead With month-by-month detail for 6 months Outline plans and milestones for remainder of project Will be updated quarterly Milestones and deliverables carefully monitored Panel will receive plans from previous Board

Financial management Handled at quarterly Board meetings Phased budget plan created for each institution RAs appointed initially for 3 years Fourth year funds retained in reserve

Contacts with Panel members Introductory meetings held with most members An RA has been assigned to each To develop understanding of the models and the modelling area To act as link between other team members and Panel member Beginning to explore use of models Some models also sourced from other contacts

Specific progress 1 Emulator fitting Study of methods to estimate roughness parameters Acquisition of existing packages Multiscale models Multiscale version of Daisyworld model created Non-homogeneous models Voronoi tessellation method improved Paper in preparation

Specific progress 2 Design Study of aberration and relationship to kernel Paper in preparation Dynamic models Basic theory of dynamic emulation developed Toy dynamic model created and emulated Paper in preparation Hydrological model acquired