Download presentation
Presentation is loading. Please wait.
Published byNathaniel Russo Modified over 10 years ago
1
Bayesian tools for analysing and reducing uncertainty Tony OHagan University of Sheffield
2
Or …
3
Uncertainty, Complexity and Predictive Reliability of (environmental/biological) process models
4
Summary Uncertainty Complexity Predictive Reliability
5
Uncertainty is everywhere … Internal parameters Initial conditions Forcing inputs Model structure Observational error Code uncertainty
6
Uncertainty (2) And all sources of uncertainty must be recognised quantified Otherwise we dont know how good model predictions are how to use data
7
Tasks involving uncertainty Whether or not we have data Sensitivity analysis Uncertainty analysis Interacting with observational data Calibration Data assimilation Discrepancy estimation Validation
8
Complexity This is already a big task It is massively exacerbated by model complexity High dimensionality Long model run times But there are powerful statistical tools available
9
Its a big task Quantifying uncertainty is often difficult Unfamiliar task Need for expert statistical skills Statistical modelling Elicitation It deserves to be recognised as a task of comparable status to developing the model And EMS is all about respecting each others expertise
10
Computational complexity All the tasks involving uncertainty can be computed by simple (MC)MC methods if the model runs quickly enough Otherwise emulation is needed Requires orders of magnitude fewer model runs
11
Emulation A computer model encodes a function, that takes inputs and produces outputs An emulator is a statistical approximation of that function NOT just an approximation Estimates what outputs would be obtained from given inputs With statistically valid measure of uncertainty
12
Emulators Multiple regression models Do not make valid uncertainty statements Neural networks Can make valid uncertainty statements but complex Data based mechanistic models Do not make valid uncertainty statements Gaussian processes
13
GPs Gaussian process emulators are nonparametric make no assumptions other than smoothness estimate the code accurately with small uncertainty and run instantly So we can do uncertainty based tasks fast and efficiently Conceptually, we use model runs to learn about the function then derive any desired properties of model
14
2 code runs Consider one input and one output Emulator estimate interpolates data
15
2 code runs Emulator uncertainty grows between data points
16
3 code runs Adding another point changes estimate and reduces uncertainty
17
5 code runs And so on
18
Smoothness It is the basic assumption of a (homogeneously) smooth, continuous function that gives the GP its computational advantages The actual degree of smoothness concerns how rapidly the function wiggles A rough function responds strongly to quite small changes in inputs We need many more data points to emulate accurately a rough function over a given range
19
Effect of Smoothness Smoothness determines how fast the uncertainty increases between data points
20
Estimating smoothness We can estimate the smoothness from the data This is obviously a key Gaussian process parameter to estimate But tricky Need robust estimate Validate by predicting left-out data points
21
Code uncertainty Emulation, like MC, is just a computational device But a highly efficient one! Like MC, quantities of interest are computed subject to error Statistically quantifiable and validatable Reducible if we can do more model runs This is code uncertainty
22
And finally … Predictive Reliability
23
What can we do with observational data? Model validation Check observations against predictive distributions based on current knowledge Calibration Data assimilation Model correction Learn about values of uncertain model parameters (possibly including model structure) For dynamic models, learn about the current value of the state vector Learn about model discrepancy function Do all of these (in one coherent Bayesian system)
24
Doing it all Its crucial to model uncertainties carefully to avoid using data twice to apportion observation error between parameters, state vector and model discrepancy to get appropriate learning about all these Data assimilation alone is useful only for short term prediction
25
This is challenging We (Sheffield and Durham) have developed theory and serious case studies Growing practical experience But still lots to do, both theoretically and practically Each new model poses new challenges Our science is as exciting and challenging as any other
26
Sorry … We are not yet at the stage where implementation is routine Very limited software Most publications in the statistics literature But were working on it And were very willing to interact with modellers/users in any discipline Particularly if you have resources!
27
Who we are Sheffield Tony OHagan (a.ohagan@sheffield.ac.uk) http://shef.ac.uk/~st1ao Marc Kennedy, Stefano Conti, Jeremy Oakley Durham Michael Goldstein (michael.goldstein@durham.ac.uk) Peter Craig, Jonathan Rougier, Alan Seheult
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.