Download presentation

Presentation is loading. Please wait.

Published byNathaniel Russo Modified over 2 years ago

1
Bayesian tools for analysing and reducing uncertainty Tony OHagan University of Sheffield

2
Or …

3
Uncertainty, Complexity and Predictive Reliability of (environmental/biological) process models

4
Summary Uncertainty Complexity Predictive Reliability

5
Uncertainty is everywhere … Internal parameters Initial conditions Forcing inputs Model structure Observational error Code uncertainty

6
Uncertainty (2) And all sources of uncertainty must be recognised quantified Otherwise we dont know how good model predictions are how to use data

7
Tasks involving uncertainty Whether or not we have data Sensitivity analysis Uncertainty analysis Interacting with observational data Calibration Data assimilation Discrepancy estimation Validation

8
Complexity This is already a big task It is massively exacerbated by model complexity High dimensionality Long model run times But there are powerful statistical tools available

9
Its a big task Quantifying uncertainty is often difficult Unfamiliar task Need for expert statistical skills Statistical modelling Elicitation It deserves to be recognised as a task of comparable status to developing the model And EMS is all about respecting each others expertise

10
Computational complexity All the tasks involving uncertainty can be computed by simple (MC)MC methods if the model runs quickly enough Otherwise emulation is needed Requires orders of magnitude fewer model runs

11
Emulation A computer model encodes a function, that takes inputs and produces outputs An emulator is a statistical approximation of that function NOT just an approximation Estimates what outputs would be obtained from given inputs With statistically valid measure of uncertainty

12
Emulators Multiple regression models Do not make valid uncertainty statements Neural networks Can make valid uncertainty statements but complex Data based mechanistic models Do not make valid uncertainty statements Gaussian processes

13
GPs Gaussian process emulators are nonparametric make no assumptions other than smoothness estimate the code accurately with small uncertainty and run instantly So we can do uncertainty based tasks fast and efficiently Conceptually, we use model runs to learn about the function then derive any desired properties of model

14
2 code runs Consider one input and one output Emulator estimate interpolates data

15
2 code runs Emulator uncertainty grows between data points

16
3 code runs Adding another point changes estimate and reduces uncertainty

17
5 code runs And so on

18
Smoothness It is the basic assumption of a (homogeneously) smooth, continuous function that gives the GP its computational advantages The actual degree of smoothness concerns how rapidly the function wiggles A rough function responds strongly to quite small changes in inputs We need many more data points to emulate accurately a rough function over a given range

19
Effect of Smoothness Smoothness determines how fast the uncertainty increases between data points

20
Estimating smoothness We can estimate the smoothness from the data This is obviously a key Gaussian process parameter to estimate But tricky Need robust estimate Validate by predicting left-out data points

21
Code uncertainty Emulation, like MC, is just a computational device But a highly efficient one! Like MC, quantities of interest are computed subject to error Statistically quantifiable and validatable Reducible if we can do more model runs This is code uncertainty

22
And finally … Predictive Reliability

23
What can we do with observational data? Model validation Check observations against predictive distributions based on current knowledge Calibration Data assimilation Model correction Learn about values of uncertain model parameters (possibly including model structure) For dynamic models, learn about the current value of the state vector Learn about model discrepancy function Do all of these (in one coherent Bayesian system)

24
Doing it all Its crucial to model uncertainties carefully to avoid using data twice to apportion observation error between parameters, state vector and model discrepancy to get appropriate learning about all these Data assimilation alone is useful only for short term prediction

25
This is challenging We (Sheffield and Durham) have developed theory and serious case studies Growing practical experience But still lots to do, both theoretically and practically Each new model poses new challenges Our science is as exciting and challenging as any other

26
Sorry … We are not yet at the stage where implementation is routine Very limited software Most publications in the statistics literature But were working on it And were very willing to interact with modellers/users in any discipline Particularly if you have resources!

27
Who we are Sheffield Tony OHagan Marc Kennedy, Stefano Conti, Jeremy Oakley Durham Michael Goldstein Peter Craig, Jonathan Rougier, Alan Seheult

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google