Uncertain models and modelling uncertainty

Slides:



Advertisements
Similar presentations
Some statistical ideas Marian Scott Statistics, University of Glasgow June 2012.
Advertisements

What can Statistics do for me? Marian Scott Dept of Statistics, University of Glasgow Statistics course, March 2009.
Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013.
Introduction to modelling extremes
What can Statistics do for me? Marian Scott Dept of Statistics, University of Glasgow Statistics course, September 2006.
Statistical basics Marian Scott Dept of Statistics, University of Glasgow August 2010.
Some statistical ideas Marian Scott Statistics, University of Glasgow January 2014.
Statistical model building
Statistical basics Marian Scott Dept of Statistics, University of Glasgow August 2008.
Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2012.
Some statistical ideas Marian Scott Statistics, University of Glasgow September 2011.
What can Statistics do for me? Marian Scott Dept of Statistics, University of Glasgow Statistics course, September 2007.
Design of Experiments Lecture I
Approaches to Data Acquisition The LCA depends upon data acquisition Qualitative vs. Quantitative –While some quantitative analysis is appropriate, inappropriate.
A Concept of Environmental Forecasting and Variational Organization of Modeling Technology Vladimir Penenko Institute of Computational Mathematics and.
1 Validation and Verification of Simulation Models.
Lecture 10 Comparison and Evaluation of Alternative System Designs.
1 The Assumptions. 2 Fundamental Concepts of Statistics Measurement - any result from any procedure that assigns a value to an observable phenomenon.
Uncertainty in Engineering - Introduction Jake Blanchard Fall 2010 Uncertainty Analysis for Engineers1.
Decision analysis and Risk Management course in Kuopio
1 D r a f t Life Cycle Assessment A product-oriented method for sustainability analysis UNEP LCA Training Kit Module k – Uncertainty in LCA.
RESEARCH A systematic quest for undiscovered truth A way of thinking
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
VTT-STUK assessment method for safety evaluation of safety-critical computer based systems - application in BE-SECBS project.
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
1 Institute of Engineering Mechanics Leopold-Franzens University Innsbruck, Austria, EU H.J. Pradlwarter and G.I. Schuëller Confidence.
Simulation Prepared by Amani Salah AL-Saigaly Supervised by Dr. Sana’a Wafa Al-Sayegh University of Palestine.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Role of Statistics in Geography
Introduction Osborn. Daubert is a benchmark!!!: Daubert (1993)- Judges are the “gatekeepers” of scientific evidence. Must determine if the science is.
Geo597 Geostatistics Ch9 Random Function Models.
Chapter 1 Introduction to Statistics. Statistical Methods Were developed to serve a purpose Were developed to serve a purpose The purpose for each statistical.
1 Enviromatics Environmental sampling Environmental sampling Вонр. проф. д-р Александар Маркоски Технички факултет – Битола 2008 год.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
17 May 2007RSS Kent Local Group1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan CTCD, Sheffield.
Copyright  2003 by Dr. Gallimore, Wright State University Department of Biomedical, Industrial Engineering & Human Factors Engineering Human Factors Research.
Climate Modeling Idania Rodriguez EEES PhD Student Idania Rodriguez EEES PhD Student “Science Explorations Through the Lens of Global Climate Change” Workshop.
Sensitivity and Importance Analysis Risk Analysis for Water Resources Planning and Management Institute for Water Resources 2008.
Lesson Overview Lesson Overview What Is Science? Lesson Overview 1.1 What Is Science?
Three Frameworks for Statistical Analysis. Sample Design Forest, N=6 Field, N=4 Count ant nests per quadrat.
Experiences in assessing deposition model uncertainty and the consequences for policy application Rognvald I Smith Centre for Ecology and Hydrology, Edinburgh.
Academic Research Academic Research Dr Kishor Bhanushali M
Working With Simple Models to Predict Contaminant Migration Matt Small U.S. EPA, Region 9, Underground Storage Tanks Program Office.
Machine Design Under Uncertainty. Outline Uncertainty in mechanical components Why consider uncertainty Basics of uncertainty Uncertainty analysis for.
Stats 845 Applied Statistics. This Course will cover: 1.Regression –Non Linear Regression –Multiple Regression 2.Analysis of Variance and Experimental.
STATISTICS AND OPTIMIZATION Dr. Asawer A. Alwasiti.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Exposure Assessment for Health Effect Studies: Insights from Air Pollution Epidemiology Lianne Sheppard University of Washington Special thanks to Sun-Young.
Forecasting for Water Resources Planning. Learning Objective(s):  The student will:  Understand the need for forecasts.  Be able to describe what a.
NPS Source Attribution Modeling Deterministic Models Dispersion or deterministic models Receptor Models Analysis of Spatial & Temporal Patterns Back Trajectory.
1 Guess the Covered Word Goal 1 EOC Review 2 Scientific Method A process that guides the search for answers to a question.
Research Design Quantitative. Quantitative Research Design Quantitative Research is the cornerstone of evidence-based practice It provides the knowledge.
“ Building Strong “ Delivering Integrated, Sustainable, Water Resources Solutions Uncertainty & Variability Charles Yoe, Ph.D.
Building Valid, Credible & Appropriately Detailed Simulation Models
Exposure Prediction and Measurement Error in Air Pollution and Health Studies Lianne Sheppard Adam A. Szpiro, Sun-Young Kim University of Washington CMAS.
7. Air Quality Modeling Laboratory: individual processes Field: system observations Numerical Models: Enable description of complex, interacting, often.
1 Life Cycle Assessment A product-oriented method for sustainability analysis UNEP LCA Training Kit Module k – Uncertainty in LCA.
Establishing by the laboratory of the functional requirements for uncertainty of measurements of each examination procedure Ioannis Sitaras.
Marc Kennedy, Tony O’Hagan, Clive Anderson,
Statistics & Evidence-Based Practice
OPERATING SYSTEMS CS 3502 Fall 2017
making certain the uncertainties
Statistical Data Analysis
Dynamical Models - Purposes and Limits
Professor S K Dubey,VSM Amity School of Business
Statistical Data Analysis
Research Design Quantitative.
MECH 3550 : Simulation & Visualization
Presentation transcript:

Uncertain models and modelling uncertainty Marian Scott Dept of Statistics, University of Glasgow EMS workshop, Nottingham, April 2004

Outline of presentation Model building and testing- is the environment special? Statistical models vs physical/process based models What is sensitivity/uncertainty analysis? Quantifying and apportioning variation in model and data. General comments- relevance and implementation.

(All data are useful, but some are more varied than others.) All models are wrong but some are useful (and some are more useful than others) (All data are useful, but some are more varied than others.)

Questions we ask about models Is the model valid? Are the assumptions reasonable? Does the model make sense based on best scientific knowledge Is the model credible? Do the model predictions match the observed data? How uncertain are the results? What is a good model? Simple, realistic, efficient, useful, reliable, valid etc

Statistical models Always includes an  term to describe random variation Empirical Descriptive and predictive Model building goal: simplest model which is adequate used for inference

Physical/process based models Uses best scientific knowledge May not explicitly include , or any random variation Descriptive and predictive Goal may not be simplest model Not used for inference

Models Mathematical (deterministic/process based) models tend to be complex to ignore important sources of uncertainty Statistical models tend to be empirical To ignore much of the biological/physical/chemical knowledge

Stages in modelling Design and conceptualisation: Visualisation of structure Identification of processes (variable selection) Choice of parameterisation Fitting and assessment parameter estimation (calibration) Goodness of fit

Model evaluation tools Graphical procedures % variation explained in response Statistical model comparisons (F-tests, ANOVA, GLRT) well designed for statistical models, but what of the physical, process-driven models? Comparability to measurements

The story of randomness and uncertainty Randomness as the source of variability A source of variation, different animals range over different territory, eat different sources of …. The effect is that we cannot be certain Uncertainty due to lack of knowledge conflicting evidence ignorance effects of scale lack of observations Uncertainty due to variability Natural randomness behavioural variability

Effect of uncertainties Uncertainty in model quantities/parameters/ inputs Uncertainty about model form Uncertainty about model completeness Lack of observations contribute to uncertainties in input data parameter uncertainties Conflicting evidence contributes to uncertainty about model form Uncertainty about validity of assumptions Making it difficult to judge how good a model is!!

Modelling tools - SA/UA  Sensitivity analysis   determining the amount and kind of change produced in the model predictions by a change in a model parameter     Uncertainty analysis  an assessment/quantification of the uncertainties associated with the parameters, the data and the model structure.

Modellers conduct SA to determine (a)    if a model resembles the system or processes under study, (b)   the factors that mostly contribute to the output variability, (c)    the model parameters (or parts of the model itself) that are insignificant, (d)   if there is some region in the space of input factors for which the model variation is maximum, and (e)     if and which (group of) factors interact with each other.

SA flow chart (Saltelli, Chan and Scott, 2000)

Design of the SA experiment Simple factorial designs (one at a time) Factorial designs (including potential interaction terms) Fractional factorial designs Important difference: design in the context of computer code experiments – random variation due to variation in experimental units does not exist.

SA techniques Screening techniques Local/differential analysis O(ne) A(t) T(ime), factorial, fractional factorial designs used to isolate a set of important factors Local/differential analysis Sampling-based (Monte Carlo) methods Variance based methods variance decomposition of output to compute sensitivity indices

Screening screening experiments can be used to identify the parameter subset that controls most of the output variability with low computational effort.

Screening methods Vary one factor at a time (NOT particularly recommended) Morris OAT design (global) Estimate the main effect of a factor by computing a number r of local measures at different points x1,…,xr in the input space and then average them. Order the input factors

Local SA Local SA concentrates on the local impact of the factors on the model. Local SA is usually carried out by computing partial derivatives of the output functions with respect to the input variables. The input parameters are varied in a small interval around a nominal value. The interval is usually the same for all of the variables and is not related to the degree of knowledge of the variables.

Global SA Global SA apportions the output uncertainty to the uncertainty in the input factors, covering their entire range space. A global method evaluates the effect of xj while all other xi,ij are varied as well.

How is a sampling (global) based SA implemented? Step 1: define model, input factors and outputs Step 2: assign p.d.f.’s to input parameters/factors and if necessary covariance structure. DIFFICULT Step 3:simulate realisations from the parameter pdfs to generate a set of model runs giving the set of output values.

Choice of sampling method S(imple) or Stratified R(andom) S(ampling) Each input factor sampled independently many times from marginal distbns to create the set of input values (or randomly sampled from joint distbn.) Expensive (relatively) in computational effort if model has many input factors, may not give good coverage of the entire range space L(atin) H(ypercube) S(sampling) The range of each input factor is categorised into N equal probability intervals, one observation of each input factor made in each interval.

SA -analysis At the end of the computer experiment, data is of the form (yij, x1i,x2i,….,xni), where x1,..,xn are the realisations of the input factors. Analysis includes regression analysis (on raw and ranked values), standard hypothesis tests of distribution (mean and variance) for sub-samples corresponding to given percentiles of x and Analysis of Variance.

Some ‘new’ methods of analysis Measures of importance VarXi(E(Y|Xj =xj))/Var(Y) HIM(Xj) =yiyi’/N Sobol sensitivity indices Fourier Amplitude Sensitivity test (FAST)

So far so good but how useful are these techniques in some real life problems? Are there other complicating factors? Do statisticians have too simple/complex a view of the world?

Common features of environmental modelling and observations Knowledge of the processes creating the observational record may be incomplete The observational records may be incomplete (observed often irregularly in space and time) involve extreme events involve quantification of risk

Issues and purpose of analysis Global and local pollutant mapping from Chernobyl Global carbon cycle – greenhouse gases, CO2 levels and global warming Ocean modelling Air pollution modelling (local and regional scale) Chronologies for past environment studies Decision making- Which areas should be restricted? Prediction-What is the trend in temperature? Predict its level in 2050? Decision making-is it safe to eat fish? Regulatory- Have emission control agreements reduced air pollutants? Understanding -when did things happen in the past

Questions we ask about observations Do they result from observational or designed; laboratory or field experiments? What scale are they collected over (time and space)? Are they representative? Are they qualitative or quantitative? How are they connected to processes, how well understood are these connections? How varied are they?

Example 1: are atmospheric SO2 concentrations declining? Measurements made at a monitoring station over a 20 year period: processes involve meteorology (local and long-range, source distribution, chemistry of sulphur) Complex statistical model developed to describe the pattern, the model portions the variation to ‘trend’, seasonality, residual variation Main objective

Example 2 Discovery of radioactive particles on the foreshore of a nuclear facility since 1983 Is the rate of finds falling off? Are the particle characteristics changing with time? Processes: transport in the marine environment, chemistry of the particles in the sea, interaction with source What can we infer about the size of the source and its distribution?

Log activity and trend

Trend in number of finds

Cumulative number of finds

Example 3: how well should models agree? 6 ocean models (process based-transport, sedimentary processes, numerical solution scheme, grid size) used to predict the dispersal of a pollutant Results to be used to determine a remediation policy The models differ in their detail and also in their spatial scale

Model agreement Three different sites (local, regional and global relative to a source) 6 different models Level of agreement (high values are poor).

Predictions of levels of cobalt-60 Different models, same input data Predictions vary by considerable margins Magnitude of variation a function of spatial distribution of sites

Environmental modelling Modelling may involve Understanding and handling variation Dealing with unusual observations Dealing with missing observations Evaluating uncertainties

How well should the model reproduce the data? anecdotal comments ‘agreement between model and measurement better than 1 (2 ) orders of magnitude is acceptable’. But this needs to be moderated by the measurement variation and uncertainties It also depends on the purpose (model fit for purpose)

How can SA/UA help? SA/UA have a role to play in all modelling stages: We learn about model behaviour and ‘robustness’ to change; We can generate an envelope of ‘outcomes’ and see whether the observations fall within the envelope; We can ‘tune’ the model and identify reasons/causes for differences between model and observations

On the other hand - Uncertainty analysis Parameter uncertainty usually quantified in form of a distribution. Model structural uncertainty more than one model may be fit, expressed as a prior on model structure. Scenario uncertainty uncertainty on future conditions.

Tools for handling uncertainty Parameter uncertainty Probability distributions and Sensitivity analysis Structural uncertainty Bayesian framework one possibility to define a discrete set of models, other possibility to use a Gaussian process

Conclusions The world is rich and varied in its complexity Modelling is an uncertain activity Model assessment is a difficult process SA/UA are an important tools in model assessment The setting of the problem in a unified Bayesian framework allows all the sources of uncertainty to be quantified, so a fuller assessment to be performed.

Challenges Some challenges: different terminologies in different subject areas. need more sophisticated tools to deal with multivariate nature of problem. challenges in describing the distribution of input parameters. challenges in dealing with the Bayesian formulation of structural uncertainty for complex models. Computational challenges in simulations for large and complex computer models with many factors.