Process-based modelling of vegetations and uncertainty quantification Marcel van Oijen (CEH-Edinburgh) Course Statistics for Environmental Evaluation Glasgow,

Slides:

Advertisements

Similar presentations

Bayes rule, priors and maximum a posteriori

Advertisements

Key sources of uncertainty in forest carbon inventories Raisa Mäkipää with Mikko Peltoniemi, Suvi Monni, Taru Palosuo, Aleksi Lehtonen & Ilkka Savolainen.

Bayesian methods for calibrating and comparing process-based vegetation models Marcel van Oijen (CEH-Edinburgh)

Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013.

Process-based modelling of vegetations and uncertainty quantification Marcel van Oijen (CEH-Edinburgh) Statistics for Environmental Evaluation Glasgow,

Bayesian calibration and uncertainty analysis of dynamic forest models Marcel van Oijen CEH-Edinburgh.

J. Daunizeau Institute of Empirical Research in Economics, Zurich, Switzerland Brain and Spine Institute, Paris, France Bayesian inference.

1 -Classification: Internal Uncertainty in petroleum reservoirs.

Analysis and modelling of landscape ecological changes due to dynamic surface movements caused by mining activities Christian Fischer, Heidrun Matejka,

Site and Stocking and Other Related Measurements.

1 Uncertainty in rainfall-runoff simulations An introduction and review of different techniques M. Shafii, Dept. Of Hydrology, Feb

Land Surface Evaporation 1. Key research issues 2. What we learnt from OASIS 3. Land surface evaporation using remote sensing 4. Data requirements Helen.

EFIMED Advanced course on MODELLING MEDITERRANEAN FOREST STAND DYNAMICS FOR FOREST MANAGEMENT MARC PALAHI Head of EFIMED Office INDIVIDUAL TREE.

Estimating Uncertainty in Ecosystem Budgets Ruth Yanai, SUNY-ESF, Syracuse Ed Rastetter, Ecosystems Center, MBL Dusty Wood, SUNY-ESF, Syracuse.

Earth Observation for Agriculture – State of the Art – F. Baret INRA-EMMAH Avignon, France 1.

1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.

1 Bayesian methods for parameter estimation and data assimilation with crop models David Makowski and Daniel Wallach INRA, France September 2006.

Reducing Canada's vulnerability to climate change - ESS Variation of land surface albedo and its simulation Shusen Wang Andrew Davidson Canada Centre for.

Februar 2003 Workshop Kopenhagen1 Assessing the uncertainties in regional climate predictions of the 20 th and 21 th century Andreas Hense Meteorologisches.

Daniel Metcalfe Oxford University Centre for the Environment Comprehensive monitoring of carbon allocation and cycling across.

Lecture 7 Forestry 3218 Forest Mensuration II Lecture 7 Forest Inventories Avery and Burkhart Chapter 9.

Bayesian calibration and comparison of process-based forest models Marcel van Oijen & Ron Smith (CEH-Edinburgh) Jonathan Rougier (Durham Univ.)

Persistence of nitrogen limitation over terrestrial carbon uptake Galina Churkina, Mona Vetter and Kristina Trusilova Max-Planck Institute for Biogeochemistry.

Carbon Cycle and Ecosystems Important Concerns: Potential greenhouse warming (CO 2, CH 4 ) and ecosystem interactions with climate Carbon management (e.g.,

Introduction to Probability and Probabilistic Forecasting L i n k i n g S c i e n c e t o S o c i e t y Simon Mason International Research Institute for.

Bayesian statistics – MCMC techniques

This presentation can be downloaded at Water Cycle Projections over Decades to Centuries at River Basin to Regional Scales:

Machine Learning CMPT 726 Simon Fraser University CHAPTER 1: INTRODUCTION.

“And see this ring right here, Jimmy?... That’s another time the old fellow miraculously survived some big forest fire.” ENFA/INSEA FORESTRY…..

Biosphere Modeling Galina Churkina MPI for Biogeochemistry.

Data assimilation Derek Karssenberg, Faculty of Geosciences, Utrecht University.

4. Testing the LAI model To accurately fit a model to a large data set, as in the case of the global-scale space-borne LAI data, there is a need for an.

Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.

Bayesian Analysis for Extreme Events Pao-Shin Chu and Xin Zhao Department of Meteorology School of Ocean & Earth Science & Technology University of Hawaii-

I N T E G R A T E D S I N K E N H A N C E M E N T A S S E S S M E N T INSEA PARTNERS Forest production and carbon storage -potentials of European forestry.

Day 7 Model Evaluation. Elements of Model evaluation l Goodness of fit l Prediction Error l Bias l Outliers and patterns in residuals.

Modeling climate change impacts on forest productivity with PnET-CN Emily Peters, Kirk Wythers, Peter Reich NE Landscape Plan Update May 17, 2012.

Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.

Bayesian parameter estimation in cosmology with Population Monte Carlo By Darell Moodley (UKZN) Supervisor: Prof. K Moodley (UKZN) SKA Postgraduate conference,

Exam I review Understanding the meaning of the terminology we use. Quick calculations that indicate understanding of the basis of methods. Many of the.

FORESTRY AND FOREST PRODUCTS Project Level Carbon Accounting Toolkit CSIRO Forestry and Forest Products Department of Forestry, Australian National University.

A process-based, terrestrial biosphere model of ecosystem dynamics (Hybrid v. 3.0) A. D. Friend, A.K. Stevens, R.G. Knox, M.G.R. Cannell. Ecological Modelling.

Nelius Foley, Matteo Sottocornola, Paul Leahy, Valerie Rondeau, Ger Kiely Hydrology, Micrometeorology and Climate Change University College Cork, IrelandEnvironmental.

BIOME-BGC estimates fluxes and storage of energy, water, carbon, and nitrogen for the vegetation and soil components of terrestrial ecosystems. Model algorithms.

1 Expert workshop on components of EEA Ecosystem Capital Accounts (ECA) Focus on biomass carbon and biodiversity data 24/03/2015.

Translation to the New TCO Panel Beverly Law Prof. Global Change Forest Science Science Chair, AmeriFlux Network Oregon State University.

Predicting climate change impacts on southern pines productivity in SE United States using physiological process based model 3-PG Carlos A. Gonzalez-Benecke.

VQ3a: How do changes in climate and atmospheric processes affect the physiology and biogeochemistry of ecosystems? [DS 194, 201] Science Issue: Changes.

Modelling protocols Marcel van Oijen (CEH-Edinburgh) With input from participants Garmisch Mar & May 2006, Protocol-GMP team, Protocol-UQ/UA team.

Why it is good to be uncertain ? Martin Wattenbach, Pia Gottschalk, Markus Reichstein, Dario Papale, Jagadeesh Yeluripati, Astley Hastings, Marcel van.

Effects of Rising Nitrogen Deposition on Forest Carbon Sequestration and N losses in the Delaware River Basin Yude Pan, John Hom, Richard Birdsey, Kevin.

Liebermann R 1, Kraft P 1, Houska T 1, Müller C 2,3, Haas E 4, Kraus D 4, Klatt S 4, Breuer L 1 1 Institute for Landscape Ecology and Resources Management,

The good sides of Bayes Jeannot Trampert Utrecht University.

Experiences in assessing deposition model uncertainty and the consequences for policy application Rognvald I Smith Centre for Ecology and Hydrology, Edinburgh.

CAMELS CCDAS A Bayesian approach and Metropolis Monte Carlo method to estimate parameters and uncertainties in ecosystem models from eddy-covariance data.

Bayesian Statistics Lecture 8 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006.

JEG DM: common work items Targets & ex post analysis Robustness Links with biodiversity Trends in selected modeled/measured parameters.

Bayes Theorem. Prior Probabilities On way to party, you ask “Has Karl already had too many beers?” Your prior probabilities are 20% yes, 80% no.

Simulating global fire regimes & biomass burning with vegetation-fire models Kirsten Thonicke 1, Allan Spessa 2 & I. Colin Prentice

Geogg124: Data assimilation P. Lewis. What is Data Assimilation? Optimal merging of models and data Models Expression of current understanding about process.

Bayesian II Spring Major Issues in Phylogenetic BI Have we reached convergence? If so, do we have a large enough sample of the posterior?

1 UIUC ATMOS 397G Biogeochemical Cycles and Global Change Lecture 18: Nitrogen Cycle Don Wuebbles Department of Atmospheric Sciences University of Illinois,

THE FUTURE CLIMATE OF AMAZONIA Carlos Nobre 1, Marcos Oyama 2, Gilvan Sampaio 1 1 CPTEC/INPE, 2 IAE/CTA LBA ECO São Paulo / 2005 November.

Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:

Forest Management Service Center Providing Biometric Services to the National Forest System Program Emphasis: We provide products and technical support.

Ruth Doherty, Edinburgh University Adam Butler & Glenn Marion, BioSS

3-PG The Use of Physiological Principles in Predicting Forest Growth

Ecosystem Demography model version 2 (ED2)

Bayes for Beginners Stephanie Azzopardi & Hrvoje Stojic

CS639: Data Management for Data Science

Presentation transcript:

Process-based modelling of vegetations and uncertainty quantification Marcel van Oijen (CEH-Edinburgh) Course Statistics for Environmental Evaluation Glasgow,

ContentsContents 1.Process-based modelling of vegetations 2.The Bayesian approach 3.Bayesian Calibration (BC) of process-based models 4.Bayesian Model Comparison (BMC) 5.Limitations of BC & BMC 6.On the usage of BC & BMC, now and in the future 7.References, Summary, Discussion

1. Process-based modelling of vegetations

1.1 Ecosystem PBMs simulate biogeochemistry Atmosphere Tree Soil Subsoil H2OH2O H2OH2O H2OH2O H2OH2OC C C N N N N

1.2 I/O of PBMs Parameters & initial constants vegetation Parameters & initial constants soil Atmospheric drivers InputModel Output Management & land use Simulation of time series of plant and soil variables

1.3 I/O of empirical models Two parameters: P1 = slope P2 = intercept InputModel Output Y = P1 + P2 * t

1.4 Environmental evaluation: increasing use of PBMs C-sequestration (model output for ) Uncertainty of C-sequestration [Van Oijen & Thomson, 2010]

1.5 Coupled vegetation-climate modelling White et al (1998)

1.5 Coupled vegetation-climate modelling White et al (1998) Death of Amazonian rain forest ?!

1.6 Forest models and uncertainty Model [Levy et al, 2004]

1.6 Forest models and uncertainty bgc century hybrid N dep UE (kg C kg -1 N) [Levy et al, 2004]

1.7 There are many models! Status: 680 models ( ) Search models (by subject) Result of query : Subject : Forestry 78 models found ANIMO: Agricultural NItrogen MOdel ACRU; Agricultural Catchments Research Unit Model AMORPHYS: A forest model based on tree morphology and physiology AREFS: The Automated Regional Ecological Forecast System BIOMASS: Forest canopy carbon and water balance model BROOK: BROOK, BROOK2 and BROOK90 BWIN: Program for forest stand analysis and prognosis CACTOS: California Conifer Timber Output Simulator CALPRO: The growth model for uneven-aged mixed conifer stands in California CARDYN: CARbon DYNamics CARRY: CARRY - contaminant transport model CRYPTOS: CRYPTOS CUPID: A comprehensive model of plant-environment interaction DENIT: DenNit DRYADES: Dryades DYNLAYER: Dynamic forest simulator EFIMOD: Dynamic Model of the "Mixed Stand/Soil" System in European Boreal Forests EFISCEN: European Forest Information Model (…)

1.8 Coupled vegetation-climate modelling White et al (1998) Death of Amazonian rain forest ?!

1.9 Amazonia revisited: model uncertainties 1. Change in precipitation and temperature over Amazonia predicted by 20 GCMs Galbraith et al. (2010). 3. Resulting change in rainforest biomass predicted by 3 vegetation models x 20 GCMs x 4 scenarios 2. Change in rainforest biomass predicted by 3 vegetation models for most extreme scenario (HadCM3 climate, A1FI) %

1.10 Reality check ! How reliable are these model studies: Sufficient data for model parameterization? Sufficient data for model input? How plausible are the different models? In every study using systems analysis and simulation: Model parameters, inputs and structure are uncertain How to deal with uncertainties optimally?

2. The Bayesian approach

2.1 Probability Theory Uncertainties are everywhere: Models (environmental inputs, parameters, structure), Data Uncertainties can be expressed as probability distributions (pdfs) We need methods that: Quantify all uncertainties Show how to reduce them Efficiently transfer information: data models model application Calculating with uncertainties (pdfs) = Probability Theory

2.2 The Bayesian approach: reasoning using probability theory

2.3 The Bayesian approach = using Bayes Theorem

2.4 Dealing with uncertainty: Medical diagnostics A flu epidemic occurs: one percent of people is ill Diagnostic test, 99% reliable Test result is positive (bad news!) What is P(diseased|test positive)? (a)0.50 (b)0.98 (c)0.99 P(dis) = 0.01 P(pos|hlth) = 0.01 P(pos|dis) = 0.99 P(dis|pos) = P(pos|dis) P(dis) / P(pos) Bayes Theorem

2.4 Dealing with uncertainty: Medical diagnostics A flu epidemic occurs: one percent of people is ill Diagnostic test, 99% reliable Test result is positive (bad news!) What is P(diseased|test positive)? (a)0.50 (b)0.98 (c)0.99 P(dis) = 0.01 P(pos|hlth) = 0.01 P(pos|dis) = 0.99 P(dis|pos) = P(pos|dis) P(dis) / P(pos) = P(pos|dis) P(dis) P(pos|dis) P(dis) + P(pos|hlth) P(hlth) Bayes Theorem

2.4 Dealing with uncertainty: Medical diagnostics A flu epidemic occurs: one percent of people is ill Diagnostic test, 99% reliable Test result is positive (bad news!) What is P(diseased|test positive)? (a)0.50 (b)0.98 (c)0.99 P(dis) = 0.01 P(pos|hlth) = 0.01 P(pos|dis) = 0.99 P(dis|pos) = P(pos|dis) P(dis) / P(pos) = P(pos|dis) P(dis) P(pos|dis) P(dis) + P(pos|hlth) P(hlth) = = 0.50 Bayes Theorem

2.5 Proof of Bayes Theorem P(A&B)= P(B) PA|B) P(A|B)= P(A) P(B|A) / P(B) A B Product Rule Bayes Theorem

2.6 Proof of Bayes Theorem P(A&B)= P(B) PA|B) = P(A) P(B|A) P(A|B)= P(A) P(B|A) / P(B) A B P(A&B) = 1/7 P(A|B) = 1/3 P(B|A) = 1/5 Product Rule Bayes Theorem

2.7 The denominator in Bayes Theorem A B P(B) = P(B|A) P(A)+ P(B|not A) P(not A) 3/7= 1/5 * 5/7+ 1 * 2/7 Law of Total probability

2.8 Bayesian updating of probabilities Model parameterization:P(params) P(params|data) Model selection:P(models) P(model|data) SPAM-killer:P(SPAM) P(SPAM| header) Weather forecasting:… Climate change prediction:… Oil field discovery:… GHG-emission estimation:… Jurisprudence:… Bayes Theorem:Prior probability Posterior prob. Medical diagnostics:P(disease) P(disease|test result)

2.10 What and why? We want to use data and models to explain and predict ecosystem behaviour Data as well as model inputs, parameters and outputs are uncertain No prediction is complete without quantifying the uncertainty. No explanation is complete without analysing the uncertainty Uncertainties can be expressed as probability density functions (pdfs) Probability theory tells us how to work with pdfs: Bayes Theorem (BT) tells us how a pdf changes when new information arrives BT: Prior pdf Posterior pdf BT: Posterior = Prior x Likelihood / Evidence BT: P(θ|D) = P(θ) P(D|θ) / P(D) BT: P(θ|D) P(θ) P(D|θ)

3. Bayesian Calibration (BC) of process-based models

Bayesian updating of probabilities for process-based models Model parameterization:P(params) P(params|data) Model selection:P(models) P(model|data) Bayes Theorem:Prior probability Posterior prob.

3.1 Process-based forest models Soil C NPP Height Environmental scenarios Initial values Parameters Model

3.2 Process-based forest model BASFOR BASFOR 40+ parameters 12+ output variables

3.3 BASFOR: outputs Volume (standing) Carbon in trees (standing + thinned) Carbon in soil

3.4 BASFOR: parameter uncertainty

3.5 BASFOR: prior output uncertainty Volume (standing) Carbon in trees (standing + thinned) Carbon in soil

3.6 Data Dodd Wood (R. Matthews, Forest Research) Volume (standing) Carbon in trees (standing + thinned) Carbon in soil

3.7 Using data in Bayesian calibration of BASFOR Prior pdf Posterior pdf Data Bayesian calibration

3.8 Bayesian calibration: posterior uncertainty Volume (standing) Carbon in trees (standing + thinned) Carbon in soil

3.9 Calculating the posterior using MCMC Sample of parameter vectors from the posterior distribution P( |D) for the parameters P( |D) P( ) P(D|f( )) 1.Start anywhere in parameter-space: p (i=0) 2.Randomly choose p(i+1) = p(i) + δ 3.IF:[ P(p(i+1)) P(D|f(p(i+1))) ] / [ P(p(i)) P(D|f(p(i))) ] > Random[0,1] THEN: accept p(i+1) ELSE: reject p(i+1) i=i+1 4.IF i < 10 4 GOTO 2 Metropolis et al (1953) MCMC trace plots

3.10 BC using MCMC: an example in EXCEL Click here for BC_MCMC1.xls

install.packages("mvtnorm") require(mvtnorm) chainLength = data <- matrix(c(10,6.09,1.83, 20,8.81,2.64, 30,10.66,3.27), nrow=3, ncol=3, byrow=T) param <- matrix(c(0,5,10, 0,0.5,1), nrow=2, ncol=3, byrow=T) pMinima <- c(param[1,1], param[2,1]) pMaxima <- c(param[1,3], param[2,3]) logli <- matrix(, nrow=3, ncol=1) vcovProposal = diag( (0.05*(pMaxima-pMinima)) ^2 ) pValues <- c(param[1,2], param[2,2]) pChain <- matrix(0, nrow=chainLength, ncol = length(pValues)+1) logPrior0 <- sum(log(dunif(pValues, min=pMinima, max=pMaxima))) model <- function (times,intercept,slope) {y <- intercept+slope*times return(y)} for (i in 1:3) {logli[i] <- -0.5*((model(data[i,1],pValues[1],pValues[2])- data[i,2])/data[i,3])^2 - log(data[i,3])} logL0 <- sum(logli) pChain[1,] <- c(pValues, logL0) # Keep first values for (c in (2 : chainLength)){ candidatepValues <- rmvnorm(n=1, mean=pValues, sigma=vcovProposal) if (all(candidatepValues>pMinima) && all(candidatepValues<pMaxima)) {Prior1 <- prod(dunif(candidatepValues, pMinima, pMaxima))} else {Prior1 <- 0} if (Prior1 > 0) { for (i in 1:3){logli[i] <- -0.5*((model(data[i,1],candidatepValues[1],candidatepValues[2])- data[i,2])/data[i,3])^2 - log(data[i,3])} logL1 <- sum(logli) logalpha <- (log(Prior1)+logL1) - (logPrior0+logL0) if ( log(runif(1, min = 0, max =1)) < logalpha ) { pValues <- candidatepValues logPrior0 <- log(Prior1) logL0 <- logL1}} pChain[c,1:2] <- pValues pChain[c,3] <- logL0 } nAccepted = length(unique(pChain[,1])) acceptance = (paste(nAccepted, "out of ", chainLength, "candidates accepted ( = ", round(100*nAccepted/chainLength), "%)")) print(acceptance) mp <- apply(pChain, 2, mean) print(mp) pCovMatrix <- cov(pChain) print(pCovMatrix) MCMC in R

3.12 Using data in Bayesian calibration of BASFOR Prior pdf Data Bayesian calibration Posterior pdf

3.13 Parameter correlations 39 parameters

3.14 Continued calibration when new data become available Prior pdf Posterior pdf Bayesian calibration Prior pdf New data

3.14 Continued calibration when new data become available New data Bayesian calibration Prior pdf Posterior pdf Prior pdf

3.15 Bayesian projects at CEH-Edinburgh Selection of forest models ( NitroEurope team) Data Assimilation forest EC data (David Cameron, Mat Williams) Risk of frost damage in grassland (Stig Morten Thorsen, Anne-Grete Roer, MvO) Uncertainty in agricultural soil models (Lehuger, Reinds, MvO) Uncertainty in UK C-sequestration (MvO, Jonty Rougier, Ron Smith, Tommy Brown, Amanda Thomson) Parameterization and uncertainty quantification of 3-PG model of forest growth & C-stock (Genevieve Patenaude, Ronnie Milne, M. v.Oijen) Uncertainty in earth system resilience (Clare Britton & David Cameron) [CO 2 ] Time

3.16 BASFOR: forest C-sequestration Uncertainty due to model parameters only, NOT uncertainty in inputs / upscaling Soil N-content C-sequestration Uncertainty of C-sequestration

3.18 What kind of measurements would have reduced uncertainty the most ?

3.19 Prior predictive uncertainty & height-data Height Biomass Prior pred. uncertainty Height data Skogaby

3.20 Prior & posterior uncertainty: use of height data Height Biomass Prior pred. uncertainty Posterior uncertainty (using height data) Height data Skogaby

3.20 Prior & posterior uncertainty: use of height data Height Biomass Prior pred. uncertainty Posterior uncertainty (using height data) Height data (hypothet.)

3.20 Prior & posterior uncertainty: use of height data Height Biomass Prior pred. uncertainty Posterior uncertainty (using height data) Posterior uncertainty (using precision height data)

3.22 Summary for BC vs tuning Model tuning 1.Define parameter ranges (permitted values) 2.Select parameter values that give model output closest (r 2, RMSE, …) to data 3.Do the model study with the tuned parameters (i.e. no model output uncertainty) Bayesian calibration 1. Define parameter pdfs 2. Define data pdfs (probable measurement errors) 3. Use Bayes Theorem to calculate posterior parameter pdf 4. Do all future model runs with samples from the parameter pdf (i.e. quantify uncertainty of model results) BC can use data to reduce parameter uncertainty for any process-based model

4. Bayesian Model Comparison (BMC)

4.1 Multiple models -> structural uncertainty bgc century hybrid N dep UE (kg C kg -1 N) [Levy et al, 2004]

4.2 Bayesian comparison of two models Bayes Theorem for model probab.: P(M|D) = P(M) P(D|M) / P(D) The Integrated likelihood P(D|M i ) can be approximated from the MCMC sample of outputs for model M i ( * ) Model 1 Model 2 P(M 2 |D) / P(M 1 |D) = P(D|M 2 ) / P(D|M 1 ) The Bayes Factor P(D|M 2 ) / P(D|M 1 ) quantifies how the data D change the odds of M 2 over M 1 P(M 1 ) = P(M 2 ) = ½ (*)(*) harmonic mean of likelihoods in MCMC-sample (Kass & Raftery, 1995)

4.3 BMC: Tuomi et al. 2007

4.4 Bayes Factor for two big forest models MCMC 5000 steps Calculation of P(D|BASFOR) Calculation of P(D|BASFOR+) Data Rajec: Emil Klimo

4.5 Bayes Factor for two big forest models MCMC 5000 steps Calculation of P(D|BASFOR) Calculation of P(D|BASFOR+) Data Rajec: Emil Klimo P(D|M 1 ) = 7.2e-016 P(D|M 2 ) = 5.8e-15 Bayes Factor = 7.8, so BASFOR+ supported by the data

4.6 Summary of BMC: what do we need, what do we do? What do we need to carry out a BMC? 1. Multiple models:M 1, …, M n 2. For each model, a list of its parameters:θ 1, …, θ n 3. Data:D What do we do with the models, parameters and data? 1. We express our uncertainty about the correctness of models, parameter values and data by means of probability distributions. 2. We apply the rules of probability theory to transfer the information from the data to the probability distributions for models and parameters 3. The result tells us which model is the most plausible, and what its parameter values are likely to be

5. Limitations of BC & BMC

5.1 What do BC & BMC tell us about our models? BC tells us about our parameters: what their values probably are BMC tells us about the structure of our models: which model is more plausible than others. But … BC does not tell us why the most probable parameter values sometimes look strange BMC does not tell us whether the most plausible model could be improved, or how.

5.2 EXAMPLE: BC giving strange posterior pdfs Red lines: Prior pdf. Black histograms: Posterior pdf after using data from Scots pine in Estonia.

5.3 A three step-procedure

5.4 Forest model comparison (NitroEurope) System:Spruce forest, Höglwald, Germany Models:BASFOR, COUP, DAYCENT, Mobile-DNDC Data:Soil water, Emissions of N 2 O, NO, CO 2 Field data [Van Oijen et al. 2011]

5.5 Analysis of model-data mismatch before/after BC: logL [Van Oijen et al. 2011]

5.6 Analysis of model-data mismatch before/after BC: MSE MSE for N 2 O Prior Posterior Prior Posterior Prior Posterior Prior Posterior [Van Oijen et al. 2011]

5.7 Parameters: universal or site-specific ? System:Forest soils, Europe, 182 sites Model:VSD Data:pH, [Ca,Mg,K], [NO 3 ], [Al] Single-site calibration turns prior uncertainty into spatial variability Multi-site calibration removes parameter uncertainty … … but NRMSE on validation plots are % higher than using nearest-neighbour single-site calibration Prior Posterior

6. On the usage of BC & BMC, now and in the future

Linear regression using least squares Model: straight line Prior: uniform Likelihood: Gaussian (iid) BC, e.g. for spatiotemporal stochastic modelling with spatial correlations included in the prior = Note: Realising that LS-regression is a special case of BC opens up possibilities to improve on it, e.g. by having more information in the prior or likelihood (Sivia 2005) All Maximum Likelihood estimation methods can be seen as limited forms of BC where the prior is ignored (uniform) and only the maximum value of the likelihood is identified (ignoring uncertainty) Hierarchical modelling = BC, except that uncertainty is ignored 6.1 Bayes in other disguises

- Inverse modelling (e.g. to estimate emission rates from concentrations) - Geostatistics, e.g. Bayesian kriging - Data Assimilation (KF, EnKF etc.) 6.2 Bayes in other disguises (cont.)

6.3 Trends More use of Bayesian approaches in all areas of environmental science Improvements in computational techniques for BC & BMC of slow process-based models Increasing use of hierarchical models (to represent complex prior pdfs, or to represent spatial relationships) Replacement of informal methods (or methods that only approximate the full probability approach) by BMC

6.4 Improvements in Markov Chain Monte Carlo algorithms

6.5 Hierarchical Bayesian modelling in ecology See also: Ogle, K. and J.J. Barber (2008) "Bayesian data-model integration in plant physiological and ecosystem ecology." Progress in Botany 69:

6.6 Bayesian calibration instead of model spin-up System:Grassland, Oensingen (Switzerland) Model:DAYCENT Data:Soil respiration Data Prior Posterior

6.7 Bayes & space Van Oijen, Thomson & Ewert (2009)

7. Summary, References, Discussion

7.1 Summary of methodology 1. Express all uncertainties probabilistically Assign probability distributions to (1) data, (2) the collection of models, (3) the parameter-set of each individual model 2. Use the rules of probability theory to transfer the information from the data to the probability distributions for models and parameters Main tool from probability theory to do this: Bayes Theorem P(α|D) P(α) P(D|α) Posterior is proportional to prior times likelihood α = parameter set parameterisation (Bayesian Calibration, BC) α = model set model evaluation (Bayesian Model Comparison, BMC)

7.2 Bayesian methods: References Bayes, T. (1763) Metropolis, N. (1953) Kass & Raftery (1995) Green, E.J. / MacFarlane, D.W. / Valentine, H.T., Strawderman, W.E. (1996, 1998, 1999, 2000) Jansen, M. (1997) Jaynes, E.T. (2003) Van Oijen et al. (2005, 2008, 2011) Bayes Theorem MCMC BMC Forest models Crop models Probability theory Complex process- based models, MCMC

Bayesian Calibration (BC) and Bayesian Model Comparison (BMC) of process- based models: Theory, implementation and guidelines Freely downloadable from

7.4 Discussion: BC & BMC in practice Practicalities: 1.When new data arrive: MCMC provides a universal method for calculating posterior pdfs Easy to implement, difficult to fine-tune 2.Quantifying the prior: Not a key issue in env. sci.: (1) many data, (2) prior is posterior from previous calibration 3.Defining the likelihood: Normal pdf for measurement error usually describes our prior state of knowledge adequately (Jaynes) 4.For model development: BC & BMC are not enough ! Still a need for analysis of model-data mismatch Overall: Uncertainty quantification often shows that our process-based environmental models are not very reliable

Appendix A: How to do BC The problem: You have: (1) a prior pdf P(θ) for your models parameters, (2) new data. You also know how to calculate the likelihood P(D|θ). How do you now go about using BT to calculate the posterior P(θ|D)? Methods of using BT to calculate P(θ|D): 1.Analytical. Only works when the prior and likelihood are conjugate (family-related). For example if prior and likelihood are normal pdfs, then the posterior is normal too. 2.Numerical. Uses sampling. Three main methods: 1.MCMC (e.g. Metropolis, Gibbs) Sample directly from the posterior. Best for high-dimensional problems 2.Accept-Reject Sample from the prior, then reject some using the likelihood. Best for low-dimensional problems 3.Model emulation followed by MCMC or A-R

Appendix B: Should we try to measure the sensitive parameters? Yes, because the sensitive parameters: are obviously important for prediction ? No, because model parameters: are correlated with each other, which we do not measure cannot really be measured at all So, it may be better to measure output variables, because they: are what we are interested in are better defined, in models and measurements help determine parameter correlations if used in Bayesian calibration Key question: what data are most informative?

Appendix C: Data have information content, which is additive = +