Bayesian calibration and uncertainty analysis of dynamic forest models Marcel van Oijen CEH-Edinburgh.

Slides:



Advertisements
Similar presentations
Progress in understanding carbon dynamics in primary forests CD08 team.
Advertisements

Emulation of a Stochastic Forest Simulator Using Kernel Stick-Breaking Processes (Work in Progress) James L. Crooks (SAMSI, Duke University)
Bayesian methods for calibrating and comparing process-based vegetation models Marcel van Oijen (CEH-Edinburgh)
Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013.
Process-based modelling of vegetations and uncertainty quantification Marcel van Oijen (CEH-Edinburgh) Course Statistics for Environmental Evaluation Glasgow,
Process-based modelling of vegetations and uncertainty quantification Marcel van Oijen (CEH-Edinburgh) Statistics for Environmental Evaluation Glasgow,
Model checks for complex hierarchical models Alex Lewin and Sylvia Richardson Imperial College Centre for Biostatistics.
Introduction to Monte Carlo Markov chain (MCMC) methods
1 Uncertainty in rainfall-runoff simulations An introduction and review of different techniques M. Shafii, Dept. Of Hydrology, Feb
Estimating Uncertainty in Ecosystem Budgets Ruth Yanai, SUNY-ESF, Syracuse Ed Rastetter, Ecosystems Center, MBL Dusty Wood, SUNY-ESF, Syracuse.
Parameter identifiability, constraints, and equifinality in data assimilation with ecosystem models Dr. Yiqi Luo Botany and microbiology department University.
Showcase of a Biome-BGC workflow presentation Zoltán BARCZA Training Workshop for Ecosystem Modelling studies Budapest, May 2014.
1 Bayesian methods for parameter estimation and data assimilation with crop models David Makowski and Daniel Wallach INRA, France September 2006.
Bayesian Deconvolution of Belowground Ecosystem Processes Kiona Ogle University of Wyoming Departments of Botany & Statistics.
Data-model assimilation for manipulative experiments Dr. Yiqi Luo Botany and microbiology department University of Oklahoma, USA.
Bayesian Estimation in MARK
Bayesian calibration and comparison of process-based forest models Marcel van Oijen & Ron Smith (CEH-Edinburgh) Jonathan Rougier (Durham Univ.)
Inspiral Parameter Estimation via Markov Chain Monte Carlo (MCMC) Methods Nelson Christensen Carleton College LIGO-G Z.
By Addison Euhus, Guidance by Edward Phillips An Introduction To Uncertainty Quantification.
Ch 11. Sampling Models Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by I.-H. Lee Biointelligence Laboratory, Seoul National.
Markov-Chain Monte Carlo
Bayesian statistics – MCMC techniques
Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical.
Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.
4. Testing the LAI model To accurately fit a model to a large data set, as in the case of the global-scale space-borne LAI data, there is a need for an.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Robin McDougall, Ed Waller and Scott Nokleby Faculties of Engineering & Applied Science and Energy Systems & Nuclear Science 1.
Introduction to Monte Carlo Methods D.J.C. Mackay.
Bayes Factor Based on Han and Carlin (2001, JASA).
Material Model Parameter Identification via Markov Chain Monte Carlo Christian Knipprath 1 Alexandros A. Skordos – ACCIS,
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
Model Inference and Averaging
Priors, Normal Models, Computing Posteriors
2 nd Order CFA Byrne Chapter 5. 2 nd Order Models The idea of a 2 nd order model (sometimes called a bi-factor model) is: – You have some latent variables.
Exam I review Understanding the meaning of the terminology we use. Quick calculations that indicate understanding of the basis of methods. Many of the.
EFFICIENT CHARACTERIZATION OF UNCERTAINTY IN CONTROL STRATEGY IMPACT PREDICTIONS EFFICIENT CHARACTERIZATION OF UNCERTAINTY IN CONTROL STRATEGY IMPACT PREDICTIONS.
Methods Model. The TECOS model is used as forward model to simulate carbon transfer among the carbon pools (Fig.1). In the model, ecosystem is simplified.
Bayesian Inversion of Stokes Profiles A.Asensio Ramos (IAC) M. J. Martínez González (LERMA) J. A. Rubiño Martín (IAC) Beaulieu Workshop ( Beaulieu sur.
A Comparison of Two MCMC Algorithms for Hierarchical Mixture Models Russell Almond Florida State University College of Education Educational Psychology.
Why it is good to be uncertain ? Martin Wattenbach, Pia Gottschalk, Markus Reichstein, Dario Papale, Jagadeesh Yeluripati, Astley Hastings, Marcel van.
17 May 2007RSS Kent Local Group1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan CTCD, Sheffield.
Suppressing Random Walks in Markov Chain Monte Carlo Using Ordered Overrelaxation Radford M. Neal 발표자 : 장 정 호.
Randomized Algorithms for Bayesian Hierarchical Clustering
CAMELS CCDAS A Bayesian approach and Metropolis Monte Carlo method to estimate parameters and uncertainties in ecosystem models from eddy-covariance data.
Using data assimilation to improve estimates of C cycling Mathew Williams School of GeoScience, University of Edinburgh.
Tracking Multiple Cells By Correspondence Resolution In A Sequential Bayesian Framework Nilanjan Ray Gang Dong Scott T. Acton C.L. Brown Department of.
Biases in land surface models Yingping Wang CSIRO Marine and Atmospheric Research.
Application of the MCMC Method for the Calibration of DSMC Parameters James S. Strand and David B. Goldstein The University of Texas at Austin Sponsored.
Reducing MCMC Computational Cost With a Two Layered Bayesian Approach
Adaptive Spatial Resampling as a McMC Method for Uncertainty Quantification in Seismic Reservoir Modeling Cheolkyun Jeong*, Tapan Mukerji, and Gregoire.
Designing Factorial Experiments with Binary Response Tel-Aviv University Faculty of Exact Sciences Department of Statistics and Operations Research Hovav.
Introduction to Sampling Methods Qi Zhao Oct.27,2004.
Bayesian Modelling Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Uncertainty analysis of carbon turnover time and sequestration potential in terrestrial ecosystems of the Conterminous USA Xuhui Zhou 1, Tao Zhou 1, Yiqi.
Bayesian II Spring Major Issues in Phylogenetic BI Have we reached convergence? If so, do we have a large enough sample of the posterior?
Introduction to emulators Tony O’Hagan University of Sheffield.
SIR method continued. SIR: sample-importance resampling Find maximum likelihood (best likelihood × prior), Y Randomly sample pairs of r and N 1973 For.
Bayesian analysis of a conceptual transpiration model with a comparison of canopy conductance sub-models Sudeep Samanta Department of Forest Ecology and.
CS498-EA Reasoning in AI Lecture #19 Professor: Eyal Amir Fall Semester 2011.
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Markov Chain Monte Carlo in R
Marc Kennedy, Tony O’Hagan, Clive Anderson,
3-PG The Use of Physiological Principles in Predicting Forest Growth
Model Inference and Averaging
Ecosystem Demography model version 2 (ED2)
Else K. Bünemann1 and Christoph Müller2,3
Markov Chain Monte Carlo
Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.
Predictive distributions
Presentation transcript:

Bayesian calibration and uncertainty analysis of dynamic forest models Marcel van Oijen CEH-Edinburgh

Input to forest models and output Soil C NPP Height Environmental scenarios Initial values Parameters Model Imperfect input data

Input to forest models and output Model [Levy et al, 2004]

Input to forest models and output bgc century hybrid N dep UE (kg C kg -1 N) [Levy et al, 2004]

Simpler models? Goal: Robust models, predicting forest growth over 100 years, with low uncertainty Effects that must be accounted for: N-deposition CO 2 Temperature Rain Radiation Soil fertility Management, e.g. thinning... Are simple, robust models possible? Typical model size: parameters

Simple (semi-)empirical relationships 1.Lieth (1972, Miami-model): NPP = f(Temperature, Rain) 2.Monteith (1977): NPP = LUE * Intercepted light 3.Gifford (1980): NPP = NPP 0 (1 + β Log([CO 2 ]/[CO 2 ] 0 ) ) 4.Gifford (1994): NPP = 0.5 GPP 5.Temperature ~ Light intensity 6.Roberts & Zimmermann (1999): LAI max Rain 7.Beers Law: Fractional light interception = (1-e -k LAI ) 8.West, Brown, Enquist, Niklas ( ): Height ~ Mass ¼ ~ {f leaf, f stem, f root } 9.Brouwer (1983): Root-shoot ratio = f(N) 10.Goudriaan (1990): Turn-over rates, SOM, litter

BASic FORest model (BASFOR) BASFOR 24 output variables 39 parameters

BASFOR: Inputs BASFOR 24 output variables Weather & soil: Skogaby (Sweden)

Forest data from Skogaby (Sweden) Planted: 1966, (2300 trees ha -1 ) Weather data: Soil data: C, N, Mineralisation rate Tree data: Biomass, NPP, Height, [N], LAI Skogaby

BASFOR: Inputs BASFOR 24 output variables Weather & soil: Skogaby (Sweden)

BASFOR: Inputs BASFOR Weather & soil: Skogaby (Sweden) p 1,min p 1,max P(p 1 ) p 39,min p 39,max P(p 39 ) 24 output variables

BASFOR: Prior predictive uncertainty Wood C Height NPP Skogaby, not calibrated (m ± σ)

BASFOR: Predictive uncertainty BASFOR 24 output variables High output uncertainty 39 parameters High input uncertainty Data: measurements of output variables Calibration of parameters

CalibrationCalibration f P(f(p)) P(p) D Calibration = Find P(p|D) Bayesian calibration: P(p|D) = P(p) P(D|p) / P(D) P(p) L(f(p)|D) Posterior distribution Prior distribution Likelihood given mismatch between model output & data:

CalibrationCalibration f P(f(p)) P(p) D Bayesian calibration P(f(p)) P(p)

Data Skogaby (S) Wood C Height NPP

Calculating the posterior distribution Bayesian calibration: P(p|D) P(p) L(f(p)|D) Calculating P(p|D) costs much time: 1.Sample parameter-space representatively 2.For each sampled set of parameter-values: a.Calculate P(p) b.Run the model c.Calculate errors (model vs data), and their likelihood Sampling problem: Markov Chain Monte Carlo (MCMC) methods Computing problem: Computer power, Numerical software Solutions

Markov Chain Monte Carlo (MCMC) Metropolis algorithm BASFOR (~ 30 lines of code) MCMC: walk through parameter-space, such that the set of visited points approaches the posterior parameter distribution P(p|D) 1.Start anywhere in parameter-space: p (i=0) 2.Randomly choose p(i+1) = p(i) + δ 3.IF:[ P(p(i+1)) L(f(p(i+1))) ] / [ P(p(i)) L(f(p(i))) ] > Random[0,1] THEN: accept P(i+1) & i=i+1 ELSE: reject P(i+1) 4.IF i < 10 4 GOTO 2 1.E.g. {SLA=5, k=0.4,......} 2.Use multivariate normal distribution for [δ 1,...,δ 39 ] 3.Run BASFOR. Assume normally distributed errors: L(output- data j ) ~ N(0,σ j ) with different σ j for each datapoint

MCMC parameter trace plots: steps Steps in MCMC Param. value

Marginal distributions of parameters

Parameter correlations (PCC) 39 parameters

Posterior predictive uncertainty Wood C Height NPP Skogaby, calibrated (m ± σ)

Posterior predictive uncertainty vs prior Wood C Height NPP Skogaby, calibrated (m ± σ) Skogaby, not calibrated (m ± σ)

Partial correlations parameters – output variables 24 output variables 39 parameters Wood C

Wood C vs parameter-values

Partial correlations parameters – wood C p x Allocation to wood Senescence stem+br. SOM turnover Max N leaf N root

Should we measure the sensitive parameters? Yes, because the sensitive parameters: are obviously important for prediction No, because model parameters: are model-specific are correlated with each other, which we do not measure cannot really be measured at all So, it may be better to measure output variables, because they: are what we are interested in are better defined, in models and measurements help determine parameter correlations if used in Bayesian calibration

The value of NPP-data Wood C Height NPP Skogaby, calibrated on NPP- data only (m ± σ) Skogaby, not calibrated (m ± σ)

Data of height growth: poor quality Wood C Height NPP Skogaby, calibrated on poor height-data only (m ± σ) Skogaby, not calibrated (m ± σ)

Data of height growth: high quality Wood C Height NPP Skogaby, calibrated on poor height-data only (m ± σ) Skogaby, not calibrated (m ± σ) Skogaby, calibrated on good height-data only (m ± σ)

Model application to forest growth in Rajec (Czechia) Rajec (CZ): Planted: 1903, (6000 trees ha -1 ) Tree data: Wood-C, Height Skogaby Rajec

Rajec (CZ): Uncalibrated and calibrated on Skogaby (S) Wood C Height NPP Rajec, Skogaby-calibrated (m ± σ) Rajec, not calibrated (m ± σ)

Rajec (CZ): Uncalibrated and calibrated on Skogaby (S) Wood C Height NPP Rajec, Skogaby-calibrated (m ± σ) Rajec, not calibrated (m ± σ)

Rajec (CZ): further calibration on Rajec-data Wood C Height NPP Rajec, Skogaby-calibrated (m ± σ) Rajec, not calibrated (m ± σ) Rajec, Skogaby- and Rajec- calibrated (m ± σ)

Summary of procedure Data D ± σModel fPrior P(p) Calibrated parameters, with covariances Uncertainty analysis of model output Sensitivity analysis of model parameters Error function e.g. N(0, σ) MCMC Samples of p (10 4 – 10 5 ) Samples of f(p) (10 4 – 10 5 ) Posterior P(p|D) P(f(p)|D) PCC

Model selection Soil C NPP Height Environmental scenarios Initial values Parameters Model Imperfect understanding Imperfect output data Imperfect input data

Model selection Bayesian model selection: P(M|D) P(M) L(f M (p M )|D) Bayesian calibration: P(p|D) P(p) L(f(p)|D) BASFOR (39 parameters) Expolinear (4 parameters) Max(log(L)) = -5.7Max(log(L)) = -6.9 By-products of MCMC Mean(log(L)) = -6.4 Mean(log(L)) = -8.7

Conclusions (1) Reducing parameter uncertainty: Reduces predictive uncertainty Reveals magnitude of errors in model structure Benefits little from parameter measurement: i.model parameter what you measure ii.parameter covariances are more important than variances Requires calibration on measured outputs (eddy fluxes, C- inventories, height-measurement,...) Calibration: Requires precise data Central output variables are more useful than peripheral (NPP/gas exchange > height)

Conclusions (2) MCMC-calibration Works on all models Conceptually simple, grounded in probability theory Algorithmically simple (Metropolis) Not fast ( model runs) Produces: 1.Sample from parameter pdf (means, variances and covariances), with likelihoods 2.Corresponding sample of model outputs (UA) 3.Partial correlation analysis of outputs vs parameters (SA) Model selection Can use the same probabilistic approach as calibration Can use mean model log-likelihoods produced by MCMC

AcknowledgementsAcknowledgements Göran Ågren (S) & Emil Klimo (CZ) Peter Levy, Renate Wendler, Peter Millard (UK) Ron Smith (UK)

Appendix 1: Calculation times per MCMC-step

MCMC: to do 1.Burn-in 2.Multiple chains 3.Mixing criteria (from characteristics of individual chains and from comparison of multiple chains) 4.Better (dynamic? f(prior?)) choice of step-length for generating candidate next points in p-space 5.Other speeding-up tricks?