Estimating Uncertainty. Estimating Uncertainty in ADMB Model parameters and derived quantities Normal approximation Profile likelihood Bayesian MCMC Bootstrap.

Slides:



Advertisements
Similar presentations
Introduction to Monte Carlo Markov chain (MCMC) methods
Advertisements

Hypothesis testing Another judgment method of sampling data.
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
Bayesian Estimation in MARK
458 Quantifying Uncertainty using Classical Methods (Likelihood Profile, Bootstrapping) Fish 458, Lecture 12.
Sampling: Final and Initial Sample Size Determination
1 Parametric Sensitivity Analysis For Cancer Survival Models Using Large- Sample Normal Approximations To The Bayesian Posterior Distribution Gordon B.
Sampling Distributions (§ )
Markov-Chain Monte Carlo
Reporter: Hsu Hsiang-Jung Modelling stochastic fish stock dynamics using Markov Chain Monte Carlo.
Bayesian statistics – MCMC techniques
Visual Recognition Tutorial
Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
458 Fitting models to data – II (The Basics of Maximum Likelihood Estimation) Fish 458, Lecture 9.
Point and Confidence Interval Estimation of a Population Proportion, p
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Course overview Tuesday lecture –Those not presenting turn in short review of a paper using the method being discussed Thursday computer lab –Turn in short.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Computer vision: models, learning and inference
Hui-Hua Lee 1, Kevin R. Piner 1, Mark N. Maunder 2 Evaluation of traditional versus conditional fitting of von Bertalanffy growth functions 1 NOAA Fisheries,
Standard error of estimate & Confidence interval.
1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and.
880.P20 Winter 2006 Richard Kass 1 Confidence Intervals and Upper Limits Confidence intervals (CI) are related to confidence limits (CL). To calculate.
G. Cowan Lectures on Statistical Data Analysis Lecture 3 page 1 Lecture 3 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Welcome to our second (advanced) course on using AD Model Builder Instructors Brian Linton and Travis Brenden Special thanks to Jim Bence, Jenny Nelson,
Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.
Evaluation of a practical method to estimate the variance parameter of random effects for time varying selectivity Hui-Hua Lee, Mark Maunder, Alexandre.
Maximum Likelihood - "Frequentist" inference x 1,x 2,....,x n ~ iid N( ,  2 ) Joint pdf for the whole random sample Maximum likelihood estimates.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
- 1 - Bayesian inference of binomial problem Estimating a probability from binomial data –Objective is to estimate unknown proportion (or probability of.
Workshop on Stock Assessment Methods 7-11 November IATTC, La Jolla, CA, USA.
Probability = Relative Frequency. Typical Distribution for a Discrete Variable.
Three Frameworks for Statistical Analysis. Sample Design Forest, N=6 Field, N=4 Count ant nests per quadrat.
: An alternative representation of level of significance. - normal distribution applies. - α level of significance (e.g. 5% in two tails) determines the.
MCMC (Part II) By Marc Sobel. Monte Carlo Exploration  Suppose we want to optimize a complicated distribution f(*). We assume ‘f’ is known up to a multiplicative.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Using distributions of likelihoods to diagnose parameter misspecification of integrated stock assessment models Jiangfeng Zhu * Shanghai Ocean University,
Bayesian Approach Jake Blanchard Fall Introduction This is a methodology for combining observed data with expert judgment Treats all parameters.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Anders Nielsen Technical University of Denmark, DTU-Aqua Mark Maunder Inter-American Tropical Tuna Commission An Introduction.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
A correction on notation (Thanks Emma and Melissa)
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
Review. Common probability distributions Discrete: binomial, Poisson, negative binomial, multinomial Continuous: normal, lognormal, beta, gamma, (negative.
Institute of Statistics and Decision Sciences In Defense of a Dissertation Submitted for the Degree of Doctor of Philosophy 26 July 2005 Regression Model.
FIXETH LIKELIHOODS this is correct. Bayesian methods I: theory.
SIR method continued. SIR: sample-importance resampling Find maximum likelihood (best likelihood × prior), Y Randomly sample pairs of r and N 1973 For.
Probability and Likelihood. Likelihood needed for many of ADMB’s features Standard deviation Variance-covariance matrix Profile likelihood Bayesian MCMC.
An Introduction to AD Model Builder. Files Input.tplCode file.datData file.pinInitial parameter values file Output.parParameter estimate file.corCorrelation.
Hierarchical models. Hierarchical with respect to Response being modeled – Outliers – Zeros Parameters in the model – Trends (Us) – Interactions (Bs)
Parameter control. Phases Each parameter can be fixed until a predefined phase in the estimation A negative phase means don’t estimate in any phase init_number.
An Introduction to AD Model Builder PFRP
An Introduction to AD Model Builder. Files Input.tplCode file.datData file.pinInitial parameter values file Output.parParameter estimate file.corCorrelation.
Markov Chain Monte Carlo in R
Quiz 2.
(Day 3).
Sampling distribution
Introduction to the bayes Prefix in Stata 15
Predictive distributions
More about Posterior Distributions
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Sampling Distributions (§ )
The Normal Distribution
Computing and Statistical Data Analysis / Stat 10
Presentation transcript:

Estimating Uncertainty

Estimating Uncertainty in ADMB Model parameters and derived quantities Normal approximation Profile likelihood Bayesian MCMC Bootstrap and Monte Carlo simulation (implement yourself see previous module)

Standard deviation calculations: Sdreport_ parameters All estimated model parameters in sd report Derived parameters need to be defined as sdreport_ parameters in the PARAMETER_SECTION Uses delta method to calculate standard deviations for derived quantities

Sdreport_ parameters DATA_SECTION init_number Nobs init_vector X(1,Nobs) init_vector Y(1,Nobs) init_number sd PARAMETER_SECTION init_number a init_number b sdreport_vector Ypred(1,Nobs) objective_function_value f PROCEDURE_SECTION Ypred=a+b*X; f=0.5*norm2((Y-Ypred)/sd);

*.std report index name value std dev 1 a e e b e e Ypred e e Ypred e e Ypred e e Ypred e e Ypred e e Ypred e e Ypred e e Ypred e e Ypred e e Ypred e e+000

*.cor report The logarithm of the determinant of the hessian = index name value std dev ……….. 1 a e e b e e Ypred e e Ypred e e Ypred e e Ypred e e ……… ……………………………..

Profile likelihood calculations: likeprof_ parameters Have to define a likeprof_ variable for both model parameters and derived quantities

likeprof_ parameters DATA_SECTION init_number Nobs init_vector X(1,Nobs) init_vector Y(1,Nobs) init_number sd PARAMETER_SECTION init_number a init_number b sdreport_vector Ypred(1,Nobs) likeprof_number a_prof objective_function_value f PROCEDURE_SECTION a_prof=a; Ypred=a+b*X; f=0.5*norm2((Y-Ypred)/sd);

Executing profile likelihoods Command line option -lprof e.g. simple -lprof

*.plt report (a_prof.plt) a_prof: Profile likelihood ……. Minimum width confidence limits: significance level lower bound upper bound …….. One sided confidence limits for the profile likelihood: …….. Normal approximation

Controlling the profile likelihood calculations PRELIMINARY_CALCS_SECTION a_prof.set_stepnumber(10); a_prof.set_stepsize(0.1); Step is in standard deviation units

Exercise: comparing estimates of uncertainty Modify the code to make a profile likelihood for the predicted Y on observation 3 Compare the profile likelihood confidence intervals to those based on the normal approximation using the estimated standard deviation

Why Bayesian analysis? Include all information –Previous experiments –Other populations/species –Expert judgment Calculate probabilities of alternative hypotheses Estimate uncertainty Easy to implement

Priors Support from Data Support from Prior

Integration

Priors

Simple Fisheries Model

Likelihood (ignoring constants sd known)

Simple Fisheries Model (fish.tpl) DATA_SECTION init_number S init_number sigma init_int Nyears init_vector C(1,Nyears) init_int Nobs init_matrix CPUE(1,Nobs,1,2) PARAMETER_SECTION init_number R init_number q vector N(1,Nyears+1) objective_function_value f

PROCEDURE_SECTION f=0; N(1)=R/(1-S); for(int year=1;year<=Nyears;year++) { N(year+1)=S*N(year)+R-C(year); } for(int obs=1;obs<=Nobs;obs++) { f+= 0.5*square((log(CPUE(obs,2))-log(N(CPUE(obs,1))*q) ) / sigma) } Allows missing years

Data file #survival 0.7 #sigma 0.4 #Nyears 26 #catch #nobs 26 #CPUE (year index)

Results # Number of parameters = 2 Objective function value = Maximum gradient component = e-06 # R: # q:

Bayesian Version Uniform vs log-Uniform prior on q Normal prior on R with bounds (ignoring constants)

Bayesian Version DATA_SECTION init_number Rmean init_number Rsigma init_int qtype init_number qlb init_number qub. PARAMETER_SECTION init_bounded_number R(1,1000) init_bounded_number q(qlb,qub) number qtemp likeprof_number Rtemp.

Bayesian Version PROCEDURE_SECTION Rtemp=R;. if (qtype==0) qtemp=q; else qtemp=exp(q);. (likelihood calculation using qtemp). f+=0.5*square((R-Rmean) / (Rsigma));

Data File (check the sd) #Rprior #Rmean 300 #Rsigma 300 #qprior #type 0=uniform, 1=uniform on logscale 0 #lb //note need to change depending on type #ub //note need to change depending on type 1 #survival //note: also need to change pin file for q depending on type

Profiles and Posteriors Likelihood Profile (with prior information) -lprof Parameter.PLT MCMC Bayesian Posterior -mcmc HST

Rtemp.plt (log q) Rtemp: Profile likelihood e e e e-05 Minimum width confidence limits: significance level lower bound upper bound One sided confidence limits for the profile likelihood: The probability is 0.9 that Rtemp is greater than The probability is 0.9 that Rtemp is less than Normal approximation e e e e-12 Minimum width confidence limits: significance level lower bound upper bound

bayes.hst (log q) # samples sizes # step size scaling factor 1.44 # step sizes # means # standard devs # lower bounds -10 # upper bounds 25 #number of parameters 2 #current parameter values for mcmc restart #random nmber seed #Rtemp e …

Results

Exercise Add normal prior on survival with mean 0.7 and sd 0.2. Estimate survival in phase 2 Compare results for average recruitment with those from the analysis with fixed S Use a uniform on the log-scale prior for q

Results

Decision Analysis -mcmc mcsave 100 -mceval

Decision Analysis DATA_SECTION init_int Nprojectyears init_number projectcatch … PARAMETER_SECTION vector N(1,Nyears+1+Nprojectyears) PROCEDURE_SECTION f=0; dynamics(); likelihood(); if (mceval_phase()) decision(); Rtemp=R;

Functions FUNCTION decision for(int year=Nyears+1;year<=Nyears+Nprojectyears;year++) { N(year+1)=S*N(year)+R-projectcatch; if (N(year+1)<=0) { N(year+1)=0; } ofstream out("posterior.dat",ios::app); out<<R<<" "<<q<<" "<<N(1)<<" "<<N(Nyears+1)<<“ “ <<N(Nyears+Nprojectyears+1)<<"\n"; out.close();

Notes Delete posterior.dat before starting or it will contain the old data Can change decision and run –mceval without rerunning -mcmc posfun

Results – Nproj/Ncur

Add random recruitment DATA_SECTION init_number Rsd … vector Rdev(Nyears+1, Nyears+Nprojectyears+1) !!rec_seed=1211; FUNCTION decision Rdev.fill_randn_ni(rec_seed); … N(year+1)=S*N(year)+R*mfexp(Rdev(year+1)*Rsd-0.5*Rsd*Rsd)- projectcatch; GLOBALS_SECTION long int rec_seed; //cant do in DATA_SECTION and if use !! It will only be local

Results - Nproj/Ncur

END

Exercise: use harvest rate strategy Start with file decision.tpl Use projectcatch as a harvest rate = 0.3 only for the projection time period Calculate catch [define a vector] Report average catch in decision file [use mean()] Probably don’t need to run mcmc again, only mceval

Harvest strategy PARAMETER_SECTION … vector Cproj(Nyears+1,Nyears+Nprojectyears) FUNCTION decision … //old code //N(year+1)=S*N(year)+R*mfexp(Rdev(year+1)*Rsd-0.5*Rsd*Rsd)- projectcatch; //new code N(year+1)=(S*N(year))*(1- projectcatch)+R*mfexp(Rdev(year+1)*Rsd-0.5*Rsd*Rsd); Cproj(year)=(S*N(year))*projectcatch; … out<<R<<" "<<q<<" "<<N(1)<<" "<<N(Nyears+1)<<" "<<N(Nyears+Nprojectyears+1)<<" "<<Rdev(Nyears+Nprojectyears)<<" "<<mean(Cproj)<<"\n";

Results (catch after survival and before recruitment)

Example Tagging to estimate survival A stock assessment model fit to and index of abundance Use prior on survival in stock-assessment model based on the results of the tagging analysis Both tagging data and provide information on survival

Bayes’ Theorem Note the different order for a probability and a likelihood

Calculating the denominator Discrete hypotheses Continuous parameters (need to change H)

Bayesian methods for parameters Posterior is proportional to the likelihood multiplied by the Prior

Two characteristics of Bayesian analysis Prior Integration

Priors –From expert opinion/experience/theory –e.g. survival is high for albatross –From other populations/species –e.g. survival for bigeye tuna in the Atlantic used in an assessment of bigeye tuna in the Pacific –From other experiments/data sets on same population –e.g. analysis of mark-recapture data for survival as a prior for a Lesley matrix fit to abundance trends

Exercise Do a profile likelihood MCMC with uniform prior on q MCMC with uniform on the log scale for q Plot these as well as the normal approximation Don’t forget to make the bounds and pin on the correct scale for q. I have appropriate values commented out. Keep upper bound on q = 1 for both.

Convergence Read Punt and Hilborn Delete the initial burn in period Thin the chain (i.e. save every xth value) to reduce auto-correlation and limit the number of forward projections