Parameter Redundancy and Identifiability in Ecological Models Diana Cole, University of Kent.

Slides:



Advertisements
Similar presentations
Lecture 2: Parameter Estimation and Evaluation of Support.
Advertisements

Workshop on Parameter Redundancy Part II Diana Cole.
The Problem with Parameter Redundancy Diana Cole, University of Kent.
Logistic Regression.
Detecting Parameter Redundancy in Ecological State-Space Models Diana Cole and Rachel McCrea National Centre for Statistical Ecology, University of Kent.
Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.
Parameter Redundancy and Identifiability Diana Cole and Byron Morgan University of Kent Initial work supported by an EPSRC grant to the National Centre.
Detecting Parameter Redundancy in Complex Ecological Models Diana Cole and Byron Morgan University of Kent.
Maximum likelihood estimates What are they and why do we care? Relationship to AIC and other model selection criteria.
Estimation A major purpose of statistics is to estimate some characteristics of a population. Take a sample from the population under study and Compute.
Maximum likelihood (ML) and likelihood ratio (LR) test
Approximations and Errors
Common Factor Analysis “World View” of PC vs. CF Choosing between PC and CF PAF -- most common kind of CF Communality & Communality Estimation Common Factor.
1 Econometrics 1 Lecture 7 Multicollinearity. 2 What is multicollinearity.
Curve-Fitting Regression
Maximum likelihood (ML)
The Islamic University of Gaza Faculty of Engineering Civil Engineering Department Numerical Analysis ECIV 3306 Chapter 3 Approximations and Errors.
Parameter Redundancy in Ecological Models Diana Cole, University of Kent Byron Morgan, University of Kent Rachel McCrea, University of Kent Ben Hubbard,
1/30 Stochastic Models for Yeast Prion Propagation Diana Cole 1, Lee Byrne 2, Byron Morgan 1, Martin Ridout 1, Mick Tuite Institute of Mathematics,
Determining Parameter Redundancy of Multi-state Mark- Recapture Models for Sea Birds Diana Cole University of Kent.
Linear and generalised linear models
Rao-Cramer-Frechet (RCF) bound of minimum variance (w/o proof) Variance of an estimator of single parameter is limited as: is called “efficient” when the.
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Maximum likelihood (ML)
Classification and Prediction: Regression Analysis
Non-Linear Simultaneous Equations
Computer vision: models, learning and inference
Likelihood probability of observing the data given a model with certain parameters Maximum Likelihood Estimation (MLE) –find the parameter combination.
Probability theory: (lecture 2 on AMLbook.com)
1 Hybrid methods for solving large-scale parameter estimation problems Carlos A. Quintero 1 Miguel Argáez 1 Hector Klie 2 Leticia Velázquez 1 Mary Wheeler.
Number Sense Standards Measurement and Geometry Statistics, Data Analysis and Probability CST Math 6 Released Questions Algebra and Functions 0 Questions.
EM and expected complete log-likelihood Mixture of Experts
Adding individual random effects results in models that are no longer parameter redundant Diana Cole, University of Kent Rémi Choquet, Centre d'Ecologie.
Detecting trends in dragonfly data - Difficulties and opportunities - Arco van Strien Statistics Netherlands (CBS) Introduction.
An Empirical Likelihood Ratio Based Goodness-of-Fit Test for Two-parameter Weibull Distributions Presented by: Ms. Ratchadaporn Meksena Student ID:
Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,
CSDA Conference, Limassol, 2005 University of Medicine and Pharmacy “Gr. T. Popa” Iasi Department of Mathematics and Informatics Gabriel Dimitriu University.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Detecting Parameter Redundancy in Integrated Population Models Diana Cole and Rachel McCrea National Centre for Statistical Ecology, School of Mathematics,
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Curve-Fitting Regression
A Hybrid Symbolic-Numerical Method for Determining Model Structure Diana Cole, NCSE, University of Kent Rémi Choquet, Centre d'Ecologie Fonctionnelle et.
The Triangle of Statistical Inference: Likelihoood Data Scientific Model Probability Model Inference.
1 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X 1, X 2,…, X k.
Integrating archival tag data into stock assessment models.
Chapter 4: Introduction to Predictive Modeling: Regressions
Methods for Estimating Defects Catherine V. Stringfellow Mathematics and Computer Science Department New Mexico Highlands University October 20, 2000.
Adjusting Radio-Telemetry Detection Data for Premature Tag-Failure L. Cowen and C.J. Schwarz Department of Statistics and Actuarial Science, Simon Fraser.
The ABC’s of Pattern Scoring
Estimating age-specific survival rates from historical ring-recovery data Diana J. Cole and Stephen N. Freeman Mallard Dawn Balmer (BTO) Sandwich Tern.
1 Chapter 4: Introduction to Predictive Modeling: Regressions 4.1 Introduction 4.2 Selecting Regression Inputs 4.3 Optimizing Regression Complexity 4.4.
M.Sc. in Economics Econometrics Module I Topic 4: Maximum Likelihood Estimation Carol Newman.
Parameter Redundancy in Mark-Recapture and Ring-Recovery Models with Missing Data Diana Cole University of Kent.
Review of statistical modeling and probability theory Alan Moses ML4bio.
Multistate models UF Outline  Description of the model  Data structure and types of analyses  Multistate with 2 and 3 states  Assumptions 
Capture-recapture Models for Open Populations “Single-age Models” 6.13 UF-2015.
Anders Nielsen Technical University of Denmark, DTU-Aqua Mark Maunder Inter-American Tropical Tuna Commission An Introduction.
Numerical Analysis – Data Fitting Hanyang University Jong-Il Park.
Computacion Inteligente Least-Square Methods for System Identification.
 Multi-state Occupancy. Multiple Occupancy States Rather than just presence/absence of the species at a sampling unit, ‘occupancy’ could be categorized.
Capture-recapture Models for Open Populations Multiple Ages.
Review. Common probability distributions Discrete: binomial, Poisson, negative binomial, multinomial Continuous: normal, lognormal, beta, gamma, (negative.
 Occupancy Model Extensions. Number of Patches or Sample Units Unknown, Single Season So far have assumed the number of sampling units in the population.
Multi-state Occupancy. Multiple Occupancy States Rather than just presence/absence of the species at a sampling unit, ‘occupancy’ could be categorized.
Capture-recapture Models for Open Populations Abundance, Recruitment and Growth Rate Modeling 6.15 UF-2015.
Models.
LECTURE 03: DECISION SURFACES
Extension to the Hybrid Symbolic-Numeric Method for Investigating Identifiability Diana Cole, University of Kent, UK Rémi Choquet, CEFE, CNRS, France.
Parameter Redundancy and Identifiability in Ecological Models
Multistate models Lecture 10.
Presentation transcript:

Parameter Redundancy and Identifiability in Ecological Models Diana Cole, University of Kent

2/27 Introduction Species present and detected Species present but not detected Species absent

3/27 Parameter Redundancy

4/27 Problems with Parameter Redundancy There will be a flat ridge in the likelihood of a parameter redundant model (Catchpole and Morgan, 1997), resulting in more than one set of maximum likelihood estimates. Numerical methods to find the MLE will not pick up the flat ridge, although it could be picked up by trying multiple starting values and looking at profile log-likelihoods. The Fisher information matrix will be singular (Rothenberg, 1971) and therefore the standard errors will be undefined. However the exact Fisher information matrix is rarely known. Standard errors are typically approximated using a Hessian matrix obtained numerically. Can parameter redundancy be detected from the standard errors?

5/27 Is example 1 parameter redundant? ParameterEstimateStandard Error 0.39imaginary imaginary 0.18imaginary

6/27 Is example 2 parameter redundant? ParameterEstimateStandard Error

7/27 Is example 3 parameter redundant? ParameterEstimateStandard Error

8/27 Simulation Study for Examples 1 and 2 57% have defined standard errors ParameterTrue ValueAverage MLESt. Dev. MLE SVD threshold%age SVD test correct % % % %

9/27 Mark-Recovery Models  63  64  65 Ringing yr 63  64  65  Recapture yr

10/27 Mark-Recovery Models

11/27 Symbolic Method (Cole et al, 2010 and Cole et al, 2012) Exhaustive summary – unique representation of the model Parameters

12/27 Symbolic Method

13/27 Estimable Parameter Combinations

Other uses of symbolic method Uses of symbolic method: – Catchpole and Morgan (1997) exponential family models, mostly used in ecological statistics, – Rothenberg (1971) original general use, econometric examples, – Goodman (1974) latent class models, – Sharpio (1986) non-linear regression models, – Pohjanpalo (1982) first use for compartment models, – Cole et al (2010) General exhaustive summary framework, – Cole et al (2012) Mark-recovery models. Finding estimable parameters: – Catchpole et al (1998) exponential family models, – Chappell and Gunn (1998) and Evans and Chappell (2000) compartment models, – Cole et al (2010) General exhaustive summary framework.

15/27 Problem with Symbolic Method The key to the symbolic method for detecting parameter redundancy is to find a derivative matrix and its rank. Models are getting more complex. The derivative matrix is therefore structurally more complex. Maple runs out of memory calculating the rank. How do you proceed? – Numerically – but only valid for specific value of parameters. But can’t find combinations of parameters you can estimate. Not possible to generalise results. – Symbolically – involves extending the theory, again it involves a derivative matrix and its rank, but the derivative matrix is structurally simpler. – Hybrid-Symbolic Numeric Method. Wandering Albatross Multi-state models for sea birds Hunter and Caswell (2009) Cole (2012) Striped Sea Bass Tag-return models for fish Jiang et al (2007) Cole and Morgan (2010)

16/27 Multi-state capture-recapture example Hunter and Caswell (2009) examine parameter redundancy of multi- state mark-recapture models, but cannot evaluate the symbolic rank of the derivative matrix (developed numerical method). 4 state breeding success model: survival breeding given survival successful breeding recapture Wandering Albatross success 2 = failure 3 post-success 4 = post-failure

17/27 Extended Symbolic Method Cole et al (2010) 1.Choose a reparameterisation, s, that simplifies the model structure. 2.Rewrite the exhaustive summary,  (  ), in terms of the reparameterisation -  (s).

3.Calculate the derivative matrix D s. 4.The no. of estimable parameters =rank(D s ) rank(D s ) = 12, no. est. pars = 12, deficiency = 14 – 12 = 2 5.If D s is full rank s = s re is a reduced-form exhaustive summary. If D s is not full rank solve set of PDE to find a reduced-form exhaustive summary, s re. Extended Symbolic Method

6.Use s re as an exhaustive summary. Breeding Constraint Survival Constraint  1 =  2 =  3 =  4  1 =  3,  2 =  4  1 =  2,  3 =  4  1,  2,  3,  4  1 =  2 =  3 =  4 0 (8)0 (9)1 (9)1 (11)  1 =  3,  2 =  4 0 (9)0 (10) 2 (12)  1 =  2,  3 =  4 0 (9)0 (10)1 (10)1 (12) 1,2,3,41,2,3,4 0 (11)0 (12) 2 (14) Extended Symbolic Method

20/27 Multi-state mark–recapture models State 1: Breeding site 1 State 2: Breeding site 2 State 3: Non-breeding, Unobservable in state 3  - survival  - breeding  - breeding site 1 1 –  - breeding site 2

21/27 Multi-state mark–recapture models – General Model General Multistate-model has S states, with the last U states unobservable with N years of data. Survival probabilities released in year r captured in year c:  t is an S  S matrix of transition probabilities at time t with transition probabilities  i,j (t) = a i,j (t). P t is an S  S diagonal matrix of probabilities of capture p t. p t = 0 for an unobservable state,

22/27 r = 10N – 17 d = N + 3 General simpler exhaustive summary Cole (2012)

23/27 Hybrid Symbolic-Numeric Method Choquet and Cole (2012)

24/27 Example – multi-site capture-recapture model

25/27 Example – Occupancy models (Hubbard et al, in prep) ModelRankDeficiencyNo. pars

26/27 Conclusion NumericSymbolicHybrid-Symbolic Accurate / correct answer Not alwaysYes General Results (e.g. any no. of years) NoYesWork in progress Easy to use (e.g. for an ecologist) YesNo, but can develop simpler ex. sum Yes Possible to add to existing computer packages YesNo (needs symbolic algebra) Yes (E-surge and M- surge) Individually Identifiable Parameters NoYes Estimable parameter combinationsNoYesIn the future? Best for intrinsic PR and general results Best for extrinsic PR and a quick result

27/27 References – Brownie, C. Hines, J., Nichols, J. et al (1993) Biometrics, 49, p1173. – Catchpole, E. A. and Morgan, B. J. T. (1997) Biometrika, 84, – Catchpole, E. A., Morgan, B. J. T. and Freeman, S. N. (1998) Biometrika, 85, – Chappell, M. J. and Gunn, R. N. (1998) Mathematical Biosciences, 148, – Choquet, R. and Cole, D.J. (2012) Mathematical Biosciences, 236, p117. – Cole, D. J. and Morgan, B. J. T. (2010) JABES, 15, – Cole, D. J., Morgan, B. J. T and Titterington, D. M. (2010) Mathematical Biosciences, 228, 16–30. – Cole, D. J. (2012) Journal of Ornithology, 152, S305-S315. – Cole, D. J., Morgan, B. J. T., Catchpole, E. A. and Hubbard, B.A. (2012) Biometrical Journal, 54, – Cole, D. J., Morgan, B.J.T., McCrea, R.S, Pradel, R., Gimenez, O. and Choquet, R. (2014) Ecology and Evolution, 4, , – Evans, N. D. and Chappell, M. J. (2000) Mathematical Biosciences, 168, – Gould, W. R., Patla, D. A., Daley, R., et al (2012). Wetlands, 32, p379. – Goodman, L. A. (1974) Biometrika, 61, – Hunter, C.M. and Caswell, H. (2009). Ecological and Environmental Statistics, 3, – Jiang, H. Pollock, K. H., Brownie, C., et al (2007) JABES, 12, – Pohjanpalo, H. (1982) Technical Research Centre of Finland Research Report No. 56. – Rothenberg, T. J. (1971) Econometrica, 39, – Shapiro, A. (1986) Journal of the American Statistical Association, 81, – Viallefont, A., Lebreton, J.D., Reboulet, A.M. and Gory, G. (1998) Biometrical Journal, 40,