Model Identification & Model Selection

Slides:



Advertisements
Similar presentations
Charles Y. Tan, PhD USP Statistics Expert Committee
Advertisements

From the homework: Distribution of DNA fragments generated by Micrococcal nuclease digestion mean(nucs) = bp median(nucs) = 110 bp sd(nucs+ = 17.3.
Day 6 Model Selection and Multimodel Inference
Copyright © 2011 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 12 Measures of Association.
Model Assessment, Selection and Averaging
Significance Testing Chapter 13 Victor Katch Kinesiology.
Bayesian inference Gil McVean, Department of Statistics Monday 17 th November 2008.
Empirical Analysis Doing and interpreting empirical work.
458 Model Uncertainty and Model Selection Fish 458, Lecture 13.
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Richard M. Jacobs, OSA, Ph.D.
Lecture 9: p-value functions and intro to Bayesian thinking Matthew Fox Advanced Epidemiology.
Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.
Choosing Statistical Procedures
AM Recitation 2/10/11.
Chapter 4 Hypothesis Testing, Power, and Control: A Review of the Basics.
Overview Definition Hypothesis
Statistical Inference: Which Statistical Test To Use? Pınar Ay, MD, MPH Marmara University School of Medicine Department of Public Health
1/2555 สมศักดิ์ ศิวดำรงพงศ์
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Inference in practice BPS chapter 16 © 2006 W.H. Freeman and Company.
The Method of Likelihood Hal Whitehead BIOL4062/5062.
BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS.
CHAPTER 16: Inference in Practice. Chapter 16 Concepts 2  Conditions for Inference in Practice  Cautions About Confidence Intervals  Cautions About.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.
VI. Evaluate Model Fit Basic questions that modelers must address are: How well does the model fit the data? Do changes to a model, such as reparameterization,
Lecture 4 Model Selection and Multimodel Inference.
Inference We want to know how often students in a medium-size college go to the mall in a given year. We interview an SRS of n = 10. If we interviewed.
CHAPTER 16: Inference in Practice ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Research Seminars in IT in Education (MIT6003) Quantitative Educational Research Design 2 Dr Jacky Pow.
Section 10.1 Confidence Intervals
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
Ch 10 – Intro To Inference 10.1: Estimating with Confidence 10.2 Tests of Significance 10.3 Making Sense of Statistical Significance 10.4 Inference as.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 8 First Part.
Issues concerning the interpretation of statistical significance tests.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
"Classical" Inference. Two simple inference scenarios Question 1: Are we in world A or world B?
Chapter 10 The t Test for Two Independent Samples
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Lecture 6 Your data and models are never perfect… Making choices in research design and analysis that you can defend.
AP Statistics Chapter 11 Notes. Significance Test & Hypothesis Significance test: a formal procedure for comparing observed data with a hypothesis whose.
Tutorial I: Missing Value Analysis
Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Chapter 8: Introduction to Hypothesis Testing. Hypothesis Testing A hypothesis test is a statistical method that uses sample data to evaluate a hypothesis.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Chapter 13 Understanding research results: statistical inference.
The Practice of Statistics Third Edition Chapter 11: Testing a Claim Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Model Selection Information-Theoretic Approach UF 2015 (25 minutes) Outline: Why use model selection Why use model selection AIC AIC AIC weights and model.
Model Comparison. Assessing alternative models We don’t ask “Is the model right or wrong?” We ask “Do the data support a model more than a competing model?”
Canadian Bioinformatics Workshops
Lecture 4 Model Selection and Multimodel Inference
Unit 5: Hypothesis Testing
CHAPTER 9 Testing a Claim
Model Comparison.
CHAPTER 18: Inference in Practice
CHAPTER 9 Testing a Claim
Lecture 4 Model Selection and Multimodel Inference
CHAPTER 9 Testing a Claim
Lecture 4 Model Selection and Multimodel Inference
CHAPTER 9 Testing a Claim
Presentation transcript:

Model Identification & Model Selection With focus on Mark/Recapture Studies

Overview Basic inference from an evidentialist perspective Model selection tools for mark/recapture AICc & SIC/BIC Overdispersed data Model set size Multimodel inference

DATA /* 01 */ 1100000000000000 1 1 1.16 27.7 4.19; /* 04 */ 1011000000000000 1 0 1.16 26.4 4.39; /* 05 */ 1011000000000000 1 1 1.08 26.7 4.04; /* 06 */ 1010000000000000 1 0 1.12 26.2 4.27; /* 07 */ 1010000000000000 1 1 1.14 27.7 4.11; /* 08 */ 1010110000000000 1 1 1.20 28.3 4.24; /* 09 */ 1010000000000000 1 1 1.10 26.4 4.17; /* 10 */ 1010110000000000 1 1 1.42 27.0 5.26; /* 11 */ 1010000000000000 1 1 1.12 27.2 4.12; /* 12 */ 1010101100000000 1 1 1.11 27.1 4.10; /* 13 */ 1010101100000000 1 0 1.07 26.8 3.99; /* 14 */ 1010101100000000 1 0 0.94 25.2 3.73; /* 15 */ 1010101100000000 1 0 1.24 27.1 4.58; /* 16 */ 1010101100000000 1 0 1.12 26.5 4.23; /* 17 */ 1010101000000000 1 1 1.34 27.5 4.87; /* 18 */ 1010101011000000 1 0 1.01 27.2 3.71; /* 19 */ 1010101011000000 1 0 1.04 27.0 3.85; /* 20 */ 1010101000000000 1 1 1.25 27.6 4.53; /* 21 */ 1010101011000000 1 0 1.20 27.6 4.35; /* 22 */ 1010101011000000 1 0 1.28 27.0 4.74; /* 23 */ 1010101010110000 1 0 1.25 27.2 4.59; /* 24 */ 1010101010110000 1 0 1.09 27.5 3.96; /* 25 */ 1010101010110000 1 1 1.05 27.5 3.82; /* 26 */ 1010101010101100 1 0 1.04 25.5 4.08; /* 27 */ 1010101010101010 1 0 1.13 26.8 4.22; /* 28 */ 1010101010101010 1 1 1.32 28.5 4.63; /* 29 */ 1010101010101010 1 0 1.18 25.9 4.56; /* 30 */ 1010101010101010 1 0 1.07 26.7 4.01; /* 31 */ 1010101010101010 1 1 1.26 26.9 4.68; /* 32 */ 1010101010101010 1 0 1.27 27.6 4.60; /* 33 */ 1010101010101010 1 0 1.08 26.0 4.15; /* 34 */ 1010101010101010 1 1 1.11 27.0 4.11; /* 35 */ 1010101010101010 1 0 1.15 27.1 4.24; /* 36 */ 1010101010101010 1 0 1.03 26.5 3.89; /* 37 */ 1010101010101010 1 0 1.16 27.5 4.22;

Models carry the meaning in science Organized thought Parameterized Model Organized thought connected to reality

Science is a cyclic process of model reconstruction and model reevaluation Comparison of predictions with observations/data Relative comparisons are evidence

All models are false, but some are useful. George Box

Statistical Inferences Quantitative measures of the validity and utility of models Social control on the behavior of scientists

Scientific Model Selection Criteria Illuminating Communicable Defensible Transferable

Common Information Criteria

Statistical Methods are Tools All statistical methods exist in the mind only, but some are useful. Mark Taper

Classes of Inference Frequentist Statistics - Bayesian Statistics Error Statistics – Evidential Stats – Bayesian Stats

Two key frequencies in frequentist statistics Frequency definition of probability Frequency of error in a decision rule

Null H tests with Fisherian P-values Single model only P-value= Prob of discrepancy at least as great as observed by chance. Not terribly useful for model selection

Neyman-Pearson Tests 2 models Null model test along a maximally sensitive axis. Binary response: Accept Null or reject Null Size of test (α) describes frequency of rejecting null in error. Not about the data, it is about the test. You support your decision because you have made it with a reliable procedure. N-P test tell you very little about relative support for alternative models.

Decisions vs. Conclusions Decision based inference reasonable within a regulatory framework. Not so appropriate for science John Tukey (1960) advocated seeking to reach conclusions not making decisions. Accumulate evidence until a conclusion is very strongly supported. Treat as true. Revise if new evidence contradicts.

All are tools for aiding scientific thought In conclusion framework, multiple statistical metrics not “incompatible” All are tools for aiding scientific thought

Statistical Evidence Data based estimate of the relative distance between two models and “truth”

Common Evidence Functions Likelihood ratios Differences in information criteria Others available E.g. Log(Jackknife prediction likelihood ratio)

Model Adequacy Bruce Lindsay The discrepancy of a model from truth Truth represented by an empirical distribution function, A model is “adequate” if the estimated discrepancy is less than some arbitrary but meaningful level.

Model Adequacy and Goodness of Fit Estimation framework rather than testing framework Confidence intervals rather than testing Rejection of “true model formalism”

Model Adequacy, Goodness of Fit, and Evidence Adequacy does not explicitly compare models Implicit comparison Model adequacy interpretable as bound on strength of evidence for any better model Unifies Model Adequacy and Evidence in a common framework

Model adequacy interpreted as a bound on evidence for a possibly better model Empirical Distribution - “Truth” Model 1 Potentially better model Model adequacy measure Evidence measure

Goodness of fit misnomer Badness of fit measures & goodness of fit tests Comparison of model to a nonparametric estimate of true distribution. G2-Statistic Helinger Distance Pearson χ2 Neyman χ2

Points of interest Badness of fit is the scope for improvement Evidence for one model relative to another model is the difference of badness of fit.

ΔIC estimates differences of Kullback-Leibler Discrepancies ΔIC = log(likelihood ratio) when # of parameters are equal Complexity penalty is a bias correction to adjust of increase in apparent precision with an increase in # parameters.

Evidence Scales L/R Log2 ln Log10 Weak <8 <3 <2 <1 Strong 8 - <32 3 - <5 2 - <7 1 - <2 Very Strong > 32 > 5 > 7 > 2 Note cutoff are arbitrary and vary with scale

Which Information Criterion? AIC? AICc ? SIC/BIC? Don’t use AIC 5.9 of one versus 6.1 of the other

What is sample size for complexity penalty? Mark/Recapture based on multinomial likelihoods Observation is a capture history not a session

To Q or not to Q? IC based model selection assumes a good model in set. Over-dispersion is common in Mark/Recapture data Don’t have a good model in set Due to lack of independence of observations Parameter estimate bias generally not influenced But fit will appear too good! Model selection will choose more highly parameterized models than appropriate

Quasi likelihood approach χ2 goodness of fit test for most general model If reject H0 estimate variance inflation c^ = χ2 /df Correct fit component of IC & redo selection

QICs

Problems with Quasilikelihood correction C^ is essentially a variance estimate. Variance estimates unstable without a lot of data lnL/c^ is a ratio statistic Ratio statistics highly unstable if the uncertainty in the denominator is not trivial Unlike AICc, bias correction is estimated. Estimating a bias correction inflates variance!

Fixes Explicitly include random component in model Then redo model selection Bootstrapped median c^ Model selection with Jackknifed prediction likelihood

Large or small model sets? Problem: Model Selection Bias When # of models large relative to data size some models will have a good fit just by chance Small Burnham & Anderson strongly advocate small model sets representing well thought out science Large model sets = “data dredging” Large The science may not be mature Small model sets may risk missing important factors

Model Selection from Many Candidates Taper(2004) SIC(x) = -2In(L) + (In(n) + x)k.

Performance of SIC(X) with small data set. N=50, true covariates=10, spurious covariates=30, all models of order <=20, 1.141 X 1014 candidate models '

Chen & Chen 2009 M subset size, P= # of possible terms

Explicit Tradeoff Small model sets Large model sets Allows exploration of fine structure and small effects Risks missing unanticipated large effects Large model sets Will catch unknown large effects Will miss fine structure Large or small model sets is a principled choice that data analysts should make based on their background knowledge and needs

Akaike Weights & Model Averaging Beware, there be dragons here!

Akaike Weights “Relative likelihood of model i given the data and model set” “Weight of evidence that model i most appropriate given data and model set”

Model Averaging “Conditional” Variance “Unconditional” Variance. Conditional on selected model “Unconditional” Variance. Actually conditional on entire model set

Good impulse with Huge Problems I do not recommend Akaike weights I do not recommend model averaging in this fashion Importance of good models is diminished by adding bad models Location of average influenced by adding redundant models

Model Redundancy Model Space is not filled uniformly Models tend to be developed in highly redundant clusters. Some points in model space allow few models Some points allow many

Redundant models do not add much information Model dimension Model adequacy Model dimension Model adequacy

A more reasonable approach Bootstrap Data Fit model set & select best model Estimate derived parameter θ from best model Accumulate θ Repeat Within Time Constraints Mean or median θ with percentile confidence intervals