Download presentation

Presentation is loading. Please wait.

Published byAshley Black Modified over 2 years ago

1
Practical Model Selection and Multi-model Inference using R Modified from on a presentation by : Eric Stolen and Dan Hunt

2
Theory This is the link with science, which is about understanding how the world works

3
Indigo Snake Habitat selection David R. Breininger, M. Rebecca Bolt, Michael L. Legare, John H. Drese, and Eric D. Stolen Source: Journal of Herpetology, 45(4):484-490. 2011. –Animal perception –Evolutionary Biology –Population Demography http://www.seaworld.org/animal-info/animal-bytes/spooky-safari/eastern-indigo-snake.htm

4
Hypotheses To use the Information-theoretic toolbox, we must be able to state a hypothesis as a statistical model (or more precisely an equation which allows us to calculate the maximum likelihood of the hypothesis ) http://www.seaworld.org/animal-info/animal-bytes/spooky-safari/eastern-indigo-snake.htm

5
Multiple Working Hypotheses We operate with a set of multiple alternative hypotheses (models) The many advantages include safeguarding objectivity, and allowing rigorous inference. Chamberlain (1890) Strong Inference - Platt (1964) Karl Popper (ca. 1960)– Bold Conjectures

6
Deriving the model set This is the tough part (but also the creative part) much thought needed, so don’t rush collaborate, seek outside advice, read the literature, go to meetings… How and When hypotheses are better than What hypotheses (strive to predict rather than describe)

7
Models – Indigo Snake example David R. Breininger, M. Rebecca Bolt, Michael L. Legare, John H. Drese, and Eric D. Stolen Source: Journal of Herpetology, 45(4):484-490. 2011. Study of indigo snake habitat use Response variable: home range size ln(ha) SEX Land cover – 2-3 levels (lC2) weeks = effort/exposure Science question: “Is there a seasonal difference in habitat use between sexes?”

8
Models – Indigo Snake example SEX land cover type (lc2) weeks SEX + lc2 SEX + weeks llc2 + weeks SEX + lc2 + weeks SEX + lc2 + SEX * lc2 SEX + lc2 + weeks + SEX * lc2 http://www.herpnation.com/hn-blog/indigo-snake-survival- demographics/?simple_nav_category=john-c-murphy

9
SEX land cover type (lc2) weeks SEX + lc2 SEX + weeks llc2 + weeks SEX + lc2 + weeks SEX + lc2 + SEX * lc2 SEX + lc2 + weeks + SEX * lc2 Models – Indigo Snake example

10
Modeling Trade-off between precision and bias Trying to derive knowledge / advance learning; not “fit the data” Relationship between data (quantity and quality) and sophistication of the model

11
Precision-Bias Trade-off Bias 2 Model Complexity – increasing umber of Parameters

12
Precision-Bias Trade-off Bias 2 variance Model Complexity – increasing umber of Parameters

13
Precision-Bias Trade-off Bias 2 variance Model Complexity – increasing umber of Parameters

14
Kullback-Leibler Information Basic concept from Information theory The information lost when a model is used to represent full reality Can also think of it as the distance between a model and full reality

15
Kullback-Leibler Information Truth / reality G 1 (best model in set) G2G2 G3G3

16
Kullback-Leibler Information Truth / reality G 1 (best model in set) G2G2 G3G3

17
Kullback-Leibler Information Truth / reality G 1 (best model in set) G2G2 G3G3

18
Kullback-Leibler Information Truth / reality G 1 (best model in set) G2G2 G3G3 The relative difference between models is constant

19
Akaike’s Contributions Figured out how to estimate the relative Kullback-Leibler distance between models in a set of models Figured out how to link maximum likelihood estimation theory with expected K-L information An (Akaike’s) Information Criteria AIC = -2 log e ( L {model i } | data) + 2K

20
AICc i = -2*log e (Likelihood of model i given the data) + 2*K (n/(n-K-1)) or = AIC + 2*K*(K+1)/(n-K-1) (where K = the number of parameters estimated and n = the sample size)

21
AICc min = AICc for the model with the lowest AICc value i = AICc i – AICc min

22
w i =Prob{g i | data}Model Probability (model probabilities) evidence ratio of model i to model j = w i / w j

23
Least Squares Regression AIC = n log e ( ) + 2*K (n/(n-K-1)) Where RSS / n

24
Counting Parameters: K = number of parameters estimated Least Square Regression K = number of parameters + 2 (for intercept &

25
Counting Parameters: K = number of parameters estimated Logistic Regression K = number of parameters + 1 (for intercept

26
Comparing Models Model selection based on AICc : K AICc Delta_AICc AICcWt Cum.Wt LL mod4 4 112.98 0.00 0.71 0.71 -51.99 mod7 5 114.89 1.91 0.27 0.98 -51.67 mod1 3 121.52 8.54 0.01 0.99 -57.47 mod5 4 122.27 9.29 0.01 1.00 -56.64 mod2 3 125.93 12.95 0.00 1.00 -59.67 mod6 4 128.34 15.36 0.00 1.00 -59.67 mod3 3 141.26 28.28 0.00 1.00 -67.34 Model 1 = “SEX ", Model 2 = "ha.ln ~ lc2", Model 3 = "ha.ln ~ weeks ", Model 4 = "ha.ln ~ SEX + lc2", Model 5 = "ha.ln ~ SEX + weeks", Model 6 = "ha.ln ~ lc2 + weeks", Model 7 = "ha.ln ~ SEX + lc2 + weeks "

27
Model Averaging Predictions

28
Model-averaged prediction Model Averaging Predictions

29
Prediction from model i Model Averaging Predictions

30
Weight model i Model Averaging Predictions

31
Model-averaged parameter estimate Model Averaging Parameters

32
Unconditional Variance Estimator

Similar presentations

OK

CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Statistical inference (Sec. )

CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Statistical inference (Sec. )

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on eisenmenger syndrome asd Ppt on credit policy pdf Ppt on successes and failures quotes Ppt on automobile related topics about work Ppt on automobile related topics about global warming Ppt on indian history and culture Ppt on social networking sites project Ppt on production function Ppt on levels of organization in biology Ppt on computer assisted language learning