Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.

Similar presentations


Presentation on theme: "Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests."— Presentation transcript:

1 Hypothesis Testing

2 Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests  Goodness-of-fit  Model Selection (AIC)  Model averaging  Bayesian Model Updating

3 Statistical Testing of Hypotheses  Objective of determining whether parameters differ from hypothesized values.  Testing procedure framed in terms of comparison of null and alternative hypotheses.  Null hypothesis  Alternative hypothesis  Compound (1-sided) alternatives

4 Procedure for Null Hypothesis Testing  Specify  Null and alternate hypotheses  Compute test statistic  Random variable that summarizes expected sample distribution given the null hypothesis is true (i.e., probability difference between sample means for 2 groups if the true mean is the same)  Compare to the sampled value  Test is binary decision  Significance level of the test α  Two types of incorrect decisions:  rejecting H 0 when it is true (Type I error), Pr = α  Not rejecting H 0 when it is false (type II error), Pr = β  Power of test = 1- β

5 P- values  Probability of obtaining a test statistic at least as extreme as the observed one, given that null hypothesis is true  Not Pr(Null hypothesis is true)  Degree of consistency of data with null, not strength of evidence for alternative  Dependent on null hypothesis (if null is that groups differ by 1 rather than 0 p-value will be different)  Dependent on sample size  Does not provide information on size or precision of estimated effect (i.e., not a measure of biological relevance or a confidence interval)

6 Reality  Conclusion ↓ H 0 True, H a FalseH 0 False, H a True We don’t reject H 0 (null hypothesis) 1-  (eg., 0.95) Odds of saying there is no difference when there really is one. 95/100 times when there is no effect, we’ll correctly say there is no effect.  (eg., 0.20) Type II Error Odds of saying there is no difference when there really is one. 20/100 times when there is an effect, we’ll say there is no effect. We reject H 0, accept H a (alternative hypothesis)  (eg., 0.05) Type I Error Odds of saying there is a difference when there is no difference. 5/100 times when there is no effect, we’ll say there is one. 1-  (eg., 0.80) POWER Odds of saying there is a difference when there is one. 80/100 times when there is an effect, we’ll say there is oen.

7 Comments:  Lower ,  lower power; higher ,  higher power  Lower , conservative in terms of rejecting the null when it’s true (i.e., saying there’s an effect when there really isn’t)  Higher   increases chances of Type I Error, decreases chances of making Type II Error and decreases rigor of test.

8 Sample Design: Choosing a sample size  Can choose based on target precision level (e.g. confidence intervals) or power (hypothesis testing)  Requires assumptions and tentative parameter (e.g., effect size) values  Therefore it is an exercise in approximation  Might identify cases where minimal sufficient sample size would bust budget or is logistically impractical to achieve.

9 Likelihood Ratio Tests  Comparing fit of hypothesized model to another model (generally containing more parameters) – Null model to alternative model with additional parameters  Maximum likelihood estimation theory  Evaluate MLE for restricted and more general parameterizations  Calculate Likelihood ratio  Chi-square, with degrees of freedom of difference in number of parameters among models

10 Goodness of fit (GOF) “Absolute” fit of model  Goal is to determine if data are reflective of the statistical model  Test statistic generated based on probability model using estimated parameters  Is there variation in the data that is out of the ordinary and not reflected in our statistical model?

11 Pearson’s  2 GOF Test  Logic: If model is ‘correct’, expected and observed cell frequencies for each multinomial cell should be similar.  Imagine we roll a die 1000 times and want to determine if the model P(x=1)=P(x=2)=…=P(x=6) is a good model  If sample size is adequate, (expect at least 2 per cell),  (observed i – expected i ) 2 /expected i     df = # cells – 1 

12 General GOF if Large Sample  Pearson’s  2  Direct use of Deviance

13 Bootstrap GOF Test  Compute ML estimates for parameters,  Produce empirical distribution of estimates:  Simulate capture histories for each released animal:  assume parameter = MLE,  ‘flip coins’ to determine survival and capture for each period,  Repeat for { R i } animals, estimate parameters,  Compute deviance  Compare original deviance with empirical distribution (i.e., what percentile?)

14 What indicates lack of fit?  With GOF test, the hope and purpose is to accept the null hypothesis  This is counter to statistical hypothesis testing  What is a ‘significant’ P-value?

15 What might cause lack of fit?  Inadequate model structure for detection or survival, e.g.,  Age dependence, size dependence, etc.  Trap dependence  Those released earlier survive at different rate  Non-random temporary emigration  Lack of independence among animals

16 Solutions  Inadequate model structure? Improve it.  Goal: Subdivide animals sufficiently that there is equal p and S within a group  Warning: Inadequate model structure doesn’t always result in lack of fit, e.g.,  Permanent emigration (confounded with S)  Random temporary emigration (confounded with p)  Random ring loss (confounded with S)  Lack of independence? Correct for Overdispersion  Inflate variances using quasi-likelihood.

17 Adjusting Variances for Overdispersion  Based on Quasi-likelihood theory  c-hat = deviance/df  adj. variance = c-hat * (ML variance)

18 Bootstrap adjustment for overdispersion  For each simulated sample:  compute deviance  compute c-hat = deviance/df  Bootstrap c-hat =  (observed deviance)/(mean deviance), or  (observed c-hat) / (mean c-hat)  Note: could replace deviance with Pearson  2, or mean with median.


Download ppt "Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests."

Similar presentations


Ads by Google