Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.

Slides:



Advertisements
Similar presentations
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Advertisements

Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Hypothesis Testing An introduction. Big picture Use a random sample to learn something about a larger population.
Inference Sampling distributions Hypothesis testing.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Random variable Distribution. 200 trials where I flipped the coin 50 times and counted heads no_of_heads in a trial.
STATISTICAL INFERENCE PART V
1 1 Slide Hypothesis Testing Chapter 9 BA Slide Hypothesis Testing The null hypothesis, denoted by H 0, is a tentative assumption about a population.
Comparing Two Population Means The Two-Sample T-Test and T-Interval.
Statistical Significance What is Statistical Significance? What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant?
Making Inferences for Associations Between Categorical Variables: Chi Square Chapter 12 Reading Assignment pp ; 485.
Statistical Significance What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant? How Do We Know Whether a Result.
Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Inference about a Mean Part II
Chapter 8: Inferences Based on a Single Sample: Tests of Hypotheses Statistics.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
5-3 Inference on the Means of Two Populations, Variances Unknown
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Inferential Statistics
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Statistical Inference Dr. Mona Hassan Ahmed Prof. of Biostatistics HIPH, Alexandria University.
Copyright © Cengage Learning. All rights reserved. 11 Applications of Chi-Square.
AM Recitation 2/10/11.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Testing Hypotheses I Lesson 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics n Inferential Statistics.
Overview Definition Hypothesis
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
Hypothesis Testing.
8 - 1 © 2003 Pearson Prentice Hall Chi-Square (  2 ) Test of Variance.
Chapter Thirteen Part I
Statistics Primer ORC Staff: Xin Xin (Cindy) Ryan Glaman Brett Kellerstedt 1.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
1 Hypothesis testing can be used to determine whether Hypothesis testing can be used to determine whether a statement about the value of a population parameter.
Introduction to Hypothesis Testing: One Population Value Chapter 8 Handout.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
1 SMU EMIS 7364 NTU TO-570-N Inferences About Process Quality Updated: 2/3/04 Statistical Quality Control Dr. Jerrell T. Stracener, SAE Fellow.
I. Statistical Tests: A Repetive Review A.Why do we use them? Namely: we need to make inferences from incomplete information or uncertainty þBut we want.
The binomial applied: absolute and relative risks, chi-square.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
Confidence intervals and hypothesis testing Petter Mostad
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
© Copyright McGraw-Hill 2004
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Statistical Inference Drawing conclusions (“to infer”) about a population based upon data from a sample. Drawing conclusions (“to infer”) about a population.
URBDP 591 I Lecture 4: Research Question Objectives How do we define a research question? What is a testable hypothesis? How do we test an hypothesis?
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Chapter 13 Understanding research results: statistical inference.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
BIOL 582 Lecture Set 2 Inferential Statistics, Hypotheses, and Resampling.
Hypothesis Tests u Structure of hypothesis tests 1. choose the appropriate test »based on: data characteristics, study objectives »parametric or nonparametric.
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Tests of Significance We use test to determine whether a “prediction” is “true” or “false”. More precisely, a test of significance gets at the question.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Review Statistical inference and test of significance.
© 2010 Pearson Prentice Hall. All rights reserved Chapter Hypothesis Tests Regarding a Parameter 10.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 5 STATISTICAL INFERENCE: ESTIMATION AND HYPOTHESES TESTING
When we free ourselves of desire,
I. Statistical Tests: Why do we use them? What do they involve?
Elements of a statistical test Statistical null hypotheses
AP STATISTICS LESSON 10 – 4 (DAY 2)
Presentation transcript:

Hypothesis Testing

Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests  Goodness-of-fit  Model Selection (AIC)  Model averaging  Bayesian Model Updating

Statistical Testing of Hypotheses  Objective of determining whether parameters differ from hypothesized values.  Testing procedure framed in terms of comparison of null and alternative hypotheses.  Null hypothesis  Alternative hypothesis  Compound (1-sided) alternatives

Procedure for Null Hypothesis Testing  Specify  Null and alternate hypotheses  Compute test statistic  Random variable that summarizes expected sample distribution given the null hypothesis is true (i.e., probability difference between sample means for 2 groups if the true mean is the same)  Compare to the sampled value  Test is binary decision  Significance level of the test α  Two types of incorrect decisions:  rejecting H 0 when it is true (Type I error), Pr = α  Not rejecting H 0 when it is false (type II error), Pr = β  Power of test = 1- β

P- values  Probability of obtaining a test statistic at least as extreme as the observed one, given that null hypothesis is true  Not Pr(Null hypothesis is true)  Degree of consistency of data with null, not strength of evidence for alternative  Dependent on null hypothesis (if null is that groups differ by 1 rather than 0 p-value will be different)  Dependent on sample size  Does not provide information on size or precision of estimated effect (i.e., not a measure of biological relevance or a confidence interval)

Reality  Conclusion ↓ H 0 True, H a FalseH 0 False, H a True We don’t reject H 0 (null hypothesis) 1-  (eg., 0.95) Odds of saying there is no difference when there really is one. 95/100 times when there is no effect, we’ll correctly say there is no effect.  (eg., 0.20) Type II Error Odds of saying there is no difference when there really is one. 20/100 times when there is an effect, we’ll say there is no effect. We reject H 0, accept H a (alternative hypothesis)  (eg., 0.05) Type I Error Odds of saying there is a difference when there is no difference. 5/100 times when there is no effect, we’ll say there is one. 1-  (eg., 0.80) POWER Odds of saying there is a difference when there is one. 80/100 times when there is an effect, we’ll say there is oen.

Comments:  Lower ,  lower power; higher ,  higher power  Lower , conservative in terms of rejecting the null when it’s true (i.e., saying there’s an effect when there really isn’t)  Higher   increases chances of Type I Error, decreases chances of making Type II Error and decreases rigor of test.

Sample Design: Choosing a sample size  Can choose based on target precision level (e.g. confidence intervals) or power (hypothesis testing)  Requires assumptions and tentative parameter (e.g., effect size) values  Therefore it is an exercise in approximation  Might identify cases where minimal sufficient sample size would bust budget or is logistically impractical to achieve.

Likelihood Ratio Tests  Comparing fit of hypothesized model to another model (generally containing more parameters) – Null model to alternative model with additional parameters  Maximum likelihood estimation theory  Evaluate MLE for restricted and more general parameterizations  Calculate Likelihood ratio  Chi-square, with degrees of freedom of difference in number of parameters among models

Goodness of fit (GOF) “Absolute” fit of model  Goal is to determine if data are reflective of the statistical model  Test statistic generated based on probability model using estimated parameters  Is there variation in the data that is out of the ordinary and not reflected in our statistical model?

Pearson’s  2 GOF Test  Logic: If model is ‘correct’, expected and observed cell frequencies for each multinomial cell should be similar.  Imagine we roll a die 1000 times and want to determine if the model P(x=1)=P(x=2)=…=P(x=6) is a good model  If sample size is adequate, (expect at least 2 per cell),  (observed i – expected i ) 2 /expected i     df = # cells – 1 

General GOF if Large Sample  Pearson’s  2  Direct use of Deviance

Bootstrap GOF Test  Compute ML estimates for parameters,  Produce empirical distribution of estimates:  Simulate capture histories for each released animal:  assume parameter = MLE,  ‘flip coins’ to determine survival and capture for each period,  Repeat for { R i } animals, estimate parameters,  Compute deviance  Compare original deviance with empirical distribution (i.e., what percentile?)

What indicates lack of fit?  With GOF test, the hope and purpose is to accept the null hypothesis  This is counter to statistical hypothesis testing  What is a ‘significant’ P-value?

What might cause lack of fit?  Inadequate model structure for detection or survival, e.g.,  Age dependence, size dependence, etc.  Trap dependence  Those released earlier survive at different rate  Non-random temporary emigration  Lack of independence among animals

Solutions  Inadequate model structure? Improve it.  Goal: Subdivide animals sufficiently that there is equal p and S within a group  Warning: Inadequate model structure doesn’t always result in lack of fit, e.g.,  Permanent emigration (confounded with S)  Random temporary emigration (confounded with p)  Random ring loss (confounded with S)  Lack of independence? Correct for Overdispersion  Inflate variances using quasi-likelihood.

Adjusting Variances for Overdispersion  Based on Quasi-likelihood theory  c-hat = deviance/df  adj. variance = c-hat * (ML variance)

Bootstrap adjustment for overdispersion  For each simulated sample:  compute deviance  compute c-hat = deviance/df  Bootstrap c-hat =  (observed deviance)/(mean deviance), or  (observed c-hat) / (mean c-hat)  Note: could replace deviance with Pearson  2, or mean with median.