Hypothesis testing Summer Program Brian Healy. Last class Study design Study design –What is sampling variability? –How does our sample effect the questions.

Slides:



Advertisements
Similar presentations
Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Advertisements

Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
Is it statistically significant?
A statistical hypothesis is an assumption about a population parameter. This assumption may or may not be true
Comparing Two Population Means The Two-Sample T-Test and T-Interval.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
10 Hypothesis Testing. 10 Hypothesis Testing Statistical hypothesis testing The expression level of a gene in a given condition is measured several.
Final Jeopardy $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 LosingConfidenceLosingConfidenceTesting.
Hypothesis testing & Inferential Statistics
1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing.
Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution.
T-Tests Lecture: Nov. 6, 2002.
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
Chapter 8 Introduction to Hypothesis Testing
Sample size and study design
Getting Started with Hypothesis Testing The Single Sample.
AM Recitation 2/10/11.
Hypothesis Testing:.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Experimental Statistics - week 2
Probability Distributions and Test of Hypothesis Ka-Lok Ng Dept. of Bioinformatics Asia University.
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
1/2555 สมศักดิ์ ศิวดำรงพงศ์
STAT 5372: Experimental Statistics Wayne Woodward Office: Office: 143 Heroy Phone: Phone: (214) URL: URL: faculty.smu.edu/waynew.
Testing Hypotheses Tuesday, October 28. Objectives: Understand the logic of hypothesis testing and following related concepts Sidedness of a test (left-,
More About Significance Tests
June 18, 2008Stat Lecture 11 - Confidence Intervals 1 Introduction to Inference Sampling Distributions, Confidence Intervals and Hypothesis Testing.
STATISTICAL INFERENCE PART VII
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
Jan 17,  Hypothesis, Null hypothesis Research question Null is the hypothesis of “no relationship”  Normal Distribution Bell curve Standard normal.
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
One Sample Inf-1 If sample came from a normal distribution, t has a t-distribution with n-1 degrees of freedom. 1)Symmetric about 0. 2)Looks like a standard.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Testing means, part II The paired t-test. Outline of lecture Options in statistics –sometimes there is more than one option One-sample t-test: review.
Associate Professor Arthur Dryver, PhD School of Business Administration, NIDA url:
Statistical Inference
Chapter 20 Testing hypotheses about proportions
Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Chapter 9 Tests of Hypothesis Single Sample Tests The Beginnings – concepts and techniques Chapter 9A.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
Lunch & Learn Statistics By Jay. Goals Introduce / reinforce statistical thinking Understand statistical models Appreciate model assumptions Perform simple.
Statistical Inference Statistical Inference involves estimating a population parameter (mean) from a sample that is taken from the population. Inference.
10.1: Confidence Intervals Falls under the topic of “Inference.” Inference means we are attempting to answer the question, “How good is our answer?” Mathematically:
5.1 Chapter 5 Inference in the Simple Regression Model In this chapter we study how to construct confidence intervals and how to conduct hypothesis tests.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.
Hypothesis Testing An understanding of the method of hypothesis testing is essential for understanding how both the natural and social sciences advance.
3-1 MGMG 522 : Session #3 Hypothesis Testing (Ch. 5)
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Fall 2002Biostat Statistical Inference - Proportions One sample Confidence intervals Hypothesis tests Two Sample Confidence intervals Hypothesis.
Welcome to MM570 Psychological Statistics
Introduction Suppose that a pharmaceutical company is concerned that the mean potency  of an antibiotic meet the minimum government potency standards.
© Copyright McGraw-Hill 2004
Inferences Concerning Variances
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Inference About Means Chapter 23. Getting Started Now that we know how to create confidence intervals and test hypotheses about proportions, it’d be nice.
T tests comparing two means t tests comparing two means.
Chapter 13 Understanding research results: statistical inference.
Chapter 12 Tests of Hypotheses Means 12.1 Tests of Hypotheses 12.2 Significance of Tests 12.3 Tests concerning Means 12.4 Tests concerning Means(unknown.
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 FINAL EXAMINATION STUDY MATERIAL III A ADDITIONAL READING MATERIAL – INTRO STATS 3 RD EDITION.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Presentation transcript:

Hypothesis testing Summer Program Brian Healy

Last class Study design Study design –What is sampling variability? –How does our sample effect the questions we can answer? Basics of probability Basics of probability Central limit theorem Central limit theorem Sample mean Sample mean

What are we doing today? Rare event Rare event p-value p-value Hypothesis test Hypothesis test t-distribution / sample standard deviation t-distribution / sample standard deviation

Big picture We discussed last week that we could estimate the population mean with the sample mean and the central limit theorem told us the distribution of the sample mean. We discussed last week that we could estimate the population mean with the sample mean and the central limit theorem told us the distribution of the sample mean. Now, we are going to consider testing whether or not our sample mean is equal to a hypothesized value. We call this hypothesized value the null hypothesis. This test allows us to compare our sample to a value in a statistically meaningful way. Now, we are going to consider testing whether or not our sample mean is equal to a hypothesized value. We call this hypothesized value the null hypothesis. This test allows us to compare our sample to a value in a statistically meaningful way.

Null hypothesis We set up our null hypothesis so that we can reject the null hypothesis. The test is designed to disprove the null We set up our null hypothesis so that we can reject the null hypothesis. The test is designed to disprove the null The first and most important step in any problem. This part requires knowledge of the problem. The first and most important step in any problem. This part requires knowledge of the problem. Notation: H 0 Notation: H 0 H 0 : My mother can run a 5 minute mile. H 0 : My mother can run a 5 minute mile. –Not: My mother cannot run a 5 minute mile. H 0 : The probability of heads on the coin is 0.5. H 0 : The probability of heads on the coin is 0.5. –Not: The probability is not 0.5

Alternative hypothesis Notation: H A or H 1 Notation: H A or H 1 Has two characteristics Has two characteristics –Must cover all values not included in the null –Must contain the value that we think is going to happen H A : My mother runs a mile slower than 5 minutes H A : My mother runs a mile slower than 5 minutes H A : The probability of heads is not 0.5 H A : The probability of heads is not 0.5

Hypothesis test Definition: A statistical test of a null hypothesis Definition: A statistical test of a null hypothesis Completed under the assumption that the null is true (conditional probability) Completed under the assumption that the null is true (conditional probability) Always want to disprove the null hypothesis Always want to disprove the null hypothesis –Ex. H 0 : Mom’s mean time<=5:00 –H A : Mom’s mean time>5:00 –Alternatively: H 0 : Probability of heads=0.5 –H A : Probability of heads != 0.5 The most important step is properly defining the null and alternative hypotheses The most important step is properly defining the null and alternative hypotheses One-sided Two-sided

How do we test this hypothesis? Take a sample Take a sample As we have discussed, we want to think carefully about the how to collect the sample to ensure that we limit bias confounding and allow the results to be generalized to the proper population. As we have discussed, we want to think carefully about the how to collect the sample to ensure that we limit bias confounding and allow the results to be generalized to the proper population. From this sample, we can find a summary statistic and compare this to null hypothesis From this sample, we can find a summary statistic and compare this to null hypothesis –Mean (t-test, linear regression) –Median (Wilcoxon tests, quantile regression)

What does this have to do with the CLT? To test a hypothesis, we take a sample and find the sample mean To test a hypothesis, we take a sample and find the sample mean –Ex. Have my mom run a mile 10 times, or flip the coin 20 times –Determining the proper sample size is next class Under the null hypothesis, we know the population mean Under the null hypothesis, we know the population mean We sometimes may know the population variance We sometimes may know the population variance The distribution of the sample mean is normal with known mean and variance under these conditions The distribution of the sample mean is normal with known mean and variance under these conditions

Distribution of test statistic Under the null hypothesis, we know that the distribution of is normal with mean  and standard deviation Under the null hypothesis, we know that the distribution of is normal with mean  and standard deviation Now, we want to find the probability of observing the sample mean or a value more extreme, under the null (p-value) to see if the null hypothesis is likely true or false. Now, we want to find the probability of observing the sample mean or a value more extreme, under the null (p-value) to see if the null hypothesis is likely true or false. Have we observed a rare event? Is it rare enough to reconsider the null? Have we observed a rare event? Is it rare enough to reconsider the null?

What is a rare event? My mom claims that she runs a mile in 5 minutes. My mom claims that she runs a mile in 5 minutes. I think she can’t I think she can’t How can I test this? How can I test this? What happens if she ran a mile in What happens if she ran a mile in –5:15 minutes? –6 minutes? –10 minutes? What if she ran 5 separate miles at 10 minutes on average? What if she ran 5 separate miles at 10 minutes on average?

What is a rare event? You play a game against a friend. In this game, you win a dollar if the coin is heads and you lose a dollar if the coin is tails You play a game against a friend. In this game, you win a dollar if the coin is heads and you lose a dollar if the coin is tails What is the null hypothesis? What is the null hypothesis? What if the coin landed on tails 2 consecutive times? What if the coin landed on tails 2 consecutive times? What if the coin landed on tails 10 consecutive times? What if the coin landed on tails 10 consecutive times? At what point would you start to get suspicious? At what point would you start to get suspicious? We want to know if the event we observed could have happened simply by chance or if something else is more likely going on We want to know if the event we observed could have happened simply by chance or if something else is more likely going on

P-value Tells you how rare the event is Tells you how rare the event is Definition: Given a null hypothesis, the probability of the observed value or something more extreme Definition: Given a null hypothesis, the probability of the observed value or something more extreme P(event or something more extreme | H o is true) P(event or something more extreme | H o is true) Ex. Coin toss problem Ex. Coin toss problem –Null hypothesis: P(tails)=0.5 –Sample 9 out of 10 tails –P(9 or more tails | H 0 is true)=P(9 tails | H 0 is true)+P(10 tails | H 0 is true)=0.011

Alpha level-type I error Definition: probability of rejecting the null hypothesis when the null hypothesis is in fact true (rejection probability). Definition: probability of rejecting the null hypothesis when the null hypothesis is in fact true (rejection probability). Usually 0.05 or 0.1, but set by the investigator Usually 0.05 or 0.1, but set by the investigator Compare the p-value to the alpha level to determine if you have a significant result. This value defines how rare an event needs to be for use to say that the event did not occur by chance. Compare the p-value to the alpha level to determine if you have a significant result. This value defines how rare an event needs to be for use to say that the event did not occur by chance. It is called an error because this conclusion that the result was not due to chance is wrong  of the time. It is called an error because this conclusion that the result was not due to chance is wrong  of the time. One-sided or two-sided One-sided or two-sided

Steps for hypothesis testing 1) State null and alternative hypotheses 2) State type of test and alpha level 3) Determine and calculate appropriate test statistic 4) Calculate p-value 5) Decide whether to reject or not reject the null hypothesis NEVER accept nullNEVER accept null 6) Write conclusion

Example A study in New Bedford was looking at pregnant teens to see how long after pregnancy did each young woman arrive at the physician’s office for the first visit and the amount of time between the first visit and the second visit. A study in New Bedford was looking at pregnant teens to see how long after pregnancy did each young woman arrive at the physician’s office for the first visit and the amount of time between the first visit and the second visit. Questions: Do teens from a low income area arrive at a clinic later than the average woman? Is there more time between the first and second visit among these teens? Questions: Do teens from a low income area arrive at a clinic later than the average woman? Is there more time between the first and second visit among these teens?

It is known that the average amount of time from conception until a woman first visits her doctor is 8.5 weeks (this number is an estimate because it is difficult to know exactly when conception occurred) and the average amount of time from first visit to second visit is 4.3 weeks. It is known that the average amount of time from conception until a woman first visits her doctor is 8.5 weeks (this number is an estimate because it is difficult to know exactly when conception occurred) and the average amount of time from first visit to second visit is 4.3 weeks. For the moment, let’s assume that we know the population standard deviations for each of these are 2.6 weeks and 2.2 weeks, respectively. For the moment, let’s assume that we know the population standard deviations for each of these are 2.6 weeks and 2.2 weeks, respectively. We have collected a sample of 35 pregnant teens and we would like to know if they take longer to get their first visit than the average woman We have collected a sample of 35 pregnant teens and we would like to know if they take longer to get their first visit than the average woman

Sample data As with all of the data sets from now on, the data is on the BIO232 website. As with all of the data sets from now on, the data is on the BIO232 website. Let’s determine the mean for this sample and compare it to the hypothesized value. Let’s determine the mean for this sample and compare it to the hypothesized value. preg<-read.table(“preg.dat”, header=T) preg<-read.table(“preg.dat”, header=T)first<-preg[,1] mean(first) #This is the sample mean [1] 9.74 So the sample mean is clearly not equal to the population mean (8.5 weeks), but is it sufficiently different to say that these girls are different than the population. So the sample mean is clearly not equal to the population mean (8.5 weeks), but is it sufficiently different to say that these girls are different than the population.

Steps for hypothesis testing 1) Null:  =8.5 weeks, Alternative:  != 8.5 weeks 2) One sample hypothesis test, alpha=0.05 3) 3) 4) Area in upper tail = , p-value = ) Reject null 6) Conclusion: There is a difference in the amount of time from conception to the first visit to a physician. The time is longer for the pregnant teens.

Picture Here is a picture Here is a picture 8.5 Area= Area=0.0024

Normal hypothesis test in R To complete a normal hypothesis test in R, you can simply use the pnorm command with the appropriate mean and standard deviation. Remember, pnorm provides the area in the lower tail in all cases To complete a normal hypothesis test in R, you can simply use the pnorm command with the appropriate mean and standard deviation. Remember, pnorm provides the area in the lower tail in all cases For the previous problem, to get the appropriate 2-sided p-value, use For the previous problem, to get the appropriate 2-sided p-value, use(1-pnorm(9.74,8.5,2.6))*2

Another way to look at the test Given a specific alpha level, you can find the cut-off for which all values more extreme, the null hypothesis would be rejected Given a specific alpha level, you can find the cut-off for which all values more extreme, the null hypothesis would be rejected The region more extreme is called the rejection region The region more extreme is called the rejection region For our present problem, the cut-off for the rejection region would be For our present problem, the cut-off for the rejection region would be 8.5 Area=0.025 cut-off=9.36

Practice Here are the times my mom ran in the 10 trials. Test the null hypothesis that she can runs a 9:00 mile on average. Here are the times my mom ran in the 10 trials. Test the null hypothesis that she can runs a 9:00 mile on average. mom<-c(9.5, 10, 8.75, 9, 11.2, 8.65, 9.6, 10.2, 8.8, 9.8) mom<-c(9.5, 10, 8.75, 9, 11.2, 8.65, 9.6, 10.2, 8.8, 9.8) What are the null and alternative hypotheses? What are the null and alternative hypotheses? What do you conclude? What do you conclude? What would have happened if we had completed a two-sided test? What would have happened if we had completed a two-sided test?

Comparison of one-sided and two- sided tests Two-sided p-value is always twice one-sided p- value. Two-sided p-value is always twice one-sided p- value. Two-sided test is more conservative because the rejection region is split between the high and low side. For the one-sided test, the rejection region is only on the side of interest Two-sided test is more conservative because the rejection region is split between the high and low side. For the one-sided test, the rejection region is only on the side of interest Two-sided test most common in literature even though usually people know the direction of effect they are interested in detecting. Two-sided test most common in literature even though usually people know the direction of effect they are interested in detecting. Picture Picture

Wait a minute Up to now, assumed we know the population variance (is this a good assumption?) Up to now, assumed we know the population variance (is this a good assumption?) How could we estimate the population variance? How could we estimate the population variance? –Sample variance!!! – Is the sample variance exactly equal to population variance? –How can we account for the additional uncertainty? Now, we need to do a little math Now, we need to do a little math

t-distribution Assume X i are iid normal Assume X i are iid normal Normal distribution Normal distribution Chi-square distribution (Proof of this is given in Casella and Berger and in Inference I) Chi-square distribution (Proof of this is given in Casella and Berger and in Inference I) t-distribution- ratio of Normal (U) and chi- square (V) t-distribution- ratio of Normal (U) and chi- square (V)

t-distribution Heavier tails than normal distribution Heavier tails than normal distribution –Accounts for additional variability –Tails heavier with fewer degrees of freedom (dof) As dof goes to infinity, t dist  normal dist As dof goes to infinity, t dist  normal dist Can use t-dist test statistic just as the previous Can use t-dist test statistic just as the previous Remember assumption of underlying normal Remember assumption of underlying normal

Example We can use a t-test to test the second null hypothesis about our pregnant teens, namely that the time from the first visit to the second visit is the same as in the general population We can use a t-test to test the second null hypothesis about our pregnant teens, namely that the time from the first visit to the second visit is the same as in the general population First, we need to ensure that the underlying distribution is approximately normal First, we need to ensure that the underlying distribution is approximately normal

Steps for hypothesis testing 1) Null:  =4.3 weeks, Alternative:  != 4.3 weeks 2) One sample hypothesis t-test, alpha=0.05 3) 3) 4) p-value = ) Reject null 6) Conclusion: There is a difference in the amount of time from the first visit to the second visit. The time is longer for the pregnant teens.

One sample t-test in R To complete a t-test in R, use To complete a t-test in R, use > t.test(second,mu=4.8) One Sample t-test One Sample t-test data: second t = , df = 34, p-value = alternative hypothesis: true mean is not equal to percent confidence interval: sample estimates: mean of x

Practice Using the class data set, test the following hypotheses: Using the class data set, test the following hypotheses: –The average age of an incoming student to the biostat program is 25. Is the mean age of this year’s class significantly different? Is there anything we need to consider in this analysis? –The average height of an incoming student is 71 inches. Is the mean height of this year’s class significantly shorter?

More practice The TV watching habits of my seventh grade classes are shown in the dataset TV.dat from the course website. The gender and age of the students is given as well. How did my students TV watching habits compare to the national average for 7 th graders of 4 hours/day? Use an alpha level of The TV watching habits of my seventh grade classes are shown in the dataset TV.dat from the course website. The gender and age of the students is given as well. How did my students TV watching habits compare to the national average for 7 th graders of 4 hours/day? Use an alpha level of 0.01.