Presentation on theme: "Hypothesis Testing: Intervals and Tests"— Presentation transcript:
1Hypothesis Testing: Intervals and Tests STAT 101Dr. Kari Lock Morgan10/2/12Hypothesis Testing: Intervals and TestsSECTION 4.3, 4.4, 4.5Type I and II errors (4.3)More randomization distributions (4.4)Connecting intervals and tests (4.5)
2Proposals Project 1 proposal comments Give spreadsheet with data in correct formatCases and variables
3RemindersHighest scorer on correlation guessing game gets an extra point on Exam 1! Deadline: noon on Thursday, 10/11.First student to get a red card gets an extra point on Exam 1!
4Errors Decision Truth There are four possibilities: Reject H0 Do not reject H0H0 trueH0 falseTYPE I ERRORTruthTYPE II ERRORA Type I Error is rejecting a true nullA Type II Error is not rejecting a false null
5Red Wine and Weight Loss In the test to see if resveratrol is associated with food intake, the p-value isIf resveratrol is not associated with food intake, a Type I Error would have been madeIn the test to see if resveratrol is associated with locomotor activity, the p-value isIf resveratrol is associated with locomotor activity, a Type II Error would have been made
6Analogy to Law Ho Ha A person is innocent until proven guilty. Evidence must be beyond the shadow of a doubt.p-value from dataTypes of mistakes in a verdict?Convict an innocentType I errorRelease a guiltyType II error
7Probability of Type I Error The probability of making a Type I error (rejecting a true null) is the significance level, αα should be chosen depending how bad it is to make a Type I error
8Probability of Type I Error Distribution of statistics, assuming H0 true:If the null hypothesis is true:5% of statistics will be in the most extreme 5%5% of statistics will give p-values less than 0.055% of statistics will lead to rejecting H0 at α = 0.05If α = 0.05, there is a 5% chance of a Type I error
9Probability of Type I Error Distribution of statistics, assuming H0 true:If the null hypothesis is true:1% of statistics will be in the most extreme 1%1% of statistics will give p-values less than 0.011% of statistics will lead to rejecting H0 at α = 0.01If α = 0.01, there is a 1% chance of a Type I error
10Probability of Type II Error The probability of making a Type II Error (not rejecting a false null) depends onEffect size (how far the truth is from the null)Sample sizeVariabilitySignificance level
11Choosing α By default, usually α = 0.05 If a Type I error (rejecting a true null) is much worse than a Type II error, we may choose a smaller α, like α = 0.01If a Type II error (not rejecting a false null) is much worse than a Type I error, we may choose a larger α, like α = 0.10
12Significance LevelCome up with a hypothesis testing situation in which you may want to…Use a smaller significance level, like = 0.01Use a larger significance level, like = 0.10
13Randomization Distributions p-values can be calculated by randomization distributions:simulate samples, assuming H0 is truecalculate the statistic of interest for each samplefind the p-value as the proportion of simulated statistics as extreme as the observed statisticToday we’ll see ways to simulate randomization samples for more situations
14Randomization Distribution In a hypothesis test for H0: = 12 vs Ha: < 12, we have a sample with n = 45 and 𝑥 =10.2. What do we require about the method to produce randomization samples?We need to generate randomization samples assuming the null hypothesis is true. = 12 < 12𝑥 =10.2
15Randomization Distribution In a hypothesis test for H0: = 12 vs Ha: < 12, we have a sample with n = 45 and 𝑥 =10.2. Where will the randomization distribution be centered?Randomization distributions are always centered around the null hypothesized value.10.212451.8
16Randomization Distribution Center A randomization distribution simulates samples assuming the null hypothesis is true, soA randomization distribution is centered at the value of the parameter given in the null hypothesis.
17Randomization Distribution In a hypothesis test for H0: = 12 vs Ha: < 12, we have a sample with n = 45 and 𝑥 =10.2. What will we look for on the randomization distribution?We want to see how extreme the observed statistic is.How extreme 10.2 isHow extreme 12 isHow extreme 45 isWhat the standard error isHow many randomization samples we collected
18Randomization Distribution In a hypothesis test for H0: 1 = 2 vs Ha: 1 > 2 , we have a sample with 𝑥 1 =26 and 𝑥 1 =21. What do we require about the method to produce randomization samples?We need to generate randomization samples assuming the null hypothesis is true.1 = 21 > 2𝑥 1 =26, 𝑥 2 =21𝑥 1 − 𝑥 2 =5
19Randomization Distribution In a hypothesis test for H0: 1 = 2 vs Ha: 1 > 2 , we have a sample with 𝑥 1 =26 and 𝑥 1 =21. Where will the randomization distribution be centered?The randomization distribution is centered around the null hypothesized value, 1 - 2 = 0121265
20Randomization Distribution In a hypothesis test for H0: 1 = 2 vs Ha: 1 > 2 , we have a sample with 𝑥 1 =26 and 𝑥 1 =21. What do we look for on the randomization distribution?We want to see how extreme the observed difference in means is.The standard errorThe center pointHow extreme 26 isHow extreme 21 isHow extreme 5 is
21Randomization Distribution For a randomization distribution, each simulated sample should…be consistent with the null hypothesisuse the data in the observed samplereflect the way the data were collected
22Randomized Experiments In randomized experiments the “randomness” is the random allocation to treatment groupsIf the null hypothesis is true, the response values would be the same, regardless of treatment group assignmentTo simulate what would happen just by random chance, if H0 were true:reallocate cases to treatment groups, keeping the response values the same
23Observational Studies In observational studies, the “randomness” is random sampling from the populationTo simulate what would happen, just by random chance, if H0 were true:Simulate resampling from a population in which H0 is trueHow do we simulate resampling from a population when we only have sample data?Bootstrap!How can we generate randomization samples for observational studies?Make H0 true, then bootstrap!
24Body Temperatures = average human body temperate98.6 H0 : = 98.6 Ha : ≠ 98.6𝑥 =98.26We can make the null true just by adding – = 0.34 to each value, to make the mean be 98.6Bootstrapping from this revised sample lets us simulate samples, assuming H0 is true!
25Body TemperaturesIn StatKey, when we enter the null hypothesis, this shifting is automatically done for usStatKeyp-value = 0.002
26Creating Randomization Samples Do males exercise more hours per week than females?Is blood pressure negatively correlated with heart rate?𝑥 𝑚 − 𝑥 𝑓 =3𝑟=−0.057State null and alternative hypothesesDevise a way to generate a randomization sample thatUses the observed sample dataMakes the null hypothesis trueReflects the way the data were collectedAsk them to share ideas. There are many possible answers.If they have computers in class you can also ask them to use StatKey to create a randomization distribution, find the p-value, and interpret in context.
27Exercise and Gender H0: m = f , Ha: m > f To make H0 true, we must make the means equal. One way to do this is to add 3 to every female value (there are other ways)Bootstrap from this modified sampleIn StatKey, the default randomization method is “reallocate groups”, but “Shift Groups” is also an option, and will do this
29Exercise and GenderThe p-value is Using α = 0.05, we conclude….Males exercise more than females, on averageMales do not exercise more than females, on averageNothingDo not reject the null… we can’t conclude anything.
30Blood Pressure and Heart Rate H0: = 0 , Ha: < 0Two variables have correlation 0 if they are not associated. We can “break the association” by randomly permuting/scrambling/shuffling one of the variablesEach time we do this, we get a sample we might observe just by random chance, if there really is no correlation
31Blood Pressure and Heart Rate Even if blood pressure and heart rate are not correlated, we would see correlations this extreme about 22% of the time, just by random chance.p-value = 0.219
32Randomization Distribution Paul the Octopus (single proportion):Flip a coin 8 timesCocaine Addiction (randomized experiment):Rerandomize cases to treatment groups, keeping response values fixedBody Temperature (single mean):Shift to make H0 true, then bootstrapExercise and Gender (observational study):Blood Pressure and Heart Rate (correlation):Randomly permute/scramble/shuffle one variable
33Randomization Distributions Randomization samples should be generatedConsistent with the null hypothesisUsing the observed dataReflecting the way the data were collectedThe specific method varies with the situation, but the general idea is always the same
34Generating Randomization Samples As long as the original data is used and the null hypothesis is true for the randomization samples, most methods usually give similar answers in terms of a p-valueStatKey generates the randomizations for you, so most important is not understanding how to generate randomization samples, but understanding why
35Bootstrap and Randomization Distributions Bootstrap DistributionRandomization DistributionOur best guess at the distribution of sample statisticsOur best guess at the distribution of sample statistics, if H0 were trueCentered around the observed sample statisticCentered around the null hypothesized valueSimulate sampling from the population by resampling from the original sampleSimulate samples assuming H0 were trueBig difference: a randomization distribution assumes H0 is true, while a bootstrap distribution does not
36Which Distribution?Let be the average amount of sleep college students get per night. Data was collected on a sample of students, and for this sample 𝑥 =6.7 hours.A bootstrap distribution is generated to create a confidence interval for , and a randomization distribution is generated to see if the data provide evidence that > 7.Which distribution below is the bootstrap distribution?(a) is centered around the sample statistic, 6.7
37Which Distribution?Intro stat students are surveyed, and we find that 152 out of 218 are female. Let p be the proportion of intro stat students at that university who are female.A bootstrap distribution is generated for a confidence interval for p, and a randomization distribution is generated to see if the data provide evidence that p > 1/2.Which distribution is the randomization distribution?(a) is centered around the null value, 1/2
38SummaryThere are two types of errors: rejecting a true null (Type I) and not rejecting a false null (Type II)Randomization samples should be generatedConsistent with the null hypothesisUsing the observed dataReflecting the way the data were collected