Hypothesis Testing: Intervals and Tests

Slides:



Advertisements
Similar presentations
Introducing Hypothesis Tests
Advertisements

Our goal is to assess the evidence provided by the data in favor of some claim about the population. Section 6.2Tests of Significance.
Hypothesis Testing making decisions using sample data.
Hypothesis Testing An introduction. Big picture Use a random sample to learn something about a larger population.
Chapter 12 Tests of Hypotheses Means 12.1 Tests of Hypotheses 12.2 Significance of Tests 12.3 Tests concerning Means 12.4 Tests concerning Means(unknown.
Inference Sampling distributions Hypothesis testing.
Last Time (Sampling &) Estimation Confidence Intervals Started Hypothesis Testing.
Our goal is to assess the evidence provided by the data in favor of some claim about the population. Section 6.2Tests of Significance.
Hypothesis Testing I 2/8/12 More on bootstrapping Random chance
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: p-value STAT 250 Dr. Kari Lock Morgan SECTION 4.2 Randomization distribution p-value.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Significance STAT 101 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Significance level (4.3)
Hypothesis Testing: One Sample Mean or Proportion
Fundamentals of Hypothesis Testing. Identify the Population Assume the population mean TV sets is 3. (Null Hypothesis) REJECT Compute the Sample Mean.
Hypothesis Testing Steps of a Statistical Significance Test. 1. Assumptions Type of data, form of population, method of sampling, sample size.
1 Business 90: Business Statistics Professor David Mease Sec 03, T R 7:30-8:45AM BBC 204 Lecture 24 = Start Chapter “Fundamentals of Hypothesis Testing:
TESTING HYPOTHESES FOR A SINGLE SAMPLE
Stat 512 – Day 8 Tests of Significance (Ch. 6). Last Time Use random sampling to eliminate sampling errors Use caution to reduce nonsampling errors Use.
Section 4.4 Creating Randomization Distributions.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Significance STAT 250 Dr. Kari Lock Morgan SECTION 4.3 Significance level (4.3) Statistical.
Determining Statistical Significance
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: p-value STAT 101 Dr. Kari Lock Morgan 9/25/12 SECTION 4.2 Randomization distribution.
Confidence Intervals and Hypothesis Tests
Statistics: Unlocking the Power of Data Lock 5 Inference for Proportions STAT 250 Dr. Kari Lock Morgan Chapter 6.1, 6.2, 6.3, 6.7, 6.8, 6.9 Formulas for.
Hypothesis Testing III 2/15/12 Statistical significance Errors Power Significance and sample size Section 4.3 Professor Kari Lock Morgan Duke University.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Hypotheses STAT 101 Dr. Kari Lock Morgan SECTION 4.1 Statistical test Null and alternative.
Statistical Inference Dr. Mona Hassan Ahmed Prof. of Biostatistics HIPH, Alexandria University.
Randomization Tests Dr. Kari Lock Morgan PSU /5/14.
Overview Definition Hypothesis
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
More Randomization Distributions, Connections
Testing Hypotheses Tuesday, October 28. Objectives: Understand the logic of hypothesis testing and following related concepts Sidedness of a test (left-,
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Significance STAT 101 Dr. Kari Lock Morgan 9/27/12 SECTION 4.3 Significance level Statistical.
Week 8 Fundamentals of Hypothesis Testing: One-Sample Tests
Essential Synthesis SECTION 4.4, 4.5, ES A, ES B
Statistics: Unlocking the Power of Data Lock 5 Synthesis STAT 250 Dr. Kari Lock Morgan SECTIONS 4.4, 4.5 Connecting bootstrapping and randomization (4.4)
Statistical Inference Decision Making (Hypothesis Testing) Decision Making (Hypothesis Testing) A formal method for decision making in the presence of.
LECTURE 19 THURSDAY, 14 April STA 291 Spring
A Broad Overview of Key Statistical Concepts. An Overview of Our Review Populations and samples Parameters and statistics Confidence intervals Hypothesis.
Statistics: Unlocking the Power of Data Lock 5 Normal Distribution STAT 101 Dr. Kari Lock Morgan 10/18/12 Chapter 5 Normal distribution Central limit theorem.
Agresti/Franklin Statistics, 1 of 122 Chapter 8 Statistical inference: Significance Tests About Hypotheses Learn …. To use an inferential method called.
Chapter 20 Testing hypotheses about proportions
Hypotheses tests for means
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Errors (4.3) Multiple testing.
Introducing Inference with Bootstrapping and Randomization Kari Lock Morgan Department of Statistical Science, Duke University with.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Chapter 20 Testing Hypothesis about proportions
Statistical Inference An introduction. Big picture Use a random sample to learn something about a larger population.
Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.
1 When we free ourselves of desire, we will know serenity and freedom.
Chapter 21: More About Tests
A review of key statistical concepts. An overview of the review Populations and parameters Samples and statistics Confidence intervals Hypothesis testing.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Hypotheses STAT 250 Dr. Kari Lock Morgan SECTION 4.1 Hypothesis test Null and alternative.
Statistics: Unlocking the Power of Data Lock 5 Section 4.2 Measuring Evidence with p-values.
Statistics: Unlocking the Power of Data Lock 5 Inference for Means STAT 250 Dr. Kari Lock Morgan Sections 6.4, 6.5, 6.6, 6.10, 6.11, 6.12, 6.13 t-distribution.
Statistics: Unlocking the Power of Data Lock 5 Section 4.5 Confidence Intervals and Hypothesis Tests.
What is a Hypothesis? A hypothesis is a claim (assumption) about the population parameter Examples of parameters are population mean or proportion The.
AP Statistics Chapter 11 Notes. Significance Test & Hypothesis Significance test: a formal procedure for comparing observed data with a hypothesis whose.
Synthesis and Review 2/20/12 Hypothesis Tests: the big picture Randomization distributions Connecting intervals and tests Review of major topics Open Q+A.
Today: Hypothesis testing. Example: Am I Cheating? If each of you pick a card from the four, and I make a guess of the card that you picked. What proportion.
Chapter 12 Tests of Hypotheses Means 12.1 Tests of Hypotheses 12.2 Significance of Tests 12.3 Tests concerning Means 12.4 Tests concerning Means(unknown.
6.2 Large Sample Significance Tests for a Mean “The reason students have trouble understanding hypothesis testing may be that they are trying to think.”
Today: Hypothesis testing p-value Example: Paul the Octopus In 2008, Paul the Octopus predicted 8 World Cup games, and predicted them all correctly Is.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Hypothesis Tests Hypothesis Tests Large Sample 1- Proportion z-test.
Statistics: Unlocking the Power of Data Lock 5 Section 4.1 Introducing Hypothesis Tests.
Statistics: Unlocking the Power of Data Lock 5 Section 4.3 Determining Statistical Significance.
A Closer Look at Testing
When we free ourselves of desire,
Determining Statistical Significance
Presentation transcript:

Hypothesis Testing: Intervals and Tests STAT 101 Dr. Kari Lock Morgan 10/2/12 Hypothesis Testing: Intervals and Tests SECTION 4.3, 4.4, 4.5 Type I and II errors (4.3) More randomization distributions (4.4) Connecting intervals and tests (4.5)

Proposals Project 1 proposal comments Give spreadsheet with data in correct format Cases and variables

Reminders Highest scorer on correlation guessing game gets an extra point on Exam 1! Deadline: noon on Thursday, 10/11. First student to get a red card gets an extra point on Exam 1!

Errors   Decision Truth There are four possibilities: Reject H0 Do not reject H0 H0 true H0 false  TYPE I ERROR Truth  TYPE II ERROR A Type I Error is rejecting a true null A Type II Error is not rejecting a false null

Red Wine and Weight Loss In the test to see if resveratrol is associated with food intake, the p-value is 0.035. If resveratrol is not associated with food intake, a Type I Error would have been made In the test to see if resveratrol is associated with locomotor activity, the p-value is 0.980. If resveratrol is associated with locomotor activity, a Type II Error would have been made

Analogy to Law Ho Ha  A person is innocent until proven guilty. Evidence must be beyond the shadow of a doubt.  p-value from data Types of mistakes in a verdict? Convict an innocent Type I error Release a guilty Type II error

Probability of Type I Error The probability of making a Type I error (rejecting a true null) is the significance level, α α should be chosen depending how bad it is to make a Type I error

Probability of Type I Error Distribution of statistics, assuming H0 true: If the null hypothesis is true: 5% of statistics will be in the most extreme 5% 5% of statistics will give p-values less than 0.05 5% of statistics will lead to rejecting H0 at α = 0.05 If α = 0.05, there is a 5% chance of a Type I error

Probability of Type I Error Distribution of statistics, assuming H0 true: If the null hypothesis is true: 1% of statistics will be in the most extreme 1% 1% of statistics will give p-values less than 0.01 1% of statistics will lead to rejecting H0 at α = 0.01 If α = 0.01, there is a 1% chance of a Type I error

Probability of Type II Error The probability of making a Type II Error (not rejecting a false null) depends on Effect size (how far the truth is from the null) Sample size Variability Significance level

Choosing α By default, usually α = 0.05 If a Type I error (rejecting a true null) is much worse than a Type II error, we may choose a smaller α, like α = 0.01 If a Type II error (not rejecting a false null) is much worse than a Type I error, we may choose a larger α, like α = 0.10

Significance Level Come up with a hypothesis testing situation in which you may want to… Use a smaller significance level, like  = 0.01 Use a larger significance level, like  = 0.10

Randomization Distributions p-values can be calculated by randomization distributions: simulate samples, assuming H0 is true calculate the statistic of interest for each sample find the p-value as the proportion of simulated statistics as extreme as the observed statistic Today we’ll see ways to simulate randomization samples for more situations

Randomization Distribution In a hypothesis test for H0:  = 12 vs Ha:  < 12, we have a sample with n = 45 and 𝑥 =10.2. What do we require about the method to produce randomization samples? We need to generate randomization samples assuming the null hypothesis is true.  = 12  < 12 𝑥 =10.2

Randomization Distribution In a hypothesis test for H0:  = 12 vs Ha:  < 12, we have a sample with n = 45 and 𝑥 =10.2. Where will the randomization distribution be centered? Randomization distributions are always centered around the null hypothesized value. 10.2 12 45 1.8

Randomization Distribution Center A randomization distribution simulates samples assuming the null hypothesis is true, so A randomization distribution is centered at the value of the parameter given in the null hypothesis.

Randomization Distribution In a hypothesis test for H0:  = 12 vs Ha:  < 12, we have a sample with n = 45 and 𝑥 =10.2. What will we look for on the randomization distribution? We want to see how extreme the observed statistic is. How extreme 10.2 is How extreme 12 is How extreme 45 is What the standard error is How many randomization samples we collected

Randomization Distribution In a hypothesis test for H0: 1 = 2 vs Ha: 1 > 2 , we have a sample with 𝑥 1 =26 and 𝑥 1 =21. What do we require about the method to produce randomization samples? We need to generate randomization samples assuming the null hypothesis is true. 1 = 2 1 > 2 𝑥 1 =26, 𝑥 2 =21 𝑥 1 − 𝑥 2 =5

Randomization Distribution In a hypothesis test for H0: 1 = 2 vs Ha: 1 > 2 , we have a sample with 𝑥 1 =26 and 𝑥 1 =21. Where will the randomization distribution be centered? The randomization distribution is centered around the null hypothesized value, 1 - 2 = 0 1 21 26 5

Randomization Distribution In a hypothesis test for H0: 1 = 2 vs Ha: 1 > 2 , we have a sample with 𝑥 1 =26 and 𝑥 1 =21. What do we look for on the randomization distribution? We want to see how extreme the observed difference in means is. The standard error The center point How extreme 26 is How extreme 21 is How extreme 5 is

Randomization Distribution For a randomization distribution, each simulated sample should… be consistent with the null hypothesis use the data in the observed sample reflect the way the data were collected

Randomized Experiments In randomized experiments the “randomness” is the random allocation to treatment groups If the null hypothesis is true, the response values would be the same, regardless of treatment group assignment To simulate what would happen just by random chance, if H0 were true: reallocate cases to treatment groups, keeping the response values the same

Observational Studies In observational studies, the “randomness” is random sampling from the population To simulate what would happen, just by random chance, if H0 were true: Simulate resampling from a population in which H0 is true How do we simulate resampling from a population when we only have sample data? Bootstrap! How can we generate randomization samples for observational studies? Make H0 true, then bootstrap!

Body Temperatures  = average human body temperate98.6 H0 :  = 98.6 Ha :  ≠ 98.6 𝑥 =98.26 We can make the null true just by adding 98.6 – 98.26 = 0.34 to each value, to make the mean be 98.6 Bootstrapping from this revised sample lets us simulate samples, assuming H0 is true!

Body Temperatures In StatKey, when we enter the null hypothesis, this shifting is automatically done for us StatKey p-value = 0.002

Creating Randomization Samples Do males exercise more hours per week than females? Is blood pressure negatively correlated with heart rate? 𝑥 𝑚 − 𝑥 𝑓 =3 𝑟=−0.057 State null and alternative hypotheses Devise a way to generate a randomization sample that Uses the observed sample data Makes the null hypothesis true Reflects the way the data were collected Ask them to share ideas. There are many possible answers. If they have computers in class you can also ask them to use StatKey to create a randomization distribution, find the p-value, and interpret in context.

Exercise and Gender H0: m = f , Ha: m > f To make H0 true, we must make the means equal. One way to do this is to add 3 to every female value (there are other ways) Bootstrap from this modified sample In StatKey, the default randomization method is “reallocate groups”, but “Shift Groups” is also an option, and will do this

Exercise and Gender p-value = 0.095

Exercise and Gender The p-value is 0.095. Using α = 0.05, we conclude…. Males exercise more than females, on average Males do not exercise more than females, on average Nothing Do not reject the null… we can’t conclude anything.

Blood Pressure and Heart Rate H0:  = 0 , Ha:  < 0 Two variables have correlation 0 if they are not associated. We can “break the association” by randomly permuting/scrambling/shuffling one of the variables Each time we do this, we get a sample we might observe just by random chance, if there really is no correlation

Blood Pressure and Heart Rate Even if blood pressure and heart rate are not correlated, we would see correlations this extreme about 22% of the time, just by random chance. p-value = 0.219

Randomization Distribution Paul the Octopus (single proportion): Flip a coin 8 times Cocaine Addiction (randomized experiment): Rerandomize cases to treatment groups, keeping response values fixed Body Temperature (single mean): Shift to make H0 true, then bootstrap Exercise and Gender (observational study): Blood Pressure and Heart Rate (correlation): Randomly permute/scramble/shuffle one variable

Randomization Distributions Randomization samples should be generated Consistent with the null hypothesis Using the observed data Reflecting the way the data were collected The specific method varies with the situation, but the general idea is always the same

Generating Randomization Samples As long as the original data is used and the null hypothesis is true for the randomization samples, most methods usually give similar answers in terms of a p-value StatKey generates the randomizations for you, so most important is not understanding how to generate randomization samples, but understanding why

Bootstrap and Randomization Distributions Bootstrap Distribution Randomization Distribution Our best guess at the distribution of sample statistics Our best guess at the distribution of sample statistics, if H0 were true Centered around the observed sample statistic Centered around the null hypothesized value Simulate sampling from the population by resampling from the original sample Simulate samples assuming H0 were true Big difference: a randomization distribution assumes H0 is true, while a bootstrap distribution does not

Which Distribution? Let  be the average amount of sleep college students get per night. Data was collected on a sample of students, and for this sample 𝑥 =6.7 hours. A bootstrap distribution is generated to create a confidence interval for , and a randomization distribution is generated to see if the data provide evidence that  > 7. Which distribution below is the bootstrap distribution? (a) is centered around the sample statistic, 6.7

Which Distribution? Intro stat students are surveyed, and we find that 152 out of 218 are female. Let p be the proportion of intro stat students at that university who are female. A bootstrap distribution is generated for a confidence interval for p, and a randomization distribution is generated to see if the data provide evidence that p > 1/2. Which distribution is the randomization distribution? (a) is centered around the null value, 1/2

Summary There are two types of errors: rejecting a true null (Type I) and not rejecting a false null (Type II) Randomization samples should be generated Consistent with the null hypothesis Using the observed data Reflecting the way the data were collected

To Do Read Sections 4.4, 4.5 Do Homework 4 (due Thursday, 10/4)