Hypothesis Testing I 2/8/12 More on bootstrapping Random chance

Slides:



Advertisements
Similar presentations
Statistics Hypothesis Testing.
Advertisements

Introducing Hypothesis Tests
Using Randomization Methods to Build Conceptual Understanding in Statistical Inference: Day 1 Lock, Lock, Lock Morgan, Lock, and Lock MAA Minicourse –
Using Randomization Methods to Build Conceptual Understanding in Statistical Inference: Day 1 Lock, Lock, Lock, Lock, and Lock MAA Minicourse – Joint Mathematics.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 10/23/12 Sections , Single Proportion, p Distribution (6.1)
Hypothesis Testing: Intervals and Tests
Bootstrap Distributions Or: How do we get a sense of a sampling distribution when we only have ONE sample?
Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: p-value STAT 250 Dr. Kari Lock Morgan SECTION 4.2 Randomization distribution p-value.
Section 3.4 Bootstrap Confidence Intervals using Percentiles.
Hypothesis testing Week 10 Lecture 2.
QM Spring 2002 Business Statistics Introduction to Inference: Hypothesis Testing.
Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College.
Statistics 201 – Lecture 23. Confidence Intervals Re-cap 1.Estimate the population mean with sample mean Know sample mean is unbiased estimator for 
STAT 101 Dr. Kari Lock Morgan Exam 2 Review.
Section 4.4 Creating Randomization Distributions.
BCOR 1020 Business Statistics Lecture 18 – March 20, 2008.
Chapter 9 Hypothesis Testing.
Inference about Population Parameters: Hypothesis Testing
Inference for Categorical Variables 2/29/12 Single Proportion, p Distribution Intervals and tests Difference in proportions, p 1 – p 2 One proportion or.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: p-value STAT 101 Dr. Kari Lock Morgan 9/25/12 SECTION 4.2 Randomization distribution.
Statistics: Unlocking the Power of Data Lock 5 Inference for Proportions STAT 250 Dr. Kari Lock Morgan Chapter 6.1, 6.2, 6.3, 6.7, 6.8, 6.9 Formulas for.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Hypotheses STAT 101 Dr. Kari Lock Morgan SECTION 4.1 Statistical test Null and alternative.
Synthesis and Review 3/26/12 Multiple Comparisons Review of Concepts Review of Methods - Prezi Essential Synthesis 3 Professor Kari Lock Morgan Duke University.
1 © Lecture note 3 Hypothesis Testing MAKE HYPOTHESIS ©
More Randomization Distributions, Connections
Sections 8-1 and 8-2 Review and Preview and Basics of Hypothesis Testing.
Using Simulation Methods to Introduce Inference Kari Lock Morgan Duke University In collaboration with Robin Lock, Patti Frazer Lock, Eric Lock, Dennis.
Confidence Intervals: Bootstrap Distribution
Statistics: Unlocking the Power of Data Lock 5 Normal Distribution STAT 250 Dr. Kari Lock Morgan Chapter 5 Normal distribution Central limit theorem Normal.
Normal Distribution Chapter 5 Normal distribution
Statistics: Unlocking the Power of Data Lock 5 Synthesis STAT 250 Dr. Kari Lock Morgan SECTIONS 4.4, 4.5 Connecting bootstrapping and randomization (4.4)
Introduction to Statistical Inference Probability & Statistics April 2014.
Hypothesis testing Chapter 9. Introduction to Statistical Tests.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 9/18/12 Confidence Intervals: Bootstrap Distribution SECTIONS 3.3, 3.4 Bootstrap.
LECTURE 19 THURSDAY, 14 April STA 291 Spring
Statistics: Unlocking the Power of Data Lock 5 Normal Distribution STAT 101 Dr. Kari Lock Morgan 10/18/12 Chapter 5 Normal distribution Central limit theorem.
Agresti/Franklin Statistics, 1 of 122 Chapter 8 Statistical inference: Significance Tests About Hypotheses Learn …. To use an inferential method called.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
Hypothesis Testing State the hypotheses. Formulate an analysis plan. Analyze sample data. Interpret the results.
Confidence Intervals: Bootstrap Distribution
The z test statistic & two-sided tests Section
Chapter 221 What Is a Test of Significance?. Chapter 222 Thought Question 1 The defendant in a court case is either guilty or innocent. Which of these.
Introduction to the Practice of Statistics Fifth Edition Chapter 6: Introduction to Inference Copyright © 2005 by W. H. Freeman and Company David S. Moore.
Statistics: Unlocking the Power of Data Lock 5 Bootstrap Intervals Dr. Kari Lock Morgan PSU /12/14.
Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.
1 CHAPTER 4 CHAPTER 4 WHAT IS A CONFIDENCE INTERVAL? WHAT IS A CONFIDENCE INTERVAL? confidence interval A confidence interval estimates a population parameter.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Statistics: Unlocking the Power of Data Lock 5 Section 4.2 Measuring Evidence with p-values.
Statistics: Unlocking the Power of Data Lock 5 Inference for Means STAT 250 Dr. Kari Lock Morgan Sections 6.4, 6.5, 6.6, 6.10, 6.11, 6.12, 6.13 t-distribution.
Statistical Inference Drawing conclusions (“to infer”) about a population based upon data from a sample. Drawing conclusions (“to infer”) about a population.
Constructing Bootstrap Confidence Intervals
AP Statistics Chapter 21 Notes
Statistics: Unlocking the Power of Data Lock 5 Normal Distribution STAT 250 Dr. Kari Lock Morgan Chapter 5 Normal distribution (5.1) Central limit theorem.
1 Probability and Statistics Confidence Intervals.
Synthesis and Review 2/20/12 Hypothesis Tests: the big picture Randomization distributions Connecting intervals and tests Review of major topics Open Q+A.
Today: Hypothesis testing. Example: Am I Cheating? If each of you pick a card from the four, and I make a guess of the card that you picked. What proportion.
Today: Hypothesis testing p-value Example: Paul the Octopus In 2008, Paul the Octopus predicted 8 World Cup games, and predicted them all correctly Is.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Review Statistical inference and test of significance.
Hypothesis Tests for 1-Proportion Presentation 9.
Statistics: Unlocking the Power of Data Lock 5 Section 4.1 Introducing Hypothesis Tests.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 FINAL EXAMINATION STUDY MATERIAL III A ADDITIONAL READING MATERIAL – INTRO STATS 3 RD EDITION.
Using Randomization Methods to Build Conceptual Understanding in Statistical Inference: Day 1 Lock, Lock, Lock, Lock, and Lock Minicourse – Joint Mathematics.
What Is a Test of Significance?
Introducing Hypothesis Tests
Hypothesis Tests for 1-Sample Proportion
When we free ourselves of desire,
STA 291 Spring 2008 Lecture 18 Dustin Lueker.
Presentation transcript:

Hypothesis Testing I 2/8/12 More on bootstrapping Random chance Null and alternative hypotheses Randomization distribution p-value Section 4.1, 4.2 Professor Kari Lock Morgan Duke University

Announcements Homework 3 (due Monday) Research question and data for Project 1 (proposal due next Wednesday)

Bootstrap CI Option 1: Estimate the standard error of the statistic by computing the standard deviation of the bootstrap distribution, and then generate a 95% confidence interval by Option 2: Generate a P% confidence interval as the range for the middle P% of bootstrap statistics

Suppose we have a random sample of 6 people: Patti

Original Sample Patti Create a “sampling distribution” using this as our simulated population

Bootstrap Sample: Sample with replacement from the original sample, using the same sample size. Patti Original Sample Bootstrap Sample

Continuous versus Discrete A continuous distribution can take any value within some range. The distribution will look like a smooth curve, without any gaps A discrete distribution only takes certain values. The distribution will be spiky, with gaps. Continuous Discrete

Criteria for Bootstrap CI Using the percentile method for a confidence interval bootstrapping for a confidence interval works for any statistic, as long as the bootstrap distribution is Approximately symmetric Approximately continuous Using the standard error method also requires Approximately bell-shaped Always look at the bootstrap distribution to make sure these are true!

Criteria for Bootstrap CI

Number of Bootstrap Samples When using bootstrapping, you may get a slightly different confidence interval each time. This is fine! The more bootstrap samples you use, the more precise your answer will be. For the purposes of this class, 1000 bootstrap samples is fine. In real life, you probably want to take 10,000 or even 100,000 bootstrap samples

Paul the Octopus http://www.youtube.com/watch?v=3ESGpRUMj9E

Paul the Octopus Paul the Octopus predicted 8 World Cup games, and predicted them all correctly Is this evidence that Paul actually has psychic powers? How unusual would this be if he was just randomly guessing (with a 50% chance of guessing correctly)? How could we figure this out?

Coins and Paul Did you get all 8 heads? (a) Yes (b) No

10,000 Simulations www.lock5stat.com/statkey If Paul is just guessing, the chance of him getting all 8 correct is 4/1000.

Hypotheses Null Hypothesis, H0 : Claim that there is no effect or no difference Alternative Hypothesis, Ha : Claim that we seek evidence for, usually that there is some effect Hypotheses are always given in terms of population parameters Paul the Octopus: H0 : 50% chance of correctly predicting each game Ha : >50% chance of correctly predicting each game

Paul the Octopus In the case of Paul the Octopus, what type of parameter are we interested in? (Hint: Are there one or two variables? Are the variable(s) categorical or quantitative?) (a) Proportion (b) Mean (c) Difference in proportions (d) Difference in means (e) Correlation Paul the Octopus: H0 : 50% chance of correctly predicting each game Ha : >50% chance of correctly predicting each game

Paul the Octopus Ha : p > 1/2 Let p denote the proportion of games that Paul guesses correctly (of all games he may have predicted) H0 : p = 1/2 Ha : p > 1/2

Hypotheses The alternative hypothesis is supported by finding evidence (data) that contradicts the null hypothesis (and supports the alternative hypothesis) Data can only contradict or not contradict the null hypothesis, but can never confirm it

Alternative Hypothesis Hypotheses Null Hypothesis Alternative Hypothesis Usually the null is a very specific statement, and so straightforward to assess evidence against ALL POSSIBILITIES

Paul and Hypotheses H0 : p = 1/2 Ha : p > 1/2 What if Paul had gotten 4 out of 8 correct? What would you conclude? H0 is true Ha is true H0 is false Ha is false Nothing

Your Own Hypotheses Come up with a situation where you want to establish a claim based on data What parameter(s) are you interested in? What would the null and alternative hypotheses be? What type of data would lead you to believe the null hypothesis is probably not true?

Measuring Evidence against H0 To see if a statistic provides evidence against H0, we need to see what kind of sample statistics we would observe, just by random chance, if H0 were true

Paul the Octopus

Randomization Distribution A randomization distribution is the distribution of sample statistics we would observe, just by random chance, if the null hypothesis were true Simulate many randomizations, assuming H0 is true, calculate the sample statistic each time, and collect these together to form a distribution

Randomization Distribution

p-value We calculate this from a randomization distribution The p-value is the probability of getting a statistic as extreme (or more extreme) as that observed, just by random chance, if the null hypothesis is true We calculate this from a randomization distribution

Paul the Octopus p-value

p-value What kinds of statistics would we get, just by random chance, if the null hypothesis were true? (randomization distribution) What proportion of these statistics are as extreme as our original sample statistic? (p-value)

Exercise and Pulse Does just 5 seconds of exercise raise your pulse rate? Let’s find out! How can we answer this question?