Presentation is loading. Please wait.

Presentation is loading. Please wait.

More Randomization Distributions, Connections

Similar presentations


Presentation on theme: "More Randomization Distributions, Connections"— Presentation transcript:

1 More Randomization Distributions, Connections
STAT 250 Dr. Kari Lock Morgan More Randomization Distributions, Connections SECTION 4.4, 4.5 More randomization distributions (4.4) Connecting intervals and tests (4.5)

2 Diet and Sex of Baby Are certain foods in your diet associated with whether or not you conceive a boy or a girl? To study this, researchers asked women about their eating habits, including asking whether or not they ate 133 different foods regularly A significant difference was found for breakfast cereal (mothers of boys eat more), prompting the headline “Breakfast Cereal Boosts Chances of Conceiving Boys”.

3 “Breakfast Cereal Boosts Chances of Conceiving Boys”
I used to eat breakfast cereal every morning and have two boys. Do you think this helped to boost my chances of having boys? Yes No Impossible to tell

4 Hypothesis Tests For each of the 133 foods studied, a hypothesis test was conducted for a difference between mothers who conceived boys and girls in the proportion who consume each food If there are NO differences (all null hypotheses are true), about how many significant differences would be found using α = 0.05? How might you explain the significant difference for breakfast cereal?

5 Author: JB Landers

6 Multiple Testing Multiple testing: When doing multiple tests, even if all the nulls are true, α of them will be significant just by random chance Publication bias: Only the significant results get published (also called the file drawer problem) Together, these represent a huge problem! Amid a Sea of False Findings, the NIH Tries Reform (March 16th, 2015) Why most published research findings are false

7 Clinical Trials Preclinical (animal studies)
Phase 0: Study pharmacodynamics and pharmacokinetics Phase 1: Screening for safety Phase 2: Placebo trials to establish efficacy Phase 3: Trials against standard treatment and to confirm efficacy Only then does a drug go to market…

8 Statistical vs Practical Significance
With small sample sizes, even large differences or effects may not be significant With large sample sizes, even a very small difference or effect can be significant A statistically significant result is not always practically significant, especially with large sample sizes

9 Statistical vs Practical Significance
Example: Suppose a weight loss program recruits 10,000 people for a randomized experiment. A difference in average weight loss of only 0.5 lbs could be found to be statistically significant Suppose the experiment lasted for a year. Is a loss of ½ a pound practically significant?

10 More Randomization Distributions
So far we’ve focused on randomization tests for randomized experiments, in which we re- randomize units to treatment groups What about a single variable? Single proportion? Single mean? What about observational studies? What about correlation?

11 Paul the Octopus

12 Hypotheses In 2008, Paul the Octopus predicted 8 World Cup games, and predicted them all correctly Is this evidence that Paul’s chance of guessing correctly, p, is really greater than 50%? What are the null and alternative hypotheses? H0: p ≠ 0.5, Ha: p = 0.5 H0: p = 0.5, Ha: p ≠ 0.5 H0: p = 0.5, Ha: p > 0.5 H0: p > 0.5, Ha: p = 0.5

13 Key Question How unusual is it to see a sample statistic as extreme as that observed, if H0 is true? If it is very unusual, we have statistically significant evidence against the null hypothesis Today’s Question: How do we measure how unusual a sample statistic is, if H0 is true?

14 Measuring Evidence against H0
To see if a statistic provides evidence against H0, we need to see what kind of sample statistics we would observe, just by random chance, if H0 were true

15 Simulate many samples of size n = 8 with p = 0.5
Paul the Octopus We need to know what kinds of statistics we would observe just by random chance, if the null hypothesis were true How could we figure this out??? Simulate many samples of size n = 8 with p = 0.5

16 Simulate! Did you get all 8 heads (correct)?
We can simulate this with a coin! Each coin flip = a guess between two teams (Heads = correct, Tails = incorrect) Flip a coin 8 times, count the number of heads, and calculate the sample proportion of heads Did you get all 8 heads (correct)? (a) Yes (b) No How extreme is Paul’s sample proportion of 1?

17 Paul the Octopus Based on your simulation results, for a sample size of n = 8, do you think 𝑝 =1 is statistically significant? Yes No

18 Randomization Distribution
A randomization distribution is a collection of statistics from samples simulated assuming the null hypothesis is true The randomization distribution shows what types of statistics would be observed, just by random chance, if the null hypothesis were true

19 www.lock5stat.com/statkey Lots of simulations!
For a better randomization distribution, we need many more simulations!

20 Randomization Distribution

21 Paul the Octopus Based on StatKey’s simulation results, for a sample size of n = 8, do you think 𝑝 =1 is statistically significant? Yes No

22 Connections Today we’ll make connections between…
Chapter 1: Data collection (random sampling?, random assignment?) Chapter 2: Which statistic is appropriate, based on the variable(s)? Chapter 3: Bootstrapping and confidence intervals Chapter 4: Randomization distributions and hypothesis tests

23 Connections Today we’ll make connections between…
Chapter 1: Data collection (random sampling?, random assignment?) Chapter 2: Which statistic is appropriate, based on the variable(s)? Chapter 3: Bootstrapping and confidence intervals Chapter 4: Randomization distributions and hypothesis tests

24 Randomization Distribution
For a randomization distribution, each simulated sample should… be consistent with the null hypothesis use the data in the observed sample reflect the way the data were collected

25 Randomized Experiments
In randomized experiments the “randomness” is the random allocation to treatment groups If the null hypothesis is true, the response values would be the same, regardless of treatment group assignment To simulate what would happen just by random chance, if H0 were true: reallocate cases to treatment groups, keeping the response values the same

26 Observational Studies
In observational studies, the “randomness” is random sampling from the population To simulate what would happen, just by random chance, if H0 were true: How do we simulate resampling from a population when we only have sample data? How can we generate randomization samples for observational studies?

27 Body Temperatures  = average human body temperature H0 :  = 98.6
Ha :  ≠ 98.6 𝑥 =98.26 We can make the null true just by adding – = 0.34 to each value, to make the mean be 98.6 Bootstrapping from this revised sample lets us simulate samples, assuming H0 is true!

28 Body Temperatures In StatKey, when we enter the null hypothesis, this shifting is automatically done for us StatKey p-value = 0.002

29 Exercise and Gender H0: m = f , Ha: m > f
How might we make the null true? One way (of many): Bootstrap from this modified sample In StatKey, the default randomization method is “reallocate groups”, but “Shift Groups” is also an option, and will do this

30 Exercise and Gender p-value = 0.095

31 Exercise and Gender The p-value is Using α = 0.05, we conclude…. Males exercise more than females, on average Males do not exercise more than females, on average Nothing

32 Blood Pressure and Heart Rate
H0:  = 0 , Ha:  < 0 Two variables have correlation 0 if they are not associated. We can “break the association” by randomly permuting/scrambling/shuffling one of the variables Each time we do this, we get a sample we might observe just by random chance, if there really is no correlation

33 Blood Pressure and Heart Rate
Even if blood pressure and heart rate are not correlated, we would see correlations this extreme about 22% of the time, just by random chance. p-value = 0.219

34 Randomization Distribution
Paul the Octopus (Single proportion): Flip a coin or roll a die Cocaine Addiction (randomized experiment): Rerandomize cases to treatment groups, keeping response values fixed Body Temperature (single mean): Shift to make H0 true, then bootstrap Exercise and Gender (observational study): Blood Pressure and Heart Rate (correlation): Randomly permute/scramble/shuffle one variable

35 Connections Today we’ll make connections between…
Chapter 1: Data collection (random sampling?, random assignment?) Chapter 2: Which statistic is appropriate, based on the variable(s)? Chapter 3: Bootstrapping and confidence intervals Chapter 4: Randomization distributions and hypothesis tests

36 Body Temperature We created a bootstrap distribution for average body temperature by resampling with replacement from the original sample ( 𝑥 = 92.26):

37 Body Temperature We also created a randomization distribution to see if average body temperature differs from 98.6F by adding 0.34 to every value to make the null true, and then resampling with replacement from this modified sample:

38 Body Temperature These two distributions are identical (up to random variation from simulation to simulation) except for the center The bootstrap distribution is centered around the sample statistic, 98.26, while the randomization distribution is centered around the null hypothesized value, 98.6 The randomization distribution is equivalent to the bootstrap distribution, but shifted over

39 Bootstrap and Randomization Distributions
Bootstrap Distribution Randomization Distribution Our best guess at the distribution of sample statistics Our best guess at the distribution of sample statistics, if H0 were true Centered around the observed sample statistic Centered around the null hypothesized value Simulate sampling from the population by resampling from the original sample Simulate samples assuming H0 were true Big difference: a randomization distribution assumes H0 is true, while a bootstrap distribution does not

40 Which Distribution? Let  be the average amount of sleep college students get per night. Data was collected on a sample of students, and for this sample 𝑥 =6.7 hours. A bootstrap distribution is generated to create a confidence interval for , and a randomization distribution is generated to see if the data provide evidence that  > 7. Which distribution below is the bootstrap distribution?

41 Which Distribution? Intro stat students are surveyed, and we find that 152 out of 218 are female. Let p be the proportion of intro stat students at that university who are female. A bootstrap distribution is generated for a confidence interval for p, and a randomization distribution is generated to see if the data provide evidence that p > 1/2. Which distribution is the randomization distribution?

42 Connections Today we’ll make connections between…
Chapter 1: Data collection (random sampling?, random assignment?) Chapter 2: Which statistic is appropriate, based on the variable(s)? Chapter 3: Bootstrapping and confidence intervals Chapter 4: Randomization distributions and hypothesis tests

43 Body Temperature Bootstrap Distribution Randomization Distribution
98.26 98.6 Randomization Distribution H0:  = 98.6 Ha:  ≠ 98.6 Talk about the fact that the null hypothesized value is in the extremes of the bootstrap distribution, so the sample statistic is in the extremes of the randomization distribution

44 Body Temperature Bootstrap Distribution Randomization Distribution
98.26 98.4 Randomization Distribution H0:  = 98.4 Ha:  ≠ 98.4 Talk about the fact that the null hypothesized value is not in the extremes of the bootstrap distribution, so the sample statistic is not in the extremes of the randomization distribution

45 Intervals and Tests A confidence interval represents the range of plausible values for the population parameter If the null hypothesized value IS NOT within the CI, it is not a plausible value and should be rejected If the null hypothesized value IS within the CI, it is a plausible value and should not be rejected

46 Intervals and Tests If a 95% CI contains the parameter in H0, then a two-tailed test should not reject H0 at a 5% significance level. If a 95% CI misses the parameter in H0, then a two-tailed test should reject H0 at a 5% significance level.

47 Body Temperatures Using bootstrapping, we found a 95% confidence interval for the mean body temperature to be (98.05, 98.47) This does not contain 98.6, so at α = 0.05 we would reject H0 for the hypotheses H0 :  = 98.6 Ha :  ≠ 98.6

48 Both Father and Mother “Does a child need both a father and a mother to grow up happily?” Let p be the proportion of adults aged in who say yes. A 95% CI for p is (0.487, 0.573). Testing H0: p = 0.5 vs Ha: p ≠ 0.5 with α = 0.05, we Reject H0 Do not reject H0 Reject Ha Do not reject Ha

49 Both Father and Mother “Does a child need both a father and a mother to grow up happily?” Let p be the proportion of adults aged in who say yes. A 95% CI for p is (0.533, 0.607). Testing H0: p = 0.5 vs Ha: p ≠ 0.5 with α = 0.05, we Reject H0 Do not reject H0 Reject Ha Do not reject Ha

50 Intervals and Tests Confidence intervals are most useful when you want to estimate population parameters Hypothesis tests and p-values are most useful when you want to test hypotheses about population parameters Confidence intervals give you a range of plausible values; p-values quantify the strength of evidence against the null hypothesis

51 Interval, Test, or Neither?
Is the following question best assessed using a confidence interval, a hypothesis test, or is statistical inference not relevant? On average, how much more do adults who played sports in high school exercise than adults who did not play sports in high school? Confidence interval Hypothesis test Statistical inference not relevant

52 Interval, Test, or Neither?
Is the following question best assessed using a confidence interval, a hypothesis test, or is statistical inference not relevant? Do a majority of adults take a multivitamin each day? Confidence interval Hypothesis test Statistical inference not relevant

53 Interval, Test, or Neither?
Is the following question best assessed using a confidence interval, a hypothesis test, or is statistical inference not relevant? Did the Penn State football team score more points in 2014 or 2013? Confidence interval Hypothesis test Statistical inference not relevant

54 Summary Using α = 0.05, 5% of all hypothesis tests will lead to rejecting the null, even if all the null hypotheses are true Randomization samples should be generated Consistent with the null hypothesis Using the observed data Reflecting the way the data were collected If a null hypothesized value lies inside a 95% CI, a two-tailed test using α = 0.05 would not reject H0 If a null hypothesized value lies outside a 95% CI, a two-tailed test using α = 0.05 would reject H0

55 To Do Read Sections 4.4, 4.5 Do HW 4.5 (due Friday, 3/27)


Download ppt "More Randomization Distributions, Connections"

Similar presentations


Ads by Google