Essential Synthesis SECTION 4.4, 4.5, ES A, ES B

Slides:



Advertisements
Similar presentations
STAT 101 Dr. Kari Lock Morgan
Advertisements

Hypothesis Testing, Synthesis
Hypothesis Testing: Intervals and Tests
Hypothesis Testing I 2/8/12 More on bootstrapping Random chance
Introduction to Statistics
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 9/6/12 Describing Data: One Variable SECTIONS 2.1, 2.2, 2.3, 2.4 One categorical.
Hypothesis Testing: One Sample Mean or Proportion
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
Stat 217 – Day 25 Regression. Last Time - ANOVA When?  Comparing 2 or means (one categorical and one quantitative variable) Research question  Null.
STAT 101 Dr. Kari Lock Morgan Exam 2 Review.
Section 4.4 Creating Randomization Distributions.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: p-value STAT 101 Dr. Kari Lock Morgan 9/25/12 SECTION 4.2 Randomization distribution.
Confidence Intervals and Hypothesis Tests
Statistics: Unlocking the Power of Data Lock 5 1 in 8 women (12.5%) of women get breast cancer, so P(breast cancer if female) = in 800 (0.125%)
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Hypotheses STAT 101 Dr. Kari Lock Morgan SECTION 4.1 Statistical test Null and alternative.
Synthesis and Review 3/26/12 Multiple Comparisons Review of Concepts Review of Methods - Prezi Essential Synthesis 3 Professor Kari Lock Morgan Duke University.
Confidence Intervals and Hypothesis Testing - II
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
More Randomization Distributions, Connections
Fundamentals of Hypothesis Testing: One-Sample Tests
Testing Hypotheses Tuesday, October 28. Objectives: Understand the logic of hypothesis testing and following related concepts Sidedness of a test (left-,
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Nathaniel Cannon Describing Data: Categorical Variables SECTIONS 2.1 One categorical variable Two.
Confidence Intervals I 2/1/12 Correlation (continued) Population parameter versus sample statistic Uncertainty in estimates Sampling distribution Confidence.
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
Statistics: Unlocking the Power of Data Lock 5 Synthesis STAT 250 Dr. Kari Lock Morgan SECTIONS 4.4, 4.5 Connecting bootstrapping and randomization (4.4)
Using Lock5 Statistics: Unlocking the Power of Data
Statistics: Unlocking the Power of Data Lock 5 Afternoon Session Using Lock5 Statistics: Unlocking the Power of Data Patti Frazer Lock University of Kentucky.
Estimation: Sampling Distribution
Statistics and Quantitative Analysis U4320
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 9/18/12 Confidence Intervals: Bootstrap Distribution SECTIONS 3.3, 3.4 Bootstrap.
LECTURE 19 THURSDAY, 14 April STA 291 Spring
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 9/11/12 Describing Data: Two Variables SECTIONS 2.1, 2.4, 2.5 Two categorical.
Statistics: Unlocking the Power of Data Lock 5 Normal Distribution STAT 101 Dr. Kari Lock Morgan 10/18/12 Chapter 5 Normal distribution Central limit theorem.
Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to estimate a population.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical.
Review of Chapters 1- 6 We review some important themes from the first 6 chapters 1.Introduction Statistics- Set of methods for collecting/analyzing data.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Ch 10 – Intro To Inference 10.1: Estimating with Confidence 10.2 Tests of Significance 10.3 Making Sense of Statistical Significance 10.4 Inference as.
Bayesian Inference, Review 4/25/12 Frequentist inference Bayesian inference Review The Bayesian Heresy (pdf)pdf Professor Kari Lock Morgan Duke University.
Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 12/6/12 Synthesis Big Picture Essential Synthesis Bayesian Inference (continued)
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Describing Data: One Quantitative Variable SECTIONS 2.2, 2.3 One quantitative.
9.3/9.4 Hypothesis tests concerning a population mean when  is known- Goals Be able to state the test statistic. Be able to define, interpret and calculate.
Chapter 6: Analyzing and Interpreting Quantitative Data
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Hypotheses STAT 250 Dr. Kari Lock Morgan SECTION 4.1 Hypothesis test Null and alternative.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Statistics: Unlocking the Power of Data Lock 5 Inference for Means STAT 250 Dr. Kari Lock Morgan Sections 6.4, 6.5, 6.6, 6.10, 6.11, 6.12, 6.13 t-distribution.
Statistics: Unlocking the Power of Data Lock 5 Section 4.5 Confidence Intervals and Hypothesis Tests.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Synthesis and Review for Exam 2.
Synthesis and Review 2/20/12 Hypothesis Tests: the big picture Randomization distributions Connecting intervals and tests Review of major topics Open Q+A.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Estimation: Confidence Intervals SECTION 3.2 Confidence Intervals (3.2)
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Today: Hypothesis testing p-value Example: Paul the Octopus In 2008, Paul the Octopus predicted 8 World Cup games, and predicted them all correctly Is.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Hypothesis Tests for 1-Proportion Presentation 9.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 FINAL EXAMINATION STUDY MATERIAL III A ADDITIONAL READING MATERIAL – INTRO STATS 3 RD EDITION.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Synthesis and Review for Exam 1.
Synthesis and Review for Exam 1
Simulation-Based Approach for Comparing Two Means
Confidence Intervals: Sampling Distribution
Section 4.5 Making Connections.
Stat 217 – Day 28 Review Stat 217.
Presentation transcript:

Essential Synthesis SECTION 4.4, 4.5, ES A, ES B STAT 101 Dr. Kari Lock Morgan Essential Synthesis SECTION 4.4, 4.5, ES A, ES B Connecting bootstrap and randomization (4.4) Connecting intervals and tests (4.5) Review (Ch 1-4)

Exam Details Wednesday, 2/26 Closed to everything except one double-sided page of notes prepared by you (no sharing) and a non-cell phone calculator Best ways to prepare: #1: WORK LOTS OF PROBLEMS! Make a good page of notes Read sections you are still confused about Come to office hours and clarify confusion Covers chapters 1-4 (except 2.6) and anything covered in lecture

Practice Problems Practice exam online (under resources) Solutions to odd essential synthesis and review problems online (under resources) Solutions to all odd problems in the book on reserve at Perkins

Office Hours and Help Monday 4–6pm: Stephanie Sun, Old Chem 211A Tuesday 3:30–5pm (extra): Prof Morgan, Old Chem 216 Tuesday 5-7pm: Wenjing Shi (new TA), Old Chem 211A Tuesday 7-9pm: Mao Hu, Old Chem 211A REVIEW SESSION: 5 – 6 pm Tuesday (if we can get a room… I’ll keep you posted)

Review from Last Class You will all do a hypothesis test for Project 1. If all of you are doing tests for which the nulls are true, about how many of you will get statistically significant results using α = 0.05? (there are 110 students in the class) 110 105 6 0.05*110 = 5.5

Multiple Testing When multiple hypothesis tests are conducted, the chance that at least one test incorrectly rejects a true null hypothesis increases with the number of tests. If the null hypotheses are all true, α of the tests will yield statistically significant results just by random chance.

www.causeweb.org Author: JB Landers

Multiple Comparisons Consider a topic that is being investigated by research teams all over the world  Using α = 0.05, 5% of teams are going to find something significant, even if the null hypothesis is true

Multiple Comparisons Consider a research team/company doing many hypothesis tests Using α = 0.05, 5% of tests are going to be significant, even if the null hypotheses are all true

Multiple Comparisons This is a serious problem The most important thing is to be aware of this issue, and not to trust claims that are obviously one of many tests (unless they specifically mention an adjustment for multiple testing) There are ways to account for this (e.g. Bonferroni’s Correction), but these are beyond the scope of this class

Publication Bias publication bias refers to the fact that usually only the significant results get published The one study that turns out significant gets published, and no one knows about all the insignificant results This combined with the problem of multiple comparisons, can yield very misleading results

Jelly Beans Cause Acne! http://xkcd.com/882/ Consider having your students act this out in class, each reading aloud a different part. it’s very fun! http://xkcd.com/882/

http://xkcd.com/882/

Connections Today we’ll make connections between… Chapter 1: Data collection (random sampling?, random assignment?) Chapter 2: Which statistic is appropriate, based on the variable(s)? Chapter 3: Bootstrapping and confidence intervals Chapter 4: Randomization distributions and hypothesis tests

Connections Today we’ll make connections between… Chapter 1: Data collection (random sampling?, random assignment?) Chapter 2: Which statistic is appropriate, based on the variable(s)? Chapter 3: Bootstrapping and confidence intervals Chapter 4: Randomization distributions and hypothesis tests

Randomization Distribution For a randomization distribution, each simulated sample should… be consistent with the null hypothesis use the data in the observed sample reflect the way the data were collected

Randomized Experiments In randomized experiments the “randomness” is the random allocation to treatment groups If the null hypothesis is true, the response values would be the same, regardless of treatment group assignment To simulate what would happen just by random chance, if H0 were true: reallocate cases to treatment groups, keeping the response values the same

Observational Studies In observational studies, the “randomness” is random sampling from the population To simulate what would happen, just by random chance, if H0 were true: Simulate resampling from a population in which H0 is true How do we simulate resampling from a population when we only have sample data? Bootstrap! How can we generate randomization samples for observational studies? Make H0 true, then bootstrap!

Body Temperatures  = average human body temperature H0 :  = 98.6 Ha :  ≠ 98.6 𝑥 =98.26 We can make the null true just by adding 98.6 – 98.26 = 0.34 to each value, to make the mean be 98.6 Bootstrapping from this revised sample lets us simulate samples, assuming H0 is true!

Body Temperatures In StatKey, when we enter the null hypothesis, this shifting is automatically done for us StatKey p-value = 0.002

Exercise and Gender H0: m = f , Ha: m > f How might we make the null true? One way (of many): add 3 to every female Bootstrap from this modified sample In StatKey, the default randomization method is “reallocate groups”, but “Shift Groups” is also an option, and will do this

Exercise and Gender p-value = 0.095

Exercise and Gender The p-value is 0.095. Using α = 0.05, we conclude…. Males exercise more than females, on average Males do not exercise more than females, on average Nothing Do not reject the null… we can’t conclude anything.

Blood Pressure and Heart Rate H0:  = 0 , Ha:  < 0 Two variables have correlation 0 if they are not associated. We can “break the association” by randomly permuting/scrambling/shuffling one of the variables Each time we do this, we get a sample we might observe just by random chance, if there really is no correlation

Blood Pressure and Heart Rate Even if blood pressure and heart rate are not correlated, we would see correlations this extreme about 22% of the time, just by random chance. p-value = 0.219

Randomization Distribution Paul the Octopus or ESP(single proportion): Flip a coin or roll a die Cocaine Addiction (randomized experiment): Rerandomize cases to treatment groups, keeping response values fixed Body Temperature (single mean): Shift to make H0 true, then bootstrap Exercise and Gender (observational study): Blood Pressure and Heart Rate (correlation): Randomly permute/scramble/shuffle one variable

Connections Today we’ll make connections between… Chapter 1: Data collection (random sampling?, random assignment?) Chapter 2: Which statistic is appropriate, based on the variable(s)? Chapter 3: Bootstrapping and confidence intervals Chapter 4: Randomization distributions and hypothesis tests

Body Temperature We created a bootstrap distribution for average body temperature by resampling with replacement from the original sample ( 𝑥 = 92.26):

Body Temperature We also created a randomization distribution to see if average body temperature differs from 98.6F by adding 0.34 to every value to make the null true, and then resampling with replacement from this modified sample:

Body Temperature These two distributions are identical (up to random variation from simulation to simulation) except for the center The bootstrap distribution is centered around the sample statistic, 98.26, while the randomization distribution is centered around the null hypothesized value, 98.6 The randomization distribution is equivalent to the bootstrap distribution, but shifted over

Bootstrap and Randomization Distributions Bootstrap Distribution Randomization Distribution Our best guess at the distribution of sample statistics Our best guess at the distribution of sample statistics, if H0 were true Centered around the observed sample statistic Centered around the null hypothesized value Simulate sampling from the population by resampling from the original sample Simulate samples assuming H0 were true Big difference: a randomization distribution assumes H0 is true, while a bootstrap distribution does not

Which Distribution? Let  be the average amount of sleep college students get per night. Data was collected on a sample of students, and for this sample 𝑥 =6.7 hours. A bootstrap distribution is generated to create a confidence interval for , and a randomization distribution is generated to see if the data provide evidence that  > 7. Which distribution below is the bootstrap distribution? (a) is centered around the sample statistic, 6.7

Which Distribution? Intro stat students are surveyed, and we find that 152 out of 218 are female. Let p be the proportion of intro stat students at that university who are female. A bootstrap distribution is generated for a confidence interval for p, and a randomization distribution is generated to see if the data provide evidence that p > 1/2. Which distribution is the randomization distribution? (a) is centered around the null value, 1/2

Connections Today we’ll make connections between… Chapter 1: Data collection (random sampling?, random assignment?) Chapter 2: Which statistic is appropriate, based on the variable(s)? Chapter 3: Bootstrapping and confidence intervals Chapter 4: Randomization distributions and hypothesis tests

Body Temperature Bootstrap Distribution Randomization Distribution 98.26 98.6 Randomization Distribution H0:  = 98.6 Ha:  ≠ 98.6 Talk about the fact that the null hypothesized value is in the extremes of the bootstrap distribution, so the sample statistic is in the extremes of the randomization distribution

Body Temperature Bootstrap Distribution Randomization Distribution 98.26 98.4 Randomization Distribution H0:  = 98.4 Ha:  ≠ 98.4 Talk about the fact that the null hypothesized value is not in the extremes of the bootstrap distribution, so the sample statistic is not in the extremes of the randomization distribution

Intervals and Tests A confidence interval represents the range of plausible values for the population parameter If the null hypothesized value IS NOT within the CI, it is not a plausible value and should be rejected If the null hypothesized value IS within the CI, it is a plausible value and should not be rejected

Intervals and Tests If a 95% CI contains the parameter in H0, then a two-tailed test should not reject H0 at a 5% significance level. If a 95% CI misses the parameter in H0, then a two-tailed test should reject H0 at a 5% significance level.

Body Temperatures Using bootstrapping, we found a 95% confidence interval for the mean body temperature to be (98.05, 98.47) This does not contain 98.6, so at α = 0.05 we would reject H0 for the hypotheses H0 :  = 98.6 Ha :  ≠ 98.6

Both Father and Mother “Does a child need both a father and a mother to grow up happily?” Let p be the proportion of adults aged 18-29 in 2010 who say yes. A 95% CI for p is (0.487, 0.573). Testing H0: p = 0.5 vs Ha: p ≠ 0.5 with α = 0.05, we Reject H0 Do not reject H0 Reject Ha Do not reject Ha 0.5 is within the CI, so is a plausible value for p. http://www.pewsocialtrends.org/2011/03/09/for-millennials-parenthood-trumps-marriage/#fn-7199-1

Both Father and Mother “Does a child need both a father and a mother to grow up happily?” Let p be the proportion of adults aged 18-29 in 1997 who say yes. A 95% CI for p is (0.533, 0.607). Testing H0: p = 0.5 vs Ha: p ≠ 0.5 with α = 0.05, we Reject H0 Do not reject H0 Reject Ha Do not reject Ha 0.5 is not within the CI, so is not a plausible value for p. http://www.pewsocialtrends.org/2011/03/09/for-millennials-parenthood-trumps-marriage/#fn-7199-1

Intervals and Tests Confidence intervals are most useful when you want to estimate population parameters Hypothesis tests and p-values are most useful when you want to test hypotheses about population parameters Confidence intervals give you a range of plausible values; p-values quantify the strength of evidence against the null hypothesis

Interval, Test, or Neither? Is the following question best assessed using a confidence interval, a hypothesis test, or is statistical inference not relevant? On average, how much more do adults who played sports in high school exercise than adults who did not play sports in high school? Confidence interval Hypothesis test Statistical inference not relevant

Interval, Test, or Neither? Is the following question best assessed using a confidence interval, a hypothesis test, or is statistical inference not relevant? Do a majority of adults riding a bicycle wear a helmet? Confidence interval Hypothesis test Statistical inference not relevant

Interval, Test, or Neither? Is the following question best assessed using a confidence interval, a hypothesis test, or is statistical inference not relevant? On average, were the players on the 2014 Canadian Olympic hockey team older than the players on the 2014 US Olympic hockey team? Confidence interval Hypothesis test Statistical inference not relevant

Summary Using α = 0.05, 5% of all hypothesis tests will lead to rejecting the null, even if all the null hypotheses are true Randomization samples should be generated Consistent with the null hypothesis Using the observed data Reflecting the way the data were collected If a null hypothesized value lies inside a 95% CI, a two-tailed test using α = 0.05 would not reject H0 If a null hypothesized value lies outside a 95% CI, a two-tailed test using α = 0.05 would reject H0

Descriptive statistics The Big Picture Population Sampling Sample Statistical Inference Descriptive statistics

Cases and Variables We obtain information about cases or units. A variable is any characteristic that is recorded for each case. Generally each case makes up a row in a dataset, and each variable makes up a column Variables are either categorical or quantitative

Sampling Sampling bias occurs when the method of selecting a sample causes the sample to differ from the population in some relevant way. If sampling bias exists, we cannot generalize from the sample to the population To avoid sampling bias, select a random sample

Sampling Population Sample Sample GOAL: Select a sample that is similar to the population, only smaller

Observational Studies A third variable that is associated with both the explanatory variable and the response variable is called a confounding variable There are almost always confounding variables in observational studies Observational studies can almost never be used to establish causation Observational studies can almost never be used to establish causation

Randomized Experiments In a randomized experiment the explanatory variable for each unit is determined randomly, before the response variable is measured Because the explanatory variable is randomly assigned, it is not associated with any other variables. Confounding variables are eliminated!!! Randomized experiments make it possible to infer causation!

Randomized Experiments Confounding Variable RANDOMIZED EXPERIMENT Explanatory Variable Response Variable

Chapter 1: Data Collection Was the explanatory variable randomly assigned? Was the sample randomly selected? Yes No Yes No Possible to generalize to the population Should not generalize to the population Possible to make conclusions about causality Can not make conclusions about causality

Chapter 2: Descriptive Statistics In order to make sense of data, we need ways to summarize and visualize it Summarizing and visualizing variables and relationships between two variables is often known as descriptive statistics (also known as exploratory data analysis) Type of summary statistics and visualization methods depend on the type of variable(s) being analyzed (categorical or quantitative)

Variable(s) Visualization Summary Statistics Categorical bar chart, pie chart frequency table, relative frequency table, proportion Quantitative dotplot, histogram, boxplot mean, median, max, min, standard deviation, z-score, range, IQR, five number summary Categorical vs Categorical side-by-side bar chart, segmented bar chart two-way table, difference in proportions Quantitative vs Categorical side-by-side boxplots statistics by group, difference in means Quantitative vs Quantitative scatterplot correlation

Descriptive Statistics Think of a topic or question you would like to use data to help you answer. What would the cases be? What would the variables be? (Limit to one or two variables)

Descriptive Statistics How would you visualize and summarize the variable or relationship between variables? bar chart/pie chart, proportions, frequency table/relative frequency table dotplot/histogram/boxplot, mean/median, sd/range/IQR, five number summary side-by-side or segmented bar charts, difference in proportions, two-way table side-by-side boxplot, difference in means scatterplot, correlation

Statistic vs Parameter A sample statistic is a number computed from sample data. A population parameter is a number that describes some aspect of a population Statistical inference is the process of drawing conclusions about the entire population based on information in a sample

Sampling Distribution A sampling distribution is the distribution of statistics computed for different samples of the same size taken from the same population The spread of the sampling distribution helps us to assess the uncertainty in the sample statistic In real life, we rarely get to see the sampling distribution – we usually only have one sample

Bootstrap A bootstrap sample is a random sample taken with replacement from the original sample, of the same size as the original sample A bootstrap statistic is the statistic computed on the bootstrap sample A bootstrap distribution is the distribution of many bootstrap statistics

Bootstrap Distribution BootstrapSample Bootstrap Statistic BootstrapSample Bootstrap Statistic Original Sample Bootstrap Distribution . . Sample Statistic BootstrapSample Bootstrap Statistic

Confidence Interval A confidence interval for a parameter is an interval computed from sample data by a method that will capture the parameter for a specified proportion of all samples A 95% confidence interval will contain the true parameter for 95% of all samples

Confidence Intervals The parameter is fixed The statistic is random (depends on the sample) The interval is random (depends on the statistic) 95% of 95% confidence intervals will capture the truth

Margin of Error One common form for a confidence interval is statistic ± margin of error The margin of error is determined by the uncertainty in the sample statistic… which depends on how much the statistic varies from sample to sample… which is measured by the standard error

Standard Error The standard error (SE) is the standard deviation of the sample statistic The SE can be estimated by the standard deviation of the bootstrap distribution For symmetric, bell-shaped distributions, a 95% confidence interval is

Confidence Intervals . . . statistic ± ME Sample Confidence Interval Bootstrap Sample Sample Bootstrap Sample Bootstrap Sample . . . Margin of Error (ME) (95% CI: ME = 2×SE) Bootstrap Sample Bootstrap Sample Bootstrap Distribution Calculate statistic for each bootstrap sample Standard Error (SE): standard deviation of bootstrap distribution

Percentile Method If the bootstrap distribution is approximately symmetric, a P% confidence interval can be gotten by taking the middle P% of a bootstrap distribution

Bootstrap Distribution

Hypothesis Testing How unusual would it be to get results as extreme (or more extreme) than those observed, if the null hypothesis is true? If it would be very unusual, then the null hypothesis is probably not true! If it would not be very unusual, then there is not evidence against the null hypothesis

p-value The p-value is the probability of getting a statistic as extreme (or more extreme) as that observed, just by random chance, if the null hypothesis is true The p-value measures evidence against the null hypothesis

Randomization Distribution A randomization distribution is the distribution of sample statistics we would observe, just by random chance, if the null hypothesis were true The p-value is calculated by finding the proportion of statistics in the randomization distribution that fall beyond the observed statistic

Hypothesis Testing p-value Observed Statistic

Statistical Conclusions Strength of evidence against H0: Formal decision of hypothesis test, based on  = 0.05 :

Formal Decisions For a given significance level, , p-value <   Reject Ho p-value >   Do not Reject Ho “If the p-value is low, the ho must go”

Errors   TYPE I ERROR TYPE II ERROR Reject H0 Do not reject H0 Decision Reject H0 Do not reject H0 H0 true H0 false TYPE I ERROR  Truth  TYPE II ERROR

More on Significance Statistical significance is closely connected to sample size Larger n: easier to get a significant result Smaller n: easier to make a Type II error Statistical significance and practical significance are not always the same Problem of multiple testing: even if all null hypotheses are true, α of all tests will find significant results

Connecting Intervals and Tests A confidence interval represents the range of plausible values for the population parameter If the null hypothesized value IS NOT within the CI, it is not a plausible value and should be rejected If the null hypothesized value IS within the CI, it is a plausible value and should not be rejected

Intervals and Tests Confidence intervals are most useful when you want to estimate population parameters Hypothesis tests and p-values are most useful when you want to test hypotheses about population parameters Confidence intervals give you a range of plausible values; p-values quantify the strength of evidence against the null hypothesis

Let’s put it all together… You’ve now learned how to successfully collect and analyze data to answer a question! Let’s put it all together…

What proportion of people can roll their tongue? Tongue Curling What proportion of people can roll their tongue? Can you roll your tongue? (a) Yes (b) No Visualize and summarize the data. What is your point estimate? Give and interpret a confidence interval. Tongue rolling has been said to be a dominant trait, in which case theoretically 75% of all people should be able to roll their tongues. Do our data provide evidence otherwise?

Exercise and Pulse Does just 5 seconds of exercise increase pulse rate? What are the cases and variables? Are they categorical or quantitative? Identify explanatory and response. Does the question imply causality? How would you collect data to answer it? Collect data. Visualize and summarize your data. Before doing any formal inference, take a guess at answering the question. Conduct a hypothesis test to answer the question. State your hypotheses, calculate the p-value, make a conclusion in context. How much does 5 seconds of exercise increase pulse rate by? State the parameter of interest and give and interpret a confidence interval.

To Do Read Section 4.4, 4.5, ES A, ES B Study for Exam 1!