Hypothesis Testing and Confidence Intervals (Part 4): Two-Sample t-Tests and Confidence Intervals Lecture 10 Justin Kern October 24 and 26, 2017.

Hypothesis Testing and Confidence Intervals (Part 4): Two-Sample t-Tests and Confidence Intervals
Lecture 10 Justin Kern October 24 and 26, 2017

Confidence Interval for 𝜇 (𝑧 𝑜𝑟 𝑡)
Suppose we are interested in the average high score of millions all over the world who play a very popular computer game. Unfortunately, the server does not keep a record of high scores, so we cannot simply determine the average score of the entire population (true mean 𝜇). We do not know the population standard deviation either. However, all individuals do know their own high score, and we also happen to know that the population high scores are not normally distributed. We take a random sample of 121 players and calculate their mean high score to be 5000 and standard deviation to be What is the 95% confidence interval for 𝝁? Population distribution not normal 𝑠=1000 (𝜎 unknown) 𝑛=121 (𝑛≥30) 𝑥 =5000 𝛼=0.05 (95% confidence) 𝑑𝑓=𝑛−1=121−1=120 95% 𝐶𝐼 (𝑡): 𝜇 𝐿 , 𝜇 𝑈 = 𝑥 ± 𝑡 𝛼/2,𝑛−1 𝑠 𝑛 =5000± =[4820, 5180] 95% 𝐶𝐼 (𝑧): 𝜇 𝐿 , 𝜇 𝑈 = 𝑥 ± 𝑧 𝛼 2 𝑠 𝑛 =5000± =[ , ] 𝑥 −𝜇 𝑠/ 𝑛 has 𝑡-distribution but is well approximated by standard normal 𝑧 𝑡 𝛼/2,𝑛−1 = 𝑡 0.025,99 =1.984

Confidence Interval for 𝜇 (𝑡)
Suppose we are interested in the average high score of millions all over the world who play a very popular computer game. Unfortunately, the server does not keep a record of high scores, so we cannot simply determine the average score of the entire population (true mean 𝜇). We do not know the population standard deviation either. However, all individuals do know their own high score, and we also happen to know that the population high scores are normally distributed. We take a random sample of 25 players and calculate their mean high score to be 5000 and standard deviation to be What is the 99% confidence interval for 𝝁? Population distribution normal 𝑠=1000 (𝜎 unknown) 𝑛=25 (𝑛<30) 𝑥 =5000 𝛼=0.01 (99% confidence) 𝑑𝑓=𝑛−1=25−1=24 99% 𝐶𝐼 (𝑡): 𝜇 𝐿 , 𝜇 𝑈 = 𝑥 ± 𝑡 𝛼/2,𝑛−1 𝑠 𝑛 =5000± =[4440.6, ] 𝑥 −𝜇 𝑠/ 𝑛 has 𝑡-distribution 𝑡 𝛼/2,𝑛−1 = 𝑡 0.005,24 =2.797

Confidence Interval for 𝜇 (𝑡)
Founded in 1998, Telephia provides a wide variety of information on cellular phone use. In 2006, Telephia reported that, on average, United Kingdom (UK) subscribers with 3G phones spent an average of 8.3 hours per month listening to full-track music on their cell phones. Suppose we hypothesize that US subscribers are different from UK subscribers in their phone usage. Say we draw a random sample of size 8 from the US population of 3G subscribers. Further suppose (unrealistically) that the distribution of time usage follows a normal distribution. Suppose we are interested in constructing a 95% confidence interval for the mean usage for US subscribers and using that to test our hypothesis. What would the 95% confidence interval about the population mean time of US subscribers look like? With 𝜶=.𝟎𝟓, can we conclude that US subscribers have a different mean time usage than UK subscribers? Sample: 5, 6, 0, 4, 11, 9, 2, 3 What are the null and alternative hypotheses? 𝐻 0 : 𝜇=8.3 vs. 𝐻 1 : 𝜇≠8.3 What is 𝑥 , and what is s? 𝑥 =5 𝑠=3.625 What is the confidence interval? 𝛼=0.05 (95% confidence) 𝑑𝑓=𝑛−1=8−1=7 95% CI: 5± =5±3.031= 1.969, 8.031 What is the conclusion? The CI does not contain 8.3, so we reject 𝐻 0 . Substantively, this means that we conclude that US 3G subscribers’ mean time usage is statistically significantly different from 8.3 hours per month (UK subscribers’ mean time). 𝑡 𝛼/2,𝑛−1 = 𝑡 0.025,7 =2.365

Two-Sample t-Test: Pooled
Scenario: We are testing if the means for two sets of observations are equal to each other. The observations in the two sets are independent of each other. We don’t know the variances for the populations, so they must be estimated. We further assume that the population variances (though, unknown) are equal. There is a technical reason for this. Assume the data are normally distributed or, if 𝑛 1 ≥30 and 𝑛 2 ≥30, we can use the CLT. Hypothesis test: 𝐻 0 : 𝜇 1 − 𝜇 2 =0 vs. 𝐻 1 : 𝜇 1 − 𝜇 2 ≠0 𝐻 0 : 𝜇 1 − 𝜇 2 ≤0 vs. 𝐻 1 : 𝜇 1 − 𝜇 2 >0 𝐻 0 : 𝜇 1 − 𝜇 2 ≥0 vs. 𝐻 1 : 𝜇 1 − 𝜇 2 <0 Test statistic: Because the variances between population are assumed equal, we use a pooled variance as an estimate. 𝑠 𝑝 2 = 𝑛 1 −1 𝑠 1 2 +( 𝑛 2 −1) 𝑠 𝑛 1 + 𝑛 2 −2 𝑑𝑓= 𝑛 1 + 𝑛 2 −2 𝑇= 𝑋 1 − 𝑋 2 𝑠 𝑝 𝑛 𝑛 2 ~𝑡(𝑑𝑓) If the variances were not equal to each other, then the test statistic would not follow an exact t-distribution. Show on the board that Var 𝑋 1 − 𝑋 2 =𝑉𝑎𝑟 𝑋 1 +𝑉𝑎𝑟 𝑋 2 = 𝜎 𝑛 𝜎 𝑛 2 , but that if the variances were the same, it would be Var 𝑋 1 − 𝑋 2 = 𝜎 𝑝 2 𝑛 𝜎 𝑝 2 𝑛 2 = 𝜎 𝑝 𝑛 𝑛 We then substitute in, an estimate for the variance, which is the pooled variance.

Two-Sample t-Test: Pooled (Example)
Suppose we want to determine whether the mean annual salary of psych professors at UC is the same as math professors. Assume (unrealistically) that salaries are normally distributed and the two populations have equal variances. We randomly sample 29 psych professors and 33 math professors, and find the mean (in thousands) and variances or their salaries: 𝑥 1 = 87.2 and 𝑠 1 2 =112.4 in psychology, and 𝑥 2 =91.3 and 𝑠 1 2 =117.2 in math. Let 𝛼=.05. What are the null and alternative hypotheses? 𝐻 0 : 𝜇 1 − 𝜇 2 =0 vs. 𝐻 1 : 𝜇 1 − 𝜇 2 ≠0 Because the population sds are assumed equal, and are not known, a pooled two-sample t-test is used: 𝑠 𝑝 = 𝑛 1 −1 𝑠 1 2 +( 𝑛 2 −1) 𝑠 𝑛 1 + 𝑛 2 −2 = 28(112.4)+32(117.2) 60 ≈10.72 𝑡 𝑜𝑏𝑠 = 𝑥 1 − 𝑥 2 𝑠 𝑝 𝑛 𝑛 2 = 87.2− =−1.50 Critical value method: 𝑑𝑓= 𝑛 1 + 𝑛 2 −2=60 Two−tailed test with 𝛼=.05 𝑡 𝑐𝑟𝑖𝑡 =±2.000. The area beyond the critical values ( and 2.000) is the critical region where we would reject 𝐻 0 . Since −𝑡 𝑐𝑟𝑖𝑡 <𝑡 𝑜𝑏𝑠 < 𝑡 𝑐𝑟𝑖𝑡 , (that is, 𝑡 𝑜𝑏𝑠 is not in the critical region), we do not reject 𝐻 0 . Thus, we can conclude that the mean salaries of psych and math professors at UC are not significantly different. Show p-value method as well.

Two-Sample t-Test: Pooled (Example)
Suppose we want to determine whether the mean annual salary of psych professors at UC is the same as biology professors. Assume (unrealistically) that salaries are normally distributed and the two populations have equal variances. We randomly sample 29 psych professors and 13 biology professors, and find the mean (in thousands) and variances or their salaries: 𝑥 1 =87.2 and 𝑠 1 2 =112.4 in psychology, and 𝑥 2 =93.1 and 𝑠 1 2 =90.2 in biology. Let 𝛼=.05. What are the null and alternative hypotheses? 𝐻 0 : 𝜇 1 − 𝜇 2 =0 vs. 𝐻 1 : 𝜇 1 − 𝜇 2 ≠0 Variances are unknown, but assumed equal. Therefore, compute pooled standard deviation and observed t-statistic. 𝑠 𝑝 = 𝑛 1 −1 𝑠 1 2 +( 𝑛 2 −1) 𝑠 𝑛 1 + 𝑛 2 −2 = 28(112.4)+12(90.2) 40 ≈10.28 𝑡 𝑜𝑏𝑠 = 𝑥 1 − 𝑥 2 𝑠 𝑝 𝑛 𝑛 2 = 87.2− =−1.778 Using the critical value method, do we reject or fail to reject the null hypothesis? 𝑑𝑓= 𝑛 1 + 𝑛 2 −2=40 Two−tailed test with 𝛼=.05 𝑡 𝑐𝑟𝑖𝑡 =±2.021. The area beyond the critical values ( and 2.021) is the critical region where we would reject 𝐻 0 . Since −𝑡 𝑐𝑟𝑖𝑡 <𝑡 𝑜𝑏𝑠 < 𝑡 𝑐𝑟𝑖𝑡 , (that is, 𝑡 𝑜𝑏𝑠 is not in the critical region), we do not reject 𝐻 0 . What is the substantive conclusion? Thus, we can conclude that the mean salaries of psych and biology professors at UC are not significantly different. Show p-value method as well.

Confidence Interval for 𝜇 1 − 𝜇 2 (𝑡), Equal Variances
For small sample sizes ( 𝑛 1 <30, 𝑛 2 <30), the sample standard deviations ( 𝑠 1 , 𝑠 2 ) are not very good estimates of their respective population standard deviations ( 𝜎 1 , 𝜎 2 ) With the assumption that the population standard deviations are actually equal, we can “pool” the sample variances to get a better estimate of the population variance 𝑠 𝑝 2 = ( 𝑛 1 −1)𝑠 ( 𝑛 2 −1)𝑠 𝑛 1 + 𝑛 2 −2 𝑡= ( 𝑥 1 − 𝑥 2 )−( 𝜇 1 − 𝜇 2 ) 𝑠 𝑝 𝑛 𝑛 → 𝑡 Distribution with 𝑑𝑓= 𝑛 1 + 𝑛 2 −2 ( 𝑥 1 − 𝑥 2 )± 𝑡 𝛼/2,( 𝑛 1 + 𝑛 2 −2) 𝑠 𝑝 𝑛 𝑛 2

Suppose we are interested in comparing the average high scores of players in the United States ( 𝜇 1 ) and the average high scores of players in Australia ( 𝜇 2 ). We know that the high scores for both populations are normally distributed with equal standard deviations. We take a random sample of 10 players from US, and their mean and standard deviation are calculated to be 6000 and 1600, respectively. Likewise, we take a random sample of 8 players from Australia, and their mean and standard deviation are calculated to be 4000 and 1800, respectively. What is the 90% confidence interval for the difference in mean high scores between the US and Australia ( 𝜇 1 − 𝜇 2 )? Population distributions normal 𝑠 1 =1600, 𝑠 2 =1800 (𝜎’s unknown) 𝑛 1 =10, 𝑛 2 =8 ( 𝑛 1 <30, 𝑛 2 <30) 𝑥 1 =6000, 𝑥 2 =4000 𝛼=0.10 (90% confidence) 𝑑𝑓=10+8−2=16 𝑠 𝑝 2 = ( 𝑛 1 −1)𝑠 ( 𝑛 2 −1)𝑠 𝑛 1 + 𝑛 2 −2 = 10−1 (1600) 2 +(8−1) (1800) −2 = 90% 𝐶𝐼: ( 𝑥 1 − 𝑥 2 )± 𝑡 𝛼/2, (𝑛 1 + 𝑛 2 −2) 𝑠 𝑝 𝑛 𝑛 2 =(6000−4000)± =[600.00, ] 𝑡 distribution 𝑡 𝛼/2, (𝑛 1 + 𝑛 2 −2) = 𝑡 0.10/2,(10+8−2) = 𝑡 0.05,16 =1.746

Suppose we are interested in whether increasing the amount of calcium in our diets has an effect on blood pressure. In a randomized experiment, one group of 10 people was given a calcium supplement for 12 weeks, and a control group of 11 people was given a placebo during that same time span. The effect on blood pressure was operationalized as the difference in blood pressure before and after the 12 weeks. The mean difference in blood pressure for the calcium group was 5 with a standard deviation of 8.743, while the mean difference in blood pressure for the placebo group was with a standard deviation of We assume that the population variances in blood pressure differences are equal for the two groups, and that blood pressure scores are normally distributed. Sample size, sample means, sample standard deviations? 𝑠 1 =8.743, 𝑠 2 = (𝜎’s unknown) 𝑛 1 =10, 𝑛 2 =11 ( 𝑛 1 <30, 𝑛 2 <30, but population scores are normal, so CLT holds) 𝑥 1 =5.000, 𝑥 2 =−0.0273 What is the pooled variance and critical value? 𝑠 𝑝 2 = ( 𝑛 1 −1)𝑠 ( 𝑛 2 −1)𝑠 𝑛 1 + 𝑛 2 −2 = 10−1 (8.743) 2 +(11−1) (5.901) −2 =54.536 𝛼=.05 𝑑𝑓= 𝑛 1 + 𝑛 2 −2=10+11−2=19 Create a 95% CI for the mean difference in mean change in blood pressure: 95% 𝐶𝐼: ( 𝑥 1 − 𝑥 2 )± 𝑡 𝛼/2, (𝑛 1 + 𝑛 2 −2) 𝑠 𝑝 𝑛 𝑛 2 =(5.000−(−0.0273))± =[−1.726, ] → 𝑡 𝛼/2, (𝑛 1 + 𝑛 2 −2) = 𝑡 0.05/2,(10+11−2) = 𝑡 0.025,19 =2.093

Two-Sample t-Test: Unpooled (Welch’s t-test)
Scenario: We are testing if the means for two sets of observations are equal to each other. The observations in the two sets are independent of each other. We don’t know the variances for the populations, so they must be estimated. We do not assume that the population variances equal. Assume the data are normally distributed or, if 𝑛 1 ≥30 and 𝑛 2 ≥30, we can use the CLT. Hypothesis test: 𝐻 0 : 𝜇 1 − 𝜇 2 =0 vs. 𝐻 1 : 𝜇 1 − 𝜇 2 ≠0 𝐻 0 : 𝜇 1 − 𝜇 2 ≤0 vs. 𝐻 1 : 𝜇 1 − 𝜇 2 >0 𝐻 0 : 𝜇 1 − 𝜇 2 ≥0 vs. 𝐻 1 : 𝜇 1 − 𝜇 2 <0 Test statistic: 𝑇= 𝑋 1 − 𝑋 𝑠 𝑛 𝑠 𝑛 2 Unfortunately, the usual test statistic, while intuitive, doesn’t actually follow an exact t-distribution. Simply choosing degrees of freedom like in the pooled test will have actual Type I error rates larger than the chosen 𝛼-level. The strategy is to pick degrees of freedom so that the test’s actual Type I error rates are close to the chosen 𝛼-level. The optimal choice of degrees of freedom is complicated, and can give non-integer values. A simpler choice is to use the smaller of 𝑛 1 −1 and 𝑛 2 −1 as degrees of freedom. This strategy will result in a more conservative test. That means that actual Type I error rates will tend to be smaller than the chosen 𝛼-level. 𝑑𝑓=min( 𝑛 1 −1, 𝑛 2 −1) 𝑑𝑓= 𝑠 𝑛 𝑠 𝑛 𝑠 𝑛 𝑛 1 −1 + 𝑠 𝑛 𝑛 2 −1 This is called the Welch-Satterthwaite equation. It gives an approximate number of degrees of freedom of a combination of sample variances when the population variances are not assumed to be equal. This has a lower bound of min(n_1-1, n_2-2). It has a maximum of (n_1 + n_2 – 2), which it hits when s_1 = s_2 and n_1 = n_2. For purposes of this class, we will use the lower bound, as it will have an actual Type I error rate smaller than alpha. This way, we are actually rejecting H0 more often than we optimally could, but we are also making fewer Type I errors.

Two-Sample t-Test: Unpooled (Example)
The Psychopathy Checklist: Screening Version (PCL: SV) is a psychological measure consisting of twelve items. Each item gets a rating of 0 (not present), 1 (may be/partially present), or 2 (present) during a structured interview. The total score is calculated and, supposedly, a score of 13 or higher is indicative of a person who is, or may be, psychopathic. It has also been used to predict violent behavior, where a larger score indicates a person more likely to engage in violent behavior.

From a random sample of 177 patients, their PCL: SV scores were recorded. Of the 177 patients, 41 were violent. The mean PCL: SV score for the violent patients was 11.88, with a variance of For non-violent patients, it was 7.70, with a variance of There is reason to believe the variances of scores for the two groups are different. Using an 𝛼=.01 rejection level, determine if violent patients have an equal mean PCL: SV score than non-violent patients. What are the null and alternative hypotheses? 𝐻 0 : 𝜇 1 − 𝜇 2 =0 vs. 𝐻 1 : 𝜇 1 − 𝜇 2 ≠0 The population sds are not known, and not assumed equal, and 𝑛 1 =41>30 and 𝑛 2 =177−41=136>30, so an unpooled t-statistic is used: 𝑡= 𝑥 1 − 𝑥 𝑠 𝑛 𝑠 𝑛 2 = 11.88− ≈ ≈4.209 Critical value method: 𝑑𝑓= min 𝑛 1 −1, 𝑛 2 −1 = min 40, 135 =40 𝛼=.01 Since 4.209= 𝑡 𝑜𝑏𝑠 > 𝑡 𝑐𝑟𝑖𝑡 =2.704 (i.e., 𝑡 𝑜𝑏𝑠 is in the critical region), we reject 𝐻 0 . Thus, we can conclude that violent and nonviolent patients have significantly different mean PCL: SV scores. Satterthwaite df = , t_crit = 2.656 → 𝑡 𝛼/2,𝑑𝑓 = 𝑡 0.01/2,40 = 𝑡 0.005, 40 =2.704

The US Department of Agriculture (USDA) uses sample surveys to produce important economic estimates. One pilot study estimated wheat prices in July and in September using independent samples of wheat producers in the two months. In July, 30 samples were taken with a sample mean of $2.95 and a standard deviation of $0.22. Similarly, in September, 30 samples were taken, with a sample mean of $3.10 and a standard deviation of $0.19. Can we conclude, with 𝛼-level of .05, that the national average prices are not the same? What are the null and alternative hypotheses? 𝐻 0 : 𝜇 1 − 𝜇 2 =0 vs. 𝐻 1 : 𝜇 1 − 𝜇 2 ≠0 Compute the appropriate t-statistic: 𝑡= 𝑥 1 − 𝑥 𝑠 𝑛 𝑠 𝑛 2 = 3.10− ≈ ≈2.826 Critical value method: 𝑑𝑓= min 𝑛 1 −1, 𝑛 2 −1 = min 29, 29 =29 𝛼=.05 Since 2.826= 𝑡 𝑜𝑏𝑠 > 𝑡 𝑐𝑟𝑖𝑡 =2.045(i.e., 𝑡 𝑜𝑏𝑠 is in the critical region), we reject 𝐻 0 . What is your substantive conclusion? We conclude that National average wheat prices are significantly different between July and September. Satterthwaite df = , t_crit = 2.003 → 𝑡 𝛼/2,𝑑𝑓 = 𝑡 0.05/2, 29 = 𝑡 0.025, 29 =2.045

Confidence Interval for 𝜇 1 − 𝜇 2 (𝑡), No Equal Variances
Comparing Two Population Means (without being able to make an assumption of equal variances between groups): 𝜇 1 − 𝜇 2 = ? 𝐸 𝑋 1 − 𝑋 2 =𝐸 𝑋 1 −𝐸 𝑋 2 = 𝜇 1 − 𝜇 2 𝑉𝑎𝑟 𝑋 1 − 𝑋 2 =𝑉𝑎𝑟 𝑋 1 +𝑉𝑎𝑟 𝑋 2 ≈ 𝑠 𝑛 𝑠 𝑛 2 𝑆𝐷 𝑋 1 − 𝑋 2 = 𝑉𝑎𝑟 𝑋 1 − 𝑋 2 ≈ 𝑠 𝑛 𝑠 𝑛 2 t = ( 𝑥 1 − 𝑥 2 )−( 𝜇 1 − 𝜇 2 ) 𝑠 𝑛 𝑠 𝑛 → 𝑡 Distribution with 𝑑𝑓=min( 𝑛 1 −1, 𝑛 2 −1) ( 𝑥 1 − 𝑥 2 )± 𝑡 𝛼/2,( 𝑛 1 + 𝑛 2 −2) 𝑠 𝑛 𝑠 𝑛 2

Confidence Interval for 𝜇 1 − 𝜇 2 (𝑡), No Equal Variances
Suppose we are interested in comparing the average high scores of players in the United States ( 𝜇 1 ) and the average high scores of players in Australia ( 𝜇 2 ). We know that the high scores for both populations are normally distributed. We take a random sample of 120 players from US, and their mean and standard deviation are calculated to be 6000 and 1600, respectively. Likewise, we take a random sample of 61 players from Australia, and their mean and standard deviation are calculated to be 4000 and 1800, respectively. What is the 90% confidence interval for the difference in mean high scores between the US and Australia ( 𝜇 1 − 𝜇 2 )? Population distributions normal 𝑠 1 =1600, 𝑠 2 =1800 (𝜎’s unknown) 𝑛 1 =10, 𝑛 2 =8 ( 𝑛 1 <30, 𝑛 2 <30) 𝑥 1 =6000, 𝑥 2 =4000 𝛼=0.10 (90% confidence) 𝑑𝑓= min 𝑛 1 −1, 𝑛 2 −1 = min 119, 60 =60 90% 𝐶𝐼: ( 𝑥 1 − 𝑥 2 )± 𝑡 𝛼/2, 𝑑𝑓 𝑠 𝑛 𝑠 𝑛 2 =(6000−4000)± =2000± =[ , ] 𝑡 distribution → 𝑡 𝛼/2, 𝑑𝑓 = 𝑡 0.10/2, 60 = 𝑡 0.05, 60 =1.671

Paired t-Test Scenario: Hypothesis test: Test statistic:
We are testing if the means for two sets of observations are equal to each other. The observations in the two sets are dependent on each other. Example 1: Ratings on pre- and post-tests for students in a class. Example 2: Scores on a measure of affection for heterosexual couples. A consequence of the dependence of scores is that the variance of 𝑋 1 − 𝑋 2 is not simply the sum of the variances. We don’t know the variance for the population difference, so it must be estimated. Assume the data are normally distributed or, if 𝑛≥30, we can use the CLT. Hypothesis test: 𝐻 0 : 𝜇 1 − 𝜇 2 = 𝜇 𝐷 =0 vs. 𝐻 1 : 𝜇 1 − 𝜇 2 = 𝜇 𝐷 ≠0 𝐻 0 : 𝜇 1 − 𝜇 2 = 𝜇 𝐷 ≤0 vs. 𝐻 1 : 𝜇 1 − 𝜇 2 = 𝜇 𝐷 >0 𝐻 0 : 𝜇 1 − 𝜇 2 = 𝜇 𝐷 ≥0 vs. 𝐻 1 : 𝜇 1 − 𝜇 2 = 𝜇 𝐷 <0 Test statistic: We create scores 𝑑 𝑖 = 𝑥 1𝑖 − 𝑥 2𝑖 and compute the sample mean and variance based on those. 𝑇= 𝑋 𝐷 𝑠 𝐷 𝑛 ~𝑡(𝑛−1) Note, n is the number of pairs. 𝑉𝑎𝑟 𝐷 =𝑉𝑎𝑟 𝑋 1 +𝑉𝑎𝑟 𝑋 2 −2𝐶𝑜𝑣( 𝑋 1 , 𝑋 2 ) 𝑉𝑎𝑟 𝐷 =𝑉𝑎𝑟 𝑋 1 +𝑉𝑎𝑟 𝑋 2 −2𝐶𝑜𝑣 𝑋 1 , 𝑋 2 = 1 𝑛 2 𝑉𝑎𝑟 𝑋 1𝑖 𝑛 2 𝑉𝑎𝑟 𝑋 2𝑖 − 2 𝑛 2 𝐶𝑜𝑣 𝑋 1𝑖 , 𝑋 2𝑖 = 𝑛 𝑛 2 𝑉𝑎𝑟 𝑋 1𝑖 + 𝑛 𝑛 2 𝑉𝑎𝑟 𝑋 2𝑖 − 2𝑛 𝑛 2 𝐶𝑜𝑣 𝑋 1𝑖 , 𝑋 2𝑖 = 𝜎 𝜎 2 2 −2 𝜎 12 𝑛

Paired t-Test (Example)
The bar owner wants to compare the sales performance of two of his bars, the Big Pig and the Little Pig. Sales can vary considerably depending on the day of the week and the season of the year, so he decides to eliminate these effects by making sure to record each store’s sales on the same sample of days. After choosing a random sample of 12 days, he records the sales (in hundreds of dollars) for each store on these days, as shown in the table. Based on these data, can the owner conclude, at the .10 level, that the mean daily sales of the two stores differ? Assume the population of differences is normally distributed. First, compute the mean and sd of the differences. 𝑥 𝐷 = 𝑑 𝑖 =42.17 𝑠 𝐷 2 = 1 𝑛−1 ( 𝑑 𝑖 − 𝑥 𝐷 ) 2 = 𝑠 𝐷 = 𝑠 𝐷 2 = =95.75 Day Big Pig Little Pig DIfference 1 279 350 -71 2 765 796 -31 3 711 754 -43 4 907 753 154 5 407 400 7 6 841 830 11 848 879 8 535 360 175 9 567 375 192 10 405 257 148 349 358 -9 12 868 864

What are the null and alternative hypotheses? 𝐻 0 : 𝜇 1 − 𝜇 2 = 𝜇 𝐷 =0 vs. 𝐻 1 : 𝜇 1 − 𝜇 2 = 𝜇 𝐷 ≠0 Because the population sd of the difference is not known, a t-statistic is used: 𝑡= 𝑥 𝐷 𝑠 𝐷 𝑛 = 42.17− ≈1.53 Critical value method: 𝛼=.10, so since this is two-tailed, 0.05 is in each tail 𝑑𝑓=𝑛−1=12−1=11 𝑡=1.53<1.796= 𝑡 .05, 11 , so we also see we cannot reject 𝐻 0 . Thus, we can conclude that the sales between the two bars are not significantly different.

Suppose a statistics tutor wants to assess whether her remedial tutoring has been effective for her five students. She decides to conduct a dependent-samples t-test and records the grades for the students before and after tutoring. Test, with an 𝛼- level of .05, whether her tutoring was effective or not. State hypotheses. Calculate difference scores. Calculate mean and sd of difference scores. Calculate t-statistic. Determine whether it is significant or not. Give a substantive conclusion. Score before Score after 2.4 3.0 2.5 2.8 3.5 2.9 3.1 2.7

What is sample mean and standard deviation of differences? 𝑥 𝐷 =0.48, 𝑠 𝐷 2 =0.057, 𝑠 𝐷 ≈0.2387 What are the null and alternative hypotheses? 𝐻 0 : 𝜇 𝐴 − 𝜇 𝐵 = 𝜇 𝐷 =0 vs. 𝐻 1 : 𝜇 𝐴 − 𝜇 𝐵 = 𝜇 𝐷 >0 What is the t-statistic? 𝑡= 𝑥 𝐷 𝑠 𝐷 𝑛 = ≈4.496 Do we reject or fail to reject the null hypothesis? 𝛼=.05, one-sided 𝑑𝑓=𝑛−1=5−1=4 𝑡=4.496>2.132= 𝑡 .05, 4 , so we can reject 𝐻 0 . What is the substantive conclusion? The tutor was successful in significantly increasing student scores. Score before Score after Diffs (after – before) 2.4 3.0 0.6 2.5 2.8 0.3 3.5 0.5 2.9 3.1 0.2 2.7 0.8

Confidence Interval for 𝜇
𝒁 or 𝒕 Distribution? One Sample Mean 𝑥 ± 𝑧 𝛼/2 𝜎 𝑛 n ≥ 30? 𝑥 ± 𝑡 𝛼/2,𝑛−1 𝑠 𝑛 ü û Two Sample Means σ known? normal? ( 𝑥 1 − 𝑥 2 )± 𝑧 𝛼/2 𝜎 𝑛 𝜎 𝑛 2 ü û ü û ( 𝑥 1 − 𝑥 2 )± 𝑧 𝛼/2 𝑠 𝑛 𝑠 𝑛 2 z z or t σ known? unclear ü û ( 𝑥 1 − 𝑥 2 )± 𝑡 𝛼/2,( 𝑛 1 + 𝑛 2 −2) 𝑠 𝑝 𝑛 𝑛 2 z t

Confidence Interval One Sample Mean 𝑥 ± 𝑧 𝛼/2 𝜎 𝑛 One thing to notice is that all follow the same general setup: 𝑥 ± 𝑡 𝛼/2,𝑛−1 𝑠 𝑛 Confidence interval for (population parameter) is: Two Sample Means ( 𝑥 1 − 𝑥 2 )± 𝑧 𝛼/2 𝜎 𝑛 𝜎 𝑛 2 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 ±𝑎𝑝𝑝𝑟𝑜𝑝. 𝑚𝑢𝑙𝑡𝑖𝑝𝑙𝑖𝑒𝑟× 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 ( 𝑥 1 − 𝑥 2 )± 𝑧 𝛼/2 𝑠 𝑛 𝑠 𝑛 2 ( 𝑥 1 − 𝑥 2 )± 𝑡 𝛼/2,( 𝑛 1 + 𝑛 2 −2) 𝑠 𝑝 𝑛 𝑛 2

Hypothesis Testing and Confidence Intervals (Part 4): Two-Sample t-Tests and Confidence Intervals Lecture 10 Justin Kern October 24 and 26, 2017.

Similar presentations

Presentation on theme: "Hypothesis Testing and Confidence Intervals (Part 4): Two-Sample t-Tests and Confidence Intervals Lecture 10 Justin Kern October 24 and 26, 2017."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Hypothesis Testing and Confidence Intervals (Part 4): Two-Sample t-Tests and Confidence Intervals Lecture 10 Justin Kern October 24 and 26, 2017.

Similar presentations

Presentation on theme: "Hypothesis Testing and Confidence Intervals (Part 4): Two-Sample t-Tests and Confidence Intervals Lecture 10 Justin Kern October 24 and 26, 2017."— Presentation transcript:

Similar presentations

About project

Feedback