# STAT E100 Section Week 11 – Hypothesis testing, Paired t-test, Chi-square test for independence.

## Presentation on theme: "STAT E100 Section Week 11 – Hypothesis testing, Paired t-test, Chi-square test for independence."— Presentation transcript:

STAT E100 Section Week 11 – Hypothesis testing, Paired t-test, Chi-square test for independence

Course Review - Project Proposals due Nov. 19 th, email your TA. - Exam 2 is Nov 26 th, practice tests have already been posted. - Exams are cumulative, about 20% future exams will be old stuff. - Email your TA to join the study group!

Key Equations: For 2-proportion z- significance: with pooling: For 2-proportion Z - interval:

Key Equations: For Paired t-tests: In SPSS: Analyze → Compare Means → Paired -Samples T Test

Key Equations: For Chi-square test of Independence: To calculate the contingency table: To calculate the test statistic: In order for this χ 2 test to be valid, we need all the expected cell counts to be ≥ 5. df = (#rows – 1) x (#cols – 1).

Sample Question #1 C1B2BSS3BLFCFRFDHmeanSt. Dev. NYY0.2360.2940.2710.3000.2820.2980.2590.2810.2470.2720.0227 Boston0.2220.3120.3220.2830.2740.2960.2750.2800.2640.2820.0301 Difference0.014-0.018-0.0510.0170.0080.002-0.0160.001-0.017-0.0100.0225 2) In 2008, the Red Sox and Yankees starters’ batting averages were: a)Perform a 2-sample t-test for these data. What is your conclusion? b) Perform a paired t-test for these data. Do you results in parts a) and b) agree? Why or why not?

Sample Question #1 C1B2BSS3BLFCFRFDHmeanSt. Dev. NYY0.2360.2940.2710.3000.2820.2980.2590.2810.2470.2720.0227 Boston0.2220.3120.3220.2830.2740.2960.2750.2800.2640.2820.0301 Difference0.014-0.018-0.0510.0170.0080.002-0.0160.001-0.017-0.0100.0225 2) In 2008, the Red Sox and Yankees starters’ batting averages were: a)Perform a 2-sample t-test for these data. What is your conclusion? 2- sample t- significance test H o : μ BOS - μ NYY = 0 H a : μ BOS - μ NYY ≠ 0 Since p > 0.05, we cannot reject the null hypothesis that there is no relationship Red Sox and Yankees starters’ batting averages. We do not have evidence to support the claim that the batting averages are statistically significantly different.

Sample Question #1 C1B2BSS3BLFCFRFDHmeanSt. Dev. NYY0.2360.2940.2710.3000.2820.2980.2590.2810.2470.2720.0227 Boston0.2220.3120.3220.2830.2740.2960.2750.2800.2640.2820.0301 Difference0.014-0.018-0.0510.0170.0080.002-0.0160.001-0.017-0.0100.0225 2) In 2008, the Red Sox and Yankees starters’ batting averages were: b) Perform a paired t-test for these data. Do you results in parts a) and b) agree? Why or why not? Paired t- test H o : μ Diff = 0 H a : μ Diff ≠ 0 Since p > 0.05, we cannot reject the null hypothesis. We do not have evidence to support the claim that the batting averages are statistically significantly different. The two tests agree here; but that is not always the case. For example, if n is different for the 2 groups, then a paired t-test cannot be performed in this manner.

A study was conducted to determine if football helmets with newer anti-concussion technology (Riddell's Revolution helmet), actually led to a lower rate of concussions in high school football players compared to standard helmets. In an observational study in western Pennsylvania, 62 of 1173 Revolution helmet wearers suffered a concussion, while 74 of 968 standard helmet wearers suffered a concussion. Does this study provide evidence of a difference in the risk of suffering a concussion between wearing the two types of helmets? http://journals.lww.com/neurosurgery/Abstract/2006/02000/Examining_Concussion_Rates_and_Return_to_Play_in.9.aspx Sample Question #2

A study was conducted to determine if football helmets with newer anti-concussion technology (Riddell's Revolution helmet), actually led to a lower rate of concussions in high school football players compared to standard helmets. In an observational study in western Pennsylvania, 62 of 1173 Revolution helmet wearers suffered a concussion, while 74 of 968 standard helmet wearers suffered a concussion. Does this study provide evidence of a difference in the risk of suffering a concussion between wearing the two types of helmets? http://journals.lww.com/neurosurgery/Abstract/2006/02000/Examining_Concussion_Rates_and_Return_to_Play_in.9.aspx There are 2 mathematically equivalent ways of doing this problem. This is a situation where the answers should agree. Here is the first way: 2-proportion z- significance test H o : p 1 - p 2 = 0 The proportion of individuals suffering a concussion between wearing the two types of helmets is the same. H a : p 1 - p 2 ≠ 0 The proportion of individuals suffering a concussion between wearing the two types of helmets is not the same. Since p < 0.05, we can reject the null hypothesis. We have sufficient evidence to suggest that there is a difference in the risk of suffering a concussion between wearing the two types of helmets. Sample Question #2

A study was conducted to determine if football helmets with newer anti-concussion technology (Riddell's Revolution helmet), actually led to a lower rate of concussions in high school football players compared to standard helmets. In an observational study in western Pennsylvania, 62 of 1173 Revolution helmet wearers suffered a concussion, while 74 of 968 standard helmet wearers suffered a concussion. Does this study provide evidence of a difference in the risk of suffering a concussion between wearing the two types of helmets? http://journals.lww.com/neurosurgery/Abstract/2006/02000/Examining_Concussion_Rates_and_Return_to_Play_in.9.aspx There are 2 mathematically equivalent ways of doing this problem. This is a situation where the answers should agree. Here is the second way: Chi-square test for Independence H 0 : The risk of suffering a concussion is independent of the helmet type H A : The risk of suffering a concussion is not independent of the helmet type. This χ 2 statistic has df = (2 – 1)*(2 – 1) = 1. Since the p-value < 0.05, reject the null hypothesis. We have enough evidence to suggest that the risk of suffering a concussion is associated with the helmet type. Sample Question #2

Below you will find a contingency table for the breakdown of gender within each class year in this semester’s Stat 104 class along with the χ 2 test output. Gender year | F M | Total -----------+----------------------+---------- Freshman | 71 117 | 188 Junior | 18 12 | 30 Senior | 10 11 | 21 Sophomore | 59 50 | 109 -----------+----------------------+---------- Total | 158 190 | 348 a) What are the hypotheses for this χ 2 test in this situation? b) What is the expected number of female Seniors? Sample Question #3

Below you will find a contingency table for the breakdown of gender within each class year in this semester’s Stat 104 class along with the χ 2 test output. Gender year | F M | Total -----------+----------------------+---------- Freshman | 71 117 | 188 Junior | 18 12 | 30 Senior | 10 11 | 21 Sophomore | 59 50 | 109 -----------+----------------------+---------- Total | 158 190 | 348 a) What are the hypotheses for this χ 2 test in this situation? H 0 : The gender breakdown is independent of class year in this semester’s Stat 104 class. H A : The gender breakdown is not independent of class year in this semester’s Stat 104 class. b) What is the expected number of female Seniors? (Row total *Column total)/n = 9.5344 Sample Question #3

Below you will find a contingency table for the breakdown of gender within each class year in this semester’s Stat 104 class along with the χ 2 test output. Gender year | F M | Total -----------+----------------------+---------- Freshman | 71 117 | 188 Junior | 18 12 | 30 Senior | 10 11 | 21 Sophomore | 59 50 | 109 -----------+----------------------+---------- Total | 158 190 | 348 c) How many degrees of freedom are in this test? d) SPSS report the chi-squared test statistic to be 10.39 for this table. What is the approximate p- value for this test? e) What is your conclusion? Sample Question #3

Below you will find a contingency table for the breakdown of gender within each class year in this semester’s Stat 104 class along with the χ 2 test output. Gender year | F M | Total -----------+----------------------+---------- Freshman | 71 117 | 188 Junior | 18 12 | 30 Senior | 10 11 | 21 Sophomore | 59 50 | 109 -----------+----------------------+---------- Total | 158 190 | 348 c) How many degrees of freedom are in this test? Df = (4-1)(2-1) = 3 d) SPSS report the chi-squared test statistic to be 10.39 for this table. What is the approximate p- value for this test? 0.02 > p > 0.01 e) What is your conclusion? Since p < 0.05, we reject the null hypothesis. There is evidence to suggest that the gender breakdown is not independent of class year in this semester’s Stat 104 class. Sample Question #3

Download ppt "STAT E100 Section Week 11 – Hypothesis testing, Paired t-test, Chi-square test for independence."

Similar presentations