# Chapter 13: Chi-Square Test

## Presentation on theme: "Chapter 13: Chi-Square Test"— Presentation transcript:

Chapter 13: Chi-Square Test

Motivating Example Research Question: Among all adults in the U.S. who were in a car accident, is there a relationship between cell phone use and injury severity? Sample: 200 randomly-selected U.S. adults who were in car accidents Results: See Table 11 1This example is entirely fictitious

Table 1: Bivariate Table
Injuries Used Cell? Total Sustained No Yes None 82 (82%) 66 (66%) 148 Minor 12 (12%) 18 (18%) 30 Severe 6 (6%) 16 (16%) 22 100 200 Relationship: There is a relationship in the sample; cell phone users are less likely than non-users to sustain no injuries (66% vs. 82%)

Table 2: “No Association” Table
Injuries Used Cell? Total Sustained No Yes None 74 (74%) 148 (74%) Minor 15 (15%) 30 (15%) Severe 11 (11%) 22 (11%) 100 200 No Relationship: Cell phone users are just as likely as non-users to sustain no injuries (74%)

Relationship in Sample Vs. Population
Sample: We found a relationship in the sample of 200 accident victims Population: We want to know whether there is a relationship in the population ALL adults in the U.S. who were in car accidents We can use hypothesis testing procedures The chi-square test is used to test hypotheses involving bivariate tables

Chi-Square (χ2) Test Procedure
State the null and research hypotheses Compute a χ2 statistic Determine the degrees of freedom Find the p-value for the χ2 statistic Decide whether there is evidence to reject the null hypothesis Interpret the results

χ2 Test Assumptions Assumption 1: The sample is selected at random from a population Assumption 2: The variables are nominal or ordinal Note: In this class, you won’t have to determine whether the assumptions have been met

χ2 Test: Hyptheses Null Hypothesis (H0): The two variables are not related in the population Research Hypothesis (H1): The two variables are related in the population Alpha (α): This will be given to you in every problem (when it’s not given, assume α = 0.05)

χ2 Test: Hyptheses Cell Phone – Injuries Example
Null Hypothesis (H0): Cell phone use and injury severity are not related among all adults in the U.S. who were in a car accident Research Hypothesis (H1): Cell phone use and injury severity are related among all adults in the U.S. who were in a car accident Alpha (α): Use α = 0.05)

χ2 Test: Calculating the χ2 Statistic
Formula: Two Components Observed Frequencies (fo) Expected Frequencies (fe)

Calculating the χ2 Statistic: fo and fe
Observed Frequencies (fo) Definition: The actual frequencies in the sample Example: In the cell phone – injuries example, these are given in Table 1 Expected Frequencies (fe) Definition: The frequencies we would expect assuming the two variables were independent In other words, assuming the null hypothesis was true Example: In the cell phone – injuries example, these are given in Table 2

Calculating the χ2 Statistic: Logic Behind the Formula
We are comparing the observed and expected frequencies We are comparing the results in our sample with what we would expect if the two variables were independent (i.e., assuming H0 is true) We are doing this because we are “testing the null hypothesis (H0)”, which assumes that the two variables are independent in the population

Calculating the χ2 Statistic: Size of Difference
Small Difference If the differences between the observed and expected frequencies are small, the χ2 statistic will be small As a result, we will likely fail to reject H0 Large Difference If the differences between the observed and expected frequencies are large, the χ2 statistic will be large As a result, we will likely reject H0 What is Small or Large? We will use Appendix D to decide what is small or large

Calculating the χ2 Statistic: Computing fe
Procedure: For each cell, multiply the corresponding column marginal and row marginal, then divide by the sample size: Huh?!?!? Let’s do this for the cell phone – injuries example (next several slides)

Calculating the χ2 Statistic: Computing fe
Injuries Used Cell? Total Sustained No Yes None 148 Minor 30 Severe 22 100 200 Begin with a table containing only the row and column totals

Calculating the χ2 Statistic: Computing fe
Injuries Used Cell? Total Sustained No Yes None 148 Minor 30 Severe 22 100 200 For each cell, multiply the corresponding row and column total, then divide by the total sample size (200 here)

Calculating the χ2 Statistic: Computing fe
Injuries Used Cell? Total Sustained No Yes None 148 Minor 30 Severe 22 100 200

Calculating the χ2 Statistic: Computing fe
Injuries Used Cell? Total Sustained No Yes None 148 Minor 30 Severe 22 100 200 For each cell, multiply the corresponding row and column total, then divide by the total sample size (200 here)

Calculating the χ2 Statistic: Computing fe
Injuries Used Cell? Total Sustained No Yes None 148 Minor 30 Severe 22 100 200

Calculating the χ2 Statistic: Computing fe
Injuries Used Cell? Total Sustained No Yes None 148 Minor 30 Severe 22 100 200

Calculating the χ2 Statistic: Computing fe
Injuries Used Cell? Total Sustained No Yes None 148 Minor 30 Severe 22 100 200

Calculating the χ2 Statistic: Computing fe
Injuries Used Cell? Total Sustained No Yes None 74 148 Minor 15 30 Severe 11 22 100 200 This is the complete table of expected frequencies (fe)

Calculating the χ2 Statistic
Injuries Used Cell? Total Sustained No Yes None 82 66 148 Minor 12 18 30 Severe 6 16 22 100 200 Observed Frequencies (fo) Injuries Used Cell? Total Sustained No Yes None 74 148 Minor 15 30 Severe 11 22 100 200 Expected Frequencies (fe)

χ2 Test: Degrees of Freedom (df)
Formula: r = number of rows c = number of columns Interpretation: The number of cells in the table that need to have numbers before we can fill in the remaining cells Cell Phone – Injury Example

χ2 Test: Determining the P-Value
χ2 Distribution The p-value will be based on the χ2 distribution The χ2 distribution is positively skewed This means that our hypothesis tests will always be one-tailed Values of the χ2 statistic are always positive Minimum = 0 (variables are completely independent) Maximum = ∞ The shape of the χ2 distribution is dictated by its df See figure on next slide

χ2 Test: Determining the P-Value

χ2 Test: Determining the P-Value
Steps Find df in the first column of Appendix D Read across the row until you find the χ2 value you computed Read up to the first row to find the p-value Cell Phone – Injury Example χ2 = 7.46, df = 2 Reading across the row where df = 2, a value of 7.46 is between and 7.824 Reading up to the top row, the p-value is between 0.05 and 0.02

χ2 Test: Determining the P-Value
Additional practice finding p-values χ2 = 0.446, df = 2 P-value = 0.80 χ2 = 4.09, df = 1 P-value is between 0.02 and 0.05 χ2 = 0.01, df = 2 P-value is greater than 0.99 χ2 = 15.00, df = 4 P-value is between and 0.01

χ2 Test: Evidence to Reject H0?
Decision Rule If the p-value is less than α, we have evidence to reject H0 in favor of H1 If the p-value is greater than α, we do not have evidence to reject H0 in favor of H1 Cell Phone – Injury Example The p-value (which is between 0.02 and 0.05) is less than α = 0.05 We have evidence to reject H0 in favor of H1

χ2 Test: Interpretation
If We Reject H0: We have evidence to suggest that the two variables are related in the population If We Do Not Reject H0: We do not have evidence to suggest that the two variables are related in the population Cell Phone – Injury Example: We have evidence that cell phone use and injury severity are related among all adults in the U.S. who were in a car accident