 # 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Categorical Data Test of Independence.

## Presentation on theme: "1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Categorical Data Test of Independence."— Presentation transcript:

1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Categorical Data Test of Independence

2 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Hypotheses: H 0 :The two variables are independent. H a :The two variables are not independent.  2 Test for Independence The  2 test statistic and procedures can also be used to investigate the association between two categorical variables in a single population.

3 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. The expected cell counts are estimated from the sample data (assuming that H 0 is true) using the formula  2 Test for Independence Test statistic:

4 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.  2 Test for Independence The P-value associated with the computed test statistic value is the area to the right of  2 under the chi-square curve with the appropriate df. P-value:When H 0 is true,  2 has approximately a chi-square distribution with df = (number of rows - 1)(number of columns - 1)

5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Assumptions: 1.The observed counts are from a random sample. 2.The sample size is large: all expected counts are at least 5. If some expected counts are less than 5, rows or columns of the table may be combined to achieve a table with satisfactory expected counts.  2 Test for Independence

6 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Example Consider the two categorical variables, gender and principle form of vision correction for the sample of students used earlier in this presentation. We shall now test to see if the gender and the principle form of vision correction are independent.

7 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Example Hypotheses: H 0 :Gender and principle method of vision correction are independent. H a : Gender and principle method of vision correction are not independent. Significance level: We have not chosen one, so we shall look at the practical significance level. Test statistic:

8 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Example Assumptions: We are assuming that the sample of students was randomly chosen. All expected cell counts are at least 5, and samples were chosen independently so the  2 test is appropriate.

9 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Example Assumptions: Notice that the expected count is less than 5 in the cell corresponding to Female and Contacts. So that we should combine the columns for Contacts and Glasses to get

10 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Example The contingency table for this example has 2 rows and 2 columns, so the appropriate df is (2-1)(2-1) = 1. Since 0.246 < 2.70, the P-value is substantially greater than 0.10. H 0 would not be rejected for any reasonable significance level. There is not sufficient evidence to conclude that the gender and vision correction are related. (I.e., For all practical purposes, one would find it reasonable to assume that gender and need for vision correction are independent. Calculations:

11 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Example Minitab would provide the following output if the frequency table was input as shown. Chi-Square Test: Contacts or Glasses, None Expected counts are printed below observed counts Contacts None Total 1 14 11 25 12.97 12.03 2 27 27 54 28.03 25.97 Total 41 38 79 Chi-Sq = 0.081 + 0.087 + 0.038 + 0.040 = 0.246 DF = 1, P-Value = 0.620