Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture #1 Chi-square Contingency Table Test.

Similar presentations


Presentation on theme: "1 Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture #1 Chi-square Contingency Table Test."— Presentation transcript:

1 1 Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture #1 Chi-square Contingency Table Test

2 2 Independence Employment Status is independent of Age Note: One population, responses formed by two categorizations

3 3 Homogeneity If nondiscriminatory, promotions are binomially distributed with a common  for both gender categories If nondiscriminatory, promotions are binomially distributed with a common  for both gender categories Note: Two populations, common distribution of responses

4 4 Cognitive Learning in Rats -- Tolman, Ritchie, Kalish (1946) Prior Theory: Discrete Learning Steps Candidate Theory: Cognitive Learning Goal -- Hull -- Tolman C D Barrier B A

5 5 Goodness of Fit Number of Rats 4 5 81532 ACTotal Path Chosen Evidence of cognitive learning ? If random selection, Multinomial with  j = 1/4 Evidence of cognitive learning ? If random selection, Multinomial with  j = 1/4 BD

6 6 Compare Incidence of Death Penalty Are victim’s race and sentence independent? Is aggravation level an explanatory factor? Are victim’s race and sentence independent? Is aggravation level an explanatory factor? Drunk, Lover’s Quarrel, Argument, etc. More Serious Vicious, Cold-blooded, Unprovoked, Murder, etc.

7 7 Chi-Square Tests for Count Data Independence Distribution of responses across one categorization is identical for each category of a second categorization Homogeneity Distribution of responses is identical across several categories of one categorical variable or across several independent samples Goodness of Fit Responses are consistent with a stated probability distribution Parameters specified Unknown parameter values

8 8 Sampling Schemes

9 9 Chi-square Tests 1. Tests for independence in contingency tables

10 10 Contingency Tables (Crosstabs) Two categorizations (rows and columns) Each with mutually exclusive categories Sample of n independent observations Are the two categorizations statistically independent? Are the two categorizations statistically independent? e.g., Is employment status statistically independent of age? Note: Equivalent to Homogeneity Test, Unspecified p, When Only 2 Rows

11 11 Notation for Observed Frequencies 1... j... c Total 1... i O ij Row i Total... r TotalColumn n j Total (Ri)(Ri) (Cj)(Cj) Column Categories Row Categories

12 12 Chi-square Test for Independence H o : Row and column categories are independent H a : Row and column categories are not independent If row and column categories are independent, Reject Ho if X 2 > X  2 X  2 = Chi-Square df = (r - 1)(c - 1)

13 13 Degrees of Freedom for Contingency Tables Given Row and Column Totals, df = (r – 1)(c – 1) Row 1: df = c - 1 Row 2: df = c - 1 Row r-1: df = c - 1... Row r: Estimated expected frequencies in column j sum to C j

14 14 Chi-square Contingency Table Test Summary Reject Ho if X 2 > X  2 X  2 = Chi-Square df = (r - 1)(c - 1) Notational Convention: E ij Even Though Estimated

15 15 Employment Discrimination Observed Frequencies Expected Frequencies Chi-square Calculation

16 16 Employment Discrimination Age (yrs) Employment Status Age (yrs) Employment Status Are age and employment status related ?

17 17 Employment Discrimination H o : Employment Status and Age are independent H a : Employment Status and Age are not independent Reject Ho if X 2 > 6.635 (  = 0.01, df = 1) Conclusion: There is sufficient evidence (p < 0.001), using a significance level of 0.05, to conclude that employment status and age are not statistically independent. X 2 = 138.67 Reason: A greater number of older employees were terminated than expected under the hypothesis of independence.

18 18 Drug Usage Group Frequency of Drug Use Frequency of Drug Use Group

19 19 Drug Usage Observed Frequencies Expected Frequencies Chi-Square Calculation

20 20 Drug Usage H o : Drug Usage and Campus Group are Independent H a : Drug Usage and Campus Group are Not Independent Reject Ho if X 2 > 5.991 (  = 0.05, df = 2) Conclusion : Using a significance level of 0.05, there is sufficient evidence (0.025 < p < 0.05) to conclude that drug usage and campus group are not statistically independent. X 2 = 6.87 Reason : A greater number of athletes and fewer members of campus organizations reported monthly usage of drugs than expected under the hypothesis of independence.

21 21

22 22 Chi-square Tests 1. Tests for independence in contingency tables 2. Tests for homogeneity

23 23 Binomial Samples (Product Binomial Sampling) Hypothesis #1: Is  w = 0.5? Binomial inference on  Equivalently, overall goodness of fit (known  ) Hypothesis #2: Are all the  w equal? Test for homogeneity (equal but unknown  ) Hypothesis #3: Is each  w = 0.5? Goodness of fit (8 Samples, known  ) Genetic Theory: H o :  W = 0.5 vs. H a :  W 0.5 Assumptions: 8 Samples, mutually independent counts Assumptions: 8 Samples, mutually independent counts

24 24 Test of Homogeneity of k Binomial Samples, Specified  H o :  1 =  2 = … =  8 = 0.5 vs. H a :  j 0.5 for some j X 2 = 22.96, df = 8, p = 0.003 Does not assume homogeneity (see below)

25 25 Test of Homogeneity of k Binomial Samples: Unspecified  H o :  1 =  2 = … =  8 vs. H a :  j  k for some (j,k)

26 26 Test of Homogeneity of k Binomial Samples: Unspecified  X 2 = 20.43, df = 7, p = 0.005 Note: Only one of each pair of expected vlues is independently estimated (k = 8, not 16) H o :  1 =  2 = … =  8 vs. H a :  j  k for some (j,k)


Download ppt "1 Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture #1 Chi-square Contingency Table Test."

Similar presentations


Ads by Google