Presentation on theme: "Class 11 Chi-Squared Test of Independence EMBS Section 11.3."— Presentation transcript:
Class 11 Chi-Squared Test of Independence EMBS Section 11.3
Chi-squared GOF test One row (column) of Observed Counts One row (column) of Expected Counts determined based on H0 – All categories are equally likely (Roulette Wheel, Soccer birth months) – Categories have specified p’s (M&M colors) – #girls in 4 is binomial(n=4,p=.5) (Denmark Fams) – Expected Bin counts from NORMAL distribution (Lorex) Calculated chi-squared, dof, chidist, pvalue, reject or not.
Supermarket Survey A random sample of 160 employees of a national Supermarket chain were asked about a proposed wage freeze. There were two categorical variables in the resulting 160-element data set. – JOB (Stacker, Sales, Admin) – RESPONSE (favorable, unfavorable, no comment)
The data set IDJobResponse 1StackerUnfav 2StackerUnfav 3AdminNC......... 158SalesNC 159SalesFav 160StackerFav
To examine the relationship between 2 categorical variables, start with a contingency table FavUnfavNCTotal Stacker630440 Sales12242056 Admin20103464 Total386458160 Response Job Are RESPONSE and JOB independent?
H0: Response and Job are independent FavUnfavNCTotal Stacker40 Sales56 Admin64 Total386458160 Response Job What are the expected counts given H0?
H0: Response and Job are independent FavUnfavNCTotal Stacker40 Sales56 Admin64 Total386458160 Response Job What are the expected counts given H0? (11.9)
Calculate the Expected Counts under H0. FavUnfavNCTotal Stacker9.51614.540 Sales13.322.420.356 Admin15.225.623.264 Total386458160 Response Job Expected Counts if independent.
We know what to do now with our table of Observed and Expected Counts… The calculated chi-squared statistic The sum of the distances.
Calculate the table of distances.. FavUnfavNCTotal Stacker1.2912.257.60 Sales0.130.110.00 Admin1.529.515.03 Total37.44 Response Job
Get the p-value FavUnfavNCTotal Stacker1.2912.257.60 Sales0.130.110.00 Admin1.529.515.03 Total37.44 Response Job P-value =Chidist(37.44,4) =1.46E-07 Dof =(#rows-1)(#cols-1) =2*2 =4
=CHITEST will do the last two steps =CHITEST(range containing the Os, range containing the Es) Calculates the chisquared, compares it to the chidist using the appropriate dof, and reports the p-value. =CHITEST(for our data) = 1.46E-07 So…..You just have to calculate the Es. CHITEST will also work for the GOF test!!
Statistically Significant? May 13, 1999 Web posted at: 11:38 a.m. EDT (1538 GMT) (CNN) -- Young children who sleep with a light on may have a substantially higher risk of developing nearsightedness as they get older, says a new study in the journal Nature. The collaborative study of 479 children by researchers at the University of Pennsylvania Medical Center and The Children's Hospital of Philadelphia found 55 percent (of the 100) children who slept with a room light on before age 2 had myopia, or nearsightedness, between ages 2 and 16. Of the (112) children who slept with a night-light before age 2, 34 percent were myopic, while just 10 percent of children who slept in darkness were nearsighted.
1. Create the Contingency Table of Observed Counts MyopicNot Light5545100 Night Light3874112 Dark27240267 120359479 Earlier we would have asked P(Light│Myopic) =55/120 Now we want to test H0: Sleep Conditions and Subsequent Eyesight are independent H0: P(M) is equal for all three sleeping conditions. Expected Counts if Independent 25.0574.95 28.0683.94 66.89200.11 Distances 35.8011.97 3.521.18 23.797.95 84.21 =chidist(84.21,2) = 5.19E-19 Statistically Significant
Suppose we Flip the contingency table? MyopicNot Light5545100 Night Light3874112 Dark27240267 120359479 Calculated chi-squared = 84.21 P-value = 5.19E-19 LightNight LightDark Myopic553827120 Not4574240359 100112267479 Calculated chi-squared = P-value =
Assignment 12 Use the class data to test the independence of ATHLETE and HS STAT. Use the Denmark Family data to test independence of “Gender Mix of first 3” and “Have 4?” IDChild1Child2Child3Child4Famsize 4MMM3 6FMM3 9FMM3 12FMMF4 14FMF3 25FFM3............ 700023FFMM4 700025MFF3 700029MMM 3 IDHS Stat?Athlete? 1YesNo 2 3...... 67No 68No 69Yes