Presentation on theme: "Class 11 Chi-Squared Test of Independence EMBS Section 11.3."— Presentation transcript:
Class 11 Chi-Squared Test of Independence EMBS Section 11.3
Chi-squared GOF test One row (column) of Observed Counts One row (column) of Expected Counts determined based on H0 – All categories are equally likely (Roulette Wheel, Soccer birth months) – Categories have specified p’s (M&M colors) – #girls in 4 is binomial(n=4,p=.5) (Denmark Fams) – Expected Bin counts from NORMAL distribution (Lorex) Calculated chi-squared, dof, chidist, pvalue, reject or not.
Supermarket Survey A random sample of 160 employees of a national Supermarket chain were asked about a proposed wage freeze. There were two categorical variables in the resulting 160-element data set. – JOB (Stacker, Sales, Admin) – RESPONSE (favorable, unfavorable, no comment)
The data set IDJobResponse 1StackerUnfav 2StackerUnfav 3AdminNC SalesNC 159SalesFav 160StackerFav
To examine the relationship between 2 categorical variables, start with a contingency table FavUnfavNCTotal Stacker Sales Admin Total Response Job Are RESPONSE and JOB independent?
H0: Response and Job are independent FavUnfavNCTotal Stacker40 Sales56 Admin64 Total Response Job What are the expected counts given H0?
H0: Response and Job are independent FavUnfavNCTotal Stacker40 Sales56 Admin64 Total Response Job What are the expected counts given H0? (11.9)
Calculate the Expected Counts under H0. FavUnfavNCTotal Stacker Sales Admin Total Response Job Expected Counts if independent.
We know what to do now with our table of Observed and Expected Counts… The calculated chi-squared statistic The sum of the distances.
Calculate the table of distances.. FavUnfavNCTotal Stacker Sales Admin Total37.44 Response Job
Get the p-value FavUnfavNCTotal Stacker Sales Admin Total37.44 Response Job P-value =Chidist(37.44,4) =1.46E-07 Dof =(#rows-1)(#cols-1) =2*2 =4
=CHITEST will do the last two steps =CHITEST(range containing the Os, range containing the Es) Calculates the chisquared, compares it to the chidist using the appropriate dof, and reports the p-value. =CHITEST(for our data) = 1.46E-07 So…..You just have to calculate the Es. CHITEST will also work for the GOF test!!
Excel Demo if time…
Statistically Significant? May 13, 1999 Web posted at: 11:38 a.m. EDT (1538 GMT) (CNN) -- Young children who sleep with a light on may have a substantially higher risk of developing nearsightedness as they get older, says a new study in the journal Nature. The collaborative study of 479 children by researchers at the University of Pennsylvania Medical Center and The Children's Hospital of Philadelphia found 55 percent (of the 100) children who slept with a room light on before age 2 had myopia, or nearsightedness, between ages 2 and 16. Of the (112) children who slept with a night-light before age 2, 34 percent were myopic, while just 10 percent of children who slept in darkness were nearsighted.
1. Create the Contingency Table of Observed Counts MyopicNot Light Night Light Dark Earlier we would have asked P(Light│Myopic) =55/120 Now we want to test H0: Sleep Conditions and Subsequent Eyesight are independent H0: P(M) is equal for all three sleeping conditions. Expected Counts if Independent Distances =chidist(84.21,2) = 5.19E-19 Statistically Significant
Suppose we Flip the contingency table? MyopicNot Light Night Light Dark Calculated chi-squared = P-value = 5.19E-19 LightNight LightDark Myopic Not Calculated chi-squared = P-value =
Assignment 12 Use the class data to test the independence of ATHLETE and HS STAT. Use the Denmark Family data to test independence of “Gender Mix of first 3” and “Have 4?” IDChild1Child2Child3Child4Famsize 4MMM3 6FMM3 9FMM3 12FMMF4 14FMF3 25FFM FFMM MFF MMM 3 IDHS Stat?Athlete? 1YesNo No 68No 69Yes