Presentation on theme: "Chi-Square Test Section 12.1. Categorical Variables Based on observations Univariate – single categorical variable Example: Sample 100 people & ask."— Presentation transcript:
Chi-Square Test Section 12.1
Categorical Variables Based on observations Univariate – single categorical variable Example: Sample 100 people & ask if they agree or disagree with a question. Bivariate – uses two categorical variables Example: Sample 100 people & ask if they are male/female and what political party they support.
One-Way Frequency Table - univariate Democrat Independent RepublicanDemocratRepublicanIndependent Republican DemocratRepublicanIndependent Freq.462 Data Horizontal One-Way Table Freq. Democrat4 Republican6 Independent2 Vertical One- Way Table
Goodness of Fit Test Used to measure the extent to which the observed counts differ from the expected counts. K = # categories of a catagorical variable Df = k – 1 Test Statistic:
Assumptions Observed Values are based on random Samples Sample size is large – each cell count is at least 5.
Hypotheses H o : State each proportion’s hypothesized value. H A : At least 1 of the proportions differ from the hypothesized value.
It uses the Chi-Square Chart Positively Skewed Uses d.f. On calculator!
Is there a preference in type of car? Freq.Expected SUV27 Truck25 Sedan29 Sports19 P 1 =proportion who prefer a SUV P 2 =proportion who prefer a truck p 3 =proportion who prefer a sedan P 4 =proportion who prefer a sports car Assumptions: Random Samples & all cell counts are at least 5. Use a Chi-Square goodness of fit Test P-val = xcdf(2.24,∞, 3)=0.52
A researcher believes that the number of homicides crimes in CA by season is uniformly distributed. To test this claim, you randomly select 1200 homicides from a recent year and record the season when each happened. SeasonFreq Spring312 Summer299 Fall297 Winter293
Results from a previous survey asking people who go to movies at least once a month are shown in the table below. To determine whether this distribution is still the same, you randomly select 1000 people who go to movies at least once a month and record the age of each. Are the distributions the same? AgeSurveyFreq % % % % %207