Presentation on theme: "Chi-Square Test. Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information."— Presentation transcript:
Most of the previous techniques presented so far have been for NUMERICAL data. So, what do we do if the data is CATEGORICAL? Ex: Information gathered on gender, political party, college major, etc.
Categorical Variables Based on observations Univariate – single categorical variable Example: Sample 100 people & ask if they agree or disagree with a question. Bivariate – uses two categorical variables Example: Sample 100 people & ask if they are male/female and what political party they support.
One-Way Frequency Table - Univariate Democrat Independent RepublicanDemocratRepublicanIndependent Republican DemocratRepublicanIndependent Freq.462 Data Horizontal One-Way Table Freq. Democrat4 Republican6 Independent2 Vertical One- Way Table
Goodness of Fit Test Used to measure the extent to which the observed counts differ from the expected counts. K = # categories of a categorical variable df = k – 1 Test Statistic:
Hypotheses H o : State each proportion’s hypothesized value. H A : At least 1 of the proportions differ from the hypothesized value.
It uses the Chi-Square Chart Positively Skewed Uses d.f. On calculator!
Is there a preference in type of car? Freq.Expected SUV27 Truck25 Sedan29 Sports19 P 1 =proportion who prefer a SUV P 2 =proportion who prefer a truck p 3 =proportion who prefer a sedan P 4 =proportion who prefer a sports car Assumptions: Random Samples & all cell counts are at least 5. Use a Chi-Square goodness of fit Test df = 3
A researcher believes that the number of homicides crimes in CA by season is uniformly distributed. To test this claim, you randomly select 1200 homicides from a recent year and record the season when each happened. SeasonFreq Spring312 Summer298 Fall297 Winter293
Results from a previous survey asking people who go to movies at least once a month are shown in the table below. To determine whether this distribution is still the same, you randomly select 1000 people who go to movies at least once a month and record the age of each. Are the distributions the same? AgeSurveyFreq 2 - 1726.70%240 18 - 2419.80%214 25 - 3919.70%183 40 - 4914%156 50+19.80%207