Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Practice of Statistics Third Edition Chapter (13.1) 14.1: Chi-square Test for Goodness of Fit Copyright © 2008 by W. H. Freeman & Company Daniel S.

Similar presentations


Presentation on theme: "The Practice of Statistics Third Edition Chapter (13.1) 14.1: Chi-square Test for Goodness of Fit Copyright © 2008 by W. H. Freeman & Company Daniel S."— Presentation transcript:

1 The Practice of Statistics Third Edition Chapter (13.1) 14.1: Chi-square Test for Goodness of Fit Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates

2 When to use the Chi Square, χ 2, Procedure Used when the dependent variable is categorical or ranked data. When the assumptions about the population are not reasonable. For example, populations that are non-normal distributions.

3 This Chapter will Cover Three Tests based on the Chi-square Distributions. Test if observed counts for a categorical data could come from a certain hypothesized distribution. ( Goodness of Fit). Test whether a single categorical variable has the same distribution in two or more distinct population. ( Inference for Two-Way Tables, Tests for Homogeneity of Populations) Test whether two categorical variables are associated or independent. (Inference for Two- Way Tables, Tests for of Association/ Independence.)

4 Required Conditions For Goodness of Fit Procedure SRS The observations must be independent and each observation must fit into one and only one cell or category. All individual expected counts are at least one and no more than 20% of the expected counts are less than 5. Please note: We are working with counts – not proportions. There is no mention of normality. Chi-squared procedures do not rely on assumptions about the population from which the sample is selected.

5

6 Hypothesizes for Goodness of Fit Test H 0 = the actual population proportions are equal to the hypothesized proportions. H a = the actual proportions are different from the hypothesized proportions.

7 Chi-square Test Statistic Degree of freedom = k – 1, where k is the number of categories. Use the appropriate chi-square distribution based on degree of freedom, to find the critical value of χ 2 at an α level.

8

9

10

11 Properties of Chi-square Distributions Total area under the curve is one. Each chi-square distribution except for df = 1 start at the origin, increases to a peak and then approach the x-axis asymptotically form above. Each distribution is skewed to the right. As the number of degrees of freedom increases the distribution becomes for symmetrical and looks like a normal curve.

12 Example 1 Consider the problem of determining whether the distribution of car sales in the Eastern United States in the current year for Nissans, Mazdas, Toyotas and Hondas is the same as the known distribution of the pervious year, given in the table below: Nissan18% Mazda10% Toyota35% Honda37% From the Motor Vehicle Bureau records, we select a random sample of 1,000 of new car purchases for one of these four types of foreign cars in the current year. The information is displayed below: Frequency Nissan150 Mazda65 Toyota385 Honda400 Is the current year’s sales distribution the same as last year’s sales ?

13 Example 1 Continued Step 1 – We want to determine if the sales distribution is different from last year’s sales distribution. –Population – this year sales of Nissan, Mazda, Toyota, and Hondas. –Parameter – the proportion of each car sold. –H 0 = The current year’s sales distribution is the same as that of the pervious year’s distribution ( Nissan: 18%, Mazda: 10%, Toyota: 35%, and Honda: 37%). –H a = The current year’s sales distribution is not the same as the previous year.

14 Example 1 Continued Step 2 Condition –SRS – Random sample taken from the Motor Vehicle Bureau. We do not know if the sample was taken from all state motor vehicle bureau is eastern United States. We will assume we have an SRS. –Expected counts: Nissan: 0.18 x 1000 = 180 Mazda: 0.10 x 1000 = 100 Toyota: 0.35 x 1000 = 350 Honda: 0.37 x 1000 = 370 All expected counts are at least 5 or more. –Independence - observations or counts are independent.

15 Example 1 Continued Step 3 Calculations Nissan1501805 Mazda6510012.25 Toyota3853503.50 Honda4003702.43 Observed Expected Count (O) Count (E) Sum = 23.18 From Table D using df = 3 and α = 0.05, the critical χ 2 * = 7.81.

16 Example 1 Continued Step 4 Interpretation Since χ 2 = 23.18 is to the right of χ 2 *, the P-value is smaller than α = 0.05. The results are statistically significant to reject H 0. The current sales distribution is not the same as last year’s sales distribution. The test only tells you there is a change. Additional analysis may be required. We need to look at (O –E) 2 /E column to find the major contributor to the Chi-square statistic. In this problem, not as many Mazda were sold in the current year.

17 Example 2 Are you more likely to have a motor vehicle collision when using a cell phone? A study of 699 drivers who were using a cell phone when they were involved in a collision examined this question. These drivers made 26,798 cell phone calls during a 14 month study period, Each of the 699 collisions was classified in various ways. Here are the counts for each day of the week: Day: Sun Mon Tues Wed Thu Fri Sat Total Num 20 133 126 159 136 113 12 699 Are the accidents equally likely to occur on any day of the week?

18 Example 2 Continued Step 1 –Population? –Parameter? –H0?–H0? –Ha?–Ha? Step 1 –Population – all accidents involving cell phones. –Parameter – proportion of accidents for each day of the week. –H 0 : Motor vehicle accidents involving cell phone use are equally likely to occur on each day of the week. –H a : The probabilities of a motor accident involving a cell phone use vary from day to day ( not all the same.)

19 Example 2 continued Step 2 Conditions –SRS? –Expected counts? –Independent? Step 2 –SRS Assume an SRS. –Expected counts are: Sun 699 x (1/7) = 99.857 Mon 699 x (1/7) = 99.857 Tue 699 x (1/7) = 99.857 Wed 699 x (1/7) = 99.857 Thu 699 x (1/7) = 99.857 Fri 699 x (1/7) = 99.857 Sat 699 x (1/7) = 99.857 All expected counts are greater than 5. - The observed counts are independent.

20 Example 2 Continued Step 3 Calculations Use calculator. L1 = Observed counts L2 = Expected counts L3 – (O –E) 2 /E = (L1 – L2) 2 / L2 Sum (L3) Sum = χ 2 2 nd Distr χ 2 cdf( Lower bound, Upper bound, df)

21 Example 2 Continued Step 4 Interpretation –The P-value is extremely small. At α = 0.05 we would reject H 0. The accidents involving cell phones are not evenly distributed over the days of the week. –Additional analysis: Saturday and Sunday provided the biggest contribution to χ 2 statistic. There were less accidents involving cell phones over the weekends.


Download ppt "The Practice of Statistics Third Edition Chapter (13.1) 14.1: Chi-square Test for Goodness of Fit Copyright © 2008 by W. H. Freeman & Company Daniel S."

Similar presentations


Ads by Google