Download presentation

Presentation is loading. Please wait.

1
© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Goodness-of-Fit Test

2
12-2 Characteristics of the Chi-Square Distribution 1.It is not symmetric.

3
12-3 1.It is not symmetric. 2.The shape of the chi-square distribution depends on the degrees of freedom, just like Student’s t-distribution. Characteristics of the Chi-Square Distribution

4
12-4 1.It is not symmetric. 2.The shape of the chi-square distribution depends on the degrees of freedom, just like Student’s t-distribution. 3.As the number of degrees of freedom increases, the chi-square distribution becomes more nearly symmetric. Characteristics of the Chi-Square Distribution

5
12-5 1.It is not symmetric. 2.The shape of the chi-square distribution depends on the degrees of freedom, just like Student’s t-distribution. 3.As the number of degrees of freedom increases, the chi-square distribution becomes more nearly symmetric. 4.The values of 2 are nonnegative, i.e., the values of 2 are greater than or equal to 0. Characteristics of the Chi-Square Distribution

6
12-6

7
12-7 A goodness-of-fit test is an inferential procedure used to determine whether a frequency distribution follows a specific distribution.

8
12-8 Expected Counts Suppose that there are n independent trials of an experiment with k ≥ 3 mutually exclusive possible outcomes. Let p 1 represent the probability of observing the first outcome and E 1 represent the expected count of the first outcome; p 2 represent the probability of observing the second outcome and E 2 represent the expected count of the second outcome; and so on. The expected counts for each possible outcome are given by E i = i = np i for i = 1, 2, …, k

9
12-9 A sociologist wishes to determine whether the distribution for the number of years care-giving grandparents are responsible for their grandchildren is different today than it was in 2000. According to the United States Census Bureau, in 2000, 22.8% of grandparents have been responsible for their grandchildren less than 1 year; 23.9% of grandparents have been responsible for their grandchildren for 1 or 2 years; 17.6% of grandparents have been responsible for their grandchildren 3 or 4 years; and 35.7% of grandparents have been responsible for their grandchildren for 5 or more years. If the sociologist randomly selects 1,000 care-giving grandparents, compute the expected number within each category assuming the distribution has not changed from 2000. Parallel Example 1: Finding Expected Counts

10
12-10 Step 1: The probabilities are the relative frequencies from the 2000 distribution: p <1yr = 0.228 p 1-2yr = 0.239 p 3-4yr = 0.176 p ≥5yr = 0.357 Solution

11
12-11 Step 2: There are n=1,000 trials of the experiment so the expected counts are: E <1yr = np <1yr = 1000(0.228) = 228 E 1-2yr = np 1-2yr = 1000(0.239) = 239 E 3-4yr = np 3-4yr =1000(0.176) = 176 E ≥5yr = np ≥5yr = 1000(0.357) = 357 Solution

12
12-12 Test Statistic for Goodness-of-Fit Tests Let O i represent the observed counts of category i, E i represent the expected counts of category i, k represent the number of categories, and n represent the number of independent trials of an experiment. Then the formula approximately follows the chi-square distribution with k-1 degrees of freedom, provided that 1.all expected frequencies are greater than or equal to 1 (all E i ≥ 1) and 2.no more than 20% of the expected frequencies are less than 5.

13
12-13 Step 1: Determine the null and alternative hypotheses. H 0 : The random variable follows a certain distribution H 1 : The random variable does not follow a certain distribution The Goodness-of-Fit Test To test the hypotheses regarding a distribution, we use the steps that follow.

14
12-14 Typically the hypotheses can be symbolically represented as: The Goodness-of-Fit Test (hypotheses cont.) vs. the alternative:

15
12-15 Step 2: Decide on a level of significance, , depending on the seriousness of making a Type I error.

16
12-16 Step 3: a)Calculate the expected counts for each of the k categories. The expected counts are E i =np i for i = 1, 2, …, k where n is the number of trials and p i is the probability of the ith category, assuming that the null hypothesis is true.

17
12-17 Step 3: b)Verify that the requirements for the goodness- of-fit test are satisfied. 1.All expected counts are greater than or equal to 1 (all E i ≥ 1). 2.No more than 20% of the expected counts are less than 5. c) Compute the test statistic: Note: O i is the observed count for the ith category.

18
12-18 CAUTION! If the requirements in Step 3(b) are not satisfied, one option is to combine two or more of the low- frequency categories into a single category.

19
12-19 Step 4: Use Table VII to obtain an approximate P-value by determining the area under the chi-square distribution with k-1 degrees of freedom to the right of the test statistic. P-Value Approach

20
12-20 Step 5: If the P-value < , reject the null hypothesis. If the P-value ≥ α, fail to reject the null hypothesis. P-Value Approach

21
12-21 Step 6: State the conclusion in the context of the problem. Note: in many cases, when the null hypothesis is rejected at the conclusion of the test, we will have to attempt to explain what the non- conformity was.

22
12-22 A sociologist wishes to determine whether the distribution for the number of years care-giving grandparents are responsible for their grandchildren is different today than it was in 2000. According to the United States Census Bureau, in 2000, 22.8% of grandparents have been responsible for their grandchildren less than 1 year; 23.9% of grandparents have been responsible for their grandchildren for 1 or 2 years; 17.6% of grandparents have been responsible for their grandchildren 3 or 4 years; and 35.7% of grandparents have been responsible for their grandchildren for 5 or more years. The sociologist randomly selects 1,000 care-giving grandparents and obtains the following data. Parallel Example 2: Conducting a Goodness-of -Fit Test

23
12-23 Test the claim that the distribution is different today than it was in 2000 at the = 0.05 level of significance.

24
12-24 Step 1: We want to know if the distribution today is different than it was in 2000. The hypotheses are then: H 0 : The distribution for the number of years care-giving grandparents are responsible for their grandchildren is the same today as it was in 2000 H 1 : The distribution for the number of years care-giving grandparents are responsible for their grandchildren is different today than it was in 2000 Solution

25
12-25 Or alternatively:

26
12-26 Step 2: The level of significance is =0.05. Step 3: (a) The expected counts were computed in Example 1. Solution Number of Years Observed Counts Expected Counts <1252228 1-2255239 3-4162176 ≥5331357

27
12-27 Step 3: (b)Since all expected counts are greater than or equal to 5, the requirements for the goodness-of-fit test are satisfied. (c)The test statistic is Solution

28
12-28 Step 4: There are k = 4 categories. The P-value is the area under the chi-square distribution with 4-1=3 degrees of freedom to the right of. Thus, P-value ≈ 0.09. Solution: P-Value Approach

29
12-29 Step 5: Since the P-value ≈ 0.09 is greater than the level of significance = 0.05, we fail to reject the null hypothesis. Solution: P-Value Approach

30
12-30 Step 6: There is insufficient evidence to conclude that the distribution for the number of years care-giving grandparents are responsible for their grandchildren is different today than it was in 2000 at the = 0.05 level of significance. Solution

Similar presentations

© 2024 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google