Test of Goodness of Fit Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007.

Test of Goodness of Fit Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007

Count Data Count data – Data that counts the number of observations that fall into each of several categories.

Count Data The data may be univariate or bivariate. Univariate example – Observe a person’s opinion on a subject (strongly agree, agree, etc.). Bivariate example – Observe a opinion on a subject and their education level (< high school, high school, etc.)

Univariate Example Observe a person’s opinion on a question. Strongly Agree AgreeNeutralDisagree Strongly Disagree 1001208050

Bivariate Example Observe each person’s opinion and education level. Strongly Agree AgreeNeutralDisagree Strongly Disagree < High School 35253055 High School 304015105 College 204015 20 > College 15 20

The Two Basic Questions For univariate data, do the data fit a specified distribution? For example, could these data have come from a uniform distribution? Strongly Agree AgreeNeutralDisagree Strongly Disagree 1001208050

The Two Basic Questions For bivariate data, for the various values of one of the variables, does the other variable show the same distribution? Could each row have come from the same distribution? Strongly Agree AgreeNeutralDisagree Strongly Disagree < High School 35253055 High School 304015105 College 204015 20 > College 15 20

Observed and Expected Counts Observed counts – The counts that were actually observed in the sample. Expected counts – The counts that would be expected if the null hypothesis were true.

Tests of Goodness of Fit The goodness-of-fit test applies only to univariate data. The null hypothesis specifies a discrete distribution for the population. We want to determine whether a sample from that population supports this hypothesis.

Examples If we rolled a die 60 times, we expect 10 of each number.  If we get frequencies 8, 10, 14, 12, 9, 7, does that indicate that the die is not fair? What is the distribution if the die were fair?

Examples If we toss a fair coin, we should get two heads ¼ of the time, two tails ¼ of the time, and one of each ½ of the time.  Suppose we toss a coin 100 times and get two heads 16 times, two tails 36 times, and one of each 48 times. Is the coin fair?

Examples If we selected 20 people from a group that was 60% male and 40% female, we would expect to get 12 males and 8 females.  If we got 15 males and 5 females, would that indicate that our selection procedure was not random (i.e., discriminatory)?  What if we selected 100 people from the group and got 75 males and 25 females?

Null Hypothesis The null hypothesis specifies the probability (or proportion) for each category. Each probability is the probability that a random observation would fall into that category.

Null Hypothesis To test a die for fairness, the null hypothesis would be H 0 : p 1 = 1/6, p 2 = 1/6, …, p 6 = 1/6. The alternative hypothesis will always be a simple negation of H 0 : H 1 : At least one of the probabilities is not 1/6. or more simply, H 1 : H 0 is false.

Level of Significance Let  = 0.05. The test statistic will involve the expected counts.

Expected Counts To find the expected counts, we apply the hypothetical probabilities to the sample size. For example, if the hypothetical probabilities are 1/6 and the sample size is 60, then the expected counts are (1/6)  60 = 10.

Example The test statistic will be the  2 statistic. Make a chart showing both the observed and expected counts (in parentheses). 123456 8 (10) 10 (10) 14 (10) 12 (10) 9 (10) 7 (10)

The Chi-Square Statistic Denote the observed counts by O and the expected counts by E. Define the chi-square (  2 ) statistic to be

The Chi-Square Statistic Clearly, if all of the deviations O – E are small, then  2 will be small. But if even a few the deviations O – E are large, then  2 will be large.

The Value of the Test Statistic Now calculate  2.

Compute the p-Value To compute the p-value of the test statistic, we need to know more about the distribution of  2.

Chi-Square Degrees of Freedom The chi-square distribution has an associated degrees of freedom, just like the t distribution. Each chi-square distribution has a slightly different shape, depending on the number of degrees of freedom. In this test, df is one less than the number of cells.

Chi-Square Degrees of Freedom

 2 (2)

Chi-Square Degrees of Freedom  2 (2)  2 (5)

Chi-Square Degrees of Freedom  2 (2)  2 (5)  2 (10)

Properties of  2 The chi-square distribution with df degrees of freedom has the following properties.   2  0.  It is unimodal.  It is skewed right (not symmetric!)    2 = df.    2 =  (2df).

Properties of  2 If df is large, then  2 (df) is approximately normal with mean df and standard deviation  (2df).

Chi-Square vs. Normal

 2 (128)

Chi-Square vs. Normal  2 (128) N(128, 16)

TI-83 – Chi-Square Probabilities To find a chi-square probability (p-value) on the TI-83,  Press DISTR.  Select  2 cdf (item #7).  Press ENTER.  Enter the lower endpoint, the upper endpoint, and the degrees of freedom.  Press ENTER.  The probability appears.

Computing the p-value The number of degrees of freedom is 1 less than the number of categories in the table. In this example, df = 5. To find the p-value, use the TI-83 to calculate the probability that  2 (5) would be at least as large as 3.4. p-value =  2 cdf(3.4, E99, 5) = 0.6386. Therefore, p-value = 0.6386 (accept H 0 ).

Test of Goodness of Fit Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007.

Similar presentations

Presentation on theme: "Test of Goodness of Fit Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Test of Goodness of Fit Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007.

Similar presentations

Presentation on theme: "Test of Goodness of Fit Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007."— Presentation transcript:

Similar presentations

About project

Feedback