Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chi-squared Tests. We want to test the “goodness of fit” of a particular theoretical distribution to an observed distribution. The procedure is: 1. Set.

Similar presentations


Presentation on theme: "Chi-squared Tests. We want to test the “goodness of fit” of a particular theoretical distribution to an observed distribution. The procedure is: 1. Set."— Presentation transcript:

1 Chi-squared Tests

2 We want to test the “goodness of fit” of a particular theoretical distribution to an observed distribution. The procedure is: 1. Set up the null and alternative hypotheses and select the significance level. 2. Draw a random sample of observations from a population or process. 3. Derive expected frequencies under the assumption that the null hypothesis is true. 4. Compare the observed frequencies and the expected frequencies. 5. If the discrepancy between the observed and expected frequencies is too great to attribute to chance fluctuations at the selected significance level, reject the null hypothesis.

3 Example 1: Five brands of coffee are taste-tested by 1000 people with the results below. Test at the 5% level the hypothesis that, in the general population, there is no difference in the proportions preferring each brand (i.e.: H 0 : p A = p B = p C = p D = p E versus H 1 : not all the proportions are the same). Brand preference Observed frequency f o Theoretical frequency f t f o -f t (f o -f t ) 2 A210 B312 C170 D85 E223 1000

4 If all the proportions were the same, we’d expect about 200 people in each group, if we have a total of 1000 people. Brand preference Observed frequency f o Theoretical frequency f t f o -f t (f o -f t ) 2 A210200 B312200 C170200 D85200 E223200 1000

5 We next compute the differences in the observed and theoretical frequencies. Brand preference Observed frequency f o Theoretical frequency f t f o -f t (f o -f t ) 2 A21020010 B312200112 C170200-30 D85200-115 E22320023 1000

6 Then we square each of those differences. Brand preference Observed frequency f o Theoretical frequency f t f o -f t (f o -f t ) 2 A21020010100 B31220011212544 C170200-30900 D85200-11513225 E22320023539 1000

7 Then we divide each of the squares by the expected frequency and add the quotients. The resulting statistic has a chi-squared (  2 ) distribution. Brand preference Observed frequency f o Theoretical frequency f t f o -f t (f o -f t ) 2 A210200101000.500 B3122001121254462.720 C170200-309004.500 D85200-1151322566.125 E223200235392.645 1000 136.49

8 The chi-squared (  2 ) distribution 22 f(  2 ) The chi-squared distribution is skewed to the right. (i.e.: It has the bump on the left and the tail on the right.)

9 In these goodness of fit problems, the number of degrees of freedom is: In the current problem, we have 5 categories (the 5 brands). We have 1 restriction. When we determined our expected frequencies, we restricted our numbers so that the total would be the same total as for the observed frequencies (1000). We didn’t estimate any parameters in this particular problem. So dof = 5 – 1 – 0 = 4.

10 Large values of the  2 statistic indicate big discrepancies between the observed and theoretical frequencies. 22 f(  2 ) So when the  2 statistic is large, we reject the hypothesis that the theoretical distribution is a good fit. That means the critical region consists of the large values, the right tail. acceptance region crit. reg.

11 f(  2 ) From the  2 table, we see that for a 5% test with 4 degrees of freedom, the cut-off point is 9.488. In the current problem, our  2 statistic had a value of 136.49. So we reject the null hypothesis and conclude that the proportions preferring each brand were not the same. acceptance region crit. reg. 0.05 9.488136.49

12 Example 2: A diagnostic test of mathematics is given to a group of 1000 students. The administrator analyzing the results wants to know if the scores of this group differ significantly from those of the past. Test at the 10% level. Grade Historical Rel. freq. Expected Abs. freq. f t Current Obs. freq. f o f o -f t (f o -f t ) 2 90-1000.1050 80-890.20100 70-790.40500 60-690.20200 <600.10150 1000

13 Grade Historical Rel. freq. Expected Abs. freq. f t Current Obs. freq. f o f o -f t (f o -f t ) 2 90-1000.1050 80-890.20100 70-790.40500 60-690.20200 <600.10150 1000

14 Based on the historical relative frequency, we determine the expected absolute frequency, restricting the total to the total for the current observed frequency. Grade Historical Rel. freq. Expected Abs. freq. f t Current Obs. freq. f o f o -f t (f o -f t ) 2 90-1000.1010050 80-890.20200100 70-790.40400500 60-690.20200 <600.10100150 1000

15 We subtract the theoretical frequency from the observed frequency. Grade Historical Rel. freq. Expected Abs. freq. f t Current Obs. freq. f o f o -f t (f o -f t ) 2 90-1000.1010050-50 80-890.20200100-100 70-790.40400500100 60-690.20200 0 <600.1010015050 1000

16 We square those differences. Grade Historical Rel. freq. Expected Abs. freq. f t Current Obs. freq. f o f o -f t (f o -f t ) 2 90-1000.1010050-502500 80-890.20200100-10010,000 70-790.4040050010010,000 60-690.20200 00 <600.10100150502500 1000

17 We divide the square by the theoretical frequency and sum up. Grade Historical Rel. freq. Expected Abs. freq. f t Current Obs. freq. f o f o -f t (f o -f t ) 2 90-1000.1010050-50250025 80-890.20200100-10010,00050 70-790.4040050010010,00025 60-690.20200 000 <600.1010015050250025 1000 125

18 We have 5 categories (the 5 grade groups). We have 1 restriction. We restricted our expected frequencies so that the total would be the same total as for the observed frequencies (1000). We didn’t estimate any parameters in this particular problem. So dof = 5 – 1 – 0 = 4.

19 f(  2 ) From the  2 table, we see that for a 10% test with 4 degrees of freedom, the cut-off point is 7.779. In the current problem, our  2 statistic had a value of 125. So we reject the null hypothesis and conclude that the grade distribution is NOT the same as it was historically. acceptance region crit. reg. 0.10 7.779125

20 Example 3: Test at the 5% level whether the demand for a particular product as listed below has a Poisson distribution. # of units demanded per day x Observed # of days f o xf o prob. f(x) Expected # of days f t f o -f t (f o -f t ) 2 011 128 243 347 432 528 67 70 82 91 101 200

21 Multiplying the number of days on which each amount was sold by the amount sold on that day, and then adding those products, we find that the total number of units sold on the 200 days is 600. So the mean number of units sold per day is 3. # of units demanded per day x Observed # of days f o xf o prob. f(x) Expected # of days f t f o -f t (f o -f t ) 2 0110 128 24386 347141 432128 528140 6742 700 8216 919 101 200600

22 We use the 3 as the estimated mean for the Poisson distribution. Then using the Poisson table, we determine the probabilities for each x value. # of units demanded per day x Observed # of days f o xf o prob. f(x) Expected # of days f t f o -f t (f o -f t ) 2 01100.050 128 0.149 243860.224 3471410.224 4321280.168 5281400.101 67420.050 7000.022 82160.008 9190.003 101 0.001 2006001.

23 Then we multiply the probabilities by 200 to compute f t, the expected number of days on which each number of units would be sold. By multiplying by 200, we restrict the f t total to be the same as the f o total. # of units demanded per day x Observed # of days f o xf o prob. f(x) Expected # of days f t f o -f t (f o -f t ) 2 01100.05010.0 128 0.14929.8 243860.22444.8 3471410.22444.8 4321280.16833.6 5281400.10120.2 67420.05010.0 7000.0224.4 82160.0081.6 9190.0030.6 101 0.0010.2 2006001.200

24 When the f t ’s are small (less than 5), the test is not reliable. So we group small f t values. In this example, we group the last 4 categories. # of units demanded per day x Observed # of days f o xf o prob. f(x) Expected # of days f t f o -f t (f o -f t ) 2 01100.05010.0 128 0.14929.8 243860.22444.8 3471410.22444.8 4321280.16833.6 5281400.10120.2 67420.05010.0 70 4 00.0224.4 6.8 82160.0081.6 9190.0030.6 101 0.0010.2 200600200

25 Next we subtract the theoretical frequencies f t from the observed frequencies f o. # of units demanded per day x Observed # of days f o xf o prob. f(x) Expected # of days f t f o -f t (f o -f t ) 2 01100.05010.01 128 0.14929.81.8 243860.22444.8-1.8 3471410.22444.82.2 4321280.16833.6-1.6 5281400.10120.27.8 67420.05010.0-3 70 4 00.0224.4 6.8-2.8 82160.0081.6 9190.0030.6 101 0.0010.2 200600200

26 Then we square the differences … # of units demanded per day x Observed # of days f o xf o prob. f(x) Expected # of days f t f o -f t (f o -f t ) 2 01100.05010.011 128 0.14929.81.83.24 243860.22444.8-1.83.24 3471410.22444.82.24.84 4321280.16833.6-1.62.5 5281400.10120.27.860.84 67420.05010.0-39 70 4 00.0224.4 6.8-2.87.84 82160.0081.6 9190.0030.6 101 0.0010.2 200600200

27 … divide by the theoretical frequencies, and sum up. # of units demanded per day x Observed # of days f o xf o prob. f(x) Expected # of days f t f o -f t (f o -f t ) 2 01100.05010.0110.10 128 0.14929.81.83.240.11 243860.22444.8-1.83.240.07 3471410.22444.82.24.840.11 4321280.16833.6-1.62.50.08 5281400.10120.27.860.843.01 67420.05010.0-390.90 70 4 00.0224.4 6.8-2.87.841.15 82160.0081.6 9190.0030.6 101 0.0010.2 200600200 5.53

28 We have 8 categories (after grouping the small ones). We have 1 restriction. We restricted our expected frequencies so that the total would be the same total as for the observed frequencies (200). We estimated 1 parameter, the mean for the Poisson distribution. So dof = 8 – 1 – 1 = 6.

29 f(  2 ) From the  2 table, we see that for a 5% test with 6 degrees of freedom, the cut-off point is 12.592. In the current problem, our  2 statistic had a value of 5.53. So we accept the null hypothesis that the Poisson distribution is a reasonable fit for the product demand. acceptance region crit. reg. 0.05 12.5925.53

30 Example 4: Test at the 10% level whether the following exam grades are from a normal distribution. Note: This is a very long problem. grade intervals midpoint X fofo X f o [50, 60)14 [60,70)18 [70,80)36 [80.90)18 [90,100]14 100

31 If the distribution is normal, we need to estimate its mean and standard deviation. grade intervals midpoint X fofo X f o [50, 60)14 [60,70)18 [70,80)36 [80.90)18 [90,100]14 100

32 To estimate the mean, we first determine the midpoints of the grade intervals. grade intervals midpoint X fofo X f o [50, 60)5514 [60,70)6518 [70,80)7536 [80.90)8518 [90,100]9514 100

33 We then multiple these midpoints by the observed frequencies of the intervals, add the products, and divide the sum by the number of observations. The resulting mean is 7500/100 = 75. grade intervals midpoint X fofo X f o [50, 60)5514770 [60,70)65181170 [70,80)75362700 [80.90)85181530 [90,100]95141330 1007500

34 Next we need to calculate the standard deviation We begin by subtracting the mean of 75 from each midpoint, and squaring the differences. grade intervals midpoint X fofo X f o [50, 60)5514770-20400 [60,70)65181170-10100 [70,80)7536270000 [80.90)8518153010100 [90,100]9514133020400 1007500

35 We multiply by the observed frequencies and sum up. Dividing by n –1 or 99, the sample variance s 2 = 149.49495. The square root is the sample standard deviation s = 12.2268. grade intervals midpoint X fofo X f o [50, 60)5514770-204005600 [60,70)65181170-101001800 [70,80)75362700000 [80.90)85181530101001800 [90,100]95141330204005600 100750014,800

36 We will use the 75 and 12.2268 as the mean  and the standard deviation  of our proposed normal distribution. We now need to determine what the expected frequencies would be if the grades were from that normal distribution.

37 Start with our lowest grade category, under 60. We then expect that 10.93% of our 100 observations, or about 11 grades, would be in the lowest grade category. So 11 will be one of our f t values. We need to do similar calculations for our other grade categories. 0.1093 -1.23.3907 Z

38 The next grade category is [60,70). So 23.16% of our 100 observations, or about 23 grades, are expected to be in that grade category. 0 -1.23 -0.41.3907 Z.1591

39 The next grade category is [70,80). So 31.82% of our 100 observations, or about 32 grades, are expected to be in that grade category. 0 -0.41.1591 Z 0.41

40 The next grade category is [80,90). So 23.16% of our 100 observations, or about 23 grades, are expected to be in that grade category. 0 0.41 1.23.3907 Z.1591

41 The highest grade category is 90 and over. So 10.93% of our 100 observations, or about 11 grades, are expected to be in that grade category. 0.1093 1.23.3907 Z

42 Now we can finally compute our  2 statistic. We put in the observed frequencies that we were given and the theoretical frequencies that we just calculated. grade category fofo ftft under 601411 [60,70)1823 [70,80)3632 [80.90)1823 90 and up1411

43 We subtract the theoretical frequencies from the observed frequencies, square the differences, divide by the theoretical frequencies, and sum up. The resulting  2 statistic is 4.3104. grade category fofo ftft under 6014110.8182 [60,70)18231.0870 [70,80)36320.5000 [80.90)18231.0870 90 and up14110.8182 4.3104

44 We have 5 categories (the 5 grade groups). We have 1 restrictions. We restricted our expected frequencies so that the total would be the same total as for the observed frequencies (100). We estimated two parameters, the mean and the standard deviation. So dof = 5 – 1 – 2 = 2.

45 f(  2 ) From the  2 table, we see that for a 10% test with 2 degrees of freedom, the cut-off point is 4.605. In the current problem, our  2 statistic had a value of 4.31. So we accept the null hypothesis that the normal distribution is a reasonable fit for the grades. acceptance region crit. reg. 0.10 4.6054.31

46 We can also use the  2 statistic to test whether two variables are independent of each other.

47 Example 5: Given the following frequencies for a sample of 10,000 households, test at the 1% level whether the number of phones and the number of cars for a household are independent of each other. # of cars 012 # of phones 01,000900100 115002600500 2 or more 5002500400 10,000

48 We first compute the row and column totals, # of cars 012 row total # of phones 01,0009001002000 1150026005004600 2 or more 50025004003400 column total 3,0006,0001,00010,000

49 and the row and column percentages (marginal probabilities). # of cars 012 row total % # of phones 01,00090010020000.20 11500260050046000.46 2 or more 500250040034000.34 column total 3,0006,0001,00010,0001.00 %0.300.600.101.00

50 Recall that if 2 variables X and Y are independent of each other, then Pr(X=x and Y=y) = Pr(X=x) Pr(Y=y)

51 We can use our row and column percentages as marginal probabilities, and multiply to determine the probabilities and numbers of households we would expect to see in the center of the table if the numbers of phones and cars were independent of each other. # of cars 012 row total % # of phones 00.20 10.46 2 or more 0.34 column total 1.00 %0.300.600.101.00

52 First calculate the expected probability. For example, Pr(0 phones & 0 cars) = Pr(0 phones) Pr(0 cars) = (0.20)(0.30) = 0.06. So we expect 6% of our 10,000 households, or 600 households to have 0 phones and 0 cars. # of cars 012 row total % # of phones 06000.20 10.46 2 or more 0.34 column total 10,0001.00 %0.300.600.101.00

53 Pr(0 phones & 1 car) = Pr(0 phones) Pr(1 car) = (0.20)(0.60) = 0.12. So we expect 12% of our 10,000 households, or 1200 households to have 0 phones and 1 car. # of cars 012 row total % # of phones 060012000.20 10.46 2 or more 0.34 column total 10,0001.00 %0.300.600.101.00

54 Pr(0 phones & 2 cars) = Pr(0 phones) Pr(2 cars) = (0.20)(0.10) = 0.02. So we expect 2% of our 10,000 households, or 200 households to have 0 phones and 2 cars. # of cars 012 row total % # of phones 060012002000.20 10.46 2 or more 0.34 column total 10,0001.00 %0.300.600.101.00

55 Notice that when we add the 3 numbers that we just calculated, we get the same total for the row (2,000) that we had observed. The row and column totals should be the same for the observed and expected tables. # of cars 012 row total % # of phones 060012002002,0000.20 10.46 2 or more 0.34 column total 10,0001.00 %0.300.600.101.00

56 Continuing, we get the following numbers for the 2 nd and 3 rd rows. # of cars 012 row total % # of phones 060012002002,0000.20 11380276046046000.46 2 or more 1020204034034000.34 column total 10,0001.00 %0.300.600.101.00

57 The column totals are the same as for the observed table. # of cars 012 row total % # of phones 060012002002,0000.20 11380276046046000.46 2 or more 1020204034034000.34 column total 30006000100010,0001.00 %0.300.600.101.00

58 Now we set up the same type of table that we did for our earlier  2 goodness-of-fit tests. We put in the f o column the observed frequencies and in the f t column the expected frequencies that we calculated. # of cars# of phonesfofo ftft 001000600 0115001380 02 or more5001020 109001200 1126002760 12 or more25002040 20100200 21500460 22 or more400340

59 Then we subtract the theoretical frequencies from the observed frequencies, square the differences, divide by the theoretical frequencies, and sum to get our  2 statistic. # of cars# of phonesfofo ftft 001000600266.67 011500138010.43 02 or more5001020265.10 10900120075.00 11260027609.28 12 or more25002040103.73 2010020050.00 215004603.48 22 or more40034010.59 794.28

60 In our example, we have 3 rows and 3 columns. So dof = (3 – 1)( 3 – 1) = (2)(2) = 4. In these tests of independence, the number of degrees of freedom is

61 f(  2 ) From the  2 table, we see that for a 1% test with 4 degrees of freedom, the cut-off point is 13.277. In the current problem, our  2 statistic had a value of 794.28. So we reject the null hypothesis and conclude that the number of phones and the number of cars in a household are not independent. acceptance region crit. reg. 0.01 13.277794.28

62 In testing for independence in 2x2 tables, the chi-square statistic has only (r-1)(c-1) =1 degree of freedom. In these cases, it is often recommended that the value of the statistic be “corrected” so that its discrete distribution will be better approximated by the continuous chi-square distribution. Yates Correction

63 The Hypothesis Test for the Variance or Standard Deviation This test is another one that uses the chi-squared distribution.

64 Sometimes it is important to know the variance or standard deviation of a variable. For example, medication often needs to be extremely close to the specified dosage. If the dosage is too low, the medication may be ineffective and a patient may die from inadequate treatment. If the dosage is too high, the patient may die from an overdose. So you may want to make sure that the variance is a very small amount.

65 If the data are normally distributed, the chi-squared test for the variance or standard deviation is appropriate. The statistic is n is the sample size, and σ 2 is the hypothesized population variance. The number of degrees of freedom is n-1.

66 Example: Suppose you want to test at the 5% level whether the population standard deviation for a particular medication is 0.5 mg. Based on a sample of 25 capsules, you determine the sample standard deviation to be 0.6 mg. Perform the test. Now we need to determine the critical region for the test.

67 Because the chi-squared distribution is not symmetric, you need to look up the two critical values for a two-tailed test separately. 0.025 12.401 39.364 critical region acceptance region critical region You can find the two numbers either by looking under “Cumulative Probabilities” 0.025 and 1-0.025=0.975 or under “Upper-Tail Areas” 0.975 and 0.025. Recall that the value of the test statistic was 34.56, which is in the acceptance region. So we can not rule out the null hypothesis and therefore we conclude that the population standard deviation is 0.5 mg.


Download ppt "Chi-squared Tests. We want to test the “goodness of fit” of a particular theoretical distribution to an observed distribution. The procedure is: 1. Set."

Similar presentations


Ads by Google