### Similar presentations

15- 2 Chapter Fifteen Nonparametric Methods: Chi-Square Applications GOALS When you have completed this chapter, you will be able to: ONE List the characteristics of the Chi-square distribution. TWO Conduct a test of hypothesis comparing an observed set of frequencies to an expected set of frequencies. THREE Conduct a hypothesis test to determine whether two classification criteria are related. Goals

15- 3 Characteristics of the Chi-Square Distribution The major characteristics of the chi-square distribution are: It is positively skewed It is non-negative There is a family of chi-square distributions Chi-Square Applications

15- 4 df = 3 df = 5 df = 10   2 distribution

15- 5 Goodness-of-Fit Test: Equal Expected Frequencies Let f 0 and f e be the observed and expected frequencies respectively. H 0 : There is no difference between the observed and expected frequencies. H 1 : There is a difference between the observed and the expected frequencies. The test statistic is: The critical value is a chi-square value with (k-1) degrees of freedom, where k is the number of categories

15- 6 Example 1 continued Day of WeekNumber Absent Monday120 Tuesday45 Wednesday60 Thursday90 Friday130 Total445 The following information shows the number of employees absent by day of the week at a large a manufacturing plant. At the.01 level of significance, is there a difference in the absence rate by day of the week?

15- 7 H 0 : There is no difference between the observed and expected frequencies. H 1 : There is a difference between the observed and the expected frequencies This is given in the problem as.01. It is the chi-square distribution. Example 1 continued Step 1 Step 1: State the null and alternate hypotheses Step 2 Step 2: Select the level of significance. Step 3 Step 3: Select the test statistic.

15- 8 EXAMPLE 1 continued Assume equal expected frequency as given in the problem f e = (120+45+60+90+130)/5=89 The degrees of freedom: (5-1)=4 The critical value of  2 is 13.28. Reject the null and accept the alternate if Computed  2 > 13.28 or p<.01 Step 4 Step 4: Formulate the decision rule.

15- 9 Example 1 continued Day Frequency Expected (f o – f e ) 2 /f e Monday120 89 10.80 Tuesday 45 89 21.75 Wednesday 60 89 9.45 Thursday 90 89 0.01 Friday 130 89 18.89 Total 445445 60.90 Step Five: Step Five: Compute the value of chi-square and make a decision. The p(  2 > 60.9) =.000000000001877 or essentially 0.

15- 10 Example 1 continued We conclude that there is a difference in the number of workers absent by day of the week. Because the computed value of chi-square, 60.90, is greater than the critical value, 13.28, the p of.000000000001877 <.01, H 0 is rejected.

15- 11 Example 2 The U.S. Bureau of the Census indicated that 63.9% of the population is married, 7.7% widowed, 6.9% divorced (and not re- married), and 21.5% single (never been married). A sample of 500 adults from the Philadelphia area showed that 310 were married, 40 widowed, 30 divorced, and 120 single. At the.02 significance level can we conclude that the Philadelphia area is different from the U.S. as a whole? Goodness-of-fit Test Goodness-of-fit Test : Unequal Expected Frequencies

15- 12 Example 2 continued Step 4: H 0 is rejected if  2 >9.837, df=3, or if p  of.02 Step 1: H 0 : The distribution has not changed H 1 : The distribution has changed. Step 2: The significance level given is.02. Step 3: The test statistic is the chi-square.

15- 13 Calculate the expected frequencies Married: (.639)500 = 319.5 Widowed: (.077)500 = 38.5 Divorced: (.069)500 = 34.5 Single: (.215)500 = 107.5 Example 2 continued Calculate chi-square values.

15- 14 Step 5:  2 = 2.3814, p(  2 > 2.3814) =.497. Example 2 continued The null hypothesis is not rejected. The distribution regarding marital status in Philadelphia is not different from the rest of the United States.

15- 15 Contingency Table Analysis contingency table A contingency table is used to investigate whether two traits or characteristics are related. Chi-square can be used to test for a relationship between two nominal scaled variables, where one variable is independent of the other. Contingency Table Analysis

15- 16  Each observation is classified according to two criteria.  We use the usual hypothesis testing procedure. degrees of freedom  The degrees of freedom are equal to: (number of rows-1)(number of columns-1). expected frequency  The expected frequency is computed as: Expected Frequency = (row total)(column total) grand total Contingency Table Analysis Contingency table analysis

15- 17 Is there a relationship between the location of an accident and the gender of the person involved in the accident? A sample of 150 accidents reported to the police were classified by type and gender. At the.05 level of significance, can we conclude that gender and the location of the accident are related? Example 3 Contingency Table Analysis

15- 18 Step 4: The degrees of freedom equal (r-1)(c-1) or 2. The critical  2 at 2 d.f. is 9.21. If computed  2 >9.21, or if p <.01, reject the null and accept the alternate. Step 5: A data table and the following contingency table are constructed. Step 1: H 0 : Gender and location are not related. H 1 : Gender and location are related. Step 2: The level of significance is set at.01. Step 3: the test statistic is the chi-square distribution.

15- 19 Example 3 continued The expected frequency for the work-male intersection is computed as (90)(80)/150=48. Similarly, you can compute the expected frequencies for the other cells. Observed frequencies (f o ) GenderWorkHomeOtherTotal Male60201090 Female20301060 Total805020150

15- 20 Expected frequencies (f e ) GenderWorkHomeOtherTotal Male(80)(90) 150 = 48 (50)(90) 150 =30 (20)(90) 150 =12 90 Female(80)(60) 150 =32 (50)(60) 150 =20 (20)(60) 150 =8 60 Total805020150 Example 3 continued

15- 21  2 : (f o – f e ) 2 / f e GenderWorkHomeOther Total  2 Male(60-48) 2 48 (20-30) 2 30 (10-12) 2 12 6.667 Female(20-32) 2 32 (30-20) 2 20 (12-10) 2 10 10.000 Total16.667 Example 3 continued

15- 22 The p(  2 > 16.667) =.00024. Since the  2 of 16.667 > 9.21, p of.00024 <.01, reject the null and conclude that there is a relationship between the location of an accident and the gender of the person involved. Example 3 concluded