Presentation is loading. Please wait.

Presentation is loading. Please wait.

CHAPTER 5 5.1 INTRODUCTORY CHI-SQUARE TEST Objectives:- Concerning with the methods of analyzing the categorical data In chi-square test, there are 3 methods.

Similar presentations


Presentation on theme: "CHAPTER 5 5.1 INTRODUCTORY CHI-SQUARE TEST Objectives:- Concerning with the methods of analyzing the categorical data In chi-square test, there are 3 methods."— Presentation transcript:

1 CHAPTER 5 5.1 INTRODUCTORY CHI-SQUARE TEST Objectives:- Concerning with the methods of analyzing the categorical data In chi-square test, there are 3 methods to be analyzed :  Goodness-of-fit test: To test over assumption that some variables follow certain distribution.  Independence Test To test if the variable is dependent to one another.  Homogeneity Test To test if there is a homogeneous relationship between the variables.

2 Definition of Chi Square A measure of differences between the observed and expected frequencies is supplied by the statistic chi square,.

3 Goodness-of-fit test:  In Goodness-of-fit test, chi-square analysis is i.applied for the purpose of examine whether sample data could have been drawn from a population having a specific probability distribution ii.To compare an observed distribution to an expected distribution

4  In Goodness-of-fit test, the test procedures are appropriate when the following conditions are met : i.The sampling method is simple random sampling ii.The population is at least 10 times as large as the sample iii.The variable under study is categorical iv.The expected value for each level of the variable is at least 5

5  Test procedure to run the Goodness-of-fit test: 1. State the null hypothesis and alternative hypothesis 2. Determine: i. the level of significance, ii. The degree of freedom,

6 3. Find the value of from the table of chi- square distribution 4. Calculate the value of Where the Category12… k Frequency…

7 5. Determine the rejection region: i.critical value approach; Reject ii. p – value approach; 6. Make decision

8 Example 5.1: The authority claims that the proportions of road accidents occurring in this country according to the categories User attitude (A), Mechanical Fault (M), Insufficient Sign Board (I) and Fate (F) are 60%, 20%, 15% and 5% respectively. A study by an independent body shows the following data Can we accept the claim at significance level Solution: 1. CategoryAMIFTotal Frequency13035305200

9 2. 3. Test statistic: 4.. From chi-square distribution table 0.833 0.625 0.000 2.500

10 5. Rejection Region: 6.. Since. Thus we accept and conclude that we have no evidence to reject the claim.

11 Exercise 5.1: The number of students playing truancy in a school over 200 school days is showing below If X is a random variable representing the number of students playing truancy per day, test the hypothesis that X follows the Poisson distribution with mean 3 per day at No. of truancy01234 No of days123245503526

12 Exercise 5.2 : The probabilities of blood phenotypes A, B, AB and O in the population of all Caucasians in the US are 0.41, 0.10, 0.04 and 0.45 respectively. To determine whether or not the actual population proportions fit this set of reported probabilities, a random sample of 200 Americans were selected and their phenotypes were recorded. The observed cells are count as calculated. Test the goodness of fit of these blood phenotype probabilities at Blood Phenotypes ABABO Observed89181281

13 The Chi-Square Test for Homogeneity  The homogeneity test is used to determine whether several populations are similar or equal or homogeneous in some characteristics.  This test is applied to a single categorical variable from two different population

14  The test procedure is appropriate when satisfy the below conditions : i.For each population, the sampling method is simple random sampling ii.Each population is at least 10 times as large as the sample iii.The variable under study is categorical iv.If sample data are displayed in contingency table (population x category levels), the expected value for each cell of the table is at least 5.

15 Two dimensional contingency table layout:  The above is contingency table ( r x c ) where r denotes as the number of categories of the row variable, c denotes as the number of categories of the column variable  is the observed frequency in cell i, j  be the total frequency for row category i  be the total frequency for column category j  be the grand total frequency for all cell ( i, j ) where Column Variable Category B 1 Category B 2 …Category B c Total Row Variable Category A 1 … Category A 2 … Category ……………… Category A r … Total…

16 Test procedure to run Chi-square test for homogeneity: 1. State the null hypothesis and alternative hypothesis Eg: 2. Determine: i. the level of significance, ii. The degree of freedom, where 3. Find the value of from the table of chi-square distribution Determine the rejection region: i.critical value approach; Reject ii. p – value approach;

17 4. Calculate the value of using the formula below: 5. Make decision

18 Example 5.2: Four machines manufacture cylindrical steel pins. The pins are subjected to a diameter specification. A pin may meet the specification or it may be too thin or too thick. Pins are sampled from each machine and the number of pins in each category is counted. Table below presents the results. Test at whether the categories of pins are similar for all machines. Too thinOKToo Thick Machine 1101028 Machine 2341615 Machine 312799 Machine 4106010

19 Solution: Construct a contingency table: Calculation of the expected frequency: Too thinOKToo ThickTotal Machine 1101028120 Machine 2341615200 Machine 312799100 Machine 410601080 Total6640232500

20 Testing procedure: 1. 2. 3. From table of chi-square:

21 4.Using the observed and expected frequency in the contingency table, we calculate using the formula given:

22

23 Exercise 5.3: 200 female owners and 200 male owners of Proton cars selected at random and the color of their cars are noted. The following data shows the results: Use a 1% significance level to test whether the proportions of color preference are the same for female and male. Car Colour BlackDullBright GenderMale4011050 Female2080100

24 Chi-Square Test for Independence  This test is applied to a single population which has categorical variables  To determine whether there is a significant association between the two variables.  Eg : In an election survey, voter might be classified by gender (female and male) and voting preferences (democrate,republican or independent). This test is used to determine whether gender is related to voting preferences.

25  The test is appropriated if the following are met : 1.The sampling method is simple random sampling ii.Each population is at least 10 times as large as the sample iii.The variable under study is categorical iv.If sample data are displayed in contingency table (population x category levels), the expected value for each cell of the table is at least 5.

26  Note: The procedure for the Chi-square test for independence is the same as the Chi-square test for homogeneity. The only different between these two test is at the determination of the null and alternative hypothesis. The rest of the procedure are the same for both tests. This theorem is useful in testing the following hypothesis:

27 Example 5.3: Insomnia is disease where a person finds it hard to sleep at night. A study is conducted to determine whether the two attributes, smoking habit and insomnia disease are dependent. The following data set was obtained. Use a 5% significance level to conduct the study. Insomnia YesNo HabitNon-smokers1070 Ex-smokers832 Smokers2238

28 Solution: 1. 2. 3 From table of chi-square: Insomnia YesNoTotal HabitNon-smokers107080 Ex-smokers83240 Smokers223860 Total40140180

29 4. Using the observed and expected frequency in the contingency table, we calculate using the formula given:

30 5.Since

31 Exercise 5.4: A study is conducted to determine whether student’s academic performance are independent of their active in co-curricular activities. The following data set was obtained: Use a 5% significance level to conduct the study. Academic Performance LowFairGood Co-curricular Activities Inactive408060 Active309060

32 Exercise 5.5: A total of n = 309 furniture defects were recorded and the defects were classified into four types: A,B,C,D. At the same time, each piece of furniture was identified by the production shift in which it was manufactured. Test at 5% significance level types of defects and furniture are independence. These counts are presented in table below: Type of Defects 123 A152633 B213117 C453449 D13520


Download ppt "CHAPTER 5 5.1 INTRODUCTORY CHI-SQUARE TEST Objectives:- Concerning with the methods of analyzing the categorical data In chi-square test, there are 3 methods."

Similar presentations


Ads by Google