Contingency Tables For Tests of Independence. Multinomials Over Various Categories Thus far the situation where there are multiple outcomes for the qualitative.

Slides:



Advertisements
Similar presentations
Statistics for Business and Economics
Advertisements

McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chi-Square Tests Chapter 12.
Chapter 11 Other Chi-Squared Tests
Chi Squared Tests. Introduction Two statistical techniques are presented. Both are used to analyze nominal data. –A goodness-of-fit test for a multinomial.
Chapter 12 Goodness-of-Fit Tests and Contingency Analysis
Basic Statistics The Chi Square Test of Independence.
Inference about the Difference Between the
Applications of the Chi-Square Statistic Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chapter 16 Chi Squared Tests.
Crosstabs and Chi Squares Computer Applications in Psychology.
22-1 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Chapter 22 Analysis.
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chi-Square Tests and the F-Distribution
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
Presentation 12 Chi-Square test.
Goodness of Fit Multinomials. Multinomial Proportions Thus far we have discussed proportions for situations where the result for the qualitative variable.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 12-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics, A First Course 4 th Edition.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 10.7.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on Categorical Data 12.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide Chapter 11 Comparisons Involving Proportions n Inference about the Difference Between the Proportions of Two Populations Proportions of Two Populations.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Introduction Many experiments result in measurements that are qualitative or categorical rather than quantitative. Humans classified by ethnic origin Hair.
Other Chi-Square Tests
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 16 Chi-Squared Tests.
HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence.
© 2000 Prentice-Hall, Inc. Statistics The Chi-Square Test & The Analysis of Contingency Tables Chapter 13.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics: A First Course Fifth Edition.
Copyright © 2010 Pearson Education, Inc. Slide
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Chap 11-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 11 Chi-Square Tests Business Statistics: A First Course 6 th Edition.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 12-1 Chapter 12 Chi-Square Tests and Nonparametric Tests Statistics for Managers using.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1/71 Statistics Tests of Goodness of Fit and Independence.
Inference for Distributions of Categorical Variables (C26 BVD)
Chapter Outline Goodness of Fit test Test of Independence.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 12. The Chi-Square Test.
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
Dan Piett STAT West Virginia University Lecture 12.
11.2 Tests Using Contingency Tables When data can be tabulated in table form in terms of frequencies, several types of hypotheses can be tested by using.
ContentFurther guidance  Hypothesis testing involves making a conjecture (assumption) about some facet of our world, collecting data from a sample,
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Chi-Två Test Kapitel 6. Introduction Two statistical techniques are presented, to analyze nominal data. –A goodness-of-fit test for the multinomial experiment.
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chi-Square hypothesis testing
Chapter 9: Non-parametric Tests
10 Chapter Chi-Square Tests and the F-Distribution Chapter 10
Chapter 11 Chi-Square Tests.
John Loucks St. Edward’s University . SLIDES . BY.
Qualitative data – tests of association
Statistics for Business and Economics (13e)
The Chi-Square Distribution and Test for Independence
Consider this table: The Χ2 Test of Independence
Chapter 10 Analyzing the Association Between Categorical Variables
Chapter 13 Goodness-of-Fit Tests and Contingency Analysis
Chapter 11 Chi-Square Tests.
Chapter 13 – Applications of the Chi-Square Statistic
Chapter 13 Goodness-of-Fit Tests and Contingency Analysis
Chapter Outline Goodness of Fit test Test of Independence.
Chapter 11 Chi-Square Tests.
Presentation transcript:

Contingency Tables For Tests of Independence

Multinomials Over Various Categories Thus far the situation where there are multiple outcomes for the qualitative variable without regard to anything else has been discussed. Now we discuss whether or not two qualitative variables are related, i.e are they independent?

EXAMPLES (1) Can it be concluded that cola preference and gender are dependent? (2) Can it be concluded that cola preference and age are dependent?

RULE OF 5  2 (Chi-squared) is actually only an approximate distribution for the test statistic. To be a “valid” approximation: ALL e i ’s should be  5 If the rule of 5 is violated, combine some categories so that the condition is met.

COLA PREFERENCE VS. GENDER The 1000 cola drinkers were further classified as to whether they were male or female. COLA MALE FEMALE ROW TOTAL Coke r 1 = 410 Coke r 1 = 410 Pepsir 2 = 350 Pepsi r 2 = 350 RCr 3 = 80 RC 50 30r 3 = 80 Shastar 4 = 50 Shasta 35 15r 4 = 50 Joltr 5 = 110 Jolt 75 35r 5 = 110COLUMN TOTALc 1 = 600 c 2 = 400 n = 1000 TOTAL c 1 = 600 c 2 = 400 n = 1000

HYPOTHESIS TEST: Can we Conclude Cola Preference and Gender Are Dependent? H 0 : (NO) Cola preference and gender are independent H A : (YES) Cola preference and gender are dependent  =.05 Reject H 0 if  2 >  2.05,DF –The correct DF = (r-1)(c-1) = (5-1)(2-1) = (4)(1) = 4 where r = # rows and c = # columns Reject H 0 if  2 >  2.05,4 =

HOW DO WE GET THE e ij ’s? Let P(A) = Probability a respondent favors Coke Let P(B) = Probability a respondent is a male If H 0 is true: The classifications are independent Thus P(A and B) = P(A)P(B) Best guess for P(A)  410/1000 =.41 Best guess for P(B)  600/1000 =.6 Thus P(A and B)  (.41)(.6) =.246 Expected number (Coke and male)e Expected number (Coke and male) =e 11 = 1000(.246) = 246 This can be gotten by r 1 c 1 /n = (410)(600)/1000 =246

CONTIGENCY TABLES Contingency tables are a convenient way of expressing the results when there are two classifications –It is the equivalent of a multinomial table for two classifications We put the e ij ’s in parentheses under (or next to) the f ij ’s in the table; then we calculate:

e ij ’s for Cola vs. Gender Coke/Malee 11 = (410)(600)/1000 = 246 Coke/Female e 12 = (410)(400)/1000 = 164 Pepsi/Male e 21 = (350)(600)/1000 = 210 Pepsi/Female e 22 = (350)(400)/1000 = 140 RC/Male e 31 = ( 80)(600)/1000 = 48 RC/Female e 32 = ( 80)(400)/1000 = 32 Shasta/Male e 41 = ( 50)(600)/1000 = 30 Shasta/Female e 42 = ( 50)(400)/1000 = 20 Jolt/Male e 51 = (110)(600)/1000 = 66 Jolt/Female e 52 = (110)(400)/1000 = 44

Notes on Calculating e’s The column totals may be set in advance or may be random based on the survey. These e ij ’s were all whole numbers -- if they are not DO NOT ROUND TO WHOLE NUMBERS. All these e’s  5 but suppose e 52 were actually = 3 –We might combine the results from Shasta and Jolt colas. –This would reduce the number of rows and hence the degrees of freedom. –e 52 is not less than 5 here, so we do not have to do this.

CONTINGENCY TABLE FOR COLA vs. GENDER MenWomenTotal MenWomenTotal Coke410 Coke (246) (164) Pepsi350 Pepsi (210) (140) RC80 RC ( 48) ( 32) Shasta50 Shasta ( 30) ( 20) Jolt110 Jolt ( 66) ( 44) Total

 2 for Cola vs. Gender  2 = ( ) 2 /246 + ( ) 2 /164 + ( ) 2 /210 + ( ) 2 /140 + ( ) 2 / 48 + ( ) 2 / 32 + ( ) 2 / 30 + ( ) 2 / 20 + ( ) 2 / 66 + ( ) 2 / 44 = 6.92  2 = 6.92 <  2.05,4 = There is not enough evidence to conclude gender and cola preference are dependent.There is not enough evidence to conclude gender and cola preference are dependent.

COLA PREFERENCE vs. AGE Survey results: 60 TOTAL 60 TOTAL Coke 410 Coke Pepsi 350 Pepsi RC80 RC Shasta 50 Shasta Jolt 110 Jolt TOTAL

HYPOTHESIS TEST There are r = 5 rows and c = 4 columns H 0 : (NO) Cola preference and age are independent H 1 : (YES) Cola preference and age are dependent  =.05 Reject H 0 if  2 >  2.05,DF –DF = (r-1)(c-1) = (5-1)(4-1) = (4)(3) = 12 Reject H 0 if  2 >  2.05,12 =

Sample e ij ’s e 34e 34 =(Row 3 Total)(Column 4 Total)/(Grand Total) = 8 (80)(100) / 1000 = 8 e 41e 41 =(Row 4 Total)(Column 1 Total)/(Grand Total) = 20 (50) (400) / 1000 = 20

CONTINGENCY TABLE FOR COLA vs. AGE 60Total 60Total Coke410 Coke (164) (123) (82) (41) Pepsi350 Pepsi (140) (105) (70) (35) RC80 RC ( 32) ( 24) (16) ( 8) Shasta50 Shasta ( 20) ( 15) (10) ( 5) Jolt110 Jolt ( 44) ( 33) (22) (11) Total

 2 for Cola vs. Age  2 = ( ) 2 /164 + ( ) 2 /123 + (75-82) 2 /82 + (40-41) 2 /41 + … + ( ) 2 / 44 + ( ) 2 / 33 + ( ) 2 / 22 + ( ) 2 / 11 =  2 = <  2.05,12 = There is not enough evidence to conclude cola preference and age are dependent.There is not enough evidence to conclude cola preference and age are dependent.

Excel CHITEST gives the p-value for the test =CHITEST(Observed Values, Expected Values) Must first calculate the expected values, e ij ’s See next slide for easy way to calculate these values.

=SUM(B4:C4) Drag to D5:D8 =$D4*B$9/$D$9 Drag to C13 Then drag B13:C13 to B17:C17 =CHITEST(B4:C8,B13:C17) =SUM(B4:B8) Drag to C9:D9

=SUM(B4:E4) Drag to F5:F8 =SUM(B4:B8) Drag to C9:D9 =$F4*B$9/$F$9 Drag to E13 Then drag B13:E13 to B17:E17 =CHITEST(B4:E8,B13:E17)

Review Contingency tables allow for comparisons to determine if two different categories are independent Excel -- CHITEST is used to generate the p- values for the chi-squared test Expected Values = (Row Total)(Column Total)/n By hand -- total degrees of freedom = (r-1)(c-1) and the  2 statistic is calculated by: