Statistics for Business and Economics

Slides:



Advertisements
Similar presentations
1 2 Test for Independence 2 Test for Independence.
Advertisements

1 2 Test for Independence 2 Test for Independence.
Chapter 7 Sampling and Sampling Distributions
EPI809/Spring Chapter 10 Hypothesis testing: Categorical Data Analysis.
© 2011 Pearson Education, Inc
Statistics for Business and Economics Chapter 9 Categorical Data Analysis.
Chapter 16 Goodness-of-Fit Tests and Contingency Tables
Chi-Square and Analysis of Variance (ANOVA)
Categorical Data Analysis
Contingency Tables For Tests of Independence. Multinomials Over Various Categories Thus far the situation where there are multiple outcomes for the qualitative.
 2 Test of Independence. Hypothesis Tests Categorical Data.
Chi Squared Tests. Introduction Two statistical techniques are presented. Both are used to analyze nominal data. –A goodness-of-fit test for a multinomial.
Chapter 12 Goodness-of-Fit Tests and Contingency Analysis
Inference about the Difference Between the
© 2002 Prentice-Hall, Inc.Chap 10-1 Statistics for Managers using Microsoft Excel 3 rd Edition Chapter 10 Tests for Two or More Samples with Categorical.
Discrete (Categorical) Data Analysis
1 1 Slide © 2009 Econ-2030-Applied Statistics-Dr. Tadesse. Chapter 11: Comparisons Involving Proportions and a Test of Independence n Inferences About.
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chapter Goals After completing this chapter, you should be able to:
Chapter 16 Chi Squared Tests.
Previous Lecture: Analysis of Variance
Cross-Tabulations.
Chapter 12 Chi-Square Tests and Nonparametric Tests
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
Presentation 12 Chi-Square test.
© 2004 Prentice-Hall, Inc.Chap 12-1 Basic Business Statistics (9 th Edition) Chapter 12 Tests for Two or More Samples with Categorical Data.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 12-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics, A First Course 4 th Edition.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on Categorical Data 12.
1 1 Slide © 2005 Thomson/South-Western Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial Population Goodness of.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
1 Measuring Association The contents in this chapter are from Chapter 19 of the textbook. The crimjust.sav data will be used. cjsrate: RATE JOB DONE: CJ.
A Course In Business Statistics 4th © 2006 Prentice-Hall, Inc. Chap 9-1 A Course In Business Statistics 4 th Edition Chapter 9 Estimation and Hypothesis.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 In this case, each element of a population is assigned to one and only one of several classes or categories. Chapter 11 – Test of Independence - Hypothesis.
Other Chi-Square Tests
Contingency Tables 1.Explain  2 Test of Independence 2.Measure of Association.
© 2000 Prentice-Hall, Inc. Statistics The Chi-Square Test & The Analysis of Contingency Tables Chapter 13.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics: A First Course Fifth Edition.
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Chap 11-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 11 Chi-Square Tests Business Statistics: A First Course 6 th Edition.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 12-1 Chapter 12 Chi-Square Tests and Nonparametric Tests Statistics for Managers using.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Chapter Outline Goodness of Fit test Test of Independence.
11.2 Tests Using Contingency Tables When data can be tabulated in table form in terms of frequencies, several types of hypotheses can be tested by using.
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Test of independence: Contingency Table
Chapter 11 – Test of Independence - Hypothesis Test for Proportions of a Multinomial Population In this case, each element of a population is assigned.
Chapter 12 Chi-Square Tests and Nonparametric Tests
10 Chapter Chi-Square Tests and the F-Distribution Chapter 10
Chapter 11 Chi-Square Tests.
John Loucks St. Edward’s University . SLIDES . BY.
Qualitative data – tests of association
Statistics for Business and Economics (13e)
Econ 3790: Business and Economics Statistics
Chapter 10 Analyzing the Association Between Categorical Variables
Chapter 13 Goodness-of-Fit Tests and Contingency Analysis
Chapter 11 Chi-Square Tests.
Inference on Categorical Data
Analyzing the Association Between Categorical Variables
Section 11-1 Review and Preview
Chapter 13 Goodness-of-Fit Tests and Contingency Analysis
Chapter Outline Goodness of Fit test Test of Independence.
Chapter 11 Chi-Square Tests.
Presentation transcript:

Statistics for Business and Economics Chapter 9 Categorical Data Analysis

Learning Objectives Explain 2 Test for Proportions Explain 2 Test of Independence Solve Hypothesis Testing Problems More Than Two Population Proportions Independence As a result of this class, you will be able to ...

Data Types Data Quantitative Qualitative Continuous Discrete 3

Qualitative Data Qualitative random variables yield responses that classify Example: gender (male, female) Measurement reflects number in category Nominal or ordinal scale Examples What make of car do you drive? Do you live on-campus or off-campus?

Hypothesis Tests Qualitative Data Z Test c 2 Test Proportion Independence 1 pop. More than 2 pop. 5

Chi-Square (2) Test for k Proportions

Hypothesis Tests Qualitative Data Z Test c 2 Test Proportion Independence 1 pop. More than 2 pop. 5

Multinomial Experiment n identical trials k outcomes to each trial Constant outcome probability, pk Independent trials Random variable is count, nk Example: ask 100 people (n) which of 3 candidates (k) they will vote for

Chi-Square (2) Test for k Proportions Tests equality (=) of proportions only Example: p1 = .2, p2=.3, p3 = .5 One variable with several levels Uses one-way contingency table

One-Way Contingency Table Shows number of observations in k independent groups (outcomes or variable levels) Outcomes (k = 3) Candidate Tom Bill Mary Total 35 20 45 100 Number of responses 20

Conditions Required for a Valid Test: One-way Table A multinomial experiment has been conducted The sample size n is large: E(ni) is greater than or equal to 5 for every cell

2 Test for k Proportions Hypotheses & Statistic Hypothesized probability 1. Hypotheses H0: p1 = p1,0, p2 = p2,0, ..., pk = pk,0 Ha: At least one pi is different from above 2. Test Statistic Observed count Expected count: E(ni) = npi,0 3. Degrees of Freedom: k – 1 Number of outcomes 24

2 Test Basic Idea Compares observed count to expected count assuming null hypothesis is true Closer observed count is to expected count, the more likely the H0 is true Measured by squared difference relative to expected count Reject large values

Finding Critical Value Example What is the critical 2 value if k = 3, and  =.05? If ni = E(ni), 2 = 0. Do not reject H0 c 2 Upper Tail Area DF .995 … .95 .05 1 ... 0.004 3.841 0.010 0.103 5.991 2 Table (Portion) Reject H0  = .05 df = k - 1 = 2 5.991 26

2 Test for k Proportions Example As personnel director, you want to test the perception of fairness of three methods of performance evaluation. Of 180 employees, 63 rated Method 1 as fair, 45 rated Method 2 as fair, 72 rated Method 3 as fair. At the .05 level of significance, is there a difference in perceptions? To check assumptions, use sample proportions as estimators of population proportion: n1·p = 78·63/78 = 63 n1·(1-p) = 78·(1-63/78) = 15 10

2 Test for k Proportions Solution H0: Ha:  = n1 = n2 = n3 = Critical Value(s): p1 = p2 = p3 = 1/3 At least 1 is different Test Statistic: Decision: Conclusion: .05 63 45 72  = .05 c 2 Reject H0 5.991 11

2 Test for k Proportions Solution 12

2 Test for k Proportions Solution H0: Ha:  = n1 = n2 = n3 = Critical Value(s): p1 = p2 = p3 = 1/3 At least 1 is different Test Statistic: Decision: Conclusion: 2 = 6.3 .05 63 45 72 c 2 Reject H0 Reject at  = .05 5.991  = .05 There is evidence of a difference in proportions 11

Contingency Tables Contingency Tables Useful in situations involving multiple population proportions Used to classify sample observations according to two or more characteristics Also called a cross-classification table.

Contingency Table Example Left-Handed vs. Gender Dominant Hand: Left vs. Right Gender: Male vs. Female 2 categories for each variable, so called a 2 x 2 table Suppose we examine a sample of 300 children

Contingency Table Example (continued) Sample results organized in a contingency table: Gender Hand Preference Left Right Female 12 108 120 Male 24 156 180 36 264 300 sample size = n = 300: 120 Females, 12 were left handed 180 Males, 24 were left handed

2 Test for the Difference Between Two Proportions H0: π1 = π2 (Proportion of females who are left handed is equal to the proportion of males who are left handed) H1: π1 ≠ π2 (The two proportions are not the same hand preference is not independent of gender) If H0 is true, then the proportion of left-handed females should be the same as the proportion of left-handed males The two proportions above should be the same as the proportion of left-handed people overall

The Chi-Square Test Statistic The Chi-square test statistic is: where: fo = observed frequency in a particular cell fe = expected frequency in a particular cell if H0 is true (Assumed: each cell in the contingency table has expected frequency of at least 5)

Decision Rule The test statistic approximately follows a chi-squared distribution with one degree of freedom Decision Rule: If , reject H0, otherwise, do not reject H0  2 Do not reject H0 Reject H0 2α

Computing the Average Proportion The average proportion is: Here: 120 Females, 12 were left handed 180 Males, 24 were left handed i.e., of all the children the proportion of left handers is 0.12, that is, 12%

Finding Expected Frequencies To obtain the expected frequency for left handed females, multiply the average proportion left handed (p) by the total number of females To obtain the expected frequency for left handed males, multiply the average proportion left handed (p) by the total number of males If the two proportions are equal, then P(Left Handed | Female) = P(Left Handed | Male) = .12 i.e., we would expect (.12)(120) = 14.4 females to be left handed (.12)(180) = 21.6 males to be left handed

Observed vs. Expected Frequencies Gender Hand Preference Left Right Female Observed = 12 Expected = 14.4 Observed = 108 Expected = 105.6 120 Male Observed = 24 Expected = 21.6 Observed = 156 Expected = 158.4 180 36 264 300

The Chi-Square Test Statistic Gender Hand Preference Left Right Female Observed = 12 Expected = 14.4 Observed = 108 Expected = 105.6 120 Male Observed = 24 Expected = 21.6 Observed = 156 Expected = 158.4 180 36 264 300 The test statistic is:

Decision Rule 2 Decision Rule: If > 3.841, reject H0, otherwise, do not reject H0 Here, = 0.7576< = 3.841, so we do not reject H0 and conclude that there is not sufficient evidence that the two proportions are different at  = 0.05 0.05 2 Do not reject H0 Reject H0 20.05 = 3.841

2 Test for Differences Among More Than Two Proportions Extend the 2 test to the case with more than two independent populations: H0: π1 = π2 = … = πc H1: Not all of the πj are equal (j = 1, 2, …, c)

The Chi-Square Test Statistic The Chi-square test statistic is: Where: fo = observed frequency in a particular cell of the 2 x c table fe = expected frequency in a particular cell if H0 is true (Assumed: each cell in the contingency table has expected frequency of at least 1)

Computing the Overall Proportion The overall proportion is: Expected cell frequencies for the c categories are calculated as in the 2 x 2 case, and the decision rule is the same: Where is from the chi-squared distribution with c – 1 degrees of freedom Decision Rule: If , reject H0, otherwise, do not reject H0

The Marascuilo Procedure Used when the null hypothesis of equal proportions is rejected Enables you to make comparisons between all pairs Start with the observed differences, pj – pj’, for all pairs (for j ≠ j’) . . . . . .then compare the absolute difference to a calculated critical range

2 Test of Independence

Hypothesis Tests Qualitative Data Z Test c 2 Test Proportion Independence 1 pop. More than 2 pop. 5

2 Test of Independence Shows if a relationship exists between two qualitative variables One sample is drawn Does not show causality Uses two-way contingency table

2 Test of Independence Contingency Table Shows number of observations from 1 sample jointly in 2 qualitative variables Levels of variable 2 Levels of variable 1 40

Conditions Required for a Valid 2 Test: Independence Multinomial experiment has been conducted The sample size, n, is large: Eij is greater than or equal to 5 for every cell

2 Test of Independence Hypotheses & Statistic H0: Variables are independent Ha: Variables are related (dependent) Test Statistic Observed count Expected count Degrees of Freedom: (r – 1)(c – 1) Rows Columns 41

2 Test of Independence Expected Counts Statistical independence means joint probability equals product of marginal probabilities Compute marginal probabilities and multiply for joint probability Expected count is sample size times joint probability e = Column Tot al Sample Siz Row Total a f   f a f

Expected Count Example 112 160 Marginal probability = Location Urban Rural House Style Obs. Obs. Total Split–Level 63 49 112 Ranch 15 33 48 Total 78 82 160 43

Expected Count Example 112 160 Marginal probability = Location Urban Rural House Style Obs. Obs. Total Split–Level 63 49 112 Ranch 15 33 48 Total 78 82 160 78 160 Marginal probability = 43

Expected Count Example Joint probability = 112 160 78 160 112 160 Marginal probability = Location Urban Rural House Style Obs. Obs. Total Split–Level 63 49 112 Ranch 15 33 48 Total 78 82 160 Expected count = 160· 112 160 78 160 = 54.6 78 160 Marginal probability = 43

Expected Count Calculation 112·78 160 54.6 House Location 112·82 160 57.4 Urban Rural House Style Obs. Exp. Obs. Exp. Total Split - Level 63 49 112 Ranch 48·78 160 23.4 15 33 48·82 160 24.6 48 Total 78 78 82 82 160 43

2 Test of Independence Example As a realtor you want to determine if house style and house location are related. At the .05 level of significance, is there evidence of a relationship? 44

2 Test of Independence Solution H0: Ha:  = df = Critical Value(s): No Relationship Relationship Test Statistic: Decision: Conclusion: .05 (2 - 1)(2 - 1) = 1 c 2 Reject H0 3.841  = .05 47

2 Test of Independence Solution  Eij  5 in all cells 112·78 160 112·82 160 48·78 160 48·82 160 45

2 Test of Independence Solution 12

2 Test of Independence Solution H0: Ha:  = df = Critical Value(s): No Relationship Relationship Test Statistic: Decision: Conclusion: 2 = 8.41 .05 (2 - 1)(2 - 1) = 1 c 2 Reject H0 Reject at  = .05 3.841  = .05 There is evidence of a relationship 47

2 Test of Independence Thinking Challenge You’re a marketing research analyst. You ask a random sample of 286 consumers if they purchase Diet Pepsi or Diet Coke. At the .05 level of significance, is there evidence of a relationship? Diet Pepsi Diet Coke No Yes Total No 84 32 116 Yes 48 122 170 Total 132 154 286 44

2 Test of Independence Solution* H0: Ha:  = df = Critical Value(s): No Relationship Relationship Test Statistic: Decision: Conclusion: .05 (2 - 1)(2 - 1) = 1 c 2 Reject H0 3.841  = .05 47

2 Test of Independence Solution*  Eij  5 in all cells 116·132 286 154·132 286 170·132 286 170·154 286 45

2 Test of Independence Solution* 12

2 Test of Independence Solution* H0: Ha:  = df = Critical Value(s): No Relationship Relationship Test Statistic: Decision: Conclusion: 2 = 54.29 .05 (2 - 1)(2 - 1) = 1 c 2 Reject H0 Reject at  = .05 3.841  = .05 There is evidence of a relationship 47

2 Test of Independence Thinking Challenge 2 There is a statistically significant relationship between purchasing Diet Coke and Diet Pepsi. So what do you think the relationship is? Aren’t they competitors? Diet Pepsi Diet Coke No Yes Total No 84 32 116 Yes 48 122 170 Total 132 154 286 48

You Re-Analyze the Data High Income Diet Pepsi Diet Coke No Yes Total No 4 30 34 Yes 40 2 42 There is a spurious relationship between purchasing Diet Coke & Diet Pepsi. Income is an intervening or control variable & is the true cause. The analysis here uses only descriptive statistics. For low income, consumers are price conscious. Either they can’t afford to buy either or they buy whatever is on sale. For high income, consumers buy depending on preference regardless of price. Total 44 32 76 Low Income Diet Pepsi Diet Coke No Yes Total No 80 2 82 Yes 8 120 128 Total 88 122 210 49

Control or intervening variable (true cause) True Relationships* Diet Coke There is a spurious relationship between purchasing Diet Coke & Diet Pepsi. Income is an intervening or control variable & is the true cause. The analysis here uses only descriptive statistics. For low income, consumers are price conscious. Either they can’t afford to buy either or they buy whatever is on sale. For high income, consumers buy depending on preference regardless of price. Underlying causal relation Apparent relation Control or intervening variable (true cause) Diet Pepsi 50

Numbers don’t think - People do! Moral of the Story* Numbers don’t think - People do! © 1984-1994 T/Maker Co. 51

Conclusion Explained 2 Test for Proportions Explained 2 Test of Independence Solved Hypothesis Testing Problems More Than Two Population Proportions Independence As a result of this class, you will be able to ...