Download presentation

Presentation is loading. Please wait.

Published byTristan Egger Modified over 2 years ago

1
Statistics for Business and Economics Chapter 9 Categorical Data Analysis

2
Learning Objectives 1.Explain 2 Test for Proportions 2.Explain 2 Test of Independence 3.Solve Hypothesis Testing Problems More Than Two Population Proportions Independence

3
Data Types Data QuantitativeQualitative ContinuousDiscrete

4
Qualitative Data Qualitative random variables yield responses that classify –Example: gender (male, female) Measurement reflects number in category Nominal or ordinal scale Examples –What make of car do you drive? –Do you live on-campus or off-campus?

5
Hypothesis Tests Qualitative Data Qualitative Data Z Test c 2 Test ProportionIndependence 1 pop. c 2 Test More than 2 pop.

6
Chi-Square ( 2 ) Test for k Proportions

7
Hypothesis Tests Qualitative Data Qualitative Data Z Test c 2 Test ProportionIndependence 1 pop. c 2 Test More than 2 pop.

8
Multinomial Experiment n identical trials k outcomes to each trial Constant outcome probability, p k Independent trials Random variable is count, n k Example: ask 100 people (n) which of 3 candidates (k) they will vote for

9
Chi-Square ( 2 ) Test for k Proportions Tests equality (=) of proportions only –Example: p 1 =.2, p 2 =.3, p 3 =.5 One variable with several levels Uses one-way contingency table

10
One-Way Contingency Table Shows number of observations in k independent groups (outcomes or variable levels) Outcomes (k = 3) Number of responses Candidate TomBillMaryTotal 352045100

11
Conditions Required for a Valid Test: One-way Table 1.A multinomial experiment has been conducted 2.The sample size n is large: E(n i ) is greater than or equal to 5 for every cell

12
2 Test for k Proportions Hypotheses & Statistic 2.Test Statistic Observed count Expected count: E(n i ) = np i,0 3.Degrees of Freedom: k – 1 Number of outcomes Hypothesized probability 1.Hypotheses H 0 : p 1 = p 1,0, p 2 = p 2,0,..., p k = p k,0 H a : At least one p i is different from above

13
2 Test Basic Idea 1.Compares observed count to expected count assuming null hypothesis is true 2.Closer observed count is to expected count, the more likely the H 0 is true Measured by squared difference relative to expected count Reject large values

14
Finding Critical Value Example What is the critical 2 value if k = 3, and =.05? c 2 0 Upper Tail Area DF.995….95….05 1...…0.004…3.841 20.010…0.103…5.991 2 Table (Portion) If n i = E(n i ), 2 = 0. Do not reject H 0 df= k - 1 = 2 5.991 Reject H 0 =.05

15
As personnel director, you want to test the perception of fairness of three methods of performance evaluation. Of 180 employees, 63 rated Method 1 as fair, 45 rated Method 2 as fair, 72 rated Method 3 as fair. At the.05 level of significance, is there a difference in perceptions? 2 Test for k Proportions Example

16
H 0 : H a : = n 1 = n 2 = n 3 = Critical Value(s): Test Statistic: Decision: Conclusion: p 1 = p 2 = p 3 = 1/3 At least 1 is different.05 634572 =.05 c 2 0 Reject H 0 5.991 2 Test for k Proportions Solution

17

18
Test Statistic: Decision: Conclusion: 2 = 6.3 Reject at =.05 There is evidence of a difference in proportions 2 Test for k Proportions Solution H 0 : H a : = n 1 = n 2 = n 3 = Critical Value(s): c 2 0 Reject H 0 p 1 = p 2 = p 3 = 1/3 At least 1 is different.05 634572 5.991 =.05

19
Contingency Tables Useful in situations involving multiple population proportions Used to classify sample observations according to two or more characteristics Also called a cross-classification table.

20
Contingency Table Example Left-Handed vs. Gender Dominant Hand: Left vs. Right Gender: Male vs. Female 2 categories for each variable, so called a 2 x 2 table Suppose we examine a sample of 300 children

21
Contingency Table Example Sample results organized in a contingency table: (continued) Gender Hand Preference LeftRight Female12108120 Male24156180 36264300 120 Females, 12 were left handed 180 Males, 24 were left handed sample size = n = 300:

22
2 Test for the Difference Between Two Proportions If H 0 is true, then the proportion of left-handed females should be the same as the proportion of left-handed males The two proportions above should be the same as the proportion of left-handed people overall H 0 : π 1 = π 2 ( Proportion of females who are left handed is equal to the proportion of males who are left handed ) H 1 : π 1 π 2 ( The two proportions are not the same hand preference is not independent of gender )

23
The Chi-Square Test Statistic where: f o = observed frequency in a particular cell f e = expected frequency in a particular cell if H 0 is true (Assumed: each cell in the contingency table has expected frequency of at least 5) The Chi-square test statistic is:

24
Decision Rule 2 α Decision Rule: If, reject H 0, otherwise, do not reject H 0 The test statistic approximately follows a chi- squared distribution with one degree of freedom 0 Reject H 0 Do not reject H 0

25
Computing the Average Proportion Here: 120 Females, 12 were left handed 180 Males, 24 were left handed i.e., of all the children the proportion of left handers is 0.12, that is, 12% The average proportion is:

26
Finding Expected Frequencies To obtain the expected frequency for left handed females, multiply the average proportion left handed (p) by the total number of females To obtain the expected frequency for left handed males, multiply the average proportion left handed (p) by the total number of males If the two proportions are equal, then P(Left Handed | Female) = P(Left Handed | Male) =.12 i.e., we would expect (.12)(120) = 14.4 females to be left handed (.12)(180) = 21.6 males to be left handed

27
Observed vs. Expected Frequencies Gender Hand Preference LeftRight Female Observed = 12 Expected = 14.4 Observed = 108 Expected = 105.6 120 Male Observed = 24 Expected = 21.6 Observed = 156 Expected = 158.4 180 36264300

28
Gender Hand Preference LeftRight Female Observed = 12 Expected = 14.4 Observed = 108 Expected = 105.6 120 Male Observed = 24 Expected = 21.6 Observed = 156 Expected = 158.4 180 36264300 The Chi-Square Test Statistic The test statistic is:

29
Decision Rule Decision Rule: If > 3.841, reject H 0, otherwise, do not reject H 0 Here, = 0.7576< = 3.841, so we do not reject H 0 and conclude that there is not sufficient evidence that the two proportions are different at = 0.05 2 0.05 = 3.841 0 0.05 Reject H 0 Do not reject H 0

30
Extend the 2 test to the case with more than two independent populations: 2 Test for Differences Among More Than Two Proportions H 0 : π 1 = π 2 = … = π c H 1 : Not all of the π j are equal (j = 1, 2, …, c)

31
The Chi-Square Test Statistic Where: f o = observed frequency in a particular cell of the 2 x c table f e = expected frequency in a particular cell if H 0 is true (Assumed: each cell in the contingency table has expected frequency of at least 1) The Chi-square test statistic is:

32
Computing the Overall Proportion The overall proportion is: Expected cell frequencies for the c categories are calculated as in the 2 x 2 case, and the decision rule is the same: Where is from the chi- squared distribution with c – 1 degrees of freedom Decision Rule: If, reject H 0, otherwise, do not reject H 0

33
The Marascuilo Procedure Used when the null hypothesis of equal proportions is rejected Enables you to make comparisons between all pairs Start with the observed differences, p j – p j, for all pairs (for j j )......then compare the absolute difference to a calculated critical range

34
2 Test of Independence

35
Hypothesis Tests Qualitative Data Qualitative Data Z Test c 2 Test ProportionIndependence 1 pop. c 2 Test More than 2 pop.

36
2 Test of Independence Shows if a relationship exists between two qualitative variables –One sample is drawn –Does not show causality Uses two-way contingency table

37
2 Test of Independence Contingency Table Shows number of observations from 1 sample jointly in 2 qualitative variables Levels of variable 2 Levels of variable 1

38
Conditions Required for a Valid 2 Test: Independence 1.Multinomial experiment has been conducted 2.The sample size, n, is large: E ij is greater than or equal to 5 for every cell

39
2 Test of Independence Hypotheses & Statistic 1.Hypotheses H 0 : Variables are independent H a : Variables are related (dependent) 3.Degrees of Freedom: (r – 1)(c – 1) Rows Columns 2.Test Statistic Observed count Expected count

40
2 Test of Independence Expected Counts 1.Statistical independence means joint probability equals product of marginal probabilities 2.Compute marginal probabilities and multiply for joint probability 3.Expected count is sample size times joint probability

41
112 160 Marginal probability = Expected Count Example Location UrbanRural House Style Obs. Obs.Total Split–Level 63 49 112 Ranch 15 33 48 Total 78 82 160

42
78 160 Marginal probability = Expected Count Example 112 160 Marginal probability = Location UrbanRural House Style Obs. Obs.Total Split–Level 63 49 112 Ranch 15 33 48 Total 78 82 160

43
Expected Count Example 78 160 Marginal probability = 112 160 Marginal probability = Joint probability = 112 160 78 160 Location UrbanRural House Style Obs. Obs.Total Split–Level 63 49 112 Ranch 15 33 48 Total 78 82 160 Expected count = 160· 112 160 78 160 = 54.6

44
Expected Count Calculation House Location Urban Rural House Style Obs. Exp. Obs. Exp. Total Split-Level 63 112·78 160 54.6 49 112·82 160 57.4 112 Ranch 15 48·78 160 23.4 33 48·82 160 24.6 48 Total 78 82 160

45
As a realtor you want to determine if house style and house location are related. At the.05 level of significance, is there evidence of a relationship? 2 Test of Independence Example

46
2 Test of Independence Solution H 0 : H a : = df = Critical Value(s): Test Statistic: Decision: Conclusion: No Relationship Relationship.05 (2 - 1)(2 - 1) = 1 c 2 0 Reject H 0 3.841 =.05

47
E ij 5 in all cells 2 Test of Independence Solution 112·82 160 48·78 160 48·82 160 112·78 160

48
2 Test of Independence Solution

49
Test Statistic: Decision: Conclusion: 2 = 8.41 Reject at =.05 There is evidence of a relationship H 0 : H a : = df = Critical Value(s): c 2 0 Reject H 0 No Relationship Relationship.05 (2 - 1)(2 - 1) = 1 3.841 =.05

50
Youre a marketing research analyst. You ask a random sample of 286 consumers if they purchase Diet Pepsi or Diet Coke. At the.05 level of significance, is there evidence of a relationship? 2 Test of Independence Thinking Challenge Diet Pepsi Diet CokeNoYesTotal No 84 32116 Yes 48122170 Total132154286

51
2 Test of Independence Solution* H 0 : H a : = df = Critical Value(s): Test Statistic: Decision: Conclusion: No Relationship Relationship.05 (2 - 1)(2 - 1) = 1 c 2 0 Reject H 0 3.841 =.05

52
E ij 5 in all cells 170·132 286 170·154 286 116·132 286 154·132 286 2 Test of Independence Solution*

53

54
Test Statistic: Decision: Conclusion: 2 = 54.29 Reject at =.05 There is evidence of a relationship H 0 : H a : = df = Critical Value(s): c 2 0 Reject H 0 No Relationship Relationship.05 (2 - 1)(2 - 1) = 1 3.841 =.05

55
There is a statistically significant relationship between purchasing Diet Coke and Diet Pepsi. So what do you think the relationship is? Arent they competitors? 2 Test of Independence Thinking Challenge 2 Diet Pepsi Diet CokeNoYesTotal No 84 32116 Yes 48122170 Total132154286

56
Low Income You Re-Analyze the Data High Income Diet Pepsi Diet Coke No Yes Total No 4 30 34 Yes 40 2 42 Total 44 32 76 Diet Pepsi Diet Coke No Yes Total No 80 2 82 Yes 8 120 128 Total 88 122 210

57
True Relationships* Apparent relation Underlying causal relation Control or intervening variable (true cause) Diet Coke Diet Pepsi

58
Moral of the Story* © 1984-1994 T/Maker Co. Numbers dont think - People do!

59
Conclusion 1.Explained 2 Test for Proportions 2.Explained 2 Test of Independence 3.Solved Hypothesis Testing Problems More Than Two Population Proportions Independence

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google