Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Chapter 14 Preprocessing the Data, And Cross-Tabs © 2005 Thomson/South-Western.

Similar presentations


Presentation on theme: "1 Chapter 14 Preprocessing the Data, And Cross-Tabs © 2005 Thomson/South-Western."— Presentation transcript:

1 1 Chapter 14 Preprocessing the Data, And Cross-Tabs © 2005 Thomson/South-Western

2 2 Figure 1: Histogram and Frequency Polygon of Incomes of Families in Car Ownership Study

3 3 Figure 2: Cumulative Distribution of Incomes of Families in Car Ownership Study

4 4 Family Income and Number of Cars Family Owns Number of Cars Income Less than $37,500 More than $37,500 TOTAL 1 or None 2 or More Total 48 27 75 6 19 25 54 46 100

5 5 Number of Cars by Family Income Number of Cars Income Less than $37,500 More than $37,500 1 or None 2 or More Total 89% 59% 11% 41% 100% # of Cases 54 46

6 6 Family Income by Number of Cars Number of Cars Income Less than $37,500 More than $37,500 Total 1 or None 2 or More 64% 36% 100% (75) 24% 76% 100% (25) (Number of Cases)

7 7 Number of Cars and Size of Family Number of Cars Size of Family 4 or Less 5 or More Total 1 or None 2 or More Total 70 5 75 8 17 25 78 22 100

8 8 Number of Cars by Size of Family Number of Cars Size of Family 4 or Less 5 or More 1 or None 2 or More Total 90% 23% 10% 77% 100% # of Cases (78) (22)

9 9 Number of Cars by Income and Size of Family Income Less than $37,500 More than $37,500 TOTAL 44 26 70 2 6 8 46 32 78 1 or None 2 or MoreTotal 4 1 5 4 13 17 8 14 22 1 or None 2 or MoreTotal 48 27 75 6 19 25 54 46 100 1 or None 2 or MoreTotal Four Members or Less: Total Number of Cars Number of Cars Five Members or More:

10 10 Number of Cars by Income and Size of Family Income Less than $37,500 More than $37,500 96% 81% 4% 19% 100% (46) 100% (32) 1 or None 2 or MoreTotal 50% 7% 50% 93% 100% (8) 100% (14) 1 or None 2 or More Total 89% 59% 11% 41% 100% (54) 100% (46) 1 or None 2 or MoreTotal Four Members or Less: Total Number of Cars Number of Cars Five Members or More:

11 11 Car Ownership for Small, Below Average Income Families Number of Cars Income Less than $37,500 1 or None 2 or More Total 96%4%100% (46)

12 12 Percentage of Families Owning Two or More Cars by Income Number of Cars Income Less than $37,500 4 or Less 5 or More Total 4%50%11% (6) More than $37,500 19%93%41% (19)

13 13 Conditions That Can Arise with the Introduction of an Additional Variable into a Cross Tabulation With the Additional Variable Initial Situation Change Conclusion Retain Conclusion Some Relationship Refine Explanation Reveal Spurious Explanation Provide Limiting Conditions A. B. C. II IV No Relationship I III

14 14 The Researcher’s Dilemma True Situation Researcher’s Conclusion No Relationship Some Relationship No Relationship Some Relationship Correct Decision Spurious Correlation Correct Decision if Concluded Relationship is of Proper Form Spurious Noncorrelation

15 15 Source: Appendix 14A Chi-Square Tests

16 16 Measures of Association for Nominal Data Measures Appropriate for Nominal Data * Contingency Table (Chi-Square) * Contingency Coefficient * Index of Predictive Association

17 17 Family Size: 4 or less 5 or more #Cars: 0 or 1 2+ 70 8 5 17 75 25 78 22 100 Frequencies of Combinations of Row (i) and Column (j) Cross Tabulations

18 18 H 0 : Row variable independent of column variable; No association between family size & #cars analogous to: “no correlation” Cross-Tabs & Chi-Squares Family Size: 4 or less 5 or more #Cars: 0 or 1 2+ 75 25 75% 25% 78 78% 22 22% 100

19 19 We’d EXPECT frequencies to be distributed “randomly”; i.e., in proportion to the margins If Family Size & #Cars are Independent: Family Size: 4 or less 5 or more #Cars: 0 or 1 2+ 75 25 75% 25% 78 78% 22 22% 100 58.5 19.5 16.5 5.5

20 20 If A & B are independent: P(A 1 B 1 ) = P(A 1 )P(B 1 ) Using the Statistical Definition of “Independence” to Calculate the Expected Frequencies e 11 = nP(A 1 B 1 ) = 100 (78/100) (75/100) = (78 x 75) 100

21 21  Chi-square measures how much our data differ from what we’d expect (given the hypothesis of independence)  Are the row and column variables associated ? Chi-Square Formula

22 22 X 2 = ( 70-58.5 ) 2 + ( 8-19.5 ) 2 + (5-16.5 ) 2 + (17-5.5 ) 2 58.5 19.5 16.5 5.5 = 2.261 + 6.782 + 8.015 + 24.046 = 41.104 Is this large? Chi-Square for Our Data df= degrees of freedom = ( r-1) ( c-1) For our 2x2 table, df=1 critical value for X 2 with 1 df = 3.84 (.05) X 2 = 41.104 exceeds 3.84.

23 23  Three-way table: Example: Family size x #Cars x household income  Log Linear Models Extension Beyond 2-Way Tables

24 24  Equation:  Degrees of Freedom:( r-1 )  When would you use this statistic?  e.g., compare sample to population characteristics, or to previous study’s benchmark  to investigate the great M&M caper: One-Way Chi-Square

25 25 PLAINPEANUT e i ’so i ’se i ’so i ’s  blue  brown  green  orange  red  yellow  critical chi-square on 5 df = 11.07 The Case of the Blue M&M’s:


Download ppt "1 Chapter 14 Preprocessing the Data, And Cross-Tabs © 2005 Thomson/South-Western."

Similar presentations


Ads by Google