Association between Variables Measured at the Nominal Level.

Slides:



Advertisements
Similar presentations
Chapter 13 (Ch. 11 in 2nd Can. Ed.)
Advertisements

POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:
1. Nominal Measures of Association 2. Ordinal Measure s of Association
Association Between Two Variables Measured at the Nominal Level
Measures of Association for contingency tables 4 Figure 8.2 : lambda – association; +-1: strong; near 0: weak Positive association: as value of the independent.
Bivariate Analysis Cross-tabulation and chi-square.
Hypothesis Testing IV Chi Square.
Describing Relationships Using Correlation and Regression
Chapter 13: The Chi-Square Test
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
PPA 415 – Research Methods in Public Administration Lecture 9 – Bivariate Association.
Chi-square Test of Independence
PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Correlation Question 1 This question asks you to use the Pearson correlation coefficient to measure the association between [educ4] and [empstat]. However,
Review Regression and Pearson’s R SPSS Demo
Smith/Davis (c) 2005 Prentice Hall Chapter Eight Correlation and Prediction PowerPoint Presentation created by Dr. Susan R. Burns Morningside College.
Chapter 14 in 1e Ch. 12 in 2/3 Can. Ed. Association Between Variables Measured at the Ordinal Level Using the Statistic Gamma and Conducting a Z-test for.
Measures of Central Tendency
Week 11 Chapter 12 – Association between variables measured at the nominal level.
Example of Simple and Multiple Regression
Significance Testing 10/22/2013. Readings Chapter 3 Proposing Explanations, Framing Hypotheses, and Making Comparisons (Pollock) (pp ) Chapter 5.
Hypothesis Testing IV (Chi Square)
Cross Tabulation and Chi-Square Testing. Cross-Tabulation While a frequency distribution describes one variable at a time, a cross-tabulation describes.
LIS 570 Summarising and presenting data - Univariate analysis continued Bivariate analysis.
Bivariate Relationships Analyzing two variables at a time, usually the Independent & Dependent Variables Like one variable at a time, this can be done.
1 Psych 5500/6500 Chi-Square (Part Two) Test for Association Fall, 2008.
Class Meeting #11 Data Analysis. Types of Statistics Descriptive Statistics used to describe things, frequently groups of people.  Central Tendency 
Agenda Review Association for Nominal/Ordinal Data –  2 Based Measures, PRE measures Introduce Association Measures for I-R data –Regression, Pearson’s.
Measures of Association. When examining relationships (or the lack thereof) between nominal- and ordinal-level variables, Crosstabs are our instruments.
Chi-Square Testing 10/23/2012. Readings Chapter 7 Tests of Significance and Measures of Association (Pollock) (pp ) Chapter 5 Making Controlled.
1 Measuring Association The contents in this chapter are from Chapter 19 of the textbook. The crimjust.sav data will be used. cjsrate: RATE JOB DONE: CJ.
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
Who Wants to Be a Millionaire? SOCI 3303 SOCIAL STATISTICS.
Correlation and Linear Regression. Evaluating Relations Between Interval Level Variables Up to now you have learned to evaluate differences between the.
Statistics in Applied Science and Technology Chapter 13, Correlation and Regression Part I, Correlation (Measure of Association)
Chi-square (χ 2 ) Fenster Chi-Square Chi-Square χ 2 Chi-Square χ 2 Tests of Statistical Significance for Nominal Level Data (Note: can also be used for.
Copyright © 2012 by Nelson Education Limited. Chapter 10 Hypothesis Testing IV: Chi Square 10-1.
Basic Statistics Correlation Var Relationships Associations.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Cross-Tabs With Nominal Variables 10/24/2013. Readings Chapter 7 Tests of Significance and Measures of Association (Pollock) (pp ) Chapter 5 Making.
Non-parametric Measures of Association. Chi-Square Review Did the | organization| split | Type of leadership for organization this year? | Factional Weak.
Chapter 11 Hypothesis Testing IV (Chi Square). Chapter Outline  Introduction  Bivariate Tables  The Logic of Chi Square  The Computation of Chi Square.
CHI SQUARE TESTS.
Chapter 13 CHI-SQUARE AND NONPARAMETRIC PROCEDURES.
Chapter 11, 12, 13, 14 and 16 Association at Nominal and Ordinal Level The Procedure in Steps.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 11: Bivariate Relationships: t-test for Comparing the Means of Two Groups.
Contingency Tables – Part II – Getting Past Chi-Square?
Chapter 11: Chi-Square  Chi-Square as a Statistical Test  Statistical Independence  Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
Practice Problem: Lambda (1)
Copyright © 2014 by Nelson Education Limited Chapter 11 Introduction to Bivariate Association and Measures of Association for Variables Measured.
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
Measures of Association June 25, 2008 Ivan Katchanovski, Ph.D. POL 242Y-Y.
Copyright © 2012 by Nelson Education Limited. Chapter 12 Association Between Variables Measured at the Ordinal Level 12-1.
Bivariate Association. Introduction This chapter is about measures of association This chapter is about measures of association These are designed to.
Theme 5. Association 1. Introduction. 2. Bivariate tables and graphs.
Association Between Variables Measured at the Ordinal Level
Final Project Reminder
Final Project Reminder
Qualitative data – tests of association
Chapter 14 in 1e Ch. 12 in 2/3 Can. Ed.
Chapter 13 (1e), (Ch. 11 2/3e) Association Between Variables Measured at the Nominal Level: Phi, Cramer’s V, and Lambda.
Association Between Variables Measured at Nominal Level
The Chi-Square Distribution and Test for Independence
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
1. Nominal Measures of Association 2. Ordinal Measure s of Associaiton
Association Between Variables Measured At Ordinal Level
1. Nominal Measures of Association 2. Ordinal Measure s of Associaiton
Presentation transcript:

Association between Variables Measured at the Nominal Level

Introduction ► The measures of association are more efficient methods of expressing an association than calculating percentages for bivariate tables—they express the relationship in a single number ► However, you always need to look at the bivariate tables (crosstabs), since a single number loses some information

Many Different Measures of Association ► Different ones are used for different levels of measurement (nominal, ordinal, or interval/ratio) ► When selecting measures of association for assessing the relationship between variables measured at different levels, social scientists generally choose the measure that is appropriate for the lower of the levels  So if one variable is nominal, and the other interval, you would use a level of association appropriate for the nominal variable

Chi-Square-Based Measures of Association ► These have been commonly used, since you already have calculated chi square for inferential statistics; it is simple to transform it into a measure of association ► We can see from the percentages in a bivariate table that two variables are associated, and know from chi square that the differences are statistically significant

Phi ► To find the strength of the association, will compute a phi ► This statistic is used as a measure of association appropriate for tables with only two rows and two columns ► Formula 14.1 for phi:

Phi, cont. ► Phi is the square root of the value of the obtained chi square divided by the sample size ► For a 2 x 2 table, phi ranges in value from 0 (no association) to 1.00 (perfect association) ► A phi of.33 indicates a weak to moderate relationship between the two variables ► This measure does not reveal the pattern of the association, so need to look at the table

Cramer ’ s V ► For tables with three or more columns or three or more rows, phi has an upper limit that can exceed 1.00  Cramer ’ s V is used for tables that are larger than 2 x 2, is based on chi square, and is also easy to calculate

Formula for Cramer ’ s V

Interpretation of Cramer ’ s V ► It has an upper limit of 1.00 for any size table ► Like phi, it can be interpreted as an index that measures the strength of the association between two variables ► A major problem with phi and Cramer ’ s V is the absence of a direct or meaningful interpretation for values between the extremes of 0.00 and 1.00  Both indicate the strength of the association  But it is only an index of relative strength

Proportional Reduction in Error (PRE) ► For nominal-level variables, the logic of PRE involves first attempting to guess or predict the category into which each case will fall on the dependent variable (Y) while ignoring the independent variable (X)  Will be predicting blindly in this case, and will make many errors ► The second step would be to predict again the category of each case on the dependent variable, but take the independent variable into account

PRE, cont. ► If the two variables are associated, the additional information from the independent variable should enable us to reduce our errors of prediction ► The stronger the association between the variables, the more we will reduce our errors  In the case of a perfect association, we would make no errors at all when predicting scores on Y from scores on X  When there is no association between the variables, knowledge of the independent will not improve the accuracy of our predictions—we would make just as many errors of prediction

Lambda ► Lambda is a PRE measure for nominal-level variables ► We know that gender and height are associated by looking at the percentages ► To measure the strength of this association, a PRE measure called lambda will be calculated  First need to find the number of prediction errors made while ignoring the independent variable (gender)  Then will find the number of prediction errors made while taking gender into account  These two sums will be compared in order to derive the statistic

Example of Height by Gender (Table 12.15) ► Can ignore information given by the independent variable (gender) by working only with the row marginals  Two different predictions can be made using these marginals ► We can predict either that all subjects are tall or that all subjects are short (these are the only two permitted by lambda)  For the first prediction (all subjects are tall), 48 errors will be made ► For this prediction, all 100 cases would be placed in the first row ► Since only 52 cases belong in this row, this prediction would result in (100 – 52) or 48 errors

Example, cont. ► If we had predicted that all subjects were short, we would have made 52 errors ( = 52) ► We will use the lesser of these two numbers and refer to this quantity as E sub 1 for the number of errors made while ignoring the independent variable  So, E sub 1 = 48 [N – (largest row total)]

Second Step ► The second step in computing lambda is to again predict scores on Y (height), this time taking X (gender) into account  Follow the same procedure as in the first step, but this time move from column to column ► Since each column is a category of X, we take X into account in making our predictions  For the left-hand column (males), we predict that all 50 cases will be tall and make six errors  For the second column (females), our prediction is that all females are short, and eight errors will be made  We have made a total of 14 errors of prediction, a quantity we will label E sub 2

Logic of Lambda ► The logic of lambda is that, if the variables are associated, fewer errors will be made under the second procedure than under the first (want E sub 2 to be less than E sub 1)  Clearly, gender and height are associated, since we made fewer errors of prediction while considering gender (E sub 2 = 14) than while ignoring gender (E sub 1 = 48)

Computing Lambda ► To find the proportional reduction in error, use Formula 12.3:

Interpretation of Lambda ► For the above example, lambda equals.71 ► Lambda has a possible range of 0 to 1  A lambda of 0 would indicate that the information given by the independent variable does not improve our ability to predict the dependent and therefore, that there is no association between the variables  A lambda of 1.00 would mean that it was possible to predict Y without error from X

PRE Interpretation ► Additionally, lambda allows a direct and meaningful interpretation of the numbers in between  When multiplied by 100, the value of lambda indicates directly the proportional reduction in error—the strength of the association  So, a lambda of.71 tells us that knowledge of gender improves our ability to predict height by a factor of 71% ► Of, we are 71% better off knowing gender when attempting to predict height than we are not knowing gender

Other Examples ► If lambda =.20, this indicates that we are 20% better off knowing the independent variable when attempting to predict a person ’ s score or value on the dependent variable ► If we make 75 errors when predicting Y without knowledge of X, and 60 errors when predicting Y with knowledge of X, then X and Y are associated ► If the value of lambda is relatively low, we may conclude that other variables are importantly associated with the dependent variable

Problems with Lambda ► It changes if you reverse the independent and dependent variables  Need to follow the convention of putting the independent variable in the columns and compute lambda as done above  You also need to be confident which variable is the independent one and which is the dependent one

Second Problem ► If one of the row totals is much larger than the others, lambda can take on a value of 0 even when other measures of association would not be 0, and calculating percentages for the table indicates some association between the variables  Suggests that you use great caution in interpretation of lambda when the row marginals are very unequal  If the row totals are unequal, you should use a chi- square-based measure of association (phi or Cramer ’ s V)  For the same bivariate table, Cramer ’ s V is.27 and lambda is zero, we can conclude that the variables may be associated even if lambda is zero—need to disregard lambda if the row marginals are very unequal