Presentation is loading. Please wait.

Presentation is loading. Please wait.

Measures of Association for Contingency Tables. Measures of Association General measures of association that can be used with any variable types. Measures.

Similar presentations


Presentation on theme: "Measures of Association for Contingency Tables. Measures of Association General measures of association that can be used with any variable types. Measures."— Presentation transcript:

1 Measures of Association for Contingency Tables

2 Measures of Association General measures of association that can be used with any variable types. Measures of association when both X and Y are nominal. Measures of association when both X and Y are ordinal. Measures of association when X and Y are both ordinal or dichotomous nominal.

3 Measures of Association There are two main classes of measures of association: symmetric or asymmetric. Symmetric measures will be the same if the roles of X and Y are reversed. In other, words it does not matter which variable is viewed as the independent variable (X) and which is viewed as the dependent variable (Y).

4 Measures of Association Asymmetric measures will be different if the roles of X and Y are reversed. In other words, which variable is viewed as the independent variable (X) and which is viewed as the dependent variable (Y) matters.

5 Measures of Association Asymmetric Measure Ordinal VariablesNominal Variables YesSomer’s DLambda ( l ) – asymmetric Uncertainty Coefficient - asymmetric NoGamma ( g ) Kendall’s Tau-b Stuart’s Tau -c Phi ( f) Yule’s Q (2 x 2 tables) Cramer’s V (r x c tables) Pearson’s Contingency Coefficient (C) Uncertainty Coefficient – symmetric Lambda ( l ) – symmetric same

6 Measures of Association Rule of Thumb for Interpreting the Magnitude (i.e. ignoring the sign/direction) of the various measures of association we will be examining is as follows:.00 to <.10“no relationship”.10 to <.30“weak relationship”.30 to <.50“moderate relationship”.50 to 1.00“strong relationship” You could find several other adjective scales, these are NOT set in stone!

7 Measures of Association Asymmetric Measure Ordinal VariablesNominal Variables YesSomer’s DLambda ( l ) – asymmetric Uncertainty Coefficient - asymmetric NoGamma ( g ) Kendall’s Tau-b Stuart’s Tau -c Phi ( f) Yule’s Q (2 x 2 tables) Cramer’s V (r x c tables) Pearson’s Contingency Coefficient (C) Lambda ( l ) – symmetric Uncertainty Coefficient – symmetric same SYMMETRIC AND CAN BE USED WITH ANY DATA TYPES

8 Measures of Association Between Two Categorical Variables - Phi statistic SYMMETRIC AND CAN BE USED WITH ANY DATA TYPES

9 Measures of Association Between Two Categorical Variables – Phi statistic This can be applied to the cervical cancer case- control study. Using this measure, there is a weak association between risk factor and disease status.

10 Measures of Association Between Two Categorical Variables – Yule’s Q DiseaseRisk PresentRisk Absent Yesab Nocd

11 Measures of Association Between Two Categorical Variables – Yule’s Q There is a strong association between risk factor (Preg. Age < 25) and case-control status (Cervical Cancer) using this measure.

12 Measures of Association Between Two Categorical Variables – Yule’s Q There is a strong association between risk factor (Preg. Age < 25) and case-control status (Cervical Cancer) using this measure.

13 Measures of Association Between Two Categorical Variables – Cramer’s V SYMMETRIC AND CAN BE USED WITH ANY DATA TYPES

14 Measures of Association Between Two Categorical Variables – Cramer’s V For the Hodgkin’s study: Which suggests a weak relationship between histological type and response to treatment. Is there a relationship between histological type of Hodgkin’s disease and response to treatment?

15 Measures of Association Between Two Categorical Variables – Pearson’s C This can be used for general r x c tables regardless of the data types involved. SYMMETRIC AND CAN BE USED WITH ANY DATA TYPES

16 Measures of Association Between Two Categorical Variables – Pearson’s C This can be used for the Hodgkin’s example. Which suggests a moderate relationship between type and response to treatment.

17 Measures of Association Asymmetric Measure Ordinal VariablesNominal Variables YesSomer’s DLambda ( l ) – asymmetric Uncertainty Coefficient - asymmetric NoGamma ( g ) Kendall’s Tau-b Stuart’s Tau -c Phi ( f) Yule’s Q (2 x 2 tables) Cramer’s V (r x c tables) Pearson’s Contingency Coefficient (C) Lambda ( l ) – symmetric Uncertainty Coefficient – symmetric same ASYMMETRIC AND CAN BE USED WITH NOMINAL X & Y

18 Lambda (l) - Is an asymmetrical measure of association suitable for use with nominal variables that looks at predictive abilities, i.e. one variable predicting the level of the other. It provides us with an indication of the strength of an association between the independent (X) and dependent (Y) variables. It may range from 0.0 (meaning the extra information provided by the independent variable does not help prediction) to 1.0 (meaning use of independent variable results in no prediction errors). It is asymmetric, i.e. which variable is viewed as X and which as Y matters! Measures of Association Between Two Categorical Variables – Lambda

19 Lambda Measures of Association Between Two Categorical Variables – Lambda

20 Lambda Measures of Association Between Two Categorical Variables – Lambda The best way to see how these formulae work and the rationale behind them is to consider an example.

21 Example: Physical and Psychological Pain of DBM Admits These data come from a study conducted by three master’s nursing students who recently graduated (Kelsey, Woods, & Langhans). One of the questions examined was whether there was a relationship between high physical pain at admission and high psychological pain. The high classification for psych pain meant 5 on five-point ordinal scale and high physical pain meant 5+ on the ten-point pain scale.

22 Example: Physical and Psychological Pain of DBM Admits Below is the a 2 X 2 table of the results with Physical Pain as Row (Y) and Psych Pain as Column (X). Physical Pain High Psych PainNo Row Totals High Phys. Pain 11718 No102939 Column Totals 2136n = 57

23 Example: Physical and Psychological Pain of DBM Admits Physical Pain High Psych PainNo Row Totals High Phys. Pain 11718 No102939 Column Totals 2136n = 57 In the absence of any information about psychological pain we predict they will not be suffering from high physical pain as that is the modal level on the physical pain scale.

24 Example: Physical and Psychological Pain of DBM Admits Physical Pain High Psych Pain No Row Totals High Phys. Pain 11718 No102939 Column Totals 2136n = 57 Using Psych Pain to predict Physical Pain status we see that if the subject has high Psych Pain the modal response is High Physical Pain and if the subject does not have high Psych Pain the modal response is not having high Physical Pain.

25 Example: Physical and Psychological Pain of DBM Admits Physical Pain High Psych Pain No Row Totals High Phys. Pain 11718 No102939 Column Totals 2136n = 57 Using Psych Pain to predict Physical Pain status we see that if the subject has high Psych Pain the modal response is High Physical Pain and if the subject does not have high Psych Pain the modal response is not having high Physical Pain.

26 Example: Physical and Psychological Pain of DBM Admits Psych Pain High Physical Pain No Row Totals High Psych Pain 111021 No72938 Column Totals1839 n = 57 Using Physical Pain to predict Psychological Pain status we see that if the subject has high Physical Pain the modal response is High Psych Pain and if the subject does not have high Physical Pain the modal response is not having High Psych Pain.

27 Example: Physical and Psychological Pain of DBM Admits Psych Pain High Physical Pain No Row Totals High Psych Pain 111021 No72938 Column Totals1839 n = 57

28 Example: Physical and Psychological Pain of DBM Admits The Lambda association measures are highlighted. You can see they match those we calculated on by hand the previous slides. The Uncertainty Coefficient is calculated differently, but measures the PRE like Lambda does thus it can be interpreted in a similar fashion.

29 Measures of Association Asymmetric Measure Ordinal VariablesNominal Variables YesSomer’s DLambda ( l ) – asymmetric Uncertainty Coefficient - asymmetric NoGamma ( g ) Kendall’s Tau-b Stuart’s Tau -c Phi ( f) Yule’s Q (2 x 2 tables) Cramer’s V (r x c tables) Pearson’s Contingency Coefficient (C) Lambda ( l ) – symmetric Uncertainty Coefficient – symmetric same SYMMETRIC AND ASYMMETRIC MEASURES USED TO MEASURE THE ASSOCIATION BETWEEN ORDINAL VARIABLES.

30 Measures of Association Between Two Ordinal Variables Some of the previously discussed measures can be used. However, for cases where both variables are ordinal better measures include Gamma, Kendall’s tau, Stuart’s tau and Somer’s D. We will discuss these in a bit. First though, in some cases we wish to measure the degree of exact agreement between two nominal or ordinal variables measured using the same levels or scales, in which case we generally use Cohen’s Kappa (  ).

31 Medicare Health Outcomes Survey http://www.hosonline.org/Content/Default.aspx Website for Medicare Health Outcomes Survey:

32 Medicare Health Outcomes Survey (HOS) FROM THE MEDICARE HOS SURVEY WEBSITE: The Medicare HOS is the first patient-reported outcomes measure used in Medicare managed care. The goal of the Medicare HOS program is to gather valid and reliable clinically meaningful data that have many uses, such as for targeting quality improvement activities and resources; monitoring health plan performance and rewarding top-performing health plans; helping beneficiaries make informed health care choices; and advancing the science of functional health outcomes measurement. Managed care plans with Medicare Advantage (MA) contracts must participate. Each spring a random sample of Medicare beneficiaries is drawn from each participating Medicare Advantage Organization (MAO), that has a minimum of 500 enrollees and is surveyed (i.e., a survey is administered to a different baseline cohort, or group, each year). Two years later, these same respondents are surveyed again (i.e., follow up measurement). Cohort 1 was surveyed in 1998 and was resurveyed in 2000. Cohort 2 was surveyed in 1999 and was resurveyed in 2001, and so on. During the current HOS administration (2013 Round 16), Cohort 16 is surveyed and Cohort 14 is resurveyed using HOS 2.5. For data collection years 1998-2006, the MAO sample size was one thousand. Effective 2007, the MAO sample size was increased to twelve hundred.

33 Measures of Association Between Two Categorical Variables Cohen’s Kappa (  ) – measures the degree of agreement between two variables on the same scales. HOS Study – General health measured ordinally at baseline and 2-yr. follow-up, how well do they agree?  excellent agreement  good agreement 0 <  marginal agreement There is a fairly good agreement between the general assessment of overall health baseline and at follow-up. However, there appears to be some general trend for improvement as well.

34 Bowker’s Test of Symmetry Symmetry of Disagreement Bowker’s test suggests the differences are asymmetric (p <.0001). Examining the percentages suggests a majority of patients either stayed the same or improved in each group based on baseline score. Therefore it is reasonable to state that we have evidence that in general subjects health stayed the same or if it did change, it was generally for the better (p <.0001).

35 Kruskal’s Gamma (  ) Before computing Gamma we need to introduce the concept of discordant and concordant paired observations. Paired observations – Observations compared in terms of their relative rankings on the independent (X) and dependent variable (Y).

36 Kruskal’s Gamma (  ) Same order pair (N s ) – Paired observations that show a positive association; the member of the pair ranked higher on the independent variable is also ranked higher on the dependent variable. Inverse order pair (N d ) – Paired observations that show a negative association; the member of the pair ranked higher on the independent variable is ranked lower on the dependent variable.

37 Kruskal’s Gamma (  ) Gamma is symmetrical measure of association suitable for use with ordinal variables or with dichotomous nominal variables. For dichotomous nominal variables it is the same as Yule’s Q for 2 X 2 tables. It can vary from 0.0 (meaning the extra information provided by the independent variable does not help prediction) to  1.0 (meaning use of independent variable results in no prediction errors) and provides us with an indication of the strength and direction of the association between the variables. When there are more N s pairs, gamma will be positive; when there are more N d pairs, gamma will be negative.

38 Example 1 : Job Security & Satisfaction Job SatisfactionHighMediumLow High16814 Medium191760 Low91156 Job Security

39 Example 1: Job Security & Satisfaction Job SatisfactionHighMediumLow High16814 Medium191760 Low91156 Job Security

40 Example 1: Job Security & Satisfaction Job SatisfactionHighMediumLow High16814 Medium191760 Low91156 Job Security

41 Example 1: Job Security & Satisfaction Job SatisfactionHighMediumLow High16814 Medium191760 Low91156 Job Security

42 Example 2: Medicare Survey – General Health: Baseline vs. Follow-up Each highlighted measure suggests a strong relationship between general health at baseline and general health at follow-up as all measures exceed 0.50. The association is also positive indicating if health was good at baseline it also tends to be good at follow-up.

43 Summary We have considered the following measures of association for contingency tables. Depending on the variable types and the goals of our analysis, we generally choose from among these measures.

44 Other Measures for Ordinal Variables There other measures that can be used when both X and Y are ordinal in nature. These are more akin to the traditional correlation measure for continuous X and Y, which is Pearson’s Product Moment Correlation (r). Spearman’s Rank Correlation - (a.k.a. Spearman’s Rho), Kendall’s , and Hoeffding’s D are all available in JMP, but are obtained by using the Analyze > Multivariate Methods and are found under the Nonparametric Correlations option.

45 Example: NHANES Survey The National Health and Nutrition Examination Survey (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The survey is unique in that it combines interviews and physical examinations.

46 Example: NHANES Dermatology Survey This link we will take you to a description of the NHANES dermatology survey module conducted in 2005-2006. http://www.cdc.gov/nchs/nhanes/nhanes2005- 2006/DEQ_D.htm

47 Example: NHANES Dermatology Survey Here we are examining ordinal measures on several variables pertaining to sun protective measures. The higher the score, the more frequently the respondent said they used the preventative measure. As these are ALL ordinal variable the use of Pearson’s Product Moment Correlation is NOT appropriate!

48 Example: NHANES Dermatology Survey The nonparametric correlations we might consider using are found in the Nonparametric Correlations pull-out menu. Spearman’s Rho is a good choice when X and Y are continuous but neither variable is normally distributed or if there are noticeable outliers. It can also be used with ordinal variables like we have here. Kendall’s Tau is also a valid choice for ordinal variables. Hoeffding’s D is good when the relationship between X and Y is nonlinear which would rarely, if ever, be the case for ordinal X and Y.

49 Example: NHANES Dermatology Survey Summary: As one would expect all correlations are positive, as someone who is cautious in one aspect of sun protection, probably tends to cautious in others as well. Spearman’s  and Kendall’s  yield similar results. Hoeffding’s D should not be used for these data!

50 Summary If X and Y are ordinal but not on the same scale, or agreement when they are is not of primary interest, then there are several choices: Gamma, Kendall’s, Stuart’s and Somer’s. Try them all, pick the one you think is “best”. For non-ordinal associations you again have several choices: Phi, Cramer’s V (Yule’s Q), Lambda, Uncertainty Coefficient, etc. Again try them all, think about what you are trying to show and choose the one you think is “best”.

51 Summary If X and Y are ordinal and on exactly the same scale we can examine Cohen’s Kappa (  ) to measure the degree of exact agreement. To test for any asymmetries (i.e. a tendency for X > Y or X < Y) we can use Bowker’s Test for r x r tables or McNemar’s Test for 2 X 2 tables.

52 Summary If you clearly have an independent variable (X) and a dependent variable (Y) then you might consider the asymmetric options. If X and Y are interchangeable and you simply want to measure or quantify the degree of association then a symmetric measure would be preferable.


Download ppt "Measures of Association for Contingency Tables. Measures of Association General measures of association that can be used with any variable types. Measures."

Similar presentations


Ads by Google