Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Measuring Association The contents in this chapter are from Chapter 19 of the textbook. The crimjust.sav data will be used. cjsrate: RATE JOB DONE: CJ.

Similar presentations


Presentation on theme: "1 Measuring Association The contents in this chapter are from Chapter 19 of the textbook. The crimjust.sav data will be used. cjsrate: RATE JOB DONE: CJ."— Presentation transcript:

1 1 Measuring Association The contents in this chapter are from Chapter 19 of the textbook. The crimjust.sav data will be used. cjsrate: RATE JOB DONE: CJ SYSTEM OVERALL judgrate: RATE JOB DONE: VT'S JUDGES OVERALL proscrat: RATE JOB DONE: VT'S PROSECUTORS

2 2 Measures of association Survey rated the criminal justice system in Vermont. There is only 3.2 % of the 601 to rate the system to be Excellent.

3 3 Measures of association There are 27% of the people who rated judges as Excellent rated the System as Excellent as well.

4 4 Almost 69% of the people who rated prosecutors as Good rated the System as Good as well. Measures of association

5 5 The word related can have many different meaning. A perfect relationship is one in which all people gave the same ratings to the overall system and a particular component. It is imperfect relationships that van be quantified in many different ways. We need measures of association. Their range, in general, are in absolute value from 0 to 1.

6 6 Measures of association-Lambda Let be the number of misclassified in situation 1, and let the number of misclassified in situation 2. The measure of association Lambda is defined by Let us see an example in data crimjust.sav.

7 7 Measures of association-Lambda For Table on p.3 we consider Situation 1: If we predict Good for everyone, the misclassified number is 19+229+62=310. Situation 2: Consider the rule For each category of the independent variable, predict the category of the dependent variable that occurs most frequently. By the use of this rule we have Excellent: 17=10+7; Good: 104=6+88+10; Only fair: 78=3+54+21; poor: 21=4+17 The total misclassified is 220=17+104+78+21..

8 8 Measures of association-Lambda The value of the is from 0 to 1. The case of indicates you make prediction no errors The case of means that the independent variable is of no help in prediction. Two different lambdas: it is not a symmetric measure. Its value depends on which variable you predict from which.

9 9 Measures of association-Lambda Two different lambdas We have calculated the lambda for predicting judgrate. To calculate the lambda for predicting cjsrate we have (284-230)/284=0.19. The symmetric lambda is defined by

10 10 Measures of association-Lambda

11 11 Measures of association For Ordinal Variable In the past discussion we did not use the order information: Excellent > Good > Just poor > Poor If judges’ ratings increase as overall rating increase, you can say that the two variables have a positive relationship. Similarly we can define a negative relationship.

12 12 Measures of association For Ordinal Variable Concordant and discordant pairs A pair of cases is discordant if the value of one variable for a case is larger than the value for the other case but the direction is reversed for the second variable. A pair of cases is called concordant if it is not discordant. cjsratejudgrate Case 112 Case 223 Case 332

13 13 Measures of association For Ordinal Variable Concordant and discordant pairs Let P be the number of concordant pairs and Q be the number of discordant pairs for all distinct pairs of observations. The Goodman and Kruskai’s gamma is defined by Gamma=(P-Q)/(P+Q)

14 14 Measures of association For Ordinal Variable A positive gamma tells you that there are more like (concordant) pairs of cases than unlike pairs. There is a positive relationship between the two sets of ratings. As judges’ ratings increase, so do ratings of the overall system. If two variables are independent, the value of gamma is 0. However, a gamma of 0, like a lambda of 0, does not necessarily mean independence.

15 15 Measures of association For Ordinal Variable Kendall’s Tau-b A measure that attempts to normalized P-Q by considering ties on each variable in a pair separately in tau-b. The tau-b is defined by Where is the number of ties involving only the first variable and is the number of ties involving only the second variable. Tau-b can have the value of 1 and -1 only for tables that have the same number of rows and columns.

16 16 Measures of association For Ordinal Variable Kendall’s Tau-c A measure that attempts to normalized P-Q is tau-c. That is defined by where m is the samller of the number of rows and columns and N is the number of cases Unfortunately, there is no simple proportional reduction of error interpretation of tau-c either.

17 17 Measures of association For Ordinal Variable The following results of tau-b and tau-c between cjsrate and judgrate. There is no simple interpretation for the values. The tau-b is a commonly used measure of association.

18 18 Measures of association For Ordinal Variable There are more measures of association Somers’d on p.428 The Cohen’s kappa on pp. 429-430. How can you decide what measure of association to use? No single measure of association is best for all situations.

19 19 Correlation-based Measures

20 20

21 21 Correlation-based Measures Coefficient of correlation When two variables are numerical the Pearson correlation coefficient has been widely used. It measures the strength of the linear relationship between two numerical variables.

22 22 Correlation-based Measures Definition of Spearman rank correlation

23 23 An example of rank correlation

24 24 Correlation-based Measures Spearman correlation coefficient is a nonparametric measure of association.

25 25 Measure based on the chi-square statistic Since the chi-square test of independence is often used when analyzing crosstabulations, there are a variety of measure of association that are based on the chi-square statistic. The chi-square statistic is not a good measure of association. Several modifications have been proposed.

26 26 Contingency Table Example Left-Handed vs. Gender Dominant Hand: Left vs. Right Gender: Male vs. Female  2 categories for each variable, so called a 2 x 2 table  Suppose we examine a sample of size 300

27 27 Contingency Table Example Sample results organized in a contingency table: (continued) Gender Hand Preference LeftRight Female12108120 Male24156180 36264300 120 Females, 12 were left handed 180 Males, 24 were left handed sample size = n = 300:

28 28  2 Test for the Difference Between Two Proportions If H 0 is true, then the proportion of left-handed females should be the same as the proportion of left-handed males! The two proportions above should be the same as the proportion of left-handed people overall H 0 : π 1 = π 2 (Proportion of females who are left handed is equal to the proportion of males who are left handed) H 1 : π 1 ≠ π 2 (The two proportions are not the same Hand preference is not independent of gender)

29 29 The Chi-Square Test Statistic where: f o = observed frequency in a particular cell f e = expected frequency in a particular cell if H 0 is true  2 for the K x L case has (K-1)(L-1) degree of freedom (Assumed: each cell in the contingency table has expected frequency of at least 5) The Chi-square test statistic is:

30 30 Decision Rule  2U2U Decision Rule: If  2 >  2 U, reject H 0, otherwise, do not reject H 0 The  2 test statistic approximately follows a chi- squared distribution with one degree of freedom 0  Reject H 0 Do not reject H 0 http://www.statsoft.com/textbook/stathome.html?sttable.html&1

31 31 Computing the Average Proportion Here: 120 Females, 12 were left handed 180 Males, 24 were left handed i.e., the proportion of left handers overall is 0.12, that is, 12% The average proportion is:

32 32 Finding Expected Frequencies To obtain the expected frequency for left handed females, multiply the average proportion left handed (p) by the total number of females To obtain the expected frequency for left handed males, multiply the average proportion left handed (p) by the total number of males If the two proportions are equal, then P(Left Handed | Female) = P(Left Handed | Male) =.12 i.e., we would expect (.12)(120) = 14.4 females to be left handed (.12)(180) = 21.6 males to be left handed

33 33 Observed vs. Expected Frequencies Gender Hand Preference LeftRight Female Observed = 12 Expected = 14.4 Observed = 108 Expected = 105.6 120 Male Observed = 24 Expected = 21.6 Observed = 156 Expected = 158.4 180 36264300

34 34 Gender Hand Preference LeftRight Female Observed = 12 Expected = 14.4 Observed = 108 Expected = 105.6 120 Male Observed = 24 Expected = 21.6 Observed = 156 Expected = 158.4 180 36264300 The Chi-Square Test Statistic The test statistic is:

35 35 Decision Rule Decision Rule: If  2 > 3.841, reject H 0, otherwise, do not reject H 0 Here,  2 = 0.7576 <  2 U = 3.841, so we do not reject H 0 and conclude that there is not sufficient evidence that the two proportions are different at  = 0.05   2 U =3.841 0  Reject H 0 Do not reject H 0

36 36 Measure based on the chi-square statistic Since the chi-square test of independence is often used when analyzing crosstabulations, there are a variety of measure of association that are based on the chi-square statistic. The chi-square statistic is not a good measure of association. Several modifications have been proposed.


Download ppt "1 Measuring Association The contents in this chapter are from Chapter 19 of the textbook. The crimjust.sav data will be used. cjsrate: RATE JOB DONE: CJ."

Similar presentations


Ads by Google