INFO 515Lecture #91 Action Research More Crosstab Measures INFO 515 Glenn Booker.

Slides:



Advertisements
Similar presentations
Tests of Significance and Measures of Association
Advertisements

Contingency Tables Chapters Seven, Sixteen, and Eighteen Chapter Seven –Definition of Contingency Tables –Basic Statistics –SPSS program (Crosstabulation)
2013/12/10.  The Kendall’s tau correlation is another non- parametric correlation coefficient  Let x 1, …, x n be a sample for random variable x and.
Action Research Correlation and Regression
Statistical Analysis and Data Interpretation What is significant for the athlete, the statistician and team doctor? important Will Hopkins
SPSS Session 5: Association between Nominal Variables Using Chi-Square Statistic.
Contingency Tables Chapters Seven, Sixteen, and Eighteen Chapter Seven –Definition of Contingency Tables –Basic Statistics –SPSS program (Crosstabulation)
Quantitative Techniques
Sociology 601 Class 13: October 13, 2009 Measures of association for tables (8.4) –Difference of proportions –Ratios of proportions –the odds ratio Measures.
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
Correlation Chapter 9.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #17.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
PPA 415 – Research Methods in Public Administration Lecture 9 – Bivariate Association.
Matching level of measurement to statistical procedures
Correlations and T-tests
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Correlation Question 1 This question asks you to use the Pearson correlation coefficient to measure the association between [educ4] and [empstat]. However,
Inferential Statistics
Week 11 Chapter 12 – Association between variables measured at the nominal level.
Association between Variables Measured at the Nominal Level.
LIS 570 Summarising and presenting data - Univariate analysis continued Bivariate analysis.
Action Research Review
Action Research Data Manipulation and Crosstabs
Hypothesis Testing for Ordinal & Categorical Data EPSY 5245 Michael C. Rodriguez.
Bivariate Relationships Analyzing two variables at a time, usually the Independent & Dependent Variables Like one variable at a time, this can be done.
Chapter 15 Correlation and Regression
1 Measuring Association The contents in this chapter are from Chapter 19 of the textbook. The crimjust.sav data will be used. cjsrate: RATE JOB DONE: CJ.
In the Lab: Working With Crosstab Tables Lab: Association and the Chi-square Test Chapters 7, 8 and 9 1.
Statistics in Applied Science and Technology Chapter 13, Correlation and Regression Part I, Correlation (Measure of Association)
Chapter 16 The Chi-Square Statistic
1 Lecture 7 Two-Way Tables Slides available from Statistics & SPSS page of Social Science Statistics Module I Gwilym Pryce.
1 Lecture 7: Two Way Tables Graduate School Quantitative Research Methods Gwilym Pryce
CHI SQUARE TESTS.
ANALYSIS PLAN: STATISTICAL PROCEDURES
Chapter 10 The t Test for Two Independent Samples
Practice Problem: Lambda (1)
Measures of Association February 25, Objectives By the end of this meeting, participants should be able to: a)Calculate ordinal measures of association.
1/5/2016Slide 1 We will use a one-sample test of proportions to test whether or not our sample proportion supports the population proportion from which.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Describing Association for Discrete Variables. Discrete variables can have one of two different qualities: 1. ordered categories 2. non-ordered categories.
Copyright © 2014 by Nelson Education Limited Chapter 11 Introduction to Bivariate Association and Measures of Association for Variables Measured.
PART 2 SPSS (the Statistical Package for the Social Sciences)
Cross Tabs and Chi-Squared Testing for a Relationship Between Nominal/Ordinal Variables.
Measures of Association June 25, 2008 Ivan Katchanovski, Ph.D. POL 242Y-Y.
Beginners statistics Assoc Prof Terry Haines. 5 simple steps 1.Understand the type of measurement you are dealing with 2.Understand the type of question.
Slide 1 Ordinal Measures of Association for Survey-type Data Christoph Maier Coordinator of the ARL December 6, 2007 Stats For Lunch Please visit our ARL.
Copyright © 2012 by Nelson Education Limited. Chapter 12 Association Between Variables Measured at the Ordinal Level 12-1.
Bivariate Association. Introduction This chapter is about measures of association This chapter is about measures of association These are designed to.
Association Between Variables Measured at the Ordinal Level
Final Project Reminder
BINARY LOGISTIC REGRESSION
Final Project Reminder
Making Use of Associations Tests
Hypothesis Testing Review
Basic Statistics Overview
Chapter 14 in 1e Ch. 12 in 2/3 Can. Ed.
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Statistics II: An Overview of Statistics
15.1 The Role of Statistics in the Research Process
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Making Use of Associations Tests
Applied Statistics Using SPSS
Applied Statistics Using SPSS
RES 500 Academic Writing and Research Skills
1. Nominal Measures of Association 2. Ordinal Measure s of Associaiton
Presentation transcript:

INFO 515Lecture #91 Action Research More Crosstab Measures INFO 515 Glenn Booker

INFO 515Lecture #92 Nominal Crosstab Tests  Four more measures which could apply to nominal data in a crosstab Eta Lambda Goodman and Kruskal’s tau Uncertainty coefficient

INFO 515Lecture #93 Eta Coefficient  Used when the dependent variable uses an interval or ratio scale, and the independent variable is nominal or ordinal  Eta () squared is the proportion of the dependent variable’s variance which is explained by the independent variable Eta squared is symmetric, and ranges from 0 to 1 This is the same eta from the end of lecture 6

INFO 515Lecture #94 Directional vs Symmetric  Directional measures give a different answer depending on whether A is dependent on B, or B is dependent on A  Symmetric measures don’t care which variable is dependent or independent  Tests indicate whether there is a statistically significant relationship; measures, here, describe the strength of association

INFO 515Lecture #95 Directional Measures  Directional measures help determine how much the dependent variable is affected by the independent variable  Directional measures for nominal data: Lambda (recommended) Goodman and Kruskal’s tau Uncertainty coefficient

INFO 515Lecture #96 Directional Measures  Directional measures generally range from 0 to 1  A value of 0 means the independent variable doesn’t help predict the dependent variable  A value of 1 means the independent variable perfectly predicts the resulting dependent variable

INFO 515Lecture #97 Directional Measures  In this context, either variable can be considered dependent or independent Does A predict B? Does B predict A?  A “symmetric” value is the weighted average of the two possible selections (A predicts B, or B predicts A)

INFO 515Lecture #98 Proportional Reduction in Error  Proportional Reduction in Error (PRE) measures find the fractional reduction in errors due to some factor (such as an independent variable) PRE = (Error without X – Error with X) / Error with X  Two we’ll look at are Lambda, and Goodman and Kruskal’s Tau

INFO 515Lecture #99 Lambda Coefficient  Lambda has a symmetric option for output  Its Value is the proportion of the dependent variable predicted by the independent one  The Asymptotic Std. Error allows a 95% confidence interval to be made  “Approx. T” is the Value divided by the Std. Error if the parameter were zero (not the usual definition!)

INFO 515Lecture #910 Goodman and Kruskal’s Tau  SPSS note: Goodman and Kruskal’s Tau is not directly selected; it appears only when Lambda is checked!  Does not have Symmetric option  Does not approximate T  Based on chi square  Otherwise similar to Lambda for interpretation

INFO 515Lecture #911 Uncertainty Coefficient  Does have symmetric dependency option  Does have T approximation  Also based on chi square  Goodman and Kruskal’s tau and the Uncertainty Coefficient may give opposite results as Lambda, so use them cautiously!

INFO 515Lecture #912 Nominal Example  Use “GSS91 political.sav” data set  Use Analyze / Descriptive Statistics / Crosstabs…  Select “region” for Row(s), and “relig” for Column(s)  Under “Statistics…” select Lambda, and Uncertainty Coefficient

INFO 515Lecture #913 Nominal Example

INFO 515Lecture #914 Nominal Example - Lambda  Focus on the Lambda () output first  Lambda measures the percent of error reduction when using the independent variable to predict the dependent variable Calculation based on any desired outcome contributing to lambda  Lambda ranges from 0 to 1

INFO 515Lecture #915 Nominal Example  As usual, we want Sig. < for the meaning of lambda to be statistically significant  If Region is dependent, then we see that religious preference is a significant (sig. = 0.000) predictor “relig” contributes (Value) 4.8% +/- (Std Error) 1.2% of the variability of a person’s region

INFO 515Lecture #916 Lambda Example 95% confidence interval of that contribution is (not shown) 4.8 – 2*1.2 = 2.4% to *1.2 = 7.2% But “region” is not a significant predictor of “relig” (sig. = 0.099)  Ignore the value of lambda if it isn’t significant  The symmetric value is significant, and its Value is between the other two lambda values

INFO 515Lecture #917 G and K Tau Example  Goodman and Kruskal’s tau () is similar to lambda, but is based on predictions in the same proportion as the marginal totals (individual row or column subtotals) No symmetric value is given – it’s only directional  Same method for interpretation, but notice it predicts both variables can be significant as dependent, and ‘relig’ is much stronger! Still from slide 13

INFO 515Lecture #918 Uncertainty Coefficient Example  Is a measure of association that indicates the proportional reduction in error when values of one variable are used to predict values of the other variable  The program calculates both symmetric and directional versions of it  Here, gives results similar to G and K Tau

INFO 515Lecture #919 Tests for 2x2 Tables  Many special measures can be applied to a 2x2 table, including: Relative risk Odds ratio  Look at these in the context of answering questions like: “Are people who approve of women working more likely to vote for a woman President?”

INFO 515Lecture #920 Tests for 2x2 Tables  Use “GSS91 social.sav” data set  Variables are “should women work” (fework) and “vote for woman president” (fepres)  Isolate the cases using Data / Select Cases  Use the If condition (fepres=1 | fepres=2) & (fework=1 | fework=2) ‘|’ means ‘or’; ‘&’ means ‘and’

INFO 515Lecture #921 Tests for 2x2 Tables  Use Analyze / Descriptive Statistics / Crosstabs…  Select “fework” for Row(s), and “fepres” for Column(s)  For Statistics select Risk  For Cells select Row percentages  This gives 947 valid cases

INFO 515Lecture #922 Tests for 2x2 Tables

INFO 515Lecture #923 Tests for 2x2 Tables ‘cohort’ = subset

INFO 515Lecture #924 Relative Risk  The relative risk is a ratio of percentages  It is very directional  Those who (approve of voting for a woman president) are times as likely to (approve of women working) Based on 93.4%/79.3% = Note the 95% confidence intervals for each ratio are given; roughly 1.09 to 1.27 for this example

INFO 515Lecture #925 Relative Risk  Conversely, those who do not approve of voting for a woman president are times as likely to approve of women working (6.6/20.7=0.317), with a broader confidence interval of 0.22 to 0.47

INFO 515Lecture #926 Odds Ratio  The odds ratio is the ratio of (the probability that the event occurs) to (the probability that the event does not occur)  The odds ratio that someone who (would vote for a woman president) also (approves of women working) has two terms One is the ratio of (those who approve of women working) divided by (voting for a woman president) (93.4/6.6=14.152)...

INFO 515Lecture #927 Odds Ratio  Divided by the ratio of (those who would NOT approve of women working) (voting for a woman president) (79.3/20.7=3.831)  Hence the odds ratio is /3.831 =3.694 or (93.4*20.7)/(6.6*79.3)  Round off error, probably in the 6.6 value, kept us from getting the stated odds ratio of (first row of output on slide 23)

INFO 515Lecture #928 Square Tables (RxR)  Tables with the same number of rows as columns (RxR tables) also have special measures Cohen’s Kappa (), which measures the strength of agreement (did two people’s measurements match well?) Applies for R values of one nominal variable

INFO 515Lecture #929 Kappa  Kappa is used only when the rows and columns have the same categories Set of possible diagnoses achieved by two different doctors Two sets of outcomes which are believed to be dependent on each other  Kappa ranges from zero to one; is one when the diagonal has the only non-zero values

INFO 515Lecture #930 Kappa Example  Example here is the educational level of one’s parents (maeduc and paeduc; as in ‘ma and pa education’)  Use “GSS91 social.sav” data set  Define new variables madeg and padeg, which are derived from maeduc and paeduc (convert years of education into rough levels of achievement)

INFO 515Lecture #931 Kappa Example  New scale for madeg and padeg is Education <12 is code 1, “LT High School” Education is code 2, “High School” Education 16 is code 3, “Bachelor degree” Education 17+ is code 4, “Graduate”  Use Analyze / Descriptive Statistics / Crosstabs…

INFO 515Lecture #932 Kappa Example  Select “padeg” for Row(s), and “madeg” for Column(s)  For Statistics select Kappa  The basic crosstab just shows the data counts (next slide)  Then we get the Kappa measure (slide after next)  As usual, check to make sure the result is significant before going any further

INFO 515Lecture #933 Kappa Example

INFO 515Lecture #934 Kappa Example

INFO 515Lecture #935 Kappa Example  Here the significance is 0.000, very clearly significant (< 0.050)  This is confirmed by the approximate T of over 20 - as before, this T is based on the null hypothesis  The actual value of kappa and its standard error are /  What does this mean?

INFO 515Lecture #936 Kappa  Kappa is judged on a fairly fixed scale Kappa below 0.40 indicates poor agreement beyond chance Kappa from 0.40 to 0.75 is fair to good agreement Kappa above 0.75 is strong agreement  So in this case we are confident there is poor agreement between parents’ education Scale from J.L. Fleiss, 1981

INFO 515Lecture #937 Ordinal Crosstab Measures  Several association measures can be used for a table with R rows and C columns which contain ordinal data (and presumably R ≠ C) Kendall’s tau-b Kendall’s tau-c (Goodman and Kruskal’s) Gamma (preferred) Somers’ d Spearman’s Correlation Coefficient

INFO 515Lecture #938 General RxC Table Measures  Many are based on comparing adjacent pairs of data from the two variables If B increases when A increases, the pair is concordant If B decreases when A increases, the pair is discordant If A and B are equal, the pair is tied

INFO 515Lecture #939 General RxC Table Measures  The number of concordant pairs is “P”  The number of discordant pairs is “Q”  The number of ties on X are “Tx”  The number of ties on Y are “Ty”  The smaller of the number of rows R and columns C is called “m” m = min(R,C)  Given this vocabulary, we can define many measures

INFO 515Lecture #940 General RxC Table Measures  Kendall’s tau-b is tau-b = (P-Q) / sqrt[ (P+Q+Tx)*(P+Q+Ty) ]  Kendall’s tau-c is tau-c = 2m*(P-Q) / [N 2 *(m-1)]  Gamma () is Gamma = (P-Q) / (P+Q)  Somers’ d is dy = (P-Q) / (P+Q+Ty) or dx = (P-Q) / (P+Q+Tx)

INFO 515Lecture #941 General RxC Table Measures  All of the RxC measures are symmetric except Somers’ d, which has both symmetric and directional values given  All are evaluated by their significance, which also has an approximate T score  All are expressed by a Value +/- its Std Error

INFO 515Lecture #942 RxC Measures Example  Use “GSS91 social.sav” data set  Use Analyze / Descriptive Statistics / Crosstabs…  Select “paeduc” for Row(s), and “maeduc” for Column(s)  Under “Statistics…” select Eta, Correlations, Gamma, Somers’ d, Kendall’s tau-b and tau-c

INFO 515Lecture #943 RxC Measures Example  This compares the number of years of education of one’s mother and father to see how strongly they affect one another  The crosstab data table is very large, since it ranges from 0 to 20 for each category, with irregular gaps (we’re not using the simplified categories from the Kappa example) Hence we’re not showing it here!

INFO 515Lecture #944 RxC Measures Example Both measures show the mother’s education is a slightly better predictor

INFO 515Lecture #945 RxC Measures Example  Directional measures: Somers’ d is significant  It shows that there are about 55% +/- 2% more concordant pairs than discordant ones, excluding ties on the independent variable The Eta measure shows that around 69% of the variability of one parent’s education is shared with the other’s

INFO 515Lecture #946 RxC Measures Example

INFO 515Lecture #947 RxC Measures Example  All of the symmetric measures are statistically significant, with approximate t values around The Kendall tau-b and tau-c measures disagree a little on the magnitude of the agreement Gamma and Spearman give fairly strong positive correlations

INFO 515Lecture #948 RxC Measures Example Spearman, like ‘r’, ranges from -1 to +1, and does not require a normal distribution  Based on ordered categories, not their values  Even ‘r’ can be calculated for this case, and it gives results similar to Gamma and Spearman

INFO 515Lecture #949 Yule’s Q  A special case of gamma for a 2x2 table is called Yule’s Q  It is appropriate for ordinal data in 2x2 tables; so values for each variable are Low/High, Yes/No, or similar  Define Yule’s Q = (a*d – b*c) / (a*d + b*c) See PDF page 59 of Action Research handout for the definition of a, b, c, and d (cell labels)

INFO 515Lecture #950 Yule’s Q  Measures the strength and direction of association from -1 (perfect negative association) to 0 (no association) to +1 (perfect positive association)  Judge the results for Yule’s Q by the table on page 59 of Action Research handout ; and see pages for other related discussion