2 Parametirc vs. Nonparametric Measures of Association Parametirc measure of association is for continuous variables measured on an interval or ratio scale.Bivariate correlation (Pearson correlation) is a typical parametric measure of association.The coefficient does not distinguish between independent and dependent variables.Nonparametric measure of association is for nominal or ordinal data.
3 Bivariate Correlation Analysis Pearson correlation coefficientr symbolized the coefficient's estimate of linear association based on sampling dataCorrelation coefficients reveal the magnitude and direction of relationshipsCoefficient’s sign (+ or -) signifies the direction of the relationshipAssumptions of rLinearityBivariate normal distribution
4 Bivariate Correlation Analysis ScatterplotsProvide a means for visual inspection of datathe direction of a relationshipthe shape of a relationshipthe magnitude of a relationship(with practice)
5 Interpretation of Coefficients Relationship does not imply causationY could cause XX could cause YX and Y could influence each other.X and Y could be affected by a third variable.Statistical significance is measured by t-value.Statistical significance does not imply a relationship is practically meaningful
6 Interpretation of Coefficients Be careful about artifact correlationsCoefficient of determination (r2)The amount of common variance in X and YF-test is used for goodness of fit.Correlation matrixused to display coefficients for more than two variables
7 Bivariate Linear Regression Establish a linear relationship between a independent variable (X) and a dependent variable (Y).Use the observed value of X to estimate or predict corresponding Y value.Regression coefficientsSlope: β1 = Δ Y / Δ XIntercept: βo = Y bar – β1 X bar
8 Bivariate Linear Regression Error term: deviation of the ith observation from the regression line represented by βo + β1 Xi , i.e., εi = Yi - βo - β1 XiMethod of Least SquaresRegression line is line of best fit for the data.To find the best fit, the method of least squares is used.Method of least squares is to minimize Σ εi2 (the total squared errors of estimate).Technically, calculus (differentiation) is used to solve for β1 and βo..
9 Interpreting Linear Regression Goodness of fitT test for individual coefficientsZero slope (β1 = 0) meansY completely unrelated to X and no systematic pattern is evidentconstant values of Y for every value of Xdata are related, but represented by a nonlinear functionF test for the modelF value is related to r2 (coefficient of determination)
10 Interpreting Linear Regression ResidualsWhat remain after the line is fittedEstimated error termsStandardized residuals are comparable to Z scores with a mean of 0 and a standard deviation of 1.Confidence band vs. prediction band
11 Measures for Nominal Data When there is no relationship at all, coefficient is 0When there is complete dependency, the coefficient displays unity or 1The following measures are used for nominal data (next slide).
12 Measures for Nominal Data Chi-square based measurePhiCramer’s VContingency coefficient of CProportional reduction in error (PRE)LambdaTau
13 Characteristics of Ordinal Data Concordant- subject who ranks higher on one variable also ranks higher on the other variableDiscordant- subject who ranks higher on one variable ranks lower on the other variable
14 Measures for Ordinal Data No assumption of bivariate normal distributionMost based on concordant/discordant pairsValues range from +1.0 to -1.0
15 Measures for Ordinal Data The following test statistics are used.GammaSomer’s dSpearman’s rhoKendall’s tau bKendall’s tau c