Overview Assumptions for Linear Regression Evaluating a Regression Model
Assumptions for Bivariate Linear Regression Quantitative data (or dichotomous) Independent observations Predict for same population that was sampled
Assumptions for Bivariate Linear Regression Linear relationship – Examine scatterplot Homoscedasticity – equal spread of residuals at different values of predictor – Examine ZRESID vs ZPRED plot
Assumptions for Bivariate Linear Regression Independent errors – Durbin Watson should be close to 2 Normality of errors – Examine frequency distribution of residuals
Influential Cases Influential cases have greater impact on the slope and y-intercept Select casewise diagnostics and look for cases with large residuals
Standard Error of the Estimate Index of how far off predictions are expected to be Larger r means smaller standard error Standard deviation of y scores around predicted y scores
Sums of Squares Total SS – total squared differences of Y scores from the mean of Y Model SS – total squared differences of predicted Y scores from the mean of Y Residual SS – total squared differences of Y scores from predicted Y scores
Coefficient of Determination r 2 is the proportion of variance in Y explained by X Adjusted r 2 corrects for the fact that the r 2 often overestimates the true relationship. Adjusted r 2 will be lower when there are fewer subjects.
Goodness of Fit Dividing the Model SS by the Total SS produces r 2 The ANOVA F-test determines whether the regression equation accounted for a significant proportion of variance in Y F is the Model Mean Square divided by the Residual Mean Square
Coefficients The Constant B under “unstandardized” is the y-intercept b 0 The B listed for the X variable is the slope b 1 The t test is the coefficient divided by its standard error The standardized slope is the same as the correlation
Example of Reporting a Regression Analysis The linear regression for predicting quiz enjoyment from level of statistics anxiety did not account of a significant portion of variance, F(1, 24) = 1.75, p =.20, r 2 =.07.
Take-Home Points The validity of a regression procedure depends on multiple assumptions. A regression model can be evaluated based on whether and how well it predicts an outcome variable.