# Linear Regression. PSYC 6130, PROF. J. ELDER 2 Correlation vs Regression: What’s the Difference? Correlation measures how strongly related 2 variables.

## Presentation on theme: "Linear Regression. PSYC 6130, PROF. J. ELDER 2 Correlation vs Regression: What’s the Difference? Correlation measures how strongly related 2 variables."— Presentation transcript:

Linear Regression

PSYC 6130, PROF. J. ELDER 2 Correlation vs Regression: What’s the Difference? Correlation measures how strongly related 2 variables are. Regression provides a means for predicting the value of one variable based on the value of a related variable. The underlying mathematics are the same. Here we are dealing only with linear correlation and linear regression.

PSYC 6130, PROF. J. ELDER 3 Optimal Prediction using z Scores Consider 2 variables X and Y that may be related in some way. –e.g., X = midterm score, Y = final exam score X = reaction time, Y = error rate Suppose you know X for a particular case (e.g., individual, trial). What is your best guess at Y? The answer turns out to be pretty simple:

PSYC 6130, PROF. J. ELDER 4 Example: 6130A 2005-06 Assignment marks

PSYC 6130, PROF. J. ELDER 5 Graphical Representation Regression line PSYC 6130A 2005-06 -3 -2 0 1 2 3 -3-20123 Assignment 1 z-Score Assignment 2 z-Score

PSYC 6130, PROF. J. ELDER 6 The Raw-Score Regression Formula or where In terms of population parameters: In terms of sample statistics: or where

PSYC 6130, PROF. J. ELDER 7 Example: 6130A 2005-06 Assignment marks

PSYC 6130, PROF. J. ELDER 8 Graphical Representation PSYC 6130 Section A 2005-2006 75% 80% 85% 90% 95% 100% 80%85%90%95%100% Assignment 1 Grade Assignment 2 Grade y = 0.867x + 10.5% Regression line

PSYC 6130, PROF. J. ELDER 9 Residuals The deviations of the actual Y values from the Y values predicted by the regression line are called residuals. The regression line minimizes the sum of squared residuals (and hence is called a mean-squared fit). PSYC 6130 Section A 2005-2006 75% 80% 85% 90% 95% 100% 80%85%90%95%100% Assignment 1 Grade Assignment 2 Grade

PSYC 6130, PROF. J. ELDER 10 Variance of the Estimate Total prediction error is expressed as the variance of the estimate (or mean-squared error) : In terms of population parameters:In terms of sample statistics:

PSYC 6130, PROF. J. ELDER 11 Explained and Unexplained Variance PSYC 6130 Section A 2005-2006 75% 80% 85% 90% 95% 100% 80%85%90%95%100% Assignment 1 Grade Assignment 2 Grade Unexplained Explained

PSYC 6130, PROF. J. ELDER 12 Summary of Variances

PSYC 6130, PROF. J. ELDER 13 Summary of Variances It can be shown that: i.e., the variance is equal to the sum of the explained and unexplained variances.

PSYC 6130, PROF. J. ELDER 14 Summary of Variances

PSYC 6130, PROF. J. ELDER 15 Coefficient of Determination The fraction of the total variance explained by the regression line is called the coefficient of determination It can be shown that this is just the square of the Pearson coefficient r: Population: Sample:

PSYC 6130, PROF. J. ELDER 16 Coefficient of Nondetermination The fraction of the total variance that remains unexplained by the regression line is called the coefficient of nondetermination It can be shown that this is just 1-r 2 : Population: Sample:

PSYC 6130, PROF. J. ELDER 17 Summary of Coefficients

PSYC 6130, PROF. J. ELDER 18 Components of Variance: SPSS Output ANOVA b 861347.21861347.1867465.139.000 a 132586111491115.383 218720911492 Regression Residual Total Model 1 Sum of SquaresdfMean SquareFSig. Predictors: (Constant), How tall are you without your shoes on (in cm.) a. Dependent Variable: How much do you weigh (in kilograms) b.

PSYC 6130, PROF. J. ELDER 19 Estimating the Variance of the Estimate Uncertainty in predictions can be estimated using the assumption of homoscedasticity. –(Etymology: hom- + Greek skedastikos able to disperse, from skedannynai to disperse) –Thought question: does this also explain the origin of the verb skedaddle? –In other words, homogeneity of variance in Y over the range of X.

PSYC 6130, PROF. J. ELDER 20 Confidence Intervals for Predictions

PSYC 6130, PROF. J. ELDER 21 Example: 6130A 2005-06 Assignment marks

PSYC 6130, PROF. J. ELDER 22 Underlying Assumptions Independent random sampling Linearity Normal Distribution Homoscedasticity

PSYC 6130, PROF. J. ELDER 23 Regressing X on Y Simply reverse the formulae, e.g., In terms of sample statistics: or where

PSYC 6130, PROF. J. ELDER 24 When to Use Linear Regression Prediction Statistical Control –Adjust for effects of confounding variable. –Also known as partialing out the effect of the confounding variable. Experimental Psychology: modeling effect of continuous independent variable on continuous dependent variable. –e.g., reaction time vs set size in visual search.

PSYC 6130, PROF. J. ELDER 25 Statistical Control Example: Mental Health Women report more bad mental health days than men, t(8176)=-7.1, p<.001, 2-tailed.

PSYC 6130, PROF. J. ELDER 26 Statistical Control Example: Physical Health

PSYC 6130, PROF. J. ELDER 27 Correlation Pearson’s r = 0.31

PSYC 6130, PROF. J. ELDER 28 After Partialing Out Physical Health

PSYC 6130, PROF. J. ELDER 29 Result of Partialing Out Physical Health Controlling for physical health, women report more bad mental health days than men, t(8176)=-5.7, p<.001, 2-tailed.

Download ppt "Linear Regression. PSYC 6130, PROF. J. ELDER 2 Correlation vs Regression: What’s the Difference? Correlation measures how strongly related 2 variables."

Similar presentations