Presentation is loading. Please wait.

Presentation is loading. Please wait.

Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Similar presentations


Presentation on theme: "Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu."— Presentation transcript:

1 Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu

2 Canoe.ca Title of article implies that when women are depressed, they tend to drink more Correlational relationship between drinking and depression in women http://lifewise.canoe.ca/Living/2007/01/05/3176991-cp.html

3 Scatter Diagram  Visual display of relationship between variables Bivariate distribution: two scores for each individual Where an individual scores on both x and y E.g., relationship between high school average and university average Participant 11 – 3.2 high school GPA, 3.3 university GPA

4 Correlation  What does one variable tell us about the other?  Looks at how the two variables covary Changes in one correspond to changes in other  Correlation coefficient tells us the direction and magnitude of the relationship i.e., how variables are related (+/-) and the strength of the relationship

5 Correlation Coefficient Positive Correlation Negative Correlation No Correlation

6 Correlation Coefficient  Correlation coefficient varies from -1.0 (perfect negative relationship) to +1.0 (perfect positive relationship)  Accounts for the individual’s deviation above and below the group mean on each variable Above the mean on both variables = 2 positive standard scores Below the mean on both variables = 2 negative standard scores

7 Correlation Coefficient  Pearson correlation coefficient is mean of these products:  Positive Value: standard scores have equal signs and are of approximate equal amount  Negative Value: standard score is above mean in one variable, and below mean in other (cross product is negative  No Correlation: some products are positive and some are negative

8 Regression  If you had no other information, what is the best prediction for a person’s grade in a course? Often we have other information (e.g., grades on other courses, midterm grades, etc.)  If variables are correlated with variable of interest, this information can help us improve our prediction  Process called regression

9 Regression  Regression line: best fitting straight line through a set of points in a scatter diagram  Principle of Least Squares Minimum squared deviation from regression line

10 Regression Line Y’ = a + bX  Y’ = predicted score  a = intercept, the value of y when x is 0, point where regression line crosses y  b = regression coefficient, slope of regression line  X = known score

11 Regression Line

12 Y’ = a + bX Y’ = 20 +.1X Where Y’ = predicted grade for course X = SAT score slope =.1 intercept = 20

13 Regression  What if there were no correlation between X and Y? What would regression line look like?

14 Regression  The larger the value of b, the more information we have about Y by knowing X

15 Regression  What happens if both variables are in terms of standard scores? Y’ = a + bX  a = 0  b = r, correlation between X and Y  Regression equation would be: Z Y’ = rZ x  Correlation: special case of regression where both variables are in standard scores

16 Regression Problems  Break into groups of 3 people and complete the problems on the handout

17 Terms Used in Correlation & Regression  Residual: difference between predicted and observed values Y – Y’ Σresiduals = 0  Standard Error of the Estimate: standard deviation of residuals, kind of an average of residuals A measure of accuracy of prediction Smaller = more accurate predictions because differences between Y and Y’ are small

18 Terms Used in Correlation & Regression  Coefficient of Determination (r 2 ): % of total variation in one set of scores that we know as a function of information about other set  Cross Validation: calculate standard error of estimate in a group of participants other than one used to get equation  Restricted Range: When restrictions on sample inhibit variability observed correlation will likely be deflated

19 Terms Used in Correlation & Regression  Correlation – Causation Problem: correlation between two variables does not necessarily mean that one causes another E.g., aggression and TV viewing  Third Variable Explanation: the possibility that a third variable that hasn’t been measured causes both E.g., aggression and TV viewing  poor social adjustment

20 Multiple Regression  Looks at relationship among three or more variables E.g., predicting course grade from SAT scores and average from previous year Where k = # of predictor variables  Example: predicting law school GPA from undergrad GPA, professors’ ratings, age Law school GPA =.8 (Z score of Undergrad GPA) +.24 (Z score of profs’ ratings) +.03 (Z score of age)

21 Multiple Regression  When variables are expressed in Z-units, weights are standardized regression coefficients Also called B’s or betas  If not Z-units  using raw regression coefficients Also called b’s  Need to be careful when predictor variables are highly correlated  Best when predictor variables are uncorrelated

22 Teaching Evaluation For: Thanh-Thanh Tieu Date: January 16, 2007 Class: Correlation & Regression, PS397 Strengths of the Lecture Suggestions for Improvement Additional Comments


Download ppt "Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu."

Similar presentations


Ads by Google