Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Similar presentations


Presentation on theme: "Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression."— Presentation transcript:

1 Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression

2 Copyright © 2010 Pearson Education, Inc. 17-2 Chapter Outline 1) Overview 2) Product-Moment Correlation 3) Regression Analysis 4) Bivariate Regression 5) Multiple Regression 6) Multicollinearity

3 Copyright © 2010 Pearson Education, Inc. 17-3 Variances are similar, t-tests are appropriate

4 Copyright © 2010 Pearson Education, Inc. 17-4 Variances are not similar, t-tests could be misleading Correlation and regression analysis can help….

5 Copyright © 2010 Pearson Education, Inc. 17-5 Product Moment Correlation The product moment correlation, r, summarizes the strength of association between two metric (interval or ratio scaled) variables, say X and Y. In other words, you can have a correlation coefficient for Likert scale items, not dichotomous items. It is an index used to determine whether a linear (straight- line) relationship exists between X and Y. As it was originally proposed by Karl Pearson, it is also known as the Pearson correlation coefficient. It is also referred to as simple correlation, bivariate correlation, or merely the correlation coefficient.

6 Copyright © 2010 Pearson Education, Inc. 17-6 Linear relationships

7 Copyright © 2010 Pearson Education, Inc. 17-7 Product Moment Correlation From a sample of n observations, X and Y, the product moment correlation, r, can be calculated as: sum X = average of all x’s Y = average of all y’s Don’t worry, we can do this in SPSS…

8 Copyright © 2010 Pearson Education, Inc. 17-8 Product Moment Correlation r varies between -1.0 and +1.0. The correlation coefficient between two variables will be the same regardless of their underlying units of measurement. For example, comparing a 5 point scale to a 7 point scale is okay.

9 Copyright © 2010 Pearson Education, Inc. 17-9 Explaining Attitude Toward the City of Residence Example: Table 17.1

10 Copyright © 2010 Pearson Education, Inc. 17-10 Product Moment Correlation The correlation coefficient may be calculated as follows: STEP 1: GET THE AVERAGES OF X AND Y = (10 + 12 + 12 + 4 + 12 + 6 + 8 + 2 + 18 + 9 + 17 + 2)/12 = 9.333 = average of X Duration of residence (X) Attitude toward the city (Y) = (6 + 9 + 8 + 3 + 10 + 4 + 5 + 2 + 11 + 9 + 10 + 2)/12 = 6.583 = average of Y ( X i - X )( Y i - Y )  i =1 n = (10 -9.33)(6-6.58) + (12-9.33)(9-6.58) + (12-9.33)(8-6.58) + (4-9.33)(3-6.58) + (12-9.33)(10-6.58) + (6-9.33)(4-6.58) For each respondent,+ (8-9.33)(5-6.58) + (2-9.33) (2-6.58) subtract the average of+ (18-9.33)(11-6.58) + (9-9.33)(9-6.58) x from their x; subtract+ (17-9.33)(10-6.58) + (2-9.33)(2-6.58) the average of y from= -0.3886 + 6.4614 + 3.7914 + 19.0814 their y, then multiply, + 9.1314 + 8.5914 + 2.1014 + 33.5714 then sum all values. + 38.3214 - 0.7986 + 26.2314 + 33.5714 = 179.6668 STEP 2: GET THE NUMERATOR

11 Copyright © 2010 Pearson Education, Inc. 17-11 Product Moment Correlation ( X i - X ) 2  i =1 n = (10-9.33) 2 + (12-9.33) 2 + (12-9.33) 2 + (4-9.33) 2 + (12-9.33) 2 + (6-9.33) 2 + (8-9.33) 2 + (2-9.33) 2 + (18-9.33) 2 + (9-9.33) 2 + (17-9.33) 2 + (2-9.33) 2 = 0.4489 + 7.1289 + 7.1289 + 28.4089 + 7.1289+ 11.0889 + 1.7689 + 53.7289 + 75.1689 + 0.1089 + 58.8289 + 53.7289 = 304.6668 ( Y i - Y ) 2  i =1 n = (6-6.58) 2 + (9-6.58) 2 + (8-6.58) 2 + (3-6.58) 2 + (10-6.58) 2 + (4-6.58) 2 + (5-6.58) 2 + (2-6.58) 2 + (11-6.58) 2 + (9-6.58) 2 + (10-6.58) 2 + (2-6.58) 2 = 0.3364 + 5.8564 + 2.0164 + 12.8164 + 11.6964 + 6.6564 + 2.4964 + 20.9764 + 19.5364 + 5.8564 + 11.6964 + 20.9764 = 120.9168 Thus, 179.6668 (304.6668) (120.9168) = 0.9361 = r STEP 3: GET THE DENOMINATOR STEP 4: COMPLETE THE FORMULA

12 Copyright © 2010 Pearson Education, Inc. 17-12 Interpretation of the Correlation Coefficient The correlation coefficient ranges from −1 to 1. A value of 1 implies that a linear equation describes the relationship between X and Y perfectly, with all data points lying on a line for which Y increases as X increases. A value of −1 implies that all data points lie on a line for which Y decreases as X increases. A value of 0 implies that there is no linear correlation between the variables.

13 Copyright © 2010 Pearson Education, Inc. 17-13 Positive and Negative Correlation

14 Copyright © 2010 Pearson Education, Inc. 17-14 CorrelationNegativePositive None−0.09 to 0.00.0 to 0.09 Small−0.3 to −0.10.1 to 0.3 Medium−0.5 to −0.30.3 to 0.5 Strong−1.0 to −0.50.5 to 1.0 Interpretation of the Correlation Coefficient As a rule of thumb, correlation values can be interpreted in the following manner:

15 Copyright © 2010 Pearson Education, Inc. 17-15 SPSS Windows: Correlations 1. Select ANALYZE from the SPSS menu bar. 2. Click CORRELATE and then BIVARIATE. 3. Move “variable x” into the VARIABLES box. Then move “variable y” into the VARIABLES box. 4. Check PEARSON under CORRELATION COEFFICIENTS. 5. Check ONE-TAILED under TEST OF SIGNIFICANCE. 6. Check FLAG SIGNIFICANT CORRELATIONS. 7. Click OK.

16 Copyright © 2010 Pearson Education, Inc. 17-16 SPSS Example: Correlation Correlations AgeInternetUsage InternetShoppi ng AgePearson Correlation 1-.740-.622 Sig. (1- tailed).000.002 N20 InternetUsagePearson Correlation -.7401.767 Sig. (1- tailed).000 N20 InternetShoppingPearson Correlation -.622.7671 Sig. (1- tailed).002.000 N20

17 Copyright © 2010 Pearson Education, Inc. 17-17 Regression Analysis Regression analysis examines associative relationships between a metric dependent variable and one or more independent variables in the following ways: Determine whether the independent variables explain a significant variation in the dependent variable: whether a relationship exists. Determine how much of the variation in the dependent variable can be explained by the independent variables: strength of the relationship. Determine the structure or form of the relationship: the mathematical equation relating the independent and dependent variables. For example, does a change in age predict a change in Internet usage? Regression can answer this.

18 Copyright © 2010 Pearson Education, Inc. 17-18 Statistics Associated with Bivariate Regression Analysis Bivariate regression model. The basic regression equation is Y i = + X i + e i, where Y = dependent or criterion variable, X = independent or predictor variable, = intercept of the line, = slope of the line, and e i is the error term associated with the i th observation. Y = B 0 + B 1 X 1 + e Coefficient of determination. The strength of association is measured by the coefficient of determination, r 2. It varies between 0 and 1 and signifies the proportion of the total variation in Y that is accounted for by the variation in X. Note: This is the correlation coefficient squared. Above.5 is good. Estimated or predicted value. The estimated or predicted value of Y i is i = a + b x, where i is the predicted value of Y i, and a and b are estimators of and, respectively.   0  1   0   1   0   1

19 Copyright © 2010 Pearson Education, Inc. 17-19 Statistics Associated with Bivariate Regression Analysis Regression coefficient. The estimated parameter b is usually referred to as the non- standardized regression coefficient. Scattergram. A scatter diagram, or scattergram, is a plot of the values of two variables for all the cases or observations. Standard error of estimate. This statistic, SEE, is the standard deviation of the actual Y values from the predicted values. Standard error. The standard deviation of b, SE b, is called the standard error. Y

20 Copyright © 2010 Pearson Education, Inc. 17-20 Conducting Bivariate Regression Analysis Fig. 17.2 Plot the Scatter Diagram Formulate the General Model Estimate Standardized Regression Coefficients (b) Test for Significance (p-value) Determine the Strength of Association (r-square) Check Prediction Accuracy

21 Copyright © 2010 Pearson Education, Inc. 17-21 Conducting Bivariate Regression Analysis The Bivariate Regression Model In the bivariate regression model, the general form of a straight line is: Y = X   0 +  1 where Y = dependent variable X = independent (predictor) variable = intercept of the line   0  1 = slope of the line The regression procedure adds an error term: Y i =   0 +  1 X i + e i where e i is the error term associated with the i th observation.

22 Copyright © 2010 Pearson Education, Inc. 17-22 Plot of Attitude with Duration Actual Responses – Attitude Towards City v. Duration of Residence Is there a pattern? And which line is most accurate? 4.52.25 6.75 11.25 9 13.5 9 3 6 15.7518 Duration of Residence Attitude

23 Copyright © 2010 Pearson Education, Inc. 17-23 In order to determine the correct line, we use the Least-squares procedure (or OLS regression). Essentially, this finds a line that minimizes the distance from the line to all the points. Least-squares minimizes the square of the vertical distances of all the points from the line. Once we find the line, a formula can be derived: Attitude = 1.0793 + 0.5897 (duration of residence) This means that attitude towards city can be predicted by duration of residence Plot of Attitude with Duration

24 Copyright © 2010 Pearson Education, Inc. 17-24 SPSS Windows: Bivariate Regression 1. Select ANALYZE from the SPSS menu bar. 2. Click REGRESSION and then LINEAR. 3. Move “Variable y” into the DEPENDENT box. 4. Move “Variable x” into the INDEPENDENT(S) box. 5. Select ENTER in the METHOD box. 6. Click on STATISTICS and check ESTIMATES under REGRESSION COEFFICIENTS. 7. Check MODEL FIT. 8. Click CONTINUE. 9. Click OK.

25 Copyright © 2010 Pearson Education, Inc. 17-25 Model Summary Model RR Square Adjusted R Square Std. Error of the Estimate 1.767.588.565.93902 Coefficientsa Model Unstandardized Coefficients Standardi zed Coefficien ts tSig. BStd. ErrorBeta 1(Constant).122.577.211.835 InternetUsage.853.168.7675.071.000 SPSS Example: Bivariate Regression

26 Copyright © 2010 Pearson Education, Inc. 17-26 Multiple Regression The general form of the multiple regression model is as follows: We will want to run multiple regression if we believe that multiple IVs will predict one DV. Perhaps this is a more appropriate formula: Attitude = 0.33732 + 0.48108 (Duration of residence) + 0.28865 (Importance of weather)  Y=  0 +  1 X 1 +  2 X 2 +  3 X 3 +...+  k X k +ee

27 Copyright © 2010 Pearson Education, Inc. 17-27 SPSS Windows: Multiple Regression 1. Select ANALYZE from the SPSS menu bar. 2. Click REGRESSION and then LINEAR. 3. Move “Variable y” into the DEPENDENT box. 4. Move “Variable x 1, x 2, x 3 …” into the INDEPENDENT(S) box. 5. Select ENTER in the METHOD box. 6. Click on STATISTICS and check ESTIMATES under REGRESSION COEFFICIENTS. 7. Check MODEL FIT. 8. Click CONTINUE. 9. Click OK.

28 Copyright © 2010 Pearson Education, Inc. 17-28 SPSS Example: Multiple Regression Model Summary Model RR Square Adjusted R Square Std. Error of the Estimate 1.771.595.547.95866 Coefficientsa Model Unstandardized Coefficients Standardi zed Coefficien ts tSig. BStd. ErrorBeta 1(Constant).8621.542.559.583 InternetUsage.754.255.6792.957.009 Age-.011.021-.119-.520.610

29 Copyright © 2010 Pearson Education, Inc. 17-29 Multicollinearity Multicollinearity arises when correlations among the predictors are very high. Multicollinearity can result in several problems, including: The regression coefficients may not be estimated precisely. The magnitudes, as well as the signs of the partial regression coefficients, may change. It becomes difficult to assess the relative importance of the independent variables in explaining the variation in the dependent variable.

30 Copyright © 2010 Pearson Education, Inc. 17-30 Multicollinearity In our example, age was actually a significant predictor. That, plus the high correlations, indicate that multicollinearity exists. Since age has a very low coefficient value, we can feel safe just getting rid of it for our final model. Coefficientsa Model Unstandardized Coefficients Standardize d Coefficients tSig. BStd. ErrorBeta 1(Constant)5.073.709 7.160.000 Age-.056.017-.622-3.366.003

31 Copyright © 2010 Pearson Education, Inc. 17-31 Thank you! Questions??


Download ppt "Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression."

Similar presentations


Ads by Google