Information ? COvary In Research Dependent variable Independent variables X1X1 X2X2 X3X3 Y
The Concept of Correlation Association or relationship between two variables X Y Covary---Go together Co-relate?relationr
Patterns of Covariation Y Positive correlation Negative correlation Correlation Covary Go together XY XY X Zero or no correlation
Scatter plots allow us to visualize the relationships Scatter Plots The chief purpose of the scatter diagram is to study the nature of the relationship between two variables Linear/curvilinear relationship Direction of relationship Magnitude (size) of relationship
Represents both the X and Y scores Variable X Variable Y An illustration of a perfect positive correlation high low Scatter Plot A Exact value
Variable X Variable Y An illustration of a positive correlation high low Scatter Plot B Estimated Y value
Variable X Variable Y An illustration of a perfect negative correlation high low Scatter Plot C Exact value
Variable X Variable Y An illustration of a negative correlation high low Scatter Plot D Estimated Y value
Variable X Variable Y An illustration of a zero correlation high low Scatter Plot E
Variable X Variable Y An illustration of a curvilinear relationship high low Scatter Plot F
The Measurement of Correlation The degree of correlation between two variables can be described by such terms as strong, low, positive, or moderate, but these terms are not very precise. If a correlation coefficient is computed between two sets of scores, the relationship can be described more accurately. The Correlation Coefficient A statistical summary of the degree and direction of relationship or association between two variables can be computed
Pearsons Product-Moment Correlation Coefficient r -1.00 -.50 0 +.50 1.00 Direction of relationship: Sign (+ or –) Magnitude: 0 through +1 or 0 through -1 Negative correlationPositive correlation No Relationship
The Pearson Product-Moment Correlation Coefficient Recall that the formula for a variance is: If we replaced the second X that was squared with a second variable, Y, it would be: This is called a co-variance and is an index of the relationship between X and Y.
Conceptual Formula for Pearson r This formula may be rewritten to reflect the actual method of calculation
Calculation of Pearson r You should notice that this formula is merely the sum of squares for covariance divided by the square root of the product of the sum of squares for X and Y
Formulae for Sums of Squares Therefore, the formula for calculating r may be rewritten as:
An Example Suppose that a college statistics professor is interested in how the number of hours that a student spends studying is related to how many errors students make on the mid- term examination. To determine the relationship the professor collects the following data:
The Stats Professors Data Student Hours Studied (X) Errors (Y) X2X2 Y2Y2 XY 14151622560 24121614448 359258145 46103610060 578496456 674491628 776493642 89281418 994811636 10123100936 Total X = 70 Y = 73 X 2 =546 Y2=695 XY=429
The Data Needed to Calculate the Sum of Squares XYX2X2 Y2Y2 XY Total X = 70 Y = 73 X 2 =546 Y2=695 XY=429 = 546 - 70 2 /10 = 546 - 490 = 56 = 695 - 73 2 /10 = 695 - 523.9 = 162.1 = 429 – (70)(73)/10 = 429 – 511 = -82
Calculating the Correlation Coefficient = -82 / (56)(162.1) = - 0.86 Thus, the correlation between hours studied and errors made on the mid-term examination is -0.86; indicating that more time spend studying is related to fewer errors on the mid-term examination. Hopefully an obvious, but now a statistical conclusion!
Pearson Product-Moment Correlation Coefficient r 0+1 Negative correlation Positive correlation perfect negative correlation Perfect positive correlation Zero correlation
The Pearson r and Marginal Distribution The marginal distribution of X is simply the distribution of the Xs; the marginal distribution of Y is the frequency distribution of the Ys. Y X Bivariate Normal Distribution Bivariate relationship
Marginal distribution of X and Y are precisely the same shape. X variable Y variable
Interpreting r, the Correlation Coefficient Recall that r includes two types of information: The direction of the relationship (+ or -) The magnitude of the relationship (0 to 1) However, there is a more precise way to use the correlation coefficient, r, to interpret the magnitude of a relationship. That is, the square of the correlation coefficient or r 2. The square of r tells us what proportion of the variance of Y can be explained by X or vice versa.
Variable X Variable Y An illustration of how the squared correlation accounts for variance in X, r =.7, r 2 =.49 high low How does correlation explain variance? Explained Suppose you wish to estimate Y for a given value of X. 49% of variance is explained Free to Vary
Now, lets look at some correlation coefficients and their corresponding scatter plots.
What is your estimate of r? r =.87r 2 =.76 = 76%