Presentation on theme: "Correlation Correlation : the mathematical extent to which two variables are related to each other –Correlation refers to both a type of research design."— Presentation transcript:
Correlation Correlation : the mathematical extent to which two variables are related to each other –Correlation refers to both a type of research design and a descriptive statistical procedure. –Generally performed between two scores obtained from the same source
Correlation Coefficient Correlation Coefficient : number between +1 and -1 that represents the strength and direction of the relationship between two variables Correlations that are closer to +1 and –1 are stronger and are better able to accurately predict
Types of Correlation Coefficients Pearson r : both variables are measured at an interval/ratio level Spearman rho : used when the measurement of at least one variable is ordinal (scores on the other variable must be converted to ranks)
Positive Correlations Positive Correlation : a correlation that is a greater than zero, but less than +1 Indicates that high scores on one variable are associated with high scores on another variable The values of the variables increase and decrease together.
Negative Correlations Negative Correlation : a correlation coefficient whose value is between 0 and -1 Indicates that there is an inverse relationship between the two sets of scores A high score on X is related to a low score on Y, and vice versa
Linear Relationships Linear Relationship : a condition wherein the relationship between two variables can be best described by a straight line (the regression line or the line of best fit)
Scatterplots Scatterplot : provides a visual representation of the relationship between variables Each point represents paired measurements on two variables for a specific individual
Understanding the Pearson Product Moment Correlation Coefficient Pearson r : represents the extent to which individuals occupy the same relative position in two distributions Definitional Equation: Important Reminder: –Σ z 2 = N
Interpreting the Correlation Coefficient Coefficient of Determination ( r 2 ): the proportion of variance in one variable that can be described or explained by the other variable Coefficient of Nondetermination ( 1 - r 2 ): the proportion of variance in one variable that cannot be described or explained by the other variable
Correlation Matrices Tables of correlations are generated when more than two variables are involved. A Correlation Matrix is a table in which each variable is listed both at the top and at the left side, and the correlation of all possible pairs of variables is shown inside the table An asterisk identifies significant correlations.
Caution: Spurious Correlations Spurious Correlations : a correlation coefficient that is artificially high or low because of the nature of the data or method for collecting the data Common Causes of Spurious Correlations: –A nonlinear relationship –Truncated range –Sample Size –Outliers –Multiple Populations –Extreme Scores
Caution: No Causality Correlations only tell us that two variables are related; they do not determine causality Four Possible Explanations: 1.X Y (Temporal Directionality) 2.Y X (Temporal Directionality) 3.X Y (Bidirectional Causation) 4.Z X and Y (Third Variable Problem)
Computing the Correlation Coefficient Using SPSS Analyze Correlate Bivariate Select variables to be correlated in the left side of the Bivariate Correlations window and move them to the right side Select the appropriate correlation coefficient Check two tailed and flag significant correlations click OK
Interpreting the Output
Creating a Scatterplot Graphs Scatter Click Simple Click Define Move the criterion variable to the Y axis box Move the predictor variable to the X axis box Click OK Double-click on the chart to edit it. Click Fit Line at Total. Click OK
Linear Regression An important use of the correlation coefficient is the ability to predict one set of scores from another. If we know the score on one variable, we can use that score to predict someone’s score on the correlated variable.
The Regression Line Line of Best Fit : minimizes the distance between each individual point and the regression line
The Regression Equation Equation: Y’ = a Y + b Y (X) Where Y’ = the predicted score of Y based on a known value of X a Y = the intercept of the regression line b Y = the slope of the line X = the score being used as the predictor
In English Please… Slope : how much variable Y changes as the values of variable X change one unit Intercept : the value of variable Y when X = 0 Predictor Variable : the variable X which is used to predict the score on variable Y (antecedent or independent variable) Criterion Variable : the variable that is predicted (dependent variable)
Linear Regression Using SPSS Analyze Regression Linear Click on the criterion variable and move it to the Dependent box Click on the predictor variable and move ot to the Independent(s) box Click Statistics check Descriptives make sure that Estimates and Model fit are also selected Click Continue Click OK
Interpreting the Output The F value in the ANOVA box indicates whether the predictor variable was a significant predictor of the criterion variable. The unstandardized coefficient for the constant reflects the Y intercept of the regression equation. The unstandardized coefficient for the predictor variable reflects the slope of the line. The regression equation for this example would be Y’ = X