Presentation on theme: "Correlation Chapter 6. Assumptions for Pearson r X and Y should be interval or ratio. X and Y should be normally distributed. Each X should be independent."— Presentation transcript:
Correlation Chapter 6
Assumptions for Pearson r X and Y should be interval or ratio. X and Y should be normally distributed. Each X should be independent of other X’s. Each Y should be independent of other Y’s. Scores on Y should be linearly related to X. X, Y scores should be bivariate normal,
In a bivariate normal distribution when you look at a single X value (0), most of the Y points are clustered around the blue (regression line) and as you get further away from the blue line there are fewer Y data points at a given X.
If there is a relationship between two variables, then as one variable deviates from its mean, the other variable in the same way or directly opposite way. Covariance is the averaged sum of combined deviations. Each variables deviation from its mean is about the same magnitude, then covariance is high.
Correlation The correlation (r) is computed by dividing the covariance by the product of x and y’s standard deviation. The value of r ranges from −1 to +1. The correlation coefficient is a standard measure of effect size: ±.1 is a small effect ±.3 is a medium effect ±.5 is a large effect Correlation and Cause? There is a high r between shoe size and math performance for grade school children.
Two Types of Correlation Two types of corr: bivariate and partial. Bivariate correlation is the correlation between two variables. Partial correlation is the correlation between two variables when controlling the effect of one or more additional variables.
Pearson’s Product Moment Correlation Correlation measures the association between two variables. Correlation quantifies the extent to which the mean, variation & direction of one variable are related to another variable. r ranges from +1 to -1. Correlation can be used for prediction. Correlation does not indicate the cause of a relationship.
Scatter Plot Scatter plot gives a visual description of the relationship between two variables. The line of best fit is defined as the line that minimized the squared deviations from a data point up to or down to the line.
Line of Best Fit Minimizes Squared Deviations from a Data Point to the Line
Always do a Scatter Plot to Check the Shape of the Relationship
Will a Linear Fit Work?
y = x R 2 =
6 th Order Fit? y = x x x x x x R 2 =
Will Linear Fit Work?
Linear Fit y = x R 2 =
Evaluating the Strength of a Correlation For predictions, absolute value of r <.7, may produce unacceptably large errors, especially if the SDs of either or both X & Y are large. As a general rule – Absolute value r greater than or equal.9 is good – Absolute value r equal to is moderate – Absolute value r equal to is low – Values for r below.5 give R 2 =.25, or 25% are poor, and thus not useful for predicting.
Significant Correlation?? If N is large (N=90) then a.205 correlation is significant. ALWAYS THINK ABOUT R 2 How much variance in Y is X accounting for? r =.205 R 2 =.042, thus X is accounting for 4.2% of the variance in Y. This will lead to poor predictions. A 95% confidence interval will also show how poor the prediction is.
Venn diagram shows (R 2 ) the amount of variance in Y that is explained by X. Unexplained Variance in Y. (1-R 2 ) =.36, 36% R 2 =.64 (64%) Variance in Y that is explained by X
A partial correlation is used to remove the effects of Revision Time on both Exam Anxiety and Exam Performance. Then the unique contribution of Exam Anxiety on Exam Performance can be analyzed.
A coach ranked athletes based on jumping ability and sprinting ability. The ranks for each athlete are presented in the table below. Compute the correlation between the ratings. AthleteJump RatingSprint Rating
Level of Measurement is ordinal
Kendall’s tau is better if the data have several ties, if not Spearman is fine.
Test-Retest Reliability (ICC) and Day to Day Variation
Consistency of Measurements Reliability refers to the consistency of a test or measurement. A test cannot be considered valid if it is not reliable. You should know the day to day variation in your dependent variable. How much does 1RM bench press change from day to day? How much does VO2 max change from day to day?
Test-Retest Reliability The 1RM squat strength was measured on 8 subjects on two separate days. In this experiment we want to measure the reliability of our measurements. Here is the day to day variation in 1RM bench press strength. SubjectTrial 1Trial 2Difference Mean SD In this experiment, we plan to average trials, so we will use Averaged Measures ICC.
Enter the between day variables.
Click OK to run
There is no difference between trials F(1,7) =.530, p =.49 If there is a difference between trials you may have a learning effect or a fatigue effect and you must modify your methods to control for learning and/or fatigue.
ICC =.973 [High Degree of Reliability] Using Mixed Model, Type Consistency, Averaged Measures.
If you train subjects for 8 weeks and they improve their strength by 2.8 Kg have you actually done anything? Is the program effective? Here is the day to day variation in 1RM bench press strength. SubjectTrial 1Trial 2Difference Mean SD Day to Day Variation
Manuscript Methods & Results Sections METHODS Measurement of Reliability Prior to initiating the study, 8 subjects (not in the experiment), participated in a test-retest assessment of measurement reliability. One RM bench press strength was measured on two separate days for subjects in the reliability study. Reproducibility of bench press strength was analyzed using SPSS (18.0 for Windows) to compute the intraclass correlation coefficient (ICC) using a two factor mixed effects model and type consistency (McGraw and Wong, 1996; Shrout and Fleiss, 1979). RESULTS A high degree of reliability was found between bench press strength measurements the average measure ICC was.973 with a 95% confidence interval from The mean between day variation for 1RM strength was 2.75±10.7 Kg. REFERENCES McGraw KO and Wong SP. Forming inferences about some intraclass correlation coefficients. Psychological Methods 1: 30-46, Shrout PE and Fleiss JL. Intraclass correlations: Uses in assessing reliability. Psychol Bull 86: , 1979.