# Correlation Minium, Clarke & Coladarci, Chapter 7.

## Presentation on theme: "Correlation Minium, Clarke & Coladarci, Chapter 7."— Presentation transcript:

Correlation Minium, Clarke & Coladarci, Chapter 7

Association Univariate vs. Bivariate one variable vs. two variables When we have two variables we can ask about the degree to which they co-vary is there any relationship between an individuals score on one variable and his or her score on a second variable number of beers consumed and reaction time (RT) Number of hours of studying and score on an exam Years of education and salary Parents anxiety (or depression) and child anxiety (or depression)* The correlation coefficient a bivariate statistic that measures the degree of linear association between two quantitative variables. The Pearson product-moment correlation coefficient

Bivariate Distributions and Scatterplots Scatter diagram Graph that shows the degree and pattern of the relationship between two variables Horizontal axis Usually the variable that does the predicting (this is somewhat arbitrary) e.g., price, studying, income, caffeine intake Vertical axis Usually the variable that is predicted e.g., quality, grades, happiness, alertness

Bivariate Distributions and Scatterplots Steps for making a scatter diagram Draw axes and assign variables to them Determine the range of values for each variable and mark the axes Mark a dot for each persons pair of scores

Bivariate Distributions and Scatterplots Linear correlation Pattern on a scatter diagram is a straight line Example above Curvilinear correlation More complex relationship between variables Pattern in a scatter diagram is not a straight line Example below

Bivariate Distributions and Scatterplots Positive linear correlation High scores on one variable matched by high scores on another Line slants up to the right Negative linear correlation High scores on one variable matched by low scores on another Line slants down to the right

Bivariate Distributions and Scatterplots Zero correlation No line, straight or otherwise, can be fit to the relationship between the two variables Two variables are said to be uncorrelated

Bivariate Distributions and Scatterplots a. Negative linear correlation b. Curvilinear correlation c. Positive linear correlation d. No correlation

The Covariance Covariance is a number that that reflects the degree and direction of association between two variables. This is the definition Note its similarity to the definition of variance (S 2 ) The logic of the Covariance

The Covariance Example (Positive Correlation) (see p. 109) PersonXYX-XmY-Ym(X-Xm)(Y-Ym) A9134416 B79200 C570-20 D311-22-4 E15-4-416 n=5Xm=5Ym=9sum =28 Cov =28/5=5.6

The Covariance Example (Negative Correlation) (see p. 109) PersonXYX-XmY-Ym(X-Xm)(Y-Ym) A954-4-16 B711224 C570-20 D39-200 E113-44-16 n=5Xm=5Ym=9sum =-28 Cov =-28/5=-5.6

The Covariance Example (Zero Correlation) (see p. 109) PersonXYX-XmY-Ym(X-Xm)(Y-Ym) A91342.811.2 B792-1.2-2.4 C570-3.20.0 D39-2-1.22.4 E113-42.8-11.2 n=5Xm=5Ym=10.2sum =0 Cov =0/5 = 0

The Pearson r: the Pearson product-moment coefficient of correlation Correlation coefficient, r, indicates the precise degree of linear correlation between two variables Can vary from -1 (perfect negative correlation) through 0 (no correlation) to +1 (perfect positive correlation) r is more useful than Cov because it is independent of the underlying scales of the two variables if two variables produce an r of.5, for example, r will still equal.5 after any linear transformation of the two variables linear transformation: adding, subtracting, dividing or multiplying by a constant e.g., converting Celsius to Fahrenheit: F = 32 + 1.8C e.g., converting Fahrenheit to Celsius: C = (F - 32) /1.8

The Pearson r: the Pearson product-moment coefficient of correlation r =.81 r =.46 r =.16 r = -.75 r = -.42 r = -.18

Correlation and Causality When two variables are correlated, three possible directions of causality 1st variable causes 2nd 2nd variable causes 1st Some 3rd variable causes both the 1st and the 2nd There is inherent ambiguity in correlations

Correlation and Causality When two variables are correlated, three possible directions of causality 1st variable causes 2nd 2nd variable causes 1st Some 3rd variable causes both the 1st and the 2nd Inherent ambiguity in correlations

Factors influencing the Pearson r Linearity Outliers To the extent that a bivariate distribution departs from linearity, r will underestimate that relationship. (p.121) Discrepant data points, or outliers, affect the magnitude of r and the direction of the effect depending on the outliers location in the scatterplot. (p. 122).

Factors influencing the Pearson r Restriction of Range Other things being equal, restricted variation in either X or Y will result in a lower Pearson r and would be obtained were variation greater. (p. 122)

Factors influencing the Pearson r Context Because of the many factors that influence r, there is no such thing as the correlation between two variables. Rather, the obtained r must be interpreted in full view of the factors that affect it and the particular conditions under which it was obtained. (p. 124)

Judging the Strength of Association r 2 : proportion of common variance The coefficient of determination, r 2, is the proportion of common variance shared by two variables. We will talk more about this when we discuss Chapter 8.