Presentation is loading. Please wait.

Presentation is loading. Please wait.

Correlation: A statistic to describe the relationship between variables Hours Worked Pay Hours Worked Pay Hours Worked Pay.

Similar presentations


Presentation on theme: "Correlation: A statistic to describe the relationship between variables Hours Worked Pay Hours Worked Pay Hours Worked Pay."— Presentation transcript:

1 Correlation: A statistic to describe the relationship between variables Hours Worked Pay Hours Worked Pay Hours Worked Pay

2 Univariate vs. Bivariate Statistics Bivariate analyses/graphical representations Scatterplots Correlation: Univariate analyses/ graphical representations: Frequency histograms Measures of central tendency and variability Z-scores linear pattern of relationship between one variable (x) and another variable (y) – an association between two variables relative position of one variable correlates with relative distribution of another variable How can we define correlation?

3 Correlations allow us to look for evidence of a relationship between variables.

4 Correlations can vary in strength

5 Correlations can vary in direction So, how do we QUANTIFY a correlation? We need to come up with a NUMBER that reflects both the strength and direction of the correlation. flu shots given

6 Correlation finds the strength and direction of the best fitting line to the data.  XY - (  X) (  Y) n r =  X 2 - (  X) 2  Y 2 - (  Y) 2 n n [ [ ] ] The number we calculate in Statistics is called the correlation coefficient. Developed by Karl Pearson, it is also sometimes referred to as Pearson’s r.

7 Example Calculation: the following data represent the number of emergency room visits per year (x) and cigarettes smoked a day (y) by three individuals recruited from New York Methodist Hospital.  = 0.94 x237x237 y456y456 x 2 4 9 49 y 2 16 25 36 xy 8 15 42 65 (12) (15) 3 62 (12) 2 3 77 (15) 2 3 5 √[(14)(2)] =

8 Another way to think of the correlation The product of the Z-scores for each pair of scores r = (  Z x Z y ) /( n-1) x237x237 y456y456 Zx -.76 -.38 1.13 Zy 0 1 If x=2, (2-4)/2.65 = -.76 … If y=4, (4-5)/1 = -1

9 Another way to think of the correlation The product of the Z-scores for each pair of scores r = (  Z x Z y ) /( n-1) x237x237 y456y456 Zx -.76 -.38 1.13 Zy 0 1 ZxZy.76 0 1.13  1.89 2 =.945 =.95

10 Interpreting the Pearson r * Range of values: Interpreting the value of r -1.0 to +1.0 * Direction from the sign negative => anticorrelated As one variable goes up the other goes down in value. positive => correlatedAs on variable goes up so does the other. * Strength from the magnitude | r | = 1.0perfect relationship | r | = 0.0no evidence of relationship 0.0 < | r | < 1.0intermediate strength relationship

11

12 When NOT to use a correlation: Extreme scores r =.97 Non-linear relationships r =.20

13 Some Issues with Correlation NO CAUSATION! Spurious correlation

14

15

16

17

18 Number of people who drowned in a swimming pool & number of Nicholas Cage films in a given year =.67 Per capita consumption of cheese & number of deaths by becoming tangled in bed sheets =.95 Divorce rate in Maine & consumption of margarine in the United States =.99

19 Preview of Next Lecture: Regression finding the best fitting line to a data set.


Download ppt "Correlation: A statistic to describe the relationship between variables Hours Worked Pay Hours Worked Pay Hours Worked Pay."

Similar presentations


Ads by Google