Presentation is loading. Please wait.

Presentation is loading. Please wait.

Correlation and Covariance R. F. Riesenfeld (Based on web slides by James H. Steiger)

Similar presentations


Presentation on theme: "Correlation and Covariance R. F. Riesenfeld (Based on web slides by James H. Steiger)"— Presentation transcript:

1 Correlation and Covariance R. F. Riesenfeld (Based on web slides by James H. Steiger)

2 Goals Introduce concepts of Introduce concepts of Covariance Covariance Correlation Correlation Develop computational formulas Develop computational formulas 2R F Riesenfeld Sp 2010CS5961 Comp Stat

3 Covariance Variables may change in relation to each other Variables may change in relation to each other measures how much the movement in one variable predicts the movement in a corresponding variable Covariance measures how much the movement in one variable predicts the movement in a corresponding variable 3R F Riesenfeld Sp 2010CS5961 Comp Stat

4 Smoking and Lung Capacity Example: investigate relationship between and Example: investigate relationship between cigarette smoking and lung capacity Data: sample group response data on smoking habits, measured lung capacities, respectively Data: sample group response data on smoking habits, and measured lung capacities, respectively R F Riesenfeld Sp 2010CS5961 Comp Stat4

5 Smoking v Lung Capacity Data 5 N Cigarettes ( X )Lung Capacity ( Y ) 1045 2542 31033 41531 52029 R F Riesenfeld Sp 2010CS5961 Comp Stat

6 Smoking and Lung Capacity 6

7 Smoking v Lung Capacity Observe that as smoking exposure goes up, corresponding lung capacity goes down Observe that as smoking exposure goes up, corresponding lung capacity goes down Variables inversely Variables covary inversely and quantify relationship Covariance and Correlation quantify relationship 7R F Riesenfeld Sp 2010CS5961 Comp Stat

8 Covariance Variables that inversely, like smoking and lung capacity, tend to appear on opposite sides of the group means Variables that covary inversely, like smoking and lung capacity, tend to appear on opposite sides of the group means When smoking is above its group mean, lung capacity tends to be below its group mean. When smoking is above its group mean, lung capacity tends to be below its group mean. Average measures extent to which variables covary, the degree of linkage between them Average product of deviation measures extent to which variables covary, the degree of linkage between them 8R F Riesenfeld Sp 2010CS5961 Comp Stat

9 The Sample Covariance Similar to variance, for theoretical reasons, average is typically computed using ( N -1 ), not N. Thus, Similar to variance, for theoretical reasons, average is typically computed using ( N -1 ), not N. Thus, 9R F Riesenfeld Sp 2010CS5961 Comp Stat

10 Cigs ( X )Lung Cap ( Y ) 045 542 1033 1531 2029 1036 Calculating Covariance R F Riesenfeld Sp 2010CS5961 Comp Stat10

11 Calculating Covariance Cigs (X ) Cap (Y ) 0-10-90945 5-5-30642 1000-333 155-25-531 2010-70-729 = - 215 R F Riesenfeld Sp 2010CS5961 Comp Stat11

12 Evaluation yields, Evaluation yields, 12 Covariance Calculation (2) R F Riesenfeld Sp 2010CS5961 Comp Stat

13 Covariance under Affine Transformation 13R F Riesenfeld Sp 2010CS5961 Comp Stat

14 14 CovarianceunderAffineTransf Covariance under Affine Transf (2) R F Riesenfeld Sp 2010CS5961 Comp Stat

15 (Pearson) Correlation Coefficient r xy Like covariance, but uses Z-values instead of deviations. Hence, invariant under linear transformation of the raw data. Like covariance, but uses Z-values instead of deviations. Hence, invariant under linear transformation of the raw data. 15R F Riesenfeld Sp 2010CS5961 Comp Stat

16 Alternative (common) Expression 16R F Riesenfeld Sp 2010CS5961 Comp Stat

17 Computational Formula 1 17R F Riesenfeld Sp 2010CS5961 Comp Stat

18 Computational Formula 2 18R F Riesenfeld Sp 2010CS5961 Comp Stat

19 Table for Calculating r xy 19 Cigs ( X ) X 2 XYY 2 Cap (Y ) 0 0 0202545 5 25210176442 10100330108933 1522546596131 2040058084129 = 5075015856680180 R F Riesenfeld Sp 2010CS5961 Comp Stat

20 Computing from Table Computing r xy from Table 20R F Riesenfeld Sp 2010CS5961 Comp Stat

21 Computing Correlation 21R F Riesenfeld Sp 2010CS5961 Comp Stat

22 Conclusion r xy = -0.96 implies almost certainty smoker will have diminish lung capacity r xy = -0.96 implies almost certainty smoker will have diminish lung capacity Greater smoking exposure implies greater likelihood of lung damage Greater smoking exposure implies greater likelihood of lung damage 22R F Riesenfeld Sp 2010CS5961 Comp Stat

23 End Covariance & Correlation 23 Notes R F Riesenfeld Sp 2010CS5961 Comp Stat


Download ppt "Correlation and Covariance R. F. Riesenfeld (Based on web slides by James H. Steiger)"

Similar presentations


Ads by Google