Presentation is loading. Please wait.

Presentation is loading. Please wait.

Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Similar presentations


Presentation on theme: "Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation."— Presentation transcript:

1 Correlation Hal Whitehead BIOL4062/5062

2 The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation Many correlation coefficients

3 The correlation coefficient

4 Linked observations: x 1,x 2,...,x n y 1,y 2,...,y n   Mean: x = Σ x i / n y = Σ y i / n     Variance: S²(x)= Σ(x i -x)²/(n-1) S²(y)= Σ(y i -y)²/(n-1)     Standard Deviation: S(x) S(y) Covariance: S²(x,y) = Σ(x i -x) ∙ (y i -y) / (n-1)

5 Correlation coefficient (“Pearson” or “product-moment”): r = {Σ(x i -x) ∙ (y i -y) / (n-1) } / {S(x) ∙ S(y)} r = S²(x,y) / {S(x) ∙ S(y)}

6 The correlation coefficient: r = S²(x,y) / {S(x) ∙ S(y)} -1 ≤ r ≤ +1 If no linear relationship: r = 0 r2:r2: proportion of variance accounted for by linear regression

7

8 r = -0.01

9

10 r = 0.38

11

12 r = -0.31

13

14 r = 0.95

15

16 r = 0.04

17

18 r = 0.64

19

20 r = -0.46

21

22 r = 0.99

23

24 r = -0.0

25 Tests on Correlation Coefficients

26 Assume: –Independence –Bivariate Normality

27 Tests on Correlation Coefficients Assume: –Independence –Bivariate Normality

28 Tests on Correlation Coefficients Assume: –Independence –Bivariate Normality Then: z = Ln [(1+r)/(1-r)]/2 is normally distributed with variance 1/(n-3) And, if  (true population value of r) = 0 : r ∙ √(n-2) / √(1-r²) is distributed as Student's t with n-2 degrees of freedom

29 We can test: a) r ≠ 0 b) r > 0 or r < 0 c) r = constant d) r(x,y) = r(z,w) Also confidence intervals for r

30 Are Whales Battering Rams? (Carrier et al. J. Exp. Biol. 2002)

31 r = 0.75 (SE = 0.15) (95% C.I. 0.47-0.89) Tests: r ≠ 0 : P = 0.0001 r > 0 : P = 0.00005 More sexually dimorphic species have relatively larger melons

32 Why do Large Animals have Large Brains? (Schoenemann Brain Behav. Evol. 2004) Correlations among mammals –Log brain size with Log muscle mass r=0.984 Log fat mass r=0.942 Are these significantly different? t=5.50; df=36; P<0.01 Hotelling-William test Brain mass is more closely related to muscle than fat

33 Non-Parametric Correlation

34 If one variable normally distributed –can test r=0 as before. If neither normally distributed: –Spearman's r S rank correlation coefficient (replace values by ranks) or: –Kendall's τ correlation coefficient Use Spearman's when there is less certainty about the close rankings

35 Are Whales Battering Rams? (Carrier et al. J. Exp. Biol. 2002) r = 0.75 r S = 0.62 τ= 0.47

36 Partial Correlation

37 Correlation between X and Y controlling for Z r (X,Y|Z) = {r(X,Y) - r(X,Z)∙r(Y,Z)} √{(1 - r(X,Z)²)∙(1 - r(Y,Z)²)} Correlation between X and Y controlling for W,Z r (X,Y|W,Z) = {r(X,Y|W) - r(X,Z|W)∙r(Y,Z|W)} √{(1 - r(X,Z|W)²)∙(1 - r(Y,Z|W)²)} n-2-c degrees of freedom (c is number of control variables)

38 Why do Large Animals have Large Brains? (Schoenemann Brain Behav. Evol. 2004) Correlations among mammals –Log brain size with Log muscle mass Controlling for Log body mass r=0.466 Log fat mass Controlling for Log body mass r=-0.299 Fatter species have relatively smaller brains and more muscular species relatively larger brains

39 Semi-partial Correlation Coefficient Correlation between X & Y controlling Y for Z r (X,(Y|Z)) = {r(X,Y) - r(X,Z)∙r(Y,Z)} √(1 - r(Y,Z)²)

40 Are Whales Battering Rams? (Carrier et al. J. Exp. Biol. 2002) Correlation r = 0.75 Partial Correlation r (SSD,MA|L) = 0.73 Semi-partial Correlations r (SSD,(MA|L)) = 0.69 r ((SSD |L),MA) = 0.71

41 Multiple Correlation

42 Multiple Correlation Coefficient Correlation between one dependent variable and its best estimate from a regression on several independent variables: r(Y∙X 1,X 2,X 3,...) Square of multiple correlation coefficient is: –proportion of variance accounted for by multiple regression

43 Multiple Partial Correlation Coefficient !

44 Autocorrelation

45 Purposes –Examine time series –Look at (serial) independence

46 Data (e.g. Feeding rate on consecutive days, plankton biomass at each station on a transect): 1.5 1.7 4.3 5.4 5.7 6.2 3.9 4.4 5.2 4.8 3.9 3.7 3.6 Autocorrelation of lag=1 is correlation between: 1.5 1.7 4.3 5.4 5.7 6.2 3.9 4.4 5.2 4.8 3.9 3.7 1.7 4.3 5.4 5.7 6.2 3.9 4.4 5.2 4.8 3.9 3.7 3.6 r = 0.508 Autocorrelation of lag=2 is correlation between: 1.5 1.7 4.3 5.4 5.7 6.2 3.9 4.4 5.2 4.8 3.9 4.3 5.4 5.7 6.2 3.9 4.4 5.2 4.8 3.9 3.7 3.6 r = -0.053 …….

47 Autocorrelation Plot (Correlogram)

48 Many Correlation Coefficients

49 Many Correlation Coefficients: [ Behaviour of Sperm Whale Groups] NGR25LSSTSHITRLSPEEDAPROPSOCVSHR2LFMECSLAERR NGR25L1.00 SST0.121.00 SHITR-0.21-0.33*1.00 LSPEED0.10-0.28+0.061.00 APROP-0.15-0.34*0.070.181.00 SOCV-0.050.08-0.16-0.01-0.33*1.00 SHR2-0.18-0.120.01-0.200.19-0.031.00 LFMECS0.080.14-0.13-0.12-0.220.29+-0.181.00 LAERR-0.100.03-0.21-0.24-0.020.24-0.080.231.00 Listwise deletion, n=40; P<0.10; P<0.05; uncorrected Expected no. with P<0.10 = 3.6; with P<0.05 = 1.8

50 Many Correlation Coefficients: [ Behaviour of Sperm Whale Groups] NGR25LSSTSHITRLSPEEDAPROPSOCVSHR2LFMECSLAERR NGR25L1.00 SST0.121.00 SHITR-0.21-0.331.00 LSPEED0.10-0.280.061.00 APROP-0.15-0.340.070.181.00 SOCV-0.050.08-0.16-0.01-0.331.00 SHR2-0.18-0.120.01-0.200.19-0.031.00 LFMECS0.080.14-0.13-0.12-0.220.29-0.181.00 LAERR-0.100.03-0.21-0.24-0.020.24-0.080.231.00 Listwise deletion, n=40; P<0.10; P<0.05; Bonferroni corrected P=1.0 for all coefficients

51 Many Correlation Coefficients: [ Behaviour of Sperm Whale Groups] NGR25LSSTSHITRLSPEEDAPROPSOCVSHR2LFMECSLAERR NGR25L1.00 SST0.121.00 SHITR-0.21-0.33*1.00 LSPEED0.10-0.28+0.061.00 APROP-0.15-0.34*0.070.181.00 SOCV-0.050.08-0.16-0.01-0.33*1.00 SHR2-0.18-0.120.01-0.200.19-0.031.00 LFMECS0.080.14-0.13-0.12-0.220.29+-0.181.00 LAERR-0.100.03-0.21-0.24-0.020.24-0.080.231.00 Listwise deletion, n=40; P<0.10; P<0.05; uncorrected Pairwise deletion, n=59-118; P<0.10; P<0.05; uncorrected NGR25LSSTSHITRLSPEEDAPROPSOCVSHR2LFMECSLAERR NGR25L1.00 SST0.111.00 SHITR-0.17+-0.46*1.00 LSPEED0.05-0.170.051.00 APROP-0.05-0.20+0.040.31*1.00 SOCV-0.00-0.05-0.06-0.02-0.25*1.00 SHR2-0.15-0.130.07-0.140.050.011.00 LFMECS0.010.07-0.02-0.14-0.25*0.43*-0.26+1.00 LAERR-0.060.060.09-0.27*-0.20+0.06-0.060.21+1.00

52 Many Correlation Coefficients Missing values: –Listwise deletion (comparability), or –Pairwise deletion (power) P-values: –Uncorrected: type 1 errors –Bonferroni, etc.: type 2 errors

53 Beware! Correlation Causation Y 1 Y 2 Y 1 Y 3 Y 4 Y 2 Y 5 Y 1 Y 3 Y 2 Y 1 Y 3 Y 4 Y 1 Y 3 Y 4 Y 2 Y 5 Y 1 Y 3 Y 4 Y 5 Y 2 Y 6


Download ppt "Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation."

Similar presentations


Ads by Google