Xuhua Xia Slide 1 Correlation Simple correlation –between two variables Multiple and Partial correlations –between one variable and a set of other variables.

Presentation on theme: "Xuhua Xia Slide 1 Correlation Simple correlation –between two variables Multiple and Partial correlations –between one variable and a set of other variables."— Presentation transcript:

Xuhua Xia Slide 1 Correlation Simple correlation –between two variables Multiple and Partial correlations –between one variable and a set of other variables Canonical Correlation –between two sets of variables each containing more than one variable. Simple and multiple correlations are special cases of canonical correlation. Multiple: x 1 on x 2 and x 3 Partial: between X and Y with Z being controlled for

Xuhua Xia Slide 2 Review of correlation XZY 1414.0000 1517.9087 1616.3255 2314.4441 2415.2952 2519.1587 2616.0299 2517.0000 3314.7556 3417.6823 3520.5301 3621.6408 4315.0903 4418.1603 4522.2471 5214.4450 5316.5554 5421.0047 5522.0000 6119.0000 6218.0000 6318.1863 6421.0000 Compute Pearson correlation coefficients between X and Z, X and Y and Z and Y. Compute partial correlation coefficient between X and Y, controlling for Z (i.e., the correlation coefficient between X and Y when Z is held constant), by using the equation in the previous slide. Run SAS to verify your calculation: proc corr pearson; var X Y; partial Z; run;

Xuhua Xia Slide 3 Many Possible Correlations With multiple DV’s and IV’s, there could be many correlation patterns: –Variable A in the DV set could be correlated to variables a, b, c in the IV set –Variable B in the DV set could be correlated to variables c, d in the IV set –Variable C in the DV set could be correlated to variables a, c, e in the IV set With these plethora of possible correlated relationships, what is the best way of summarizing them?

Xuhua Xia Slide 4 Dealing with Two Sets of Variables The simple correlation approach: –For N DV’s and M IV’s, calculate the simple correlation coefficient between each of N DV’s and each of M IV’s, yielding a total of N*M correlation coefficients The multiple correlation approach: –For N DV’s and M IV’s, calculate multiple or partial correlation coefficients between each of N DV’s and the set of M IV’s, yielding a total of N correlation coefficients The canonical correlation Note: All these deal with linear correlations

Xuhua Xia Slide 5 Fitness Data /* First three variables: physical Last three variables: exercise Middle-aged men */ data fit; input weight waist pulse chins situps jumps @@; cards; 191 36 50 5 162 60 189 37 52 2 130 60 193 38 58 12 101 101 162 35 62 12 145 37 189 35 46 13 145 58 182 36 56 4 141 42 211 38 56 8 151 38 167 34 60 6 155 40 176 31 74 15 200 40 154 30 56 17 251 250 169 34 50 17 120 38 166 33 52 13 210 115 154 34 64 14 215 105 247 46 50 1 50 50 193 36 46 6 170 31 202 37 62 12 120 120 176 37 54 4 160 25 157 32 52 11 230 80 156 33 54 15 215 73 138 33 68 2 150 43 ;

Xuhua Xia Slide 6 SAS Program proc cancorr data=fit vdep wdep smc stb t probt vprefix=PHYS vname='Physical Measurements' wprefix=EXER wname='Exercises'; var weight waist pulse; with chins situps jumps; title2 'Middle-aged Men in a Health Fitness Club'; title3 'Data Courtesy of Dr. A. C. Linnerud, NC State Univ.'; run; What’s the meaning of these cryptic terms? Next slide

Xuhua Xia Slide 7 SAS Program proc cancorr data=fit short vdep wdep smc stb t probt SHORT - suppresses all default output except the tables of Canonical correlations and multivariate statistics. VDEP - requests multiple regression analyses with the VAR variable as dependent variables and the WITH variables as regressors. WDEP does the opposite SMC - prints squared multiple correlations and F tests for the regression analyses The STB option requests standardized regression coefficients. VPREFIX - specify a variable prefix for canonical variables instead of using the default V1, V2, and so on. WPREFIX does the same.

Xuhua Xia Slide 8 Multiple Correlations DV: the Physical Measurements IV: Exercises Squared Multiple Correlations and F Tests 3 numerator df 16 denominator df 95% CI for R 2 R 2 R 2.adj Lower Upper F Pr > F weight 0.517798 0.427385 0.065 0.736 5.73 0.0074 waist 0.752679 0.706306 0.380 0.877 16.23 <.0001 pulse 0.037362 -.143132 0.000 0.177 0.21 0.8901 Weight and WAIST are significantly associated with the exercise variables.

Xuhua Xia Slide 9 Regression of Phys. on Exer. Standardized Regression Coefficients weight waist pulse chins -0.1059 -0.2791 0.1281 situps -0.7273 -0.7640 0.1351 jumps 0.1619 0.1465 -0.0909 t Values for the Regression Coefficients weight waist pulse chins -0.4957 -1.8243 0.4244 situps -3.4776 -5.1007 0.4571 jumps 0.7768 0.9809 -0.3087 Prob > |t| for the Regression Coefficients weight waist pulse chins 0.6268 0.0868 0.6769 situps 0.0031 0.0001 0.6537 jumps 0.4486 0.3412 0.7615

Xuhua Xia Slide 10 Multiple Correlations DV: Exercises IV: the Physical Measurements Squared Multiple Correlations and F Tests 3 numerator df 16 denominator df 95% CI for R 2 R 2 R 2.adj Lower Upper F Pr> F chins 0.408377 0.297448 0.000 0.657 3.68 0.0344 situps 0.716127 0.662901 0.316 0.857 13.45 0.0001 jumps 0.144544 -.015853 0.000 0.395 0.90 0.4622

Xuhua Xia Slide 11 Regression of Exer. on Phys. Standardized Regression Coefficients chins situps jumps weight 0.4994 0.0468 0.2802 waist -1.0261 -0.9209 -0.6102 pulse -0.0085 -0.1324 -0.0658 t Values for the Regression Coefficients chins situps jumps weight 1.2653 0.1710 0.5904 waist -2.6335 -3.4120 -1.3024 pulse -0.0411 -0.9249 -0.2649 Prob > |t| for the Regression Coefficients chins situps jumps weight 0.2239 0.8664 0.5632 waist 0.0181 0.0036 0.2112 pulse 0.9678 0.3688 0.7945

Xuhua Xia Slide 12 Canonical Correlation Adjusted Approx Squared Canonical Canonical Standard Canonical Correlation Correlation Error Correlation 1 0.878578 0.856195 0.052330 0.771899 2 0.264992 0.080853 0.213306 0.070221 3 0.062661. 0.228515 0.003926 Eigenvalue Difference Proportion Cumulative 1 3.3840 3.3085 0.9771 0.9771 2 0.0755 0.0716 0.0218 0.9989 3 0.0039 0.0011 1.0000 Significance test: EigenvalueLikelihood Approximate Ratio F Value Num DF Den DF Pr > F 10.21125051 3.40 9 34.223 0.0044 20.92612863 0.29 4 30 0.8799 30.99607358 0.06 1 16 0.8049

Xuhua Xia Slide 13 Standardized Canonical Coefficients for the Physical Measurements PHYS1 PHYS2 PHYS3 weight -0.1899 2.0261 0.2691 waist 1.1929 -1.5800 -0.4314 pulse 0.1218 0.3245 -1.0176 for the exercises EXER1 EXER2 EXER3 chins -0.3383 1.0114 -0.6139 situps -0.8614 -0.8403 -0.0579 jumps 0.1512 0.2536 1.1640 Because the variables are not measured in the same units, the standardized coefficients rather than the raw coefficients should be interpreted.

Xuhua Xia Slide 14 Canonical Structure: correlations Between Phys. and their canonical var.: PHYS1 PHYS2 PHYS3 weight 0.8028 0.5335 0.2662 waist 0.9872 0.0737 0.1416 pulse -0.2061 0.1098 -0.9723 Between Exer. and their canonical var.: EXER1 EXER2 EXER3 chins -0.6945 0.7165 -0.0658 situps -0.9609 -0.2169 0.1721 jumps -0.4141 0.3671 0.8329 Between Phys. and the canonical var. of Exer.: EXER1 EXER2 EXER3 weight 0.7054 0.1414 0.0167 waist 0.8673 0.0195 0.0089 pulse -0.1811 0.0291 -0.0609 Between Exer. and the canonical var. of Phys.: PHYS1 PHYS2 PHYS3 chins -0.6102 0.1899 -0.0041 situps -0.8442 -0.0575 0.0108 jumps -0.3638 0.0973 0.0522

Xuhua Xia Slide 15 Ecology data data candata; input Sp1 Sp2 Sp3 Sp4 Chem1 Chem2 Chem3 Chem4; cards; 21.0921.909.199.1820.9621.527.467.41 14.6914.8514.0614.0714.8014.6313.7113.69 2.112.173.133.063.172.432.101.96 9.589.478.148.069.549.719.369.43 10.0210.719.029.0611.1610.5910.9111.10 14.6514.3215.1015.1514.5914.6113.5513.55 24.4224.126.006.1224.3624.504.304.34 22.2022.104.144.0423.3722.744.905.06 8.348.889.169.068.758.197.597.58 10.4910.1211.0811.1310.0910.739.559.56 25.7225.911.121.1625.9426.011.981.99 4.164.443.053.093.974.894.534.53 12.0712.3111.0911.1512.6812.8912.6212.78 19.1319.3611.1311.0518.6919.059.019.16 5.805.154.114.186.076.335.104.96 1.271.152.102.171.271.800.730.75 22.1522.528.018.0422.0822.537.437.31 26.5326.270.140.1126.3326.880.550.57 17.2517.6811.1211.1817.3917.769.519.55 7.947.466.136.037.537.677.517.47 4.124.453.083.145.214.653.924.00 17.5917.5311.1911.0416.9716.7012.3012.26 15.4115.1613.1213.0315.7916.0112.0011.83 12.9012.9311.1211.1212.8012.0411.5211.52 19.1419.117.167.1419.8819.848.868.90 25.1125.503.133.2025.2825.444.264.23 ;

Xuhua Xia Slide 16 SAS Program (cont.) proc cancorr vdep wdep smc stb t probt vprefix=BIO vname='Species' wprefix=ENV wname='Environment'; var Sp1 Sp2 Sp3 Sp4; with Chem1 Chem2 Chem3 Chem4; run; Run and explain

Download ppt "Xuhua Xia Slide 1 Correlation Simple correlation –between two variables Multiple and Partial correlations –between one variable and a set of other variables."

Similar presentations