CORRELATIONS: TESTING RELATIONSHIPS BETWEEN TWO METRIC VARIABLES Lecture 18:
Agenda 2 Reminder about Lab 3 Brief Update on Data for Final Correlations
Probability Revisited 3 To make a reasonable decision, we must know: Probability Distribution What would the distribution be like if it were only due to chance? Decision Rule What criteria do we need in order to determine whether an observation is just due to chance or not.
Quick Recap of An Earlier Issue: Why N-1? 4 If we have a randomly distributed variable in a population, extreme cases (i.e., the tails) are less likely to be selected than common cases (i.e., within 1 SD of the mean). One result of this: sample variance is lower than actual population variance. Dividing by n-1 corrects this bias when calculating sample statistics.
Checking for simple linear relationships 5 Pearson’s correlation coefficient Measures the extent to which two metric or interval-type variables are linearly related Statistic is Pearson r, or the linear or product-moment correlation Or, the correlation coefficient is the average of the cross products of the corresponding z-scores.
Correlations 6 Ranges from zero to 1, where 1 = perfect linear relationship between the two variables. Negative relations Positive relations Remember: correlation ONLY measures linear relationships, not all relationships!
Interpretation 7 Recall that Correlation is a precondition for causality– but by itself it is not sufficient to show causality (why?) Correlation is a proportional measure; does not depend on specific measurements Correlation interpretation: Direction (+/-) Magnitude of Effect (-1 to 1); shown as r Statistical Significance (p<.05, p<.01, p<.001)
Correlation: Null and Alt Hypotheses 8 Null versus Alternative Hypothesis H 0 H 1, H 2, etc Test Statistics and Significance Level Test statistic Calculated from the data Has a known probability distribution Significance level Usually reported as a p-value (probability that a result would occur if the null hypothesis were true). pricempg price mpg
Factors which limit Correlation coefficient 9 Homogeneity of sample group Non-linear relationships Censored or limited scales Unreliable measurement instrument Outliers
Homogenous Groups 10
Homogenous Groups: Adding Groups 11
Homogenous Groups: Adding More Groups 12
Separate Groups (non-homogeneous) 13
Non-Linear Relationships 14
Censored or Limited Scales… 15
Censored or Limited Scales 16
Unreliable Instrument 17
Unreliable Instrument 18
Unreliable Instrument 19
Outliers 20
Outliers 21 Outlier
22 Examples with Real Data…