Correlation and Regression Paired Data Is there a relationship? Do the numbers appear to increase or decrease together? Does one set increase as the other decreases? How consistent is the pattern? If so can we… Quantify it? Model it with an equation? Use the equation for prediction? x Plastic (lb) y Household
The Linear Correlation Coefficient measures strength and direction of the linear relationship between paired x and y values in a sample. ρ (rho) is the population’s linear correlation coefficient. r is the sample’s linear correlation coefficient Linear Correlation Coefficient 1 0 no correlation negativepositive
Example/Homework Estimate r for the following relationships 1.Household size and amount of trash 2.Car weight and gas mileage 3.Car length and braking distance 4.Height and shoe size 5.Facebook friends and time spent on line 6.Car cost and number of cup holders 7.Time watching television and SAT scores 8.Outside temperature and student absences 9.Number of pages for the term paper and its grade 10.Number of accidents and car insurance premiums
Calculation The values of r is not affected by the units of measurements or the assignment of x and y. Round to three decimal places The sample of paired data (x,y) is a random sample. The pairs of (x,y) data have a bivariate normal distribution. 1.For every x, the paired y values are normally distributed 2.For every y, the paired x values are normally distributed n xy - ( x)( y) n( x 2 ) - ( x) 2 n( y 2 ) - ( y) 2 r =
Calculating r XYXYX2X2 Y2Y
Calculating r Excel The correl function Calculator Data into two lists STAT->TEST>E: LinRegTTest Enter two lists Highlight CALCULATE, Select Enter Find r (and t and p) BudgetGross
Formal Hypothesis Test Test whether the linear correlation is significant Hypothesis H 0 : ρ = 0 (no significant linear correlation) H 1 : ρ 0 (significant linear correlation) Two-tailed test Still need a significance level Two methods for calculating the test statistic and critical value
1 - r 2 n - 2 r t = Test Statistic and Critical Value Test statistic: Critical values: –T-table –Two-tailed alpha heading –Degrees of freedom = n - 2
Test Statistic and Critical Value Test statistic: r Critical value Use to Table A n =.05 =.01
For Example Is there a correlation between engine size and mileage? If so, is it significant? r = SizeMileage
Common Errors Involving Correlation 1.Causation: It is wrong to conclude that correlation implies causality. 1.If strongly correlated, we can not always assume “x causes y” 1.y might cause x 2.The both might be caused by z 2.Averages: Averages suppress individual variation and may inflate the correlation coefficient. 3.There may be some relationship between x and y even when there is no significant linear correlation.
Homework For each of the following pairs of data find the linear correlation coefficient and determine if the correlation is significant. Math Critical Reading Length Braking Distance
Homework For each of the following pairs of data find the linear correlation coefficient and determine if the correlation is significant. depth (ft) Velocity (ft/sec) Altitude (km)Temp (C)