ASSESSING THE STRENGTH OF THE REGRESSION MODEL. Assessing the Model’s Strength Although the best straight line through a set of points may have been found.
Published byModified over 4 years ago
Presentation on theme: "ASSESSING THE STRENGTH OF THE REGRESSION MODEL. Assessing the Model’s Strength Although the best straight line through a set of points may have been found."— Presentation transcript:
Assessing the Model’s Strength Although the best straight line through a set of points may have been found and the assumptions for may appear valid, is the resulting regression line useful in predicting y? NONO YES
STEP 4: HOW GOOD IS THE MODEL? Can we conclude that there is a linear relation between y and x? –This is a hypothesis test (t-test) What proportion of the overall variability in y (from its mean) can be explained by changes in x? r 2 –This is a performance measure called -- the coefficient of determination (denoted by r 2 )
Can we conclude a linear relation exists between y and x? We are hypothesizing that y changes linearly with x: y = 0 + 1 x. That is, if x goes up by 1, y will change by 1. But if no linear relation exists, then that means if x goes up by 1, y will not change, i.e. 1 = 0.
The Hypothesis Test To test whether or not a linear relation exists: H 0 : 1 = 0 (No linear relation exists) H A : 1 0 (A linear relation does exists) = the significance level Reject H 0 (Accept H A ) if t > t /2 or if t < -t /2 with Degrees of Freedom = n- (# betas) = n-2
Coefficient of Determination -- r 2 r 2 The proportion of the total change in y that can be explained by changes in the x values is called the coefficient of determination, denoted r 2.
Hand Calculation of SSR, SSE, SST 11200101000109567.57186802403.2173403214.0226010000 28009200088540.5454161643.5411967859.7515210000 3100011000099054.059948056.98119813732.7198810000 41300120000114824.32358130051.1326787618.7580810000 57009000083283.78159168911.6145107560.2634810000 68008200088540.5454161643.5442778670.56193210000 710009300099054.059948056.9836651570.498410000 86007500078027.03319443162.899162892.622436810000 99009100093797.304421358.667824872.16924010000 101100105000104310.8170741738.50474981.738582810000 SUM1226927027.03373972972.971600900000 2 i 2 i 2 iiii )y(y )y ˆ )yy ˆ ( y ˆ y x i SSRSSESST
Interpretation of r 2 r 2 = 1 -- perfect (positive or negative) relation i.e. points fit exactly along the regression line r 2 close to 0 -- very little relation The higher the value of r 2 the better the model fits the data
Pearson Correlation Coefficient, r Pearson correlation coefficient.r = r 2, which can also be calculated by cov(x,y)/s x s y is called the Pearson correlation coefficient. This is also used to measure the strength of the relation between y and x. r = -1 means perfect negative correlation (i.e. all points fit exactly on a line with negative slope). r = +1 means perfect positive correlation (i.e. all points fit exactly on a line with positive slope). r = 0 means no correlation. Other values give relative strength, but have no exact meaning like r 2 – so we usually use r 2 When we take the square root of r 2 to get r, the sign in front of r is the sign of b 1 – positive or negative slope
EXCEL r2r2 r (“+” if 1 >0; “-” if 1 < 0) SSRSSESST s b1 t statistic for 1 test p-value for 1 test 95% Confidence Interval for 1
Steps Using Excel 1.Determine regression equation Equation: ŷ = 46486.49 + 52.56757x 2.Can you conclude a linear relation exists between y and x? The p-value for the test is.000904 < =.05—YES 3.What proportion of the overall variation in y is explained by changes to x? This is r 2 =.766398 -- a high r 2 CONCLUSION: Overall a good model!
Review Can we conclude a linear relation exists? –Two-tailed t-test of 1 0 –Look at p-value for the x-variable on Excel Computation of a confidence interval for the amount y will change per unit increase in x (i.e. for 1 ) –By hand –Printed on Excel Output What proportion of the overall variation in y is explained by changes in x?– r 2 –By hand –Printed on Excel Pearson correlation coefficient – r –Square root of r 2 –Sign is same as b 1