Presentation on theme: "Conclusion to Bivariate Linear Regression Economics 224 – Notes for November 19, 2008."— Presentation transcript:
Conclusion to Bivariate Linear Regression Economics 224 – Notes for November 19, 2008
Reporting regression results Equation format OR table format. For each of these: – Make sure you define x and y, with the units for each also provided. In your report, make this accessible. – Report the sample size and units of observation. – Report the standard errors or t-statistics associated with each of the regression coefficients. – Report the coefficient of determination, along with its statistical significance. ANOVA table could be provided for a fuller report. – Each step involves reorganizing the results from Excel or other statistical programs to one of these conventions. – Don’t report too many or too few decimals!
Equation format Income and alcohol example. x is mean family income per capita in dollars, 1986 and y is alcohol consumption in litres of alcohol per capita for those aged 15 or over, 1985-86. n = 10 observations from the ten provinces of Canada. The regression equation, with standard errors reported in brackets, is as follows. R 2 for this equation is 0.625 with a P- value of 0.0065. Alternatively, the t statistic could be reported in the brackets – make sure you indicate whether it is the standard errors or t statistics that are reported in the brackets.
4 Table format Dependent variable is wages and salaries Variable Estimated Coefficient Standard Error Probability Value Constant-13,49323,2110.568 Yrs schooling4,1811,6060.017 R 2 = 0.253, P = 0.017 (+ Other equation test-statistics) Could be t-statistics
5 Presenting Multiple Results Dependent variable is wages and salaries VariableEquation IEquation IIEquation III Constant -13,493 (23.211) …. Yrs schooling 4,181 (1,606) † …. R2R2 0.253 Significance0.017…… (+ Other equation test-statistics) Note: Standard errors in brackets. * – significant at the 1% level, † – significant at the 5% level, ‡ – significant at the 10% level
6 Residual analysis The t-test and F-test theoretically only work if the assumptions about the error term are met. E(ε) = 0; Variance (ε) = σ 2 is constant for each x; Values of ε are independent of each other. ε is normally distributed. If these assumptions are not met: – Must correct how our model is constructed. – Or, must come up with a new estimator other than OLS and work on correcting the problems –> Econ 324 and up. Can’t see true ε’s –> must look at our estimates: – Estimated residuals are e i = y i – ŷ i. – Best way: plot them versus x i or ŷ i using Excel or another program.
xe 174914.407 12-21180.1 1230819.88 11-22999 15-11223.4 15-13223.4 194052.216 15-2223.4 209871.121 16-25404.5 183233.312 1115500.98 1427457.69 12-3680.12 14.5-41132.9 13.519548.24 1528276.6 131138.789 107682.075 12.5-17770.7 15-8223.4 12.314565.56 Residuals (e) for years of schooling (x) and wages and salaries regression
Last slide Example of regression of wages and salaries on years of schooling. Appears to satisfy assumptions, although it may have heteroskedasticity. That is, variance of residuals may not be equal for all values of x.
Regression of alcohol consumption on income, provinces of Canada, 1986. Residuals. e i = predicted – actual alcohol consumption x (income) Appears to have a reasonable scatter of residuals, with no obvious violation of assumptions.
Consumption function. Example of serial or auto correlation.
11 Solutions Tests and results are suspect. Violations of assumptions may affect some estimates more than others. Solutions – May mean we have missing explanatory variables or wrong equation format. – May mean that ε does not meet assumptions. – Use different estimators. Examine in detail in courses in Econometrics.
Transformations Relationship may not be linear. There are many different possibilities here. Two examples are provided in other documents: – Population growth – exponential growth. – Earnings and age – parabolic relationship.
13 Some Cautions If we conclude β 1 ≠ 0 –> doesn’t imply x causes y. Could still be random relationships. Need some theoretical argument too. If we conclude β 1 is statistically different than 0 –> doesn’t mean a linear relationship exists for sure. – Must watch out for non-linear relationships.
Next day Begin multiple regression (ASW, Ch. 13). Assignment 6 has now been posted. Remember that this assignment is optional.