Presentation is loading. Please wait.

Presentation is loading. Please wait.

6.4 Prediction -We have already seen how to make predictions about our dependent variable using our OLS estimates and values for our independent variables.

Similar presentations


Presentation on theme: "6.4 Prediction -We have already seen how to make predictions about our dependent variable using our OLS estimates and values for our independent variables."— Presentation transcript:

1 6.4 Prediction -We have already seen how to make predictions about our dependent variable using our OLS estimates and values for our independent variables -However, these estimates are subject to sampling variation -a confidence interval is therefore often preferable to a point estimate

2 6.4 Prediction -Assume we want to obtain the value of y (denoted by θ) given certain independent values x j =c j : -since we don’t have actual values for B J, our estimator of θ becomes:

3 6.4 Prediction -However, due to uncertainty, we may prefer a confidence interval estimate for θ: -Since we don’t know the above standard error, similar to F tests we construct a new regression using the calculation:

4 6.4 Prediction -substituting this calculation into the population regression function gives us: -In this regression, the intercept gives us the needed predicted value and intercept to construct our confidence interval -note that all our slope coefficients and their standard errors must be the same as the original regression

5 6.4 Prediction Notes 1) Note that the variance of this prediction will be smallest at the mean values of x j -This is due to the fact that we have most confidence in our OLS estimates in the middle of our data. 2) Note also that this confidence interval applies to the AVERAGE observation with the given x values, and does not apply to any PARTICULAR observation

6 6.4 Individual Prediction -When obtaining a confidence interval for an INDIVIDUAL (often one not in the sample), it is referred to as a PREDICITON INVERVAL and the actual outcome is: -x 0 is the individual’s x values (which could be observed) and u 0 is the unobserved error -Our best estimate of y 0, from OLS, is:

7 6.4 Individual Prediction -given our OLS estimation, our PREDICITON ERROR is: -We know that E(B j hat)x j =B j x j (since OLS is unbiased) and E(u 0 )=0, therefore

8 6.4 Individual Prediction -Note that u 0 is uncorrelated with B j hat, since error comes from the population and OLS estimates are from a sample (and the actual error is uncorrelated with the estimated error) -Furthermore, u 0 is uncorrelated with yhat 0 -Therefore, the VARIANCE OF THE PREDICIOTN ERROR simplifies to:

9 6.4 Individual Prediction -The above variation comes from two sources: 1)Variation in yhat 0. Since yhat 0 comes from estimates of the B j ’s, which have a variance proportional to 1/n, Var(yhat 0 ) is also proportional to 1/n, and is very small for large samples 2)σ 2. Since (1) is often small, the variance of the error is often the dominant term

10 6.4 Individual Prediction -From our CLM assumptions, B j hat and u 0 are normally distributed, so ehat 0 is also normally distributed. -Furthermore, we can rearrange our variance formulas to see that: -due to the σhat 2 term, individual prediction CI’s are wider than average prediction CI’s and calculated as:

11 6.4 Residual Analysis -RESIDUAL ANALYSIS involves examining individual observations to determine whether the predicted value is above or below the true value -a negative residual can indicate an observation is undervalued or has an unmeasured characteristic that lowers the dependent variable -ie: a car with a negative residual is either a good deal or has something wrong with it (ie: it’s not a Ford)

12 6.4 Residual Analysis -a positive residual can indicate an observation is overvalued or has an unmeasured characteristic that increases the dependent variable -ie: a hockey team with a positive residual is either playing very well or has an unmeasured positive factor (ie: it’s from Edmonton, city of champions)

13 6.4 Predicting with Logs -It is very common to express the dependent variable of a regression using natural logs, producing the true and estimated equations: -Where x j can also be in log form -It may seem natural to predict y as: -But this UNDERESTIMATES yhat

14 6.4 Predicting with Logs -From our 6 CLM assumptions, it can be shown that: -Which gives us the simple prediction adjustment: -Which is consistent, but not unbiased, and depends highly on the error term having a normal distribution (MLR.6)

15 6.4 Predicting with Logs -Since in large samples normality isn’t required, forsaking MLR.6 we have -where α 0 =E(e u ) -if we can estimate α 0,

16 6.4 Predicting with Logs -In order to estimate α 0 hat, 1)Regress logy on all x’s and obtain logyhat i ’s: 2)For each observation, create m i hat=elogyhati 3)Regress y on m i hat without an intercept: 4) The coefficient on m i hat is used to predict y

17 6.4 Comparing R 2 ’s By definition, the R 2 from the regression Is simply the squared correlation between the y i and the y i hat. This can be compared to the square of the sample correlation for the regression In order to compare R 2 ’s between a linear and log model.


Download ppt "6.4 Prediction -We have already seen how to make predictions about our dependent variable using our OLS estimates and values for our independent variables."

Similar presentations


Ads by Google