Presentation is loading. Please wait.

Presentation is loading. Please wait.

HETEROSKEDASTICITY Chapter 8.

Similar presentations


Presentation on theme: "HETEROSKEDASTICITY Chapter 8."— Presentation transcript:

1 HETEROSKEDASTICITY Chapter 8

2 I. Introduction Previously have assumed homoskedasticity.
Variance of unobservable error, u, conditional on the explanatory variables is constant. Var(u|x1, x2,…, xk) = Var(u)= s2 Heteroskedasticity occurs if variance of u changes across different segments of the population (i.e. different values of x). Var(u|x1, x2,…, xk) ≠ Var(u) not equal to≠ s2 Example: Return to education, where variance in ability differs by educational attainment

3 I. Introduction Failure of homoskedasticity
Does not bias coefficient estimates Interpretation of R sq and adj R sq unaffected (because unconditional variance) Does lead to bias in the estimate of variance (and consequently, standard error) standard t-statistics, p-values, CI, F-statistics will lead to wrong inference i.e. t-statistics no longer follows t distribution OLS is no longer BLUE (i.e. no longer most efficient estimator)

4 . . . Example of Heteroskedasticity f(y|x) y x1 x2 x3 x
E(y|x) = b0 + b1x . x1 x2 x3 x Economics 20 - Prof. Anderson

5 II. Heteroskedasticity-Robust Infererence
We can adjust standard errors so that they are valid in the presence of heteroskedasticity of unknown form. Convenient, because can get correct standard errors regardless of kind of heteroskedasticity. Outline: Variance with heteroskedasticity Estimate heteroskedasticity-robust standard errors Detect heteroskedasticity

6 II. Variance with Heteroskedasticity

7 II. Variance with Heteroskedasticity

8 III. Estimation with Heteroskedasticity
Now have a consistent estimate of the variance under heteroskedasticity the square root can be used as a standard error for inference Call them heteroskedasticity robust s.e. or Huber/White/Eicker s.e. Easily computable Sometimes the variance is corrected for degrees of freedom by multiplying by n/(n – k – 1) before taking the square root. As n → ∞ it’s all the same, though

9 III. Estimation with Heteroskedasticity
Why don’t we always calculate robust s.e.? These robust standard errors only have asymptotic justification as n → ∞ With small sample sizes, the t statistics formed with robust standard errors will not have a distribution close to the t, and inferences will not be correct In Stata, robust standard errors are easily obtained using the robust option of reg y x1 x2, robust

10 II. Heteroskedasticity-Robust Infererence Estimation
Large samples, people often report just robust s.e., in small samples will report both. T-stat tstat=estimate-hypothesized value /(s.e.) F-stat No easy analytical formula—can’t use standard formula. Use STATA to give you F-stat for robust s.e.

11 III. Estimation with Heteroskedasticity
Log Wage Equation Robust s.e. can either be larger or smaller than usual s.e. Here (by chance) estimates are significant regardless of using wrong or right s.e…..not always case.

12 IV. Testing for Heteroskedasticity
Homoskedasticity implies Var(u|x1, x2,…, xk) = s2 To test for heteroskedasticity, just test the following: H0: Var(u|x1, x2,…, xk) = s2, which is equivalent to H0: E(u2|x1, x2,…, xk) = E(u2) = s2 since assume E(u)=0 If assume the relationship between u2 and xj is linear, can test as a linear restriction For u2 = d0 + d1x1 +…+ dk xk + v this means testing H0: d1 = d2 = … = dk = 0 using an F-test.

13 IV. Testing for Heteroskedasticity The Breusch-Pagan Test
Don’t observe the error u (or u2 ), but can estimate it with the residuals from the OLS regression û2 = d0 + d1x1 +…+ dk xk + e Now can use the R û2 2 from this regression to construct an F statistic. Can use normal F-statistic formula if we assume (which we do) that the errors here satisfy homoskedasticity Var(e|x1, x2,…, xk) = su2 The F statistic is just the reported F statistic for overall significance of the regression F = [R û2 2 /k]/[(1 – R û2 2 )/(n – k – 1)], which is distributed Fk, n – k - 1

14 IV. Testing for Heteroskedasticity The Breusch-Pagan Test
Ex: Heteroskedasticity in Housing Price Equations Economics 20 - Prof. Anderson

15 IV. Testing for Heteroskedasticity The White Test
The Breusch-Pagan test will detect any linear forms of heteroskedasticity The White test allows for nonlinearities by using squares and crossproducts of all the x’s û 2 = d0 + d1x1 + d2 x2 + d3x21+ d4x22 +d5x1 x2 + e Use R û2 2 to form F-statistic again, assume latter equation satisfies homoskedasticity test whether all the xj, xj2, and xjxh are jointly significant

16 IV. Testing for Heteroskedasticity Alternative form of The White Test
White Test can get unwieldy pretty quickly Alternative: Consider that the fitted values from OLS, ŷ, are a function of all the x’s Thus, ŷ2 will be a function of the squares and cross-products and so ŷ and ŷ2 can proxy for all of the xj, xj2, and xjxh. Regress û2 = a0 + a1 ŷ + a2 ŷ2 + t Use the R û2 2 to form an F statistic Note only testing for 2 restrictions now (H0: a1, a2 =0)

17 IV. Testing for Heteroskedasticity Notes
Recall, transforming dependent variable into log can reduce heteroskedasticity.

18 V. Generalized Least Squares & Weighted Least Squares
While it’s always possible to estimate robust standard errors for OLS estimates, OLS is not the most efficient estimator. If we can correctly specify the form of the heteroskedasticity, we can obtain more efficient estimates. Basic idea: Transform the model into one that has homoskedastic errors – called weighted least squares Economics 20 - Prof. Anderson

19 V. Weighted Least Squares Heteroskedasticity is known up to a Multiplicative Constant
Suppose the heteroskedasticity can be modeled as: Var(u|x) =si2 =s2h(x) Means variance of error is proportional to level of x I.e. As income increases, variability in savings increases.

20 V. Weighted Least Squares Heteroskedasticity is known up to a Multiplicative Constant
Suppose Var(u|x) = s2h(x)= s2 hi Since hi is just a function of x: E(ui/√hi|x) = 0, because E(ui|x) = 0 Moreover, Var(u|x)=E(u2|x) –E(u|x)E(u|x)= E(u2|x) Then Var(ui/√hi|x) = E((ui/√hi ) 2|x )=1/ hi E( ui 2|x ) = 1/ hi Var(ui|x)= 1/ hi s2 hi =s2 So, if we divided our whole equation by √hi we would have a model where the error is homoskedastic

21 V. Weighted Least Squares Heteroskedasticity is known up to a Multiplicative Constant
Example: Transformed equation satisfies all G-M assumptions. Economics 20 - Prof. Anderson

22 V. Weighted Least Squares General Least Squares Estimator
Estimating the transformed equation by OLS is called generalized least squares (GLS)…class of estimators GLS will be BLUE in this case Provides more efficient estimates than if used OLS in untransformed analysis. Can uses s.e. for t-statistics, p-values, CI, and resulting R2 is used for F-statistics However, R2 not very good goodness of fit measure (tells us how much of variation in transformed x explains variation in transformed y)

23 V. Weighted Least Squares
GLS estimator for correcting heteroskedasticity is called WLS estimator. minimize the weighted sum of squared residuals (weighted by 1/hi ), which is inverse of the variance. Less weight is given to observations with a higher error variance; in contrast, OLS gives same weight to all observations because it is best when error variance is identical for all partitions of the population Economics 20 - Prof. Anderson

24 V. Weighted Least Squares
Minimization problem: ∑uhati2 /hi Can easily perform WLS using “weight” option in STATA. Produces s.e. that can use for t-statistics, etc… Some regression packages even include option to calculate robust s.e. after weighting, in case specify form of heteroskedasticity incorrectly. Ex: Explain financial wealth in terms of income, age, gender, 401(k) eligibility Suspect heteroskedasticity, so use WLS, with weight 1/inc: Var(finan|inc)= s2inc Economics 20 - Prof. Anderson

25 Economics 20 - Prof. Anderson

26 V. Weighted Least Squares
WLS is great if we know what Var(ui|xi) looks like, but in most cases won’t know form of heteroskedasticity One example where we do know form is: data is aggregated across some group or geographic region instead of at the individual level relationship between amount worker contributes to 401(k) and plan generosity

27 V. Weighted Least Squares
What if only have averages for a firm? mcontribi = b0 + b1mearnsi + b2magei + b3mmaratei + mui If individual regression satisfies all G-M assumptions, it can be shown that the aggregated regression has Var(mui |x)=s2/si where si is the number of employees at firm i. Variance of error term decreases with firm size. Weight is then: 1/hi=mi similar issue if use per capita data at city, county, state or country level…error in aggregate equation has variance proportional to 1/size of population in that area.

28 VI. Heteroskedasticity must be estimated: Feasible GLS
More typically, we don’t know the form of the heteroskedasticity In this case, you need to estimate h(xi) Since are estimating h(xi) and using the estimate to transform the equation, call it feasible GLS. Typically, we start with the assumption of a fairly flexible model, such as Var(u|x) = s2exp(d0 + d1x1 + …+ dkxk) , where h(xi)= exp(d0 + d1x1 + …+ dkxk) Since we don’t know the d, must estimate

29 VI. Heteroskedasticity must be estimated: Feasible GLS
Can transform above as u2 = s2exp(d0 + d1x1 + …+ dkxk)v v is error term assume E(v|x) = 1 and E(v) = 1 Taking natural logs of both sides: ln(u2) = a0 + d1x1 + …+ dkxk + e… a0 now contains original intercept and log(s2) Assume E(e) = 1 and e is independent of x We don’t have u, so replace with its estimate, û. Now can estimate this by OLS to get estimate of h(xi)

30 VI. Heteroskedasticity must be estimated: Feasible GLS
Now, an estimate of h is obtained as ĥ = exp(ĝ). Now can use WLS with weights 1/ ĥ Implementation: Run the original OLS model, save the residuals, û, square them and take the log Regress ln(û2) on all of the independent variables and get the fitted values, ĝ Do WLS using 1/exp(ĝ) as the weight Economics 20 - Prof. Anderson

31 VI. Heteroskedasticity must be estimated: Feasible GLS
FGLS is not BLUE, because use ĥ instead of h, but it is more efficient than just running OLS. As we saw with White test, could save valuable degrees of freedom by regressing: ln(û2) on ŷ and ŷ2 instead ln(û2) of all of the independent variables When doing F tests with WLS, must make sure to use same weights in restricted and unrestricted model form the weights from the unrestricted model and use those weights to do WLS on the restricted model as well as the unrestricted model

32 VI. Heteroskedasticity must be estimated: Feasible GLS
Example: OLS (w/o het rob s.e.) and WLS Under B-P test, Ru2 =.040 and get F-stat that reject null…have evidence of heteroskedasticity Use Feasible GLS procedure and get second set of estimates. Signs and story are similar, but magnitudes different Economics 20 - Prof. Anderson


Download ppt "HETEROSKEDASTICITY Chapter 8."

Similar presentations


Ads by Google