Chapter 5: Panel Estimation Under Heteroskedasticity and Serial Correlation This chapter introduces various panel estimation methods that take into account.

Chapter 5: Panel Estimation Under Heteroskedasticity and Serial Correlation
This chapter introduces various panel estimation methods that take into account the possible existence of heteroskedasticity and/or serial correlation. The discussions draw on Cameron & Trivedi: Microeconometrics Using Stata Main methods include: Cluster-robust estimation for short panels Pooled OLS estimator or population-averaged estimator Pooled FGLS estimator or population-averaged estimator Fixed effects estimators: FE, within, LSDV, First-difference Random effects estimators: RE, BE Robust estimation for long panels Heteroskedasticity and serial correlation Unit roots and cointegration Robust estimation for large panels Zhenlin Yang

5.1. Introduction Consider the general panel data model that has been studied: 𝑦 𝑖𝑡 =𝛼+ 𝑋 𝑖𝑡 ′ 𝛽+ 𝑢 𝑖𝑡 , 𝑢 𝑖𝑡 = 𝜇 𝑖 + 𝜆 𝑡 + 𝑣 𝑖𝑡 , (5.1) with cross-sections i = 1, , N, and time periods t = 1, , T. Heteroskedasticity refers to that the variance of 𝑢 𝑖𝑡 (or 𝑣 𝑖𝑡 in case of FE models) changes over i or t or both, in particular over i as the cross-sectional units may be of varying size. Serial correlation means that 𝑢 𝑖𝑡 are correlated over time in a way that is more than the equicorrelation induced by the random effects, because it is often that an unobserved shock in one period will affect the behavioral relationship for at least the next few periods. Recall: a short panel has large N and small T; a long panel has small N and large T, and a large panel has both N and T large. Depending on the type of panels that the estimation is based upon, the method for handling these two issues are different.

5.2. Cluster-Robust Estimation for Short Panels
For short panels, as T is small it is common to let the time effects 𝜆 𝑡 be fixed effects. Then Model (5.1) reduces the one-way model: 𝑦 𝑖𝑡 =𝛼+ 𝐗 𝑖𝑡 ′ 𝛃+ 𝜇 𝑖 + 𝑣 𝑖𝑡 , i = 1, , N, t = 1, , T, (5.2) if the regressors 𝐗 𝑖𝑡 includes a set of time dummies (with one time dummy dropped to avoid the dummy variable trap). Under this parameterization, 𝐗 𝑖𝑡 is (𝐾+𝑇−1)×1, and 𝜇 𝑖 are subject to 𝑖=1 𝑁 𝜇 𝑖 =0 . Writing (5.20) in vector form for each i, or the ith cluster, 𝑦 𝑖 = 𝐗 𝑖 𝛃+ 𝜄 𝑇 𝜇 𝑖 + 𝑣 𝑖 , i = 1, 2, . . ., N, and applying the transformation: 𝑄 𝑇 = 𝐼 𝑇 − 1 𝑇 𝜄 𝑇 𝜄 𝑇 ′ , to give 𝑦 𝑖 ∗ = 𝐗 𝑖 ∗ 𝛃+ 𝑣 𝑖 ∗ , i = 1, 2, . . ., N. (5.3) The cluster-robust VC matrix of the Within estimator 𝛃 :

Introduction Assume in (5.3), (i) 𝑣 𝑖 ∗ are independent, and (ii) Var( 𝑣 𝑖 ∗ ) = Ω 𝑖 , a general T×T positive definite matrix. A robust estimator of the variance-covariance (VC) matrix of the Within estimator 𝛃 is: Var ( 𝛃 ) = 𝐗 ∗′ 𝐗 ∗ −1 𝑖=1 𝑁 𝐗 𝑖 ∗′ 𝑣 𝑖 ∗ 𝑣 𝑖 ∗′ 𝐗 𝑖 ∗ 𝐗 ∗′ 𝐗 ∗ −1 , where 𝑣 𝑖 ∗ = 𝑦 𝑖 ∗ − 𝐗 𝑖 ∗ 𝛃 , and 𝐗 ∗ is NT×(K+T1), which stacks 𝐗 𝑖 ∗ . This is the result given in (3.12), and is valid only for short panels, i.e., the case of large N and small T. It allows arbitrary correlation among the elements in 𝑣 𝑖 ∗ , for each i, but requires the independence of 𝑣 𝑖 ∗ over i. Most importantly, the result not only applies to Model (5.3), obtained from one-way FE model after within transformation, it applies to any model of the that form.

Introduction Generalization. Assume in a model of the form 𝑦 𝑖 ∗ = 𝐗 𝑖 ∗ 𝛃+ 𝑣 𝑖 ∗ , i = 1, 2, . . ., N, (i) 𝑣 𝑖 ∗ are independent, (ii) Var( 𝑣 𝑖 ∗ ) = Ω 𝑖 , and (iii) 𝐸 𝑣 𝑖 ∗ 𝐗 𝑖 ∗ =0. The OLS estimator of 𝛃 is 𝛃 = 𝐗 ∗′ 𝐗 ∗ −1 𝐗 ∗′ 𝑦 ∗ , and a robust estimator of the VC matrix of 𝛃 is: Var ( 𝛃 ) = 𝐗 ∗′ 𝐗 ∗ −1 𝑖=1 𝑁 𝐗 𝑖 ∗′ 𝑣 𝑖 ∗ 𝑣 𝑖 ∗′ 𝐗 𝑖 ∗ 𝐗 ∗′ 𝐗 ∗ −1 , (5.4) where 𝑣 𝑖 ∗ = 𝑦 𝑖 ∗ − 𝐗 𝑖 ∗ 𝛃 , 𝑦 ∗ is the stacked 𝑦 𝑖 ∗ , and 𝐗 ∗ the stacked 𝐗 𝑖 ∗ . It can be shown that the OLS estimator 𝛃 is robust against serial- correlation and cross-sectional heteroskedasticity of unknown form; Clearly, the VC matrix estimate Var ( 𝛃 ) is robust against unknown serial-correlation and cross-sectional heteroskedasticity. Therefore, 𝛃 and Var ( 𝛃 ) together provide a set of inference methods that are robust against unknown serial-correlation and cross-sectional heteroskedasticity. Various applications of (5.4) are presented.

Introduction Various applications of the result (5.4) are presented. The key Stata command/option for implementing (5.4) is vce(cluster id): Pooled OLS or population-averaged estimators; Pooled FGLS or population-averaged estimators; Within estimation; Within estimation, allowing time dummies; Least-squares dummy-variables regression; First-difference estimation, allowing time dummies; One-way individual RE estimation, allowing time dummies; Between estimation; Comparison of panel estimators based on short panels.

Pooled OLS or Population-Averaged Estimators
Pooled OLS estimators simply regress 𝑦 𝑖𝑡 on 𝐗 𝑖𝑡 , using both between (cross-section) and within (time-series) variation in the data, and assuming the disturbances 𝑢 𝑖𝑡 are iid. The resulted OLS estimators 𝛃 of the coefficients 𝛃 of 𝑋 𝑖𝑡 can be consistent if 𝑋 𝑖𝑡 is uncorrelated with 𝑢 𝑖𝑡 , otherwise inconsistent. Even in the case where 𝛃 is consistent, the VC matrix of 𝛃 obtained from an OLS regression may not be correct, as 𝑢 𝑖𝑡 may not be iid, leading to misleading inferences. To further motivate the OLS estimator and the need for an cluster- robust estimator of it VC matrix, consider Model (5.2): 𝑦 𝑖𝑡 =𝛼+ 𝐗 𝑖𝑡 ′ 𝛃+ 𝜇 𝑖 + 𝑣 𝑖𝑡 . Consistency of OLS requires that the error term 𝜇 𝑖 + 𝑣 𝑖𝑡 be uncorrelated with 𝐗 𝑖𝑡 . So pooled OLS 𝛃 is consistent if 𝜇 𝑖 are RE but inconsistent if 𝜇 𝑖 are FE; Var( 𝛃 ) is not of the form 𝜎 2 𝐗 ′ 𝐗 −1 . In statistical literature, the pooled estimators are called population- averaged (pa) estimators.

We use the “Returns to Schooling Data” to demonstrate pooled OLS (or PA) with cluster-robust standard errors (CRSD). . * Pooled OLS with cluster-robust standard errors . regress lwage exp expsq wks ed, vce(cluster id) Linear regression Number of obs = ,165 F(4, 594) = Prob > F = R-squared = Root MSE = (Std. Err. adjusted for 595 clusters in id) | Robust lwage | Coef. Std. Err t P>|t| [95% Conf. Interval] exp | expsq | wks | ed | _cons | The coefficients estimates are identical to those from xtreg, pa, to be given latter. The standard error are almost same as well.

Wages increase with experience until a peak at 31 years [=0.0447/(2 )]; Wages increase by 0.6% with each additional week worked; And wages increase by 7.6% with each additional year of education. The default standard errors assume that the regression errors are iid: . * Pooled OLS with incorrect default standard errors . regress lwage exp expsq wks ed Source | SS df MS Number of obs = 4,165 F(4, 4160) = Model | Prob > F = Residual | , R-squared = Adj R-squared = Total | , Root MSE = lwage | Coef. Std. Err t P>|t| [95% Conf. Interval] exp | expsq | wks | ed | _cons |

These standard errors are misleadingly small, being , , , , compared with CRSD: , , , ; Therefore, it is essential that the OLS standard errors be corrected for clustering on individuals; The pooled OLS estimator can also be obtained using (xtreg, pa) command, with options corr(independent) and vce(robust) nolog: . * Pooled OLS with CRSD using the general xtreg, pa procedure. . xtreg lwage exp expsq wks ed, pa corr(independent) vce(robust) nolog … (Std. Err. adjusted for clustering on id) | Robust lwage | Coef. Std. Err z P>|z| [95% Conf. Interval] exp | expsq | wks | ed | _cons |

Pooled FGLS or Population-Averaged Estimators
Pooled FGLS estimation can lead to estimators of the parameters of the pooled model 𝑦 𝑖𝑡 =𝛼+ 𝐗 𝑖𝑡 ′ 𝛃+ 𝑢 𝑖𝑡 that are more efficient than OLS estimation. This is achieved by modelling the 𝑇×𝑇 error correlation matrix of (𝑢 𝑖1 … 𝑢 𝑖𝑇 ), assumed constant over I, assuming that the observations are independent over i and 𝑁→∞. The pooled estimator, or PA estimator, is obtained using the (xtreg, pa) command, with two key additional options: corr( ): place different restriction on the error correlation; vce(robust): to obtain cluster-robust standard errors that are valid even if corr( ) does not specify correct correlation model. Let 𝜌 𝑡𝑠 =Cor( 𝑢 𝑖𝑡 , 𝑢 𝑖𝑠 ) be the correlation of the errors at time periods t and s, for individual i. Note the restriction that 𝜌 𝑡𝑠 does not vary with i. Also, corr( ) options all set 𝜌 𝑡𝑡 =1. There are potentially T(T1) unique off-diagonal values in the 𝑇×𝑇 error correlation matrix because it need not be that 𝜌 𝑡𝑠 = 𝜌 𝑠𝑡 .

Typical options for corr( ) include: corr(independence): sets 𝜌 𝑡𝑠 =0 for 𝑠≠𝑡. Then the PA estimator equals the pooled OLS estimator; corr(exchangeable): sets 𝜌 𝑡𝑠 =𝜌 for 𝑠≠𝑡. Then errors are equicorrelated and (xtreg, pa) is asymptotically equivalent to (xtreg, re). corr(ar k): specifies an autoregressive process of order k, or AR(k), for 𝑢 𝑖𝑡 . corr(stationary g): specifies a moving average process, or MA(g), for 𝑢 𝑖𝑡 . corr(unstructured): places no restrictions on 𝜌 𝑡𝑠 . For small T, this may be the best model for correlations over time, but can fail for a larger T. The nolog option is to prevent the display of an iteration log. In the statistics literature, the PA estimator is also called the generalized estimating equations (GEE) estimator. The (xtreg, pa) command is a special case of xtgee with family(gaussian) option.

We demonstrate the applications of the (xtreg, re) command using the “Return to Schooling Data” . * PA or pooled FGLS estimation with AR(2) and cluster-robust standard errors . xtreg lwage exp expsq wks ed, pa corr(ar 2) vce(robust) nolog GEE population-averaged model Number of obs = ,165 Group and time vars: id year Number of groups = Link: identity Obs per group: Family: Gaussian min = Correlation: AR(2) avg = max = Wald chi2(4) = Scale parameter: Prob > chi = (Std. Err. adjusted for clustering on id) | Robust lwage | Coef. Std. Err z P>|z| [95% Conf. Interval] exp | expsq | wks | ed | _cons |

Compared with the results from pooled OLS, we see that the coefficients change considerably, due to the use of AR(2) model. the cluster standard errors are smaller than those from the pooled OLS for all regressors except ed, showing the efficacy gain. The estimated error correlation matrix is stored in e(R). We have . * Estimated error correlation matrix after xtreg, pa . matrix list e(R) symmetric e(R)[7,7] c c c c c c c7 r r r r r r r 𝜌 𝑡𝑠 changes only with the value of |ts|, as ab AR model is used.

If an unstructured error correlation matrix is specified, we have . xtreg lwage exp expsq wks south, pa corr(unstructured) vce(robust) nolog ... (Std. Err. adjusted for clustering on id) | Robust lwage | Coef. Std. Err z P>|z| [95% Conf. Interval] exp | expsq | wks | south | _cons | . matrix list e(R) symmetric e(R)[7,7] c c c c c c c7 r r r r r r r 𝜌 𝑡𝑠 changes only with the values of t and s, as an unstructured error correlation is specified.

. * PA or pooled FGLS estimation with MA(6) and cluster-robust standard errors . xtreg lwage exp expsq wks ed, pa corr(stationary 6) vce(robust) nolog GEE population-averaged model Number of obs = ,165 Group and time vars: id year Number of groups = Link: identity Obs per group: Family: Gaussian min = Correlation: stationary(6) avg = max = Wald chi2(4) = Scale parameter: Prob > chi = (Std. Err. adjusted for clustering on id) | Robust lwage | Coef. Std. Err z P>|z| [95% Conf. Interval] exp | expsq | wks | ed | _cons | The results are similar to those based on AR(2).

Time Series Autocorrelation for Panel Data
Some Stata commands are useful in analyzing the correlation of errors over time. First, set both panel and time identifies by xtset. L1.lwage or L.lwage: for lwage lagged once; L2.lwage: for lwage lagged twice; D.lwage: for the first difference in lwage (equals lwage  L.lwage); LD.lwage: for the difference lagged once; L2D.lwage: for the difference lagged twice. . correlate lwage L1.lwage L2.lwage L3.lwage L4.lwage L5.lwage L6.lwage (obs=595) | L L L L L L6. | lwage lwage lwage lwage lwage lwage lwage lwage | --. | L1. | L2. | L3. | L4. | L5. | L6. | Correlation 𝑟 𝑡𝑠 changes only with the values of t and s.

Within Estimator with Cluster-Robust SE
The within (of FE) estimator of a one-way FE model is obtained by running an OLS regression on the within-transformed model (5.3), or an OLS regression of the within equation: 𝑦 𝑖𝑡 − 𝑦 𝑖∙ = 𝑋 𝑖𝑡 − 𝑋 𝑖∙ ′ 𝛽+ 𝑣 𝑖𝑡 − 𝑣 𝑖∙ . The (xtreg, fe) command computes this estimator assuming 𝑣 𝑖𝑡 are iid. The vce(robust) option relaxes this assumption and provides cluster- robust standard errors, provided that the observations are independent over i and 𝑁→∞. The FE or within estimator controls for the fixed effects 𝜇 𝑖 , by using the within i differences so that 𝜇 𝑖 are differenced out; However, the within estimation method is unable to estimate the coefficients of time-invariant regressors, and The within estimator will be relatively imprecise for time-varying regressors that vary little over time. Further, the within estimation will be relatively less efficient as a result of losing one period of data due to differencing.

Within Estimator with Cluster-Robust SE
. xtreg lwage exp expsq wks, fe vce(cluster id) Fixed-effects (within) regression Number of obs = ,165 Group variable: id Number of groups = R-sq: Obs per group: within = min = between = avg = overall = max = F(3,594) = corr(u_i, Xb) = Prob > F = (Std. Err. adjusted for 595 clusters in id) | Robust lwage | Coef. Std. Err t P>|t| [95% Conf. Interval] exp | expsq | wks | _cons | sigma_u | sigma_e | rho | (fraction of variance due to u_i) Compared with pooled OLS, the standard errors have increased. The ed variable cannot be included.

Within Estimator with CRSE and Time Dummies
. xtreg lwage exp expsq wks i.year, fe vce(cluster id) note: 7.year omitted because of collinearity Fixed-effects (within) regression Number of obs = ,165 Group variable: id Number of groups = R-sq: Obs per group: within = min = between = avg = overall = max = F(8,594) = corr(u_i, Xb) = Prob > F = (Std. Err. adjusted for 595 clusters in id) | Robust lwage | Coef. Std. Err t P>|t| [95% Conf. Interval] exp | expsq | wks |

Within Estimator with CRSE and Time Dummies
Cont’d year | 2 | 3 | 4 | 5 | 6 | 7 | (omitted) | _cons | sigma_u | sigma_e | rho | (fraction of variance due to u_i)

Least-Squares Dummy-Variables Regression
The within estimator of 𝛽 can be shown to equal the estimator obtained from a direct OLS estimation of 𝜇 1 , , 𝜇 𝑁 and 𝛽 in individual effects model 𝑦 𝑖𝑡 = 𝑋 𝑖𝑡 ′ 𝛽+ 𝜇 𝑖 + 𝑣 𝑖𝑡 , using command areg: . areg lwage exp expsq wks, absorb(id) vce(cluster id) Linear regression, absorbing indicators Number of obs = ,165 F( 3, ) = Prob > F = R-squared = Adj R-squared = Root MSE = (Std. Err. adjusted for 595 clusters in id) | Robust lwage | Coef. Std. Err t P>|t| [95% Conf. Interval] exp | expsq | wks | _cons | id | absorbed (595 categories) The coefficients estimates are the same as those from xtreg, fe. The robust standard errors differ and are invalid as areg is designed for long panels.

First-Difference Estimator
Consistent estimation of 𝛽 in one-way FE model requires elimination of 𝜇 1 , , 𝜇 𝑁 which is achieved by the within transformation to give the within estimator. An orthogonal transformation method was introduced in Chapter 2. Another way to do so is through the first difference: 𝑦 𝑖𝑡 − 𝑦 𝑖,𝑡−1 = 𝑋 𝑖𝑡 − 𝑋 𝑖,𝑡−1 ′ 𝛽+ 𝑣 𝑖𝑡 − 𝑣 𝑖,𝑡−1 , where the time-invariant 𝜇 𝑖 are eliminated through differencing. An OLS estimation of this model yields consistent estimates of 𝛽. The FD operator is not provided as an option to xtreg. Instead, the estimator can be computed using regress and Stata time-series operators D. to compute the first difference. Similar to the within estimator, the time dummies, fixed time effects, can be added to the model. The robust standard errors can be calculated using vce(cluster id), valid when observations are independent over i and 𝑁→∞.

First-Difference Estimator
. regress D.(lwage exp expsq wks ed), vce(cluster id) noconstant note: D.ed omitted because of collinearity Linear regression Number of obs = ,570 F(3, 594) = Prob > F = R-squared = Root MSE = (Std. Err. adjusted for 595 clusters in id) | Robust D.lwage | Coef. Std. Err t P>|t| [95% Conf. Interval] exp | D1. | expsq | D1. | wks | D1. | ed | D1. | (omitted)

One-Way Random Effects Estimator with CRSE
Recall the one-way random effects model given Ch. 2: 𝑦 𝑖𝑡 =𝛼+ 𝑋 𝑖𝑡 ′ 𝛽+ 𝜇 𝑖 + 𝑣 𝑖𝑡 , i = 1, , N and t = 1, , T. The default of (xtreg, re) command returns RE estimator of this model under 𝜇 𝑖 ~ IID(0, 𝜎 𝜇 2 ) and 𝑣 𝑖𝑡 ~ IID(0, 𝜎 𝑣 2 ), independent of each other, and 𝑋 𝑖𝑡 is independent of 𝜇 𝑖 and 𝑣 𝑖𝑡 for all i and t. For the disturbances 𝑢 𝑖𝑡 = 𝜇 𝑖 + 𝑣 𝑖𝑡 , it is easy to see that 𝜌 𝑡𝑠 =Cor 𝑢 𝑖𝑡 , 𝑢 𝑖𝑠 = 𝜎 𝜇 2 ( 𝜎 𝜇 2 + 𝜎 𝑣 2 ) =𝜌, for all 𝑠≠𝑡. RE model has equicorrelated/exchangeable errors, which is realized by Stata command xtreg with option re. The options mle or pa corr(exchangeable) give asymptotically equivalent estimators, with different estimators of 𝜎 𝜇 2 and 𝜎 𝑣 2 . The robust standard errors can be calculated using vce(cluster id), valid when observations are independent over i and 𝑁→∞.

Between Estimator with CRSE
The between estimator is the OLS estimator of the between model: 𝑦 𝑖∙ = 𝑋 𝑖∙ ′ 𝛽+( 𝜇 𝑖 + 𝑣 𝑖∙ ). Consistency of the OLS estimator 𝛽 requires that the ‘disturbance’ term ( 𝜇 𝑖 + 𝑣 𝑖∙ ) is uncorrelated with 𝑋 𝑖𝑡 . This is the case if 𝜇 𝑖 is a random effect but not if 𝜇 𝑖 is a fixed effect. The between estimator is obtained by specifying the be option of the xtreg command. This essentially a cross-section regression. Therefore, the cross-sectional heteroskedasticity is the issue of concern. There is no explicit option of heteroskedasticity-robust standard errors, except the vce(bootstrap) option. The between estimator is based on averages over t, , i.e., based on the between i variations. Hence it is less efficient than the other estimators such as RE, MLE.

Comparison of Panel Estimators based on Short Panels
Recall from Chap 3 the three 𝑅 2 measures reported in Stata: Within 𝑅 2 : 𝜌 2 { 𝑦 𝑖𝑡 − 𝑦 𝑖∙ , 𝑋 𝑖𝑡 − 𝑋 𝑖∙ 𝛽 } Between 𝑅 2 : 𝜌 2 ( 𝑦 𝑖∙ , 𝑋 𝑖∙ 𝛽 ) Overall 𝑅 2 : 𝜌 2 ( 𝑦 𝑖𝑡 , 𝑋 𝑖𝑡 𝛽 ) where 𝜌 2 (𝑥, 𝑦) denotes the squared correlation between x and y, and 𝛽 is obtained from one of the xtreg options (be, fe, or re). Also, Stata reports: sigma_u: gives the standard deviation of individual effects 𝜇 𝑖 sigma_e: gives the standard deviation of idiosyncratic error 𝑣 𝑖𝑡 rho: the fraction of variance due to 𝜇 𝑖 , i.e., 𝜌= 𝜎 𝑣 2 ( 𝜎 𝜇 2 + 𝜎 𝑣 2 ) In RE estimation, there is an theta option: 𝜃=1− 𝜎 𝑣 2 𝑇 𝜎 𝜇 2 + 𝜎 𝑣 2 For pooled OLS estimation: 𝜃 =0; For within estimation: 𝜃 =1; For RE, 𝜃 →1 as T, 𝜎 𝜇 2 get large.

We compare some of the panel estimators and the associated standard errors, variance components estimates, and R2. Note: pooled OLS is the same as xtreg command with the corr(independence) and pa options. The Stata commands are: ∙ * Compare OLS, BE, FE, RE estimators, and methods to compare standard errors ∙ global xlist exp expsq wks ed ∙ quietly regress lwage $xlist, vce(cluster id) ∙ estimates store OLS_rob ∙ quietly xtreg lwage $xlist, be ∙ estimates store BE ∙ quietly xtreg lwage $xlist, fe ∙ estimates store FE ∙ quietly xtreg lwage $xlist, fe vce(robust) ∙ estimates store FE_rob ∙ quietly xtreg lwage $xlist, re ∙ estimates store RE ∙ quietly xtreg lwage $xlist, re vce(robust) ∙ estimates store RE_rob ∙ estimates table OLS_rob BE FE FE_rob RE RE_rob, > b se stats(N r2 r2_o r2_b r2_w sigma_u sigma_e rho) b(%7.4f)

Variable | OLS_rob BE FE FE_rob RE RE_rob exp | | expsq | | wks | | ed | (omitted) (omitted) | _cons | | N | r2 | r2_o | r2_b | r2_w | sigma_u | sigma_e | rho |

5.2. Robust Estimation for Long Panels
The methods considered up to now have focused on short panels. Now we consider long panels with many time periods for relatively few individuals (N is small and T is large). The individual fixed effects, if desired, can be easily handled by including dummy variables for each individual as regressors. With long panels (𝑇→∞), there is an issue of stationarity. Here we consider only methods for stationary errors, with the cases of unit roots and cointegration being briefly mentioned. When T is large, one cannot have cluster-robust standard errors (as in short panel case). Instead, it is necessary to specify a model for serial correlation in the error. Typical Stata commands for analyzing long panels include: xtregr, xtpcse, xtgls, xtscc, and the respective options. We will use the well-known cigarette demand data for illustrations.

Cigarette Demand Data Recall the cigarette demand data introduced in Chap. 1: a panel of 46 states in United States over 30 years ( ), given on the Wiley website for Baltagi (2005): Variables in the data file Cigar.txt are: (1) STATE = State abbreviation. (2) YR = YEAR. (3) Price per pack of cigarettes. (4) Population. (5) Population above the age of 16. (6) CPI = Consumer price index with (1983=100) (7) NDI = Per capita disposable income. (8) C = Cigarette sales in packs per capita. (9) PIMIN = Minimum price in adjoining states per pack of cigarettes. Several time dummies corresponding to the major policy interventions in 1965, 1968 and 1971 can be added into the model. To reflect long panel nature, we choose only first 10 states.

Serial Correlation & Heteroskedasticity in Long Panels
Consider the one-way effects model: 𝑦 𝑖𝑡 =𝛼+ 𝑋 𝑖𝑡 ′ 𝛽+ 𝜇 𝑖 + 𝑣 𝑖𝑡 , i = 1, , N, t = 1, , T. As now N is small, the individual effects can be merged into 𝑋 𝑖𝑡 in the form of dummies, so that the model is reduced to: 𝑦 𝑖𝑡 = 𝐗 𝑖𝑡 ′ 𝛃+ 𝑣 𝑖𝑡 , i = 1, , N, t = 1, , T. (5.5) where the regressors 𝐗 𝑖𝑡 include intercept, and may also include individual dummies, and time and possibly time-squared, giving a model like a regular multiple linear regression model. The focal point for a long panel is the serial correlation of 𝑣 𝑖𝑡 over t. A model has to be specified as T is large. As N is small, one can be more flexible on the cross-sectional relations: heteroskedasticity and cross-section correlation.

A simple way to model serial correlation is to allow for first-order autoregressive disturbances, i.e., AR(1), for (5.5): 𝑣 𝑖𝑡 = 𝜌 𝑖 𝑣 𝑖,𝑡−1 + 𝜖 𝑖𝑡 , i = 1, , N, t = 1, , T. where the autoregressive parameter may vary with i, with | 𝜌 𝑖 | < 1. Also, the remainder errors 𝜖 𝑖𝑡 are assumed to be normal with mean zero and a general VC matrix that allows for possible heteroskedasticity and cross-sectional correlation: E(𝜖𝜖′)=Σ⨂ 𝐼 𝑇 , where 𝜖 ′ =( 𝜖 11 ,…, 𝜖 1𝑇 , …, 𝜖 𝑁1 ,…, 𝜖 𝑁𝑇 ) where Σ is 𝑁×𝑁 with elements 𝜎 𝑖𝑗 2 . Two special cases are of interest: Σ is diagonal (i.e., 𝜎 𝑖𝑗 2 =0 for 𝑖≠𝑗), only heteroskedasticity. Σ is diagonal and further, all of the 𝜌 𝑖 are equal to 𝜌. Under exogeneity of 𝐗 𝑖𝑡 in (5.5), the OLS is unbiased and consistent.

. xtgls LnC LnP LnNDI LnPmin Year, panels(correlated) corr(psar1) Cross-sectional time-series FGLS regression Coefficients: generalized least squares Panels: heteroskedastic with cross-sectional correlation Correlation: panel-specific AR(1) Estimated covariances = Number of obs = Estimated autocorrelations = Number of groups = Estimated coefficients = Time periods = Wald chi2(4) = Prob > chi = LnC | Coef. Std. Err z P>|z| [95% Conf. Interval] LnP | LnNDI | LnPmin | Year | _cons |

where the xtset command is first executed before running xtgls: . xtset State Year panel variable: State (strongly balanced) time variable: Year, 63 to 92 delta: 1 unit All regressors have the expected effects. The estimated price elasticity of demand for cigarette is .3583, The income elasticity is .5521, Demand declines by 2.7% per year (the coefficient of Year is semielasticity as because the dependent variable is in logs), The minimum price in the adjoining states does not have a significant effect on the demand in the current state. There are 10 states, so there 1011/2 = 55 unique entries in the 1010 contemporaneous error covariance matrix Σ, and 10 autocorrelation parameters 𝜌 𝑖 are estimated!

The xtgls command does the pooled OLS or FGLS estimation when data are from a long panel. They allow the errors 𝑣 𝑖𝑡 in the model to be correlated over i, allow the use of AR(1) models for 𝑣 𝑖𝑡 over t, and allow 𝑣 𝑖𝑡 to be heteroskedastic. An alternative Stata command, xtpcse, yields (long) panel-corrected standard errors (pcse) for the pooled OLS estimator, as well as for pooled least-squares estimator with an AR(1) model for 𝑣 𝑖𝑡 . A third choice, xtscc, generalizes xtpcse by allowing AR(m) errors. It gives Driscoll and Kraay (1998) standard errors for coefficients estimated by pooled OLS/WLS or fixed-effects (within) regression. Note: the xtscc is not automatically installed with the installation of Stata. It can be found and installed by following the steps: Goto help > search; type xtscc In the pumped up window, click the link xtscc from And then click on: “click here to install”.

Options for xtpcse: correlation( ) with choices: hetonly: 𝑣 𝑖𝑡 are independence but heteroskedastic over i; independence: 𝑣 𝑖𝑡 are iid; ar1: constant 𝜌; and psar1: different 𝜌 𝑖 . In all cases, panel corrected standard errors that allow heteroskedasticity and correlation over i are reported. Options for xtgls: panels( ): specifies the error correlation across individuals: iid: 𝑣 𝑖𝑡 are iid; heteroskedastic: 𝑣 𝑖𝑡 are independence i, with variance 𝜎 𝑖 2 correlated: additionally allows correlation over individuals, with independence over time for given individual. corr( ): specifies serial correlation of errors for each individual: ar1: constant 𝜌; and psar1: different 𝜌 𝑖 .

Options for xtscc:. lag(#): set maximum lag order of autocorrelation; default is m(T)=floor[4(T/100)^(2/9)]; fe: perform fixed effects (within) regression; re: perform GLS random effects regression pooled: perform pooled OLS/WLS regression; default noconstant: suppress regression constant in pooled OLS/WLS regressions ase: return (asymptotic) Driscoll-Kraay SE without small sample adjustment

We now use xtpcse, xtgls and user written xtscc (needs separate installation) to obtain the following pooled estimators and the associated standard errors: 1) pooled OLS with iid errors; 2) pooled OLS with standard errors assuming correlation over states; 3) pooled OLS assuming general serial correlation in the error (4 lags) and correlation over states; 4) pooled OLS that assumes an AR(1) error and gets standard errors that additionally permits correlation over states; 5) pooled FGLS with standard errors assuming an AR(1) error; and 6) pooled FGLS assuming an AR(1) error and correlation across states. 𝜌 𝑖 =𝜌. . * Comparison of various pooled OLS and GLS estimators . quietly xtpcse LnC LnP LnNDI LnPmin Year, corr(ind) independent nmk . estimates store OLS_iid . quietly xtpcse LnC LnP LnNDI LnPmin Year, corr(ind) . estimates store OLS_cor . quietly xtscc LnC LnP LnNDI LnPmin Year, lag(4) . estimates store OLS_DK . quietly xtpcse LnC LnP LnNDI LnPmin Year, corr(ar1) . estimates store AR1_cor . quietly xtgls LnC LnP LnNDI LnPmin Year, corr(ar1) panels(iid) . estimates store FGLSAR1 . quietly xtgls LnC LnP LnNDI LnPmin Year, corr(ar1) panels(correlated) . estimates store FGLSCAR

. estimates table OLS_iid OLS_cor OLS_DK AR1_cor FGLSAR1 FGLSCAR, b(%7.3f) se Variable | OLS_iid OLS_cor OLS_DK AR1_cor FGLSAR1 FGLSCAR LnP | | LnNDI | | LnPmin | | Year | | _cons | | legend: b/se For pooled OLS with iid errors, the nmk option normalizes the VCE by Nk rather than N, so that the output is exactly the same as that from regress with default standard errors. The same could be obtained by using xtgls with the corr(ind) panel(iid) nmk options

An final illustration is the xtscc with fe option. The default is re. . xtscc LnC LnP LnNDI LnPmin Year, fe lag(4) Regression with Driscoll-Kraay standard errors Number of obs = Method: Fixed-effects regression Number of groups = Group variable (i): State F( 4, 29) = maximum lag: Prob > F = within R-squared = | Drisc/Kraay LnC | Coef. Std. Err t P>|t| [95% Conf. Interval] LnP | LnNDI | LnPmin | Year | _cons | Compared with the results from xtscc LnC LnP LnNDI LnPmin Year, lag(4), we see that LnPmin becomes significant.

Unit Roots and Cointegration
The methods for long panel considered depend on the stationarity of the time series, i.e., 𝜌 𝑖 <1, i = 1, …, N. The literature on panel methods for unit roots and cointegration is large, and it remains to be an active area of research. In standard application of long panel methods, it is of interest to test the existence unit roots and cointegration. Panel unit-root tests: The Stata command xtunitroot ( provides tests appropriate for all types of panel data: short, long, or large panel. A detailed treatments on these tests are beyond the course. Panel cointegration tests: The Stata command xtcointtest ( implements a variety of tests for panel data with large-N large-T. This seems to be an added feature for Stata 15. Again, a detained treatment on this topic is beyond the course.

5.3. Robust Estimation for Large Panels
Consider the one-way effects model: 𝑦 𝑖𝑡 =𝛼+ 𝑋 𝑖𝑡 ′ 𝛽+ 𝜇 𝑖 + 𝑣 𝑖𝑡 , i = 1, , N, t = 1, , T, where both N and T are ‘large’. The xtreg works for large panel under iid assumptions on 𝑣 𝑖𝑡 . An alternative and better procedure, xtregar, allows AR(1) error 𝑣 𝑖𝑡 = 𝜌 𝑖 𝑣 𝑖,𝑡−1 + 𝜖 𝑖𝑡 . . * Comparison of various RE and FE estimators with full cigarette demand data . quietly xtscc LnC LnP LnNDI LnPmin, lag(4) . estimates store OLS_DK . quietly xtreg LnC LnP LnNDI LnPmin, fe . estimates store FE_REG . quietly xtreg LnC LnP LnNDI LnPmin, re . estimates store RE_REG . quietly xtregar LnC LnP LnNDI LnPmin, fe . estimates store FE_REGAR . quietly xtregar LnC LnP LnNDI LnPmin, re . estimates store RE_REGAR . quietly xtscc LnC LnP LnNDI LnPmin, fe lag(4) . estimates store FE_DK

Robust Estimation for Large Panels
. estimates table OLS_DK FE_REG RE_REG FE_REGAR RE_REGAR FE_DK, b(%7.3f) se Variable | OLS_DK FE_REG RE_REG FE_RE~R RE_RE~R FE_DK LnP | | LnNDI | | LnPmin | | _cons | | legend: b/se It is indeed that xtregar gives more efficient estimators than does the xtreg. The last set of results from “xtscc LnC LnP LnNDI LnPmin, fe lag(4)” are the standard within estimators but with standard errors are robust to both spatial and temporal correlation of the error. However, the standard errors produced by xtscc are much larger, … .

Chapter 5: Panel Estimation Under Heteroskedasticity and Serial Correlation This chapter introduces various panel estimation methods that take into account.

Similar presentations

Presentation on theme: "Chapter 5: Panel Estimation Under Heteroskedasticity and Serial Correlation This chapter introduces various panel estimation methods that take into account."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 5: Panel Estimation Under Heteroskedasticity and Serial Correlation This chapter introduces various panel estimation methods that take into account.

Similar presentations

Presentation on theme: "Chapter 5: Panel Estimation Under Heteroskedasticity and Serial Correlation This chapter introduces various panel estimation methods that take into account."— Presentation transcript:

Similar presentations

About project

Feedback