# Lecture 8 (Ch14) Advanced Panel Data Method

## Presentation on theme: "Lecture 8 (Ch14) Advanced Panel Data Method"— Presentation transcript:

Lecture 8 (Ch14) Advanced Panel Data Method
Research Method Lecture 8 (Ch14) Advanced Panel Data Method

Fixed effects estimation
Fixed effects estimation is another method to eliminate the time invariant unobserved effect. Consider the following model Yit=β0+β1xit1+β2xit2+…+xitk+ai+uit ……. (1) The correlation between the fixed effect ai and the explanatory variables will cause biases in the estimated coefficients.

Thus, we need to eliminate ai from the estimation
Thus, we need to eliminate ai from the estimation. The first differencing is one method. Another method is the following. First, compute the sample average of variables for each individual. (That is, for ith individual, you compute the time series sample average of each variables). Then, you have the following Since ai is constant over time, ai term in the equation (2) does not have the over-bar.

Now, subtract (2) from (1). Then, you get the following equation.
Notice that, this transformation eliminates the fixed effect ai. This transformation is called the within transformation. Note also that this transformation eliminates the constant as well. Now, we simplify the notation by writing the above equation as: where This is called the time-demeaned data on y. The same notation is used for the x-variables and u.

Finally, estimate the demeaned equation (3) using OLS
Finally, estimate the demeaned equation (3) using OLS. This is called the fixed effect estimation. To repeat, you simply run the OLS for the following equation and it is called the fixed effect estimation.. Note that you do not have the intercept in this model.

The standard error for the fixed effect estimator
Now, define the fixed effects residual as Then, the unbiased estimator of the sample variance is given by =Total # of observations. (T is the # of period, and N is the # of cross sectional units) # of parameters excluding the intercept # cross sectional units (# of individuals, firms etc)

After computing the estimated sample variance , you can compute the standard errors for the parameters by applying the formula given in Handout 2. Notice that, if you manually create the time-demeaned variables and apply OLS, the usual statistical software will compute the degree of freedom as NT-k. This will understate the standard errors. In this case, you have to correct the sample standard errors by multiplying each standard error by Fortunately, STATA has a command that estimates the fixed effect model automatically with correct standard errors.

Estimating ai Sometimes (not often though), ai ,itself is of interest. This can be easily estimated as: When you estimate a fixed effect model using STATA, STATA reports the `intercept’. Remember that, fixed effect does not have the intercept. What STATA is reporting is the average value of

Example JTRAIN.dta is a three year panel data. In the first differenced model, we used only the first two years. Now use all the three years and estimate the following model. log(scrap)it=β0+β1(grant)it +β2log(sales)it+β3log(#employees)it +β4(year88)it+β5(year89)it+ai+uit Ex1. Estimate the model using OLS ignoring the presence of the fixed effect. Ex2. Estimate the model using the fixed effect model.

Ex1. OLS result

Fixed effect model

Ex3. The fixed effect model above did not show statistically significant effects of the grant. It is probably because it takes some time for the effect of grants to appear. In order to capture this possibility, include the lag of grant. That is, estimate the following model. log(scrap)it=β0+β1(grant)it +β2(grant)it-1 +β3log(sales)it+β4log(#employees)it +β5(year88)it+β6(year89)it+ai+uit One year lag of the grant This is called the distributed lag model. The lag of the grant captures the effect of receiving grant last year on this year’s scrap rate.

Fixed effect model with one year lag of the grant
The lag of grant has greater effect than current grant. This indicates that it takes time for the effect to appear.

Ex4. Finally, estimate the following fixed effect model by manually creating the time-demeaned variable. This is a good exercise for you to understand the exact procedure of the fixed effect estimation. log(scrap)it=β0+β1(grant)it +β2(year88)it+β3(year89)it+ai+uit

Fixed effect estimated automatically
Fixed effect estimated by manually creating time-demeaned variables. Note the standard errors are wrong, so you have to correct them.

The do file ***************************** * Mannually estimating the * * fixed effect model * sort fcode by fcode: egen meanlscrap=mean(lscrap) gen dmlscrap=lscrap-meanlscrap by fcode: egen meangrant=mean(grant) gen dmgrant=grant-meangrant by fcode: egen meand88=mean(d88) gen dmd88=d88-meand88 by fcode: egen meand89=mean(d89) gen dmd89=d89-meand89 ******************* *Estimate the model * reg dmlscrap dmgrant dmd88 dmd89 xtreg lscrap grant d88 d89, fe

Note, when you estimate the fixed effect model, it is a good idea to tell your audience what the potential fixed effect would be and whether it is correlated with the explanatory variables. Off course, one can never tell exactly what the fixed effect is since it is the aggregate effects of all the unobserved effects. However, if you tell what is contained in the fixed effect, your audience can understand the potential direction of the bias, and why you need to use the fixed effect model.

The dummy variable regression
Consider again the following model. log(scrap)it=β0+β1(grant)it +β2(year88)it+β3(year89)it+ai+uit We learned that fixed effect model can correct for the biases arising from the correlation between ai and the explanatory variables.

Now, consider instead that you include all the firm dummy variables in the model, and estimate the model using the usual OLS. It is known that the slope coefficients and their standard errors obtained from this procedure are exactly the same as those obtained from the fixed effect estimation. The coefficients for dummy variables will be the same as the fixed effect estimates for ai.

However, note that the coefficients for the dummy variables are not consistent when the number of periods (T) is fixed and the number of firms (N) gets large. This is because, when N gets large, the number of ai will increase. So no information accumulates on each ai.

The Random Effect Estimation
Consider the following unobserved effect model. Previously, we applied the fixed effect estimation since we suspect that ai are correlated with some of the explanatory variables. But if we can assume that ai are not correlated with any of the explanatory variables, we can estimate the model more efficiently (i.e., get smaller standard errors).

When ai are not correlated with any of the explanatory variables, pooled OLS will be consistent.
But the problem is now the serial correlation. That is, for a given person i, the composite error term vit of this period and other periods are correlated.

To be more precise, assume the following.
Cov(xitj, ai)=0 for t=1,2,…,T, and j=1,2,…,k That is: ai is uncorrelated with all the explanatory variables in all the periods. In addition, we assume that ai and the idiosyncratic errors in all the periods are uncorrelated. Then we can show the following. where σa2=var(ai) and σu2=Var(uit). Proof: See the front board.

Here is a way to eliminate the serial correlation.
Consider the following. Then, the term are not serially correlated. Thus, first consider the following.

Then, subtract (2) from (1) to get,
As can be seen, the composite error term is , and we know that this error term has no serial correlation. The transformed data are called the quasi-demeaned data. Therefore, if we apply the OLS to (3), we get the correct standard error. One problem is that λ is an unknown parameter. So this has to be estimated. The procedure to estimate λ is the following.

1. Estimate (1) using OLS. Then estimate σa2 σu2 σv2 and as:
This is just the estimate of the sigma-squared estimated from the pooled OLS of (1). 2. Then estimate λ as: 3. Finally, replace λ in equation (3) with and estimate the equation using OLS. This is called the Random Effect Estimation.

Example Estimate a log wage equation using WAGEPAN.dta. Include in the model education, black, hispan, exper, exper squared, married, union, and full set of year dummies. First, estimate the model using OLS Next, estimate the model using the random effect. Finally estimate the model using the fixed effect model. Why does STATA drops some of the variables?

OLS

Random Effect

Fixed effect

Fixed effect or random effect
Fixed effect estimation allows arbitrary correlation between ai and explanatory variables. Random effect is valid only if ai are uncorrelated with any of the explanatory variables. When you conduct a policy analysis, correlation should be considered as the rule rather than the exception. Thus fixed effect is almost always more convincing than the random effect.

But if the policy variable is set experimentally, then you might apply random effect. For example, suppose that you want to know the effect of the class size on the students’ achievement. And if students are randomly assigned to classes of different size, then random effect can be applied. However, again, this kind of situation is rare. So, the usual recommendation is to use the fixed effect method.