Lecture 8 (Ch14) Advanced Panel Data Method

Slides:



Advertisements
Similar presentations
Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
Advertisements

PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator
Economics 20 - Prof. Anderson1 Panel Data Methods y it = x it k x itk + u it.
Panel Data Models Prepared by Vera Tabakova, East Carolina University.
Economics 20 - Prof. Anderson
Random Assignment Experiments
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
3.3 Omitted Variable Bias -When a valid variable is excluded, we UNDERSPECIFY THE MODEL and OLS estimates are biased -Consider the true population model:
Instrumental Variables Estimation and Two Stage Least Square
Lecture 12 (Ch16) Simultaneous Equations Models (SEMs)
Session 2. Applied Regression -- Prof. Juran2 Outline for Session 2 More Simple Regression –Bottom Part of the Output Hypothesis Testing –Significance.
Cross section and panel method
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Pooled Cross Sections and Panel Data II
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Chapter 4 Multiple Regression.
12.3 Correcting for Serial Correlation w/ Strictly Exogenous Regressors The following autocorrelation correction requires all our regressors to be strictly.
1Prof. Dr. Rainer Stachuletz Fixed Effects Estimation When there is an observed fixed effect, an alternative to first differences is fixed effects estimation.
Chapter 11 Multiple Regression.
Topic 3: Regression.
Business Statistics - QBM117 Interval estimation for the slope and y-intercept Hypothesis tests for regression.
1 Research Method Lecture 11-1 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©
Lecture 2 (Ch3) Multiple linear regression
Economics 20 - Prof. Anderson1 Fixed Effects Estimation When there is an observed fixed effect, an alternative to first differences is fixed effects estimation.
Economics Prof. Buckles
1Prof. Dr. Rainer Stachuletz Panel Data Methods y it =  0 +  1 x it  k x itk + u it.
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
12 Autocorrelation Serial Correlation exists when errors are correlated across periods -One source of serial correlation is misspecification of the model.
Lecture 16 Duration analysis: Survivor and hazard function estimation
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
Regression Method.
Quantitative Methods Heteroskedasticity.
1 Research Method Lecture 6 (Ch7) Multiple regression with qualitative variables ©
2-1 MGMG 522 : Session #2 Learning to Use Regression Analysis & The Classical Model (Ch. 3 & 4)
Error Component Models Methods of Economic Investigation Lecture 8 1.
Welcome to Econ 420 Applied Regression Analysis Study Guide Week Two Ending Sunday, September 9 (Note: You must go over these slides and complete every.
Pure Serial Correlation
Analysis of Cross Section and Panel Data Yan Zhang School of Economics, Fudan University CCER, Fudan University.
Testing Hypotheses about Differences among Several Means.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
1 Javier Aparicio División de Estudios Políticos, CIDE Primavera Regresión.
Example x y We wish to check for a non zero correlation.
8-1 MGMG 522 : Session #8 Heteroskedasticity (Ch. 10)
Ch5 Relaxing the Assumptions of the Classical Model
Vera Tabakova, East Carolina University
Chapter 15 Panel Data Models.
Vera Tabakova, East Carolina University
Panel Data Models By Mai Thanh, Jin Lulu.
Esman M. Nyamongo Central Bank of Kenya
PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator
Chapter 15 Panel Data Analysis.
Pure Serial Correlation
Instrumental Variables and Two Stage Least Squares
CHAPTER 29: Multiple Regression*
Advanced Panel Data Methods
Serial Correlation and Heteroskedasticity in Time Series Regressions
The Regression Model Suppose we wish to estimate the parameters of the following relationship: A common method is to choose parameters to minimise the.
Economics 20 - Prof. Anderson
Instrumental Variables and Two Stage Least Squares
Migration and the Labour Market
Serial Correlation and Heteroscedasticity in
Ch. 13. Pooled Cross Sections Across Time: Simple Panel Data.
Simple Linear Regression
Tutorial 1: Misspecification
Heteroskedasticity.
Chapter 7: The Normality Assumption and Inference with OLS
Ch. 13. Pooled Cross Sections Across Time: Simple Panel Data.
Serial Correlation and Heteroscedasticity in
Advanced Panel Data Methods
Presentation transcript:

Lecture 8 (Ch14) Advanced Panel Data Method Research Method Lecture 8 (Ch14) Advanced Panel Data Method

Fixed effects estimation Fixed effects estimation is another method to eliminate the time invariant unobserved effect. Consider the following model Yit=β0+β1xit1+β2xit2+…+xitk+ai+uit ……. (1) The correlation between the fixed effect ai and the explanatory variables will cause biases in the estimated coefficients.

Thus, we need to eliminate ai from the estimation Thus, we need to eliminate ai from the estimation. The first differencing is one method. Another method is the following. First, compute the sample average of variables for each individual. (That is, for ith individual, you compute the time series sample average of each variables). Then, you have the following Since ai is constant over time, ai term in the equation (2) does not have the over-bar.

Now, subtract (2) from (1). Then, you get the following equation. Notice that, this transformation eliminates the fixed effect ai. This transformation is called the within transformation. Note also that this transformation eliminates the constant as well. Now, we simplify the notation by writing the above equation as: where . This is called the time-demeaned data on y. The same notation is used for the x-variables and u.

Finally, estimate the demeaned equation (3) using OLS Finally, estimate the demeaned equation (3) using OLS. This is called the fixed effect estimation. To repeat, you simply run the OLS for the following equation and it is called the fixed effect estimation.. Note that you do not have the intercept in this model.

The standard error for the fixed effect estimator Now, define the fixed effects residual as Then, the unbiased estimator of the sample variance is given by =Total # of observations. (T is the # of period, and N is the # of cross sectional units) # of parameters excluding the intercept # cross sectional units (# of individuals, firms etc)

After computing the estimated sample variance , you can compute the standard errors for the parameters by applying the formula given in Handout 2. Notice that, if you manually create the time-demeaned variables and apply OLS, the usual statistical software will compute the degree of freedom as NT-k. This will understate the standard errors. In this case, you have to correct the sample standard errors by multiplying each standard error by . Fortunately, STATA has a command that estimates the fixed effect model automatically with correct standard errors.

Estimating ai Sometimes (not often though), ai ,itself is of interest. This can be easily estimated as: When you estimate a fixed effect model using STATA, STATA reports the `intercept’. Remember that, fixed effect does not have the intercept. What STATA is reporting is the average value of .

Example JTRAIN.dta is a three year panel data. In the first differenced model, we used only the first two years. Now use all the three years and estimate the following model. log(scrap)it=β0+β1(grant)it +β2log(sales)it+β3log(#employees)it +β4(year88)it+β5(year89)it+ai+uit Ex1. Estimate the model using OLS ignoring the presence of the fixed effect. Ex2. Estimate the model using the fixed effect model.

Ex1. OLS result

Fixed effect model

Ex3. The fixed effect model above did not show statistically significant effects of the grant. It is probably because it takes some time for the effect of grants to appear. In order to capture this possibility, include the lag of grant. That is, estimate the following model. log(scrap)it=β0+β1(grant)it +β2(grant)it-1 +β3log(sales)it+β4log(#employees)it +β5(year88)it+β6(year89)it+ai+uit One year lag of the grant This is called the distributed lag model. The lag of the grant captures the effect of receiving grant last year on this year’s scrap rate.

Fixed effect model with one year lag of the grant The lag of grant has greater effect than current grant. This indicates that it takes time for the effect to appear.

Ex4. Finally, estimate the following fixed effect model by manually creating the time-demeaned variable. This is a good exercise for you to understand the exact procedure of the fixed effect estimation. log(scrap)it=β0+β1(grant)it +β2(year88)it+β3(year89)it+ai+uit

Fixed effect estimated automatically Fixed effect estimated by manually creating time-demeaned variables. Note the standard errors are wrong, so you have to correct them.

The do file ***************************** * Mannually estimating the * * fixed effect model * sort fcode by fcode: egen meanlscrap=mean(lscrap) gen dmlscrap=lscrap-meanlscrap by fcode: egen meangrant=mean(grant) gen dmgrant=grant-meangrant by fcode: egen meand88=mean(d88) gen dmd88=d88-meand88 by fcode: egen meand89=mean(d89) gen dmd89=d89-meand89 ******************* *Estimate the model * reg dmlscrap dmgrant dmd88 dmd89 xtreg lscrap grant d88 d89, fe

Note, when you estimate the fixed effect model, it is a good idea to tell your audience what the potential fixed effect would be and whether it is correlated with the explanatory variables. Off course, one can never tell exactly what the fixed effect is since it is the aggregate effects of all the unobserved effects. However, if you tell what is contained in the fixed effect, your audience can understand the potential direction of the bias, and why you need to use the fixed effect model.

The dummy variable regression Consider again the following model. log(scrap)it=β0+β1(grant)it +β2(year88)it+β3(year89)it+ai+uit We learned that fixed effect model can correct for the biases arising from the correlation between ai and the explanatory variables.

Now, consider instead that you include all the firm dummy variables in the model, and estimate the model using the usual OLS. It is known that the slope coefficients and their standard errors obtained from this procedure are exactly the same as those obtained from the fixed effect estimation. The coefficients for dummy variables will be the same as the fixed effect estimates for ai.

However, note that the coefficients for the dummy variables are not consistent when the number of periods (T) is fixed and the number of firms (N) gets large. This is because, when N gets large, the number of ai will increase. So no information accumulates on each ai.

The Random Effect Estimation Consider the following unobserved effect model. Previously, we applied the fixed effect estimation since we suspect that ai are correlated with some of the explanatory variables. But if we can assume that ai are not correlated with any of the explanatory variables, we can estimate the model more efficiently (i.e., get smaller standard errors).

When ai are not correlated with any of the explanatory variables, pooled OLS will be consistent. But the problem is now the serial correlation. That is, for a given person i, the composite error term vit of this period and other periods are correlated.

To be more precise, assume the following. Cov(xitj, ai)=0 for t=1,2,…,T, and j=1,2,…,k That is: ai is uncorrelated with all the explanatory variables in all the periods. In addition, we assume that ai and the idiosyncratic errors in all the periods are uncorrelated. Then we can show the following. where σa2=var(ai) and σu2=Var(uit). Proof: See the front board.

Here is a way to eliminate the serial correlation. Consider the following. Then, the term are not serially correlated. Thus, first consider the following.

Then, subtract (2) from (1) to get, As can be seen, the composite error term is , and we know that this error term has no serial correlation. The transformed data are called the quasi-demeaned data. Therefore, if we apply the OLS to (3), we get the correct standard error. One problem is that λ is an unknown parameter. So this has to be estimated. The procedure to estimate λ is the following.

1. Estimate (1) using OLS. Then estimate σa2 σu2 σv2 and as: This is just the estimate of the sigma-squared estimated from the pooled OLS of (1). 2. Then estimate λ as: 3. Finally, replace λ in equation (3) with and estimate the equation using OLS. This is called the Random Effect Estimation.

Example Estimate a log wage equation using WAGEPAN.dta. Include in the model education, black, hispan, exper, exper squared, married, union, and full set of year dummies. First, estimate the model using OLS Next, estimate the model using the random effect. Finally estimate the model using the fixed effect model. Why does STATA drops some of the variables?

OLS

Random Effect

Fixed effect

Fixed effect or random effect Fixed effect estimation allows arbitrary correlation between ai and explanatory variables. Random effect is valid only if ai are uncorrelated with any of the explanatory variables. When you conduct a policy analysis, correlation should be considered as the rule rather than the exception. Thus fixed effect is almost always more convincing than the random effect.

But if the policy variable is set experimentally, then you might apply random effect. For example, suppose that you want to know the effect of the class size on the students’ achievement. And if students are randomly assigned to classes of different size, then random effect can be applied. However, again, this kind of situation is rare. So, the usual recommendation is to use the fixed effect method.