Chapter 15 Panel Data Models Walter R. Paczkowski Rutgers University.

Slides:



Advertisements
Similar presentations
Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
Advertisements

Panel Data Models Prepared by Vera Tabakova, East Carolina University.
Lecture 29 Summary of previous lecture LPM LOGIT PROBIT ORDINAL LOGIT AND PROBIT TOBIT MULTINOMIAL LOGIT AN PROBIT DURATION.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Managerial Economics in a Global Economy
Using SAS for Time Series Data
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance.
Objectives (BPS chapter 24)
8. Heteroskedasticity We have already seen that homoskedasticity exists when the error term’s variance, conditional on all x variables, is constant: Homoskedasticity.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
The Simple Linear Regression Model: Specification and Estimation
Economics Prof. Buckles1 Time Series Data y t =  0 +  1 x t  k x tk + u t 1. Basic Analysis.
Chapter 10 Simple Regression.
Chapter 4 Multiple Regression.
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
The Simple Regression Model
Chapter 11 Multiple Regression.
Topic 3: Regression.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Ordinary Least Squares
Multiple Linear Regression Analysis
12 Autocorrelation Serial Correlation exists when errors are correlated across periods -One source of serial correlation is misspecification of the model.
Hypothesis Tests and Confidence Intervals in Multiple Regressors
Introduction to Linear Regression and Correlation Analysis
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Hypothesis Testing in Linear Regression Analysis
Regression Method.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Interval Estimation and Hypothesis Testing
Chapter 10 Hetero- skedasticity Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Panel Data Models ECON 6002 Econometrics I Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Chapter Three TWO-VARIABLEREGRESSION MODEL: THE PROBLEM OF ESTIMATION
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
MODELS FOR PANEL DATA. PANEL DATA REGRESSION Double subscript on variables (observations) i… households, individuals, firms, countries t… period (time-series.
Chapter 4 The Classical Model Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
Panel Data Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Principles of Econometrics, 4t h EditionPage 1 Chapter 8: Heteroskedasticity Chapter 8 Heteroskedasticity Walter R. Paczkowski Rutgers University.
Correlation & Regression Analysis
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Review of Statistics.  Estimation of the Population Mean  Hypothesis Testing  Confidence Intervals  Comparing Means from Different Populations  Scatterplots.
11 Chapter 5 The Research Process – Hypothesis Development – (Stage 4 in Research Process) © 2009 John Wiley & Sons Ltd.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Vera Tabakova, East Carolina University
Chapter 15 Panel Data Models.
Chapter 4 Basic Estimation Techniques
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Vera Tabakova, East Carolina University
15.5 The Hausman test For the random effects estimator to be unbiased in large samples, the effects must be uncorrelated with the explanatory variables.
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Undergraduated Econometrics
The Simple Linear Regression Model: Specification and Estimation
Interval Estimation and Hypothesis Testing
Simple Linear Regression
Tutorial 1: Misspecification
Simple Linear Regression and Correlation
Chapter 7: The Normality Assumption and Inference with OLS
Product moment correlation
The Multiple Regression Model
Chapter 9 Dummy Variables Undergraduated Econometrics Page 1
Correlation and Simple Linear Regression
Presentation transcript:

Chapter 15 Panel Data Models Walter R. Paczkowski Rutgers University

15.3 The Fixed Effects Model 15.4 The Random Effects Model Chapter Contents 15.1 A Microeconomic Panel 15.2 A Pooled Model 15.3 The Fixed Effects Model 15.4 The Random Effects Model 15.5 Comparing Fixed and Random Effects Estimators 15.6 The Hausman-Taylor Estimator 15.7 Sets of Regression Equations

A panel of data consists of a group of cross-sectional units (people, households, firms, states, countries) who are observed over time Denote the number of cross-sectional units (individuals) by N Denote the number of time periods in which we observe them as T

Different ways of describing panel data sets: Long and narrow ‘‘Long’’ describes the time dimension and ‘‘narrow’’ implies a relatively small number of cross sectional units Short and wide There are many individuals observed over a relatively short period of time Long and wide Both N and T are relatively large

It is possible to have data that combines cross-sectional and time-series data which do not constitute a panel We may collect a sample of data on individuals from a population at several points in time, but the individuals are not the same in each time period Such data can be used to analyze a ‘‘natural experiment’’

15.1 A Microeconomic Panel

15.1 A Microeconomic Panel In microeconomic panels, the individuals are not always interviewed the same number of times, leading to an unbalanced panel in which the number of time series observations is different across individuals In a balanced panel, each individual has the same number of observations

Table 15.1 Representative Observations from NLS Panel Data A Microeconomic Panel Table 15.1 Representative Observations from NLS Panel Data

15.2 A Microeconomic Panel

15.2 Pooled Model A pooled model is one where the data on different individuals are simply pooled together with no provision for individual differences that might lead to different coefficients Notice that the coefficients (β1, β2, β3) do not have i or t subscripts Eq. 15.1

15.2 Pooled Model The least squares estimator, when applied to a pooled model, is referred to as pooled least squares The data for different individuals are pooled together, and the equation is estimated using least squares

15.2 Pooled Model It is useful to write explicitly the error assumptions required for pooled least squares to be consistent and for the t and F statistics to be valid when computed using the usual least squares variance estimates and standard errors Eq. 15.2 Eq. 15.3 Eq. 15.4 Eq. 15.5

Cluster-Robust Standard Errors 15.2 Pooled Model 15.2.1 Cluster-Robust Standard Errors Applying pooled least squares in a way that ignores the panel nature of the data is restrictive in a number of ways The first unrealistic assumption that we consider is the lack of correlation between errors corresponding to the same individual

Cluster-Robust Standard Errors 15.2 Pooled Model 15.2.1 Cluster-Robust Standard Errors To relax the assumption of zero error correlation over time for the same individual, we write: This also relaxes the assumption of homoskedasticity: We continue to assume that the errors for different individuals are uncorrelated: Eq. 15.6

Cluster-Robust Standard Errors 15.2 Pooled Model 15.2.1 Cluster-Robust Standard Errors What are the consequences of using pooled least squares in the presence of the heteroskedasticity and correlation? The least squares estimator is still consistent Its standard errors are incorrect This implies that hypothesis tests and interval estimates based on these standard errors will be invalid Typically, the standard errors will be too small, overstating the reliability of the least squares estimator

Cluster-Robust Standard Errors 15.2 Pooled Model 15.2.1 Cluster-Robust Standard Errors Standard errors that are valid for the pooled least squares estimator under the assumption in Eq. 15.6 can be computed Various names are: Panel-robust standard errors Cluster-robust standard errors The time series observations on individuals are the clusters

Pooled Least Squares Estimates of Wage Equation 15.2 Pooled Model Table 15.2 Pooled Least Squares Estimates of Wage Equation 15.2.2 Pooled Least Squares Estimates of Wage Equation

The Fixed Effects Model 15.3 The Fixed Effects Model

The Fixed Effects Model 15.3 The Fixed Effects Model We can extend the model in Eq. 15.1 to relax the assumption that all individuals have the same coefficients: An i subscript has been added to each of the subscripts, implying that (β1, β2, β3) can be different for each individual Eq. 15.7

The Fixed Effects Model 15.3 The Fixed Effects Model A popular simplification is one where the intercepts β1i are different for different individuals but the slope coefficients β2 and β3 are assumed to be constant for all individuals: Eq. 15.8

The Fixed Effects Model 15.3 The Fixed Effects Model All behavioral differences between individuals, referred to as individual heterogeneity, are assumed to be captured by the intercept Individual intercepts are included to ‘‘control’’ for individual-specific, time-invariant characteristics. A model with these features is called a fixed effects model The intercepts are called fixed effects

The Fixed Effects Model 15.3 The Fixed Effects Model We consider two methods for estimating Eq. 15.8 The least squares dummy variable estimator The fixed effects estimator

If we have 10 individuals, we define 10 such dummies Now we can write: 15.3 The Fixed Effects Model 15.3.1 The Least Square Dummy Variable Estimator for Small N One way to estimate the model in Eq. 15.8 is to include an intercept dummy variable (indicator variable) for each individual If we have 10 individuals, we define 10 such dummies Now we can write: Eq. 15.9

15.3 The Fixed Effects Model 15.3.1 The Least Square Dummy Variable Estimator for Small N If the error terms eit are uncorrelated with mean zero and constant variance σ2e for all observations, then the best linear unbiased estimator of Eq. 15.9 is the least squares estimator In a panel data context, it is called the least squares dummy variable estimator

Table 15.3 Dummy Variable Estimation of Wage Equation for N = 10 The Fixed Effects Model Table 15.3 Dummy Variable Estimation of Wage Equation for N = 10 15.3.1 The Least Square Dummy Variable Estimator for Small N

Table 15.4 Pooled Least Squares Estimates of Wage Equation for N = 10 15.3 The Fixed Effects Model Table 15.4 Pooled Least Squares Estimates of Wage Equation for N = 10 15.3.1 The Least Square Dummy Variable Estimator for Small N

We can test the estimates of the intercepts: 15.3 The Fixed Effects Model 15.3.1 The Least Square Dummy Variable Estimator for Small N We can test the estimates of the intercepts: Eq. 15.10

In the restricted model all the intercept parameters are equal 15.3 The Fixed Effects Model 15.3.1 The Least Square Dummy Variable Estimator for Small N These N-1 = 9 joint null hypotheses are tested using the usual F-test statistic In the restricted model all the intercept parameters are equal If we call their common value β1, then the restricted model is the pooled model:

The F-statistic is: 15.3 The Fixed Effects Model 15.3.1 The Least Square Dummy Variable Estimator for Small N The F-statistic is:

The value of the test statistic F = 4.134 yields a p-value of 0.0011 15.3 The Fixed Effects Model 15.3.1 The Least Square Dummy Variable Estimator for Small N The value of the test statistic F = 4.134 yields a p-value of 0.0011 We reject the null hypothesis that the intercept parameters for all individuals are equal. We conclude that there are differences in individual intercepts, and that the data should not be pooled into a single model with a common intercept parameter

Using the dummy variable approach is not feasible when N is large 15.3 The Fixed Effects Model 15.3.2 The Fixed Effects Estimator Using the dummy variable approach is not feasible when N is large Another approach is necessary

Take the data on individual i: 15.3 The Fixed Effects Model 15.3.2 The Fixed Effects Estimator Take the data on individual i: Average the data across time: Eq. 15.11

15.3 The Fixed Effects Model 15.3.2 The Fixed Effects Estimator Using the fact that the parameters do not change over time, we can simplify this as: Eq. 15.12

Now subtract Eq. 15.12 from Eq. 15.11, term by term, to obtain: 15.3 The Fixed Effects Model 15.3.2 The Fixed Effects Estimator Now subtract Eq. 15.12 from Eq. 15.11, term by term, to obtain: or Eq. 15.13 Eq. 15.14

Table 15.5 Data in Deviation from Individual Mean Form 15.3 The Fixed Effects Model Table 15.5 Data in Deviation from Individual Mean Form 15.3.2 The Fixed Effects Estimator

Table 15.6 Fixed Effects Estimation of Wage Equation for N = 10 15.3 The Fixed Effects Model Table 15.6 Fixed Effects Estimation of Wage Equation for N = 10 15.3.2a The Fixed Effects Estimates of Wage Equation for N = 10

the resulting standard errors are identical to those in Table 15.3 The Fixed Effects Model 15.3.2a The Fixed Effects Estimates of Wage Equation for N = 10 If we multiply the standard errors from estimating Eq. 15.14 by the correction factor the resulting standard errors are identical to those in Table 15.3

So that the fixed effects are: 15.3 The Fixed Effects Model 15.3.2a The Fixed Effects Estimates of Wage Equation for N = 10 Usually we are most interested in the coefficients of the explanatory variables and not the individual intercept parameters These coefficients can be ‘‘recovered’’ by using the fact that the least squares fitted regression passes through the point of the means That is: So that the fixed effects are: Eq. 15.15

Table 15.7 Fixed Effects Estimates of Wage Equation for N = 716 15.3 The Fixed Effects Model Table 15.7 Fixed Effects Estimates of Wage Equation for N = 716 15.3.3 Fixed Effects Estimates of Wage Equation from Complete Panel

Table 15.8 Percentage Marginal Effects on Wages 15.3 The Fixed Effects Model Table 15.8 Percentage Marginal Effects on Wages 15.3.3 Fixed Effects Estimates of Wage Equation from Complete Panel

The Random Effects Model 15.4 The Random Effects Model

The Random Effects Model 15.4 The Random Effects Model In the random effects model we assume that all individual differences are captured by the intercept parameters But we also recognize that the individuals in our sample were randomly selected, and thus we treat the individual differences as random rather than fixed, as we did in the fixed-effects dummy variable model

The Random Effects Model 15.4 The Random Effects Model Random individual differences can be included in our model by specifying the intercept parameters to consist of a fixed part that represents the population average and random individual differences from the population average: The random individual differences ui are called random effects and have: Eq. 15.16 Eq. 15.17

The Random Effects Model 15.4 The Random Effects Model Substituting, we get: Rearranging: Eq. 15.18 Eq. 15.19

The Random Effects Model 15.4 The Random Effects Model The combined error term is: The random effects error has two components: One for the individual One for the regression The random effects model is often called an error components model Eq. 15.20

The combined error term has zero mean: 15.4 The Random Effects Model 15.4.1 Error Term Assumptions The combined error term has zero mean: And a constant, homoskedastic, variance: Eq. 15.21

There are several correlations that can be considered: 15.4 The Random Effects Model 15.4.1 Error Term Assumptions There are several correlations that can be considered: The correlation between two individuals, i and j, at the same point in time, t.

There are several correlations (Continued): 15.4 The Random Effects Model 15.4.1 Error Term Assumptions There are several correlations (Continued): The correlation between errors on the same individual (i) at different points in time, t and s Eq. 15.22

There are several correlations (Continued): 15.4 The Random Effects Model 15.4.1 Error Term Assumptions There are several correlations (Continued): The correlation between errors for different individuals in different time periods

15.4 The Random Effects Model 15.4.1 Error Term Assumptions The errors vit = ui + eit are correlated over time for a given individual, but are otherwise uncorrelated The correlation is caused by the component ui that is common to all time periods It is constant over time and, in contrast to the AR(1) error model, it does not decline as the observations get further apart in time: Eq. 15.23

15.4 The Random Effects Model 15.4.1 Error Term Assumptions In terms of the notation introduced to explain the assumptions that motivate the use of cluster-robust standard errors:

Summary of the error term assumptions of the random effects model: 15.4 The Random Effects Model 15.4.1 Error Term Assumptions Summary of the error term assumptions of the random effects model: Eq. 15.24 Eq. 15.25 Eq. 15.26 Eq. 15.27 Eq. 15.28 Eq. 15.29

15.4 The Random Effects Model 15.4.2 Testing for Random Effects We can test for the presence of heterogeneity by testing the null hypothesis H0: σ2u = 0 against the alternative hypothesis H1: σ2u > 0 If the null hypothesis is rejected, then we conclude that there are random individual differences among sample members, and that the random effects model is appropriate If we fail to reject the null hypothesis, then we have no evidence to conclude that random effects are present

15.4 The Random Effects Model 15.4.2 Testing for Random Effects The Lagrange multiplier (LM) principle for test construction is very convenient in this case If the null hypothesis is true, then ui = 0 and the random effects model in Eq. 15.19 reduces to:

The test statistic is based on the least squares residuals: 15.4 The Random Effects Model 15.4.2 Testing for Random Effects The test statistic is based on the least squares residuals: The test statistic for balanced panels is: Eq. 15.30

This critical value is 1.645 if α = 0.05 and 2.326 if α = 0.01 15.4 The Random Effects Model 15.4.2 Testing for Random Effects If the null hypothesis H0: σ2u = 0 is true, then LM ~ N(0, 1) in large samples Thus, we reject H0 at significance level α and accept the alternative H1: σ2u > 0 if LM > z(1-α), where z(1-α) is the 100(1–α) percentile of the standard normal distribution This critical value is 1.645 if α = 0.05 and 2.326 if α = 0.01 Rejecting the null hypothesis leads us to conclude that random effects are present

where the transformed variables are: and α is defined as 15.4 The Random Effects Model 15.4.3 Estimation of the Random Effects Model We can obtain the generalized least squares estimator in the random effects model by applying least squares to a transformed model: where the transformed variables are: and α is defined as Eq. 15.31 Eq. 15.32 Eq. 15.33

15.4 The Random Effects Model 15.4.3 Estimation of the Random Effects Model For α = 1, the random effects estimator is identical to the fixed effects estimator For α < 1, it can be shown that the random effects estimator is a ‘‘matrix-weighted average’’ of the fixed effects estimator that utilizes only within individual variation and a ‘‘between estimator’’ which utilizes variation between individuals

The estimate of the transformation parameter α is: 15.4 The Random Effects Model Table 15.9 Random Effects Estimates of Wage Equation 15.4.4 Random Effects Estimation of the Wage Equation The estimate of the transformation parameter α is:

Comparing Fixed and Random Effects Estimators 15.5 Comparing Fixed and Random Effects Estimators

Comparing Fixed and Random Effects Estimators 15.5 Comparing Fixed and Random Effects Estimators If random effects are present, then the random effects estimator is preferred for several reasons: The random effects estimator takes into account the random sampling process by which the data were obtained The random effects estimator permits us to estimate the effects of variables that are individually time-invariant The random effects estimator is a generalized least squares estimation procedure, and the fixed effects estimator is a least squares estimator

The problem of endogenous regressors was considered before 15.5 Comparing Fixed and Random Effects Estimators 15.5.1 Endogeneity in the Random Effects Model If the random error vit = ui + eit is correlated with any of the right-hand-side explanatory variables in a random effects model, then the least squares and GLS estimators of the parameters are biased and inconsistent The problem of endogenous regressors was considered before The problem is common in random effects models, because the individual specific error component ui may well be correlated with some of the explanatory variables

Average the observations for each individual over time: 15.5 Comparing Fixed and Random Effects Estimators 15.5.2 The Fixed Effects Estimator in a Random Effects Model The panel data regression Eq. 15.19, including the error component ui, is: Average the observations for each individual over time: Eq. 15.34 Eq. 15.35

Subtract: Eq. 15.36 15.5 Comparing Fixed and Random Effects Estimators 15.5.2 The Fixed Effects Estimator in a Random Effects Model Subtract: Eq. 15.36

Comparing Fixed and Random Effects Estimators 15.5 Comparing Fixed and Random Effects Estimators 15.5.3 The Hausman Test To check for any correlation between the error component ui and the regressors in a random effects model, we can use a Hausman test The Hausman test can be carried out for specific coefficients, using a t-test, or jointly, using an F-test or a chi-square test

Comparing Fixed and Random Effects Estimators 15.5 Comparing Fixed and Random Effects Estimators 15.5.3 The Hausman Test Let the parameter of interest be βk Denote the fixed effects estimate as bFE,k and the random effects estimate as bRE,k The t-statistic for testing that there is no difference between the estimators is: Eq. 15.37

Comparing Fixed and Random Effects Estimators 15.5 Comparing Fixed and Random Effects Estimators 15.5.3 The Hausman Test We expect to find: Also: because Hausman proved that:

Comparing Fixed and Random Effects Estimators 15.5 Comparing Fixed and Random Effects Estimators 15.5.3 The Hausman Test Applying the t-test to the SOUTH we get: Using the standard 5% large sample critical value of 1.96, we reject the hypothesis that the estimators yield identical results Our conclusion is that the random effects estimator is inconsistent, and that we should use the fixed effects estimator, or should attempt to improve the model specification

Comparing Fixed and Random Effects Estimators 15.5 Comparing Fixed and Random Effects Estimators 15.5.3 The Hausman Test The form of the Hausman test in Eq. 15.37 and its χ2 equivalent are not valid for cluster robust standard errors, because under these more general assumptions, it is no longer true that:

The Hausman-Taylor Estimator 15.6 The Hausman-Taylor Estimator

The Hausman-Taylor Estimator 15.6 The Hausman-Taylor Estimator The Hausman-Taylor estimator is an instrumental variables estimator applied to the random effects model to overcome the problem of inconsistency caused by correlation between the random effects and some of the explanatory variables

The Hausman-Taylor Estimator 15.6 The Hausman-Taylor Estimator Consider the regression model: with: xit,exog :exogenous variables that vary over time and individuals xit,endog: endogenous variables that vary over time and individuals wi,exog: time-invariant exogenous variables wi,endog: time-invariant endogenous variables Eq. 15.38

The Hausman-Taylor Estimator 15.6 The Hausman-Taylor Estimator A slightly modify set is applied to the transformed generalized least squares model from Eq. 15.31: Eq. 15.39

The Hausman-Taylor Estimator 15.6 The Hausman-Taylor Estimator Table 15.10 Hausman-Taylor Estimates of Wage Equation

Sets of Regression Equations 15.7 Sets of Regression Equations

Sets of Regression Equations 15.7 Sets of Regression Equations Consider procedures for a panel that is long and narrow: T is large relative to N If the number of time series observations is sufficiently large, and N is small, we can estimate separate equations for each individual These separate equations can be specified as Eq. 15.40

15.7 Sets of Regression Equations 15.7.1 Grunfeld’s Investment Data An economic model for describing gross firm investment for the ith firm in the tth time period, denoted INVit, may be expressed as: Eq. 15.41

15.7 Sets of Regression Equations 15.7.1 Grunfeld’s Investment Data We specify the following two equations for General Electric and Westinghouse: Eq. 15.42

Are the GE coefficients equal to the WE coefficients? 15.7 Sets of Regression Equations 15.7.1 Grunfeld’s Investment Data The choice of estimator depends on what assumptions we make about the coefficients and the error terms: Are the GE coefficients equal to the WE coefficients? Do the equation errors eGE,t and eWE,t have the same variance? Are the equation errors eGE,t and eWE,t correlated?

15.7 Sets of Regression Equations 15.7.2 Estimation: Equal Coefficients, Equal Error Variances The assumption that both firms have the same coefficients and the same error variances can be written as: Eq. 15.43

Specify a model with slope and intercept indicator variables: 15.7 Sets of Regression Equations 15.7.3 Estimation: Different Coefficients, Equal Error Variances Let Di be an indicator variable equal to one for the Westinghouse observations and zero for the General Electric observations Specify a model with slope and intercept indicator variables: Eq. 15.44

Table 15.12 Least Squares Estimates from the Dummy Variable Model 15.7 Sets of Regression Equations Table 15.12 Least Squares Estimates from the Dummy Variable Model 15.7.3 Estimation: Different Coefficients, Equal Error Variances

Using the Chow test, we get: 15.7 Sets of Regression Equations 15.7.3 Estimation: Different Coefficients, Equal Error Variances Using the Chow test, we get: where NT - NK is the total number of degrees of freedom in the unrestricted model The p-value for an F(3,34)-distribution is 0.328, implying that the null hypothesis of equal coefficients cannot be rejected Eq. 15.45

15.7 Sets of Regression Equations 15.7.4 Estimation: Different Coefficients, Different Error Variances When both the coefficients and the error variances of the two equations differ, and in the absence of contemporaneous correlation that we introduce in the next section, there is no connection between the two equations, and the best we can do is apply least squares to each equation separately

Table 15.13 Least Squares Estimates of Separate Investment Equations 15.7 Sets of Regression Equations Table 15.13 Least Squares Estimates of Separate Investment Equations 15.7.4 Estimation: Different Coefficients, Different Error Variances

Consider the following assumption: 15.7 Sets of Regression Equations 15.7.5 Seemingly Unrelated Regressions Consider the following assumption: The error terms in the two equations, at the same point in time, are correlated This kind of correlation is called contemporaneous correlation Eq. 15.46

15.7 Sets of Regression Equations 15.7.5 Seemingly Unrelated Regressions The dummy-variable model Eq. 15.44 represents a way to ‘‘stack’’ the 40 observations for the GE and WE equations into one regression To improve the precision of the dummy variable model estimates, we use seemingly unrelated regressions (SUR) estimation, which is a generalized least squares estimation procedure It estimates the two investment equations jointly, accounting for the fact that the variances of the error terms are different for the two equations and accounting for the contemporaneous correlation between the errors of the GE and WE equations

Three stages in the SUR estimation procedure: 15.7 Sets of Regression Equations 15.7.5 Seemingly Unrelated Regressions Three stages in the SUR estimation procedure: Estimate the equations separately using OLS Use the OLS residuals from (1) to estimate σ2GE, σ2WE and σGE,WE The estimated covariance is given by: Use the estimates from (2) to estimate the two equations jointly within a generalized least squares framework

15.7 Sets of Regression Equations 15.7.5 Seemingly Unrelated Regressions The SUR estimation procedure is optimal under the contemporaneous correlation assumption, so no standard error adjustment is necessary

Table 15.14 SUR Estimates of Investment Equations 15.7 Sets of Regression Equations Table 15.14 SUR Estimates of Investment Equations 15.7.5 Seemingly Unrelated Regressions

The equation errors are not contemporaneously correlated 15.7 Sets of Regression Equations 15.7.5a Separate or Joint Estimation Two situations in which separate least squares estimation is just as good as the SUR technique The equation errors are not contemporaneously correlated If the errors are not contemporaneously correlated, there is nothing linking the two equations, and separate estimation cannot be improved upon Least squares and SUR give identical estimates when the same explanatory variables appear in each equation

Compute the squared correlation: 15.7 Sets of Regression Equations 15.7.5a Separate or Joint Estimation If the explanatory variables in each equation are different, then a test to see if the correlation between the errors is significantly different from zero is of interest Compute the squared correlation:

The value of the test statistic is LM = 10.628 15.7 Sets of Regression Equations 15.7.5a Separate or Joint Estimation To check the statistical significance of r2GE,WE, test the null hypothesis H0: σGE,WE = 0 If σGE,WE = 0, then LM = T x r2GE,WE is a Lagrange Multiplier test statistic that is distributed as a χ2(1) random variable in large samples The 5% critical value of a χ2-distribution with one degree of freedom is 3.841 The value of the test statistic is LM = 10.628 We reject the null hypothesis of no correlation

15.7 Sets of Regression Equations 15.7.5a Separate or Joint Estimation If we are testing for the existence of correlated errors for more than two equations, the relevant test statistic is equal to T times the sum of squares of all the correlations The probability distribution under H0 is a χ2-distribution with degrees of freedom equal to the number of correlations

The χ2(3) test statistic is: 15.7 Sets of Regression Equations 15.7.5a Separate or Joint Estimation With three equations, denoted by subscripts 1, 2 and 3, the null hypothesis is: The χ2(3) test statistic is: With M equations: with M(M – 1)/2 degrees of freedom

15.7 Sets of Regression Equations 15.7.5b Testing Cross-Equation Hypotheses We previously used the dummy variable model and the Chow test to test whether the two equations had identical coefficients: It is also possible to test hypotheses such as Eq 15.47 when the more general error assumptions of the SUR model are relevant Because of the complicated nature of the model, the test statistic can no longer be calculated simply as an F-test statistic based on residuals from restricted and unrestricted models Eq. 15.47

15.7 Sets of Regression Equations 15.7.5b Testing Cross-Equation Hypotheses Most econometric software will perform an F-test and/or a Wald χ2-test in a multi-equation framework such as we have here In the context of SUR equations both tests are large sample approximate tests

Any restrictions on parameters in different equations can be tested 15.7 Sets of Regression Equations 15.7.5b Testing Cross-Equation Hypotheses The equality of coefficients is not the only cross-equation hypothesis that can be tested Any restrictions on parameters in different equations can be tested Tests for hypotheses involving coefficients within each equation are valid whether done on each equation separately or using the SUR framework However, tests involving cross-equation hypotheses need to be carried out within an SUR framework if contemporaneous correlation exists

Key Words

Cluster-robust standard errors Contemporaneous correlation Keywords Balanced panel Cluster-robust standard errors Contemporaneous correlation Cross-equation hypotheses Deviations from individual means Endogeneity Error components model Fixed effects estimator Fixed effects model Hausman test Hausman-Taylor estimator Heterogeneity Instrumental variables Least squares dummy variable model LM test Panel corrected standard errors Pooled least squares Pooled model Random effects estimator Random effects model Seemingly unrelated regressions Time-invariant variables Time-varying variables Unbalanced panel

Appendices

Cluster-Robust Standard Errors: Some Details Consider a simple regression model for cross sectional data: The variance of b2, in the presence of heteroskedasticity, is given by:

Cluster-Robust Standard Errors: Some Details Now suppose we have a panel simple regression model: with the assumptions: Eq. 15A.1

Cluster-Robust Standard Errors: Some Details The pooled least squares estimator for β2 is: where with Eq. 15A.2

Cluster-Robust Standard Errors: Some Details The variance of the pooled least squares estimator b2is given by: with Eq. 15A.3

Cluster-Robust Standard Errors: Some Details We can now write: Eq. 15A.4

Cluster-Robust Standard Errors: Some Details To find var(gi), suppose for the moment that T = 2, then:

Cluster-Robust Standard Errors: Some Details For T > 2, Substituting: Eq. 15A.5

Cluster-Robust Standard Errors: Some Details A cluster-robust standard error for b2 is given by the square root of: Eq. 15A.6

Estimation of Error Components 15B Estimation of Error Components The random effects model is: We transform the panel data regression into ‘‘deviation about the individual mean’’ form: Eq. 15B.1 Eq. 15B.2

Estimation of Error Components 15B Estimation of Error Components A consistent estimator of σ2e is: Eq. 15B.3

Estimation of Error Components 15B Estimation of Error Components The estimator of σ2u requires a bit more work Write: This estimator is called the between estimator It uses variation between individuals as a basis for estimating the regression parameters This estimator is unbiased and consistent, but not minimum variance under the error assumptions of the random effects model Eq. 15B.4

Estimation of Error Components 15B Estimation of Error Components The error term has homoskedastic variance: Eq. 15B.5

Estimation of Error Components 15B Estimation of Error Components An estimate of the variance is: Therefore: Eq. 15B.6 Eq. 15B.7