Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.

Slides:



Advertisements
Similar presentations
Econometric Modelling
Advertisements

Cointegration and Error Correction Models
Autocorrelation Functions and ARIMA Modelling
Functional Form and Dynamic Models
Multiple Regression.
F-tests continued.
Autocorrelation and Heteroskedasticity
Ordinary least Squares
Regression Analysis.
Dynamic panels and unit roots
Econometric Analysis of Panel Data Panel Data Analysis – Random Effects Assumptions GLS Estimator Panel-Robust Variance-Covariance Matrix ML Estimator.
PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator
Economics 20 - Prof. Anderson1 Panel Data Methods y it = x it k x itk + u it.
Panel Data Models Prepared by Vera Tabakova, East Carolina University.
Economics 20 - Prof. Anderson
Data organization.
Lecture 29 Summary of previous lecture LPM LOGIT PROBIT ORDINAL LOGIT AND PROBIT TOBIT MULTINOMIAL LOGIT AN PROBIT DURATION.
Multiple Regression and Model Building
Managerial Economics in a Global Economy
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Random Assignment Experiments
Lecture 8 (Ch14) Advanced Panel Data Method
Hypothesis Testing Steps in Hypothesis Testing:
Using SAS for Time Series Data
Random effects estimation RANDOM EFFECTS REGRESSIONS When the observed variables of interest are constant for each individual, a fixed effects regression.
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Chapter 13 Additional Topics in Regression Analysis
Chapter 15 Panel Data Analysis.
Chapter 11 Multiple Regression.
Topic 3: Regression.
Cross-sectional:Observations on individuals, households, enterprises, countries, etc at one moment in time (Chapters 1–10, Models A and B). 1 During this.
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
12 Autocorrelation Serial Correlation exists when errors are correlated across periods -One source of serial correlation is misspecification of the model.
Chapter 11 Simple Regression
Hypothesis Testing in Linear Regression Analysis
Regression Method.
Multiple Regression. In the previous section, we examined simple regression, which has just one independent variable on the right side of the equation.
Chapter 10 Hetero- skedasticity Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Chapter 5 Demand Estimation Managerial Economics: Economic Tools for Today’s Decision Makers, 4/e By Paul Keat and Philip Young.
Panel Data Models ECON 6002 Econometrics I Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
MODELS FOR PANEL DATA. PANEL DATA REGRESSION Double subscript on variables (observations) i… households, individuals, firms, countries t… period (time-series.
7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to.
Panel Data Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
AUTOCORRELATION: WHAT HAPPENS IF THE ERROR TERMS ARE CORRELATED?
Chapter 15 Panel Data Models Walter R. Paczkowski Rutgers University.
When should you use fixed effects estimation rather than random effects estimation, or vice versa? FIXED EFFECTS OR RANDOM EFFECTS? 1 NLSY 1980–1996 Dependent.
Financial Econometrics Lecture Notes 5
Chapter 13 Simple Linear Regression
F-tests continued.
Vera Tabakova, East Carolina University
Chapter 15 Panel Data Models.
Chapter 4 Basic Estimation Techniques
Vera Tabakova, East Carolina University
Esman M. Nyamongo Central Bank of Kenya
Chow test.
Econometrics ITFD Week 8.
PANEL DATA REGRESSION MODELS
PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator
Chapter 15 Panel Data Analysis.
Advanced Panel Data Methods
Ch. 13. Pooled Cross Sections Across Time: Simple Panel Data.
Tutorial 6 SEG rd Oct..
Chapter 9 Dummy Variables Undergraduated Econometrics Page 1
Ch. 13. Pooled Cross Sections Across Time: Simple Panel Data.
Advanced Panel Data Methods
Presentation transcript:

Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman test, which determines if fixed or random effects should be used. Evaluate some panel data models

Panel Data These are Models that Combine Cross-section and Time-Series Data In panel data the same cross-sectional unit (industry, firm, country) is surveyed over time, so we have data which is pooled over space as well as time.

Reasons for using Panel Data 1. Panel data can take explicit account of individual-specific heterogeneity (“individual” here means related to the microunit) 2. By combining data in two dimensions, panel data gives more data variation, less collinearity and more degrees of freedom. 3. Panel data is better suited than cross-sectional data for studying the dynamics of change. For example it is well suited to understanding transition behaviour – for example company bankruptcy or merger.

4. Panel data is better at detecting and measuring effects that cannot be observed in either cross-section or time-series data. 5. Panel data enables the study of more complex behavioural models – for example the effects of technological change, or economic cycles. 6. Panel data can minimise the effects of aggregation bias, from aggregating firms into broad groups.

If all the cross-sectional units have the same number of time series observations the panel is balanced, if not it is unbalanced. Cross section Time series - a matrix of balanced panel data observations on variable y, N cross-sectional observations, T time series observations.

Suppose y is investment and x is a measure of profit Suppose y is investment and x is a measure of profit. We have i = 1…n companies and t = 1…T time periods. Suppose we specify a simple econometric model which says that investment depends on profit: uit is a random error term: E (uit ) ~ N (0, σ2) Estimation of (1) depends on the assumptions that we make about the intercept (a0), the slope coefficient (a1) and the error term (uit ).

Several possible assumptions can be made in order to estimate (1): 1. Assume that the intercept and slope coefficients are constant across time and firms and that the error term captures differences over time and over firms. 2. The slope coefficient is constant but the intercept varies over firms. 3. The slope coefficient is constant but the intercept varies over firms and over time. 4. All coefficients (intercept and slope) vary over firms. 5. The intercept as well as the slope vary over firms and time.

Pooled regression by OLS This is estimation option 1 on the list. But pooled regression may result in heterogeneity bias : Pooled regression: yit=a0+a1xit+uit y True model: Firm 4 • • • • • True model: Firm 3 • • • True model: Firm 2 • • • • True model: Firm 1 • • • • x

Fixed Effects Estimation The previous slide suggests that a better way to model the data would be to allow each group (firm) to have its own intercept: This is know as the (One Way) Fixed Effects Model. How do we estimate it? The simplest way to allow each firm to have its own intercept is to create a set of dummy (binary) variables, one for each firm, and include them as regressors. Consequently, this form of estimation is also known as Least Squares Dummy Variables (LSDV). (Note that there is no constant in this regression.)

However if there are a lot of groups (firms) then it becomes very tedious to create all the dummy variables needed. Some econometric software (e.g. Limdep) is able to automate this. The method used is called the covariance estimator and works be “differencing” out the fixed effect by expressing variables as deviations from their group means, : So: A further extension is to allow the intercept to vary across the different time periods (Two Way Fixed Effects):

The time dummy coefficients can allow the regression function to shift over time to capture changes in technology, government regulation, tax policy, external influences (wars…) etc. Allowing intercept and slope coefficients to vary across groups If we have a sufficient long time dimension to the panel, we could of course just estimate a separate OLS regression for each group (firm). If the number of firms (cross-sectional dimension) is small, then we could estimate a single regression with interactions between x and the group dummy variables D.

Random Effects Estimation The fixed effects model assumes that each group (firm) has a non-stochastic group-specific component to y. Including dummy variables is a way of controlling for unobservable effects on y. But these unobservable effects may be stochastic (i.e. random). The Random Effects Model attempts to deal with this: Here the unobservable component, vi , is treated as a component of the random error term. vi is the element of the error which varies between groups but not within groups. εit is the element of the error which varies over group and time.

We assume that: (We could also introduce an error component which varies across time periods but not across groups – two way random effects.) Estimation of the random effects model cannot be performed by OLS – instead a technique known as generalised least squares (GLS) must be used.

Choosing between Fixed Effects (FE) and Random Effects (RE) 1. With large T and small N there is likely to be little difference, so FE is preferable as it is easier to compute 2. With large N and small T, estimates can differ significantly. If the cross-sectional groups are a random sample of the population RE is preferable. If not the FE is preferable. 3. If the error component, vi , is correlated with x then RE is biased, but FE is not. 4. For large N and small T and if the assumptions behind RE hold then RE is more efficient than FE.

Hausman test: Tests for the statistical significance of the difference between the coefficient estimates obtained by FE and by RE, under then null hypothesis that the RE estimates are efficient and consistent, and FE estimates are inefficient. The test has a Wald test form, and is usually reported in Chi2 form with k-1 degrees of freedom (k is the number of regressors). If W < critical value then random effects is the preferred estimator.

Autocorrelation Although different to autocorrelation using the usual OLS models, a version of the Durbin-Watson test can be used in the usual way. (E-views reports this). To remedy autocorrelation we can use the usual methods, such as the Error Correction Model. ‘Dynamic Models’ are also often used, which basically involves adding a lagged dependent variable. Recently the use of a method for adjusting the standard errors has become popular, the most common method is termed the ‘Newey-West’ adjusted standard errors.

Heteroskedasticity Given that there is a cross-section component to panel data, there will always be a potential for heteroskedasticity. Although there are various tests for heteroskedastcity, as with autocorrelation there is a tendency to automatically use adjusted standard errors, which remove the problem. With heteroskedasticity, it is usually White’s adjusted standard errors that are used.

Example The data consists of 20 countries over 10 years of annual data, giving 200 observations in all (T=200). This produces the following result, where stock prices are regressed against expenditure on research (r ):

Example The results are interpreted in the usual way, however you would need to decide whether you wished to use fixed or random effects in this model. This requires including random effects, then using the Hausman test to determine if the random effects applies. If the null hypothesis is rejected, then the fixed effects model is more appropriate. This is an example of a basic OLS panel data model, you can apply all the other models to panel data too, although it usually requires separate tests. (i.e. the test for cointegration)

Conclusion Panel data is a method for estimating data which is both time series and cross sectional It has both advantages but also disadvantages over OLS estimation It applies to many different techniques, such as tests for stationarity