PANEL REGRESSION (T≥2) (plm package)

PANEL REGRESSION (T≥2) (plm package)
By Mike Jadoo

Purpose Bring about an awareness
Enable individuals to properly create analysis Select the most appropriate model(s).

Special Thank you MLK library Digital commons lab

Lecture Structure Slides, code, datasets are in the groups Meetup files section

Lecture Structure Using R i386 3.2.3 or R studio
Can follow lecture and code or code after There will be some web exercises

R i386 3.2.3 (32 bit)/ R-Studio Plm Lmtest XLCONNECT fBasics Tseries
PACKAGE task Plm panel regression model Lmtest additional statistical tests XLCONNECT Read in excel files fBasics for the Jarque Bera Test Tseries ADF test stargazer for nice reports PSYCH Describe( ) descriptive statistics

Data series SNAP benefits data from USDA
Civilian population estimates from Census Food store employment data from BEA

Orientation (TYPES OF DATA)
Cross-sectional datasets- observing many subjects at one point in time (i.e. OLS model) Pooled Cross Sectional-multiple variables over two periods of time Time series- one variable over multiple periods of time Panel data- multiple variables over multiple periods of time This is worth going over

Orientation (Terminology Clarification)
Longitudinal data Pooled Cross sectional Time series Panel Data Longitudinal data: Collection of samples observations about a particular individual(s), firms, or place over multiple periods of time. Represents combination of regression (linear regression) and time series analysis where a subject (category) is observed over time but unlike time series data many subjects which are observed more than once. Terminology clarification: Be advised that Panel data model is often times called longitudinal data model (regression) as it is done so in biostatistics. **Be aware of this whenever researching/googleing the subject. **

History of Panel Data Regression
Sir George Biddell Airy's 1861 analysis of astronomical data R. A. Fisher 1925 explained more fully the concepts and methods of both fixed-effects and random effects First Paris Conference 1977 experts started to convene and share ideas Sir George Airy Sir George Biddell Airy's 1861 analysis of astronomical data was the first person to use this method- he used a random effects model. R. A. Fisher was noted as explaining more fully the concepts and methods of both fixed-effects and random effects in 1925. After many papers and attempts to modernize the model in 1977 the First Panel Data Conference was held called “first Paris Conference” and was established to bring together experts to share ideas and new discovers of panel data methodologies. R.A. Fisher

Why use panel regression model?
Gives more observations to analyze More complicated characteristics and behavioral hypothesis can be tested Better analysis of the nature of unobserved errors and individual [idiosyncratic] errors 2. Observe same individual units over time so a more complicated dynamics and behavioral hypothesis can be tested rather than a linear regression model where it is one- dimensional data. 3. Provide a means for analyzing more fully the nature of unobserved errors (idosencatic errors) disturbance terms. Disturbance terms are supposed to measure the effects of all kinds of other external factors.

Statistical Modeling Find the data Check the data Estimate the model
Review topics theory or use past experience Formulate a initial model Find the data Check the data Estimate the model Reformulate the model Check the Parameter estimates Interpret your results

Statistical modeling process review
Create the hypothesis - What are you trying to analyze or predict Go over the topics relative theory -May involve extensive reading but it is the good first start!!

Finding the data sources
Government sources are the good first start!! Can’t find the data your looking for? Staff is there to help. There are more providers of data, some have a cost. (Data source list presented) longitudinal:

Panel data sets sources
Web exercise Please see R – code text file for a list of resources

Methodology Review the data series methodology (document that tells you how the data is made), is this acceptable? Web exercise bls.gov/mfp

Data structure An example

Data structure

Panel Regression Pooled (OLS) Fixed effects Random effects
First Differencing R demonstration is per Ani Katchova lecture and Sayed Hossain. Econometrics Academy Introduction, Ani Katchova: Hossain Academy Pooled: putting all observations together and running an ols regression. Taken out the cross section and time series feature of the data table. Combing observations we den the heterogeneity or individual effects that may exist. Assume all categories are the same. Fixed effects: allows for heterogeneity or individuality among categories (individuals) with different intercept terms. Categories are not the same. Random effects: all categories have common mean value, common intercept. First-difference: used when time is not invariant (same values across time)

Models Assumptions Fixed Effects
1. The model has parameter estimates and unobserved effect ai. 2. Data comes from cross sectional random sample 3. X variables changes over time, no prefect linear relationships exists among X’s 4. For each period, the expected value of the idiosyncratic errors given all X’s and the ai is 0 5. Variance for the idiosyncratic error terms and the ai is constant 6. The idiosyncratic errors are uncorrelated 7. The idiosyncratic errors are independent and normally distributed Wooldridge. Intro to econometrics

Models Assumptions Random Effects
1. The model has parameter estimates and there is an ai. 2. Data comes from cross sectional random sample 3. No prefect linear relationships exists among X’s 4. For each period, the expected value of the idiosyncratic errors given all X’s and the ai is 0. Also, the expected value of ai for each parameter equals the constant term 5. Variance for the idiosyncratic error terms given all X’s and the ai is constant. Also, the variance of ai is constant given all X’s 6. The idiosyncratic errors are uncorrelated

Fixed vs Random Effects
Fixed effects: assuming that the individual effects are correlated with the other X’s; study the causes of changes within a person [or entity] Random effects: assuming that the individual effects are uncorrelated with the other X’s

Demonstration Hypothesis Data
“Does food stamp benefits effect grocery store employment? if so, by how much?” Data FOODEMPLY: Food store employment: BEA SNAPP: average annual participation in SNAP SNAPB: SNAP benefits distributed in thousands of dollars CIVPOP: estimated civilian population STATE: state identifier for all 50 states, YRS: time variable years from 2008 to 2012 You can hack with your own data set.

Creating the Model Create scatter plot, histogram R-excerise

Creating the Model Examine the data Create the summary statistics
These statistical measures and illustrations should be included in your report.

Test the series for normality
JB Test

Checking for Prefect Collinearity
Correlation box of all variables x <- newdata[3:6] y <- newdata[3:6] cor(x, y) LSNAPP LSNAPB LFoodEmply LCivPop LSNAPP LSNAPB LFoodEmply LCivPop Correlation box of just explanatory variables newdata$LFoodEmply <-NULL x <- newdata[3:5] y <- newdata[3:5] LSNAPP LSNAPB LCivPop LSNAPP LSNAPB LCivPop

Panel Regression Set the panel regression

Panel Regression Create and save the results for the different types of panel models, use the LM test to find best one. #Pooled OLS estimator: ols<-plm(LFoodEmply~LSNAPP+LSNAPB+LCivPop,data=newdata, index=c("id","t"),model='pooling') #first difference; firstdiff<-plm(LFoodEmply~LSNAPP+LSNAPB+LCivPop,data=newdata, index=c("id","t"),model='fd') #fixed effects(within): fixed<-plm(LFoodEmply~LSNAPP+LSNAPB+LCivPop,data=newdata, index=c("id","t"),model='within') # Random effects: random<-plm(LFoodEmply~LSNAPP+LSNAPB+LCivPop,data=newdata, index=c("id","t"),model='random‘) R-excerise

Panel Regression To determine Fixed vs Random effects use the Hausman Test # Hausman test for fixed versus random effects model phtest(random, fixed) Ho: random effect model is appropriate Ha: fixed effect model is appropriate Hausman Test data: LFoodEmply ~ LSNAPP + LSNAPB + LCivPop chisq = , df = 3, p-value = 3.803e-05 alternative hypothesis: one model is inconsistent

Test your model If your panel data has a long time period then:
Check for serial correlation pbgtest(fixed) Check for cross-sectional dependence (Baltagi) pcdtest()

Statistics of Fit R2 and Adjusted R2 (some say R2 doesn’t matter)
Residual Sum of Squares or Mean Squared Errors F-statistics: p< 0.05 Parameter estimates R2 Residual Sum of Squares or MSE

Statistics of Fit

Interpretation of model
How you say it counts!! Logs Levels Levels to log dependant variable Random effects: when the average effect of X changes across time and between states by one unit, this causes _______ change in Y. Fixed effects: Y changes _____ much overtime, on average per state, when X increases by one unit

Report findings Stargazer: collects the essential parameter estimates in a nice format

Summary Orientation History Statistical model process Data sources
Data Structure of panel data model Panel data model in R Interpretation of models parameter estimates Reporting

MORE TO EXPLORE!!!

Announcements

Special Thanks Ani Katchova- Econometric Academy Sayed Hossain: Hossain Academy

PANEL REGRESSION (T≥2) (plm package)

Similar presentations

Presentation on theme: "PANEL REGRESSION (T≥2) (plm package)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

PANEL REGRESSION (T≥2) (plm package)

Similar presentations

Presentation on theme: "PANEL REGRESSION (T≥2) (plm package)"— Presentation transcript:

Similar presentations

About project

Feedback