2AnnouncementsAssignment 5 DueAssignment 6 handed out
3Panel DataPanel data typically refers to a particular type of multilevel dataMeasurement at over time (T1, T2, T3…) is nested within persons (or firms or countries)Each time is referred to as a “wave”Person 1T2T1T4T3T5Person 2Person 3Person 4
4Panel Data Panel data involves combining: Information about multiple casesA “cross-sectional” componentInformation about cases over timeA ‘longitudinal’ or ‘time series’ componentPanel datasets are described in terms of:N, the number of individual casesT, the number of wavesIf N is “large” compared to T, the dataset is called “cross-sectionally dominant”If T is “large” compared to N, the dataset is called “time-series dominant”“Time-series Cross-section” / TSCS data: small set of units (usually 10-30), moderate T
5Panel Terminology Issue: Panel means 2 things: 1. Panel is an umbrella term for all data with time-series and cross-section components2. Panel refers specifically to datasets with large N, small T (cross-sectionally dominant)Example: a survey of 1000 people at 3 points in timeAnother distinction: Balanced vs unbalancedPanel data are said to be “balanced” if information on each person is available for all TIf data is missing for some cases at some points in time, the data are “unbalanced”This is common for many datasets on countries or firms.
6Panel Data Example ID Wave Grade Math Test Family income Class Size Gender156780K18277882K22398588K263427K4123K33
7Benefits of Panel Data1. Pooling (either cases or across time) provides richer informationMore observations = better2. Panel data is longitudinalYou can follow individual cases over timeAllows us to study dynamic processesBaltagi: “Dynamics of adjustment”Provides opportunities to better tease out causal relationships3. Panel models allow us to control for individual heterogeneity.
8Benefits of Panel Data4. Panel data allows investigation of issues that are obscured in cross-sectional dataExample: women’s participation in the labor forceSuppose a cross-sectional dataset shows that 50% of women are in the labor forceWhat’s going on?Are 50% of women pursuing work and 50% staying at home?Are 100% of women seeking employment, but many experiencing unemployment at any given time?Or something in between?Without longitudinal information, we can’t develop a clear picture.
9Problems with Panel Data 1. Violates OLS independence assumptionClustering by casesClustering by timeOther sources? Ex: spatial correlation2. For small N large T (TSCS): PoolabilityAppropriate to combine very different cases?3. Serial correlationTemporally adjacent cases may have correlated error4. Non-stationarity (for “larger T” data)5. Other issues: heteroskedasticity, etc…
10Panel Data Strategies Traditional strategy: Fixed vs. random effects Use Hausman test to chooseMore recent strategies:1. Distinction between tools for “panel” vs “TSCS” dataDifferent problems, different solutions2. Many more optionsNew solutions to existing problems3. More attention to “dynamic” modelsModels that include lagged values (lags of Y)Issue: overall, not much consensus about what constitutes “best practices”.
11The Econometric Tradition Main focus of econometric studies:Large N, small T panelsConcern with “the omitted variables problem”Unobserved heterogeneity“Individual” heterogeneityUnobserved effectsExample: Individual wagesOften modeled as a function of education, experience...But, individuals differ in ways that are difficult to controlStrategy: Use a model that “gets rid of” individual variabilitySuch as “fixed effects”…
12The Econometric Tradition Basic “static” panel model w/ unobserved effects:Subscripts i, t refer to cases & time periodsbX = covariatesmi = unobserved unit-specific errornit = idiosyncratic errorLook familiar? Basically the same as the random & fixed effects models discussed previously…Other texts use a or u or zeta instead of “mu”Other texts use e for error, instead of “nu”Issue: whether to treat mi as “fixed” versus “random”But, there are other choices, as well…
13Unobserved Effects Basic “static” panel model w/ unobserved effects: Issue: mi is the problemThe “unobserved” time-invariant features of an individual that may cause their wages to be especially high or low across timeEx: Too much TV as a child; or lots of parental pressure to make $Or genes or IQ or whatever…Common strategies:1. Wipe it out: first differences2. Build it into the model: fixed effectsOr random effects, if we think mi is not correlated with Xs3. Control for it in some other way – such as with a lagged DVWhich already might reflect the impact of mi.
14Unobserved Effects: First Differences First differences (2 waves of data):Which can be expressed generally as:Result: mi goes away!Differencing allows us to estimate coefficients, purged of unit-specific (time-invariant) unobserved effects
15Unobserved Effects: Fixed Effects Fixed Effects – start with the same basic model:A second approach: the “within” transformationCenter everything around unit-specific mean (‘time demeaning’)Which wipes out mu:Equivalent to putting in dummy variables for every caseEstimating m as a “fixed” effectIn stata: xtreg wage educ, fe
16Fixed Effects vs. First Differencing FE and FD are very similar approachesIn fact, for 2 wave datasets, results are identicalFor 3+ waves, results can differWhich is better for large N, small T?FE is more efficient if no serial correlationFD is better if lots of serial correlationEx: concerns about nonstationarityWhich is better for small N, larger T (TSCS)?More likely issues with nonstationarity & spurious regression problem… FE = problematic, FD = betterBut, FD can turn a I(1) process into weakly dependent. Fixed!
17Fixed Effects vs. First Differencing General remarks (Wooldridge 2009):Both FE and FD are biased if variables aren’t strictly exogenousX variables should be uncorrelated with present, past, & future error…Including inclusion of lagged dependent variableBUT: bias in FE declines with large TBias in FD does not…Wooldridge 2009, Ch 14 (p. 587): “Generally, it is difficult to choose between FE and FD when they give substantially different results. It makes sense to report both sets of results and try to determine why they differ.”
18Fixed Effects vs. First Differencing More remarks1. FE and FD cannot estimate effects of variables that do not change over timeAnd can have problems if variables change rarely…2. Both FE and FD are not as efficient as models that include “between case” variabilityA trade off between efficiency and potential bias (if unobserved effects are correlated with Xs)3. Both FE and FD are very sensitive to measurement errorWithin-case variability is often small… may be swamped by error/noise.
19“Two-way” Error Components Unobserved effects may occur over timeExample: common effect of year (e.g., a recession) on wagesA “crossed” multilevel modelWhat Baltagi (2008) calls 2-way error componentsVersus basic FE/RE, which has one eror componentStrategies:Use dummies for time in combination with FE/FDSpecify a “crossed” multilevel model in StataSee Rabe-Hesketh and SkrondalBasically, create a 3-level model, nesting groups underneath.
20Random EffectsRandom effects: Additional efficiency at the cost of additional assumptionsKey assumption: unit-specific unobserved effect is not correlated with X variablesIf assumptions aren’t met, results are biasedOmitted X variables often induce correlation between other X variables and the unobserved effectIf main purpose of panel analysis is to avoid unobserved effects, fixed effects is a safer choiceIn stata:xtreg wage educ, re – for GLS estimatorxtreg wage educ, mle – for ML estimator
21Random Effects The random effects model Then, instead of the “within” transformation, we take out part of within-case variation:GLS Estimator: “Quasi time-demeaned data”Where:
22Random EffectsRandom effects model is a hybrid: An intermediate model between OLS and FEIf T is large, it becomes more like FEIf the unobserved effect has small variance (isn’t important), RE becomes more like OLSIf unobserved effect is big (cases hugely differ), results will be more like FEGLS random effects estimator is for large NIts properties for small N large T aren’t well studiedBeck and Katz advise against itAnd, if you must use random effects, they recommend ML estimator
23Random Effects When to use random effects? 1. If you are confident that unit unobserved effect isn’t correlated with XsEither for theoretical reasons…Or, because you have lots of good controlsEx: lots of regional dummies for countries…2. Your main focus is time-constant variablesFE isn’t an option. Hopefully #1 applies.3. When your focus is between case variabilityAnd/or there is hardly any within-case variability (#1!)4. When a Hausman test indicates that they yield similar results
24Hausman Specification Test Hausman Specification Test: A tool to help evaluate fit of fixed vs. random effectsLogic: Both fixed & random effects models are consistent if models are properly specifiedHowever, some model violations cause random effects models to be inconsistentEx: if X variables are correlated to random errorIn short: Models should give the same results… If not, random effects may be biasedIf results are similar, use the most efficient model: random effectsIf results diverge, odds are that the random effects model is biased. In that case use fixed effects…
25Hausman Specification Test Strategy: Estimate both fixed & random effects modelsSave the estimates each timeFinally invoke Hausman testEx:xtreg var1 var2 var3, i(groupid) feestimates store fixedxtreg var1 var2 var3, i(groupid) reestimates store randomhausman fixed random
26Hausman Specification Test Example: Environmental attitudes fe vs re. hausman fixed random---- Coefficients ----| (b) (B) (b-B) sqrt(diag(V_b-V_B))| fixed random Difference S.E.age |male |dmar |demp |educ |incomerel |ses |b = consistent under Ho and Ha; obtained from xtregB = inconsistent under Ha, efficient under Ho; obtained from xtregTest: Ho: difference in coefficients not systematicchi2(7) = (b-B)'[(V_b-V_B)^(-1)](b-B)=Prob>chi2 =Direct comparison of coefficients…Non-significant p-value indicates that models yield similar results…
27Hausman Specification Test Issues with Hausman tests (Wooldridge 2009)1. Fail to reject means either:FE and RE are similar… you’re good!FE estimates are VERY impreciseLarge differences from RE are nevertheless insignificantThat can happen if the data are awful/noisy. Watch out!2. Watch for difference between “statistical significance” and “practical significance”With a huge sample, the Hausman test may “fail” even though RE is nearly the same as FEIf differences are tiny, you could argue that RE is appropriate.
28Allison’s Hybrid Approach Allison (2009) suggests a ‘hybrid’ approach that provides benefits of FE and REAlso discussed in Gelman & HillBuilds on idea of decomposing X vars into mean, deviation1. Compute case-specific mean variablesegen meanvar1 = mean(var1), by(groupid)2. Transform X variables into deviationsSubtract case-specific meanegen withinvar1 = var1 – meanvar13. Do not transform the dependent variable Y4. Include both X deviation & X mean variables5. Estimate with a RE model
29Allison’s Hybrid Approach Benefits of hybrid approach:1. Effects of “X-deviation” variables are equivalent to results from a fixed effects modelAll time-constant factors are controlled2. Allows inclusion of time-constant X variables3. You can build a general multilevel modelRandom slope coefficients; more than 2 level models…4. You can directly test FE vs RENo Hausman test neededX-mean and X-deviation coefficients should be equalConduct a Wald test for equality of coefficientsAlso differing X-mean & X-deviation coefs are informative.
30SEM Approaches Both FE and RE can be estimated using SEM Benefits: See Allison 2009, Bollen & Brand forthcoming (SF)Software: LISREL, EQS, AMOS, MplusSome limited models in Stata using GLLAMMBenefits:FE and RE are nested… can directly compare with fit statistics (BIC, IFI, RMSEA) – HUGE advantageAllows tremendous flexibilityLagged DVsCan relax assumptions that covariate effects or variances are constant across waves of dataFlexibility in correlation between unobserved effect and XsAnd more!
31SEM Approaches RE in SEM (from Allison 2009) Unobserved effect correlated with Y across waves(Assumes no correlation with X)RE in SEM (from Allison 2009)
32SEM Approaches FE in SEM (from Allison 2009) Unobserved effect correlated with Y AND X at all waves
33Serial CorrelationIssue: What about correlated error in nearby waves of data?So far we’ve focused on correlated error due to unobserved effect (within each case)Strategies:Use a model that accounts for serial correlationSTATA: xtregar – FE and RE with “AR(1) disturbance”Many other options… xtgee, etc.Develop a “dynamic model”Actually model the patterns of correlation across YInclude the lagged dependent variable (Y)… or many lags!This causes bias in FE, RE – so other models needed.
34Dynamic Panel Models What if we have a dynamic process? Examples from Baltagi 2008:Cigarette consumption (in US states) – lots of inertiaDemocracy (in countries)We might consider a model like:Y from the prior period is included as an independent variableIssue: FE, RE estimators are biasedTime-demeaned (or quasi-demeaned) lag y correlated with errorFE is biased for small T. Gets better as T gets bigger (like 30)RE also biased.
35Dynamic Panel Models One solution: Use FD and instrumental variables Strategy: If there’s a problem between error and lag Y, let’s find a way to calculate a NEW version of lag y that doesn’t pose a problemIdea: Further lags of Y aren’t an issue in a FD model.Use them as “instrumental variables” as a proxy for lag YArellano Bond GMM estimatorA FD estimatorLag of levels as an instrument for differenced YArellano-Bover/Blundell-Bond “System GMM” estimatorExpand on this by using lags of differences and levels as instrumentsGeneralized Method of Moments (GMM) estimation.
36Dynamic Panel Models Stata: xtabond – basic Arellano-Bond GMM model xtdpdsys – System GMM estimatorxtdpd – A flexible command to build system GMM modelsLots of control over lag structureCan run non-dynamic modelsKey assumptions / issuesSerial correlation of differenced errors limited to 1 lagNo overidentifying restrictions (“Sargan test”)How many instruments?Criticisms:Angrist & Pichke 2009: assumptions not always plausibleAllison 2009Bollen and Brand, forth: Hard to compare models.
37Dynamic Panel Models General remarks: It is important to think carefully about dynamic processes…How long does it take things to unfold?What lags does it make sense to include?With huge datasets, we can just throw lots inWith smaller datasets, it is important to think things through.
38Methods: IV panel models Traditional instrumental variable panel estimator:bX = exogenous covariatesgZ = endogenous covariates (may be related to nit)mi = unobserved unit-specific errornit = idiosyncratic errorTreat mi as random, fixed, or use differencing to wipe it outUse contemporaneous or lagged X and (appropriate) lags of Z as instruments in two-stage estimation of yit.Works if lag Z is plausibly exogenous.
39TSCS Data Time Series Cross Section Data Example: economic variables for industrialized countriesOften countriesOften ~30-40 years of dataBeck (2001)No specific minimum, but be suspicious of T<10Large N isn’t required (though not harmful)
40TSCS Data: OLS PCSE Beck & Katz 2001 “Old” view: Use FGLS to deal with heteroskedasticity & correlated errorsProblem: This underestimates standard errorsNew view: Use OLS regressionWith “panel corrected” standard errorsTo address panel heteroskedasticityWith FE to deal with unit heterogeneityWith lagged dependent variable in the modelTo address serial correlation.
41TSCS Data: Dynamics Beck & Katz 2009 examine dynamic models OLS PCSE with lagged Y and FEStill appropriateBetter than some IV estimatorsBut, didn’t compare to System GMM.Plumper, Troeger, Manow (2005)FE isn’t theoretically justified and absorbs theoretically important varianceLagged Y absorbs theoretically important temporal variationTheory must guide model choices…
42TSCS Data: Nonstationary Data Issue: Analysis of longitudinal (time-series) data is going through big changesRealization that strongly trending data cause problemsRandom walk / unit root processes / integrated of order 1 / non-stationary dataConverse: stationary data, integrated of order zeroThe “spurious regression” problemStrategies:Tests for “unit root” in time series & panel dataDifferencing as a solutionA reason to try FD models.
43Panel Data Remarks 1. Panel data strategies are taught as “fixes” How do I “fix” unobserved effects?How do I “fix” dynamics/serial correlation?But, the fixes really change what you are modelingA FE (within) model is a very different look at your data, compared to OLSGoal: learn the “fixes”… but get past that… start to think about interpretation2. Much strife in literaturePeople arguing over what “fix” is bestDon’t be afraid of criticism… but expect that people will weigh in with different views
44Panel Data Remarks 3. MOST IMPORTANT THING: Try a wide range of models If your findings are robust, you’re goldenIf not, differences will help you figure out what is going on…Either way, you don’t get “surprised” when your results go away after following the suggestion of a reviewer!
45Reading DiscussionSchofer, Evan and Wesley Longhofer. “The Structural Sources of Associational Life.” Working Paper.