Presentation on theme: "Longitudinal and Multilevel Methods for Models with Discrete Outcomes with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David."— Presentation transcript:
Longitudinal and Multilevel Methods for Models with Discrete Outcomes with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David K. Guilkey
Focus of this talk: Binary dependent variables Unordered categorical dependent variables Models will be logit based – will not discuss probit, poisson or negative binomial models although STATA has methods for these estimators as well Empirical example uses data from the Indonesian Family Life Survey: Two outcomes: Binary indicator for whether the respondent uses contraception Unordered categorical variable for method choice
Data Set Overview Four waves of data: 1993, 1997, 2000, and 2007 Individual level information on fertility, education, migration Community and facility level data on health and family planning providers Data from 321 enumeration areas – we will consider these communities IFLS Longitudinal Sample Size Initial Participation Cohort Survey Year 1993199720002007total Wave 1 Cohort352028732684149810575 Wave 2 Cohort2207174211525101 Wave 3 Cohort14669332399 Wave 4 Cohort 2287 total observations 20362
Basic Model for Longitudinal Logit: Where: Y ti : observed binary variable (respondent i from time period t) X ti : time varying explanatory variables (age and education level) P ti : time varying program variable (posyandus) Z i : time invariant regressors (Muslim) i=1,2,…N (individuals) t=1,2,…T i (observations per individual -- unbalanced panel)
Assumptions: for the parametric logit in STATA (xtlogit, melogit, and one variant of GLLAMM) and: Note that observations for the same individual will be correlated because of the time invariant error – sometimes referred to as unobserved heterogeneity Given the assumptions, estimation options are: 1. Simple logit yields consistent point estimates but incorrect SE’s 2. Simple logit with cluster option corrects SE’s 3. Parametric or semi-parametric maximum likelihood
The likelihood function for this model is derived as follows: This is the probability that individual i at time t is using contraception conditional on time invariant heterogeneity. For individual i, we observe T i binary responses that we can write as: Y i = (1,0,0,1) for a woman that is observed for 4 time periods and used contraception at times 1 and 4.
Let Y i be the set of observed outcomes for individual i, then: Joint probability must be approximated -- approximating the area under a curve. With the assumption of normality the approximation method is Gaussian Quadrature or Hermite integration Points: 1. More accurate with more Hermite points – but execution time is longer. 2. You need more points as T i gets larger.
Hermite integration replaces the integral with a sum: where the weights (w m ’s) and the masspoints (μ m ’s) are known because of the assumption of normality Alternative: The discrete factor approximation searches over weights and mass points along with the other parameters of the model. Must impose a normalization; 1. Weights sum to one 2. Either set one mass point to zero (fortran program) or set mean of distribution to zero (GLLAMM)
Multilevel Panel Models Basic Form of the model: where j=1,2,…,J (communities) i=1,2,…,N j (individuals from community j) t=1,2,…,T ij (observations for person i for community j)
X tij : individual level variables (some could be fixed through time) P tij : time varying program variable Z j : time invariant community level variables μ ij : time invariant individual level unobserved heterogeneity λ j : time invariant community level unobserved heterogeneity This model allows observations on the same individual to be correlated and observations from the same community to be correlated.
Assumptions: 1. Simple logit yields consistent point estimates but incorrect SE’s 2. Simple logit with cluster option corrects SE’s (at community level) 3. Parametric or semi-parametric maximum likelihood Maximum likelihood estimator is a straight forward extension of the longitudinal data model:
You need the unconditional joint probability of the observed set of outcomes for the set of individuals in each community: Conditional on the unobservables at the community level, the probability of the set of observed outcomes for person i from community j are: The unconditional joint probability of the set of observed outcomes for all individuals in community j is then: We then either use Hermite integration or the discrete factor method to approximate the integral.
Testing for Program Targeting Programs may target high need areas or areas where they feel residents would be receptive to family planning For example: family planning programs may concentrate on high fertility areas Result is that simple methods may understate or overstate program impact Statistical Implication of program targeting:
Solutions: Explicitly model program placement and estimate placement simultaneously with program impact equations (Angeles, Guilkey, and Mroz, 1998) Treat as fixed effects and include dummies for communities or some other fixed effects method (Gertler and Molyneau, 1994) Angeles, Guilkey, and Mroz show that the joint modeling approach yields smaller standard errors in Tanzania but the two methods gave similar results
Example (fixed effects) plus Hausman Test for endogenous placement: Efficient estimator under the null of no endogeneity (random effects):
Consistent estimator under the alternate (fixed effects):
State Dependence and Unobserved Heterogeneity Consider the simple model: Note: Implies: Unless (no time invariant unobserved heterogeneity) Now consider: Now: Very difficult to distinguish between the two models
Same problem would exist if the unobserved heterogeneity were at the community level Solution is to estimate a comprehensive model: Initial conditions problem: Must either be able to set or jointly estimate the equation of interest with an equation of the form:
Often it is reasonable to set the initial value: Observations start at the beginning of the woman’s child bearing years In this example, it is not since women enter the year one data set at different ages Joint estimation is basically a simultaneous equations problem subject to standard identification issues. However, time varying exogenous variables provide identification (age and education in this case) Example follows:
Estimation with no controls for unobserved heterogeneity and initial conditions:
Basic Model Longitudinal Multinomial Logit with 3 Choices: Individual i at time t time makes choice 3 (for example) if : If we assume that the ε’s follow independent extreme value distributions and impose the restriction that:
So that the probabilities sum to one then: for k=2,3. The discrete factor model allows a more general pattern of correlation: for m=1,2…,M and a common set of weights: allows for correlation in the μ’s
Unfortunately, GLLAMM estimates a needlessly restrictive version of the model: Parametric: If there are more than 3 choices, all ρ’s are restricted Non-parametric: for all m.
Extension to Multilevel Panel Model: Parametric: Semi-parametric:
The empirical example estimates a model with four choices: 1= Non use 2=Temporary Methods (pill, condom, injection) 3=Long Lasting Methods (IUD, sterilization) 4=Traditional Methods We show the complete results for the most general model and then report partial results for other models: