Presentation is loading. Please wait.

Presentation is loading. Please wait.

DURATION ANALYSIS Eva Hromádková, 9.12.2010 Applied Econometrics JEM007, IES Lecture 9.

Similar presentations


Presentation on theme: "DURATION ANALYSIS Eva Hromádková, 9.12.2010 Applied Econometrics JEM007, IES Lecture 9."— Presentation transcript:

1 DURATION ANALYSIS Eva Hromádková, 9.12.2010 Applied Econometrics JEM007, IES Lecture 9

2 Duration analysis Introduction  Model the length of time spent in a given state before transition to another state - spell duration  Unemployment, life, insurance status  Nice application of panel data  Issues:  1. Distributional assumptions on probability of transition (increasing/decreasing with time, etc.)  2. Sampling schemes Flow sampling: e.g. sample of those who enter unemployment in a given period Stock sampling: e.g. people unemployed in a given period Population sampling: e.g. all people regardless of employment status

3 Duration analysis Introduction  Issues (continued):  3. info about duration can be censored: in some cases we do not observe “end”  4. possibility of multiple states (unemp, emp, out-of LF)  5. possibility of multiple spells  6. different applications have different focus: biostatistics – survival time, physics – failure time, economics – recidivism, length of match (employment, marriage, etc.)

4 Duration analysis Basic concepts  Duration of spell T – nonnegative random variable  Cumulative distribution function – F(t) (picture)  Density function – dF(t)/dt (picture)  Prob (duration of spell is less than t ) = F(t) = P(T<t)  Survivor function - Prob (duration of spell is more than t)  S(t)=P(T>t)=1-F(t)  Hazard function λ(t) – instantaneous probability of leaving a state conditional on survival to time t  Cumulative hazard function  Different for discreet data (years, weeks)

5 Duration analysis Censoring I  Types:  Right: duration of spell is above certain value, but we do not know by how much (e.g. we don’t observe the end)  Left: duration of spell is below a certain value, but we do not know by how much (e.g. we don’t observe the beginning)  Interval: duration of spell falls in the time interval  Type I censoring: all durations are censored above fixed time t c, e.g. 5000 x testing for light bulbs  Type II censoring: study ceases after kth failure => only k complete spells and all other censored

6 Duration analysis Censoring II

7 Duration analysis Censoring III  Random censoring:  Observed T i = min(T * i, C * i )  Indicator for non-censoring: δ i =1[T * i <C * i ]  Each individual has (potential) duration spell T * i and censoring time C * i  Independent: determinants of censoring aren’t informative about duration

8 useful for descriptive statistics:  Without censoring: survival function S(t) = # spells of duration <t/N  With censoring: t 1 <t 2 <…<t j <…<t k -failure time  d j - # spells ending at time t j  m j - # of spells right-censored at t j -t j+1  r j - # of spells at risk at time t j r j = (d j + m j ) + … + (d k + m k ) = Σ l>j (d l + m l ) Duration analysis Non-parametric estimates I.

9 Duration analysis Non-parametric estimates II.  Hazard function: λ j =d j /r j  # spells ending at time t j out of all that have been at risk  Survival function – Kaplan Meier estimator:  S(t) = Π j | tj <t (1-λ j ) = Π j | tj<t (r j -d j )/r j  Notes: adjustment for grouping of data (censoring can occur progressively over the interval)

10 Duration analysis Parametric estimates I.  Exponential distribution:  h(t) = λ – constant prob. of leaving state, memoryless property  Survival function: S(t) = e -λt  Weibull distribution: more general  h(t)= λαt α-1; S(t) = e -λt α  α>1 – h(t) is increasing, α<1 – h(t) is decreasing  Other: Gomperts (biostatistics); log-normal & log-logistic (hazard first increases and then decreases with t); gamma  Regressors are introduced by letting λ= e xβ with α left as constant

11 Duration analysis Maximum Likelihood Estimation Assumption: time invariant regressors X (vary over individuals, not over the length of spell)  Uncensored data are observed with prob f(t|x,θ)  Censored data are observed with prob P(T>t)= S(t|x,θ)  Thus, probability of each observation is: f(t|x,θ) δ i x S(t|x,θ) 1-δ i where δ i =1 if no censoring  We are trying to find θ that maximizes sum of probabilities – i.e. likelihood that we observed current actual realization ln L(θ) = Σ i=1,…, N [δ i ln f(t i |x i,θ) + (1-δ i )ln S(t i |x i,θ)]

12 Duration analysis Maximum Likelihood Estimation II Components of Likelihood  Each type of observation contributes to likelihood  Complete durations:f(t)  Left truncated at t L (t>t L ):f(t)/S(t L )  Left censored at t CL :1-S(t CL )  Right censored at t CR :S(t CR )  Right truncated at t R (t<tr):f(t R )/[1-S(t R )]  Interval censored at t CL, t CR :S(t CL )-S(t CR )

13 Duration analysis Cox model I Proportional hazard model: λ(t|x,β) = λ 0 (t) Φ(x,β)  Semi-parametric model:  Functional form for baseline hazard λ 0 (t) is unspecified  Functional form for Φ(x,β) is fully specified – usually exponential form exp(xβ)  Interpretation of coefficients: change in x  Discreet: λ(t|x new,β) = λ 0 (t) exp(xβ+β j ) = exp(β j ) λ(t|x,β)  Continuous: dλ(t|x,β) /dx j = λ 0 (t) exp(xβ)*β j

14 Duration analysis Cox model II Set-up:  t 1 < t 2 <…< t j <…< t k – k discreet failure times  R(t j ) = {l: t l > t j }- set of spells at risk at t j  D(t j ) = {l: t l = t j }- set of spells completed at t j  dj= Σ l 1(t l = t j )- # of spells completed at t j Probability of a particular spell at risk ending at time t j : Baseline hazard dropped out!!

15 Duration analysis Cox model III We must control for tied durations (i.e. more than 1 failure at a given time) Product of individual probabilities within R(t j ) Partial likelihood function: joint product of failure probabilities over failure times

16 Duration analysis Time varying coefficients Problem:  If the time variation is endogenous – feedback duration of unemployment  job search strategy  Basic case: external time variation Solution:  Very rough: replace variable by average value  Within Cox approach, what matters at each failure time tj is the value of regressor x(t j ) for those observations in the risk set R(t j )  multiple observations for each subject


Download ppt "DURATION ANALYSIS Eva Hromádková, 9.12.2010 Applied Econometrics JEM007, IES Lecture 9."

Similar presentations


Ads by Google