DURATION ANALYSIS Eva Hromádková, 9.12.2010 Applied Econometrics JEM007, IES Lecture 9.

Slides:



Advertisements
Similar presentations
What is Event History Analysis?
Advertisements

What is Event History Analysis?
Hydrologic Statistics
Goodness of Fit of a Joint Model for Event Time and Nonignorable Missing Longitudinal Quality of Life Data – A Study by Sneh Gulati* *with Jean-Francois.
Part 21: Hazard Models [1/29] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business.
1 Lecture Twelve. 2 Outline Failure Time Analysis Linear Probability Model Poisson Distribution.
CONTINUOUS RANDOM VARIABLES These are used to define probability models for continuous scale measurements, e.g. distance, weight, time For a large data.
How Long Until …? Given a strike, how long will it last?
Biostatistics in Research Practice Time to event data Martin Bland Professor of Health Statistics University of York
Introduction to Survival Analysis
1 Lecture Twelve. 2 Outline Projects Failure Time Analysis Linear Probability Model Poisson Approximation.
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Exponential distribution Reliability and failure rate (Sec )
1 Lecture Eleven Probability Models. 2 Outline Bayesian Probability Duration Models.
1 Lecture Eleven Probability Models. 2 Outline Bayesian Probability Duration Models.
Event History Models Sociology 229: Advanced Regression Class 5
1 2. Reliability measures Objectives: Learn how to quantify reliability of a system Understand and learn how to compute the following measures –Reliability.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Analysis of Complex Survey Data
SC968: Panel Data Methods for Sociologists Introduction to survival/event history models.
Lecture 16 Duration analysis: Survivor and hazard function estimation
17. Duration Modeling. Modeling Duration Time until retirement Time until business failure Time until exercise of a warranty Length of an unemployment.
Single and Multiple Spell Discrete Time Hazards Models with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David K. Guilkey.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
NASSER DAVARZANI DEPARTMENT OF KNOWLEDGE ENGINEERING MAASTRICHT UNIVERSITY, 6200 MAASTRICHT, THE NETHERLANDS 22 OCTOBER 2012 Introduction to Survival Analysis.
HSRP 734: Advanced Statistical Methods July 10, 2008.
Chapter 5 Statistical Models in Simulation
1/17 The Transition from Welfare to Work and the Role of Potential Labor Income Hilmar Schneider (IZA, DIW Berlin) Arne Uhlendorff (DIW Berlin, IZA)
1 Introduction to medical survival analysis John Pearson Biostatistics consultant University of Otago Canterbury 7 October 2008.
“Further Modeling Issues in Event History Analysis by Robert E. Wright University of Strathclyde, CEPR-London, IZA-Bonn and Scotecon.
INTRODUCTION TO SURVIVAL ANALYSIS
Applied Epidemiologic Analysis Fall 2002 Patricia Cohen, Ph.D. Henian Chen, M.D., Ph. D. Teaching Assistants Julie KranickSylvia Taylor Chelsea MorroniJudith.
Borgan and Henderson:. Event History Methodology
1 Lecture 13: Other Distributions: Weibull, Lognormal, Beta; Probability Plots Devore, Ch. 4.5 – 4.6.
HSRP 734: Advanced Statistical Methods July 17, 2008.
Pro gradu –thesis Tuija Hevonkorpi.  Basic of survival analysis  Weibull model  Frailty models  Accelerated failure time model  Case study.
The dynamics of poverty in Ethiopia : persistence, state dependence and transitory shocks By Abebe Shimeles, PHD.
An Application of Probability to
Lecture 12: Cox Proportional Hazards Model
Chapter 01 Probability and Stochastic Processes References: Wolff, Stochastic Modeling and the Theory of Queues, Chapter 1 Altiok, Performance Analysis.
EML EML 4550: Engineering Design Methods Probability and Statistics in Engineering Design: Reliability Class Notes Hyman: Chapter 5.
Empirical Likelihood for Right Censored and Left Truncated data Jingyu (Julia) Luan University of Kentucky, Johns Hopkins University March 30, 2004.
Survival Analysis in Stata First, declare your survival-time variables to Stata using stset For example, suppose your duration variable is called timevar.
STT : BIOSTATISTICS ANALYSIS Dr. Cuixian Chen Chapter 7: Parametric Survival Models under Censoring STT
01/20151 EPI 5344: Survival Analysis in Epidemiology Actuarial and Kaplan-Meier methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
Lecture 4: Likelihoods and Inference Likelihood function for censored data.
01/20151 EPI 5344: Survival Analysis in Epidemiology Hazard March 3, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine,
Topic 19: Survival Analysis T = Time until an event occurs. Events are, e.g., death, disease recurrence or relapse, infection, pregnancy.
Proportional Hazards Model Checking the adequacy of the Cox model: The functional form of a covariate The link function The validity of the proportional.
Chapter 4 Continuous Random Variables and Probability Distributions  Probability Density Functions.2 - Cumulative Distribution Functions and E Expected.
02/20161 EPI 5344: Survival Analysis in Epidemiology Hazard March 8, 2016 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine,
Slide 16.1 Hazard Rate Models MathematicalMarketing Chapter Event Duration Models This chapter covers models of elapsed duration.  Customer Relationship.
[Topic 11-Duration Models] 1/ Duration Modeling.
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005
Expectations of Random Variables, Functions of Random Variables
Introduction to Probability - III John Rundle Econophysics PHYS 250
Lecture Eleven Probability Models.
Chapter 4 Continuous Random Variables and Probability Distributions
Proportional Hazard Models
Econometric Analysis of Panel Data
CHAPTER 18 SURVIVAL ANALYSIS Damodar Gujarati
Bias and Variance of the Estimator
Hydrologic Statistics
T305: Digital Communications
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
LIMITED DEPENDENT VARIABLE REGRESSION MODELS
Lecture 4: Likelihoods and Inference
Lecture 4: Likelihoods and Inference
Kaplan-Meier survival curves and the log rank test
Presentation transcript:

DURATION ANALYSIS Eva Hromádková, Applied Econometrics JEM007, IES Lecture 9

Duration analysis Introduction  Model the length of time spent in a given state before transition to another state - spell duration  Unemployment, life, insurance status  Nice application of panel data  Issues:  1. Distributional assumptions on probability of transition (increasing/decreasing with time, etc.)  2. Sampling schemes Flow sampling: e.g. sample of those who enter unemployment in a given period Stock sampling: e.g. people unemployed in a given period Population sampling: e.g. all people regardless of employment status

Duration analysis Introduction  Issues (continued):  3. info about duration can be censored: in some cases we do not observe “end”  4. possibility of multiple states (unemp, emp, out-of LF)  5. possibility of multiple spells  6. different applications have different focus: biostatistics – survival time, physics – failure time, economics – recidivism, length of match (employment, marriage, etc.)

Duration analysis Basic concepts  Duration of spell T – nonnegative random variable  Cumulative distribution function – F(t) (picture)  Density function – dF(t)/dt (picture)  Prob (duration of spell is less than t ) = F(t) = P(T<t)  Survivor function - Prob (duration of spell is more than t)  S(t)=P(T>t)=1-F(t)  Hazard function λ(t) – instantaneous probability of leaving a state conditional on survival to time t  Cumulative hazard function  Different for discreet data (years, weeks)

Duration analysis Censoring I  Types:  Right: duration of spell is above certain value, but we do not know by how much (e.g. we don’t observe the end)  Left: duration of spell is below a certain value, but we do not know by how much (e.g. we don’t observe the beginning)  Interval: duration of spell falls in the time interval  Type I censoring: all durations are censored above fixed time t c, e.g x testing for light bulbs  Type II censoring: study ceases after kth failure => only k complete spells and all other censored

Duration analysis Censoring II

Duration analysis Censoring III  Random censoring:  Observed T i = min(T * i, C * i )  Indicator for non-censoring: δ i =1[T * i <C * i ]  Each individual has (potential) duration spell T * i and censoring time C * i  Independent: determinants of censoring aren’t informative about duration

useful for descriptive statistics:  Without censoring: survival function S(t) = # spells of duration <t/N  With censoring: t 1 <t 2 <…<t j <…<t k -failure time  d j - # spells ending at time t j  m j - # of spells right-censored at t j -t j+1  r j - # of spells at risk at time t j r j = (d j + m j ) + … + (d k + m k ) = Σ l>j (d l + m l ) Duration analysis Non-parametric estimates I.

Duration analysis Non-parametric estimates II.  Hazard function: λ j =d j /r j  # spells ending at time t j out of all that have been at risk  Survival function – Kaplan Meier estimator:  S(t) = Π j | tj <t (1-λ j ) = Π j | tj<t (r j -d j )/r j  Notes: adjustment for grouping of data (censoring can occur progressively over the interval)

Duration analysis Parametric estimates I.  Exponential distribution:  h(t) = λ – constant prob. of leaving state, memoryless property  Survival function: S(t) = e -λt  Weibull distribution: more general  h(t)= λαt α-1; S(t) = e -λt α  α>1 – h(t) is increasing, α<1 – h(t) is decreasing  Other: Gomperts (biostatistics); log-normal & log-logistic (hazard first increases and then decreases with t); gamma  Regressors are introduced by letting λ= e xβ with α left as constant

Duration analysis Maximum Likelihood Estimation Assumption: time invariant regressors X (vary over individuals, not over the length of spell)  Uncensored data are observed with prob f(t|x,θ)  Censored data are observed with prob P(T>t)= S(t|x,θ)  Thus, probability of each observation is: f(t|x,θ) δ i x S(t|x,θ) 1-δ i where δ i =1 if no censoring  We are trying to find θ that maximizes sum of probabilities – i.e. likelihood that we observed current actual realization ln L(θ) = Σ i=1,…, N [δ i ln f(t i |x i,θ) + (1-δ i )ln S(t i |x i,θ)]

Duration analysis Maximum Likelihood Estimation II Components of Likelihood  Each type of observation contributes to likelihood  Complete durations:f(t)  Left truncated at t L (t>t L ):f(t)/S(t L )  Left censored at t CL :1-S(t CL )  Right censored at t CR :S(t CR )  Right truncated at t R (t<tr):f(t R )/[1-S(t R )]  Interval censored at t CL, t CR :S(t CL )-S(t CR )

Duration analysis Cox model I Proportional hazard model: λ(t|x,β) = λ 0 (t) Φ(x,β)  Semi-parametric model:  Functional form for baseline hazard λ 0 (t) is unspecified  Functional form for Φ(x,β) is fully specified – usually exponential form exp(xβ)  Interpretation of coefficients: change in x  Discreet: λ(t|x new,β) = λ 0 (t) exp(xβ+β j ) = exp(β j ) λ(t|x,β)  Continuous: dλ(t|x,β) /dx j = λ 0 (t) exp(xβ)*β j

Duration analysis Cox model II Set-up:  t 1 < t 2 <…< t j <…< t k – k discreet failure times  R(t j ) = {l: t l > t j }- set of spells at risk at t j  D(t j ) = {l: t l = t j }- set of spells completed at t j  dj= Σ l 1(t l = t j )- # of spells completed at t j Probability of a particular spell at risk ending at time t j : Baseline hazard dropped out!!

Duration analysis Cox model III We must control for tied durations (i.e. more than 1 failure at a given time) Product of individual probabilities within R(t j ) Partial likelihood function: joint product of failure probabilities over failure times

Duration analysis Time varying coefficients Problem:  If the time variation is endogenous – feedback duration of unemployment  job search strategy  Basic case: external time variation Solution:  Very rough: replace variable by average value  Within Cox approach, what matters at each failure time tj is the value of regressor x(t j ) for those observations in the risk set R(t j )  multiple observations for each subject