01/20151 EPI 5344: Survival Analysis in Epidemiology Hazard March 3, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine, University of Ottawa
01/20152 Hazard (1) h(t) –Instantaneous hazard –Rate of event occurring at time ‘t’, conditional having survived event-free until time ‘t’ H(t) –Cumulative hazard –the ‘sum’ of all hazards from time ‘0’ to time ‘t’ –Area under the h(t) curve from ‘0’ to ‘t’
01/20153 Hazard (2) Simplest survival model assumes a constant hazard –Yields an exponential survival curve –Leads to basic epidemiology formulae for incidence, etc. –More next week Can extend it using the piecewise model –Fits a different constant hazard for given follow- up time intervals.
01/20154 Hazard estimation (1) If hazard is not constant, how does it vary over time?
01/20155 Hazard estimation (2) How can we estimate the hazard? –Parametric methods (not discussed today) –Non-parametric methods We can estimate: –h(t) –H(t)
01/20156 Hazard estimation (3) Preference is to estimate H(t) –Nelson-Aaalen method is main approach. Let’s look at direct estimation of h(t) –Works from a piece-wise constant hazard model Start by dividing follow-up time into intervals –Actuarial has pre-defined intervals –KM uses time between events as intervals.
01/20157 Hazard estimation (4) Direct hazard estimation has issues –h(t) shows much random variation –Unstable estimates due to small event numbers in time intervals –Works ‘best’ for actuarial method since intervals are pre-defined –Length is generally the same for each interval (u i ).
01/20158 h(t) direct estimation
01/20159 Hazard estimation (5) Actuarial method to estimate h(t) –Length is generally the same for each interval (u i ). Standard ID formula from Epi
01/ ABCDEFGHI Year# people still alive # lost# people dying in this year Effective # at risk Prob die in year Prob survive this year Cum. Prob of surviving to this year Cum. Prob of dying by this year ,0005,0001,5007, ,5001, , , Last week, we used this data to illustrate actuarial method Let’s use it to estimate h(t)
01/ NtNt WtWt ItIt h(t) Year# people still alive # lost# people dying in this year Effective # at risk Prob die in year ,0005,0001,5007, ,5001, , , : NtNt WtWt ItIt h(t) Year# people still alive # lost# people dying in this year Effective # at risk Prob die in year ,0005,0001,5007, ,5001, , , : NtNt WtWt ItIt h(t) Year# people still alive # lost# people dying in this year Effective # at risk Prob die in year ,0005,0001,5007, ,5001, , , NtNt WtWt ItIt h(t) Year# people still alive # lost# people dying in this year Effective # at risk Prob die in year ,0005,0001,5007, ,5001, , ,
01/ Hazard estimation (6) Person-time variant –Divide follow-up time into fixed intervals –Compute actual person-time in each interval (rather than using approximation). –Gives a slightly smoother curve
01/ Hazard estimation (7) Kaplan-Meier method to estimate h(t) –‘interval’ is time between death events Varies irregularly –Formula has same structure as above person- time estimate given above d i = # with event u i = t i+1 – t i n i = size of risk set at ‘t’
01/ Hazard estimation (8) Issues with using KM method to estimate h(t) –Normally, only have 1 or 2 in numerator –Makes estimates ‘unstable’ liable to considerable random variation and noise –Do not usually estimate h(t) from KM methods –Use a Kernel Smoothing approach to improve estimates
01/ Estimating Cumulative hazard: H(t) –Measures the area under the h(t) curve. Tends to be more stable since it is based on number of events from ‘0’ to ‘t’ rather than number in the last interval Hazard estimation (9)
01/ Simple approach –Estimate h(t) assuming a piece-wise constant model –H(t) is the sum of the pieces. –For each ‘piece’ before time ‘t’, compute product of the estimated ‘h i ’ for the interval multiplied by the length of the interval it is based on. –Add these up across all ‘pieces’ before time ‘t’. width of last ‘piece’ is up to ‘t’ only –Relates to the density method from epi Hazard estimation (10)
01/ H(t) estimation based on piecewise estimation of h(t)
01/ Four ways you can do this: Actuarial using ‘epi’ formula Actuarial using Person-time method Kaplan-Meier approach using Nelson-Aalen estimator Kaplan-Meier approach using –log(S(t)) We’ve discussed methods 1 and 2. Generate h(t) Just add things up Let’s talk about 3 and 4 Hazard estimation (11)
01/ Nelson-Aalen estimator for H(t) Apply above approach defining intervals by using the time points for events Most commonly used approach to estimate H(t) Related to Kaplan-Meier method Compute H(t) at each time when event happens: Hazard estimation (12) d i = # with event at ‘t i ’ n i = size of risk set at ‘t i ’
01/ Another approach to estimate H(t) Use -log(S(t)) –from our basic formulae, we have: –Estimate S(t) and convert using this formula Hazard estimation (13)
01/ For those who care, methods 3 and 4 are very similar From KM, the estimate of S(t) is: Hazard estimation (14)
01/ Hence, we have: But, for small values, we have: So, we get: Hazard estimation (15)
01/ Numerical example IDTime(mons)Censored 114XXXXX XXXXX 545XXXXX XXXXX 992XXXXX 10111XXXXX Very coarse: 10 events in 10 years
Year# people under follow- up # lost# people dying in this year h(t)H(t) / Actuarial Method for h(t) Year# people under follow- up # lost# people dying in this year h(t)H(t) Year# people under follow- up # lost# people dying in this year h(t)H(t) Year# people under follow- up # lost# people dying in this year h(t)H(t)
Nelson-Aalen estimate of H(t) IntervalComputationH(t) from actuarial method 0-22H(t) = H(t) = H(t) = H(t) = H(t) = /201525
01/ A new example
01/ H(t) has many uses, largely based on: Hazard estimation (10)
Nelson-Aalen (2) Estimating H(t) gives another way to estimate S(t). Uses formula: 01/ IntervalH(t)S(t)Cum Incid(t)
01/201529
01/ Key for testing proportional hazards assumption (later)
01/ Suppose the hazard is a constant (λ), then we have: Plot ‘ln(S(t))’ against ‘t’. A straight line indicates a constant hazard. Approach can be used to test other models (e.g. Weibull).
Smoothing & hazard estimation Using KM to give a direct estimate of h(t) is very unstable –Only 1 event per time point Instead, apply a smoothing method to generate an estimate of h(t) 01/201532
Example (from Allison) Recidivism data set –432 male inmates released from prison –Followed for 52 weeks –Dates of re-arrests were recorded –Study designed to examine the impact of a financial support programme on reducing re- arrest 01/201533
01/201534
01/ Simple hazard estimates using actuarial method Adjusted hazard estimates using actuarial method: last interval ends at 53 weeks, not 60 weeks
01/201536
01/201537
Proportional Hazards 01/201538
Proportional Hazards (1) Suppose we have two groups followed over time (say treatment groups in an RCT). How will the hazards in the two groups relate? –There need be no specific relationship –They could even go in opposite directions 01/201539
01/ Hazard functions for 2 hypothetical groups in a RCT
Proportional Hazards (2) Often, it is reasonable to place restrictions on how the hazards relate Consider a situation where the hazard is constant over time: –Experimental:λ e –Control: λ c 01/201541
01/ λcλc λeλe The ratio of the hazard in one group to the other is constant for all follow-up time. A simple example of Proportional Hazards
Proportional Hazards (3) What if the hazard is not constant over time? –Relationship between curves can be complex –It is common to make the assumption that the hazard curves are proportional over all follow- up time 01/201543
01/ λcλc λeλe The hazard in the experimental group is a constant multiple of that in the control group for all follow-up time. Proportional Hazards (PH)
Proportional Hazards (4) PH is easier to see if we look at the logarithm of the hazards. 01/ The difference in the log-hazards is constant over time. – Means that the curves are a fixed distance apart
01/ λcλc λeλe
Proportional Hazards (5) If PH is true, then we frequently designate one group as the reference group (0). 01/ Re-write this to get:
Proportional Hazards (6) In above equation, HR can be affected by patient characteristics –Age –Sex –Residence –Baseline disease severity Can model this as: 01/201548
Proportional Hazards (7) Most common form for this model is: 01/ Model underlies the Cox regression approach.
Reminder & Warning Proportional Hazards is an ASSUMPTION It need not be true Not all probability models for survival curves leads to PH PH is less likely to be true when the follow- up time gets very long 01/201550
01/201551