Presentation is loading. Please wait.

Presentation is loading. Please wait.

School of Epidemiology, Public Health &

Similar presentations


Presentation on theme: "School of Epidemiology, Public Health &"— Presentation transcript:

1 School of Epidemiology, Public Health &
EPI 5344: Survival Analysis in Epidemiology Introduction to concepts and basic methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine, University of Ottawa 01/2015

2 Survival concepts (1) Cohort studies
Follow-up a pre-defined group of people for a period of time which can be: Same time for everyone Different time for different people. Determine which people achieve specified outcome. 01/2015

3 Survival concepts (2) Cohort studies
Outcomes could be many different things, such as: Death Any cause Cause-specific Onset of new disease Resumption of smoking in someone who had quit Recidivism for drug use or criminal activity Change in numerical measure such as blood pressure Longitudinal data analysis 01/2015

4 Survival concepts (3) Cohort studies
Traditional approach to cohorts assumes everyone is followed for the same time Incidence proportion Logistic regression modeling If follow-up time varies, what do you do with subjects who don’t make it to the end of the study? Censoring 01/2015

5 Survival concepts (4) Cohort studies
Cohort studies can provide more information than presence/absence of outcome. Time when outcome occurred Type of outcome (competing outcomes) Can look at rate or speed of development of outcome Incidence rate Person-time 01/2015

6 Survival concepts (5) Time to event analysis
Survival Analysis (general term) Life tables Kaplan-Meier curves Actuarial methods Log-rank test Cox modeling (proportional hazards) Strong link to engineering Failure time studies 01/2015

7 Survival concepts (6) Common epidemiological approach to the analysis of cohort studies Most common outcome measure is: Incidence proportion Cumulative incidence Select a point in time as the end of follow-up. Compare groups using t-test Logistic regression is commonly used Produces a CIR (RR) 01/2015

8 Survival concepts (7) Issues with this approach include:
What point in time to use? What if not all subjects remain under follow-up that long? Ignores information from subjects who don’t get outcome or reach the time point What is incidence proportion for the outcome ‘death’ if we set the follow-up time to 200 years? Will always be 100% 01/2015

9 Survival concepts (8) Alternate method uses Incidence rate (density)
Based on person time of follow-up Can include information on drop-outs, etc. Closely linked to survival analysis methods 01/2015

10 Survival concepts (9) Cumulative Incidence Incidence density (rate)
The probability of becoming ill over a pre-defined period of time. No units Range 0-1 Incidence density (rate) The rate at which people get ill during person-time of follow-up Units: 1/time or cases/Person-time Range 0 to +∞ Very closely related to hazard rate. 01/2015

11 Measuring Time (1) Units to use to measure time ‘scale’ to be used
Normally, years/months/days Time of events is usually measured using dates on a calendar Other measures are possible (e.g. hours) ‘scale’ to be used time on study age calendar date Time ‘0’ (‘origin of time’) The point when time starts 01/2015

12 Time Scale (1) Time of events is usually measured using ‘calendar dates’ Can be represented in graphic display by ‘time lines’ The conceptual idea used in analyses Patient #1 enters on Feb 15, 2000 & dies on Nov 8, 2000 Patient #2 enters on July 2, 2000 & is lost (censored) on April 23, 2001 Patient #3 Enters on June 5, 2001 & is still alive (censored) at the end of the follow-up period Patient #4 Enters on July 13, 2001 and dies on December 12, 2002 01/2015

13 D C C D 01/2015

14 Time Scale (2) For now, focus on ‘study time’ as the time scale
In RCT’s, focus is commonly on ‘study time’ How long after a patient starts follow-up do their events occur? Need to define a ‘time 0’ or the point when study time starts accumulating for each patient. Frequently used as the ‘default’ in observational research Most epidemiologists recommend using ‘age’ as the time scale for etiological studies More in Session 6 For now, focus on ‘study time’ as the time scale 01/2015

15 Origin of Time (1) Choice of time ‘0’ affects analysis
can produce very different regression coefficients and model fit; Preferred origin is often unavailable More than one origin may make sense no clear criterion to choose which to use 01/2015

16 Time ‘0’ (2) No best time ‘0’ for all situations RCT of Rx
Depends on study objectives and design RCT of Rx ‘0’ = date of randomization Prognostic study ‘0’ = date of disease onset Inception cohort Often use: date of disease diagnosis 01/2015

17 Time ‘0’ (3) ‘point source’ exposure Use date of event
Hiroshima atomic bomb Dioxin spill (Seveso, Italy) 01/2015

18 Time ‘0’ (4) Chronic exposure Issues to consider Date of study entry
Date of first exposure For age as time scale, time ‘0’ is date of birth Issues to consider There often is no first exposure (or no clear date of 1st exposure) Recruitment long after 1st exposure Immortal person time Lack of info on early events. 01/2015

19 Time ‘0’ (5) Here is our sample time line data
Convert for analysis by defining a time ‘0’ Patient #1 enters on Feb 15, 2000 & dies on Nov 8, 2000 Patient #2 enters on July 2, 2000 & is lost (censored) on April 23, 2001 Patient #3 Enters on June 5, 2001 & is still alive (censored) at the end of the follow-up period Patient #4 Enters on July 13, 2001 and dies on December 12, 2002 01/2015

20 Time ‘0’ (6) Calendar time can be very important
Uses the actual date of the event Studies of incidence/mortality trends Normally uses Poisson or similar models In survival analysis, focus is on ‘study time’ When after a patient starts follow-up do their events occur Need to change time lines to reflect new time scale 01/2015

21 D C C D 01/2015

22 D C C D 01/2015

23 Study course for patients in cohort
2001 2003 2013 01/2015

24 01/2015

25 Time ‘0’ (7) Can be interested in more than one ‘event’ An Example:
More than one ‘time to event’ An Example: Patients treated for malignant melanoma Treated with drug ‘A’ or ‘B’ Expected to influence both: Time to relapse; Time to survival 01/2015

26 Time ‘0’ (8) SAS code to compute time-to-event.
Surgical treatment for breast cancer Four time points: Date of surgery Relapse Death Last follow-up (if still alive without relapse.) 01/2015

27 Time ‘0’ (9) Time ‘0’: Date of surgery Event #1: Relapse
Earliest of relapse/death/end Event #2: Death Earliest of death/end 01/2015

28 How do we compute the ‘time on study’ for each of these events?
Convert to days (or weeks, months, years) from time ‘0’ for each person Let’s talk some SAS 01/2015

29 Dates in SAS (1) Multiple ways to get date data into SAS
I commonly use three variables for each date: Day Month Year Facilitates data entry and editing Requires more complicated manipulation later Stored as SAS date variables Multiple formats available for data entry Always stored as # days since Jan 1, 1960. 01/2015

30 Dates in SAS (2) data dates; input ptid surgdate mmddyy8.; datalines; /5/ /7/ /6/ /6/55 ; run; proc print data=dates; 01/2015

31 Dates in SAS (3) Obs # ptid surgdate 1 13725 13061 2 25422 13580 3
34721 12667 4 11111 -1670 01/2015

32 Dates in SAS (4) data dates; input ptid surgdate mmddyy8.; datalines; /5/ /7/ /6/ /6/55 ; run; proc print data=dates; format surgdate date9.; 01/2015

33 Dates in SAS (5) Obs # ptid surgdate 1 13725 05OCT1995 2 25422
07MAR1997 3 34721 06SEP1994 4 11111 06JUN1955 01/2015

34 Read the date data using a ‘date format’
Time ‘0’ Read the date data using a ‘date format’ If the event didn’t happen, then the date is ‘missing’ 01/2015

35 01/2015

36 SAS code to create event variables
Data melanoma; set melanoma; /* surv -> Alive at the end of follow-up */ if (date_of_death = .) then survevent = 0; else survevent = 1; Run; 01/2015

37 SAS code to create event variables
Data melanoma; set melanoma; /* surv -> Alive at the end of follow-up */ survevent = (date_of_death ne .); Run; 01/2015

38 SAS code to create event variables
Data melanoma; set melanoma; /* surv -> Alive at the end of follow-up */ survevent = (date_of_death ne .); if (survevent = 0) then survtime = (date_of_last – date_of_surg)/30.4; else survtime = (date_of_death – date_of_surg)/30.4; Run; 01/2015

39 SAS code to create event variables
Data melanoma; set melanoma; /* surv -> Alive at the end of follow-up */ survevent = (date_of_death ne .); if (survevent = 0) then survtime = (date_of_last – date_of_surg)/30.4; else survtime = (date_of_death – date_of_surg)/30.4; /* dfs -> Died or relapsed */ if ((date_of_relapse = 0) and (date_of_death = .)) then dfsevent = 0 else dfsevent = 1; Run; 01/2015

40 SAS code to create event variables
Data melanoma; set melanoma; /* surv -> Alive at the end of follow-up */ survevent = (date_of_death ne .); if (survevent = 0) then survtime = (date_of_last – date_of_surg)/30.4; else survtime = (date_of_death – date_of_surg)/30.4; /* dfs -> Died or relapsed */ dfsevent = 1 – (date_of_relapse = .)*(date_of_death = .); Run; 01/2015

41 SAS code to create event variables
Data melanoma; set melanoma; /* surv -> Alive at the end of follow-up */ survevent = (date_of_death ne .); if (survevent = 0) then survtime = (date_of_last – date_of_surg)/30.4; else survtime = (date_of_death – date_of_surg)/30.4; /* dfs -> Died or relapsed */ dfsevent = 1 – (date_of_relapse = .)*(date_of_death = .); if (dfsevent = 0) then dfstime = (date_of_last - date_of_surg)/30.4; else if (date_of_relapse NE .) then dfstime = (date_of_relapse - date_of_surg)/30.4; else if (date_of_relapse = . and date_of_death NE .) then dfstime = (date_of_death - date_of_surg)/30.4; else dfstime = .E; Run; 01/2015

42 01/2015

43 Survival curve (1) What can we do with data which includes time-to-event? Might be nice to see a picture of the number of people surviving from the start to the end of follow-up. 01/2015

44 Sample Data: Mortality, no losses
Year # still alive # dying in the year 2000 10,000 2,000 2001 8,000 1,600 2002 6,400 1,280 2003 5,120 1,024 2004 4,096 820 01/2015

45 Not the right axis for a survival curve
01/2015

46 Survival curve (2) Previous graph has a problem
What if some people were lost to follow-up? Plotting the number of people still alive would effectively say that the lost people had all died. 01/2015

47 Sample Data: Mortality, no losses
Year # still alive # dying in the year Lost to follow-up 2000 10,000 2,000 1,000 2001 7,000 1,400 800 2002 4,800 960 500 2003 3,340 670 400 2004 2,270 460 260 Year # still alive # dying in the year Lost to follow-up 2000 10,000 2,000 1,000 2001 7,000 2002 2003 2004 Year # still alive # dying in the year Lost to follow-up 2000 10,000 2,000 1,000 2001 2002 2003 2004 01/2015

48 01/2015

49 Survival curve (2) Previous graph has a problem Instead
What if some people were lost to follow-up? Plotting the number of people still alive would effectively say that the lost people had all died. Instead True survival curve plots the probability of surviving. 01/2015

50 01/2015

51 01/2015

52 Survival Curves (1) Primary outcome is ‘an event happened’
You need to know: type of event time to event Person Type Time 1 Death 100 2 Alive 200 3 Lost 150 4 65 And so on 01/2015

53 Survival Curves (2) Censoring (censored outcome)
People who do not have the targeted outcome (e.g. death) For now, assume no censoring How do we represent the ‘time’ data in a statistical method? Histogram of death times - f(t) Survival curve - S(t) Hazard curve - h(t) To know one is to know them all 01/2015

54 Histogram of death time Skewed to right pdf or f(t) CDF or F(t)
Area under ‘pdf’ from ‘0’ to ‘t’ F(t) t 01/2015

55 Survival curves (3) Plot % of group still alive (or % dead)
S(t) = survival curve = % still surviving at time ‘t’ = P(survive to time ‘t’) Mortality rate = 1 – S(t) = F(t) = Cumulative incidence 01/2015

56 Survival S(t) 1-S(t) S(t) Deaths CI(t) t 01/2015

57 Consider these 2 survival curves
‘Rate’ of dying Consider these 2 survival curves Which has the better survival profile? Both have S(3) = 0 01/2015

58 01/2015

59 Survival curves (4) Most people would prefer to be in group‘A’ than group ‘B’. Death rate is lower in first two years. Will live longer than in pop ‘B’ Concept is called: Hazard: Survival analysis/stats Force of mortality: Demography Incidence rate/density: Epidemiology 01/2015

60 Survival curves (5) DEFINITION of hazard
h(t) = rate of dying at time ‘t’ GIVEN that you have survived to time ‘t’ Similar to asking the speed of your car given that you are two hours into a five hour trip from Ottawa to Toronto Slight detour and then back to main theme 01/2015

61 Survival Curves (5) Conditional Probability h(t0) = rate of failing at ‘t0’ conditional on surviving to t0 Requires the ‘conditional survival curve’: Essentially, you are re-scaling S(t) so that S*(t0) = 1.0 01/2015

62 S(t0) t0 t0 01/2015

63 Hazard (instantaneous) Force of Mortality Incidence rate
S*(t) = survival curve conditional on surviving to ‘t0‘ CI*(t) = failure/death/cumulative incidence at ‘t’ conditional on surviving to ‘t0‘ Hazard at t0 is defined as: ‘the slope of CI*(t) at t0’ Hazard (instantaneous) Force of Mortality Incidence rate Incidence density Range: 0 ∞ 01/2015

64 Some relationships If the rate of disease is small: CI(t) ≈ H(t)
If we assume h(t) is constant (= ID): CI(t)≈ID*t 01/2015

65 Some survival functions (1)
Exponential h(t) = λ S(t) = exp (- λt) Underlies most of the ‘standard’ epidemiological formulae. Assumes that the hazard is constant over time Big assumption which is not usually true 01/2015

66 01/2015

67 Some survival functions (2)
Weibull h(t) = λ γ tγ-1 S(t) = exp (- λ tγ) Allows fitting a broader range of hazard functions Assumes hazard is monotonic Always increasing (or decreasing) 01/2015

68 01/2015

69 Hazard curves (2) 01/2015

70 Hazard curves (3) 01/2015

71 Some survival functions (3)
All these functions assume that everyone eventually gets the outcome event. That might not be true: Cures occur Immunity Mixture models 01/2015

72 Some survival functions (4)
Piece-wise exponential Divide follow-up into intervals The hazard is constant within interval but can differ across intervals (e.g. ‘0’ for cure) 01/2015

73 01/2015

74 Some survival functions (5)
Piece-wise exponential Divide follow-up into intervals The hazard is constant within interval but can differ across intervals (e.g. ‘0’ for cure) Gompertz Model Uses a functional form for S(t) which goes to a fixed, non-zero value after a finite time 01/2015

75 Censoring (1) So much for theory
In real world, we run into practical issues: May know that subject was disease-free up to time ‘t’ but then you lost track of them May only know subject got disease before time ‘t’ May only know subject got disease between two exam dates. May know subject must have been outcome-free for the first ‘x’ years of follow-up (immortal person-time) Can’t measure time to infinite precision Often only know year of event Exact time of event might not even exist in theory 01/2015

76 Censoring (2) Three main kinds of censoring Right censoring
The time of the event is known to be later than some time Subject moves to Australia after three years of follow-up We only know that they died some time after 3 years. Left censoring The time of the event is known to be before some time Looking at age of menarche, starting with a group of 12 year old girls. Some girls are already menstruating 01/2015

77 Censoring (3) Three main kinds of censoring Interval censoring
Time of the event occurred between two known times Annual HIV test Negative on Jan 1, 2012 Positive on Jan 1, 2013 01/2015

78 D D D 01/2015

79 Censoring (4) Right censoring is most commonly considered
Type 1 censoring The censoring time is ‘fixed’ (under control of investigator) Singly censored Everyone has the same censoring time Commonly due to the study ending on a specific date 01/2015

80 Censoring (5) Right censoring is most commonly considered
Type 2 censoring Terminate study after a fixed number of events has happened most common in lab studies Random censoring Observation terminated for reason not under investigator’s control Varying reasons for drop-out Varying entry times 01/2015

81 Censoring (6) Right censoring is most commonly assumed
At the end of their follow-up, subject has not had event. Administrative Censoring Loss-to-follow-up A patient moves away or is lost without having experienced event of interest Drop-out Patient dropped from study due to protocol violation, etc. Competing risks Death occurs due to a competing event We know something about these patients. Discarding them would ‘waste’ information 01/2015

82 Study course for patients in cohort
2001 2003 2013 01/2015

83 Censoring (7) Standard analysis ignores method used to generate censoring. Type 1/2 methods work fine ‘Random’ censoring can be a problem. Informative vs. uninformative censoring Standard analyses require ‘uninformative’ censoring The development of the outcome in subjects who are censored must be the same as in the subjects who remained in follow-up 01/2015

84 Censoring (8) Informative vs. uninformative censoring Strong bias
RCT of new therapy with serious side effects. Patients on this Rx can tolerate side effects until near death. Then, they drop out. Mortality rate in this group will be 0 (/100,000) Control therapy has no side-effects Patients do not drop out near death. Strong bias 01/2015

85 01/2015


Download ppt "School of Epidemiology, Public Health &"

Similar presentations


Ads by Google