01/20151 EPI 5344: Survival Analysis in Epidemiology Time varying covariates March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive.

01/20151 EPI 5344: Survival Analysis in Epidemiology Time varying covariates March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine, University of Ottawa

01/20152 Objectives Introduce time varying covariates Methods of inclusion into Cox models SAS (computer) issues

01/20153 Does heart transplantation improve survival? –Epidemiological study with ID measures –Observational study (not an RCT) Introduction (1)

01/20154 Assume that transplant has no effect on survival –IDR = 1.0 800 candidates for transplant 2 year follow-up No losses 50% of people get a transplant –Always occurs on their first anniversary of entering study 25% of group die in first year 25% of first year survivors die in second year Introduction (2)

01/2015 Introduction (3) Ignore transplant status 5

01/20156 Introduction (4) Stratify by transplant status Transplant Done

01/20157 Introduction (5) Stratify by transplant status NO Transplant Done

01/20158 What is the observed IDR under this method of analysis? Transplant ID = 0.133/yr No transplant ID = 0.526/yr IDR = 0.253 Correct IDR = 1.0 Introduction (6) STRONG BIAS Doing an RCT does NOT fix this issue as long as transplant is not done at time ‘0’

01/20159 How do we fix this? –No-one is at risk of dying with a transplant until the transplant has taken place Solution using epi methods: –People who never have transplant –People who have a transplant Accumulate PT (and events) to the non-transplant group until after a transplant occurs Accumulate PT (and events) to the transplant group only after transplant occurs Introduction (7)

01/201510 Introduction (8) CORRECT WAY: No Transplant Done

01/201511 Introduction (9) CORRECT WAY Transplant Done

01/201512 What is the observed IDR under this method of analysis? Transplant ID = 0.286/yr No transplant ID = 0.286/yr IDR = 1.0 Correct IDR = 1.0 Introduction (10) TIME VARYING COVARIATE Transplant status

01/201513 Exposures can change during follow-up –People stop/start smoking –BP increases –Air pollution varies from year to year Hazard often depends more strongly on recent values than original exposure –Not always true –Can depend on cumulative exposure Lagged exposure Time Varying Covariates (1)

01/201514 Produces non-proportional hazards –Change in exposure level causes hazard to change in one group Still proportional conditional on value of time varying exposure. Time Varying Covariates (2)

01/201515

01/201516

Before t*, HR = 1.0 After t*, HR* < 1.0 Time Varying Covariates (3) NOT PH over all time If we ignore the time of exposure and just treat these as two groups with PH, we get a biased estimate of the hazard ratio –A type of average of 1.0 and HR* (> HR*) 01/201517

01/201518 BUT: before t*, hazards are proportional after t*, hazards are proportional The true impact of the exposure is HR* and only occurs after t* Need an analysis approach to reflect this Time Varying Covariates (4)

01/201519 Is this hard to do? –YES and NO Consider a situation where all subjects start off as ‘unexposed’ but at some time in the future, some people become exposed Time Varying Covariates (5)

01/201520 Standard Cox Model Time Varying Covariates (6) Time Varying Cox Model Only change

01/201521 The theory really is this simple! WHY? Time Varying Covariates (7) RISK SETS

01/201522 Likelihood function for Cox model is computed at each time point when an event occurs –Depends only on subjects “at risk” at the event time –RISK SET Time Varying Covariates (8) x ij is the value of ‘x’ AT THE TIME of this event

01/201523 Fixed covariates: Time Varying Covariates (9) x ij is the same at all times Time varying covariates: Use the x ij which corresponds to the event time of this risk set Keep doing this over all risk sets

01/201524 So why isn’t it simple to do this? Practical Issues intrude!!!! To fit a time varying covariate, SAS needs to know the value of the covariate for every risk set. –Need to compute a value of the covariate at the time of every event. Interpretation is also tricky (later) Time Varying Covariates (10)

Time Varying Covariates (11) Example –4 subjects –2 get transplant at t = 15 & t = 25 –Want to include a time-varying covariate for transplant status. 01/201525 IDOutcomeTime of event TransplantTime of transplant 1dead10N. 2dead20Y15 3dead30N. 4dead40Y25 4 risk sets at t=10, 20, 30, & 40

Time Varying Covariates (12) 01/201526 Risk setIDX trans 1012341234 00000000 20234234 100100 303434 0101 4041

01/201527 Two ways to do this in SAS: –Use programming statements in ‘Proc Phreg’. –Re-structure the data set and use a different method of describing the model to SAS Counting Process Input. Other programmes have similar options and choices Time Varying Covariates (13)

01/201528 We’ll look at both ways. –Some things can only be done in the Phreg programming approach –Counting Process input has some strong benefits. –Counting process approach can be tricky to use with age as the time scale Time Varying Covariates (14)

01/201529 SAS lets you include programme statements within PROC PHREG: proc phreg data=njb1; model surv*vs(0)=age sex x1; if (surv > 20) then x1 = 2; else x1 = 1; run; Proc Phreg programming (1)

01/201530 This code is processed once for each risk set ‘surv’ is the time when the risk set occurs –It is NOT the survival time for the subject ‘x1’ is the value of the variable in the subject at the time of the specific risk set under consideration. –Here, it is ‘1’ if the risk set occurs before time 20 but ‘2’ otherwise File can get VERY BIG Hard to de-bug your code –But, SAS 9.4 allows ‘out’ statements to be used Proc Phreg programming (2)

Stanford Heart Transplant Study 01/201531

01/201532

01/201533 Standard phreg analysis. Defines the ‘transplant’ status in the ‘data step’ using code like this: data njb1; set stanford; if (dot =.) then trans = 0; else trans = 1; run; proc phreg data=njb1; model time*cens(0)=trans; run;

01/201534 Trans=1  a) Had a transplant b) Lived long enough to have a transplant

01/201535 Hazard curves look something like this. Transplant No Transplant Transplant time In this interval, HR = 0  Overall HR is biased

01/201536 Stanford Heart Transplant Study: with time varying effect IDSurv1DeadWait 1491. 251. 31510 438135 5171. 621. 7674150 For each event time, we need to define the transplant variable for every subject still in risk set plant = 0 no transplant by risk set time 1 transplant done on or before risk set time

01/201537 Risk set time ID’sWait timeplant 212345671234567. 0 35. 50 00100000010000 5123457123457. 0 35. 50 001000001000 151345713457. 0 35. 50 0100001000

01/201538 Risk set time ID’sWait timeplant 1714571457. 35. 50 00000000 38147147. 35 50 010010 491717 1 50 0000 6747501

01/201539 SAS Code to create ‘plant’ and run analysis proc phreg data=stan; model surv1*dead(0)=plant surg ageaccept/ ties=exact; if (wait > surv1 or wait =.) then plant = 0; else plant = 1; run;

Counting Process Input (1) Counting processes are a different way to look at survival –mathematically more powerful –essentially, each subject follows a ‘process’ ‘count up’ the events they experience can handle recurrent events enhances modeling of exposure. Don’t need to know all this to use SAS counting process style input. 01/201540

Counting Process Input (2) Data set needs to be restructured. To-date –one record per subject –To code covariate changes, need multiple variables value at baseline (v1) time of first change (t1) and new value (v2) and so on –Need to use ‘phreg’ programming to define value at risk set. 01/201541

Counting Process Input (3) New approach –Similar to piece-wise exponential model –Split data for each subject into multiple records Define intervals where every covariate is constant –[t1, t2) Each interval has one line (record) of data –Intervals continue until: Subject censored Subject has outcome event. 01/201542

01/201543 Need to re-structure data file Each interval needs a record in the data set Need to code Start of this interval End of this interval Outcome status at end of interval Value of time varying covariate(s) during the interval Values of fixed covariates, etc. Counting Process Input (4)

01/201544 Let’s use data from the Stanford Heart Transplant Study the same data as before. But, we only include transplant status Ignore other variables for now. Only have one time varying covariate. Counting Process Input (5)

01/2015 IDSurv1DeadWait 1491. 251. 31510 438135 5171. 621. 7674150 Original data Re-structured data IDStartStopStatusplant 104910 IDStartStopStatusplant 104910 20510 IDStartStopStatusplant 104910 20510 403500 IDStartStopStatusplant 104910 20510 403500 4 3811 IDStartStopStatusplant 104910 20510 30.100 403500 4 3811 IDStartStopStatusplant 104910 20510 30.100 3 1511 403500 4 3811 IDStartStopStatusplant 104910 20510 30.100 3 1511 403500 4 3811 501710 IDStartStopStatusplant 104910 20510 30.100 3 1511 403500 4 3811 501710 60210 IDStartStopStatusplant 104910 20510 30.100 3 1511 403500 4 3811 501710 60210 705000 IDStartStopStatusplant 104910 20510 30.100 3 1511 403500 4 3811 501710 60210 705000 7 67410 45

01/201546 DATA stanlong; SET allison.stan; plant=0; start=0; IF (trans=0) THEN DO; dead2=dead; stop=surv1; IF (stop=0) THEN stop=.1; OUTPUT; END; ELSE DO; stop=wait; IF (stop=0) THEN stop=.1; dead2=0; OUTPUT; plant=1; start=wait; IF (stop=.1) THEN start=.1; stop=surv1; dead2=dead; OUTPUT; END; RUN; SAS Code to re-structure data DATA stanlong; SET allison.stan; plant=0; start=0; IF (trans=0) THEN DO; dead2=dead; stop=surv1; OUTPUT; END; ELSE DO; stop=wait; dead2=0; OUTPUT; plant=1; start=wait; stop=surv1; dead2=dead; OUTPUT; END; RUN;

01/201547 PROC PHREG DATA=stanlong; MODEL (start,stop)*dead2(0)=plant surg ageaccpt / TIES=EFRON; RUN; SAS Code for counting-process input analysis Identical to previous time-varying analysis

01/201548 Types of time varying covariates Internal (endogenous) –Change in the covariate is related to the behaviour of the subject. –Measurement requires subject to be under periodic examination Blood pressure Cholesterol Smoking –More challenging for analysis Often part of causal pathway Time Varying Covariates (15)

01/201549 External (exogenous) –Variables which vary independently of the subject’s normally biological processes. –The values do not depend on subject-specific information –Measurement does not require subject monitoring Hourly pollen count Time Varying Covariates (16)

01/201550 Some pattern types –Non-reversible dichotomy Transplant –Reversible dichotomy Smoking Drug use –Continuous variable Cholesterol Time Varying Covariates (17)

01/201551 Some issues –Need for valid measures for all subjects at all follow- up time Missing data ‘coarse’ measurement intervals Imputation Interpolation –Computationally intense Reverse causation effects Intermediate variables in the causal pathway Time Varying Covariates (18)

01/201552 Some Logical fallacies Can not use the future to predict the future! Example #1 –Recruit a cohort of neonates Age at entry = 0 for all subjects –Not useful as a predictor –Suggestion is made to use average age during follow-up to predict outcome –INVALID Average age during follow-up depends on ‘future’ information High average age is due to long survival Time Varying Covariates (19)

01/201553 Intermediaries (Internal covariates) RCT of anti-hypertensive treatment Outcome: time to stroke Main Q: Does drug   rate of stroke Model 1: ln(HR) = β 1 (drug) BUT, we measured BP on all subjects during follow-up. –Why not include this as a time-varying covariate? Time Varying Covariates (20)

01/201554 Intermediaries (cont) Model 1: ln(HR) = β 1 (drug) Model 2: ln(HR) = β 1 *(drug) + β 2 BP(t) Results Model 1 β 1 : p < 0.001 Model 2 β 1 *: p =0.6 Time Varying Covariates (21) WHY?

01/201555 Drug  drop in BP  drop in stroke risk Effect of drug on stroke is already accounted for in the BP term Estimate from model of ‘drug’ effect is the effect of the drug after adjusting for changes in BP That is, after adjusting for the drug effect. Time Varying Covariates (22)

01/201556 Study of prisoners released from jail –One year follow-up –Monitor every week If subject was re-arrested, record the week of the arrest Recidivated –Key question Does financial security post-release reduce risk of recidivism? SAS examples (1)

01/201557

01/201558

01/201559

01/201560 Study also collected information about employment status for every week of follow-up after release Time varying covariate Hypothesis –Being in full-time employment reduces the risk of recidivism. SAS examples (2)

01/201561 IDEMP1EMP2EMP3………EMP52 1110………0 2000 1 3100 0 … and so on Data layout for employment information

01/201562 PROC PHREG DATA=allison.recid; MODEL week*arrest(0)=fin age race wexp mar paro prio employed / TIES=EFRON; ARRAY emp(*) emp1-emp52; employed=emp[week]; RUN;

01/201563 BUT: if you get arrested in week 10, you can’t work fulltime in week 10 REVERSE CAUSATION Lagged exposure

01/201564 title 'Single week lag'; PROC PHREG data=allison.recid; WHERE week>1; MODEL week*arrest(0)=fin age race wexp mar paro prio employed / TIES=EFRON; ARRAY emp(*) emp1-emp52; employed=emp[week-1]; RUN;

01/201565 Allison looks at some other models –Other lag intervals –cumulative work experience Worth reviewing for code examples and interpretation SAS examples (3)

01/201566 Albumin and death –Question: Does a falling serum albumin predict an increased likelihood of death? SAS examples (4)

01/201567 Albumin measured on the first day of each month –Ad-hoc measurement –Not available on every day of the month Can not use ‘average’ albumin around death date –No post-death value Use ‘closest’ value before risk set date SAS examples (5)

01/201568 DATA bloodcount; INFILE 'c:\blood.dat'; INPUT deathday status alb1-alb12; ARRAY alb(*) alb1-alb12; status2=0; deathmon=CEIL(deathday/30.4); DO j=1 TO deathmon; start=(j-1)*30.4; stop=start+30.4; albumin=alb(j); IF (j=deathmon) THEN DO; status2=status; stop=deathday-start; END; OUTPUT; END; Run; PROC PHREG DATA=bloodcount; MODEL (start,stop)*status2(0)=albumin; RUN; Uses counting process style input

01/201569 Alcohol cirrhosis and survival –Prothrombin time (a measure of blood clotting) is hypothesized as a predictor of survival –Cohort of men were followed up –Lab measures were taken at ‘clinically relevant’ times No pattern to the times Varied for each subject SAS examples (6)

01/201570

01/201571 DATA alcocount; SET allison.alco; time1=0; time11=.; ARRAY t(*) time1-time11; ARRAY p(*) pt1-pt10; dead2=0; DO j=1 TO 10 WHILE (t(j) NE.); start=t(j); pt=p(j); stop=t(j+1); IF (t(j+1)=.) THEN DO; stop=surv; dead2=dead; END; OUTPUT; END; run; PROC PHREG DATA=alcocount; MODEL (start,stop)*dead2(0)=pt; RUN; Uses counting process style input

01/201572

01/20151 EPI 5344: Survival Analysis in Epidemiology Time varying covariates March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive.

Similar presentations

Presentation on theme: "01/20151 EPI 5344: Survival Analysis in Epidemiology Time varying covariates March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

01/20151 EPI 5344: Survival Analysis in Epidemiology Time varying covariates March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive.

Similar presentations

Presentation on theme: "01/20151 EPI 5344: Survival Analysis in Epidemiology Time varying covariates March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive."— Presentation transcript:

Similar presentations

About project

Feedback