Presentation on theme: "The analysis of survival data in nephrology. Basic concepts and methods of Cox regression Paul C. van Dijk 1-2, Kitty J. Jager 1, Aeilko H. Zwinderman."— Presentation transcript:
The analysis of survival data in nephrology. Basic concepts and methods of Cox regression Paul C. van Dijk 1-2, Kitty J. Jager 1, Aeilko H. Zwinderman 2, Carmine Zoccali 3, Friedo W. Dekker 4 1 ERA–EDTA Registry, Department of Medical Informatics, Academic Medical Center, Amsterdam, The Netherlands 2 Department of Clinical Epidemiology, Biostatistics and Bio-informatics, Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands 3 CNR–IBIM Clinical Epidemiology and Pathophysiology of Renal Diseases and Hypertension, Renal and Transplantation Unit, Ospedali Riuniti, Reggio Cal., Italy 4 Department of Clinical Epidemiology, Leiden University Medical Centre, Leiden, The Netherlands Kidney International: ABC on epidemiology
Survival Analysis Comparing groups Question: How much does the survival of one group differ from the survival of another group? Which techniques can be used to answer this question? Answer: The Kaplan Meier method can be used to Give a crude description of the survival of both groups Differences in survival can be reported by comparing survival probabilities or median survival times Estimate a p-value (log rank test) Question: But, can we obtain an effect size (RR) of the survival difference ? And is it possible to account for confounders? Answer: Not with the KM technique. To be able to estimate the effect size and to account for confounders a regression technique for survival data is needed The Cox proportional hazards model (or Cox regression) is a very popular regression technique for the analysis of survival data
The Regression Family For the analysis of data with a continuous outcome variable we use linear regression with a dichotomous (binary) outcome variable we use logistic regression where the time to an event is the outcome of interest, Cox regression is the most popular regression technique?
The Incidence Rate The quantity that is being modelled in Cox regression is the incidence rate ( Synonym: hazard rate) of an event The incidence rate is the ratio of the number of subjects developing disease (or other health outcome) and the time at risk for the disease. It is the instantaneous risk of experiencing the event of interest The incidence rate of an event expresses the speed of an event to occur. The incidence rate of death is sometimes called the force of mortality
The Cox Regression Equation The equation for a basic Cox regression model is: ln incidence (t)=β 0 (t)+ β 1 x i This equation tells us that the incidence rate for individual i at time t is the product of two quantities: β 0 (t) the baseline hazard function (which can be interpreted as a sort of intercept) β 1 x i linear function of covariate(s) which is exponentiated
Example Comparing the survival of patients with and patients without diabetes Consider this example: In this example Cox regression estimates the difference in the incidence rate of diabetics and non-diabetics Example 1 Renal replacement therapy for diabetic end-stage renal disease. Incident dialysis patients in the ERA-EDTA Registry were included in an analysis of patient survival on dialysis by diabetic status.8 Like in most survival studies patients were recruited over a period of time ( the inclusion period) and they were observed up to a specific date (31 December the end of the follow-up period). The death of the patient was the event studied. Transplantation and recovery of renal function were censored observations. For all patients the covariates age and diabetes were collected at the start of RRT.
Example Comparing the survival of patients with and patients without diabetes Comparison of the cumulative survival curves: Comparison of the Ln incidence rates :
Example Comparing the survival of patients with and patients without diabetes Time varying hazard: β 0 (t) β 0 (t) + β 1 x β1β1 ln incidence (t) = β 0 (t) + β 1 x The graph shows that the incidence rate varies over time If we compare the logarithm of the incidence rate of diabetics and non diabetics then, the vertical distance (β 1 ) is the additional risk for patients with diabetes ln [inc diab] - ln [inc nondiab] = β 1 How can we convert this vertical distance β 1 into a RR type of ratio. See next slide
From Beta to Hazard Ratio If both sides of the equation: ln [inc 1] - ln [inc 0] = β 1 (where 1=diab, 0= nondiab) are exponentiated: e (ln[inc 1 ] - ln[inc 0 ]) = e β 1 we obtain a ratio of the incidence of both groups: [inc 1 ] / [inc 0 ] = e β 1 which is the hazard ratio (HR) The hazard ratio (HR) quantifies the impact of diabetes on outcome and can be interpreted as a relative risk type of ratio
The Proportionality Assumption Cox regression assumes that the excess risk (the vertical distance between the log of the incidence rate of both groups) is constant throughout follow-up time Question: How do you know that proportionality is a problem? Answer: Often, crossing survival curves are a strong indication of non proportionality Check if the interaction of the covariate of interest with time is statistically significant Question: What if incidences are not proportional over time? Answer: Then the Hazard Ratio cannot be estimated with the standard Cox Proportional Hazards model Better consult a statistician to find a suitable solution
Typical Output of a Cox Regression Model The Diabetes Example Estimates VariableBeta Standard Error P value Hazard Ratio exp(Beta) 95% confidence interval Diabetes (yes/no) < Diabetes was coded as a binary variable (1=yes/ 0=no) In this case the beta is positive (0.54) which means that the log of the incidence rate for death in diabetics is higher than in non diabetic patients Thus, the hazard ratio (HR) for diabetics compared to non diabetics is e 0.54 = 1.71 The 95% confidence interval for the hazard ratio is [ ] Conclusion: the mortality of patients with diabetes is higher than in patients without diabetics OR should we account for known confounders (e.g. age) before drawing any conclusions
Typical Output of a Cox Regression Model The Diabetes Example, Adjustment for Age Cox regression is the tool to account for confounding effects when performing survival analysis The output of this multiple Cox regression model shows an effect of both age and diabetes The hazard ratio for diabetes increased from (see previous slide) to This change shows that after accounting for the confounding effect of age, the impact of diabetes on survival is even stronger Our previous conclusion: the mortality of patients with diabetes is higher than in patients without diabetics is still valid. Moreover, age is indeed an important confounder in the diabetes-mortality relationship Estimates VariableBeta Standard Error P value Hazard Ratio exp(Beta) 95% confidence interval Age (continuous) < Diabetes (yes/no) <
In Summary Cox regression is a powerful and popular regression technique to obtain an effect estimate and to study the impact of several risk factors on survival at the same time This presentation described some basic properties and applications of the Cox regression model in the context of aetiological studies In Cox regression, the proportionality assumption is important. If the survival curves cross, the Cox regression model with proportional hazards is inappropriate and extensions to the Cox model are needed Further extensions of the Cox regression technique, for example the modeling of time dependent effects will be described in detail in the next issue of the ABC in epidemiology series