On ranking in survival analysis: Bounds on the concordance index

Slides:



Advertisements
Similar presentations
The analysis of survival data in nephrology. Basic concepts and methods of Cox regression Paul C. van Dijk 1-2, Kitty J. Jager 1, Aeilko H. Zwinderman.
Advertisements

Surviving Survival Analysis
Survival Analysis. Key variable = time until some event time from treatment to death time for a fracture to heal time from surgery to relapse.
Survival Analysis In many medical studies, the primary endpoint is time until an event occurs (e.g. death, remission) Data are typically subject to censoring.
If we use a logistic model, we do not have the problem of suggesting risks greater than 1 or less than 0 for some values of X: E[1{outcome = 1} ] = exp(a+bX)/
Survival Analysis-1 In Survival Analysis the outcome of interest is time to an event In Survival Analysis the outcome of interest is time to an event The.
Departments of Medicine and Biostatistics
Introduction Cure models within the framework of flexible parametric survival models T.M-L. Andersson1, S. Eloranta1, P.W. Dickman1, P.C. Lambert1,2 1.
Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer.
Cox Model With Intermitten and Error-Prone Covariate Observation Yury Gubman PhD thesis in Statistics Supervisors: Prof. David Zucker, Prof. Orly Manor.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Goodness of Fit of a Joint Model for Event Time and Nonignorable Missing Longitudinal Quality of Life Data – A Study by Sneh Gulati* *with Jean-Francois.
the Cox proportional hazards model (Cox Regression Model)
Model and Variable Selections for Personalized Medicine Lu Tian (Northwestern University) Hajime Uno (Kitasato University) Tianxi Cai, Els Goetghebeur,
Chapter 11 Survival Analysis Part 3. 2 Considering Interactions Adapted from "Anderson" leukemia data as presented in Survival Analysis: A Self-Learning.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
For internal use only / Copyright © Siemens AG All rights reserved. Multiple-instance learning improves CAD detection of masses in digital mammography.
Maximum-Likelihood estimation Consider as usual a random sample x = x 1, …, x n from a distribution with p.d.f. f (x;  ) (and c.d.f. F(x;  ) ) The maximum.
A Brief Overview of Really Current Research on Dividends Gretchen A. Fix Department of Statistics Rice University 6 November 2003.
Survival Analysis for Risk-Ranking of ESP System Performance Teddy Petrou, Rice University August 17, 2005.
Model Checking in the Proportional Hazard model
Cox Proportional Hazards Regression Model Mai Zhou Department of Statistics University of Kentucky.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Analysis of Complex Survey Data
An extension of the compound covariate prediction under the Cox proportional hazard models Emura, Chen & Chen [ 2012, PLoS ONE 7(10) ] Takeshi Emura (NCU)
Kaplan-Meier Estimation &Log-Rank Test Survival of Ventilated and Control Flies (Old Falmouth Line 107) R.Pearl and S.L. Parker (1922). “Experimental Studies.
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
HSTAT1101: 27. oktober 2004 Odd Aalen
Multiple Choice Questions for discussion
Faculty of Economics and Administrative Sciences Department of Applied Statistics Survival Analysis of Breast Cancer Patients in Gaza Strip.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international.
NASSER DAVARZANI DEPARTMENT OF KNOWLEDGE ENGINEERING MAASTRICHT UNIVERSITY, 6200 MAASTRICHT, THE NETHERLANDS 22 OCTOBER 2012 Introduction to Survival Analysis.
Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.
Designing Efficient Cascaded Classifiers: Tradeoff between Accuracy and Cost Vikas Raykar Balaji Krishnapuram Shipeng Yu Siemens Healthcare KDD 2010 TexPoint.
Statistical approaches to analyse interval-censored data in a confirmatory trial Margareta Puu, AstraZeneca Mölndal 26 April 2006.
1 Introduction to medical survival analysis John Pearson Biostatistics consultant University of Otago Canterbury 7 October 2008.
1 Active learning based survival regression for censored data Bhanukiran Vinzamuri Yan Li Chandan K.
Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
03/20141 EPI 5344: Survival Analysis in Epidemiology Log-rank vs. Mantel-Hanzel testing Dr. N. Birkett, Department of Epidemiology & Community Medicine,
Using Neural Networks to Predict Claim Duration in the Presence of Right Censoring and Covariates David Speights Senior Research Statistician HNC Insurance.
Applied Epidemiologic Analysis Fall 2002 Patricia Cohen, Ph.D. Henian Chen, M.D., Ph. D. Teaching Assistants Julie KranickSylvia Taylor Chelsea MorroniJudith.
HSRP 734: Advanced Statistical Methods July 17, 2008.
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
Modeling Cure Rates Using the Survival Distribution of the General Population Wei Hou 1, Keith Muller 1, Michael Milano 2, Paul Okunieff 1, Myron Chang.
HSRP 734: Advanced Statistical Methods July 31, 2008.
1 A fast algorithm for learning large scale preference relations Vikas C. Raykar and Ramani Duraiswami University of Maryland College Park Balaji Krishnapuram.
Pro gradu –thesis Tuija Hevonkorpi.  Basic of survival analysis  Weibull model  Frailty models  Accelerated failure time model  Case study.
Survival Analysis 1 Always be contented, be grateful, be understanding and be compassionate.
Lecture 12: Cox Proportional Hazards Model
1 Confidence Intervals for Two Proportions Section 6.1.
Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics.
Survival Analysis approach in evaluating the efficacy of ARV treatment in HIV patients at the Dr GM Hospital in Tshwane, GP of S. Africa Marcus Motshwane.
01/20151 EPI 5344: Survival Analysis in Epidemiology Actuarial and Kaplan-Meier methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
Lecture 4: Likelihoods and Inference Likelihood function for censored data.
Love does not come by demanding from others, but it is a self initiation. Survival Analysis.
Computacion Inteligente Least-Square Methods for System Identification.
Additional Regression techniques Scott Harris October 2009.
1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 8.1: Cohort sampling for the Cox model.
Practical Solutions Additional Regression techniques.
Carolinas Medical Center, Charlotte, NC Website:
Bootstrap and Model Validation
Comparing Cox Model with a Surviving Fraction with regular Cox model
Anastasiia Raievska (Veramed)
Statistics 262: Intermediate Biostatistics
What is Regression Analysis?
Love does not come by demanding from others, but it is a self initiation. Survival Analysis.
Lecture 4: Likelihoods and Inference
Lecture 4: Likelihoods and Inference
Presentation transcript:

On ranking in survival analysis: Bounds on the concordance index Vikas C. Raykar | Harald Steck | Balaji Krishnapuram CAD & Knowledge Solutions (IKM CKS), Siemens Medical Solutions USA, Inc., Malvern, USA Cary Dehing-Oberije | Philippe Lambin Maastro clinic, University Hospital Maastricht, University Maastricht-GROW, The Netherlands NIPS 2007

Organization Motivation Brief review of survival analysis Concordance index Our proposed ranking approach Connections to survival analysis Results

Motivation: Personalized medicine Predict survival time of lung cancer patients. Different kinds of treatment Chemo/radiotherapy dosage Survival time Different patient characteristics Age/gender/health Dataset available from MAASTRO hospital our collaborator.

Why not use regression? Not amenable to standard statistical/ machine learning methods due to censored data. Well studied in statistics as survival analysis. Give

Review: Survival Analysis Branch of statistics that deals with time until the occurrence of a event When did a patient die ? When did the disease manifest? When did the machine fail? Given humidity and temperature predict the amount of snowfall Widely used in medical statistics, epidemiology, reliability engineering, economics, sociology, marketing, insurance, etc.

What is censored data? Start of the study Data collected at this time At the end of the study a lot of patients may still survive. 2001 TIME Start of the study Data collected at this time Patient unavailable for follow-up Censored Data Some patients die during the study period. End of study A patient may move to a different town and thus be no longer available for follow-up. For such cases the exact survival time may be longer than the observation period. Patient 1 Death 2005 The exact survival time may be longer than the observation period

Censoring provides only partial information Typically a large portion of the data is censored. Observed Data Given humidity and temperature predict the amount of snowfall Survival Time Censored Data

Notation: Survival analysis Given humidity and temperature predict the amount of snowfall

Proportional Hazard (PH) Model Has become a standard model for studying the effect of covariates on survival time distributions. unknown regression parameters relative hazard function Given humidity and temperature predict the amount of snowfall Baseline hazard function covariate Parameter estimates for PH model are obtained by maximizing Cox’s partial likelihood.

Concordance Index or c-index Standard performance measure for model assessment in survival analysis. Generalization of the area under the ROC curve to regression problems/censored data. Fraction of all pairs of subjects who's survival times can be ordered such that the subject with higher predicted survival is the one who actually survived longer. Given humidity and temperature predict the amount of snowfall

Concordance Index-no censoring 5 1 5 4 Survival time 2 3 covariate 4 3 2 C=1 perfect prediction accuracy C=0.5 as good as a random predictor 1

Concordance Index-with censoring 5 5 Survival time 4 4 3 Given humidity and temperature predict the amount of snowfall 3 2 1 No arrow can go above a censored point 2 1 Censored

Proposed approach: Maximize CI directly While CI is widely used to evaluate a learnt model, it is not generally used as an objective function for training. CI is invariant to monotone transformation of the survival times. Hence the model learnt by maximizing the CI is a ranking function. (N-partite ranking problem) Given humidity and temperature predict the amount of snowfall

Lower bounds on the CI Discrete optimization problem Use a differentiable concave lower bound Given humidity and temperature predict the amount of snowfall Related to the PH model

Maximize lower bounds on the CI Linear ranking functions Given humidity and temperature predict the amount of snowfall Regularization Use gradient based methods to maximize this

Connection to the PH model Log-likelihood for correct ranking For a proportional hazard model we can show that Given humidity and temperature predict the amount of snowfall This is a common assumption made in ranking literature. We have shown that if we use PH models this is exactly the case.

Penalized log-likelihood Compare this with the objective function using the lower bound approach Given humidity and temperature predict the amount of snowfall

Cox partial likelihood Our proposed method explicitly maximizes a lower bound. Cox method maximizes partial likelihood. Experimental results indicate that both do well. Conjecture: Is Cox’s partial likelihood also a lower bound on the CI? Given humidity and temperature predict the amount of snowfall

Cox partial likelihood (cont.) Given humidity and temperature predict the amount of snowfall

Results Proposed method slightly better than Cox-PH. However differences not significant. Given humidity and temperature predict the amount of snowfall

Thank You ! | Questions ?