A comparative study of survival models for breast cancer prognostication based on microarray data: a single gene beat them all? B. Haibe-Kains, C. Desmedt,

Slides:



Advertisements
Similar presentations
Regulation of Consumer Tests in California AAAS Meeting June 1-2, 2009 Beatrice OKeefe Acting Chief, Laboratory Field Services California Department of.
Advertisements

Most Random Gene Expression Signatures are Significantly Associated with Breast Cancer Outcome Venet, et al. PLoS Computational Biology, 2011 Molly Carroll.
1 Statistical Modeling  To develop predictive Models by using sophisticated statistical techniques on large databases.
Clinical Trial Designs for the Evaluation of Prognostic & Predictive Classifiers Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.
Yue Han and Lei Yu Binghamton University.
The 70-Gene Profile and Chemotherapy Benefit in 1,600 Breast Cancer Patients Bender RA et al. ASCO 2009; Abstract 512. (Oral Presentation)
Departments - Surgery - Gerontology and Geriatrics Department of SurgeryDepartment of Gerontology & Geriatrics Prof. dr. C.J.H. van de VeldeProf. dr. R.G.J.
Expression profiles for prognosis and prediction Laura J. Van ‘t Veer The Netherlands Cancer Institute, Amsterdam.
Model and Variable Selections for Personalized Medicine Lu Tian (Northwestern University) Hajime Uno (Kitasato University) Tianxi Cai, Els Goetghebeur,
Supervised classification performance (prediction) assessment Dr. Huiru Zheng Dr. Franscisco Azuaje School of Computing and Mathematics Faculty of Engineering.
4 th NETTAB Workshop Camerino, 5 th -7 th September 2004 Alberto Bertoni, Raffaella Folgieri, Giorgio Valentini
New Methods in Ecology Complex statistical tests, and why we should be cautious!
1 Masterseminar „A statistical framework for the diagnostic of meningioma cancer“ Chair for Bioinformatics, Saarland University Andreas Keller Supervised.
Using Random Peptide Phage Display Libraries for early Breast cancer detection Ekaterina Nenastyeva.
MammaPrint, the story of the 70-gene profile
Thoughts on Biomarker Discovery and Validation Karla Ballman, Ph.D. Division of Biostatistics October 29, 2007.
References 1.Salazar R, Roepman P, Capella G et al. Gene expression signature to improve prognosis prediction of stage II and III colorectal cancer. J.
A comparative study of survival models for breast cancer prognostication based on microarray data: a single gene beat them all? B. Haibe-Kains, C. Desmedt,
Expression profiling of peripheral blood cells for early detection of breast cancer Introduction Early detection of breast cancer is a key to successful.
Prognostic Modelling and Profiling of Breast Cancer Patients after Surgery Ian Jarman School of Computer and Mathematical Sciences Liverpool John Moores.
Introduction MarkerPolymorphisms in CYP2C19 (*2 and *17) assessed by Taqman allelic discrimination ObjectivesTo evaluate the predictive capacity of CYP2C19.
Estimating cancer survival and clinical outcome based on genetic tumor progression scores Jörg Rahnenführer 1,*, Niko Beerenwinkel 1,, Wolfgang A. Schulz.
JM - 1 Introduction to Bioinformatics: Lecture VIII Classification and Supervised Learning Jarek Meller Jarek Meller Division.
Taiwan 2000 PETACC 3 ASCO 2009 Molecular markers in colon cancer have a stage specific prognostic value. Results of the translational study on the PETACC.
Tang G et al. Proc SABCS 2010;Abstract S4-9.
The 70 gene Mammaprint ™ signature: a comparison with traditional clinico-pathological parameters. Patrizia Querzoli 1, Massimo Pedriali 1, Gardenia Munerato.
A 14-gene prognosis signature predicts metastasis risk in node-negative, estrogen receptor-positive, Tamoxifen-treated breast cancer in different ethnogeographic.
Analysis of Molecular and Clinical Data at PolyomX Adrian Driga 1, Kathryn Graham 1, 2, Sambasivarao Damaraju 1, 2, Jennifer Listgarten 3, Russ Greiner.
Molecular Diagnosis Florian Markowetz & Rainer Spang Courses in Practical DNA Microarray Analysis.
University of Washington Institute of Technology Tacoma, WA, USA Ecole des Hautes Etudes en Santé Publique Département Infobiostat Rennes, France Isabelle.
Sgroi DC et al. Proc SABCS 2012;Abstract S1-9.
Metabolic Syndrome and Recurrence within the 21-Gene Recurrence Score Assay Risk Categories in Lymph Node Negative Breast Cancer Lakhani A et al. Proc.
Chapter 16 The Elaboration Model Key Terms. Descriptive statistics Statistical computations describing either the characteristics of a sample or the relationship.
D:/rg/folien/ms/ms-USA ppt F 1 Assessment of prediction error of risk prediction models Thomas Gerds and Martin Schumacher Institute of Medical.
Selection of Patient Samples and Genes for Disease Prognosis Limsoon Wong Institute for Infocomm Research Joint work with Jinyan Li & Huiqing Liu.
Computational biology of cancer cell pathways Modelling of cancer cell function and response to therapy.
INCREASED EXPRESSION OF PROTEIN KINASE CK2  SUBUNIT IN HUMAN GASTRIC CARCINOMA Kai-Yuan Lin 1 and Yih-Huei Uen 1,2,3 1 Department of Medical Research,
Sage / DREAM Breast Cancer Prognosis Challenge The goal of the breast cancer prognosis challenge is to assess the accuracy of computational models designed.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
Ranjit Ganta, Raj Acharya, Shruthi Prabhakara Department of Computer Science and Engineering, Penn State University DATA WAREHOUSE FOR BIO-GEO HEALTH CARE.
Effect of 21-Gene Reverse- Transcriptase Polymerase Chain Reaction Assay on Treatment Recommendations in Patients with Lymph Node-Positive and Estrogen.
Risk assessment for VTE Dr Roopen Arya King’s College Hospital.
Journal Club Meeting Sept 13, 2010 Tejaswini Narayanan.
Class 23, 2001 CBCl/AI MIT Bioinformatics Applications and Feature Selection for SVMs S. Mukherjee.
Modeling Ultra-high Dimensional Feature Selection as a Slow Intelligence System Wang Yingze CS 2650 Project.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
Jin MENG Shen FU (DPD 08) Biology 2 - Head/Neck and CNS Tumors
Pan-cancer analysis of prognostic genes Jordan Anaya Omnes Res, In this study I have used publicly available clinical and.
Scott Kopetz, MD, PhD Department of Gastrointestinal Medical Oncology
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 5.
A B C Supplementary Figure S1. Time-dependent assessment of grade, GGI and PAM50 in untreated patients Landmark analyses of the Kaplan-Meier estimates.
Network applications Sushmita Roy BMI/CS 576 Dec 9 th, 2014.
Date of download: 5/29/2016 Copyright © 2016 American Medical Association. All rights reserved. From: Gene Expression Signatures, Clinicopathological Features,
Analyzing circadian expression data by harmonic regression based on autoregressive spectral estimation Rendong Yang and Zhen Su Division of Bioinformatics,
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 3.
Classifiers!!! BCH364C/391L Systems Biology / Bioinformatics – Spring 2015 Edward Marcotte, Univ of Texas at Austin.
IMPACT OF STAGE MIGRATION ON NODE POSITIVE PROSTATE CANCER RATE AND FEATURES: A 20-YEAR, SINGLE INSTITUTION ANALYSIS IN MEN TREATED WITH EXTENDED PELVIC.
IDENTIFYING CANCER SUBTYPES BASED ON SOMATIC MUTATION PROFILE BIOINFORMATICS SEMINAR 2016 SPRING YOUJIN SHIN.
Mamounas EP et al. Proc SABCS 2012;Abstract S1-10.
PhD fellowship in bioinformatics
KnowEnG: A SCALABLE KNOWLEDGE ENGINE FOR LARGE SCALE GENOMIC DATA
Principal Component Analysis (PCA)
A Long Noncoding RNA Signature That Predicts Pathological Complete Remission Rate Sensitively in Neoadjuvant Treatment of Breast Cancer  Gen Wang, Xiaosong.
Descriptive Statistics vs. Factor Analysis
Volume 5, Issue 6, Pages e3 (December 2017)
Recurrence-Associated Long Non-coding RNA Signature for Determining the Risk of Recurrence in Patients with Colon Cancer  Meng Zhou, Long Hu, Zicheng.
Florian T. Merkle, Kevin Eggan  Cell Stem Cell 
Disease-specific survival
Figure 1. Identification of three tumour molecular subtypes in CIT and TCGA cohorts. We used CIT multi-omics data ( Figure 1. Identification of.
by Wei-Yi Cheng, Tai-Hsien Ou Yang, and Dimitris Anastassiou
Presentation transcript:

A comparative study of survival models for breast cancer prognostication based on microarray data: a single gene beat them all? B. Haibe-Kains, C. Desmedt, C. Sotiriou and G. Bontempi BIOINFORMATICS Vol. 24 no , pages

outline microarray Introduction –Motivation –Results –Purpose Risk prediction methods

Microarray mics/chip/chip.html

Motivation Survival prediction of breast cancer (BC) patients, independently of treatment, also known as prognostication, is a complex task. –Clinically similar breast tumors and molecularly heterogeneous –Several clinical and pathological indicators such as histological grade, tumor size and lymph node –Although BC prognostication has been the object of intense research, a still open challenge is how to detect patient who needs adjuvant systemic therapy.

Motivation In recently years, the analysis of gene expression profiles by means of sophisticated data mining tools emerged as a promising technology to bring additional insight into BC biology and to improve the quality of prognostication. –array-based technology –The sequencing of the human genome

Motivation Several research teams conducted comprehensive genome-wide assessments of gene expression profiling and identified prognostic gene expression signatures. With respect to clinical guidelines, these signatures were shown to correctly identify a larger group of low-risk patient not requiring treatment.

Motivation In fact, clinicians encounter problems when confronted with patient with intermediate-grade tumors (Grade 2). These tumors, which represent 30-60% of cases, are a major source of inter- observer discrepancy and may display intermediate phenotype and survival, making treatment decisions for the patients a great challenge, with subsequent under- or over-treatment.

Results Complex prediction methods are highly exposed to performance degradation despite the use of cross-validation techniques for the following reasons: –The large number of variables –The reduced amount of samples –The high degree of noise Our analysis shows that the most methods are not significantly better than the simplest one, a univariate model relying on a single proliferation gene.

Results This result suggests the proliferation might be the most relevant biological process for BC prognostication. The loss of interpretability deriving from the use of overcomplex methods may be not sufficiently counterbalabed by an improvement of the quality of prediction.

purpose Set up a common and independent assessment framework to compare the performance fo existing gene signatures and several state-of- the-art risk prediction method Compare the prediction accuracy of these methods in several BC microarray prognostication tasks to elucidate the key characteristics of a successful risk prediction method and to bring additional insights into BC biology.

Risk prediction methods In this work, we compare the performance of 13 risk prediction methods on more than 1000 patients. All the risk prediction models considered in this study return a risk score denoted by R f

The first risk prediction method is also the simplest one and defines the risk score as the expression of a single proliferation gene (AURKA)

–Genotype: It can be the expression of –Single proliferation gene (AURKA) –Biologically driven selection of genes of interest (BD) –The whole genome (GW) –Dimension reduction strategy: A simple univariate ranking (RANK) of the k most relevant features A selection of the first k principal components (PCA)

–Structure of the model: Multivariate (MULTIV) model a linear combination of univariate modes(COMBUNIV) –Learning algorithm: The linear combination of gene expressions weighted by the significance computed from the Wilconxon rank sum test (WILCOSON) The multivariate linear regression model (LM) The linear combination of gene expressions weighted by the significance computed from the univariate Cox’s proportional hazards model (COX)

Performance assessment In order to assess the performance of the risk prediction methods, we used five accuracy measures: –Time-dependent ROC Curve

Discussion and conclusions The loss of interpretability deriving from the use of overcomplex methods in survival analysis of BC microarray data might be not sufficiently counterbalanced by an improvement in the quality of prediction.