Claudio Lottaz and Rainer Spang

Slides:



Advertisements
Similar presentations
CSCE555 Bioinformatics Lecture 15 classification for microarray data Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
Advertisements

Probabilistic modelling in computational biology Dirk Husmeier Biomathematics & Statistics Scotland.
Instance-based Classification Examine the training samples each time a new query instance is given. The relationship between the new query instance and.
Achim Tresch Computational Biology ‘Omics’ - Analysis of high dimensional Data.
Correlation Aware Feature Selection Annalisa Barla Cesare Furlanello Giuseppe Jurman Stefano Merler Silvano Paoli Berlin – 8/10/2005.
Computational Diagnostics We are a new research group in the department of Computational Molecular Biology at the Max Planck Institute for Molecular Genetics.
By Russell Armstrong Supervisor Mrs Wei Ji Diagnosis Analysis of Lung Cancer by Genome Expression Profiles.
Model and Variable Selections for Personalized Medicine Lu Tian (Northwestern University) Hajime Uno (Kitasato University) Tianxi Cai, Els Goetghebeur,
Part II: Discriminative Margin Clustering Joint work with: Rob Tibshirani, Dept of Statistics Patrick O. Brown, School of Medicine Stanford University.
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
4 th NETTAB Workshop Camerino, 5 th -7 th September 2004 Alberto Bertoni, Raffaella Folgieri, Giorgio Valentini
Reduced Support Vector Machine
. Differentially Expressed Genes, Class Discovery & Classification.
Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.
Feature Selection and Its Application in Genomic Data Analysis March 9, 2004 Lei Yu Arizona State University.
Gene Expression Based Tumor Classification Using Biologically Informed Models ISI 2003 Berlin Claudio Lottaz und Rainer Spang Computational Diagnostics.
Guidelines on Statistical Analysis and Reporting of DNA Microarray Studies of Clinical Outcome Richard Simon, D.Sc. Chief, Biometric Research Branch National.
1 Harvard Medical School Transcriptional Diagnosis by Bayesian Network Hsun-Hsien Chang and Marco F. Ramoni Children’s Hospital Informatics Program Harvard-MIT.
Model Assessment and Selection Florian Markowetz & Rainer Spang Courses in Practical DNA Microarray Analysis.
Structured Analysis of Microarrays & Differential Coexpression Claudio Lottaz, Dennis Kostka & Rainer Spang Courses in Practical DNA Microarray Analysis.
From motif search to gene expression analysis
Molecular Diagnosis Florian Markowetz & Rainer Spang Courses in Practical DNA Microarray Analysis.
Diagnosis of multiple cancer types by shrunken centroids of gene expression Course: Topics in Bioinformatics Presenter: Ting Yang Teacher: Professor.
Diagnosis using computers. One disease Three therapies.
Sample classification using Microarray Data. AB We have two sample entities malignant vs. benign tumor patient responding to drug vs. patient resistant.
The Broad Institute of MIT and Harvard Classification / Prediction.
Identification and functional characterisation of molecular risk factors in acute leukemias Renate Kirschner 1, Michaela Heide 1, Peter Rhein 1, Leonid.
Gene expression analysis
Classification of microarray samples Tim Beißbarth Mini-Group Meeting
+ Get Rich and Cure Cancer with Support Vector Machines (Your Summer Projects)
Evolutionary Algorithms for Finding Optimal Gene Sets in Micro array Prediction. J. M. Deutsch Presented by: Shruti Sharma.
Computational Diagnostics A new research group at the Max Planck Institute for molecular Genetics, Berlin.
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Guest lecture: Feature Selection Alan Qi Dec 2, 2004.
Molecular Classification of Cancer Class Discovery and Class Prediction by Gene Expression Monitoring.
Clustering by soft-constraint affinity propagation: applications to gene- expression data Michele Leone, Sumedha and Martin Weight Bioinformatics, 2007.
Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring T.R. Golub et al., Science 286, 531 (1999)
Gene expression. Gene Expression 2 protein RNA DNA.
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Computational Biology Group. Class prediction of tumor samples Supervised Clustering Detection of Subgroups in a Class.
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 5.
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 3.
Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks From Nature Medicine 7(6) 2001 By Javed.
Canadian Bioinformatics Workshops
Predictive Automatic Relevance Determination by Expectation Propagation Y. Qi T.P. Minka R.W. Picard Z. Ghahramani.
David Amar, Tom Hait, and Ron Shamir
Estimation of Gene-Specific Variance
Department of Mathematics, Northern New Mexico College1
A graph-based integration of multiple layers of cancer genomics data (Progress Report) Do Kyoon Kim 1.
Classification with Gene Expression Data
1. SELECTION OF THE KEY GENE SET 2. BIOLOGICAL NETWORK SELECTION
Heping Zhang, Chang-Yung Yu, Burton Singer, Momian Xiong
FINAL PROJECT- Key dates
Gene Expression Analysis
Classifiers!!! BCH339N Systems Biology / Bioinformatics – Spring 2016
Introduction to translational and clinical bioinformatics Connecting complex molecular information to clinically relevant decisions using molecular.
Gene expression.
Alan Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani
Microarray Technology and Applications
Molecular Classification of Cancer
Claudio Lottaz and Rainer Spang
Computational Diagnostics
Rainer Spang, Max Planck Institute for Molecular Genetics, Berlin
Boosting For Tumor Classification With Gene Expression Data
Rainer Spang, Max Planck Institute for Molecular Genetics, Berlin
Classification of class-imbalanced data
Class Prediction Based on Gene Expression Data Issues in the Design and Analysis of Microarray Experiments Michael D. Radmacher, Ph.D. Biometric Research.
Single Sample Expression-Anchored Mechanisms Predict Survival in Head and Neck Cancer Yang et al Presented by Yves A. Lussier MD PhD The University.
Machine Learning – a Probabilistic Perspective
Presentation transcript:

Claudio Lottaz and Rainer Spang Decomposing Complex Clinical Phenotypes by Biologically Structured Microarray Analysis Claudio Lottaz and Rainer Spang Berlin Center for Genome Based Bioinformatics, Berlin (Germany) Computational Diagnostics, Max Planck Institute for Molecular Genetics, Berlin (Germany)

Overview Introduction Using functional annotation for semi-supervised classification Heterogeneity vs. performance Evaluation on cancer related data Concluasions 22-Oct-19

Tumor Classification Setting: More formally: Introduction Tumor Classification Patients Genes D C Setting: Data: gene expression profiles Goal: prediction/classification of outcome/sub-type More formally: Many expression levels measured Samples labelled as disease and control Train classifier 22-Oct-19

State-of-the-Art Various powerful methods: Introduction State-of-the-Art Various powerful methods: Support vector machines Shrunken centroids... Regularization to fight overfitting: Feature selection Large margins... Common hypothesis: Generate a single molecular signature 22-Oct-19

Introduction Complex Phenotypes A single clinical phenotype may be caused by different molecular mechanisms Our approach: discover several sub-classes in disease group Each sub-class has a homogeneous molecular signature 22-Oct-19

Molecular Symptoms Classical signatures are globally optimal Introduction Molecular Symptoms Classical signatures are globally optimal They have no biological focus Genes are corregulated thus correlated  in a global signature genes can be replaced with little loss Molecular Symptom: A functionally focused signature to identify a disease sub-class High specificity – sub-optimal sensitivity 22-Oct-19

Molecular Patient Stratification Introduction Molecular Patient Stratification Patterns of molecular symptoms define a molecular patient stratification Control Subclass Control Subclass Molecular Symptom Control Another Molecular Symptom Diagnostic signature 22-Oct-19

Using Functionl Annotations: A Priori vs. A Posteriori Using Functional Annotations Using Functionl Annotations: A Priori vs. A Posteriori Common procedure Data Functional Annotations Statistical Analysis Data Functional Annotations Statistical Analysis Our suggestion 22-Oct-19

Gene Ontology Biological terms in a directed graph Using Functional Annotations Gene Ontology Biological terms in a directed graph Genes annotated to terms Levels represent specificity of terms 22-Oct-19

Structured Analysis of Microarrays Using Functional Annotations Structured Analysis of Microarrays Classification in leaf nodes Regularized multivariate classifier Local signatures Diagnosis propagation Combine child diagnoses in inner nodes Generate more general diagnoses Regularization Shrink the classifier graph Remove uninformative branches 22-Oct-19

Leaf Node Classification Using Functional Annotations Leaf Node Classification Shrunken centroid classification (Tibshirani et al. 2002) Classificatino according to distance to centroids Regularization via gene shrinkage Determine probability-like values as classification results 22-Oct-19

Propagation of Classification Using Functional Annotations Propagation of Classification Weighted averages Weight according to child performance Weights are normalized per inner node Pa w1 w3 w2 C1 C2 C3 22-Oct-19

Graph Shrinkage Weights of nodes are shrunken by a constant Using Functional Annotations Graph Shrinkage Weights of nodes are shrunken by a constant Negative weights are set to zero  uninformative branches vanish Best shrinkage level chosen in cross-validation 22-Oct-19

Biased Classifier Evaluation Heterogeneity vs. Performance Biased Classifier Evaluation Calibration of Sensitivity and Specificity Shrinkage Parameter Worst Performance in Leaf Node Cj = DCi ( j Dj )-1 22-Oct-19

Classifier Heterogeneity Heterogeneity vs. Performance Classifier Heterogeneity Difference between two classifiers: measures inconsistency of classifications Node‘s redundancy: Graph‘s redundancy (K nodes of the shrunken graph) 22-Oct-19

Calibration Sensitivity vs. Specificity: Heterogeneity vs. Performance Calibration Sensitivity vs. Specificity: Best classifiers: set to control prevalence More molecular symptoms: set  higher than control prevalence Heterogeneity vs. Performance:  Molecular symptoms are heterogeneous Thus high  eliminates them 22-Oct-19

Leukemia Data Set Data set by Yeoh et al. 2002 Task for illustration Evaluation on Cancer Related Data Leukemia Data Set Data set by Yeoh et al. 2002 Acute lymphocytic leukemia 327 patients of 7 clinical sub-types Expression profiles by HG-U95Av2 Task for illustration Detect MLL sub-type 20 MLL samples 109 test set / 218 training set 22-Oct-19

Functional Annotations Evaluation on Cancer Related Data Functional Annotations Focus on GO‘s Biological Process branch (8‘173 terms) 12‘625 probesets on the chip 8‘679 genes (68.7% of probesets) In 1‘359 leaf nodes 845 inner nodes (total 2‘204 nodes) 22-Oct-19

MLL Classifier 2‘796 genes accessible through 32 nodes Evaluation on Cancer Related Data MLL Classifier 2‘796 genes accessible through 32 nodes 22-Oct-19

Evaluation on Cancer Related Data MLL Stratification 22-Oct-19

Conclusions Semi-supervised classification Functional annotation Conclustions Conclusions Semi-supervised classification Datect sub-classes In labelled disease groups Functional annotation Use in an a priori fashion To find biologically focused signatures  molecular symptoms Resolve complex clinical phenotypes (stratification through molecular symptoms) 22-Oct-19