Molecular Classification of Cancer

Slides:



Advertisements
Similar presentations
Basic Gene Expression Data Analysis--Clustering
Advertisements

Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Instance-based Classification Examine the training samples each time a new query instance is given. The relationship between the new query instance and.
Unsupervised Networks Closely related to clustering Do not require target outputs for each input vector in the training data Inputs are connected to a.
Self Organizing Maps. This presentation is based on: SOM’s are invented by Teuvo Kohonen. They represent multidimensional.
T. R. Golub, D. K. Slonim & Others Big Picture in 1999 The Need for Cancer Classification Cancer classification very important for advances in cancer.
SocalBSI 2008: Clustering Microarray Datasets Sagar Damle, Ph.D. Candidate, Caltech  Distance Metrics: Measuring similarity using the Euclidean and Correlation.
Dimension reduction : PCA and Clustering Agnieszka S. Juncker Slides: Christopher Workman and Agnieszka S. Juncker Center for Biological Sequence Analysis.
L15:Microarray analysis (Classification) The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
Bio277 Lab 2: Clustering and Classification of Microarray Data Jess Mar Department of Biostatistics Quackenbush Lab DFCI
Classification: Support Vector Machine 10/10/07. What hyperplane (line) can separate the two classes of data?
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman.
L15:Microarray analysis (Classification). The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Applications of Data Mining in Microarray Data Analysis Yen-Jen Oyang Dept. of Computer Science and Information Engineering.
JAVED KHAN ET AL. NATURE MEDICINE – Volume 7 – Number 6 – JUNE 2001
Evaluating Performance for Data Mining Techniques
Sp’10Bafna/Ideker Classification (SVMs / Kernel method)
Classification of multiple cancer types by multicategory support vector machines using gene expression data.
Functional genomics + Data mining BCH364C/391L Systems Biology / Bioinformatics – Spring 2015 Edward Marcotte, Univ of Texas at Austin.
Self-organizing Maps Kevin Pang. Goal Research SOMs Research SOMs Create an introductory tutorial on the algorithm Create an introductory tutorial on.
More on Microarrays Chitta Baral Arizona State University.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
The Broad Institute of MIT and Harvard Classification / Prediction.
Microarrays.
Scenario 6 Distinguishing different types of leukemia to target treatment.
A Short Overview of Microarrays Tex Thompson Spring 2005.
Class Prediction and Discovery Using Gene Expression Data Donna K. Slonim, Pablo Tamayo, Jill P. Mesirov, Todd R. Golub, Eric S. Lander 발표자 : 이인희.
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman modified by Hanne Jarmer.
Evolutionary Algorithms for Finding Optimal Gene Sets in Micro array Prediction. J. M. Deutsch Presented by: Shruti Sharma.
TreeSOM :Cluster analysis in the self- organizing map Neural Networks 19 (2006) Special Issue Reporter 張欽隆 D
Whole Genome Approaches to Cancer 1. What other tumor is a given rare tumor most like? 2. Is tumor X likely to respond to drug Y?
Application of Class Discovery and Class Prediction Methods to Microarray Data Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics.
Examples of Classifying Expression Data / 7.90 Computational Functional Genomics Spring 2002.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College Bio Informatics January
CUNY Graduate Center December 15 Erdal Kose. Outlines Define SOMs Application Areas Structure Of SOMs (Basic Algorithm) Learning Algorithm Simulation.
Molecular Classification of Cancer Class Discovery and Class Prediction by Gene Expression Monitoring.
Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring T.R. Golub et al., Science 286, 531 (1999)
CSE182 L14 Mass Spec Quantitation MS applications Microarray analysis.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Principal Components Analysis ( PCA)
Classification of tissues and samples 指導老師:藍清隆 演講者:張許恩、王人禾.
Classifiers!!! BCH364C/391L Systems Biology / Bioinformatics – Spring 2015 Edward Marcotte, Univ of Texas at Austin.
Predictive Automatic Relevance Determination by Expectation Propagation Y. Qi T.P. Minka R.W. Picard Z. Ghahramani.
Computational Biology
Unsupervised Learning
Chapter 5 Unsupervised learning
Functional genomics + Data mining
Machine Learning Clustering: K-means Supervised Learning
Data Mining, Neural Network and Genetic Programming
Classifiers!!! BCH339N Systems Biology / Bioinformatics – Spring 2016
Classifiers!!! BCH364C/394P Systems Biology / Bioinformatics
Alan Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani
Gene Expression Classification
Dimension reduction : PCA and Clustering by Agnieszka S. Juncker
Claudio Lottaz and Rainer Spang
PCA, Clustering and Classification by Agnieszka S. Juncker
Neuro-Computing Lecture 4 Radial Basis Function Network
Computational Biology Lecture #9: Analyzing Gene Expression Data
Clustering.
Dimension reduction : PCA and Clustering
Self-organizing map numeric vectors and sequence motifs
Introduction to Cluster Analysis
Unsupervised Networks Closely related to clustering
Clustering.
Qing-Rong Chen, Gordon Vansant, Kahuku Oades, Maria Pickering, Jun S
Claudio Lottaz and Rainer Spang
Unsupervised Learning
Presentation transcript:

Molecular Classification of Cancer Christopher Davis Mark Fleharty 9/18/2018 Class discovery and class prediction by gene expression

Class discovery and class prediction by gene expression Introduction Clinical applications of computational molecular biology Class prediction Class discovery 9/18/2018 Class discovery and class prediction by gene expression

Class discovery and class prediction by gene expression Topics of Discussion Acute Leukemia AML ALL DNA Microarrays Data mining methods 9/18/2018 Class discovery and class prediction by gene expression

Class discovery and class prediction by gene expression Acute Leukemia Different types Acute myeloid leukemia (AML) Acute lymphoblastic leukemia (ALL) Importance of correct diagnosis Maximize efficacy Minimize toxicity Morphological vs. Molecular characteristics 9/18/2018 Class discovery and class prediction by gene expression

Class discovery and class prediction by gene expression DNA Microarrays Hybridization of mRNA’s onto chips with complementary strands of DNA What they tell us How much is a gene expressed When are genes expressed Where are genes expressed Under what conditions are they expressed 9/18/2018 Class discovery and class prediction by gene expression

Gene Expression Example mRNA’s are indicator Yeast – Wine Anaerobic Alcohol Yeast – Bread Aerobic CO2 9/18/2018 Class discovery and class prediction by gene expression

Class discovery and class prediction by gene expression Data Mining Correlation Weighting Methods Self Organizing Maps K-means PCA (Principle Component Analysis) 9/18/2018 Class discovery and class prediction by gene expression

Correlated Weighting Methods The magnitude of each vote is dependant on the expression level in the new sample and the correlation with the class distinction 9/18/2018 Class discovery and class prediction by gene expression

Pearson’s “r” Correlation Continuous interval between –1 and 1 +1 if 2 genes are correlated perfectly -1 if 2 genes are correlated negatively 0 if there is no correlation 9/18/2018 Class discovery and class prediction by gene expression

Class discovery and class prediction by gene expression Example r = .8 9/18/2018 Class discovery and class prediction by gene expression

Idealized AML/ALL Gene 9/18/2018 Class discovery and class prediction by gene expression

High Correlation With Idealized Gene 9/18/2018 Class discovery and class prediction by gene expression

Class discovery and class prediction by gene expression Allow genes to “vote” Sort strongest correlated genes (This list is often informative) Genes cast weighted votes based on their correlation with the idealized gene and how much they are expressed in the patient Votes are summed and based on a predetermined threshold the patient is classified as having AML/ALL/Inconclusive Prediction Strength 9/18/2018 Class discovery and class prediction by gene expression

Class discovery and class prediction by gene expression Self Organizing Maps Method for unsupervised learning – reduces high dimensional data to low dimensional data Based on a grid of artificial neurons Each grid location has a weight vector 9/18/2018 Class discovery and class prediction by gene expression

Self Organizing Maps Continued The node with a weight vector closest to input vector is chosen and it’s weights adjusted closer to the input vector This node’s neighbors are also adjusted to be closer to the input vector according to some decay function Process all vectors and repeat until stable 9/18/2018 Class discovery and class prediction by gene expression

Use SOM to discover classes SOM is used to find the class members to train the predictors Predictors are tested on a new set of known classification If the cross validation is positive and the prediction strength good the cluster discovery and prediction are considered good Iterate if you want to find finer classes 9/18/2018 Class discovery and class prediction by gene expression

Class discovery and class prediction by gene expression K-Means Dataset is partitioned into K clusters randomly For each data point calculate the distance from the point to the cluster – if it is closer to it’s current cluster leave it there, otherwise move it to the closest cluster Repeat until stable 9/18/2018 Class discovery and class prediction by gene expression

Principle Components Analysis A transform that chooses a new coordinate system for the data set s.t. the greatest variance comes to lie on the first axis(principle component), the 2nd greatest variance on the 2nd axis, etc. Can be used to reduce dimensionality by eliminating later principle components 9/18/2018 Class discovery and class prediction by gene expression

Class discovery and class prediction by gene expression What This Means Diagnostic Tools Use in diagnosis of other diseases Look for toxins in environment Decoding regulatory networks Use of time sensitive data Use of stress data Drug discovery New classifications of disease 9/18/2018 Class discovery and class prediction by gene expression

Class discovery and class prediction by gene expression Future Work Algorithm research How do we gear experiments to maximize the amount of information we get? 9/18/2018 Class discovery and class prediction by gene expression