Metabolomics Bob Ward German Lab Food Science and Technology.

Slides:



Advertisements
Similar presentations
Yinyin Yuan and Chang-Tsun Li Computer Science Department
Advertisements

Outlines Background & motivation Algorithms overview
PCA for analysis of complex multivariate data. Interpretation of large data tables by PCA In industry, research and finance the amount of data is often.
Predictive Analysis of Gene Expression Data from Human SAGE Libraries Alexessander Alves* Nikolay Zagoruiko + Oleg Okun § Olga Kutnenko + Irina Borisova.
Pattern Recognition for the Natural Sciences Explorative Data Analysis Principal Component Analysis (PCA) Lutgarde Buydens, IMM, Analytical Chemistry.
An Introduction to Multivariate Analysis
Chapter 3 – Data Exploration and Dimension Reduction © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Structural Equation Modeling analysis for causal inference from multiple -omics datasets So-Youn Shin, Ann-Kristin Petersen Christian Gieger, Nicole Soranzo.
1 Harvard Medical School Mapping Transcription Mechanisms from Multimodal Genomic Data Hsun-Hsien Chang, Michael McGeachie, and Marco F. Ramoni Children.
Gene Shaving – Applying PCA Identify groups of genes a set of genes using PCA which serve as the informative genes to classify samples. The “gene shaving”
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
1 Multivariate Statistics ESM 206, 5/17/05. 2 WHAT IS MULTIVARIATE STATISTICS? A collection of techniques to help us understand patterns in and make predictions.
An introduction to Principal Component Analysis (PCA)
Principal Component Analysis
Gene expression analysis summary Where are we now?
DNA Microarray Bioinformatics - #27611 Program Normalization exercise (from last week) Dimension reduction theory (PCA/Clustering) Dimension reduction.
Dimension reduction : PCA and Clustering Agnieszka S. Juncker Slides: Christopher Workman and Agnieszka S. Juncker Center for Biological Sequence Analysis.
4 th NETTAB Workshop Camerino, 5 th -7 th September 2004 Alberto Bertoni, Raffaella Folgieri, Giorgio Valentini
‘Gene Shaving’ as a method for identifying distinct sets of genes with similar expression patterns Tim Randolph & Garth Tan Presentation for Stat 593E.
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman.
09/05/2005 סמינריון במתמטיקה ביולוגית Dimension Reduction - PCA Principle Component Analysis.
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
Microarray analysis Algorithms in Computational Biology Spring 2006 Written by Itai Sharon.
Demonstration Trupti Joshi Computer Science Department 317 Engineering Building North (O)
Genomics, Proteomics and Metabolomics. Genomics l The complete set of DNA found in each cell is known as the genome l Most crop plant genomes have billions.
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
Patrick Kemmeren Using EP:NG.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Pathway Analysis. Goals Characterize biological meaning of joint changes in gene expression Organize expression (or other) changes into meaningful ‘chunks’
Principal Component Analysis. Philosophy of PCA Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of multivariate data.
Microarray Gene Expression Data Analysis A.Venkatesh CBBL Functional Genomics Chapter: 07.
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
A Multivariate Biomarker for Parkinson’s Disease M. Coakley, G. Crocetti, P. Dressner, W. Kellum, T. Lamin The Michael L. Gargano 12 th Annual Research.
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
Robust PCA in Stata Vincenzo Verardi FUNDP (Namur) and ULB (Brussels), Belgium FNRS Associate Researcher.
Whole Genome Expression Analysis
BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.
More on Microarrays Chitta Baral Arizona State University.
Data Reduction. 1.Overview 2.The Curse of Dimensionality 3.Data Sampling 4.Binning and Reduction of Cardinality.
Microarray data analysis David A. McClellan, Ph.D. Introduction to Bioinformatics Brigham Young University Dept. Integrative Biology.
es/by-sa/2.0/. Principal Component Analysis & Clustering Prof:Rui Alves Dept Ciencies Mediques.
Metabolomics Metabolome Reflects the State of the Cell, Organ or Organism Change in the metabolome is a direct consequence of protein activity changes.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
Evolutionary Algorithms for Finding Optimal Gene Sets in Micro array Prediction. J. M. Deutsch Presented by: Shruti Sharma.
CLASSIFICATION. Periodic Table of Elements 1789 Lavosier 1869 Mendelev.
PATTERN RECOGNITION : CLUSTERING AND CLASSIFICATION Richard Brereton
Clustering Algorithms to make sense of Microarray data: Systems Analyses in Biology Doug Welsh and Brian Davis BioQuest Workshop Beloit Wisconsin, June.
Class 23, 2001 CBCl/AI MIT Bioinformatics Applications and Feature Selection for SVMs S. Mukherjee.
Introduction to Microarrays Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics
Sudhakar Jonnalagadda and Rajagopalan Srinivasan
Innovative Paths to Better Medicines Design Considerations in Molecular Biomarker Discovery Studies Doris Damian and Robert McBurney June 6, 2007.
A Report on CAMDA’01 Biointelligence Lab School of Computer Science and Engineering Seoul National University Kyu-Baek Hwang and Jeong-Ho Chang.
High-throughput omic datasets and clustering
Microarray analysis Quantitation of Gene Expression Expression Data to Networks BIO520 BioinformaticsJim Lund Reading: Ch 16.
Semi-targeted UPLC-MS analysis of phenylpropanoids in Arabidopsis Jiří Grúz, LGR.
Affymetrix User’s Group Meeting Boston, MA May 2005 Keynote Topics: 1. Human genome annotations: emergence of non-coding transcripts -tiling arrays: study.
Principal Components Analysis ( PCA)
Nature as blueprint to design antibody factories Life Science Technologies Project course 2016 Aalto CHEM.
Principal Component Analysis
JMP Discovery Summit 2016 Janet Alvarado
Molecular Classification of Cancer
Multivariate environmental characterization of samples
Descriptive Statistics vs. Factor Analysis
Standards Development for Metabolomics
Dimension reduction : PCA and Clustering
Microarray Data Set The microarray data set we are dealing with is represented as a 2d numerical array.
Principal Component Analysis
The Omics Dashboard.
Volume 3, Issue 1, Pages (January 2010)
Presentation transcript:

Metabolomics Bob Ward German Lab Food Science and Technology

Genome- ….All the DNA Transcriptome- ….All the mRNA Proteome- ….All the proteins Metabalome ….All the metabolites “ ” “Metabolomics is a post genomic technology which seeks to provide a comprehensive profile to all the metabolites present in a biological sample.” (Taylor et. al, 2002)

Limitations of “ohmics” technologies Genomics Static picture Expensive Not for individuals Transcriptomics Need Genome (annotations) Correlated with proteome? Sampling issues splicing No info on modifications Proteomics Technologically challenging Need genome?

Metabolome Same metabolites for all organisms ~1k for organism vs 10k(genes) or 100k(proteins) Technology exists and is not too expensive Carbohydrate and Lipid info

Goal: Discrimination between related genotypes of Arabidopsis between Co10 and C24 (parent strains) between Co10 x C24 and progeny (F1) between (Co10 x C24) and (C24 x Co10) - Maternal line donates both mitochondria and chloroplast -Clear-cut realization of effectiveness -Potential to uncover biologically relevant info

Instrumental and Informatic Tools GC/MS-Separation/identification of polar metabolites in 1200 second run time AMDIS deconvoluting software MassLab to choose target ions R for statistics WEKA (standard neural network approach) Euclidean distance Principal Component Analysis

Data Work-Up Selection of reference chromatogram (F1) 8 individual samples for each genotype –no replicates Selection of target peaks/analytes (433) –normalized (mg analyte/wt sample)to internal standard (ribitol) –Allows for simple 2-D matrix

201 metabolites identified in some detail (92 as molecular type and 109 by chemical property) High variance in low numbers corresponds to core metabolites

Co C Co10 x C C24 x Co

Neural Network Analysis } P=0.27 Lack of samples precluded use of a training subset ‘Leave one out cross’ used for training Model judged by ability to classify remaining object (repeated for all objects) Allows for maximal use of data for validation when n is low

Clustering by Euclidean distance Co C Co10 x C C24 x Co

Principal Component Analysis Used to tease out role of individual metabolites in discrimination Unsupervised multivariate analysis applied to functions of many attributes Transformation of large set of related values to smaller set of uncorrelated variables Attempts to express maximum variance in data PC’s are axes in multidimensional space Object characterized by distance to axis

PCA algorithm from MatLab 78% of variation of data from first 3 PC’s Variance of data explained by first few principal components

Principal Component Analysis Co10 and C24 differentiated except outlier F1 genotypes cluster together

Contribution of each variable to first PC Malate and Citrate- metabolites of TCA cycle

Relative peak area for metabolites malate and citrate Co10 contains outlier…..may explain misclassification

Other significant results Parental genotype removed from PCA analysis and F1’s discriminated by glucose and fructose Inference that the first PC differentiates parental line, and 2nd and 3rd differentiate F1 Malate and Citrate from TCA, glucose and fructose from chloroplasts

Conclusions Advances in technology will improve detection limits and will allow characterization of metabolites Formalized ontology needed to link chemical structure with pathways Metabolite profiling is an exciting new field which complements other non-hypothesis driven global analysis technologies Large amounts of informatic support to develop field and to correlate data from genomics, microarrays, and proteomics