A blind search for patterns Unravelling low replicate data.

Slides:



Advertisements
Similar presentations
Institute on Research and Statistics, Sacramento 04/08/04
Advertisements

Challenges In Progressing Biomarkers To Clinical Use Proteomic Experiences Chris Harbron Technical Lead For High Dimensional Data AstraZeneca FDA Industry.
Bioinformatics (and Systems Biology?) in Biomedical Research Donald Dunbar Systems Biology Club 30th November 2005.
Visualising and Exploring BS-Seq Data
A Multi-PCA Approach to Glycan Biomarker Discovery using Mass Spectrometry Profile Data Anoop Mayampurath, Chuan-Yih Yu Info-690 (Glycoinformatics) Final.
HPLC Coupled with Quadrupole Mass Spectrometry and Forensic Analysis of Cocaine.
Bioinformatic Treatment of Human Metabolome Profile for Diagnostics Dr. Petr Lokhov & Dr. Alexander Archakov Institute of Biomedical Chemistry, RAMS.
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Metabolomics Bob Ward German Lab Food Science and Technology.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
Genomics, Proteomics and Metabolomics. Genomics l The complete set of DNA found in each cell is known as the genome l Most crop plant genomes have billions.
Microarray analysis 2 Golan Yona. 2) Analysis of co-expression Search for similarly expressed genes experiment1 experiment2 experiment3 ……….. Gene i:
Previous Lecture: Regression and Correlation
HOW MASS SPECTROMETRY CAN IMPROVE YOUR RESEARCH
Proteomics Understanding Proteins in the Postgenomic Era.
Proteomics Josh Leung Biology 1220 April 13 th, 2010.
Microarray Gene Expression Data Analysis A.Venkatesh CBBL Functional Genomics Chapter: 07.
A Significance Test-Based Feature Selection Method for the Detection of Prostate Cancer from Proteomic Patterns M.A.Sc. Candidate: Qianren (Tim) Xu The.
2007 GeneSpring MS GeneSpring for Metabolite BioMarker Analysis using Mass Spectrometry data Agilent Q-TOF VIP Visit Jan 16-17, 2007 Santa Clara, CA Thon.
Anomaly detection with Bayesian networks Website: John Sandiford.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Finish up array applications Move on to proteomics Protein microarrays.
Supplement 1 PCA analysis of metabolite data from untreated rice genotypes by LC-TOF and GC-TOF mass spectrometry. Top panel: (A) LC-TOF analysis of Xa21.
Metabolomics Metabolome Reflects the State of the Cell, Organ or Organism Change in the metabolome is a direct consequence of protein activity changes.
Quantification of Membrane and Membrane- Bound Proteins in Normal and Malignant Breast Cancer Cells Isolated from the Same Patient with Primary Breast.
Analysis and Management of Microarray Data Previous Workshops –Computer Aided Drug Design –Public Domain Resources in Biology –Application of Computer.
Extracting quantitative information from proteomic 2-D gels Lecture in the bioinformatics course ”Gene expression and cell models” April 20, 2005 John.
Clustering Features in High-Throughput Proteomic Data Richard Pelikan (or what’s left of him) BIOINF 2054 April
Figure SOM1. Functional roles of the genes affected in zmet2-m1 mutants. Although the genes localized on the intracellular membranes were slightly over-represented.
KEY CONCEPT Biotechnology relies on cutting DNA at specific places.
Innovative Paths to Better Medicines Design Considerations in Molecular Biomarker Discovery Studies Doris Damian and Robert McBurney June 6, 2007.
1.
Introduction to Biostatistics and Bioinformatics Experimental Design.
Comp. Genomics Recitation 10 4/7/09 Differential expression detection.
Low lightHigh light High light response in Arabidopsis thaliana 4 days 1100 transcripts change Anthocyanin light response mutant.
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
PCB 5530 Antje Thamm & Tom Niehaus Fall 2015
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Metabolomics MS and Data Analysis PCB 5530 Tom Niehaus Fall 2015.
THE SCIENTIFIC METHOD: It’s the method you use to study a question scientifically.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
Use of gene expression to identify heterogeneity of metastatic behavior among high-grade pleomorphic soft tissue sarcomas Keith Skubitz 1, Princy Francis.
1 Bioinformatics Tools for Genotyping Frances Tong Dr. Garry Larson, Ph.D City of Hope Department of Molecular Medicine Southern California Bioinformatics.
Shotgun protein identification Creative Proteomics offers iTRAQ protein quantification service suited for unbiased untargeted biomarker discovery. Relative.
Biotechnology.
Metabolomics Study of Human Seminal Plasma of Infertile Men
Gene expression.
ProfileAnalysis 2.1 Introduction
Microarray Technology and Applications
Knowledge l Action l Impact
Brain Region Mapping Using Global Metabolomics
Computational Tools for Stem Cell Biology
Proteomics Informatics David Fenyő
Standards Development for Metabolomics
A perspective on proteomics in cell biology
Discrimination and Quantification of True Biological Signals in Metabolomics Analysis Based on Liquid Chromatography-Mass Spectrometry  Lixin Duan, István.
Class Prediction Based on Gene Expression Data Issues in the Design and Analysis of Microarray Experiments Michael D. Radmacher, Ph.D. Biometric Research.
Volume 25, Issue 3, Pages (March 2017)
Copyright © 2016 Elsevier Inc. All rights reserved.
Volume 24, Issue 10, Pages e7 (October 2017)
Urine metabolomics using liquid chromatography quadrupole time-of-flight mass spectrometry indicates common markers of disease in alkaptonuria and idiopathic.
Microarray Data Set The microarray data set we are dealing with is represented as a 2d numerical array.
R.H. Brophy, B. Zhang, L. Cai, R.W. Wright, L.J. Sandell, M.F. Rai 
Microbiome: Metabolomics
Proteomics Informatics David Fenyő
Volume 26, Issue 12, Pages e5 (March 2019)
Computational Tools for Stem Cell Biology
Sera metabolite profiles of patients with RA discriminate rituximab responders and non-responders. Sera metabolite profiles of patients with RA discriminate.
Untargeted LC/MS metabolite profiling of DFMO-treated HT-29 colorectal cancer cells. Untargeted LC/MS metabolite profiling of DFMO-treated HT-29 colorectal.
Presentation transcript:

A blind search for patterns Unravelling low replicate data

ExSpec Pipeline

Data: Structure and variability  Structure  Between ,000+ features  Each feature has an associate ion count for each sample aligned.  Data is not normally distributed.  Variability  Up to 30% technical variability  Each feature is effected differently

Data Structure and variability

Data: Structure and variability The majority of features that are detected are singletons.

Low Replicate data  “Suck it and see”  One off project  Pump priming projects  Medical samples  Biopsy  Difficult to access  Ecological data  Resampling is difficult

Methods  Finger printing  PCA  Basic scoring  PDE model  Gradient search  Differential analysis

PCA  Very simple  Can be highly informative  Depends on the data  Used in pipeline  Data quality

Bruno Project  Samples :  Human biopsy  Replication – biopsy cut into equal parts PCA Analysis

 N group  Non-cancer biopsy  T group  Cancer biopsy Using PCA clustering we are able to distinguish between healthy and sick patients PCA Analysis

PCA reveled profile similarity which correlated with biological evidence PCA Analysis

Human Urine project 22 patients sampled 11 healthy and 11 sick patients Sample labels dropped

PCA Analysis Ecological Data Large number of samples without clear replication.

PCA Analysis Cluster pattern: Find the features which hold the cluster pattern

PCA Analysis Using PCA and profile similarity analysis subset of features of interest were found

Basic Scoring  Use Z-score to sort data  Use this to pull out important features.  Control – Exp  With two class problem we can use PDE modelling.

Basic Scoring : PDE modelling  Multi class problem  Plants  Wild type  act ko mutant  Treatments  Normal light  High light

Gradient Analysis  Use rate of change of abuandace to  Mine data for spesifc trends  Find features of intrest  Use PDE modelling of rates

Gradient Analysis Mining for features which showed rapid increase due to a specific treatment

Data Provided by:  Brno  Ted Hupp  Rob O’Neill  Urine study  Steve Michell  John Mcgrath  Ecological data  Dave Hodgson  Nicole Goody  Gradient analysis  John Love  Data scoring  Nicholas Smirnoff  Mike Page

Metabolomics and Proteomics Mass Spectrometry The University of Exeter Nick Smirnoff ( Director of Mass Spectrometry ) Hannah Florance ( MS Facility Manager ) Venura Perera ( Bioinformatics and Mathematical Support )

About me  Background  Applied Maths  Untargeted metabolite profiling  Research interests  Data driven modelling  Small molecule profiling  Gene regulatory network modelling  Application of mathematical methods  Metabolite identification using LC-MS/MS