A blind search for patterns Unravelling low replicate data
ExSpec Pipeline
Data: Structure and variability Structure Between ,000+ features Each feature has an associate ion count for each sample aligned. Data is not normally distributed. Variability Up to 30% technical variability Each feature is effected differently
Data Structure and variability
Data: Structure and variability The majority of features that are detected are singletons.
Low Replicate data “Suck it and see” One off project Pump priming projects Medical samples Biopsy Difficult to access Ecological data Resampling is difficult
Methods Finger printing PCA Basic scoring PDE model Gradient search Differential analysis
PCA Very simple Can be highly informative Depends on the data Used in pipeline Data quality
Bruno Project Samples : Human biopsy Replication – biopsy cut into equal parts PCA Analysis
N group Non-cancer biopsy T group Cancer biopsy Using PCA clustering we are able to distinguish between healthy and sick patients PCA Analysis
PCA reveled profile similarity which correlated with biological evidence PCA Analysis
Human Urine project 22 patients sampled 11 healthy and 11 sick patients Sample labels dropped
PCA Analysis Ecological Data Large number of samples without clear replication.
PCA Analysis Cluster pattern: Find the features which hold the cluster pattern
PCA Analysis Using PCA and profile similarity analysis subset of features of interest were found
Basic Scoring Use Z-score to sort data Use this to pull out important features. Control – Exp With two class problem we can use PDE modelling.
Basic Scoring : PDE modelling Multi class problem Plants Wild type act ko mutant Treatments Normal light High light
Gradient Analysis Use rate of change of abuandace to Mine data for spesifc trends Find features of intrest Use PDE modelling of rates
Gradient Analysis Mining for features which showed rapid increase due to a specific treatment
Data Provided by: Brno Ted Hupp Rob O’Neill Urine study Steve Michell John Mcgrath Ecological data Dave Hodgson Nicole Goody Gradient analysis John Love Data scoring Nicholas Smirnoff Mike Page
Metabolomics and Proteomics Mass Spectrometry The University of Exeter Nick Smirnoff ( Director of Mass Spectrometry ) Hannah Florance ( MS Facility Manager ) Venura Perera ( Bioinformatics and Mathematical Support )
About me Background Applied Maths Untargeted metabolite profiling Research interests Data driven modelling Small molecule profiling Gene regulatory network modelling Application of mathematical methods Metabolite identification using LC-MS/MS