Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computational Characterization of Short Environmental DNA Fragments Jens Stoye 1, Lutz Krause 1, Robert A. Edwards 2, Forest Rohwer 2, Naryttza N. Diaz.

Similar presentations


Presentation on theme: "Computational Characterization of Short Environmental DNA Fragments Jens Stoye 1, Lutz Krause 1, Robert A. Edwards 2, Forest Rohwer 2, Naryttza N. Diaz."— Presentation transcript:

1 Computational Characterization of Short Environmental DNA Fragments Jens Stoye 1, Lutz Krause 1, Robert A. Edwards 2, Forest Rohwer 2, Naryttza N. Diaz 1, Alexander Goesmann 1, Scott Kelley 2, Alfred Pühler 1 1 Bielefeld University 2 San Diego State University

2 Computational Characterization of Short Environmental DNA Fragments Jens Stoye 1, Lutz Krause 1, Robert A. Edwards 2, Forest Rohwer 2, Naryttza N. Diaz 1, Alexander Goesmann 1, Scott Kelley 2, Alfred Pühler 1 1 Bielefeld University 2 San Diego State University

3 CeBiTec, Bielefeld University Jens Stoye Metagenomics 454 Pyrosequencing

4 CeBiTec, Bielefeld University Jens Stoye CARMA - A Pipeline for Characterizing Short-Read Metagenomes Quantitative analysis of metagenomes: Which microbes live in an environment? What are they doing? What are the differences between communities from different environments?

5 CeBiTec, Bielefeld University Jens Stoye (A) Functional Analysis Reads directly analyzed without prior assembly Protein family fragments used as Environmental Gene Tags (EGTs) for quantitative analysis of gene content GO-term profiles characterize genetic diversity and potential metabolism of underlying communities

6 CeBiTec, Bielefeld University Jens Stoye Heat Map Comparing GO-Term Frequencies Comparative analysis reveals genetic and metabolic trends Significantly overrepresented GO-terms identified with G-test

7 CeBiTec, Bielefeld University Jens Stoye (B) Analyzing the Community Structure EGTs assigned to taxonomic groups based on a phylogenetic analysis Taxonomic profiles characterize the composition of the underlying communities

8 CeBiTec, Bielefeld University Jens Stoye Taxonomic Classification of Short Environmental Gene Tags Krause et al., submitted Phylogenetic tree reconstructed for each matching Pfam family Multiple alignment of known family members (downloaded from Pfam web site)

9 CeBiTec, Bielefeld University Jens Stoye Taxonomic Classification of Short Environmental Gene Tags Phylogenetic tree reconstructed for each matching Pfam family Identified EGTs matching family added to full multiple alignment PF1-PF7: Known family members EGT1-EGT3: Environmental Gene Tags matching family Krause et al., submitted

10 CeBiTec, Bielefeld University Jens Stoye Taxonomic Classification of Short Environmental Gene Tags Phylogenetic tree reconstructed for each matching Pfam family Multiple alignment used to calculate distance matrix Pairwise distance: sequence identity in aligned region Missing values determined with additive estimation (Landry et al., 1996) PF1PF2…EGT3 PF100.3…0.6 PF20.30…0.2 …………… EGT30.60.2…0

11 CeBiTec, Bielefeld University Jens Stoye Taxonomic Classification of Short Environmental Gene Tags Distance matrix used to reconstruct phylogenetic tree (with Neighbor Joining) EGTs classified based on their location in tree PF1PF2…EGT3 PF100.3…0.6 PF20.30…0.2 …………… EGT30.60.2…0 Krause et al., submitted

12 CeBiTec, Bielefeld University Jens Stoye Performance Evaluation: Creating Standard of Truth Test set: 77 complete genomes 2 Superkingdoms (Archaea and Bacteria) 10 Phyla 29 Classes 62 Genera 77 Species Test set excluded from reference set (Pfam members from any of the 77 species omitted from full multiple alignments)

13 CeBiTec, Bielefeld University Jens Stoye Performance Evaluation: Creating Standard of Truth 77 genomes fragmentized with ReadSim (Schmid et al., submitted) Simulates sequencing using 454 pyrosequencing Fragments randomly sampled (2x) Fragment length: 80-120bp, mean 100bp Simulates sequencing errors at homopolymers

14 CeBiTec, Bielefeld University Jens Stoye Classification Accuracy for Short Environmental Gene Tags Sens: Sensitivity, fraction of correctly classified EGTs Spec: Specificity, reliability of predictions FNrate: False negative rate, proportion of wrongly classified EGTs Urate: Unknown rate, proportion of EGTs not assigned to any taxonomic group

15 Application Example: Comparative Analysis of Four Microbial Coral Reef Communities In cooperation with Rob Edwards and Forest Rohwer (San Diego State University, California) Dinsdale, et al., submitted

16 CeBiTec, Bielefeld University Jens Stoye Influence of Human Activities on Coral Reef Microbial Communities Kiritimati Kingman Palmyra Tabuaeran Northern Line Islands little intermediate high Human disturbance

17 CeBiTec, Bielefeld University Jens Stoye GO-Term Profiles Indicate Transition in Metabolic Activities Color indicates abundance of GO-terms in each sample Significantly different (p < 0.01)

18 CeBiTec, Bielefeld University Jens Stoye Community Structure

19 CeBiTec, Bielefeld University Jens Stoye Taxonomic Profiles Indicate Transition from Prochlorococcus to Synechococcus (most abundant marine Cyanobacteria)

20 Application Example: Comparative Analysis of Three Aquatic Microbial Communities L. Krause, N. N. Diaz, A. Goesmann, F. Rohwer, S. Kelley, R. A. Edwards and J. Stoye. Taxonomic classification of short environmental DNA fragments. submitted

21 CeBiTec, Bielefeld University Jens Stoye Sampling Locations Rios Mesquites stromatolites, Mexico San Diego solar salterns, USA Kingman coral reef, Northern Line Islands Sample data provided by Forest Rohwer and Robert Edwards

22 CeBiTec, Bielefeld University Jens Stoye Community Structure pEGTs: prokaryotic fraction of EGTs

23 CeBiTec, Bielefeld University Jens Stoye Community Structure Genus pEGTs: prokaryotic fraction of EGTs

24 CeBiTec, Bielefeld University Jens Stoye Taxonomic Diversity H' : Diversity, including richness and evenness (Shannon index) J : Evenness, relative commonness and rarity of organisms Sample PhylumClassOrderGenus H'H' J H'H' J H'H' J H'H' J Coral reef1.20.461.70.553.90.814.20.83 Stromatolite1.20.421.160.372.70.553.60.70 Solar Saltern0.80.311.00.321.40.282.60.45

25 CeBiTec, Bielefeld University Jens Stoye Further Applications of CARMA Diversity of coral reef viruses (in cooperation with Stuart Sandin, Scripps Institution of Oceanography, San Diego, USA) Waste Water Treatment Plant plasmid sample (in cooperation with Andreas Schlüter, Bielefeld University)

26 CeBiTec, Bielefeld University Jens Stoye Conclusions Gene fragments identified using Pfam profile hidden Markov models Fragments can be assigned to functional role and taxonomic origin Profiling allows detection of trends in species composition, metabolism, and genetic potential Pyrosequencing combined with profiling techniques enables rapid and cost-effective assay of microbial communities

27 CeBiTec, Bielefeld University Jens Stoye Acknowledgements Co-authors: Lutz Krause 1, Robert A. Edwards 2, Forest Rohwer 2, Naryttza N. Diaz 1, Alexander Goesmann 1, Scott Kelley 2, and Alfred Pühler 1 Also many thanks to: Andreas Schlüter 1, Elisabeth Dinsdale 2, Scott Kelley 2, Beltran Rodriguez-Brito 2, and Christelle Desnues 2 1 Bielefeld University 2 San Diego State University

28 Thank you for your attention!!!!

29 CeBiTec, Bielefeld University Jens Stoye Taxonomic Diversity Diversity: Evenness: p i : proportion of EGTs classified into i-th taxonomic group H max : total number of taxa found Sample PhylumClassOrderGenus H'H' J H'H' J H'H' J H'H' J Coral reef1.20.461.70.553.90.814.20.83 Stromatolite1.20.421.160.372.70.553.60.70 Solar Saltern0.80.311.00.321.40.282.60.45


Download ppt "Computational Characterization of Short Environmental DNA Fragments Jens Stoye 1, Lutz Krause 1, Robert A. Edwards 2, Forest Rohwer 2, Naryttza N. Diaz."

Similar presentations


Ads by Google