The Sorcerer II Global ocean sampling expedition Katrine Lekang Global Ocean Sampling project (GOS) Global Ocean Sampling project (GOS) CAMERA CAMERA METAREP.
Published byModified over 4 years ago
Presentation on theme: "The Sorcerer II Global ocean sampling expedition Katrine Lekang Global Ocean Sampling project (GOS) Global Ocean Sampling project (GOS) CAMERA CAMERA METAREP."— Presentation transcript:
The Sorcerer II Global ocean sampling expedition Katrine Lekang Global Ocean Sampling project (GOS) Global Ocean Sampling project (GOS) CAMERA CAMERA METAREP METAREP
Background Microorganisms in the world oceans: what do we know? –Play an important role in the marine ecosystem and global biogeochemical cycles. –10 million species? How can second generation sequencing techniques contribute?
Global Ocean Sampling The expedition’s goal is to evaluate the microbial diversity in the world’s oceans using the tools and teqhniques developed to sequence the human and other organisms’ genomes. They want to increase the knowledge about microbial diversity and expect that this will help them understand how ecosystems function and to discover new genes of ecological and evolutionary importance
Sampling 200-400 liters of water every 200 miles Filtering
Methods Total DNA was extracted (0.1-0.8 µm) Random insert clone libraries End-sequencing of 44 000-420 000 clones per sample (Sanger sequencing)
Development of new tools Fragment recruitment analyses for performing and visualizing comparative genomic analysis when a reference sequence is available. New assembly techniques that use metadata to produce assemblies for uncultivated microbial taxa. A whole metagenome comparison tool to compare entire samples at arbitrary degree og genetic divergence.
Assembly Primary assembly: Celera assembler –Pairs of mated reads were testet- overlap- single pseudo-read –Overlap cut-off 98 % to construct unitigs –Fragmented Second assembly: 94 % cut-off Series of assemblies at various stringencies for subsets of GOS-data
Fragment recruitment GOS dataset compared with genomes of sequenced microbes (NCBI)- 584 reference genomes BLAST- 55 % identity 70 % of the reads aligned to one or more genomes. –Many with large gaps and low identity Recruited reads: stringent criteria- 30 % of the reads
Identification of structural variation with metagenomic data Variations in genome structure (rearrangements, duplications, inserions, deletions) can be explored by fragment recruitment Mated sequencing reads- assessing structural differences between the reference and environmetntal sequence. Determines the orientation and distance between two mated sequencing reads. Relative location and orientation of mated reads →metadata that can be used to color-code a fragment recruitment plot.
Fragment recruitment All genome structure variations that are large enough to prevent recruitment can be detected → will be associated with missing mates. Depending on the type of rearrangement present, other recruitment metadata categories will be present near the rearrangements’ endpoint → possible to distinguish among deletions, translocations, inversions and inverted translocations from the recruitment plots.
Extreme assembly of uncultivated populations Assemblies for abundant, uncultivated microbial genera Assembly apporach that resolves conflict – ”Extreme assembly” Do not use matepairing data – contigs –Assembly artefacts Alternative way to an unguided assembly: start from seed fragments that can be identified as belonging to a particular taxonomic group.
Fragment recruitment plots Investigate variation within a group of related organisms Repeatedly seeding extreme assembly with fragments mated to a SAR11 like 16S sequence.
Sample comparisons A method that assess the genetic similarity between two samples that potentially make use of all portions of the genome, not just the 16S rRNA region. Assembly independent Estimate of the fraction of sequence from one sample that could be considered to be present in the other sample. Whole metagenomic similarities were computed for all pairs of samples.
Variations in gene abundance Differences in gene content between samples –Can identify functions that reflect the lifestyles of the community in the context of its local environment. Binning of genes into functional categories – TIGERFRAM hidden Markov models. Genes predominately found in a single sample. Differences between temperate/tropical samples Differences between samples with almost similar taxonomy
CAMERA Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis http://camera.calit2.net/ A need for a systematic way to explore the structure and function of ocean ecosystems, and their impact on global carbon processing and climate. – Bridge the gap between the rates of collecting data and interpreting it. Monitoring microbial communities in the ocean and their response to environmental changes.
Metadata CAMERA will integrate sequence data with all available metadata Allow researchers to derive correlations between ecology and environmental conditions that may favour one community structure or another. Future…. Metadata from satelites and weather stations can be used to help interpret and inform us on how these factors affect microbial processes as well as community composition.
New generation Bioinformatics tools Combine bioinformatical tools with large- scale compute resources
METAREP JCVI Metagenomics Report http://www.jcvi.org/metarep/# Analyze and compare annotated metagenomics datasets Solr/Lucene search engine SQL-like query syntax- filter and refine datasets Functional classification, GO, NCBI taxonomy Statistical tests Analyze function in the context of phylogeny