Download presentation
Presentation is loading. Please wait.
Published byBeryl Beasley Modified over 5 years ago
1
Environmental Genome Shotgun Sequencing of the Sargasso Sea
J. Craig Venter,1 * Karin Remington,1 John F. Heidelberg,3 Aaron L. Halpern,2 Doug Rusch,2 Jonathan A. Eisen,3 Dongying Wu,3 Ian Paulsen,3 Karen E. Nelson,3 William Nelson,3 Derrick E. Fouts,3 Samuel Levy,2 Anthony H. Knap,6 Michael W. Lomas,6 Ken Nealson,5 Owen White,3 Jeremy Peterson,3 Jeff Hoffman,1 Rachel Parsons,6 Holly Baden-Tillson,1 Cynthia Pfannkoch, Yu-Hui Rogers,4 Hamilton O. Smith1 Therese Pham D145 Presentation January 29, 2019
2
Why the Sargasso Sea? This is specifically the northwest Sargasso Sea at Bermuda Atlantic Time-series Study (BATS) site One of the best studied regions of the global ocean Has been intensively studied as part of a 50-year time series study on ocean physics and biogeochemistry These circumstances creates the opportunity for an interpretation of genomic data from an oceanic environment
3
Environmental and Microbial characteristics of the Sargasso Sea
Nutrient poor, open ocean environment Bounded in the north and west by the Gulf Stream This keeps the nutrient poor waters of the Sargasso Sea from the more nutrient rich waters of the U.S. continental shelf In the winter cold fronts travel across the region eroding the seasonal thermoclines and causing convective mixing resulting in subtropical mode waters m in depth, with a more nutrient rich profile at the surface level Causes phytoplankton blooms with Synechococcus and Prochlorococcus contributing the most numerically to overall photosynthetic biomass Mode water (subtropical refers to region): refers to a kind of water mass that is nearly vertically homogenous Thermocline: transition layer between warmer layer of water (warmed by the sun; warmth is then mixed and distributed by waves and wind) and the cooler waters at the deeper levels of the ocean
4
Collection of samples liters of surface water was collected from each of three sites off the Bermuda coast in February 2003 via the RV Waterbird II Later, additional samples were collected in May 2003 with the SV Socerer II Fig. 1
5
DNA extraction Genomic DNA was extracted from filtering sea water through a series of micron filters The filters were then quartered and incubated in TE buffer containing lysozyme sodium dodecyl sulfate (SDS) was added and the samples were then put through three freeze/thaw cycles This lysate was then treated with Proteinase K to purify it of proteins TE buffer: tris/ EDTA buffer; pH 8 an enzyme that catalyzes the destruction of the cell walls of certain bacteria, occurring notably in tears and egg white. Pretty standard DNA extraction according to outside resources
6
Whole Genome Shotgun Sequencing (WGS)
First DNA was randomly sheared Then it was blunted with consecutive BAL31 nuclease and T4 DNA polymerase treatments Followed by size-selected by electrophoresis on 1% low-melting- point agarose It was then purified by 3 rounds of gel electrophoresis to remove excess adapters 2-6 kB inserts were inserted into Bst XI-linearized plasmid vector and cloned Blunting is a process by which the single-stranded overhang created by a restriction digest is either "filled in", by the addition of nucleotides on the complementary strand using the overhang as a template for polymerization, or by "chewing back" the overhang, using an exonuclease activity. Vectors and inserts are often "blunted" to allow non-compatible ends to be joined. Sequence information is lost or distorted by doing this, Nuclease BAL-31 exonuclease degrades both 3' and 5' termini of duplex DNA without generating internal scissions. The enzyme is also a highly specific single-stranded endonuclease which cleaves at nicks, gaps and single-stranded regions of duplex DNA and RNA (1,2). T4 DNA Polymerase is a template-dependent DNA polymerase that catalyzes 5'-3' synthesis from primed single-stranded DNA. The enzyme has a 3'-5' exonuclease activity, but lacks 5'-3' exonuclease activity. Contigs
7
WGS cont’d These prepared plasmid clones were then sequenced on ABI 3730XL DNA sequencers which are 96-capillary sequencers The process involves incorporating fluorescent ddNTPs DNA is then separated by size through thin capillaries prepared plasmid clones were sequenced from both ends to provide paired-end reads These sequences were then assembled by the Celera Assembler (computational assembly) Paper doesn’t specify whether fluorescent primers or terminators were used 1. Trace files (dye signals) are analyzed and bases called to create chromatograms. 2. Chromatograms from opposite strands are reconciled with software to create double-stranded sequence data. High-throughput for the time
8
Figure 5A Fig 5A global structure of Scaffold with respect to assembly Blue segments, contigs; green segments, fragments; and yellow segments, stages of the assembly of fragments into the resulting contigs
9
Figures 5B Fig 5B a sample of the multiple sequence alignment. B
10
Results of the sequencing and assembly
There were 1.66 million reads produced from WGS of the Weatherbird II samples (1.36 Gbp) And an additional 325,561 sequences generated from the Socerer II samples (265 Mbp) Assembly of the Weatherbird II sequences generated 64,398 scaffolds (826 bp- 2.1 Mbp) and 215,015 unassembled pair-ends, termed mini- scaffolds. There were also 215,038 unassembled singleton reads. The Sorcerer II provided almost no assembly; only 153,458 mini- scaffolds and 18,692 singleton reads Scaffold: In genomic mapping, a series of contigs that are in the right order but not necessarily connected in one continuous stretch of sequence. The lack of overlapping reads within the unassembled set indicates that lack of additional assembly was not due to algorithmic limitations but to the relatively limited depth of sequencing coverage given the level of diversity within the sample.
11
Analysis of mapped scaffolds; photobiology
The sequenced and assembled DNA assemblies were compared to published sequences Notable species identified include Shewanella, a group of scaffolds that were similar to Burkholderia, and of course Prochlorococcus which had been established as abundantly present There was, however, a lack of scaffolds relating to Synechococcus within the pooled data but this is likely due to Synechococcus being larger and less distributed within the waters sampled. Scaffold taxonomy: The spans of the scaffold were associated with the appropriate NCBI taxonomies through the informative blast hits. The global percentage of each taxon present on the scaffold was determined. If the most prevalent taxon covered 20%+ more sequence or was 4 times more common than the second most prevalent taxon then the taxon of the scaffold was assigned to the given taxonomy at the given taxonomic level. If these conditions were not met, each taxon was transformed to a higher more general taxonomic level and the termination conditions were retested The recent discovery of a homolog of bacteriorhodopsin in an uncultured - Proteobacteria from the Monterey Bay revealed the basis of a form of phototrophy in marine systems (38) that was observed previously by oceanographers (39, 40).
12
Prochlorococcus
13
Figure 3 Fig 3
14
Bacteriorhodopsin; a closer look
Bacteriorhodopsin allows for the coupling of light energy harvesting and carbon cycling in the ocean through a non–chlorophyll-based pathway Environmental culture-independent gene surveys with PCR have revealed about 67 additional closely related proteorhodopsin homologs More than 650 rhodopsin homologs were identified within the Weatherbird II samples and an additional 132 were identified in the Sorcerer II samples Homologous recombination is used in horizontal gene transfer to exchange genetic material between different strains and species of bacteria and viruses.
15
Phylogenetic tree of rhodopsinlike genes
Fig 7. All homologs of halorhodopsin were identified in the predicted proteins from the Sargasso Sea assemblies using BLASTp searches with representatives of previously identified halorhodpsinlike protein families as query sequences. (ii) All sequences greater than 75 amino acids in length were aligned to each other using CLUSTALw, (multiple sequence alignment tool) and a neighbor-joining phylogenetic tree was inferred using the protdist and neighbor programs of Phylip.
16
Table 1 Table 1 There were also other genes identified; some of which hadn’t been archived in the curated SwissProt database UniProtKB/Swiss-Prot is the manually annotated and reviewed section of theUniProt Knowledgebase (UniProtKB). It is a high quality annotated and non-redundant protein sequence database, which brings together experimental results, computed features and scientific conclusions.
17
Analysis of mapped scaffolds; plasmids
Megaplasmids are extrachromosomal genetic elements in the size range of 100 kbp and larger. evidence for supposed six plasmids larger than 100 kbp in length, two plasmids 70 to 80 kbp, and two plasmids under 10 kbp was found UmuCD DNA is a damage induced DNA polymerase of Escherchia coli and these genes could play a role in ultraviolet (UV) resistance Fig. 4
18
Analysis of mapped scaffolds; Bacteriophages
only double-stranded DNA bacteriophages were observable 71 scaffolds greater than 10 kb in length containing identifiable clusters of phage genes Burkholderia- and Shewanella-associated scaffolds accounting for ~33% of these
19
PCR based methods vs. WGS
PCR based methods have two major limitations: They under sample the total number of genotypes they access only a very small subsample of the genomes WGS provides greater depth of knowledge on cell function However there are also cons There are gaps due to limitations in sequencing resolution The size of the genome that needs to be assemble proves a sizable roadblock as well
20
Figure 6 Fig 6 The relative contribution of organisms from different major phylogenetic groups (phylotypes) was measured using multiple phylogenetic markers that have been used previously in phylogenetic studies of prokaryotes: 16S rRNA, RecA, EF-Tu, EF-G, HSP70, and RNA polymerase B (RpoB).
21
Critiques of the study “Shotgun Sequencing in the Sea: A Blast from the Past?” by Paul G. Falkowski and Colomban de Vargas offers some critiques of the study Despite the large-scale sequencing effort put forth by Venter et al. they were still only able to reconstruct only two, almost-complete genomes, even with the help of fully sequenced templates available in the microbial genome database What about sequencing larger eukaryotic microbes such as diatoms and dinoflagellates PCR based methods + 18s rDNA analysis provided greater number of new and divergent phylotypes Many labs at the time were not able to access the same technologies Venter et al. were
22
A good additional figure to consider
23
Additional Reading “The sequence of the human genome”
Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., ... & Gocayne, J. D. (2001). The sequence of the human genome. science, 291(5507), “Ocean time-series reveals recurring seasonal patterns of virioplankton dynamics in the northwestern Sargasso Sea” Parsons, R. J., Breitbart, M., Lomas, M. W., & Carlson, C. A. (2012). Ocean time-series reveals recurring seasonal patterns of virioplankton dynamics in the northwestern Sargasso Sea. The ISME journal, 6(2), 273. “The Genome Warrior” Preston, R. (2000, June). The Genome Warrior. The New Yorker, pp.66.
24
Bibliography Falkowski, P. G., & de Vargas, C. (2004). Shotgun sequencing in the sea: a blast from the past?. Science, 304(5667), Venter, J. C., Remington, K., Heidelberg, J. F., Halpern, A. L., Rusch, D., Eisen, J. A., ... & Fouts, D. E. (2004). Environmental genome shotgun sequencing of the Sargasso Sea. science, 304(5667),
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.