Presentation is loading. Please wait.

Presentation is loading. Please wait.

Environmental Genome Shotgun Sequencing of the Sargasso Sea

Similar presentations


Presentation on theme: "Environmental Genome Shotgun Sequencing of the Sargasso Sea"— Presentation transcript:

1 Environmental Genome Shotgun Sequencing of the Sargasso Sea
J. Craig Venter, Karin Remington, John F. Heidelberg, Aaron L. Halpern, Doug Rusch, Jonathan A. Eisen, Dongying Wu, Ian Paulsen, Karen E. Nelson, William Nelson, Derrick E. Fouts, Samuel Levy, Anthony H. Knap, Michael W. Lomas, Ken Nealson, Owen White, Jeremy Peterson, Jeff Hoffman, Rachel Parsons, Holly Baden-Tillson, Cynthia Pfannkoch, Yu-Hui Rogers, Hamilton O. Smith By: Madona Masoud and Daria Nahidipour

2 Background In the 90’s microbial diversity was studied using PCR amplification of DNA Messing came up with the first description Current shotgun sequencing approach developed by Sanger Lab This method of genome sequencing was used by Venter to sequence the human genome. He was NOT affiliated with the Human Genome Project. The first time whole-genome shotgun sequencing was used to study ocean microbial diversity at such a large-scale

3 What’s in the deep blue Sargasso Sea?
Located in the middle of the North-Atlantic Ocean No land boundaries and is over 1000 miles wide and 3000 miles long One of the most well-studied region of the global ocean Nutrient-limited FIg 1. Sampling sites are shown in this figure.

4 Hypothesis There would be a limited amount of genetic diversity of oceanic microorganisms in the Sargasso Sea due to fact that it is a nutrient limited region.

5 Collection The fleet: RV Weatherbird II and SV Sorcerer II
A total of ~1500L of surface seawater Filters were sized to isolate microbes and to exclude eukaryotic DNA (too large) and viruses (too small) Genomic DNA was extracted from the filters and prepared for sequencing SV Sorcerer II: Photo:

6 Whole-Genome Shotgun [WGS] Sequencing

7 ADD IN: Genomic Libraries were constructed (inserts 2-6 kb) In fragmentation, double stranded DNA is used which gives the advantage of paired-end sequencing [sequencing of the fragment from the ends of the clone] and providing sequence information. DNA is isolated from the species, them fragmented. The DNA fragments are then incorporated into cloning vectors Each cloning vector obtain a random sequence The DNA fragments are assembled into contigs which, when are collected in large quantities and supported, can be turned into the complete sequence How the fragments come together which is through having multiple assemblies that complement each other The complete sequence is annotated Final sequence

8 Summary of WGS Sequencing
Advantages: Improved speed and cost Labor saving because sequencing reactions and assembly fully automated Enhanced coverage of the genome Detection of large numbers of DNA polymorphisms Approximation of the abundance Disadvantages Possibility of DNA sequences mislocation due to linking DNA fragments incorrectly. Repetitive sequences Gaps in the DNA assembly that are caused by the repetitive sequences repetitive sequences since the size of these sequences is unknown due to not having a map prior enhanced coverage of the genome taking into account the abundance of each species. The size of repetitive sequences is not known unless there is a map made prior to the one being constructed.

9 After the clone, a cycle begins in which the De Novo assembly of DNA fragments happen, then the fragments are sequenced which take the coverage of the data into account. Lastly, the cycle ends with the a new genetic map being constructed or having the new genes integrated into an already existing map through the assembly of contigs together.

10 Celera Assembler Celera Assembler - de novo whole-genome shotgun (WGS) DNA sequence assembler Reconstructs long sequences of DNA from short fragmented data produced by WGS sequencing w/o use of a reference genome De novo sequence assemblers are a type of program that assembles short nucleotide sequences into longer ones without the use of a reference genome. These are most commonly used in bioinformatic studies to assemble genomes More Info: assembler.sourceforge.net/wiki/index.php?title=Main_Page

11 Old technique - PCR based rRNA Sequencing
Some rRNA genes are not amplified by their designated primers Inaccurate estimates of abundance, diversity, and taxonomic relations based on the rRNA gene numbers Example of PCR based rRNA sequencing from a study that was concerned with skin microbes. The microbes were obtained then their DNA was isolated and amplified using 16S rRNA genes. The amplified genes obtained were then assembled. sample -> PCR amplification -> sequence read A lot of phylogenetic studies had been done using PCR based rRNA Sequencing, however, there are problems that follow the use of this method mainly bias due to primers inability to amplify certain rRNA genes which did not identify some rRNA whilist using shotgun sequencing identified new, distinct small subunit rRNA genes. In addition, 97% and 99% sequence similarity [not completely accurate] showed novel phylotypes [similarities] between organisms that they later compared with the ones from the RDP II database. A second issue with using PCR based sequencing is that the number of rRNA genes may not vary greatly as it is necessary for every cell type

12 Gene Conservation of Prochlorococcus
Outermost circle is completed sequence of Prochlorococcus marinus genome Inner circles are fragments obtained from environmental sequencing. Black = not conserved Different color = chromosomal rearrangement Takeaway: There are even differences between closely related organisms Genomic differences are present even among closely related organisms in the same genus. Outermost circle is the completed sequence of the Prochlorococcus marinus genome. This species is one of the most abundant photosynthetic organism on the planet and is responsible for a significant fraction of photosynthesis in the oceans. The inner circles are the fragments obtained from environmental sequencing. The black regions not conserved. Different color regions indicate chromosomal rearrangements took place. More info:

13 Comparing sequences Comparison of scaffolds from WGS sequencing of Sargasso Sea to a known sequence Used tBLASTx to perform the comparison tblastx: for identifying nucleotide sequences similar to the query based on their coding potential They are comparing Sargasso Sea scaffolds to clone 4B7, which came from the North Pacific, far away from and totally different environment. Those being compared are individual scaffolds obtained from the WGS sequencing of the Sargasso sea .The red dots represent the similarity of the scaffold to the known sequence. As you can see there is lots of similarity to the known sequence as indicated by mostly red.

14 9 megaplasmids Genes encoded in the forward and reverse direction
paired-end reads of the prepared plasmid clones Different colors show the gene function - magenta = fatty + phospholipid metabolism - white= genes with no known homology - grey = genes of unknown function

15 Sequence diversity of rhodopsin-like genes
Phylogenetic tree of rhodopsin-like genes proteorhodopsin—membrane protein found in marine bacteria, provide a novel mechanism to harness light energy. Additional sequences ID in this study represent a great increase in sequence diversity of genes compared to previous findings Red indicated uncultured species collected from the Sargasso Sea This is a phylogenetic tree of rhodopsinlike genes in Sargasso sea along with homologs of the genes. Phylogenetic anlysis of these genes reveals that additional sequences identified in this study represent a great increase in sequence diversity of these genes in comparison to what has been seen previously. The red indicates sequences from uncultured species, meaning it cannot grow on a Petri dish in a laboratory because not enough is known about its biology.

16 Results Discovered 148 novel bacterial phylotypes (97% sequence similarity) Identified over 1.2 million unknown genes new rhodopsin-like photoreceptors Lots of variation in the species present suggests high oceanic microbial diversity The genes they found are published online and available to the public

17 Conclusion The researchers found a lot more genetic diversity of organisms in this nutrient-poor region of the ocean WGS sequencing has a new use as a tool that can be used in environmental sequencing to find new species We may discover new microbes that can create alternative sources of energy

18 Limitations Abundant species have great coverage but less common species are only represented by a few sequences Sample collection - Filtering technique was not that effective because eukaryotic DNA was found (18S rRNA) Over-collapsing occurs when well conserved regions assemble across species and cause scaffolds to break → suboptimal assembly

19 Supplemental Readings
1. Rusch et al The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific. PLOS Biology. 5: 2. Moran et al. Nature. Genome sequence of Silicibacter pomeroyi reveals adaptations to the marine environment. Nature. 432:910–913. 3. Jensen et al Application of whole genome shotgun sequencing for detection and characterization of genetically modified organisms and derived products. Bioanal Chem. 408: 4. McDermott et al. Whole-Genome Sequencing for Detecting Antimicrobial Resistance in Nontyphoidal Salmonella. Antimicrob Agents Chemother. 60:


Download ppt "Environmental Genome Shotgun Sequencing of the Sargasso Sea"

Similar presentations


Ads by Google