Viral Genomics Allie Evans Colin Lappala Chelsea Layes Sheena Scroggins.

Slides:



Advertisements
Similar presentations
Cyber Metagenomics; Challenge to See The Unseen Majority in The Ocean
Advertisements

Recombinant DNA Technology
METAGENOMICS OF CYANOBACTERIAL BLOOMS Phillip B Pope and Bharat K.C. Patel Microbial Gene Research and Resources Facility, School of Biomolecular and Biomedical.
An Overview of Microbial Life
Tucson High School Biotechnology Course Spring 2010.
Using phylogenetic profiles to predict protein function and localization As discussed by Catherine Grasso.
Metabarcoding 16S RNA targeted sequencing
Phylogenetic Trees Understand the history and diversity of life. Systematics. –Study of biological diversity in evolutionary context. –Phylogeny is evolutionary.
Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities By Kevin Chen, Lior Pachter PLoS Computational Biology, 2005 David Kelley.
Mining SNPs from EST Databases Picoult-Newberg et al. (1999)
CHAPTER 15 Microbial Genomics Genomic Cloning Techniques Vectors for Genomic Cloning and Sequencing MS2, RNA virus nt sequenced in 1976 X17, ssDNA.
Central Dogma Information storage in biological molecules DNA RNA Protein transcription translation replication.
Brock Biology of Microorganisms
C A M E R A A Metagenomics Resource for Marine Microbial Ecology July 27, 2007 Paul Gilna UCSD/Calit2 Saul A. Kravitz J. Craig Venter Institute.
Analyses of ORFans in microbial and viral genomes Journal club presentation on Mar. 14 Albert Yu.
The Sorcerer II Global Ocean Sampling Expedition: Metagenomic Characterization of Viruses within Aquatic Microbial Samples Shannon J. Williamson, Douglas.
VIRUS PROPERTIES Infectious – must be transmissible horizontally Intracellular – require living cells RNA or DNA genome, not both* Most all have protein.
Genetica per Scienze Naturali a.a prof S. Presciuttini Mutation Rates Ultimately, the source of genetic variation observed among individuals in.
Utilizing Fuzzy Logic for Gene Sequence Construction from Sub Sequences and Characteristic Genome Derivation and Assembly.
The Sorcerer II Global ocean sampling expedition Katrine Lekang Global Ocean Sampling project (GOS) Global Ocean Sampling project (GOS) CAMERA CAMERA METAREP.
Zachary Bendiks. Jonathan Eisen  UC Davis Genome Center  Lab focus: “Our work focuses on genomic basis for the origin of novelty in microorganisms (how.
Environmental Genome Shotgun Sequencing of the Sargasso Sea
Unit 1: The Language of Science  communicate and apply scientific information extracted from various sources (3.B)  evaluate models according to their.
C A M E R A A Metagenomics Resource for Microbial Ecology Saul A. Kravitz J. Craig Venter Institute Rockville, Maryland USA KNAW Colloquium May 29, 2008.
The Genetics of Viruses and Bacteria
Molecular Microbial Ecology
H = -Σp i log 2 p i. SCOPI Each one of the many microbial communities has its own structure and ecosystem, depending on the body environment it exists.
Probes can be designed in an evolutionary hierarchy.
Effect of viruses on bacteria-mediated C and Fe cycling M.G. Weinbauer CNRS-UPMC, UMR 7093 Villefranche-sur-mer.
Prokaryote Taxonomy & Diversity
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
Genomes and Their Evolution. GenomicsThe study of whole sets of genes and their interactions. Bioinformatics The use of computer modeling and computational.
NIS - BIOLOGY Lecture 57 – Lecture 58 DNA Technology Ozgur Unal 1.
Essential knowledge 3.C.3:
Microbial genomics Genomics: study of entire genomes Logical next step after genetics: study of genes Genomics: 1) “Structural genomics” * Determine and.
Big Picture Of ≈1.7 million species classified so far, roughly 6000 are microbes True number of microbes is obviously larger than 6000 “Imagine if our.
Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.
Genomics II: The Proteome Using high-throughput methods to identify proteins and to understand their function.
Analysis and comparison of very large metagenomes with fast clustering and functional annotation Weizhong Li, BMC Bioinformatics 2009 Present by Chuan-Yih.
I. Prolinks: a database of protein functional linkage derived from coevolution II. STRING: known and predicted protein-protein associations, integrated.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
es/by-sa/2.0/. Metagenomics Prof:Rui Alves Dept Ciencies Mediques Basiques, 1st Floor, Room.
Chap 18 The Genetics of Viruses and Bacteria. Structure of Virus Approximately 20 nm in diameter Their genome can contain DNA or RNA. Enclosed by a.
A Robust and Accurate Binning Algorithm for Metagenomic Sequences with Arbitrary Species Abundance Ratio Zainab Haydari Dr. Zelikovsky Summer 2011.
Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et. al (2004) Presented by Ken Vittayarukskul Steven S. White.
MEGAN analysis of metagenomic data Daniel H. Huson, Alexander F. Auch, Ji Qi, et al. Genome Res
Phylogenetic trees. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.
Date of download: 6/23/2016 Copyright © 2016 McGraw-Hill Education. All rights reserved. Pipeline for culture-independent studies of a microbiota. (A)
General Microbiology (Micr300)
Boundless Lecture Slides Free to share, print, make copies and changes. Get yours at Available on the Boundless Teaching Platform.
Computational Characterization of Short Environmental DNA Fragments Jens Stoye 1, Lutz Krause 1, Robert A. Edwards 2, Forest Rohwer 2, Naryttza N. Diaz.
Date of download: 7/7/2016 Copyright © 2016 McGraw-Hill Education. All rights reserved. Pipeline for culture-independent studies of a microbiota. (A) DNA.
Noriko Cassman CMGT Rio de Janeiro, Brasil Nov. 17, 2012
Rob Edwards San Diego State University
Metagenomic Species Diversity.
Introduction to Bioinformatics Resources for DNA Barcoding
Tzachi Hagai, Ariel Azia, M. Madan Babu, Raul Andino  Cell Reports 
Environmental Genome Shotgun Sequencing of the Sargasso Sea
Taxonomic distribution of large DNA viruses in the sea
COURSE OF MICROBIOLOGY
Environmental Genome Shotgun Sequencing of the Sargasso Sea
Analysis of Double-Stranded RNA from Microbial Communities Identifies Double- Stranded RNA Virus-like Elements  Carolyn J. Decker, Roy Parker  Cell Reports 
Genomic Data Manipulation
Genomes and Their Evolution
H = -Σpi log2 pi.
Volume 137, Issue 2, Pages (August 2009)
Essential knowledge 3. C. 3: youtube. com/watch
Tzachi Hagai, Ariel Azia, M. Madan Babu, Raul Andino  Cell Reports 
Volume 27, Issue 9, Pages (May 2017)
Toward Accurate and Quantitative Comparative Metagenomics
Presentation transcript:

Viral Genomics Allie Evans Colin Lappala Chelsea Layes Sheena Scroggins

The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific Rusch DB, Halpern AL, Sutton G, Heidelberg KB, Williamson S, et al. PLoS Biology Vol. 5, No. 3, e77 doi: /journal.pbio The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein Families Yooseph S, Sutton G, Rusch DB, Halpern AL, Williamson SJ, et al. PLoS Biology Vol. 5, No. 3, e16 doi: /journal.pbio The Sorcerer II Global Ocean Sampling Expedition: Metagenomic Characterization of Viruses within Aquatic Microbial Samples Shannon J. Williamson, Douglas B. Rusch, Shibu Yooseph, Aaron L. Halpern, Karla B. Heidelberg, John I. Glass, Cynthia Andrews-Pfannkoch, Douglas Fadrosh, Christopher S. Miller, Granger Sutton, Marvin Frazier, J. Craig Venter

Baltimore Classification of Viruses dsDNA ssDNA dsRNA +ssRNA -ssRNA ssRNA-RT dsDNA-RT

Bacteriophages Viruses that infect bacteria Numerically dominant type of phage in oceans.

Cyanophages Prochlorococcus Viruses have acquired and retained photosynthesis gene

Phage Cycles

Lateral gene transfer l

Metagenomics Contribution of viral genomes to microbial environmental processes studied through metagenomic techniques. Metagenomics enables us to study microorganisms by examining DNA that is extracted directly from communities of environmental microorganisms

Metagenomic Challenges Inefficiencies in sampling DNA extraction methods Construction of libraries Inadequacies in data analysis and visualization tools Low abundance species overlooked Lack of reference genomes Sequencing complex environments cost prohibitive Standardizing metadata

Methods Cruise the world Collect L of seawater from each of 37 different stations Record pH, salinity, temperature, etc. of water First:

Methods Pass water through 2.0, 0.8, 0.1 µm filters, TFF to 50Kda for viral concentrate Store at -20°C until shipment from next port

Sequencing Preparation Extract DNA Nebulize DNA –Average of kb fragments Gel electrophoresis extraction –purify and determine lengths Subclone into E. coli Colonies selected for inserts Shotgun sequence inserts

End sequence each insert –Average of 822 bp sequenced per end Sequencing

Same procedure as in humans, Drosophila, dogs, etc. Metagenomic Assembly Unitigs using 98% or 94% homology for overlap Scaffolding Consensus sequence Venter et al. (2001)

New uses for shotgun sequencing and assembly Multiple organisms at once Likely novel organisms Metagenomic Assembly Problems? Mate-pair data relied on more heavily, since overlap coverage is low or unknown Need verification of assembly somehow

Created multiple distinct assemblies – 98% homology unitigs – 94% homology unitigs –non-preassembled end-pairs at various stringencies for multiple sequence alignments Multiple assemblies allowed cross-referencing, quality assurance. Metagenomic Assembly

Taxonomic Assignment Protein-ORF based strategy 5.6 million sequences from GOS All ORFs in same sequence scaffold compared to NCBI protein database using BLAST Votes tallied from each ORF into pools for scaffold Archea, Bacteria, Eukaryota, Viral 5.0 million sequence assigned using this method

Quantitative PCR How many copies of studied proteins exist: from station to station? versus one another?

Quantitative PCR Level of fluorescence checked after each PCR cycle Initial amount can be inferred using standard curve Multiple dilutions allow comparison - Outcome reported only if: -- Ten-fold above no-template negative control AND dilution results in 3-30 more than dilution

Proteins clustered and compared to NCBI –Sequence alignments, not just domains –Gene families bolstered with new genes Phylogeny trees generated –Multiple sequence alignments CLUSTALW –Used only long, fairly homologous samples PHYLIP used to build trees –Based on difference matrix Clustering and Phylogeny

Results 37 marine surface water samples collected 7.7 million sequencing reads were produced Identified 154,662 viral peptide sequences

Identification of Viral Sequences Data from microbial fraction of water samples was examined Looked for viral sequences by comparison to the NCBI non-redundant protein database 154,662 viral peptide sequences were identified Approximately 3% of predicted proteins were identified as viral sequences Number of viral sequences thought to be largely underestimated

Classification through Protein Clustering Of 154,662 viral peptide sequences, 117,123 or 76% fell within 380 protein clusters containing at least 20 proteins Remaining sequences fell within clusters containing less than 20 proteins Average cluster size contained 258 peptide sequences

Neighbor Functional Linkage Analysis Used to verify that they were on viral instead of pro-viral regions of bacterial genomes Proportion of viral same-scaffold ORFs range from 32% to 92% for the metabolic gene families studied Occurrence of viral neighbors on same scaffolds as host- derived viral genes supports hypothesis that sources of the sequences are viruses rather than bacterial

Quantitative PCR qPCR used on DNA collected from 5 sampling locations Yields were initially too low, so samples were pooled Viral gene families psbD, petE, speD, talC, pstS, and phoH were included Results indicate that host-derived viral genes are viral in nature Viral genes encoding environmentally significant host- specific functions are prevalent in aquatic samples

Phylogenetic Analyses Figure 2. Phylogenetic trees of all GOS and publicly available psbA(A) and psbD(B) sequences. BS indicates bootstrap values. GOS and public viral sequences are colored aqua and pink respectively. GOS and public prokaryotic sequences are navy blue and lime green respectively. doi: /journal.pone g002

Figure 3. Phylogenetic trees of all GOS and publicly available pstS(A) and talC(B) sequences. BS indicates bootstrap values. GOS and public viral sequences are colored aqua and pink respectively. GOS and public prokaryotic sequences are navy blue and lime green respectively. GOS eukaryotic sequences are colored yellow. doi: /journal.pone g003

All viral gene families were positively correlated with water temperature Some viral gene families were correlated with salinity, water depth, and calculated trophic status indices Different environmental pressures may influence acquisition of these genes by viruses Table S7 shows the correlations between viral gene families and environmental parameters

Discussion Most studies have focused on the filtered viral fraction of the data This is the first study to focus on the viral components in the microbial fraction of the data Strong evidence for abundance and distribution of environmentally important host-derived viral gene families Distribution patterns of host-derived viral families over environmental gradients Evidence of interactions between bacteriophage and host organisms

Detection of Viruses in Mircrobial Data Large viruses (0.1 µm–0.22 µm) get caught in the filters because of their size and geometric shape Small free living phages flow through the filter, but when viruses physically interacting with the microbes will be caught along with the microbes When filtrating large volumes, biomass accumulates on the filter and viruses get caught Most viruses found within the aquatic microbial communities studies seemed to be in the lytic infection cycle therefore they were actively replicating their DNA

Viruses with Metabolic Genes Through lateral gene transfer, metabolic genes can be acquired from the host Acquisition, retention, and expression of metabolic genes may increase fitness Key metabolic processes and pathways running during infection allows maximum replication Previous studies on host-derived metabolic viral genes has been on the photosynthesis genes psbA and psbD of a cyanophage Previous studies did not focus on abundance or distribution of these genes in the oceans

Host-Derived Metabolic Gene Families In aquatic viral communities sampled, host-derived genes were found widely distributed in significant proportions Quantitative PCR of the these genes confirmed high abundance Not known if these genes were expressed at the time of sampling Unlikely to see these genes in high abundance if they: –Were not expressed –Did not have a fitness advantage

“Suggests that viruses may play a more substantial role in environmentally relevant metabolic processes than previously recognized such as the conversion of light to energy, photoadaptation, phosphate acquisition, and carbon metabolism”

Potential Evolutionary Viral-Host Relationships The study of the cyanophage found that the host- derived genes undergo higher mutation rates than their cyanobacterial nucleotide counterpart After phage acquisition, the genes could diversify Mutated viral genes could form gene reservoirs for the host Through horizontal gene transfer, viruses could promote diversity and distribution

Prochlorococcus – P-SSM4-like Phage Prochlorococcus is one of the most widespread picophytoplankton in the ocean P-SSM4-like phage may influence the abundance, diversity, and distribution of Prochlorococcus Statistically significant relationship between the Prochlorococcus and the P-SSM4-like phage

Metagenomic Viral-Microbial Interactions This study of viral-microbial association between communities was coincidental Horizontal transfer of metabolic genes More studies necessary on the viral-microbial diversity and genetic complement –Community relationships –Evolutionary relationships

Any Questions?