C A M E R A A Metagenomics Resource for Marine Microbial Ecology July 27, 2007 Paul Gilna UCSD/Calit2 Saul A. Kravitz J. Craig Venter Institute.

Slides:



Advertisements
Similar presentations
Cyber Metagenomics; Challenge to See The Unseen Majority in The Ocean
Advertisements

Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury.
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (CAMERA) Invited Talk CONNECT Board Meeting La Jolla, CA April 26, 2006.
The CAMERA Project Metagenomics 2006 Oct 3-5, 2006 Paul Gilna, Calit2, UCSD.
Tucson High School Biotechnology Course Spring 2010.
MitoInteractome : Mitochondrial Protein Interactome Database Rohit Reja Korean Bioinformation Center, Daejeon, Korea.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
DESIGNING THE MICROBIAL RESEARCH COMMONS: AN INTERNATIONAL SYMPOSIUM NATIONAL ACADEMY OF SCIENCES, WASHINGTON, DC, 8-9 OCTOBER 2009 Paul Gilna, B.Sc.,
© 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey What is Metagenomics?  Traditional microbial genomics 
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities By Kevin Chen, Lior Pachter PLoS Computational Biology, 2005 David Kelley.
Biological Oceanography Scientific Domain Ed DeLong MIT Department of Biological Engineering Department of Civil and Environmental Engineering DataSpace.
Project Proposals Due Monday Feb. 12 Two Parts: Background—describe the question Why is it important and interesting? What is already known about it? Proposed.
The Sorcerer II Global Ocean Sampling Expedition: Metagenomic Characterization of Viruses within Aquatic Microbial Samples Shannon J. Williamson, Douglas.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Viral Genomics Allie Evans Colin Lappala Chelsea Layes Sheena Scroggins.
Lecture 1. Microorganisms: an overview Chapter 1. Microorganisms and Microbiology Chapter 2. An overview of microbial life. Cell and viral structures DNA.
The Sorcerer II Global ocean sampling expedition Katrine Lekang Global Ocean Sampling project (GOS) Global Ocean Sampling project (GOS) CAMERA CAMERA METAREP.
Subsystem Approach to Genome Annotation National Microbial Pathogen Data Resource Claudia Reich NCSA, University of Illinois, Urbana.
The Microbiome and Metagenomics
Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.
 Scientific study of life.  Present era is most exciting in biology  Scientists are trying to solve biological puzzles like:  How a single microscopic.
Aquarium Biogeography and Succession of Microbial Communities in Aquatic Built Environments Nitrification Results in Coral Pond 1 The nitrite levels in.
Environmental Genome Shotgun Sequencing of the Sargasso Sea
Databases and tools to study the genomes of hundreds of pathogens, plants, and mammals Richard H. Scheuermann, Ph.D. Director of Informatics J. Craig Venter.
C A M E R A A Metagenomics Resource for Microbial Ecology Saul A. Kravitz J. Craig Venter Institute Rockville, Maryland USA KNAW Colloquium May 29, 2008.
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology Metagenomics Center for Earth Observations and Applications Advisory Committee.
The BIO Directorate Microbial Biology Emphasis BIO Advisory Committee April, 2005.
Presentation Title April 4, 2002 CAMERA- Metagenomics meets the Cyberinfrastructure David T. Kingsbury Gordon and Betty Moore Foundation BERAC - October.
Development of Bioinformatics and its application on Biotechnology
Molecular Microbial Ecology
“Mapping the Human Gut Microbiome in Health and Disease Using Sequencing, Supercomputing, and Data Analysis” Invited Talk Delivered by Mehrdad Yazdani,
Beyond the Human Genome Project Future goals and projects based on findings from the HGP.
“Quantified Self- On Being a Personal Genomic Observatory” Keynote in the “Humans as Genomic Observatories” Meeting Session in the Genomics Standards Consortium.
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology Metagenomics Invited Talk 2006 Synthetic Biology Symposium Aliso Creek Inn.
Gao Song 2010/07/14. Outline Overview of Metagenomices Current Assemblers Genovo Assembly.
“Comparative Human Microbiome Analysis” Remote Video Talk to CICESE Big Data, Big Network Workshop Ensenada, Mexico October 10, 2013 Dr. Larry Smarr Director,
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
DAN LAWSON BRC 2011 – ANNUAL MEETING UT SOUTHWESTERN MEDICAL CENTER DALLAS, TX SEPTEMBER 2011 Challenges and opportunities of new sequencing technologies.
Roadmap for Soil Community Metagenomics of DOE’s FACE & OTC Sites
“Living in a Microbial World” Global Health Program Council on Foreign Relations New York, NY April 10, 2014 Dr. Larry Smarr Director, California Institute.
Browsing the Genome Using Genome Browsers to Visualize and Mine Data.
EBI is an Outstation of the European Molecular Biology Laboratory. Bioinformatics Challenges in Data Handling and Presentation to the Bioinformaticists.
Big Picture Of ≈1.7 million species classified so far, roughly 6000 are microbes True number of microbes is obviously larger than 6000 “Imagine if our.
Current Challenges in Metagenomics: an Overview Chandan Pal 17 th December, GoBiG Meeting.
Anil Wipat University of Newcastle upon Tyne, UK A Grid based System for Microbial Genome Comparison and analysis.
Tsute (George) Chen Bioinformatics Core Department of Microbiology The Forsyth Institute March 24 th, 2015 HOMD A Tour to the Data and Tools.
CompostBin : A DNA composition based metagenomic binning algorithm Sourav Chatterji *, Ichitaro Yamazaki, Zhaojun Bai and Jonathan Eisen UC Davis
Sara E. Richardson Calit2 Summer Undergraduate Research Scholarship Program Advisor: Jurgen Schulze Ivl.calit2.net/wiki CAMERA is.
Analysis and comparison of very large metagenomes with fast clustering and functional annotation Weizhong Li, BMC Bioinformatics 2009 Present by Chuan-Yih.
Bioinformatics Lecture to accompany BLAST/ORF finder activity
“CAMERA Goes Live!" Presentation with Craig Venter National Press Club Washington, DC March 13, 2007 Dr. Larry Smarr Director, California Institute for.
es/by-sa/2.0/. Metagenomics Prof:Rui Alves Dept Ciencies Mediques Basiques, 1st Floor, Room.
Finding genes in the genome
The Genomics: GTL Program Environmental Remediation Sciences Program Spring Workshop April 3, 2006.
Bioprospecting Lecture 17. Marine sponges with cancer promise Hundreds of compounds isolated from natural environments are in use or in development for.
“Genomics: The CAMERA Project" Invited Talk 5 th Annual ON*VECTOR International Photonics Workshop UCSD February 28, 2006 Dr. Larry Smarr Director,
Metagenomics The study of metagenomes, genetic material recovered directly from environmental samples. Term: Coined in 1998 to refer to the idea that a.
General Microbiology (Micr300)
“ Building an Information Infrastructure to Support Microbial Metagenomic Sciences " Presentation to the NBCR Research Advisory Committee UCSD La Jolla,
Tara A.Gianoulis, Jeroen Raes April 13,2010 Presenter: Quan Zhang.
Metagenomic Species Diversity.
Environmental Genome Shotgun Sequencing of the Sargasso Sea
Tools and Services Workshop
Joslynn Lee – Data Science Educator
High-throughput Biological Data The data deluge
Genomic Data Manipulation
Metagenomics Microbial community DNA extraction
Screenshot of JCVI's Advanced Reference Viewer ( jcvi
Presentation transcript:

C A M E R A A Metagenomics Resource for Marine Microbial Ecology July 27, 2007 Paul Gilna UCSD/Calit2 Saul A. Kravitz J. Craig Venter Institute

UCSD/Calit2 - Larry Smarr, PI; Paul Gilna, Executive Director - Phil Papadopoulos, Technical Lead - Weizhong Li JCVI - Marv Frazier, co-PI - Leonid Kagan, Architect; Jennifer Wortman, Bioinformatics - Rekha Seshadri, Outreach and Training; - Doug Rusch, Shibu Yooseph, Aaron Halpern, Granger Sutton UC Davis - Jonathan Eisen, co-investigator Gordon and Betty Moore Foundation - David Kingsbury and Mary Maxon Acknowledgements

Outline New Discipline of Metagenomics Global Ocean Sampling Expedition Challenges of Metagenomic Data CAMERA Features CAMERA Usage to Date Cyberinfrastructure

Genomics – ‘Old School’ - Study of an organism's genome - Genome sequence determined using shotgun sequencing and assembly - ~1300 microbes sequenced, first in DNA usually obtained from pure cultures Metagenomics - Application of genome sequencing methods to environmental samples (no culturing) - Environmental shotgun sequencing is the most widely used approach Genomics vs Metagenomics

Within an environment - What biological functions are present (absent)? - What organisms are present (absent) Compare data from (dis)similar environments - What are the fundamental rules of microbial ecology Search for novel proteins and protein families Metagenomic Questions

Metagenomics Applications Marine Ecology and Microbiology Alternative Energy and Industrial - Hypersaline ponds, Oceans - Termite Metabolism Medical Applications - Microbial Ecology of Human body cavities and fluids Agricultural - Disease Vector Metabolism (Glassy Eyed Sharpshooter) - Soil Ecology Environmental Remediation - DOE: Acid Mine Drainage, Chemical and Radioactive Waste

Metagenomics - Genomics + Metadata Environmental Metadata - Time and location (lat, long, depth) of sample collection - Correlate w/remote sensing data - Physico-chemical properties (e.g. temperature, salinity) MODIS-Aqua satellite image of ocean chlorophyll in the Sargasso Sea grid about the BATS site from 22 February 2003 Metadata

JCVI Global Ocean Sampling Expedition Largest Metagenomic Study to Date

Global Ocean Sampling (GOS) 178 Total Sampling Locations Phase 1: 41 samples, 7.7M reads, >6M proteins Diverse Environments Open ocean, estuary, embayment, upwelling, fringing reef, atoll, warm seep, mangrove, fresh water, biofilms, sediments, soils

Novel clustering process Sequence similarity based Predict proteins and group into related clusters Include GOS and all known proteins Findings GOS proteins cover ~all existing prokaryotic families GOS expands diversity of known protein families 1700 large novel clusters with no homology to known protein families Higher than expected proportion of novel clusters are viral No saturation in the rate of novel protein family discover GOS Protein Analysis Yooseph et al (PLoS 2007)

H. marismortui B. halodurans T. thermophilus B. anthracis D. psychrophila D. radiodurans UVDE homologs Rubisco homologs GOS prokaryotes Known eukaryotes Known prokaryotes GOS prokaryotes Known eukaryotes Known prokaryotes GOS viral Known viral GOS eukaryotes Added Diversity

Rate of Protein Discovery

Fragment Recruitment Viewer Rusch et al, PLoS 3/2007 Percent Identity Reference Genome Coordinates 100% 55% Ribosomal operon “core” genome, ~75% identical Sequence absent from most strains – phage/other lateral transfer? 100% 50%

Public repositories not focused on environmental metagenomics - Sargasso Sea data underutilized by community M$ invested in sequencing and analysis but only accessible to bioinformatics elite Release of GOS dataset in March 2007 Comply with Convention on Biodiversity Why CAMERA?

CAMERA – “Convenient acronym for cumbersome name…” - Henry Nichols, PLoS Biology Mission - Enable Research in Marine Microbiology CAMERA Partners:

Enormous datasets with high gene density - large compute resources required - 2 orders of magnitude jump Fragmentary data - inadequate bioinformatics tools for assembly, annotation, analysis, visualization Metadata standards non-existent - metadata absent from databases - Lack of standards impedes collection of datasets Diversity of User Sophistication and Needs Challenges

Maintain searchable sequence collections - ALL metagenomic sequence reads, assemblies - Non-identical amino acid collection (extended NRAA) - Viral, Fungal, pico-Eukaryotes, Microbial - CAMERA protein clusters Metagenomics data easily downloadable Interactive and Batch Search Facility - Scalable parallel implementations of BLAST - Integrated with associated metadata CAMERA Services

Graphical Tools for Visualizing Diversity - Based on Rusch et al - Fragment recruitment viewer CAMERA Protein Clusters - Based on Yooseph et al - Incremental version implemented in 2007 Annotation - Break through quadratic complexity via clusters - Phyletic Classification Overviews of sequence collections Distinctive Features Set in Progress

Fragment Recruitment Viewer Metagenomic Sequence vs Reference Sequence Highlight and Select with Associated Metadata View large datasets AJAX I/F Based on Doug Rusch’s Viewer