High Throughput Computational Sequence Analysis Rob Edwards Argonne National Laboratory San Diego State University.

Slides:



Advertisements
Similar presentations
Putting TAIR to work for you hands-on workshop for beginning and advanced users
Advertisements

JGI Timeline 1997 JGI April 2003 Human Genome Program Officially Ended Human Genome Program Officially Launched 1990 Joint Genome Institute ………………….(JGI)
Integration of Prokaryotic Genomics into the Unknown Microbe ID Lab Bert Eardley – Penn State, Berks & Dan Golemboski – Bellarmine University.
Metabarcoding 16S RNA targeted sequencing
EXPONENTIAL GROWTH Exponential functions can be applied to real – world problems. One instance where they are used is population growth. The function for.
Office of Science Office of Biological and Environmental Research Susan K. Gregurick, Ph.D. Program Manager Computational Biology & Bioinformatics Biological.
Introduction to the Pathway Tools Software David Walsh and Simon Eng bigDATA Workshop—May 29, 2010.
High performance computational analysis of DNA sequences from different environments Rob Edwards Computer Science Biology edwards.sdsu.eduwww.theseed.org.
Experimental and computational assessment of conditionally essential genes in E. coli Chao WANG, Oct
Annotating Metagenomes Using the NMPDR Rob Edwards Department of Computer Sciences, San Diego State University Mathematics and Computer Sciences Division,
THE GLOBAL MARINE VIRIOME Rob Edwards Dept. Biology, SDSU Computational Sciences Research Center, SDSU Center for Microbial Sciences, San Diego, Fellowship.
How We Annotated Genomes for Free: Fast and Accurate Functional Analysis Using Subsystems Technology Rob Edwards Depts of Computer Science And Biology,
National Microbial Pathogen Data Resource About us NMPDR is a Bioinformatics Resource Center dedicated to the thorough understanding of core.
Annotating Metagenomes Using the SEED Rob Edwards Department of Computer Sciences, San Diego State University Mathematics and Computer Sciences Division,
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
Sequencing All of Microbial Life: Challenges and Opportunities Rob Edwards Argonne National Laboratory San Diego State University.
Annotations, Subsystems based approach Rob Edwards Argonne National Labs San Diego State University.
Microbial Genomes Features Analysis Role of high-throughput sequencing Yeast - the eukaryotic model microbe Databases –TIGR CMR –NCBI Microbial Genomes.
Metagenomic Analysis Using MEGAN4
Development of Bioinformatics and its application on Biotechnology
Genome-scale Metabolic Reconstruction and Modeling of Microbial Life Aaron Best, Biology Matthew DeJongh, Computer Science Nathan Tintle, Mathematics Hope.
From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ Anna Shcherbina Bioinformatics Challenge Day 02/02/2013 From Metagenomic Sample to.
Beyond the Human Genome Project Future goals and projects based on findings from the HGP.
H = -Σp i log 2 p i. SCOPI Each one of the many microbial communities has its own structure and ecosystem, depending on the body environment it exists.
The Metagenomics RAST server: Annotation, Analysis, and Comparisons Perfect for Pyrosequencing Rob Edwards Department of Computer Science, San Diego State.
Chapter 21 Eukaryotic Genome Sequences
Vectorbase and Galaxy Jarek Nabrzyski On behalf of VectorBase Center for Research Computing University of Notre Dame VectorBase Bioinformatics Resource.
K Phone: Web: A Software Package for the Design and Analysis of Microbial Functional.
Genomes To Life Biology for 21 st Century A Joint Initiative of the Office of Advanced Scientific Computing Research and Office of Biological and Environmental.
The metagenomics sequencing service CD Genomics. Metagenomics: Metagenomics is the study of metagenomes, genetic material recovered directly from environmental.
NEES Cyberinfrastructure Center at the San Diego Supercomputer Center, UCSD George E. Brown, Jr. Network for Earthquake Engineering Simulation NEES TeraGrid.
2009 IADR, MIAMI, FL, USA Hands-on Experience for using the Human Oral Microbiome Database (HOMD) 2009 IADR Workshop, Miami, FL, USA Tsute (George) Chen.
Functional and Evolutionary Attributes through Analysis of Metabolism Sophia Tsoka European Bioinformatics Institute Cambridge UK.
Metagenomics at Second Genome
IGV tools. Pipeline Download genome from Ensembl bacteria database Export the mapping reads file (SAM) Map reads to genome by CLC Using the mapping.
Chapter 16 Microbial Genomics “If we should succeed in helping ourselves through applied genetics before vengefully or accidentally exterminating ourselves,
The (IMG) Systems for Comparative Analysis of Microbial Genomes & Metagenomes: N America: 1,180 Europe: 386 Asia: 235 Africa: 6 Oceania: 81 S America:
SGM Meeting, Warwick, April 2006
GADU: A System for High-throughput Analysis of Genomes using Heterogeneous Grid Resources. Mathematics and Computer Science Division Argonne National Laboratory.
Modern aspects of drugs production. Differences.
Systems Microbiology Biology 475. Systems microbiology aims to integrate basic biological information with genomics, transcriptomics, metabolomics, glycomics,
Supplementary Figure S1. Supplementary Figure S2.
Microbial Growth. Growth of Microbes Increase in number of cells, not cell size One cell becomes colony of millions of cells.
High throughput biology data management and data intensive computing drivers George Michaels.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Bioinformatics activity Christophe BLANCHET.
Genome Analysis. This involves finding out the: order of the bases in the DNA location of genes parts of the DNA that controls the activity of the genes.
Subsystem: General secretory pathway (sec-SRP) complex (TC 3.A.5.1.1) Matthew Cohoon, Department of Computer Science, University of Chicago, Chicago, IL.
DIABETES STORY LSSI Alum, 2015 Shawn Hurley, San Diego Miramar College.
Computational Characterization of Short Environmental DNA Fragments Jens Stoye 1, Lutz Krause 1, Robert A. Edwards 2, Forest Rohwer 2, Naryttza N. Diaz.
First bacterial genome 100 bacterial genomes 1,000 bacterial genomes Number of known sequences Year How much has been sequenced? Environmental sequencing.
Real Time DNA Sequence Analysis: New tools for mining data Rob Edwards San Diego State University, San Diego, CA Argonne National Laboratory, Argonne,
The SEED Family First bacterial genome 100 bacterial genomes 1,000 bacterial genomes Number of known sequences Year How.
Using Computers to Understand Life: from Bacteria and Viruses to Corals and Fishes Rob Edwards SDSURF 2011.
Genomics, Metagenomics, And Google Rob Edwards San Diego State University, San Diego, CA Argonne National Laboratory, Argonne, IL
Real time metagenomics Ross Overbeek Bob Olson Terry Disz Liz Dinsdale.
Rob Edwards San Diego State University
Metagenomic Species Diversity.
The bioinformatics behind
Bioinformatics Capstone Project
النمو والعد البكتيري Microbial growth النمو الجرثومي.
VISUALIZING COMPLEX BACTERIAL POPULATIONS IN ANIMAL MODELS
H = -Σpi log2 pi.
Strategies for annotation of a genome
16.1 – Genetic Variation in Bacteria
Exploring the forest canopy metagenome for novel compounds
Fractions of 16S rRNA genes from bacteria (top panel) and archaea (bottom panel) in public databases from primer-amplified metagenomes (with and without.
Overview of Genetics.
Multiple sequence alignment & Phylogenetics Analysis
Research Techniques Made Simple: Profiling the Skin Microbiota
Annotations, Subsystems based approach
Presentation transcript:

High Throughput Computational Sequence Analysis Rob Edwards Argonne National Laboratory San Diego State University

First bacterial genome 100 bacterial genomes 1,000 bacterial genomes Number of known sequences Year How much has been sequenced Environmental sequencing

Everybody in San Diego Everybody in USA All cultured Bacteria 100 people How much will be sequenced One genome from every species Most major microbial environments

High Performance Computing

TeraGrid

The Teragrid National Resource

Life Sciences Gateway to TeraGrid

Subsystems

Subsystems make up metabolism Wikipedia Metabolism

Subsystems are not just metabolism Enzyme complex Cell Machinery Cell Processes

Growth in generation of subsystems

Microbial Genomics Annotation Platform Goal 1: Automate the generation of high quality annotations by leveraging the information contained in SubSystems and FIGfams. Goal 2: Minimize turnaround time. Initial target 48 hours

Automated process consisting of: –Gene calling –Initial annotation of function –Initial metabolic reconstruction Process takes 1-7 hours depending on size and complexity of the genome ~20 genomes per day Password protected, secure, private Release to public databases if required Freely available annotation service

Some estimate of annotation quality

Evaluation / Viewing

Download results We provide a number of export formats: –Genbank, Fasta, GFF3, Excel –can easily be extended to all formats supported by BioPerl Genomes can be deleted by the user at any time (we keep them for max. 120 days) Genomes can be directly imported into the SEED if the user wishes all genomes are password protected

Metagenomics SEED

Metagenome Metabolic Reconstruction

Starch utilization in cow rumens

Metabolic potential in environments

Everybody in San Diego Everybody in USA All cultured Bacteria 100 people Too much will be sequenced One genome from every species Most major microbial environments

Acknowledgements Argonne National Laboratory Rick Stevens Bob Olson Folker Meyer San Diego State University Forest Rohwer Fellowship for Interpretation of Genomes Ross Overbeek Veronika Vonstein The Annotators