Presentation is loading. Please wait.

Presentation is loading. Please wait.

How We Annotated Genomes for Free: Fast and Accurate Functional Analysis Using Subsystems Technology Rob Edwards Depts of Computer Science And Biology,

Similar presentations


Presentation on theme: "How We Annotated Genomes for Free: Fast and Accurate Functional Analysis Using Subsystems Technology Rob Edwards Depts of Computer Science And Biology,"— Presentation transcript:

1 How We Annotated Genomes for Free: Fast and Accurate Functional Analysis Using Subsystems Technology Rob Edwards Depts of Computer Science And Biology, San Diego State University Mathematics and Computer Sciences Division, Argonne National Laboratory ASM Philadelphia, May 2009 http://rast.nmpdr.org/?page=Conference

2 Pigeons If it’s good enough for Google – it’s good enough for me

3 Annotation Servers Metagenomes – http://metagenomics.theseed.org http://metagenomics.theseed.org http://rast.nmpdr.org/?page=Conference Complete genomes – http://rast.nmpdr.org http://rast.nmpdr.org

4 First bacterial genome 100 bacterial genomes 1,000 bacterial genomes Number of known sequences Year How much has been sequenced? Environmental sequencing http://rast.nmpdr.org/?page=Conference

5 Everybody at an ASM meeting Everybody in USA All cultured Bacteria 100 people How much will be sequenced? One genome from every species Most major microbial environments http://rast.nmpdr.org/?page=Conference

6 The SEED Family http://rast.nmpdr.org/?page=Conference

7 Subsystem Spreadsheet ChaperoneSubunitUsherAdhesin S. enterica Enteritidis2389238823872386 E. coli HS3068306730663065 B. cenocepacia J2315 2604260326022601 S. maltophilia1085108810871086

8 Over 1,000 Subsystems Three level “hierarchy” Amino Acids and Derivatives – Alanine, serine, and glycine Serine Biosynthesis Amino Acids and Derivatives – Lysine, threonine, methionine, and cysteine Methionine Biosynthesis Make your own subsystems! http://rast.nmpdr.org/?page=Conference

9 Class# SSClass# SS Amino Acids and Derivatives56Nucleosides and Nucleotides14 Carbohydrates97Phosphorus Metabolism6 Cell Division and Cell Cycle10Photosynthesis9 Cell Wall and Capsule50Potassium metabolism3 Clustering-based subsystems193Protein Metabolism52 Cofactors, Vitamins, Pigments43RNA Metabolism39 DNA Metabolism30Regulation and Cell signaling23 Fatty Acids, Lipids, and Isoprenoids 22Respiration44 Membrane Transport41Secondary Metabolism24 Metabolism of Aromatic Compounds 30Stress Response37 Motility and Chemotaxis8Sulfur Metabolism12 Nitrogen Metabolism11Virulence116

10 The Annotation Process Find the phylogenetic neighborhood of your genome Look for proteins that related organisms have – Core proteins – Subset of all subsystems Use those calls as a training set for critica/glimmer – Intrinsic training set! http://rast.nmpdr.org/?page=Conference

11 This one’s for Gary

12 Automatic Metabolic Reconstruction Subsystem, GO, and KEGG connections – KEGG EC numbers – KEGG reaction numbers – SEED reaction numbers (Chris Henry) Metabolic flux models – Automatically generate FBA matrices (Aaron Best/Matt DeJongh; Hope College) http://rast.nmpdr.org/?page=Conference

13

14 The Populated Subsystem http://rast.nmpdr.org/?page=Conference

15 Automatically Compare Metabolic Reconstructions

16 Find And Suggest Candidate Functions Rapidly correct missing annotations Add more members to subsystems Improves future genome annotations! (especially with new subsystems) http://rast.nmpdr.org/?page=Conference

17 The Real Live Test 10 genomes submitted on Thursday at 6 pm First annotation complete before 8 am Friday Remaining annotations completed Friday before noon (there were others in the pipeline too!) http://rast.nmpdr.org/?page=Conference

18 Subsystems Coverage GenomePercent of Proteins in Subsystems Haloferax denitrificans20% Haloferax mediterranei19% Haloferax sulfurifontis19% Haloferax volcanii DS219% Haloarcula sp 3380019% Haloarcula sp 3379918% http://rast.nmpdr.org/?page=Conference

19 Prophages PHANTOME Mya Breitbart, Matt Sullivan, Jeff Elhai, Rob Edwards NSF Haloferax sulfurifontis prophage

20 Metagenome Comparisons Metagenomics RAST has 300 public metagenomes Compared using tblastx http://rast.nmpdr.org/?page=Conference

21 Human Poop

22 High Salinity Salterns SaN Diego, July 2004 Thanks Beltran Rodriguez-Mueller, Mya Breitbart, & Forest Rohwer

23 Low salinity salternsHigh salinity salterns July 2004 Nov 2005

24 Free workshops on NMPDR, RAST, mg-RAST, SEED Contact Leslie McNeil lkmcneil@ncsa.uiuc.edu or visit http://www.nmpdr.org/ http://rast.nmpdr.org/?page=Conference

25 Acknowledgements Environmental Genomics Forest Rohwer Beltran Rodriguez-Mueller Annotation Servers Rick Stevens Ross Overbeek Folker Meyer Bob Olson Daniel Paarman Mark D'Souza Jared Wilkening Andreas Wilke FIG Ross Overbeek Veronika Vonstein Annotators Artist Paula Morris http://rast.nmpdr.org/?page=Conference


Download ppt "How We Annotated Genomes for Free: Fast and Accurate Functional Analysis Using Subsystems Technology Rob Edwards Depts of Computer Science And Biology,"

Similar presentations


Ads by Google