Presentation is loading. Please wait.

Presentation is loading. Please wait.

The (IMG) Systems for Comparative Analysis of Microbial Genomes & Metagenomes: N America: 1,180 Europe: 386 Asia: 235 Africa: 6 Oceania: 81 S America:

Similar presentations


Presentation on theme: "The (IMG) Systems for Comparative Analysis of Microbial Genomes & Metagenomes: N America: 1,180 Europe: 386 Asia: 235 Africa: 6 Oceania: 81 S America:"— Presentation transcript:

1 The (IMG) Systems for Comparative Analysis of Microbial Genomes & Metagenomes: N America: 1,180 Europe: 386 Asia: 235 Africa: 6 Oceania: 81 S America: 96 1,979

2  User Community  Data Transformation  Data Flow  What’s Next Outline

3 North America: 356 Canada 33 Mexico 5 USA 318 South America: 21 Argentina 4 Brazil 7 Chile 1 Colombia 5 Ecuador 2 Peru 1 Uruguay 1 Africa: 7 Algeria 1 Egypt 5 Ethiopia 1 Oceania: 12 Australia 10 New Zeeland 2 Europe: 79 Belgium 2 Czech Rep 1 Denmark 16 Estonia 1 Finland 6 France 1 Germany 13 Greece 4 Ireland 4 Italy 1 Hungary 1 Netherlands 4 Norway 1 Russia 4 Portugal 1 Poland 1 Spain 3 Sweden 4 Switzerland 1 UK 10 Asia: 70 China 20 Hong Kong 5 India 18 Israel 4 Japan 3 Korea 4 Malaysia 3 Philipines 1 Saudi Arabia 4 Singapore 2 Taiwan 3 Thailand 2 Turkey 1 545 /48 Countries April 20, 2012 MGM Workshop Attendees

4 North America: 1,180 Canada 72 Mexico 9 USA 1,099 South America: 96 Argentina 6 Brazil 38 Chile 10 Colombia 5 Costa Rica 32 Peru 2 Uruguay 3 Africa: 6 Egypt 3 South Africa 2 Tanzania 1 Oceania: 81 Australia 72 New Zeeland 9 Europe: 386 Austria 16 Belgium 6 Czech Rep 4 Denmark 23 Estonia 2 Finland 16 France 45 Germany 100 Greece 7 Iceland 1 Ireland 9 Italy 5 Lithuania 1 Netherlands 15 Norway 7 Poland 1 Portugal 3 Russia 7 Serbia 1 Slovenia 1 Spain 26 Sweden 22 Switzerland 9 UK 59 Asia: 235 China 68 Hong Kong 10 India 55 Israel 28 Iran 1 Japan 16 Korea 27 Malaysia 4 Phillipines 1 Qatar 1 Saudi Arabia 5 Singapore 3 Taiwan 11 Thailand 4 Turkey 1 1,979/ 54 Countries May 14, 2012 IMG ER & IMG/M ER Users

5 Transformation Assembly: Assembled reads Sequencing: Qualified reads Functional annotation*: Pathways Structural annotation: Predicted genes Characterization Functional annotation: Annotated proteins Data  Structure & semantics  Logical: objects, correlations  Physical: files, formats, size Processing  Methods, tools Questions Sequencing: Raw reads Implementation  Data management  Computing infrastructure Genome exploration Browse & search genome  Browse genome sequence: genes coordinates, features  Search genome for presence of specific genes, functions Sequence browser Chromosome map Data interpretation for individual genomes

6 Genome fusion: pangenomes Pathways Functions Genome integration “OMICS” integration Genes Data  Structure & semantics  Logical: data model  Physical: database system Integration  Methods, tools Questions Implementation  Analysis operations  Flow (composition)  Performance Genome 1 g 3 g 2 g 1 g 1 g 2 g 3 Genome k Gene correlations Genome n g 1 g 2 g 3 g 4 Gene expression from:  Proteomics  Transcriptomics Conserved genes Function Profile Comparative Analysis Review, revise, improve quality of annotations  Explore /compare gene & functional content of genomes & metagenomes  Detect /correct annotation gaps & inconsistencies Data interpretation across genomes

7 Questions  Completeness & consistency of functional catalogue for genomes Consistence: IMG terms & pathways Completeness: IMG metabolic reconstruction  Expert curation in IMG ER Data Integration Functional annotation Structural annotation Scaffolds Genes Functions Ʃ genes Functional catalogue Phenotypes Genomes Phenotype rules Phenotype prediction Biological data interpretation process Questions  Gene prediction accuracy  Need re-annotation of all microbial genomes Questions  Multiple resources, methods Potential conflicts, errors Missing annotations  Requires integrated context (IMG ER) + tools for review/curation

8 Every 4 months Monthly Instructor & Student Tools On demand IMG systems data flow: up to Dec 2011

9 9,991 Public Genomes  22.5 mil genes 1,293 Private Genomes  6.1 mil genes 9,991 Public Genomes  22.5 mil genes 1,293 Private Genomes  6.1 mil genes 7,989 Genomes 12.6 Mil genes + 1,077 Samples: >120 Studies + 2.5 Bil Genes + 1,077 Samples: >120 Studies + 2.5 Bil Genes 357 Samples > 95 Studies +140 Mil Genes 357 Samples > 95 Studies +140 Mil Genes Every 2-3 weeks On demand Monthly Bi weekly Instructor & Student Tools IMG systems data flow: May 2012

10 IMG development focus  Large metagenome datasets in IMG/M ER  Extended underlying datastore  Revision of metagenome analysis tools  New User Workspace for handling sets of genomes, functions, genes  Long running operations transitioned to background execution mode  Content update process  New genomes added to IMG ER & IMG/M ER at the same time  Data distribution  Documentation

11 genome.jgi.doe.gov IMG data distribution

12 IMG documentation


Download ppt "The (IMG) Systems for Comparative Analysis of Microbial Genomes & Metagenomes: N America: 1,180 Europe: 386 Asia: 235 Africa: 6 Oceania: 81 S America:"

Similar presentations


Ads by Google