Download presentation
Presentation is loading. Please wait.
Published byAriel Webb Modified over 9 years ago
1
The (IMG) Systems for Comparative Analysis of Microbial Genomes & Metagenomes: N America: 1,180 Europe: 386 Asia: 235 Africa: 6 Oceania: 81 S America: 96 1,979
2
User Community Data Transformation Data Flow What’s Next Outline
3
North America: 356 Canada 33 Mexico 5 USA 318 South America: 21 Argentina 4 Brazil 7 Chile 1 Colombia 5 Ecuador 2 Peru 1 Uruguay 1 Africa: 7 Algeria 1 Egypt 5 Ethiopia 1 Oceania: 12 Australia 10 New Zeeland 2 Europe: 79 Belgium 2 Czech Rep 1 Denmark 16 Estonia 1 Finland 6 France 1 Germany 13 Greece 4 Ireland 4 Italy 1 Hungary 1 Netherlands 4 Norway 1 Russia 4 Portugal 1 Poland 1 Spain 3 Sweden 4 Switzerland 1 UK 10 Asia: 70 China 20 Hong Kong 5 India 18 Israel 4 Japan 3 Korea 4 Malaysia 3 Philipines 1 Saudi Arabia 4 Singapore 2 Taiwan 3 Thailand 2 Turkey 1 545 /48 Countries April 20, 2012 MGM Workshop Attendees
4
North America: 1,180 Canada 72 Mexico 9 USA 1,099 South America: 96 Argentina 6 Brazil 38 Chile 10 Colombia 5 Costa Rica 32 Peru 2 Uruguay 3 Africa: 6 Egypt 3 South Africa 2 Tanzania 1 Oceania: 81 Australia 72 New Zeeland 9 Europe: 386 Austria 16 Belgium 6 Czech Rep 4 Denmark 23 Estonia 2 Finland 16 France 45 Germany 100 Greece 7 Iceland 1 Ireland 9 Italy 5 Lithuania 1 Netherlands 15 Norway 7 Poland 1 Portugal 3 Russia 7 Serbia 1 Slovenia 1 Spain 26 Sweden 22 Switzerland 9 UK 59 Asia: 235 China 68 Hong Kong 10 India 55 Israel 28 Iran 1 Japan 16 Korea 27 Malaysia 4 Phillipines 1 Qatar 1 Saudi Arabia 5 Singapore 3 Taiwan 11 Thailand 4 Turkey 1 1,979/ 54 Countries May 14, 2012 IMG ER & IMG/M ER Users
5
Transformation Assembly: Assembled reads Sequencing: Qualified reads Functional annotation*: Pathways Structural annotation: Predicted genes Characterization Functional annotation: Annotated proteins Data Structure & semantics Logical: objects, correlations Physical: files, formats, size Processing Methods, tools Questions Sequencing: Raw reads Implementation Data management Computing infrastructure Genome exploration Browse & search genome Browse genome sequence: genes coordinates, features Search genome for presence of specific genes, functions Sequence browser Chromosome map Data interpretation for individual genomes
6
Genome fusion: pangenomes Pathways Functions Genome integration “OMICS” integration Genes Data Structure & semantics Logical: data model Physical: database system Integration Methods, tools Questions Implementation Analysis operations Flow (composition) Performance Genome 1 g 3 g 2 g 1 g 1 g 2 g 3 Genome k Gene correlations Genome n g 1 g 2 g 3 g 4 Gene expression from: Proteomics Transcriptomics Conserved genes Function Profile Comparative Analysis Review, revise, improve quality of annotations Explore /compare gene & functional content of genomes & metagenomes Detect /correct annotation gaps & inconsistencies Data interpretation across genomes
7
Questions Completeness & consistency of functional catalogue for genomes Consistence: IMG terms & pathways Completeness: IMG metabolic reconstruction Expert curation in IMG ER Data Integration Functional annotation Structural annotation Scaffolds Genes Functions Ʃ genes Functional catalogue Phenotypes Genomes Phenotype rules Phenotype prediction Biological data interpretation process Questions Gene prediction accuracy Need re-annotation of all microbial genomes Questions Multiple resources, methods Potential conflicts, errors Missing annotations Requires integrated context (IMG ER) + tools for review/curation
8
Every 4 months Monthly Instructor & Student Tools On demand IMG systems data flow: up to Dec 2011
9
9,991 Public Genomes 22.5 mil genes 1,293 Private Genomes 6.1 mil genes 9,991 Public Genomes 22.5 mil genes 1,293 Private Genomes 6.1 mil genes 7,989 Genomes 12.6 Mil genes + 1,077 Samples: >120 Studies + 2.5 Bil Genes + 1,077 Samples: >120 Studies + 2.5 Bil Genes 357 Samples > 95 Studies +140 Mil Genes 357 Samples > 95 Studies +140 Mil Genes Every 2-3 weeks On demand Monthly Bi weekly Instructor & Student Tools IMG systems data flow: May 2012
10
IMG development focus Large metagenome datasets in IMG/M ER Extended underlying datastore Revision of metagenome analysis tools New User Workspace for handling sets of genomes, functions, genes Long running operations transitioned to background execution mode Content update process New genomes added to IMG ER & IMG/M ER at the same time Data distribution Documentation
11
genome.jgi.doe.gov IMG data distribution
12
IMG documentation
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.