Presentation is loading. Please wait.

Presentation is loading. Please wait.

Generic model/many/my organism database Oct/Nov 2007 Don Gilbert Genome Informatics Lab, Biology Dept., Indiana University GMOD.

Similar presentations


Presentation on theme: "Generic model/many/my organism database Oct/Nov 2007 Don Gilbert Genome Informatics Lab, Biology Dept., Indiana University GMOD."— Presentation transcript:

1 generic model/many/my organism database Oct/Nov 2007 Don Gilbert Genome Informatics Lab, Biology Dept., Indiana University gilbertd@indiana.edu GMOD

2 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf Indiana GMOD Potpourri Recent Updates for GMOD-CSHL-0711 Genome Grid GMODTools update Gene Summary Pages in XML

3 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf Genome Grid Middleware to easily use TeraGrid (& other Grid) for genome analyses Give me your genomes to Gridalyze Collaborators wanted ! Apply BioMart, Ergatis, LuceGene, Galaxy Science gateway to use TeraGrid for genome analyses Blast: proteome x non-redudant; organisms x genome gene finders, interproscan, others gmod.org/Genome_grid

4 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf GMODTools update Update: config for new genome chado dbs (sea urchin, paramecium) loaded via GMOD gff2chado New: GO gene-association output Please publish your Chado DB gmod.org/Public_Chado_Databases each project chado has variations Cleans database contents for public use Todo: add gene page xml, others? gmod.org/GMODTools

5 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf Gene Summary Pages Simple, readable XML summarizes gene info. In use at Daphnia (wFleaBase.org) base wfleabase.org/lucegene/lookup?id=NCBI_GNO_ 149114 wfleabase.org/lucegene/lookup?id=NCBI_GNO_ 149114 Created from Chado DB or overloaded GFF Software is simple Perl lib, XML DTD eugenes.org/gmod/gene-report-examples/

6 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf Gene Page XML Gene Summary 2007-Sep-02 NCBI_GNO_200214 Daphnia pulex C:integral to membrane F:rhodopsin-like receptor activity P:G-protein coupled receptor protein signalin... P:phototransduction Rh3-PA Drosophila virilis UniProt:Q8I138 Bacterial infection Pfam:PF00001 7tm_1 WFes0143594

7 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf.. on to Introduction to GMOD..

8 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf Generic Model Organism Database Built by and for many contributing projects Loosely coupled tool kit Work as separate parts and together Complex and simple No more complex than necessary; complexity is part of this territory. GMOD Introduction

9 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf New Genome? Draft assembly in parts; many computed annotations; little literature; Known Genome? Large literature base; rich and complex biology knowledge; Lab integration? Support and integrate with focused lab research project Your project needs?

10 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf gmod.org/Getting Started Documentation is now rich and improving Installation options: distribution tar-ball Virtual Machine-Ware for demo YUM Unix packages Getting Started w/ GMOD

11 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf Chado – database schema and middleware GBrowse – Web-based genome annotation viewing Apollo – Desktop-based genome annotation editing CMap – Web-based comparative map viewing BioMart – Genome data mining from Ensembl/GMOD GMOD Components

12 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf Chado - Getting Started gmod.org/Chado_Manual modules, conventions, design principles Worked examples @ gmod.org Load_RefSeq_Into_Chado Load_BLAST_Into_Chado Sample_Chado_SQL Chado Database How-To

13 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf Modularity: inherent Chado schema, core module, biology groupings, with common structure. Ontologies: standard biology vocabularies a core of Chado design. Associated software: Perl and Java middleware, stand-alone programs with Chado adaptors. Chado Design

14 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf Complexity and Detail: inherent in genome data, Chado embraces with room to grow, plus long-term stability. Data Integration: key component of Chado, public and lab data sets can be combined. Support: shared responsibility among the GMOD community. Chado Design [2]

15 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf CV: Controlled vocabularies and ontologies Sequence: Biological sequences and objects which can be localized on them Companalysis: Adjunct to sequence module for in- silico analysis Map: Adjunct to sequence module for non-sequence localization Organism: Taxonomy / species information Pub: Publication / Biblio. / Reference information General: General information / database cross- references Chado Schema: Core

16 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf Expression: Transcript and protein expression events Mage: for microarray data Genetics: Genetic/phenotypic interactions in genotypic/environmental context Phenotype: for phenotypic data Library: for descriptions of molecular libraries Phylogeny: for organisms and phylogenetic trees Stock: for specimens and biological collections Contact: for people, groups, and organizations Chado Schema: More

17 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf GFF to Chado data loader, with BioPerl extensions (GenBank2GFF -> Chado, …) GMODTools - Output Bulk genome data XORT - Chado XML input and output Modware - OO-Perl Chado access package (in/out) Java middleware (Hibernate; others) Chado Middleware

18 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf

19 Sybil – Web-based synteny viewing at gene & chromosome level Turnkey – “Skinable” Chado-based web site Pathway Tools – metabolic pathways PubFetch – Literature management Textpresso – Automatic paper classification LuceGene - Genome object/text/web search system GMOD Components [2]

20 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf Wikipedia Community Annotation (in development; EcoliWiki ++) Comparative visualization - SynBrowse & SynView Genome grid - Teragrid methods for genome computations (in dev.) GMOD Components [3]

21 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf WikiGenomes (ecoliwiki.net)

22 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf Database Frameworks: VMWare: virtual machine package with basic GMOD components for demo YUM distribution package ARGOS : replication framework for genome databases GMOD Components [4]

23 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf Core: PostgreSQL database; Chado Schema; Sequence & OBO Ontologies System: Apache web server; Unix; BioPerl; … Load data: GFF to Chado View: Gbrowse (Chado; MySql;..) Edit/Update: Apollo, Wiki (coming), bulk-file updates Output: BulkFiles; BioMart; Putting GMOD together

24 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf Example new MOD

25 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf New Genome? Known? Lab integration? Assess your customer needs Full database/toolset is overkill for some Loosely coupled tools; complex and simple Pick the parts you need Learn tools with examples first Recap:Your project needs?

26 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf Genome Annotations Proteome annotations, EST/cDNA, gene predictions, RNA, transposon, promotor, etc. Database cross-refs: UniProt, Gene Ontology, KEGG, KOG, etc. Web-Database Gbrowse maps, Blast server with Chado output, Gene detail reports, BioMart data mining; Wikipedia community editing Chado-centric Genome

27 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf Current components Need adopters to share effort Re-use rather than re-invent Describe : GMOD.org Wiki needs more examples New components Discuss with other projects: common need? Shared specifications, use cases GMOD recommended practices Contributing to GMOD

28 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf https://lists.sourceforge.net/lists/listinfo/ gmod-announce gmod-schema All Chado schema issues gmod-gbrowse GBrowse mailing list gmod-devel General development Related: Ontologies (SO, OBO); BioPerl; Apollo; Biomart; Active GMOD Mailing Lists

29 http://eugenes.org/gmod/docs/gmod-intro-07oct.pdf


Download ppt "Generic model/many/my organism database Oct/Nov 2007 Don Gilbert Genome Informatics Lab, Biology Dept., Indiana University GMOD."

Similar presentations


Ads by Google