Presentation is loading. Please wait.

Presentation is loading. Please wait.

The GMOD Project Lincoln Stein Cold Spring Harbor Laboratory.

Similar presentations

Presentation on theme: "The GMOD Project Lincoln Stein Cold Spring Harbor Laboratory."— Presentation transcript:

1 The GMOD Project Lincoln Stein Cold Spring Harbor Laboratory

2 Test Subject: Michael Caudy oDrosophila neurobiologist oProneural differentiation onotch pathway oHLH transcriptional activators/repressors oachaete/scute complex oNo computer science training oTook my “bioinformatics for biologists” course

3 “Simple” Problem oDiscover the transcriptional factor binding site code controlling proneural differentiation.

4 Regular Expression Search oUsing achaete promoter as exemplar, search for combinations of known binding sites in particular architectures

5 Mike’s Got Lots of Data o90-11,000 TF binding site clusters o100s-1000s of genes omillions of interactions oWhich genes are involved in neural differentiation? oWhich have interactions with the pathway? oWhich have suggestive mutant phenotypes?

6 Mike Needs a Database oDatabase management system for proneural differentiation genes. oVisualization/exploration tools for relationship of genes to putative TF clusters. oLiterature citations oLink out to FlyBase, Genbank & other DBs. oAdd notes and other annotations.

7 Try to do it with Filemaker o“Cluster-centric” vs “gene-centric”? oData import from FlyBase? oStoring images? oMaintaining relationships between genes & clusters? oUpdates?

8 Mike Needs a MOD oModel Organism Database oRepository for reagents oStocks, vectors, clones oGenetic & physical maps oLarge-scale data sets oGenome oEST sets, microarray results, 2-cell hybrid interactions oLiterature oOntologies & Nomenclature oMeetings, announcements

9 Example MOD: WormBase

10 Looking for Sex

11 An Author Entry

12 Bibliography

13 Citation

14 Gene

15 Genome

16 Proteome

17 Comparative Genomics

18 Functional Genomics

19 Anatomy

20 How WormBase Works ACeDB Images, Movies Database access library Web server Perl scripts You MySQL Genomic Data

21 Can Mike reuse WormBase to manage his data? No!

22 Sorry Mike oWormBase website difficult to install oData model nematode-centric oData entry tools very process- specific oCustomization difficult oSoftware documentation uneven oStandard operating procedure documentation uneven

23 MOD Redux oSGD, MGD, FlyBase, TAIR, RGD… oThe same basic idea as WormBase oImplementation entirely different oWheel reinvented many times oLittle software sharing oThis madness must stop!

24 The GMOD Project oPortable, open source software to support model organism databases oMultiple MODs involved oWorm, fly, yeast, mouse, arabidopsis, rat, monocot, [fugu], [E. coli] oFunded by NIH as of June 2002 oProgrammers, coordinator, quarterly meetings

25 GMOD Home Page

26 The GMOD Pyramid Open Source DBMS & Middleware Modular Schema Modular Applications

27 A MOD Construction Set genome genetic maps liter- ature genomes Middleware Layer Database Layer Appplication Layer mapscitations genome browser genome editor map browser map editor citation browser citation editor Bioperl BioJava BioPython annotation pipeline

28 Chado – Modular Schema oCommon schema for use by FlyBase and WormBase oOntology Driven oSmall number of generic tables e.g. “feature” oControlled vocabulary names object types and relationships among them: o“achaete protein is a HLH activator” o“m8 protein inhibits achaete transcription” oEvidence-Savvy

29 GMOD Applications oApollo genome annotation editor oGbrowse generic genome browser oPubSearch literature curation editor oCMAP comparative map browser oIMD insertional mutagenesis database management system

30 Apollo – BDGP & Sanger Center

31 Apollo Data adapters oParser -> data models -> display oExisting data adapters oGAME XML oGFF oEnsembl CGI server oDAS oWrite your own data adapter! oExtend AbstractDataAdapter class oDisplay options defined in config file

32 Who is Using Apollo? oBDGP oReannotated Drosophila genome oBristol-Myers Squibb oLaunching Apollo from web browser via mime types oGNF oJDBC adapter layer over BioSQL oBiogen oView human genome alignment between public and Biogen internal database oConnected BLAT pipeline to Apollo oHGMP-RC Fugu Genomics group oDisplaying annotations on fugu scaffolds

33 PubSearch – TAIR & RatDB

34 PubSearch – Gene Association

35 IMD – Insertional Mutagenesis Db

36 CMap – Gramene

37 Cmap – Detailed View

38 GBrowse – WormBase

39 GBrowse – Zoomed in

40 GBrowse – Zoomed Way In

41 GBrowse – Zoomed Way Way In

42 GBrowse – Keyword Search

43 GBrowse – Third Party Annotations

44 Sequence dumps & other reports

45 Extensively Customizable oEnd-user oTurn tracks on and off, change order, change packing & labeling attributes (stored in cookie) oData provider oChange fonts, colors, text. oChange overview – genetic map, contigs, coverage, karyotype. oDefine new tracks using simple config file. oTinker with track appearance to hearts content.

46 Adding a New Track (a) Create a GFF file named “deletions.gff” Chr1 targeted deletion 1293224 1294901... Deletion d101k2 Chr1 targeted deletion 8239811 8241116... Deletion d680k2 Chr2 targeted deletion 5866382 5866500... Deletion d007k2 (b) Run the script > –d example_database deletions.gff Loading features… Done. 3 features loaded. (c) Add a new track “stanza” to the gbrowse configuration file [Knockout] feature = deletion glyph = span fgcolor = red key = Knockouts link =$name citation = These are deletion knockouts produced by the example knockout consortium (

47 Extensively Extensible Apache Web Server gbrowse CGI script BioPerl library Bio::DB::GFF adaptor Chado adaptor MySQL/Postgres Plugins Bio::Graphics library Oracle Oracle adaptorFlat File adaptor Flat Files Glyphs

48 GBrowse on GenBank? Apache Web Server gbrowse CGI script BioPerl library Plugins Bio::Graphics library Glyphs GenBank Proxy Adaptor GenBank GBrowse on GenBank! Bio::DB::GFF adaptor MySQL

49 B. burgdorferi via GenBank proxy

50 Who is Using GBrowse? oGMOD Members oWormBase, FlyBase, RatDB oHGMP-RC Fugu genomics group oKEGG (multiple microorganisms) oIngenium AG (mouse) oBristoll-Myers Squibb (drosophila) oTexas A&M University (salmonella) oMcGill University (human chr7) oInstitute of Systems Biology (human)

51 Genome Knowledgebase (GK)

52 “Constellation View” (in dev) TCA Cycle Oxidative Decarboxylation Amino Acid Biosynthesis Ethanol Catabolism Glucose Metabolism RNA Splicing DNA Replication

53 “Constellation View” (in dev) TCA Cycle Oxidative Decarboxylation Amino Acid Biosynthesis Ethanol Catabolism Glucose Metabolism RNA Splicing DNA Replication

54 Can Mike use GMOD to manage his data? Almost

55 Mike’s very own flybase

56 Uploaded Annotations

57 Details

58 Essential Pieces in Progress oGeneric MOD web site oStrain & phenotype curation tools oPathway tools and browsers oTree (e.g. phylogenetic) tools & browsers oBiopipe – genome annotation pipeline

59 Find out more about GMOD oGo to oExamine software matrix oFind a project you’re interested in oContact project leader oOr contact Scott Cain: oOr mail

60 Credits CSHL Adrian Arva Shuly Avraham Scott Cain Ken Clark Allen Day Xiaokang Pan BDGP Nomi Harris Suzanna Lewis Chris Mungall John Richter ShengQiang Shu Colin Weil EBI Michele Clamp Stephen Searle Carnegie Institute Sue Rhee Danny Yoo Harvard David Emmert Stan Letovsky Cornell Medical School Michael Caudy

Download ppt "The GMOD Project Lincoln Stein Cold Spring Harbor Laboratory."

Similar presentations

Ads by Google