Presentation is loading. Please wait.

Presentation is loading. Please wait.

Curation of the EcoCyc Database: The EcoCyc Update Project Martha Arnaud Scientific Database Curator Bioinformatics Research Group SRI International

Similar presentations


Presentation on theme: "Curation of the EcoCyc Database: The EcoCyc Update Project Martha Arnaud Scientific Database Curator Bioinformatics Research Group SRI International"— Presentation transcript:

1 Curation of the EcoCyc Database: The EcoCyc Update Project Martha Arnaud Scientific Database Curator Bioinformatics Research Group SRI International http://www.ecocyc.org http://www.biocyc.org

2 SRI International Bioinformatics

3 SRI International Bioinformatics EcoCyc Organization EcoCyc collects information about multiple types of database objects l Pathway * l Reaction * l Compound * l Protein l Gene * l Transcription Unit * hierarchies Proteins Compounds Genes Pathway Reactions

4 SRI International Bioinformatics EcoCyc Statistics 176 pathways 992 enzymes 1006 enzymatic reactions 169 transporters 828 transcription units 1929 proteins have a comment (598 > 300 characters)

5 SRI International Bioinformatics EcoCyc Pathway Information http://biocyc.org:1555/ECOLI/new-image?type=PATHWAY&object=ALANINE-VALINESYN-PWY&detail-level=2

6 SRI International Bioinformatics EcoCyc Pathway Information http://biocyc.org:1555/ECOLI/new-image?type=PATHWAY&object=ALANINE-VALINESYN-PWY&detail-level=2

7 SRI International Bioinformatics …viewed with “More Detail”

8 SRI International Bioinformatics EcoCyc Protein Information comment citations reaction

9 SRI International Bioinformatics EcoCyc Gene Information

10 SRI International Bioinformatics EcoCyc Metabolic Overview http://biocyc.org/ov-expr.shtml Static or animated views of expression data

11 SRI International Bioinformatics EcoCyc Curation l names and synonyms l gene classes l subunit composition of protein complexes l location of gene product l protein or complex molecular weight l enzyme activity name l enzyme properties (activators, inhibitors, cofactors) l comment fields l evidence l citations l reactions catalyzed l pathway information

12 SRI International Bioinformatics Build a new MOD or add a “Pathway Module”! Pathway Tools Software - Takes annotated genome - Generates database, including pathway predictions Freely available (academics/non-profits) http://bioinformatics.ai.sri.com/ptools/ Pathway Tools software environment for creation, curation, analysis, and Web publishing of MODs ptools-info@ai.sri.com Saccharomyces cerevisiae SGD, Stanford University Arabidopsis thaliana Carnegie Institution of Washington Plasmodium falciparum, Stanford University Mycobacterium tuberculosis Stanford University Synechocystis Carnegie Institution of Washington Methanococcus janaschii EBI Current Pathway Tools Users

13 SRI International Bioinformatics EcoCyc Strengths Metabolism Transport Transcription regulation

14 SRI International Bioinformatics EcoCyc into the Future: “EcoCyc is not just metabolism anymore!” …an integrated, review-level information resource on E. coli genomics and biochemistry…

15 SRI International Bioinformatics What do we need to do?Goals Can we possibly get it done? Quantification Where do we start? Priorities How is it going? Progress The EcoCyc Update Project:

16 SRI International Bioinformatics EcoCyc Update: Curation Goals Expand database scope beyond metabolism, transporters, and transcription Curate associated reactions and pathways Stay current with the latest papers Curate every gene product:  literature-based descriptions  comprehensive reference lists

17 SRI International Bioinformatics EcoCyc Update: Quantification 4405 genes -175 transcription factors -168 transporters 4062 genes to curate Full-time curator: 4 days/week on curation + Part-time curator (70%), years 2-4 Year 1: 1600 hours Year 2: 3000 hours Year 3: 3000 hours Year 4: 3000 hours Total: 10,600 hours/4062 genes: 2.6 hours per gene Curation of abstracts

18 SRI International Bioinformatics EcoCyc Update: Priorities 1. Problems raised by users and advisors 2. Gene products that have new characterizations published in the literature 3. Gene products that have not yet been thoroughly curated 4. Gene products that have been curated, but have not been updated lately

19 SRI International Bioinformatics Where are we now? 807 gene products curated. 807/4062 = 19.9% of the total (excluding transport and transcription factors) 4-year plan: Curate 615 genes in Year 1 We are meeting our goal!

20 SRI International Bioinformatics The EcoCyc Collaboration SRI l Peter Karp, PI l Suzanne Paley, Software Engineer l John Pick, Software Engineer l Martha Arnaud, Curator UCD l John Ingraham, Project Leader MBL l Monica Riley, Editor Emerita UNAM l Julio Collado-Vides, Project Leader l Socorro Gama-Castro, Curator l Martin Peralta, Curator TIGR l Ian Paulsen, Project Leader l Mark Hance, Curator UCSD l Milton Saier, Project Leader l Can Tran, Curator Funding: NIH National Center for Research Resources

21 SRI International Bioinformatics

22 SRI International Bioinformatics Pathway/Genome DBs Created by External Users Saccharomyces cerevisiae, Stanford University l pathway.yeastgenome.org/biocyc / Plasmodium falciparum, Stanford University l plasmocyc.stanford.edu Mycobacterium tuberculosis, Stanford University l BioCyc.org Arabidopsis thaliana and Synechocystis, Carnegie Institution of Washington l Arabidopsis.org:1555 Methanococcus janaschii, EBI l Maine.ebi.ac.uk:1555 Other PGDBs in progress by 40 other users Software freely available Each PGDB owned by its creator


Download ppt "Curation of the EcoCyc Database: The EcoCyc Update Project Martha Arnaud Scientific Database Curator Bioinformatics Research Group SRI International"

Similar presentations


Ads by Google