Overview of the Pathway Tools Software and Pathway/Genome Databases.

Slides:



Advertisements
Similar presentations
Editing Pathway/Genome Databases. SRI International Bioinformatics Pathway Tools Paradigm Separate database from user interface Navigator provides one.
Advertisements

1 SRI International Bioinformatics The Ocelot Frame Knowledge Representation System Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International.
How pathway databases were created and curated Peifen Zhang Plant Metabolic Network (PMN)
SRI International Bioinformatics Data Import / Export Markus Krummenacker Bioinformatics Research Group SRI, International Q
SRI International Bioinformatics Comparative Analysis Q
SRI International Bioinformatics 1 Genome Browser Markus Krummenacker Bioinformatics Research Group SRI, International Q
Overview of the Pathway Tools Software and Pathway/Genome Databases.
Overviews and Omics Viewers. SRI International Bioinformatics Introduction Each overview is a genome-scale diagram of a different aspect of the cellular.
Overview of the Pathway Tools Software and Pathway/Genome Databases.
SRI International Bioinformatics 1 The consistency Checker, or Overhauling a PGDB By Ron Caspi.
Curation of the EcoCyc Database: The EcoCyc Update Project Martha Arnaud Scientific Database Curator Bioinformatics Research Group SRI International
The Pathway Tools Schema. SRI International Bioinformatics Motivations for Understanding Schema Pathway Tools visualizations and analyses depend upon.
New Developments in the Pathway Tools Software and EcoCyc Database Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International
The EcoCyc and MetaCyc Pathway/Genome Databases
Interoperation of Molecular Biology Databases Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International Menlo Park, CA
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Introduction to the Pathway Tools Software David Walsh and Simon Eng bigDATA Workshop—May 29, 2010.
Pathway Tools User Group Meeting Introduction Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org.
August 29, 2002InforMax Confidential1 Vector PathBlazer Product Overview.
陳虹瑋 國立陽明大學 生物資訊學程 Genome Engineering Lab. Genome Engineering Lab The Newest.
Pathway/Genome Databases and Software Tools Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International
Update on The Pathway Tools Software Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org MetaCyc.org.
CalbiCyc, Metabolic Pathways at the Candida Genome Database Martha Arnaud
Creating a … Community Database Organism-Specific Database Model-Organism Database.
Computational Exploration of Metabolic Networks with Pathway Tools Part 1: Overview & Representations Suzanne Paley Bioinformatics Research Group SRI International.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
Ch10. Intermolecular Interactions and Biological Pathways
1 SRI International Bioinformatics BioCyc Tutorial Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org,
1 SRI International Bioinformatics The Pathway Tools Software and BioCyc Database Collection Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International.
SRI International Bioinformatics 1 Pathway Tools: Recent Developments GMOD Meeting, June 2006.
Overviews, Omics Viewers, and Object Groups. SRI International Bioinformatics Introduction Each overview is a genome-scale diagram of cellular machinery.
1 SRI International Bioinformatics EcoCyc, MetaCyc, and the Pathway Tools Software Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International.
Overviews and Omics Viewers. SRI International Bioinformatics Introduction Each overview is a genome-scale diagram of cellular machinery l Cellular Overview.
Data Content of the BioCyc Databases. BioCyc Tier 1 Databases.
The Pathway Tools Ontology and Inferencing Layer Peter D. Karp, Ph.D. SRI International.
TAIR/Gramene/SGN Workshop I ASPB Meeting July 08, 2007 Chicago, IL Metabolic Databases.
The BioCyc Collection of Pathway/Genome Databases Alexander Shearer Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org.
SRI International Bioinformatics 1 Recent Developments in Pathway Tools GMOD Workshop November ‘07 Suzanne Paley Bioinformatics Research Group SRI International.
SRI International Bioinformatics 1 Advanced Editing of Pathway/Genome Databases Ron Caspi.
The consistency Checker, or Overhauling a PGDB By Ron Caspi.
MetaCyc and AraCyc: Plant Metabolic Databases Hartmut Foerster Carnegie Institution.
Top Four Essential TAIR Resources Debbie Alexander Metabolic Pathway Databases for Arabidopsis and Other Plants Peifen Zhang.
SRI International Bioinformatics 1 Submitting pathway to MetaCyc Ron Caspi.
1 SRI International Bioinformatics And now for our ‘Feature’ presentation: Automatic Loading of Protein Sequence Annotation Data from UniProt to Pathway.
The Pathway Tools Schema. SRI International Bioinformatics Motivations for Understanding Schema Pathway Tools visualizations and analyses depend upon.
SRI International Bioinformatics 1 SmartTables & Enrichment Analysis Peter Karp SRI Bioinformatics Research Group September 2015.
© 2014 SRI International About OMICS Group OMICS Group International is an amalgamation of Open Access publications and worldwide international science.
SRI International Bioinformatics 1 Genome Browser Markus Krummenacker Bioinformatics Research Group SRI, International Q
Copyright © 1997 Pangea Systems, Inc. All rights reserved. Pathway Tools Training Course.
SRI International Bioinformatics 1 Genome Browser Tomer Altman Bioinformatics Research Group SRI, International August 19th, 2009.
SRI International Bioinformatics Update your computers! To install a patch: Tools => Instant Patch => Download and Activate All Patches.
SRI International Bioinformatics 1 Editing Pathway/Genome Databases Ron Caspi.
Building and Refining AraCyc: Data Content, Sources, and Methodologies Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science.
1 AraCyc Metabolic Pathway Annotation. 2 AraCyc – An overview  AraCyc is a metabolic pathway database for Arabidopsis thaliana;  Computational prediction.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
SRI International Bioinformatics 1 Pathway Tools Features Available Only in the Desktop Version PathoLogic.
SRI International Bioinformatics 1 The Structured Advanced Query Page Tomer Altman Mario Latendresse Bioinformatics Research Group SRI International April.
Recent Developments and Future Directions in Pathway Tools Peter D. Karp SRI International.
Editing Pathway/Genome Databases
Why Create a PGDB? Perform pathway analyses as part of a genome project Analyze omics data Create a central public information resource for the organism,
An Advanced Web Query Interface for Biological Databases
The Pathway Tools FBA Module
The Pathway Tools Schema
How to Administer a PGDB
The Pathway Tools Software and BioCyc Database Collection
A Community Effort to Model the Human Microbiome
Overview of Microbial Pathway and Genome Databases
Bioinformatics Research Group SRI International
Advanced PGDB Editing: Gene Ontology (GO) Terms
Overview of the Pathway Tools Software and Pathway/Genome Databases
Presentation transcript:

Overview of the Pathway Tools Software and Pathway/Genome Databases

SRI International Bioinformatics Introductions BRG Staff l Peter Karp l Pallavi Kaipa l Mario Latendresse l Suzanne Paley l Markus Krummenacker l Ingrid Keseler l Ron Caspi l Alex Shearer l Carol Fulcher Attendees l Where from, what genome? l What do you hope to get out of the tutorial?

SRI International Bioinformatics SRI International Private nonprofit research institute No permanent funding sources 1300 staff in Menlo Park – Founded in 1946 as Stanford Research Institute – Separated from Stanford University in 1970 – Name changed to SRI International in 1977 – David Sarnoff Research Center acquired in 1987

SRI International Bioinformatics SRI Organization Information and Computing Sciences Engineering Systems And Sciences Physical Sciences Biopharmaceuticals And Pharmaceutical Discovery Education and Policy Bioinformatics Research Group

SRI International Bioinformatics Research in the SRI Bioinformatics Research Group EcoCyc MetaCyc Pathway Tools Pathway Holes BioWarehouse Enzyme Genomics

SRI International Bioinformatics Outline for Tutorial Monday l Introduction l Pathway/Genome Navigator l PathoLogic tutorial and demo l PathoLogic lab session – Make genome input files parsable Tuesday l PathoLogic tutorial l PathoLogic lab session – Build initial version of PGDB Wednesday l Pathway hole filler, operon predictor, transport inference parser Thursday l Editors l Feedback session

SRI International Bioinformatics Outline for Tutorial Wednesday l Introduction l Pathway/Genome Navigator l PathoLogic tutorial and demo l PathoLogic lab session -- build initial version of PGDB l PathoLogic lab session and Pathway Tools Schema Thursday l Editing tools –tutorial/lab sessions l Curation strategy Friday l Tutorial writing queries to PGDBs l Lisp, Java, and Perl APIs l How to send us a bug reports, auto-patch l Feedback session

SRI International Bioinformatics Tutorial Goals General familiarity with Pathway Tools goals and functionality Ability to create, edit, and navigate a new PGDB Create new PGDB for genome(s) you brought with you Familiarity with information resources available about Pathway Tools to continue your work

SRI International Bioinformatics SRI’s Support for Pathway Tools NIH grant finances software development and user support Additional grants finance other software development us bug reports, suggestions, questions Comprehensive bug reports are required for us to fix the problem you reported Keep us posted regarding your progress

SRI International Bioinformatics Administrative Details Please wear badges at all times Escort required outside this room/hallway Let us know when you are leaving Use E-Bldg Entrance Phone numbers to call from entrance Meals Wednesday outing possible Restrooms

SRI International Bioinformatics Tutorial Format Questions welcome during presentations Lab sessions will take different amounts of time for different people l Refine your PGDB l Read Pathway Tools manuals Buddy system for some computers Computer logins Internet connectivity

SRI International Bioinformatics Pathway/Genome Database Integrating Genomic and Biochemical Data Chromosomes, Plasmids Genes Proteins Reactions Pathways Compounds CELL Operons, Promoters, DNA Binding Sites

SRI International Bioinformatics Terminology Model Organism Database (MOD) – DB describing genome and other information about an organism Pathway/Genome Database (PGDB) – MOD that combines information about l Pathways, reactions, substrates l Enzymes, transporters l Genes, replicons l Transcription factors, promoters, operons, DNA binding sites BioCyc – Collection of 205 PGDBs at BioCyc.org l EcoCyc, AgroCyc, HumanCyc

SRI International Bioinformatics BioCyc Collection of Pathway/Genome Databases Pathway/Genome Database (PGDB) – combines information about l Pathways, reactions, substrates l Enzymes, transporters l Genes, replicons l Transcription factors/sites, promoters, operons Tier 1: Literature-Derived PGDBs l MetaCyc l EcoCyc -- Escherichia coli K-12 l BioCyc Open Chemical Database Tier 2: Computationally-derived DBs, Some Curation PGDBs l HumanCyc l Mycobacterium tuberculosis Tier 3: Computationally-derived DBs, No Curation DBs

SRI International Bioinformatics Terminology – Pathway Tools Software PathoLogic l Predicts operons, metabolic network, pathway hole fillers, from genome l Computational creation of new Pathway/Genome Databases Pathway/Genome Editors l Distributed curation of PGDBs l Distributed object database system, interactive editing tools Pathway/Genome Navigator l WWW publishing of PGDBs l Querying, visualization of pathways, chromosomes, operons l Analysis operations u Pathway visualization of gene-expression data u Global comparisons of metabolic networks Bioinformatics 18:S

SRI International Bioinformatics Pathway/Genome DBs Created by External Users 600+ licensees groups applying software to 100+ organisms Software freely available to academics; Each PGDB owned by its creator Saccharomyces cerevisiae, SGD project, Stanford University l pathway.yeastgenome.org/biocyc / TAIR, Carnegie Institution of Washington Arabidopsis.org:1555 dictyBase, Northwestern University GrameneDB, Cold Spring Harbor Laboratory Planned: l CGD (Candida albicans), Stanford University l MGD (Mouse), Jackson Laboratory l RGD (Rat), Medical College of Wisconsin l WormBase ( C. elegans ), Caltech Large scale users: l C. Medigue, Genoscope, 67 PGDBs l G. Burger, U Montreal, 20 PGDBs DOE GTL contractors: l G. Church, Harvard, Prochlorococcus marinus MED4 l Larimer/Uberbacher, ORNL, Shewanella onedensis l J. Keasling, UC Berkeley, Desulfovibrio vulgaris Fiona Brinkman, Simon Fraser Univ, Pseudomonas aeruginosa

SRI International Bioinformatics Terminology “Database” = “DB” = “Knowledge Base” = “KB” = “Pathway/Genome Database” = “PGDB”

SRI International Bioinformatics Why Create PGDBs? Extract more information from your genome Create an up-to-date computable information repository about an organism Perform analyses on the genome and pathway complement of the organism, e.g., analyses of omics data Perform comparative analyses with other organisms Generate a genome poster and metabolic wall chart

SRI International Bioinformatics Sequence Project Workflow Raw Sequence Phred Phrap BLAST, BLOCKS GeneMark/Glimmer PathoLogic P/G Navigator P/G Editors WWW PublishingAnalyses Pathway Tools

SRI International Bioinformatics MetaCyc: Metabolic Encyclopedia Nonredundant metabolic pathway database Describe a representative sample of every experimentally determined metabolic pathway Literature-based DB with extensive references and commentary Pathways, reactions, enzymes, substrates Jointly developed by SRI and Carnegie Institution Nucleic Acids Research 34:D511-D

SRI International Bioinformatics MetaCyc Curation DB updates by 5 staff curators l Information gathered from biomedical literature l Emphasis on microbial and plant pathways l More prevalent pathways given higher priority Review-level database Four releases per year Quality assurance of data and software: l Evaluate database consistency constraints l Perform element balancing of reactions l Run other checking programs l Display every DB object

SRI International Bioinformatics MetaCyc Curation Ontologies guide querying l Pathways (recently revised), compounds, enzymatic reactions l Example: Coenzyme M biosynthesis Extensive citations and commentary Evidence codes l Controlled vocabulary of evidence types l Attach to pathways and enzymes: u Code : Citation : Curator : date Release notes explain recent updates l

SRI International Bioinformatics MetaCyc Data

SRI International Bioinformatics MetaCyc Pathway Variants Pathways that accomplish similar biochemical functions using different biochemical routes l Alanine biosynthesis I – E. coli l Alanine biosynthesis II – H. sapiens Pathways that accomplish similar biochemical functions using similar sets of reactions l Several variants of TCA Cycle

SRI International Bioinformatics MetaCyc Super-Pathways Groups of pathways linked by common substrates Example: Super-pathway containing l Chorismate biosynthesis l Tryptophan biosynthesis l Phenylalanine biosynthesis l Tyrosine biosynthesis Super-pathways defined by listing their component pathways Multiple levels of super-pathways can be defined Pathway layout algorithms accommodate super-pathways

SRI International Bioinformatics Family of Pathway/Genome Databases MetaCyc EcoCyc CauloCyc AraCyc MtbRvCyc HumanCyc

SRI International Bioinformatics Comparison of BioCyc to KEGG KEGG approach: Static collection of pathway diagrams that are color-coded to produce organism-specific views KEGG vs MetaCyc: Resource on literature-derived pathways l KEGG pathways maps are composites of pathways in many organisms -- do not identify what specific pathways elucidated in what organisms l KEGG has no literature citations, no comments, less enzyme detail KEGG vs BioCyc organism-specific PGDBs l KEGG re-annotates entire genome for each organism l KEGG does not curate or customize pathway networks for each organism Software tools l KEGG has no algorithmic visualization tools l KEGG has no queryable metabolic-map overview diagram l KEGG has no interactive editing tools

SRI International Bioinformatics Omics Viewer Import gene expression, proteomics, metabolomics data Obtain pathway based visualizations of omics data l Numerical spectrum of expression values mapped to a color spectrum l Steps of overview painted with color corresponding to expression level(s) of genes that encode enzyme(s) for that step

SRI International Bioinformatics Environment for Computational Exploration of Genomes Powerful ontology opens many facets of the biology to computational exploration Global characterization of metabolic network Analysis of interface between transport and metabolism Nutrient analysis of metabolic network

SRI International Bioinformatics Pathway Tools Implementation Details Allegro Common Lisp Sun, Linux, Windows platforms Ocelot object database 300,000+ lines of code Lisp-based WWW server at BioCyc.org l Manages 205 PGDBs

SRI International Bioinformatics Pathway Tools Architecture Object DBMS GFP API Pathway Genome Navigator WWW Server X-Windows Graphics Object Editor Pathway Editor Reaction Editor Oracle

SRI International Bioinformatics Ocelot Knowledge Server Architecture Frame data model l Classes, instances, inheritance l Frames have slots that define their properties, attributes, relationships l A slot has one or more values u Datatypes include numbers, strings, etc. Transaction logging facility Slot units define metadata about slots: l Domain, range, inverse l Collection type, number of values, value constraints

SRI International Bioinformatics Ocelot Storage System Architecture Persistent storage via disk files, Oracle DBMS l Concurrent development: Oracle l Single-user development: disk files l Read-only delivery: bundle data into binary program Oracle storage l Oracle is submerged within Ocelot, invisible to users l Frames transferred from DBMS to Ocelot u On demand u By background prefetcher u Memory cache u Persistent disk cache to speed performance via Internet

SRI International Bioinformatics The Common Lisp Programming Environment Gatt studied Lisp and Java implementation of 16 programs by 14 programmers (Intelligence 11: )

SRI International Bioinformatics Peter Norvig’s Solution “I wrote my version in Lisp. It took me about 2 hours (compared to a range of hours for the other Lisp programmers in the study, 3-25 for C/C++ and 4-63 for Java) and I ended up with 45 non-comment non-blank lines (compared with a range of for Lisp, and for the other languages). (That means that some Java programmer was spending 13 lines and 84 minutes to provide the functionality of each line of my Lisp program.)”

SRI International Bioinformatics Survey Please complete survey at end of each day

SRI International Bioinformatics PGDB(s) That You Build Before you leave l Tar up your PGDB directory and FTP it home, it home, or copy it to flash disk l We will create a backup copy of your PGDB directory if the directory is still there at the end of the tutorial l Delete the PGDB directory if you don’t want us to back it up l We will not give the backed up data to anyone else

SRI International Bioinformatics Summary Pathway Tools and Pathway/Genome Databases l Not just for pathways! l Computational inferences u Operons, metabolic pathways, pathway hole fillers l Editing tools l Analysis tools: Omics data on pathways l Web publishing of PGDBs Main classes of users: l Develop PGDB to extract more information from genome for genome paper l Develop a model-organism DB for the organism that is updated regularly and published on the web

SRI International Bioinformatics Information Sources Pathway Tools User’s Guide l /root/aic-export/ecocyc/genopath/released/doc/manuals/userguide1.pdf u Pathway/Genome Navigator u Appendix A: Guide to the Pathway Tools Schema l /toot/aic-export/ecocyc/genopath/released/doc/manuals/userguide2.pdf u PathoLogic, Editing Tools l NOTE: Location of the aic-export directory can vary across different computers Pathway Tools Web Site l l Publications, programming examples, etc. Slides from this tutorial l