Integration of E. Coli Data (E. coli Pathway and Genomic Data from BioCyc) Jesse Walsh.

Slides:



Advertisements
Similar presentations
Editing Pathway/Genome Databases. SRI International Bioinformatics Pathway Tools Paradigm Separate database from user interface Navigator provides one.
Advertisements

1 SRI International Bioinformatics The Ocelot Frame Knowledge Representation System Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International.
SRI International Bioinformatics 1 Web Services. SRI International Bioinformatics 2 Kinds of Web Services Data retrieval Web Services l PTools-XML l BioPAX.
SRI International Bioinformatics Comparative Analysis Q
SRI International Bioinformatics 1 Orthology-Based Multi-PGDB Curation Tools Suzanne Paley Pathway Tools Workshop 2010.
SRI International Bioinformatics 1 The consistency Checker, or Overhauling a PGDB By Ron Caspi.
Gene Ontology John Pinney
Chubaka Producciones Presenta :.
Curation of the EcoCyc Database: The EcoCyc Update Project Martha Arnaud Scientific Database Curator Bioinformatics Research Group SRI International
The Pathway Tools Schema. SRI International Bioinformatics Motivations for Understanding Schema Pathway Tools visualizations and analyses depend upon.
Interoperation of Molecular Biology Databases Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International Menlo Park, CA
2012 JANUARY Sun Mon Tue Wed Thu Fri Sat
Introduction to the Pathway Tools Software David Walsh and Simon Eng bigDATA Workshop—May 29, 2010.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
Overview Distributed vs. decentralized Why distributed databases
Update on The Pathway Tools Software Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org MetaCyc.org.
Creating a … Community Database Organism-Specific Database Model-Organism Database.
Enzymatic Function Module (KEGG, MetaCyc, and EC Numbers)
Computational Exploration of Metabolic Networks with Pathway Tools Part 1: Overview & Representations Suzanne Paley Bioinformatics Research Group SRI International.
1 SRI International Bioinformatics BioCyc Tutorial Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org,
Overviews, Omics Viewers, and Object Groups. SRI International Bioinformatics Introduction Each overview is a genome-scale diagram of cellular machinery.
Computational Exploration of Metabolic Networks with Pathway Tools Part 2: APIs & Examples Randy Gobbel, Ph.D. Bioinformatics Research Group SRI International.
Data Content of the BioCyc Databases. BioCyc Tier 1 Databases.
The BioCyc Collection of Pathway/Genome Databases Alexander Shearer Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org.
SRI International Bioinformatics 1 Recent Developments in Pathway Tools GMOD Workshop November ‘07 Suzanne Paley Bioinformatics Research Group SRI International.
SRI International Bioinformatics 1 The Structured Advanced Query Page Tomer Altman & Mario Latendresse Bioinformatics Research Group SRI, International.
The Pathway/Genome Navigator (These slides are a guide as you experiment with the Navigator)
SRI International Bioinformatics 1 The Structured Advanced Query Page Tomer Altman & Mario Latendresse Bioinformatics Research Group SRI, International.
SRI International Bioinformatics 1 Advanced Editing of Pathway/Genome Databases Ron Caspi.
SRI International Bioinformatics 1 Object Groups & Enrichment Analysis Suzanne Paley Pathway Tools Workshop 2010.
EADGENE and SABRE Post-Analyses Workshop 12-14th November 2008, Lelystad, Netherlands 1 François Moreews SIGENAE, INRA, Rennes Cytoscape.
EBI is an Outstation of the European Molecular Biology Laboratory. Avazeh Ghanbarian Paul Kersey Alessandro Vullo EBI Microme Annotation Meeting June 2011.
The consistency Checker, or Overhauling a PGDB By Ron Caspi.
1 SRI International Bioinformatics GO Term Integration and Curation in Pathway Tools and EcoCyc Ingrid M. Keseler Bioinformatics Research Group SRI International.
SRI International Bioinformatics 1 Submitting pathway to MetaCyc Ron Caspi.
The Pathway Tools Schema. SRI International Bioinformatics Motivations for Understanding Schema Pathway Tools visualizations and analyses depend upon.
SRI International Bioinformatics 1 Regulation in Pathway Tools Pathway Tools Workshop August 2009.
SRI International Bioinformatics 1 The Structured Advanced Query Page Tomer Altman Bioinformatics Research Group SRI, International February 1, 2008.
The Pathway/Genome Navigator. SRI International Bioinformatics Overview Data page types General query strategies Web queries Desktop Pathway Tools User.
Writing Programs that Analyze Pathway/Genome Databases Markus Krummenacker Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org.
SRI International Bioinformatics 1 The Structured Advanced Query Page Mario Latendresse Tomer Altman Bioinformatics Research Group SRI International March,
Editing Pathway/Genome Databases Compounds, Reactions and Pathways Ron Caspi.
2011 Calendar Important Dates/Events/Homework. SunSatFriThursWedTuesMon January
SRI International Bioinformatics Update your computers! To install a patch: Tools => Instant Patch => Download and Activate All Patches.
SRI International Bioinformatics 1 Editing Pathway/Genome Databases Ron Caspi.
CIS 250 Advanced Computer Applications Database Management Systems.
Module 5: Future 1 Canadian Bioinformatics Workshops
Introduction to Databases Angela Clark University of South Alabama.
SRI International Bioinformatics 1 Pathway Tools Features Available Only in the Desktop Version PathoLogic.
SRI International Bioinformatics 1 The Structured Advanced Query Page Tomer Altman Mario Latendresse Bioinformatics Research Group SRI International April.
SRI International Bioinformatics Selected PathoLogic Refining Tasks Creation of Protein Complexes Assignment of Modified Proteins Operon Prediction.
July 2007 SundayMondayTuesdayWednesdayThursdayFridaySaturday
Recent Developments and Future Directions in Pathway Tools Peter D. Karp SRI International.
PythonCyc and other APIs A Python package to access Pathway Tools and its data using the Python programming language Mario Latendresse March 2016.
The Pathway/Genome Navigator
Editing Pathway/Genome Databases
Comparative Analysis in BioCyc
Why Create a PGDB? Perform pathway analyses as part of a genome project Analyze omics data Create a central public information resource for the organism,
The Pathway Tools Schema
How to Administer a PGDB
Grid Metadata Management
A Community Effort to Model the Human Microbiome
Comparative Analysis Q
Incremental PathoLogic
Propagating Changed Annotation and Pathway Information
Annotation Presentation
The MultiOmics Explainer
Overview of the Pathway Tools FBA Module
2015 January February March April May June July August September
Presentation transcript:

Integration of E. Coli Data (E. coli Pathway and Genomic Data from BioCyc) Jesse Walsh

Outline Description of BioCyc data – Format – Key Classes How I am retrieving and storing the data – SPDB schema – Key tables Recent Developments

BioCyc Data Format Frames are made of slots – Slots are made of facets – Slots values can have annotations Slot Frame Facet Annotation Reaction X Common Name EC # Reactants Coefficient Compartment :VALUE-TYPE, :DOCUMENTATION

BioCyc Class Hierarchy…. Complicated

Key Classes in BioCyc Genes Proteins Polypeptides (a subclass of Proteins) Protein-Complexes (a subclass of Proteins) Pathways Reactions Compounds-And-Elements Enzymatic-Reactions Transcription-Units Promoters

Why not just use BioCyc? Advantages: – Fast access to individual objects – Logic based assertions Disadvantages – Hard to query – Difficult to understand the structures – Difficult to know all of what is in the database – Difficult to integrate other types of data Solution: – Create a relational database

SPDB Schema Simple Pathway DataBase

Pathway “Central” table Allows organization of major pathways Easy to retrieve a pathway, or all reactions that share a pathway with a specified reaction

Reaction Reactions types include: – Catalysis, Spontaneous, Transcription, Translation, Promoter, Transcription Factor Transcription, Translation, Promoter, and TF reactions are all inferred reactions Reactions are the “nodes” of networks in SPBD

Entity Entities include: – Compound, Protein (Complex/Monomer), Gene, Transcription Unit, Promoter Entities with multiple types are represented with the most specific type in its hierarchy – (i.e. A protein that is also a complex will be listed as “Complex”, not “Protein” – “Enzyme” status is stored as a participation type

Participation in Reactions Entities participate in reactions Information includes km data Unsure if condition data exists, and unsure how to access evidence data

Data Links in BioCyc Pathway Reaction Reactants/ProductsEnzymes/Cofactors Genes Transcriptional Unit Promoter Transcription FactorSigma Factor Translation Reaction Transcription Reaction Promoter Relation Activation/Repression Specificity Relation

Data Retrieval Strategy Pathway Reaction Reactants/ProductsEnzymes/Cofactors Genes Transcriptional Unit Promoter Transcription FactorSigma Factor Translation Reaction Transcription Reaction Promoter Relation Activation/Repression Specificity Relation 1 2 3

Improvements to SPDB Explicitly organize pathway networks and reaction networks Allow recursive tracing of pathway elements

Old Organization of Reaction Data Pathway Rxn

Better Way Rxn Pathway Explicitly link reactions in the context of individual pathways

Recursively Tracing the Data Pathway Reaction Reactants/ProductsEnzymes/Cofactors Genes Transcriptional Unit Promoter Transcription FactorSigma Factor Translation Reaction Transcription Reaction Promoter Relation Activation/Repression Specificity Relation Genes of TFs

Coefficient Data for Reactions 6 ATP + 3 L-serine + 3 2,3-dihydroxybenzoate  6 diphosphate + 6 AMP + enterobactin + 9 H +

To Do MIAME experimental conditions Explore other data in BioCyc

Flow of Data (The Big Picture) Data is imported from BioCyc (EcoCyc + MetaCyc) Changes can be made to BioCyc via Cell Designer, which will then be propagated to SPDB Biomart is one option to directly view data in SPDB BioCyc PGDB SPDB JavaCycConnectionBioCycImporter Lisp Based DB MySQL Object Oriented DB API based on JavaCyc Cell Designer BioMart Researcher

Data in BioCyc SPDB Pathways 242 (Excludes Superpathways) Reactions (1751 not inferred, 4373 ‘orphaned’) Enzymes Transporters Gene product summaries Genes Transcription Units Citations18,46917,842--

SPDB Networks

BioCyc Updates March 13, 2006January 10, 2007April 1, 2008March 9, 2009 May 19, 2006March 16, 2007June 27, 2008June 19, 2009 September 8, 2006May 25, 2007October 15, 2008 August 15, 2007 December 5, 2007 Update history shows from 1 to 5 updates per year (~3 times a year on avg) Will have to manually import check for updates and import new data into our database “Actual curation of the data occurs within BioCyc, and the information is periodically propagated to RegulonDB.”

SPDB Schema Simple Pathway DataBase Compound Complex Gene TranscriptionUnit Promoter Monomer Frame Reactant Product Modifier Cofactor Activator Repressor Promoter Catalysis Spontaneous Transcription Translation Promoter Transcription Factor