BIND: the Biomolecular Interaction Network Database Gary D. Bader, Doron Betel and Christopher W. V. Houge Seminar in Bioinformatics Elinor Heller.

Slides:



Advertisements
Similar presentations
Molecular Biomedical Informatics Machine Learning and Bioinformatics Machine Learning & Bioinformatics 1.
Advertisements

MitoInteractome : Mitochondrial Protein Interactome Database Rohit Reja Korean Bioinformation Center, Daejeon, Korea.
Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
The National Center for Biotechnology Information (NCBI) a primary resource for molecular biology information Database Resources.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Ontology annotation: mapping genomic regions biological function Paul D Thomas, Huaiyu Mi and Suzanna Lewis.
Gene Ontology John Pinney
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
BIOINFORMATICS Ency Lee.
Contents of this Talk [Used as intro to Genome Databases Seminar, 2002] Overview of bioinformatics Motivations for genome databases Analogy of virus reverse-eng.
Bioinformatics for biomedicine Summary and conclusions. Further analysis of a favorite gene Lecture 8, Per Kraulis
Interoperation of Molecular Biology Databases Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International Menlo Park, CA
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Cell Division, Genetics, Molecular Biology
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
Scientific Data Mining: Emerging Developments and Challenges F. Seillier-Moiseiwitsch Bioinformatics Research Center Department of Mathematics and Statistics.
BACKGROUND E. coli is a free living, gram negative bacterium which colonizes the lower gut of animals. Since it is a model organism, a lot of experimental.
Bioinformatics & LIS A brief talk for librarians, information scientists, and computer scientists about resources and collaborative opportunities with.
Signaling Pathways and Summary June 30, 2005 Signaling lecture Course summary Tomorrow Next Week Friday, 7/8/05 Morning presentation of writing assignments.
13.3: RNA and Gene Expression
Subsystem Approach to Genome Annotation National Microbial Pathogen Data Resource Claudia Reich NCSA, University of Illinois, Urbana.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
Chapter 6– Artifacts of the process
The Science of Life Biology unifies much of natural science
Ch10. Intermolecular Interactions and Biological Pathways
Cytoscape A powerful bioinformatic tool Mathieu Michaud
Bioinformatics.
Gene Expression Omnibus (GEO)
Bioinformatics Dr. Víctor Treviño BT4007
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
NCBI’s Bioinformatics Resources Michele R. Tennant, Ph.D., M.L.I.S. Health Science Center Libraries U.F. Genetics Institute January 2015.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
E-BioSci a platform for e-publishing and information integration in the life sciences Les Grivell European Molecular Biology Organization.
Copyright OpenHelix. No use or reproduction without express written consent1.
DNA alphabet DNA is the principal constituent of the genome. It may be regarded as a complex set of instructions for creating an organism. Four different.
What is Genetic Research?. Genetic Research Deals with Inherited Traits DNA Isolation Use bioinformatics to Research differences in DNA Genetic researchers.
Studying Life Vodcast 1.3 Unit 1: Introduction to Biology.
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
Organizing information in the post-genomic era The rise of bioinformatics.
Cell Signaling Ontology Takako Takai-Igarashi and Toshihisa Takagi Human Genome Center, Institute of Medical Science, University of Tokyo.
DNA, RNA, and Proteins Section 3 Section 3: RNA and Gene Expression Preview Bellringer Key Ideas An Overview of Gene Expression RNA: A Major Player Transcription:
Protein and RNA Families
Mining Biological Data. Protein Enzymatic ProteinsTransport ProteinsRegulatory Proteins Storage ProteinsHormonal ProteinsReceptor Proteins.
The Mammalian Protein – Protein Interaction Database and Its Viewing System That Is Linked to the Main FANTOM2 Viewer Genome Research (2003) Speaker: 蔡欣吟.
NCBI Genome Workbench Chuong Huynh NIH/NLM/NCBI Sao Paulo, Brasil July 15, 2004 Slides from Michael Dicuccio’s Genome Workbench.
Biological Networks & Systems Anne R. Haake Rhys Price Jones.
Copyright OpenHelix. No use or reproduction without express written consent1.
A collaborative tool for sequence annotation. Contact:
Primary vs. Secondary Databases Primary databases are repositories of “raw” data. These are also referred to as archival databases. -This is one of the.
Applied Bioinformatics Week 9 Jens Allmer. Theory I Gene Expression Microarray.
Copyright OpenHelix. No use or reproduction without express written consent1.
Introduction to Biological Concepts and Research Chapter 1.
PubChem: An Open Repository for Chemical Structure and Biological Activity Information Steve Bryant The NIH Biowulf Cluster: 10 Years of Scientific Supercomputing.
Protein sequence databases Petri Törönen Shamelessly copied from material done by Eija Korpelainen This also includes old material from my thesis
PROTEIN INTERACTION NETWORK – INFERENCE TOOL DIVYA RAO CANDIDATE FOR MASTER OF SCIENCE IN BIOINFORMATICS ADVISOR: Dr. FILIPPO MENCZER CAPSTONE PROJECT.
Recent Developments and Future Directions in Pathway Tools Peter D. Karp SRI International.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
Lab Interactions and Ontologies LAB CBW Bioinformatics Workshop February 23 th 2006, Toronto Christopher Hogue Blueprint Initiative.
` Comparison of Gene Ontology Term Annotations Between E.coli K12 Databases REDDYSAILAJA MARPURI WESTERN KENTUCKY UNIVERSITY.
Biological Databases By: Komal Arora.
Systems Biology Tools for working with BIND data
Interactions and Ontologies
Bioinformatics Madina Bazarova. What is Bioinformatics? Bioinformatics is marriage between biology and computer. It is the use of computers for the acquisition,
Section 3: RNA and Gene Expression
Data Warehousing and Data Mining
9 Future Challenges for Bioinformatics
An Overview of Gene Expression
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

BIND: the Biomolecular Interaction Network Database Gary D. Bader, Doron Betel and Christopher W. V. Houge Seminar in Bioinformatics Elinor Heller

Abstract 1.What is Bind? 2.Why do we need a tool like Bind? 3.How does Bind work?

What is Bind? Bind is a Database archive that hold information about: –Biomolecular interactions –Reactions –Complexes –Pathways

Why do we need Protein Interaction Info? Motivation…

Why do we need Protein Interaction Info? - cont Learning protein functions: If two proteins interact, there is a very high possibility that their functions are related as well. Cellular operations are largely endured by interactions among proteins. From protein pathways to understanding cells, tissues, … to life and evolution

Protein interaction -example

Why do we need Bind? Until 2001: –This type of data was stored in journal publications, where it is difficult to mine.

Why do we need Bind?-cont The genome era has taught us that it is important to use effective tools for storing and managing data before they become too large. Preparing for the future: A concerted effort by the biological community is required now to prepare for the interaction information of the near future.

BIND -Goals Goals: Provide a standard, comprehensive and integrated interaction resource to the scientific community Define protein function and mechanisms Recover and integrate biomolecular interaction knowledge Discover new knowledge through data mining

BIND data specification: The problem: Storing different interactions, with different data structure in a generic way. Solution: Using ASN.1 –Main concept: ASN.1 is a formal notation used for describing data transmitted by telecommunications protocols, regardless of language implementation and physical representation of these data, whatever the application, whether complex or very simple.

What is ASN1? ASN.1 = Abstract Syntax Notation 1 Internationally standardized data specification language used to build complex data types in a hierarchical manner - origins are Xerox Used in telephone systems, air traffic, building and machine control, toll highways, smart cards, security and more Used by NCBI to store GenBank, PubMed, MMDB and more For more info -

What kind of information does bind store? BIND stores information about interactions, molecular complexes and pathways. (These are the high level data types). Objects Data-types

Interactions: interaction record stores a description of the binding event between two objects, A and B, which are generally molecules. A B

Molecular complex a generally stable aggregate of molecules that have a function when linked together and are usually described as having sub-units. example: the ribosome

Pathways: A pathway is defined as a group of molecules that are generally free from each other, but form a network of interactions usually to mediate some cellular function.

Bind Objects: An object in BIND is basically a molecule. It can be: DNA RNA Protein Photon or a small/complex molecule.

Bind Objects-cont The object record holds : – its name + a list of name synonyms – its origin - whether natural or not – where it occurs in the cell – the cell stages in which it occurs – a sequence database reference to or a full instantiation of biological sequence and 3D structure.

Bind Objects-cont Most of the biological information in BIND is stored in an interaction record. An interaction also stores: A B text description cellular place of interaction experimental conditions used to observe binding binding sites on A and B and how they are connected chemical action including kinetic and thermo­dynamic data and chemical state of the molecules a comment on evolutionarily conserved biological sequence

DATA SUBMISSION Data is entered into BIND either by manual or automatic methods. Who enters the data? –Expert on the BIND team are entering high quality records on a continuing basis. –Users are encouraged to enter records into the database by the web-based system, or to contact the BIND staff if they have large data sets they want to process.

DATA SUBMISSION-cont How is a record submitted? First stage: entering contact information. Second stage: enter the PubMed identifier and two interacting molecules. Every record that is entered in this way will be validated by BIND indexers and by at least one other expert before it is made available in any public data release.

DATA SUBMISSION-cont –Submitters cannot limit the intended use of submitted BIND data –Submitters have the right to edit/alter their records over time –Suggestions made by a third party will be forwarded by us to the submitters to seek approval for any changes or corrections

How Much Data ?

BIND growth: The fist version of BIND (June 1999): – Contained over 1000 interaction records – Pathways: 6 – Complexes: 40 The last version of BIND –Interactions: –Pathways: 8 –Complexes: 3388

Browsing BIND

Visually Navigating BIND

FAST = “parallel” RPS BLAST Used to spot domain similarities in a protein interaction cluster Server-generated scalable FLASH graphics – zoomable, printable. Followed-up by zoom in on FASTA formatted sequences to see domain superposition and links to SMART/PFAM

Nucleic Acids Res Jan 1;31(1):248-50

More usages for BIND-1 Helping direct future interaction studies: example: The human and mouse variants of the protein tyrosine kinase Fyn: –each have 9 recorded interactions in BIND –Share 6 similar interactions –The mouse variants is known to interact with a protein tyrosime kinase Vav. –The human variant has no record of interaction with the Vav homologue.

Example - continue Using Bind in combination with other tools, it has been lately discovered that : Human homologues with similar domain architecture to mouse Fyn interactions can be identified.

More usages for BIND-2 Comparing between creatures with a different number of genes. Example: Drosophila VS. C.elegans

Example - continue Who has a higher Gene number? Who has larger Protein Interaction complexity ?

References: BIND: the Biomolecular Interaction Network DatabaseGary D. Bader, Doron Betel, and Christopher W. V. Hogue. Nucleic Acids Res January 1; 31(1): 248–250. Bader G.D., Donaldson,I., Wolting,C., Ouellette,B.F., Pawson,T. and Hogue,C.W. (2001) BIND—the biomolecular interaction network database. Nucleic Acids Res., 29, 242–245. Bader G.D. and Hogue,C.W. (2000) BIND—a data specification for storing and describing biomolecular interactions, molecular complexes a pathways. Bioinformatics, 16, 465–477. The BIND and related tools 2005 update. D418-D422 Nucleic Acids Res, 2005,vol33. Doron Betel, and Christopher W. V. Hogue at el

THE END