Other biological databases. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and networks Biological.

Slides:



Advertisements
Similar presentations
EMBL-EBI Integration of Sequence and 3D structure Databases.
Advertisements

EBI Proteomics Services Team – Standards, Data, and Tools for Proteomics Henning Hermjakob European Bioinformatics Institute SME forum 2009 Vienna.
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
Protein Structure Database Introduction Database of Comparative Protein Structure Models ModBase 生資所 g 詹濠先.
Gene Ontology John Pinney
EBI is an Outstation of the European Molecular Biology Laboratory. IntEnz Integrated relational Enzyme database 23 May 2015.
EBI is an Outstation of the European Molecular Biology Laboratory. Alex Mitchell InterPro team Using InterPro for functional analysis.
Bioinformatics for biomedicine Summary and conclusions. Further analysis of a favorite gene Lecture 8, Per Kraulis
Archives and Information Retrieval
Readings for this week Gogarten et al Horizontal gene transfer….. Francke et al. Reconstructing metabolic networks….. Sign up for meeting next week for.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Biological Databases Notes adapted from lecture notes of Dr. Larry Hunter at the University of Colorado.
The Cell, Central Dogma and Human Genome Project.
ONCOMINE: A Bioinformatics Infrastructure for Cancer Genomics
IST Computational Biology1 Information Retrieval Biological Databases 2 Pedro Fernandes Instituto Gulbenkian de Ciência, Oeiras PT.
Pathway Informatics 6 th July, 2015 Ansuman Chattopadhyay, PhD Head, Molecular Biology Information Services Health Sciences Library System University of.
DEMO CSE fall. What is GeneMANIA GeneMANIA finds other genes that are related to a set of input genes, using a very large set of functional.
Enzymatic Function Module (KEGG, MetaCyc, and EC Numbers)
Modeling Functional Genomics Datasets CVM Lessons 4&5 10 July 2007Bindu Nanduri.
Login: BITseminar Pass: BITseminar2011 Login: BITseminar Pass: BITseminar2011.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Pattern databasesPattern databasesPattern databasesPattern databases Gopalan Vivek.
Introduction to metabolism: Compounds, Reactions, Enzymes and Pathways Kristian Axelsen, Alan Bridge Elisabeth Coudert & Anne Morgat SIB Swiss Institute.
Genome Informatics 2005 ~ 220 participants 1 keynote speaker: David Haussler 47 talks 121 posters.
Ch10. Intermolecular Interactions and Biological Pathways
Erice 2008 Introduction to PDB Workshop From Molecules to Medicine: Integrating Crystallography in Drug Discovery Erice, 29 May - 8 June Peter Rose
Databases in Bioinformatics and Systems Biology Carsten O. Daub Omics Science Center RIKEN, Japan May 2008.
1 Bio-Trac 25 (Proteomics: Principles and Methods) October 5, 2007 Zhang-Zhi Hu, M.D. Research Associate Professor Protein Information Resource, Department.
1 Bio-Trac 25 (Proteomics: Principles and Methods) October 3, 2008 Zhang-Zhi Hu, M.D. Research Associate Professor Protein Information Resource, Department.
Information Resources for Bioinformatics 1 MARC: Developing Bioinformatics Programs July, 2008 Alex Ropelewski Hugh Nicholas
© Wiley Publishing All Rights Reserved. Protein and Specialized Sequence Databases.
Copyright OpenHelix. No use or reproduction without express written consent1.
The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)
Gene Annotation and Analysis Lab Work Reference: European Multimedia Bioinformatics Educational Resource.
EBI web resources II: Ensembl and InterPro Yanbin Yin Fall
Intralab Workshop - Reactome CMAP Chang-Feng Quo June 29 th, 2006.
Biological Databases By : Lim Yun Ping E mail :
Finish up array applications Move on to proteomics Protein microarrays.
Network & Systems Modeling 29 June 2009 NCSU GO Workshop.
BIOINFORMATIK I UEBUNG 2 mRNA processing.
EMBL-EBI EMBL-EBI EMBL-EBI What is the EBI's particular niche? Provides Core Biomolecular Resources in Europe –Nucleotide; genome, protein sequences,
1 SRI International Bioinformatics GO Term Integration and Curation in Pathway Tools and EcoCyc Ingrid M. Keseler Bioinformatics Research Group SRI International.
Protein and RNA Families
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
Protein Data Bank: An Introduction Learning to Use the RCSB PDB Portal.
The Mammalian Protein – Protein Interaction Database and Its Viewing System That Is Linked to the Main FANTOM2 Viewer Genome Research (2003) Speaker: 蔡欣吟.
Sequencing the World of Possibilities for Energy & Environment MGM workshop. 19 Oct 2010 Information Sources for Genomics Konstantinos Mavrommatis Genome.
Rice Proteins Data acquisition Curation Resources Development and integration of controlled vocabulary Gene Ontology Trait Ontology Plant Ontology
A curated database of biological pathways.
Bioinformatics and Computational Biology
Learning and exploring Life science through the EBI reosurces and tools BIOQUEST workshop_2011 Vicky Schneider, EMBL-EBI Training Programme Project leader.
A database of biological pathways and processes (borrowed from a presentation created by Steve Jupe)
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
Supplementary Figure 1: Comparison of the results obtained from three widely used databases (namely AmiGO, ArrayExpress and GeneCards) with that from HypoxiaDB.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
InterPro Sandra Orchard.
Welcome to the Protein Database Tutorial. This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
OncoTrack Bioinformatics Workshop Max Planck Institute for Molecular Genetics, Berlin Wednesday 6 th November 2013 TimeSubject 13:30-15:00 Introduction.
Tutorial: Bioinformatics Resources ( georgetown
Cheminformatics and Metabolism Team The EBI Enzyme Portal.
Bio/Chem-informatics
Archives and Information Retrieval
Tutorial: Bioinformatics Resources
Ensembl Genome Repository.
Annotation Presentation
Part II SeqViewer AraCyc Help
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Overview of Enzyme, Protein and Network Databases
Presentation transcript:

Other biological databases

Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and networks Biological systems Protein families and domains Whole genome data Sequence data Ontologies -GO

Other Biological Databases Transcription factor binding sites - TRANSFAC Protein structure databases- PDB, SCOP, CATH Protein family databases- Pfam, Prints, PROSITE etc. Chemicals and small molecules - ChEBI Gene expression databases – GEO, ArrayExpress Metabolic pathways - Reactome, KEGG Genome Databases- Ensembl, FlyBase, WormBase etc. Human genetics-related databases –HapMap, dbSNP

Transcription factor binding sites TRANSFAC –database of eukaryotic transcription factors: regulation.com/pub/databases.html#transfac TESS –Transcription Element Search System –for predicting transcription factor binding sites, uses TRANSFAC: TFsearch –for searching transcription factor binding sites:

Protein structure databases Main resource is Protein Data Bank (PDB): Contains the spatial coordinates of macromolecule atoms whose 3D structure has been obtained by X-ray or NMR studies Proteins represent more than 90% of available structures (others are DNA, RNA, sugars, viruses, protein/DNA complexes…) Can search by PDB code

Searching MSD -Search by PDB code

Protein structure-related databases Structural family databases based on PDB – SCOP ( and CATH ( Predicted structures in SWISS-MODEL ( MODEL.html)

Protein family databases Databases that produce signatures for identifying protein families or domains Used for functional classification of proteins E.g. Pfam, PROSITE, Prints, SMART, TIGRFAMs etc. Integrated into single resource InterPro (

InterProScan sequence search Stand-alone version available

InterPro text search Search keyword, protein acc or InterPro acc

Results for protein acc

Example InterPro entry

Chemicals and small molecules Chemical abstracts- ChEBI- KEGG –part of it includes chemicals ChemID plus -chemicals cited in NLM databases dlite.jsp MSD-Chem –ligands and chemicals in MSD

CheBI example entry

Hierarchy for chemicals

Gene expression databases NCBI Gene Expression Omnibus (GEO) ArrayExpress Stanford microarray database www5.stanford.edu/ Can usually search for experiments or particular expression profiles

GEO search page

Profiles search results

Specific entry and experiment info

ArrayExpress search results

What does the data look like? Info on experiment, array used, etc. Raw or processed tab delimited file containing spots and their intensities cy3/cy5 ratios) across different samples Files with meta data e.g. sample info, annotation and coordinates of each spot on array

Proteomics: SWISS-2DPAGE

Enzymes and metabolic pathways Contain information describing enzymes, biochemical reactions and metabolic pathways; ENZYME and BRENDA: nomenclature databases that store information on enzyme names and reactions; IntEnz: Integrated relational Enzyme database

Enzyme nomenclature E.C. (Enzyme Commission) numbers assigned based on reactions they catalyze Hierarchy, high level groups: –EC 1 –Oxidoreductases –EC 2 –Transferases –EC 3 –Hydrolases –EC 4 –Lyases –EC 5 –Isomerases –EC 6 –Ligases

EC example

Metabolic Pathway databases PATHGUIDE >200 pathways KEGG (Kyoto encyclopedia of genes and genomes): -includes: –Database of chemicals, genes and networks (metabolic, regulatory etc.) –Well-curated and quite specific EcoCyc (Encyclopedia of E. coli K12 genes and metabolism): –curation of entries genome Reactome –curated biological pathways: GenMAPP –pathways contributed by users

Different pathway in different species: -> comparison

Pathway in Reactome

Example of a pathway in BioCyc

Protein-protein interaction databases Protein-protein interaction databases store pairwise interactions or complexes Can get 1 to more than 20,000 interactions per publication IntAct DIP (Database of Interacting Proteins) mbi.ucla.edu/ BIND (Biomolecular Interaction Network Database)

Protein-protein interactions in IntAct

Integrated functional interactions in STRING

Genome browsers Integrate sequence & functional data for a genome Ensembl –genome browser for major eukaryotic genomes, e.g. human, mouse etc. UCSC browser - FlyBase –Drosophila genome database: WormBase –C. elegans: PlasmoDB –Plasmodium (malaria): Etc.

Ensembl genome browser

Ensembl gene view 1

Ensembl gene view 2

Gene within context on chromosome

Human genetics databases GeneCards ( HapMap ( OMIM HGDP Human Genome Diversity Project (

Most of the databases are disease or gene centric i.e. p53 Mutation/polymorphism databases

dbSNP Repository of all known mutation (human and other organisms)

Where to find the databases Table of addresses for major databases and tools Nucleic Acids Research Database issue January each year Nucleic Acids Research Software issue –new Expasy list of tools:

Large scale data retrieval Programmatic access to many databases MySQL access to some BioMart access –public and private FTP sites –large data downloads

Other tutorials ex.html