Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tutorial: Bioinformatics Resources (http://pir. georgetown

Similar presentations


Presentation on theme: "Tutorial: Bioinformatics Resources (http://pir. georgetown"— Presentation transcript:

1 Tutorial: Bioinformatics Resources (http://pir. georgetown
Bio-Trac 25 (Proteomics: Principles and Methods) March 25, 2005 Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist Protein Information Resource National Biomedical Research Foundation, GUMC

2 What is Bioinformatics?
computer mouse = bioinformatics (information) (biology) NIH Biomedical Information Science and Technology Initiative (BISTI) Working Definition (2000) - Research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data.

3 Molecular Biology Database Collection (http://nar. oupjournals
key databases of 14 categories

4 Database Collection in Nucleic Acids Res.

5

6 Overview Database Contents, Search and Retrieval
Text search / Information retrieval Sequence & genomics databases Protein family databases Database of protein functions Databases of protein structures Proteomics databases

7 Entrez Text Searches (

8 PubMed Literature Database (http://www. ncbi. nlm. nih

9 UniProt Text Search (

10 PIR Text Search (I) ( What’s different between CRAA_RABIT & CYRBAA? How about Search: Crystallin and SuperFamily?

11 PIR Text Search (II) Can you find which crystallin that has 3D structure determined using PIR text search?

12 I. Sequence & Genomics Databases
GenBank: An annotated collection of all publicly available nucleotide and protein sequences. RefSeq: NCBI non-redundant set of reference sequences, including genomic DNA, transcript (RNA), and protein products UniProt Consortium Database: Universal protein knowledgebase, a central resource of protein sequence and function from Swiss-Prot, TrEMBL and PIR. Entrez Gene: Gene-centered information at NCBI. UniGene: Unified clusters of ESTs and full-length mRNA sequences . OMIM: Online Mendelian inheritance in man: a catalog of human genetic and genomic disorders. Model Organism Genome Databases: MGD, RGD, SGD, Flybase… GeneCards: Integrated database of human genes, maps, proteins and diseases. SNP Consortium Database

13 UniProt Consortium Database
UniProtKB (knowledgebase) UniRef (100,90,50) UniParc (archive) (

14 UniProt Sequence Report (I)
(

15 UniProt Sequence Report (II)
(

16 Entrez Gene

17 OMIM: Online Mendelian inheritance in man
(

18 II. Protein Family Databases
Whole Proteins PIRSF: A Network Classification System of Protein Families COG (Clusters of Orthologous Groups) of Complete Genomes ProtoNet: Automated Hierarchical Classification of Proteins Protein Domains Pfam: Alignments and HMM Models of Protein Domains SMART: Protein Domain Families CDD: Conserved Domain Database Protein Motifs PROSITE: Protein Patterns and Profiles BLOCKS: Protein Sequence Motifs and Alignments PRINTS: Protein Sequence Motifs and Signatures Integrated Family Databases iProClass: Superfamilies/Families, Domains, Motifs, Rich Links InterPro: Integrate Pfam, PRINTS, PROSITES, ProDom, SMART, PIRSF, SuperFamily

19 Protein Clustering COGs: (

20 KOGs: Eukaryotic Clusters
(

21 Domain Classification
( (

22 Pfam Domain (

23 Integrated Family Classification
InterPro: An integrated resource unifying PROSITE, PRINTS, ProDom, Pfam, SMART, and TIGRFAMs, PIRSF. (

24 PIRSF: Full Length Classification iProClass Family Report
(

25 Protein Motifs PROSITE is a database of protein families and domains. It consists of biologically significant sites, patterns and profiles. (

26 III. Databases of Protein Functions
Metabolic Pathways, Enzymes, and Compounds Enzyme Classification: Classification and Nomenclature of Enzyme-Catalysed Reactions (EC-IUBMB) KEGG (Kyoto Encyclopedia of Genes and Genomes): Metabolic Pathways LIGAND (at KEGG): Chemical Compounds, Reactions and Enzymes EcoCyc: Encyclopedia of E. coli Genes and Metabolism MetaCyc: Metabolic Encyclopedia (Metabolic Pathways) WIT: Functional Curation and Metabolic Models BRENDA: Enzyme Database UM-BBD: Microbial Biocatalytic Reactions and Biodegradation Pathways Cellular Regulation and Gene Networks EpoDB: Genes Expressed during Human Erythropoiesis BIND: Descriptions of interactions, molecular complexes and pathways DIP: Catalogs experimentally determined interactions between proteins BioCarta: Biological pathways of human and mouse GO: Gene Ontology Consortium Database MetaCyc is a metabolic-pathway database. The database describes pathways, reactions, and enzymes of a variety of organisms, with a microbial focus.

27 KEGG Metabolic & Regulatory Pathways
KEGG is a suite of databases and associated software, integrating our current knowledge on molecular interaction networks, the information of genes and proteins, and of chemical compounds and reactions. ( (

28 BioCyc (EcoCyc/MetaCyc Metabolic Pathways)
The BioCyc Knowledge Library is a collection of Pathway/Genome Databases (

29 BioCarta Cellular Pathways
(

30 Protein-Protein Interaction: BIND
(

31 Gene Ontology (http://www.geneontology.org/)
Three GOs: Molecular Function Biological Process Cellular Component

32 IV. Databases of Protein Structures
PDB: Structure Determined by X-ray Crystallography and NMR PDBsum: Summaries and analyses of PDB structures MMDB: NCBI’s database of 3D structures, part of NCBI Entrez SWISS-MODEL Repository: Database of annotated protein 3D models ModBase: Annotated comparative protein structure models Structure Classification CATH: Hierarchical Classification of Protein Domain Structures SCOP: Familial and Structural Protein Relationships FSSP: Protein Fold Classification Based on Structure--Structure Alignment

33 PDB 3D Structure Rat gamma-crystallin, chain A, B.
Can you do a text search at PIR to find this? (

34 PDBsum: Summary and Analysis (

35 Protein Structural Classification (1)
CATH: Hierarchical domain classification of protein structures ( ucl.ac.uk/bsm/cath_new/)

36 Protein Structural Classification (2)
SCOP: comprehensive description of structural and evolutionary relationships between all proteins whose structure is known. (

37 SWISS-MODEL Repository
A database of annotated three-dimensional comparative protein structure models (

38 VI. Proteomic Resources
GELBANK ( 2D-gel patterns from completed genomes; SWISS-2DPAGE ( PEP: Predictions for Entire Proteomes: ( pep/): Summarized analyses of protein sequences Proteome BioKnowledge Library: ( Detailed information on human, mouse and rat proteomes Proteome Analysis Database ( Online application of InterPro and CluSTr for the functional classification of proteins in whole genomes Expression Profiling databases: GNF ( human and mouse transcriptome), SMD ( Stanford microarray data analysis), EBI Microarray Informatics ( index.html , managing, storing and analyzing microarray data)

39 2D-Gel Image Databases (1)
( (

40 2D-Gel Image Databases (2)
(

41 Expression Profiling (http://genome-www.stanford.edu/serum/)
Human and Mouse Transcriptome ( ( (

42 Lab: Choose additional protein IDs to browse the variety of molecular biology databases each sequence report links to. Delta crystallin II (Argininosuccinate lyase) (UniProt: CRD2_ANAPL) Alpha crystallin (UniProt: CRAA_RABIT)


Download ppt "Tutorial: Bioinformatics Resources (http://pir. georgetown"

Similar presentations


Ads by Google