Today’s menu: -SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.

Slides:



Advertisements
Similar presentations
A Comparative mapping resource ONTOLOGY DEVELOPMENT AND INTEGRATION IN GRAMENE Pankaj Jaiswal Cornell University.
Advertisements

Applications of GO. Goals of Gene Ontology Project.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
Slide-1 ONTOLOGY DEVELOPMENT AND INTEGRATION Tutorial exercise: A preview.
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Gene Ontology John Pinney
EBI is an Outstation of the European Molecular Biology Laboratory. Alex Mitchell InterPro team Using InterPro for functional analysis.
©CMBI 2005 Exploring Protein Sequences - Part 2 Part 1: Patterns and Motifs Profiles Hydropathy Plots Transmembrane helices Antigenic Prediction Signal.
Introduction to Functional Analysis J.L. Mosquera and Alex Sanchez.
Intro to Bioinformatics Summary. What did we learn Pairwise alignment – Local and Global Alignments When? How ? Tools : for local blast2seq, for global.
COG and GO tutorial.
Genome analysis and annotation Part II. THE INSTITUTE FOR GENOMIC RESEARCH TIGRTIGR Evidence View S.mansoni PASA assemblies S. japonicum EST alignments.
Biology 224 Dr. Tom Peavy Sept 28 & 30
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Tutorial 5 Motif discovery.
Protein analysis and proteomics Friday, 27 January 2006 Introduction to Bioinformatics DA McClellan
Biology 224 Dr. Tom Peavy Sept 27 & 29 Protein Structure & Analysis- part 2.
Pattern databases in protein analysis Arthur Gruber Instituto de Ciências Biomédicas Universidade de São Paulo AG-ICB-USP.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
1 Gene Ontology and Semantic Similarity Measures.
Protein and Function Databases
Introduction to Bioinformatics - Tutorial no. 8 Protein Prediction: - PROSITE - Pfam - SCOP - TOPITS - genThreader.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Predicting Function (& location & post-tln modifications) from Protein Sequences June 15, 2015.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Methods for Creating GO Annotations Emily Dimmer European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK.
Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.
PAT project Advanced bioinformatics tools for analyzing the Arabidopsis genome Proteins of Arabidopsis thaliana (PAT) & Gene Ontology (GO) Hongyu Zhang,
A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!
Automatic methods for functional annotation of sequences Petri Törönen.
Wellcome Trust Workshop Working with Pathogen Genomes Module 3 Sequence and Protein Analysis (Using web-based tools)
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Slide-1 DEVELOPMENT AND INTEGRATION OF ONTOLOGIES IN GRAMENE Scientific Advisory Board Meeting January 2005.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
Biology 224 Instructor: Tom Peavy Feb 21 & 26, Protein Structure & Analysis.
Ontologies, data standards and controlled vocabularies.
Sequence analysis: Macromolecular motif recognition Sylvia Nagl.
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Gene expression analysis
Lecture Four: GO: The Gene Ontology ----Infrastructure for Systems Biology.
BIOINFORMATIK I UEBUNG 2 mRNA processing.
BLOCKS Multiply aligned ungapped segments corresponding to most highly conserved regions of proteins- represented in profile.
Monday, November 8, 2:30:07 PM  Ontology is the philosophical study of the nature of being, existence or reality as such, as well as the basic categories.
Alastair Kerr, Ph.D. WTCCB Bioinformatics Core An introduction to DNA and Protein Sequence Databases.
Tutorial 7 Gene expression analysis 1. Expression data –GEO –UCSC –ArrayExpress General clustering methods –Unsupervised Clustering Hierarchical clustering.
Protein and RNA Families
Mining Biological Data. Protein Enzymatic ProteinsTransport ProteinsRegulatory Proteins Storage ProteinsHormonal ProteinsReceptor Proteins.
Functional Annotation and Functional Enrichment. Annotation Structural Annotation – defining the boundaries of features of interest (coding regions, regulatory.
Motif discovery and Protein Databases Tutorial 5.
Rice Proteins Data acquisition Curation Resources Development and integration of controlled vocabulary Gene Ontology Trait Ontology Plant Ontology
Protein Domain Database
PROTEIN PATTERN DATABASES. PROTEIN SEQUENCES SUPERFAMILY FAMILY DOMAIN MOTIF SITE RESIDUE.
Sequence Based Analysis Tutorial March 26, 2004 NIH Proteomics Workshop Lai-Su L. Yeh, Ph.D. Protein Science Team Lead Protein Information Resource at.
March 28, 2002 NIH Proteomics Workshop Bethesda, MD Lai-Su Yeh, Ph.D. Protein Scientist, National Biomedical Research Foundation Demo: Protein Information.
1 Annotation EPP 245/298 Statistical Analysis of Laboratory Data.
S. pombe Unicellular archiascomycete Diverged from S. cerevisiae Ma Size ~14 Mb, 3 chromosomes No synteny Data stored in GeneDB.
InterPro Sandra Orchard.
Welcome to the Protein Database Tutorial. This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
The Biologist’s Wishlist A complete and accurate set of all genes and their genomic positions A set of all the transcripts produced by each gene The location.
Protein families, domains and motifs in functional prediction May 31, 2016.
Sequence: PFAM Used example: Database of protein domain families. It is based on manually curated alignments.
Protein families, domains and motifs in functional prediction
Protein Families, Motifs & Domains.
Sequence based searches:
Genome Annotation Continued
There are four levels of structure in proteins
Sequence Based Analysis Tutorial
Sequence Based Analysis Tutorial
Presentation transcript:

Today’s menu: -SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7

Characterized proteins Hypothetical proteins

Pfam Pfam is a database of multiple alignments of protein domains or conserved protein regions.

ls mode: a hit is reported if it globally aligns to the seed fs mode: a hit is reported if it locally aligns to the seed

Description Structure info Gene Ontology Links

What kind of domains can we find in Pfam? Trusted Domains Repeats and Motifs Fragment Domains Nested Domains Disulfide bonds Important residues (e.g active sites) Trans membrane domains

What kind of domains can we find in Pfam? Low complexity regions Coiled Coils: (two or three alpha helices that wind around each other) Pfam-B Context domains: are those that despite not scoring above the family threshold are expected to be real based on the other domains found in the protein Signal peptides: (indicate a protein that will be secreted)

ProSite is a database of protein domains and motifs that can be searched by either regular expression patterns or sequence profiles.

Search Results Domains architecture

PRATT Make a pattern from FASTA format sequences

PRATT

Greed, Overlap and Include Search A-x(1,3)-A on ABACADAEAFA

Gene Ontology (GO) It is a database of biological processes, molecular functions and cellular components. GO does not contain sequence information nor gene or protein description. GO is linked to gene and protein databases. The GO database is structured as a tree

Three principal branches

GO structure is a Directed Acyclic Graph

Important: note what is the source of the GO entry

GO sources ISSInferred from Sequence/Structural Similarity IDAInferred from Direct Assay IPIInferred from Physical Interaction TASTraceable Author Statement NASNon-traceable Author Statement IMPInferred from Mutant Phenotype IGIInferred from Genetic Interaction IEPInferred from Expression Pattern ICInferred by Curator NDNo Data available IEAInferred from electronic annotation