Methods for Creating GO Annotations Emily Dimmer European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK.

Slides:



Advertisements
Similar presentations
A Comparative mapping resource ONTOLOGY DEVELOPMENT AND INTEGRATION IN GRAMENE Pankaj Jaiswal Cornell University.
Advertisements

Annotation of Gene Function …and how thats useful to you.
Applications of GO. Goals of Gene Ontology Project.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
European Bioinformatics Institute The Gene Ontology Annotation (GOA) Database and enhancement of GO annotations through InterPro2GO Nicky Mulder
Gene Ontology John Pinney
Gene function analysis Stem Cell Network Microarray Course, Unit 5 May 2007.
Introduction to Functional Analysis J.L. Mosquera and Alex Sanchez.
COG and GO tutorial.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Protein analysis and proteomics Friday, 27 January 2006 Introduction to Bioinformatics DA McClellan
Biology 224 Dr. Tom Peavy Sept 27 & 29 Protein Structure & Analysis- part 2.
David Binns, * Emily Dimmer, Rachael Huntley, Daniel Barrell, Claire O'Donovan, and Rolf Apweiler.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Today’s menu: -SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Protein and Function Databases
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Gene Ontology at WormBase: Making the Most of GO Annotations Kimberly Van Auken.
A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Using The Gene Ontology: Gene Product Annotation.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Slide-1 DEVELOPMENT AND INTEGRATION OF ONTOLOGIES IN GRAMENE Scientific Advisory Board Meeting January 2005.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
Biology 224 Instructor: Tom Peavy Feb 21 & 26, Protein Structure & Analysis.
Ontologies, data standards and controlled vocabularies.
EBI is an Outstation of the European Molecular Biology Laboratory. Introduction to the Gene Ontology and GO annotation resources Rachael Huntley UniProtKB-GOA.
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
GO: The Gene Ontology Pascale Gaudet dictyBase curator Northwestern University, Chicago, IL.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Gene Ontology Project
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
Gene expression analysis
EBI is an Outstation of the European Molecular Biology Laboratory. GOA: Looking after GO annotations Emily Dimmer Gene Ontology Annotation (GOA) Database.
Lecture Four: GO: The Gene Ontology ----Infrastructure for Systems Biology.
BIOINFORMATIK I UEBUNG 2 mRNA processing.
1 SRI International Bioinformatics GO Term Integration and Curation in Pathway Tools and EcoCyc Ingrid M. Keseler Bioinformatics Research Group SRI International.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Emily Dimmer GOA group European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK Gene Ontology (GO)
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
Manual GO annotation Evidence: Source AnnotationsProteins IEA:Total Manual: Total
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
SRI International Bioinformatics 1 Submitting pathway to MetaCyc Ron Caspi.
24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?
Gene Product Annotation using the GO ml Harold J Drabkin Senior Scientific Curator The Jackson Laboratory.
Alastair Kerr, Ph.D. WTCCB Bioinformatics Core An introduction to DNA and Protein Sequence Databases.
Tutorial 7 Gene expression analysis 1. Expression data –GEO –UCSC –ArrayExpress General clustering methods –Unsupervised Clustering Hierarchical clustering.
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
Functional Annotation and Functional Enrichment. Annotation Structural Annotation – defining the boundaries of features of interest (coding regions, regulatory.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
Building WormBase database(s). SAB 2008 Wellcome Trust Sanger Insitute Cold Spring Harbor Laboratory California Institute of Technology ● RNAi ● Microarray.
Getting Started: a user’s guide to the GO TAMU GO Workshop 17 May 2010.
Rice Proteins Data acquisition Curation Resources Development and integration of controlled vocabulary Gene Ontology Trait Ontology Plant Ontology
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Computer Science Ph. D. Seminar Gene Ontology (GO) Based Search for Protein Structure Similarity Clustering Metrics Ph.D. Candidate Steve Johnson Committee.
1 Annotation EPP 245/298 Statistical Analysis of Laboratory Data.
Getting GO: how to get GO for functional modeling Iowa State Workshop 11 June 2009.
An example of GO annotation from a primary paper Rebecca E. Foulger (UniProt Curator) GO Annotation Camp, June 2005 PMID:
Lisa Matthews, 1 Esther Schmidt, 2 Suzanna Lewis, 3 David Croft, 2 Bernard de Bono, 2 Peter D'Eustachio, 1 Marc Gillespie, 1 Gopal Gopinath, 1 Bijay Jassal,
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
An example of GO annotation from a primary paper GO Annotation Camp, July 2006 PMID:
Annotating with GO: an overview
Introduction to the Gene Ontology
Department of Genetics • Stanford University School of Medicine
Annotating Gene Products to the GO
Presentation transcript:

Methods for Creating GO Annotations Emily Dimmer European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK

The core information needed for a GO annotation 1. Database object (protein) e.g. Q9ARH1 2. GO term ID e.g. GO: Reference ID e.g. PubMed ID: GOA:InterPro 4. Evidence code e.g. TAS

1. Database object (protein) e.g. Q9ARH1 2. GO term ID e.g. GO: Reference ID e.g. PubMed ID: GOA:InterPro 4. Evidence code e.g. TAS The core information needed for a GO annotation

1. Database object (protein) e.g. Q9ARH1 2. GO term ID e.g. GO: Reference ID e.g. PubMed ID: GOA:InterPro 4. Evidence code e.g. TAS The core information needed for a GO annotation

1. Database object (protein) e.g. Q9ARH1 2. GO term ID e.g. GO: Reference ID e.g. PubMed ID: GOA:InterPro 4. Evidence code e.g. TAS The core information needed for a GO annotation

GO Evidence Codes CodeDefinition IEAInferred from Electronic Annotation IDAInferred from Direct Assay IEPInferred from Expression Pattern IGIInferred from Genetic Interaction IMPInferred from Mutant Phenotype IPIInferred from Physical Interaction ISSInferred from Sequence Similarity TASTraceable Author Statement NASNon-traceable Author Statement RCAReviewed Computational Analysis ICInferred from Curator NDNo Data Manually annotated Every GO annotation includes an Evidence Code that gives information about the evidence from which the annotation has been made.

Additional fields can be used to further clarify an annotation Qualifiers (NOT, contributes_to, colocalizes_with) ‘ with’ data to provide users with more information on the method/experiment applied.

hSNF2H ATPase activity GO: IDA Rsf-1 NOT ATPase activity GO: IDA Annotations using the ‘NOT’ qualifier Loyola et al. Mol Cell Biol Oct;23(19):

1.Its individual action 2.the action of the whole complex To differentiate between these two types of annotations, if a protein does not possess the activity itself, the annotation has the contributes_to qualifier added A protein which is part of a complex can be annotated to terms in that describe: (Molecular Function terms) Annotations using the ‘contributes_to’ qualifier

Cao et al. Mol Cell Dec 22;20(6): Bmi-1 ubiquitin-protein ligase activity IDA contributes_to Ring1A ubiquitin-protein ligase activity IDA contributes_to Pc3 ubiquitin-protein ligase activity IDA contributes_to Ring1B ubiquitin-protein ligase activity IDA Annotations using the ‘contributes_to’ qualifier

Annotations using the ‘colocalizes_with’ qualifier Used with cellular component terms To describe proteins that are transiently or peripherally associated with an organelle or complex Meyer et al. J Cell Biol Feb 24;136(4): CENP-E condensed chromosome kinetochore IDA colocalizes_with

Annotations using additional identifiers in the ‘with’ column Provides further information to support the evidence code used in an annotation For protein binding annotations… Protein GO term Evidence Reference With When transferring annotations based on sequence similarity… Protein GO term Evidence Reference With

There are two main types of GO annotation:  Electronic Annotation  Manual Annotation both these methods have their advantages They can be easily distinguished by the ‘evidence code’ used.

Electronic Annotation Fatty acid biosynthesis ( Swiss-Prot Keyword) EC: (EC number) IPR000438: Acetyl-CoA carboxylase carboxyl transferase beta subunit ( InterPro entry) MF_00527: Putative 3- methyladenine DNA glycosylase (HAMAP) GO:Fatty acid biosynthesis ( GO: ) GO:acetyl-CoA carboxylase activity ( GO: ) GO:acetyl-CoA carboxylase activity (GO: ) GO:DNA repair (GO: ) Very high-quality However these annotations often use high-level GO terms and provide little detail. Camon et al. BMC Bioinformatics. 2005; 6 Suppl 1:S17

Mappings of external concepts to GO

InterProScan

Output from InterProScan…

High–quality, specific annotations made using: Peer-reviewed papers A range of evidence codes to categorize the types of evidence found in a paper very time consuming and requires trained biologists Manual Annotation

Finding GO terms … …for chicken TaxREB107protein (Q8UWG7) Component: cytoplasm GO: nucleoli cytoplasmic increased troponin I reporter gene activity positive modulator of skeletal muscle gene expression Component: nucleolus GO: Process: positive regulation of transcription GO: Process: positive regulation of skeletal muscle development GO:

Aids for GO manual annotation Many are on the GO Consortium tools page:

GoPubMed gives an overview over literature abstracts taken from PubMed and categorizes them with Gene Ontology terms: GoPubMed

GoPubMed

Whatizit GO termsUniProt Ac’s

…and more varieties of browsers available on the GO Tools page: Searching for GO terms

Exact match

GO annotation editors enhanced spreadsheets (e.g. Excel) Protein2GO (GOA) The GO Consortium is aware there is a need for a light-weight, generic GO annotation tool.

Enhanced Spreadsheets quick and cheap to start with however difficult to maintain/update a reasonable sized set of annotations

protein2go Protein2GO

QuickGO : Download and parse an entire gene association file… …or look at annotations for a protein using one of the GO browsers or a database that integrates GO annotations. How users can view GO annotations

Acknowledgements Nicky Mulder Head of InterPro Evelyn Camon GOA Coordinator Daniel Barrell GOA Programmer Rachael Huntley GOA Curator David Binns & John Maslen QuickGO, Protein2GO tools Achuthanunni C. Balakrishnan Text-2-GO Jorge Duarte IPI sets Midori Harris GO Editor Jane Lomax GO Curator Amelia Ireland GO Curator Jennifer Clarke GO Curator Rolf Apweiler Head of Sequence Database Group The Gene Ontology Consortium and 1.5 members of GOA currently supported by an P41 grant from the National Human Genome Research Institute (NHGRI) [grant HG002273], GOA is also supported by core EMBL funding.