Presentation is loading. Please wait.

Presentation is loading. Please wait.

Methods for Creating GO Annotations Emily Dimmer European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK.

Similar presentations


Presentation on theme: "Methods for Creating GO Annotations Emily Dimmer European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK."— Presentation transcript:

1 Methods for Creating GO Annotations Emily Dimmer European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK

2 The core information needed for a GO annotation 1. Database object (protein) e.g. Q9ARH1 2. GO term ID e.g. GO:0004674 3. Reference ID e.g. PubMed ID: 12374299 GOA:InterPro 4. Evidence code e.g. TAS

3 1. Database object (protein) e.g. Q9ARH1 2. GO term ID e.g. GO:0004674 3. Reference ID e.g. PubMed ID: 12374299 GOA:InterPro 4. Evidence code e.g. TAS The core information needed for a GO annotation

4 1. Database object (protein) e.g. Q9ARH1 2. GO term ID e.g. GO:0004674 3. Reference ID e.g. PubMed ID: 12374299 GOA:InterPro 4. Evidence code e.g. TAS The core information needed for a GO annotation

5 1. Database object (protein) e.g. Q9ARH1 2. GO term ID e.g. GO:0004674 3. Reference ID e.g. PubMed ID: 12374299 GOA:InterPro 4. Evidence code e.g. TAS The core information needed for a GO annotation

6 GO Evidence Codes CodeDefinition IEAInferred from Electronic Annotation IDAInferred from Direct Assay IEPInferred from Expression Pattern IGIInferred from Genetic Interaction IMPInferred from Mutant Phenotype IPIInferred from Physical Interaction ISSInferred from Sequence Similarity TASTraceable Author Statement NASNon-traceable Author Statement RCAReviewed Computational Analysis ICInferred from Curator NDNo Data Manually annotated Every GO annotation includes an Evidence Code that gives information about the evidence from which the annotation has been made.

7 Additional fields can be used to further clarify an annotation Qualifiers (NOT, contributes_to, colocalizes_with) ‘ with’ data to provide users with more information on the method/experiment applied.

8 hSNF2H ATPase activity GO:0016887 IDA Rsf-1 NOT ATPase activity GO:0016887 IDA Annotations using the ‘NOT’ qualifier Loyola et al. Mol Cell Biol. 2003 Oct;23(19):6759-68.

9 1.Its individual action 2.the action of the whole complex To differentiate between these two types of annotations, if a protein does not possess the activity itself, the annotation has the contributes_to qualifier added A protein which is part of a complex can be annotated to terms in that describe: (Molecular Function terms) Annotations using the ‘contributes_to’ qualifier

10 Cao et al. Mol Cell. 2005 Dec 22;20(6):845-54. Bmi-1 ubiquitin-protein ligase activity IDA contributes_to Ring1A ubiquitin-protein ligase activity IDA contributes_to Pc3 ubiquitin-protein ligase activity IDA contributes_to Ring1B ubiquitin-protein ligase activity IDA Annotations using the ‘contributes_to’ qualifier

11 Annotations using the ‘colocalizes_with’ qualifier Used with cellular component terms To describe proteins that are transiently or peripherally associated with an organelle or complex Meyer et al. J Cell Biol. 1997 Feb 24;136(4):775-88. CENP-E condensed chromosome kinetochore IDA colocalizes_with

12 Annotations using additional identifiers in the ‘with’ column Provides further information to support the evidence code used in an annotation For protein binding annotations… Protein GO term Evidence Reference With When transferring annotations based on sequence similarity… Protein GO term Evidence Reference With

13 There are two main types of GO annotation:  Electronic Annotation  Manual Annotation both these methods have their advantages They can be easily distinguished by the ‘evidence code’ used.

14 Electronic Annotation Fatty acid biosynthesis ( Swiss-Prot Keyword) EC:6.4.1.2 (EC number) IPR000438: Acetyl-CoA carboxylase carboxyl transferase beta subunit ( InterPro entry) MF_00527: Putative 3- methyladenine DNA glycosylase (HAMAP) GO:Fatty acid biosynthesis ( GO:0006633 ) GO:acetyl-CoA carboxylase activity ( GO:0003989 ) GO:acetyl-CoA carboxylase activity (GO:0003989) GO:DNA repair (GO:0006281) Very high-quality However these annotations often use high-level GO terms and provide little detail. Camon et al. BMC Bioinformatics. 2005; 6 Suppl 1:S17

15 http://www.geneontology.org/GO.indices.shtml Mappings of external concepts to GO

16 InterProScan http://www.ebi.ac.uk/InterProScan

17 Output from InterProScan…

18 High–quality, specific annotations made using: Peer-reviewed papers A range of evidence codes to categorize the types of evidence found in a paper very time consuming and requires trained biologists Manual Annotation

19 Finding GO terms … …for chicken TaxREB107protein (Q8UWG7) Component: cytoplasm GO:0005737 nucleoli cytoplasmic increased troponin I reporter gene activity positive modulator of skeletal muscle gene expression Component: nucleolus GO:0005730 Process: positive regulation of transcription GO:0045941 Process: positive regulation of skeletal muscle development GO:0048643

20 http://www.geneontology.org/GO.annotation.shtml

21 Aids for GO manual annotation Many are on the GO Consortium tools page: http://www.geneontology.org/GO.tools.shtml

22 GoPubMed gives an overview over literature abstracts taken from PubMed and categorizes them with Gene Ontology terms: GoPubMed http://gopubmed.org

23 GoPubMed http://gopubmed.org

24 http://www.ebi.ac.uk/Rebholz-srv/whatizit

25 Whatizit http://www.ebi.ac.uk/Rebholz-srv/whatizit GO termsUniProt Ac’s

26

27

28

29

30

31 http://www.ebi.ac.uk/ego http://www.godatabase.org …and more varieties of browsers available on the GO Tools page: http://www.geneontology.org/GO.tools.html Searching for GO terms

32 http://www.ebi.ac.uk/ego

33 Exact match

34

35

36 GO annotation editors enhanced spreadsheets (e.g. Excel) Protein2GO (GOA) The GO Consortium is aware there is a need for a light-weight, generic GO annotation tool.

37 Enhanced Spreadsheets quick and cheap to start with however difficult to maintain/update a reasonable sized set of annotations

38 protein2go Protein2GO

39

40

41

42

43

44

45 QuickGO : http://www.ebi.ac.uk/ego Download and parse an entire gene association file… …or look at annotations for a protein using one of the GO browsers or a database that integrates GO annotations. How users can view GO annotations

46 http://www.geneontology.org/GO.current.annotations.shtml

47 http://www.ebi.ac.uk/goa

48 Acknowledgements Nicky Mulder Head of InterPro Evelyn Camon GOA Coordinator Daniel Barrell GOA Programmer Rachael Huntley GOA Curator David Binns & John Maslen QuickGO, Protein2GO tools Achuthanunni C. Balakrishnan Text-2-GO Jorge Duarte IPI sets Midori Harris GO Editor Jane Lomax GO Curator Amelia Ireland GO Curator Jennifer Clarke GO Curator Rolf Apweiler Head of Sequence Database Group The Gene Ontology Consortium and 1.5 members of GOA currently supported by an P41 grant from the National Human Genome Research Institute (NHGRI) [grant HG002273], GOA is also supported by core EMBL funding.


Download ppt "Methods for Creating GO Annotations Emily Dimmer European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK."

Similar presentations


Ads by Google