A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Slides:



Advertisements
Similar presentations
Annotation of Gene Function …and how thats useful to you.
Advertisements

Applications of GO. Goals of Gene Ontology Project.
Www. GeneOntology.org Gene Ontology Collaboration.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Microarray Data Analysis Day 2
Global Mapping of the Yeast Genetic Interaction Network Tong et. al, Science, Feb 2004 Presented by Bowen Cui.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
Gene Ontology John Pinney
Gene function analysis Stem Cell Network Microarray Course, Unit 5 May 2007.
COG and GO tutorial.
The UNIVERSITY of Kansas EECS 800 Research Seminar Mining Biological Data Instructor: Luke Huan Fall, 2006.
BI class 2010 Gene Ontology Overview and Perspective.
Internet tools for genomic analysis: part 2
Gene Annotation and GO SPH 247 Statistical Analysis of Laboratory Data.
1 Gene Ontology and Semantic Similarity Measures.
BICH CACAO Biocurator Training Session #3.
Microarray Data Analysis The Bioinformatics side of the bench.
Microarray Data Analysis The Bioinformatics side of the bench.
PAT project Advanced bioinformatics tools for analyzing the Arabidopsis genome Proteins of Arabidopsis thaliana (PAT) & Gene Ontology (GO) Hongyu Zhang,
A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!
EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science
Gene Annotation and GO SPH 247 Statistical Analysis of Laboratory Data May 5, 2015SPH 247 Statistical Analysis of Laboratory Data 1.
Using The Gene Ontology: Gene Product Annotation.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
Gene Ontology Overview and Perspective Lung Development Ontology Workshop.
SPH 247 Statistical Analysis of Laboratory Data 1May 14, 2013SPH 247 Statistical Analysis of Laboratory Data.
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
The Bioinformatics of Microarrays
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Gene expression analysis
Gene Onotology Part 1: what is the GO? Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse Genome Informatics.
Monday, November 8, 2:30:07 PM  Ontology is the philosophical study of the nature of being, existence or reality as such, as well as the basic categories.
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of.
Manual GO annotation Evidence: Source AnnotationsProteins IEA:Total Manual: Total
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
SRI International Bioinformatics 1 Submitting pathway to MetaCyc Ron Caspi.
24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?
Gene Product Annotation using the GO ml Harold J Drabkin Senior Scientific Curator The Jackson Laboratory.
Part II GO-Vocabulary of Genome. S. cerevisiae D. melanogaster.
Alastair Kerr, Ph.D. WTCCB Bioinformatics Core An introduction to DNA and Protein Sequence Databases.
The Gene Ontology and its insertion into UMLS Jane Lomax.
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
Functional Annotation and Functional Enrichment. Annotation Structural Annotation – defining the boundaries of features of interest (coding regions, regulatory.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
Getting Started: a user’s guide to the GO TAMU GO Workshop 17 May 2010.
Rice Proteins Data acquisition Curation Resources Development and integration of controlled vocabulary Gene Ontology Trait Ontology Plant Ontology
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
Gene Ontology Consortium
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Scope of the Gene Ontology Vocabularies. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary.
Introduction to biological molecular networks
CACAO Training Jim Hu and Suzi Aleksander Fall 2015.
1 Annotation EPP 245/298 Statistical Analysis of Laboratory Data.
Predicting Protein Function Annotation using Protein- Protein Interaction Networks By Tamar Eldad Advisor: Dr. Yanay Ofran Computational Biology.
An example of GO annotation from a primary paper Rebecca E. Foulger (UniProt Curator) GO Annotation Camp, June 2005 PMID:
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
Protein databases Petri Törönen Shamelessly copied from material done by Eija Korpelainen and from CSC bio-opas
An example of GO annotation from a primary paper GO Annotation Camp, July 2006 PMID:
Gene Ontology TM (GO) Consortium
2/3/2005 Gene Ontology (GO) The Gene Ontology (GO) project is a collaborative effort to address the need for consistent descriptions.
CACAO Training Jim Hu and Suzi Aleksander Fall 2015.
Annotating with GO: an overview
Introduction to the Gene Ontology
Department of Genetics • Stanford University School of Medicine
SPH 247 Statistical Analysis of Laboratory Data
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
Gene expression analysis
Insight into GO and GOA Angelica Tulipano , INFN Bari CNR
Presentation transcript:

A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Gene Ontology Objectives GO represents categories used to classify specific parts of our biological knowledge: –Biological Process –Molecular Function –Cellular Component GO develops a common language applicable to any organism GO terms can be used to annotate gene products from any species, allowing comparison of information across species

Expansion of Sequence Info

Eukaryotic Genome Sequences YearGenome# Genes Size (Mb) Yeast ( S. cerevisiae ) ,000 Worm ( C. elegans ) ,100 Fly ( D. melanogaster ) ,600 Plant ( A. thaliana ) ,500 Human ( H. sapiens, 1st Draft )2001 ~3000~35,000 Entering the Genome Sequencing Era

Baldauf et al. (2000) Science 290:972

MCM3 MCM2 CDC46/MCM5 CDC47/MCM7 CDC54/MCM4 MCM6 These proteins form a hexamer in the species that have been examined Comparison of sequences from 4 organisms

Outline of Topics Introduction to the Gene Ontologies (GO) Annotations to GO terms GO Tools Applications of GO

What is an Ontology? (from OED) 1721 B AILEY, Ontology, an Account of being in the Abstract (title) A Brief Scheme of Ontology or the Science of Being in General. a1832 B ENTHAM Fragm. Ontol. Wks VIII. 195 The field of ontology, or as it may otherwise be termed, the field of supremely abstract entities, is a yet untrodden labyrinth B OSANQUET tr. Lotze's Metaph. 22 Ontology..as a doctrine of the being and relations of all reality, had precedence given to it over Cosmology and Psychology, the two branches of enquiry which follow the reality into its opposite distinctive forms.

What is Ontology? Dictionary:A branch of metaphysics concerned with the nature and relations of being. Barry Smith:The science of what is, of the kinds and structures of objects, properties, events, processes and relations in every area of reality s

So what does that mean? From a practical view, ontology is the representation of something we know about. “Ontologies" consist of a representation of things, that are detectable or directly observable, and the relationships between those things.

Sriniga Srinivasan, Chief Ontologist, Yahoo! The ontology. Dividing human knowledge into a clean set of categories is a lot like trying to figure out where to find that suspenseful black comedy at your corner video store. Questions inevitably come up, like are Movies part of Art or Entertainment? (Yahoo! lists them under the latter.) -Wired Magazine, May 1996

Molecular Function = elemental activity/task –the tasks performed by individual gene products; examples are carbohydrate binding and ATPase activity Biological Process = biological goal or objective –broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions Cellular Component = location or complex –subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and RNA polymerase II holoenzyme The 3 Gene Ontologies

Function (what) Process (why) Drive nail (into wood) Carpentry Drive stake (into soil) Gardening Smash roach Pest Control Clown’s juggling object Entertainment Example: Gene Product = hammer

Biological Examples Molecular Function Biological Process Cellular Component

term: MAPKKK cascade (mating sensu Saccharomyces) goid: GO: definition: OBSOLETE. MAPKKK cascade involved in transduction of mating pheromone signal, as described in Saccharomyces. definition_reference: PMID: comment: This term was made obsolete because it is a gene product specific term. To update annotations, use the biological process term 'signal transduction during conjugation with cellular fusion ; GO: '. Terms, Definitions, IDs definition: MAPKKK cascade involved in transduction of mating pheromone signal, as described in Saccharomyces

Ontology Includes: 1.A vocabulary of terms (names for concepts) 2.Definitions 3.Defined logical relationships to each other

[other types of chromosomes] [other organelles] chromosome organelle nucleus nuclear chromosome

Ontologies can be represented as graphs, where the nodes are connected by edges Nodes = terms in the ontology Edges = relationships between the concepts node edge Ontology Structure

Chromosome Cytoplasmic chromosome Mitochondrial chromosome Plastid chromosome Nuclear chromosome A child is a subset or instances of a parent’s elements Parent-Child Relationships

Ontology Structure The Gene Ontology is structured as a hierarchical directed acyclic graph (DAG) Terms can have more than one parent and zero, one or more children Terms are linked by two relationships –is-a –part-of is_apart_of

is-a part-of [other types of chromosomes] [other organelles] chromosome organelle nucleus nuclear chromosome Directed Acyclic Graph (DAG)

Evidence Codes for GO Annotations

Evidence codes Indicate the type of evidence in the cited source* that supports the association between the gene product and the GO term *capturing information

Experimental codes - IDA, IMP, IGI, IPI, IEP Computational codes - ISS, IEA, RCA, IGC Author statement - TAS, NAS Other codes - IC, ND Types of evidence codes

IDA Inferred from Direct Assay direct assay for the function, process, or component indicated by the GO term Enzyme assays In vitro reconstitution (e.g. transcription) Immunofluorescence (for cellular component) Cell fractionation (for cellular component)

IMP Inferred from Mutant Phenotype variations or changes such as mutations or abnormal levels of a single gene product Gene/protein mutation Deletion mutant RNAi experiments Specific protein inhibitors Allelic variation

IGI Inferred from Genetic Interaction Any combination of alterations in the sequence or expression of more than one gene or gene product Traditional genetic screens - Suppressors, synthetic lethals Functional complementation Rescue experiments An entry in the ‘with’ column is recommended

IPI Inferred from Physical Interaction Any physical interaction between a gene product and another molecule, ion, or complex 2-hybrid interactions Co-purification Co-immunoprecipitation Protein binding experiments An entry in the ‘with’ column is recommended

IEP Inferred from Expression Pattern Timing or location of expression of a gene –Transcript levels Northerns, microarray Exercise caution when interpreting expression results

ISS Inferred from Sequence or structural Similarity Sequence alignment, structure comparison, or evaluation of sequence features such as composition –Sequence similarity –Recognized domains/overall architecture of protein An entry in the ‘with’ column is recommended

RCA Inferred from Reviewed Computational Analysis non-sequence-based computational method –large-scale experiments genome-wide two-hybrid genome-wide synthetic interactions –integration of large-scale datasets of several types –text-based computation (text mining)

IGC Inferred from Genomic Context Chromosomal position Most often used for Bacteria - operons –Direct evidence for a gene being involved in a process is minimal, but for surrounding genes in the operon, the evidence is well- established

IEA Inferred from Electronic Annotation depend directly on computation or automated transfer of annotations from a database –Hits from BLAST searches –InterPro2GO mappings No manual checking Entry in ‘with’ column is allowed (ex. sequence ID)

TAS Traceable Author Statement publication used to support an annotation doesn't show the evidence –Review article Would be better to track down cited reference and use an experimental code

NAS Non-traceable Author Statement Statements in a paper that cannot be traced to another publication

ND No biological Data available Can find no information supporting an annotation to any term Indicate that a curator has looked for info but found nothing –Place holder –Date

IC Inferred by Curator annotation is not supported by evidence, but can be reasonably inferred from other GO annotations for which evidence is available ex. evidence = transcription factor (function) –IC = nucleus (component)

Ask yourself: What is the experiment that was done? Choosing the correct evidence code