Gene Ontology (GO) Project

Slides:



Advertisements
Similar presentations
Annotation of Gene Function …and how thats useful to you.
Advertisements

24th Feb 2006 Jane Lomax Gene Ontology tutorial Talk:Using the Gene Ontology (GO) for Expression Analysis Practical:Onto-Express analysis tool Talk: GO.
25th June 2007 Jane Lomax Using the Gene Ontology (GO) for analysis of expression data Jane Lomax EMBL-EBI.
Www. GeneOntology.org Gene Ontology Collaboration.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
The Gene Ontology Consortium Jennifer Clark, GO Editorial Office.
Gene Ontology John Pinney
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
COG and GO tutorial.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
BI class 2010 Gene Ontology Overview and Perspective.
Demonstration Trupti Joshi Computer Science Department 317 Engineering Building North (O)
Internet tools for genomic analysis: part 2
Gene Annotation and GO SPH 247 Statistical Analysis of Laboratory Data.
1 Gene Ontology and Semantic Similarity Measures.
CACAO - Penn State Gene Function and Gene Ontology January 2011
Gene Ontology Project
Lecture 4: Gene Annotation & Gene Ontology June 11, 2015.
Using The Gene Ontology: Gene Product Annotation.
Introduction to the Gene Ontology and GO annotation resources
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Gene Expression Databases: Where and When Dave Clements EuReGene and Mouse Atlas projects Medical Research Council Human Genetics.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
Editing the Gene Ontology Midori A. Harris GO Editorial Office EBI, Hinxton, UK.
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
March 24, Integrating genomic knowledge sources through an anatomy ontology Gennari JH, Silberfein A, and Wiley JC Pac Symp Biocomputing 2005:
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
Gene Ontology Consortium
The Gene Ontology: a real-life ontology, progress and future. Jane Lomax EMBL-EBI.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Gene Ontology Project
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
Lecture Four: GO: The Gene Ontology ----Infrastructure for Systems Biology.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Gene Onotology Part 1: what is the GO? Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse Genome Informatics.
Monday, November 8, 2:30:07 PM  Ontology is the philosophical study of the nature of being, existence or reality as such, as well as the basic categories.
Emily Dimmer GOA group European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK Gene Ontology (GO)
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of.
24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?
Part II GO-Vocabulary of Genome. S. cerevisiae D. melanogaster.
The Gene Ontology and its insertion into UMLS Jane Lomax.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
Gene Ontology Project
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
Gene Ontology Consortium
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
Scope of the Gene Ontology Vocabularies. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary.
Gene Ontology Project
Computer Science Ph. D. Seminar Gene Ontology (GO) Based Search for Protein Structure Similarity Clustering Metrics Ph.D. Candidate Steve Johnson Committee.
MAPPING OF SEQUENCES TO GENE ONTOLOGY. GO consortium.
Getting GO: how to get GO for functional modeling Iowa State Workshop 11 June 2009.
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
Gene Ontology TM (GO) Consortium
Joined up ontologies: incorporating the Gene Ontology into the UMLS.
Gene Annotation & Gene Ontology May 24, Gene lists from RNAseq analysis What do you do with a list of 100s of genes that contain only the following.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Annotating with GO: an overview
The Gene Ontology Project
GO : the Gene Ontology & Functional enrichment analysis
Mental Functioning and the Gene Ontology
The Gene Ontology Project
Department of Genetics • Stanford University School of Medicine
Using the Gene Ontology (GO) for analysis of expression data Jane Lomax EMBL-EBI 25th June 2007 Jane Lomax.
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
Presentation transcript:

Gene Ontology (GO) Project http://www.geneontology.org/ Jane Lomax Hi, my name is Jennifer Clark and I work at the gene ontology consortium editorial office at the European Bioinformatics Institute in Cambridge in the UK. Today I’m going to give you an introduction to the work of the Gene ontology consortium.

There is a lot of biological research output

You’re interested in which genes control inmesoderm development…

You get 6752 results! How will you ever find what you want?

How will you spot the patterns? attacked time control Puparial adhesion Molting cycle hemocyanin Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Amino acid catabolism Lipid metobolism Peptidase activity Protein catabloism Microarray data shows changed expression of thousands of genes. How will you spot the patterns? Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI.

Scientists work hard

There are lots of papers to read http://www.teamtechnology.co.uk/f-scientist.jpg

More papers… http://www.teamtechnology.co.uk/f-scientist.jpg

more and more and more… http://www.teamtechnology.co.uk/f-scientist.jpg

more and Help! more and more! http://www.teamtechnology.co.uk/f-scientist.jpg

The Gene Ontology provides a way to capture and represent biological all this knowledge in a computable form

The Gene Ontology

GO browser

Search on ‘mesoderm development’

Definition of mesoderm development Gene products involved in

GO can be used to help analyse microarray data

Microarray process: Treat samples Collect mRNA Label Hybridize Scan Normalize Select differentially regulated genes Understand the biological phenomena involved So this is a typical process that you’d use to collect data from your microarray Just like that - makes it look very easy! So for a certain set of conditions, you have a set of differentially regualted genes and what you really want to know about is the underlying biological processes involved.

Traditional analysis gene by gene basis requires literature searching time-consuming So this gene by gene approach has the major disadvantage that you have to delve into the literature yourself, which is obviously very time consuming.

Traditional analysis Gene 1 Apoptosis Cell-cell signaling Protein phosphorylation Mitosis … Gene 2 Growth control Oncogenesis Gene 3 Growth control Mitosis Oncogenesis Protein phosphorylation … Gene 4 Nervous system Pregnancy Gene 100 Positive ctrl. of cell prolif Glucose transport Typically, this is the way the analysis would have been done. Taking your differentially regualted genes, you’d analyse them one by one - researching the what is known about that gene, and what processes it is involved in.

Using GO annotations But by using GO annotations, this work has already been done GO:0006915 : apoptosis But by using GO annotations, this work has already been done for you - someone has already sat down and associated a particular gene with a particular process…

Grouping by process Mitosis Gene 2 Gene 5 Gene45 Gene 7 Gene 35 … Glucose transport Gene 7 Gene 3 Gene 6 … Apoptosis Gene 1 Gene 53 Positive ctrl. of cell prolif. Gene 7 Gene 3 Gene 12 … Growth Gene 5 Gene 2 Gene 6 … So you have the ability to group your differentially regulated genes by process…

GO for microarray analysis Annotations give ‘function’ label to genes Ask meaningful questions of microarray data e.g. genes involved in the same process, same/different expression patterns? GO and it’s annotations are useful in microarrays mainly for comparing gene expression patterns to gene function, allowing more meaningful interpretation of microarray data.

How does the Gene Ontology work?

GO structure GO isn’t just a flat list of biological terms terms are related within a hierarchy Two arrangements for DNA replication

GO structure gene A

GO structure This means genes can be grouped according to user-defined levels Allows broad overview of gene set or genome

How does GO work? What information might we want to capture about a gene product?

How does GO work? What information might we want to capture about a gene product? What does the gene product do?

How does GO work? What information might we want to capture about a gene product? What does the gene product do? Where and when does it act?

How does GO work? What information might we want to capture about a gene product? What does the gene product do? Where and when does it act? Why does it perform these activities?

GO structure GO terms divided into three parts: cellular component molecular function biological process

Cellular Component where a gene product acts

Cellular Component

Cellular Component

Cellular Component Enzyme complexes in the component ontology refer to places, not activities.

glucose-6-phosphate isomerase activity Molecular Function activities or “jobs” of a gene product glucose-6-phosphate isomerase activity

insulin receptor activity Molecular Function insulin binding insulin receptor activity

drug transporter activity Molecular Function drug transporter activity

Molecular Function A gene product may have several functions; a function term refers to a reaction or activity, not a gene product Sets of functions make up a biological process

Biological Process a commonly recognized series of events cell division

Biological Process transcription

regulation of gluconeogenesis Biological Process regulation of gluconeogenesis

Biological Process limb development

Biological Process courtship behavior

Ontology Structure Terms are linked by two relationships is-a  part-of 

mitochondrial chloroplast Ontology Structure cell membrane chloroplast mitochondrial chloroplast membrane membrane is-a part-of

Ontology Structure Ontologies are structured as a hierarchical directed acyclic graph (DAG) Terms can have more than one parent and zero, one or more children

Ontology Structure cell membrane chloroplast mitochondrial chloroplast membrane membrane Directed Acyclic Graph (DAG) - multiple parentage allowed

Anatomy of a GO term unique GO ID id: GO:0006094 name: gluconeogenesis namespace: process def: The formation of glucose from noncarbohydrate precursors, such as pyruvate, amino acids and glycerol. [http://cancerweb.ncl.ac.uk/omd/index.html] exact_synonym: glucose biosynthesis xref_analog: MetaCyc:GLUCONEO-PWY is_a: GO:0006006 is_a: GO:0006092 term name ontology definition synonym 17800 terms in three ontologies 94% of terms defined database ref parentage

GO can also be useful for resolving language conflicts amongst scientific communities

? In biology… Tactition Taction Tactile sense In developing the ontologies we are solving a number of problems biologists. Currently in biology there are many ambiguities in language. Groups of researchers may use the same words to mean different things, or they may use several different words to refer to the same thing. This causes problems for scientists trying to access research carried out by groups outside their immediate field. It also makes it very difficult to process biological information using a computer. For example three groups of biologists studying different model organisms may all be studying the perception of touch. Scientists in different groups might talk about this single process as ‘tactition’, ‘tactile sense’ or ‘taction’. This differing use of language means that when they try to find and read each other’s papers they will have more trouble. It will also be harder for them to use a computer to find and interpret biological data on this subject since the computer has no way to know that these words mean the same thing.

perception of touch ; GO:0050975 Tactition Taction Tactile sense perception of touch ; GO:0050975 The GO provides a solution to this problem since we take biological concepts like the perception of touch and we make them a single GO item in the ontology. We add all the relevant synonyms, and give a unique numerical identifier to the concept.

Bud initiation? The GO also provides a solution to the opposite problem in which several groups of scientists use the same words to refer to different things. For example the phrase ‘bud inititation’ could refer to the initiation of a tooth bud, a yeast reproductive bud, or a bud on a tree. However, these three types of bud are initiated in quite different ways, and scientists would like to be able to distinguish between them.

= cellular bud initiation = tooth bud initiation = cellular bud initiation To solve this problem the GO differentiates between differing concepts by adding a ‘sensu ending’. So according to this example, we would have ‘bud initiation sensu Metazoa’ to mean the kind of bud initation that gives rise to a tooth in mammals. We would have ‘bud initiation sensu Saccharomyces’ for the initation of a reproductive bud in yeast, and we would have ‘bud initiation sensu Viridiplantae’ for the initation of a tree bud. This means that gene products involved in bud initiation can be categorised along with only those other gene products involved in the same kind of bud initation. = flower bud initiation

Categorization of gene products using GO is called annotation. So how does that happen?

P05147 Take a gene or protein

P05147 Find papers about it PMID: 2976880

Find the GO term describing its function, process PMID: 2976880 Find the GO term describing its function, process or location of action. GO:0047519

P05147 PMID: 2976880 What evidence do they show? IDA GO:0047519

P05147 GO:0047519 IDA PMID:2976880 Record these: IDA P05147

Submit to the GO Consortium

Annotation appears in GO database

Many species groups annotate We see the research of one Clark et al., 2005 Many species groups annotate We see the research of one function across all species

Adding terms to the GO

Developing GO GO under constant development International group of developers central editorial office at EBI - 4 members Developed in consultation with domain experts Term suggestions handled through online tracking system

U.S. Virgin Islands, March 30 - April 3, 2006 2006 Consortium Meeting, St. Croix, U.S. Virgin Islands, March 30 - April 3, 2006 Finally, this is the current list of groups in the consortium. The Editorial office where the ontologies are developed is in Cambridge in the UK, and the rest of these groups contribute annotations. We are keen to include more groups in the annotation process so that more species will be manually annotated to the go and so Harold Drabkin is now going to talk about the process of annotation and about how you can contribute manual annotations of your own gene products.

Contributors dictyBase FlyBase GeneDB Gramene Reactome WormBase The GO Editorial Office Berkeley Bioinformatics and Ontology Project (BBOP) Gene Ontology Annotation @ EBI (GOA) Mouse Genome Database (MGD) and Gene Expression Database (GXD) Rat Genome Database (RGD) Saccharomyces Genome Database (SGD) The Arabidopsis Information Resource (TAIR) The Institute for Genomic Research (TIGR) Zebrafish Information Network (ZFIN) Finally, this is the current list of groups in the consortium. The Editorial office where the ontologies are developed is in Cambridge in the UK, and the rest of these groups contribute annotations. We are keen to include more groups in the annotation process so that more species will be manually annotated to the go and so Harold Drabkin is now going to talk about the process of annotation and about how you can contribute manual annotations of your own gene products.