Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gene Ontology (GO) Project

Similar presentations


Presentation on theme: "Gene Ontology (GO) Project"— Presentation transcript:

1 Gene Ontology (GO) Project
Jane Lomax Hi, my name is Jennifer Clark and I work at the gene ontology consortium editorial office at the European Bioinformatics Institute in Cambridge in the UK. Today I’m going to give you an introduction to the work of the Gene ontology consortium.

2 There is a lot of biological research output

3 You’re interested in which genes control inmesoderm development…

4 You get 6752 results! How will you ever find what you want?

5 How will you spot the patterns?
attacked time control Puparial adhesion Molting cycle hemocyanin Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Amino acid catabolism Lipid metobolism Peptidase activity Protein catabloism Microarray data shows changed expression of thousands of genes. How will you spot the patterns? Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI.

6 Scientists work hard

7 There are lots of papers to read

8 More papers…

9 more and more and more…

10 more and Help! more and more!

11 The Gene Ontology provides a way to capture and represent biological all this knowledge in a computable form

12 The Gene Ontology

13 GO browser

14 Search on ‘mesoderm development’

15

16 Definition of mesoderm development Gene products involved in

17 GO can be used to help analyse microarray data

18 Microarray process: Treat samples Collect mRNA Label Hybridize Scan
Normalize Select differentially regulated genes Understand the biological phenomena involved So this is a typical process that you’d use to collect data from your microarray Just like that - makes it look very easy! So for a certain set of conditions, you have a set of differentially regualted genes and what you really want to know about is the underlying biological processes involved.

19 Traditional analysis gene by gene basis requires literature searching
time-consuming So this gene by gene approach has the major disadvantage that you have to delve into the literature yourself, which is obviously very time consuming.

20 Traditional analysis Gene 1 Apoptosis Cell-cell signaling
Protein phosphorylation Mitosis Gene 2 Growth control Oncogenesis Gene 3 Growth control Mitosis Oncogenesis Protein phosphorylation Gene 4 Nervous system Pregnancy Gene 100 Positive ctrl. of cell prolif Glucose transport Typically, this is the way the analysis would have been done. Taking your differentially regualted genes, you’d analyse them one by one - researching the what is known about that gene, and what processes it is involved in.

21 Using GO annotations But by using GO annotations, this work has already been done GO: : apoptosis But by using GO annotations, this work has already been done for you - someone has already sat down and associated a particular gene with a particular process…

22 Grouping by process Mitosis Gene 2 Gene 5 Gene45 Gene 7 Gene 35 …
Glucose transport Gene 7 Gene 3 Gene 6 Apoptosis Gene 1 Gene 53 Positive ctrl. of cell prolif. Gene 7 Gene 3 Gene 12 Growth Gene 5 Gene 2 Gene 6 So you have the ability to group your differentially regulated genes by process…

23 GO for microarray analysis
Annotations give ‘function’ label to genes Ask meaningful questions of microarray data e.g. genes involved in the same process, same/different expression patterns? GO and it’s annotations are useful in microarrays mainly for comparing gene expression patterns to gene function, allowing more meaningful interpretation of microarray data.

24 How does the Gene Ontology work?

25 GO structure GO isn’t just a flat list of biological terms
terms are related within a hierarchy Two arrangements for DNA replication

26 GO structure gene A

27 GO structure This means genes can be grouped according to user-defined levels Allows broad overview of gene set or genome

28 How does GO work? What information might we want to capture about a gene product?

29 How does GO work? What information might we want to capture about a gene product? What does the gene product do?

30 How does GO work? What information might we want to capture about a gene product? What does the gene product do? Where and when does it act?

31 How does GO work? What information might we want to capture about a gene product? What does the gene product do? Where and when does it act? Why does it perform these activities?

32 GO structure GO terms divided into three parts: cellular component
molecular function biological process

33 Cellular Component where a gene product acts

34 Cellular Component

35 Cellular Component

36 Cellular Component Enzyme complexes in the component ontology refer to places, not activities.

37 glucose-6-phosphate isomerase activity
Molecular Function activities or “jobs” of a gene product glucose-6-phosphate isomerase activity

38 insulin receptor activity
Molecular Function insulin binding insulin receptor activity

39 drug transporter activity
Molecular Function drug transporter activity

40 Molecular Function A gene product may have several functions; a function term refers to a reaction or activity, not a gene product Sets of functions make up a biological process

41 Biological Process a commonly recognized series of events
cell division

42 Biological Process transcription

43 regulation of gluconeogenesis
Biological Process regulation of gluconeogenesis

44 Biological Process limb development

45 Biological Process courtship behavior

46 Ontology Structure Terms are linked by two relationships is-a 
part-of 

47 mitochondrial chloroplast
Ontology Structure cell membrane chloroplast mitochondrial chloroplast membrane membrane is-a part-of

48 Ontology Structure Ontologies are structured as a hierarchical directed acyclic graph (DAG) Terms can have more than one parent and zero, one or more children

49 Ontology Structure cell membrane chloroplast mitochondrial chloroplast
membrane membrane Directed Acyclic Graph (DAG) - multiple parentage allowed

50 Anatomy of a GO term unique GO ID id: GO:0006094 name: gluconeogenesis
namespace: process def: The formation of glucose from noncarbohydrate precursors, such as pyruvate, amino acids and glycerol. [ exact_synonym: glucose biosynthesis xref_analog: MetaCyc:GLUCONEO-PWY is_a: GO: is_a: GO: term name ontology definition synonym 17800 terms in three ontologies 94% of terms defined database ref parentage

51 GO can also be useful for resolving language conflicts amongst scientific communities

52 ? In biology… Tactition Taction Tactile sense
In developing the ontologies we are solving a number of problems biologists. Currently in biology there are many ambiguities in language. Groups of researchers may use the same words to mean different things, or they may use several different words to refer to the same thing. This causes problems for scientists trying to access research carried out by groups outside their immediate field. It also makes it very difficult to process biological information using a computer. For example three groups of biologists studying different model organisms may all be studying the perception of touch. Scientists in different groups might talk about this single process as ‘tactition’, ‘tactile sense’ or ‘taction’. This differing use of language means that when they try to find and read each other’s papers they will have more trouble. It will also be harder for them to use a computer to find and interpret biological data on this subject since the computer has no way to know that these words mean the same thing.

53 perception of touch ; GO:0050975
Tactition Taction Tactile sense perception of touch ; GO: The GO provides a solution to this problem since we take biological concepts like the perception of touch and we make them a single GO item in the ontology. We add all the relevant synonyms, and give a unique numerical identifier to the concept.

54 Bud initiation? The GO also provides a solution to the opposite problem in which several groups of scientists use the same words to refer to different things. For example the phrase ‘bud inititation’ could refer to the initiation of a tooth bud, a yeast reproductive bud, or a bud on a tree. However, these three types of bud are initiated in quite different ways, and scientists would like to be able to distinguish between them.

55 = cellular bud initiation
= tooth bud initiation = cellular bud initiation To solve this problem the GO differentiates between differing concepts by adding a ‘sensu ending’. So according to this example, we would have ‘bud initiation sensu Metazoa’ to mean the kind of bud initation that gives rise to a tooth in mammals. We would have ‘bud initiation sensu Saccharomyces’ for the initation of a reproductive bud in yeast, and we would have ‘bud initiation sensu Viridiplantae’ for the initation of a tree bud. This means that gene products involved in bud initiation can be categorised along with only those other gene products involved in the same kind of bud initation. = flower bud initiation

56 Categorization of gene products
using GO is called annotation. So how does that happen?

57 P05147 Take a gene or protein

58 P05147 Find papers about it PMID:

59 Find the GO term describing its function, process
PMID: Find the GO term describing its function, process or location of action. GO:

60 P05147 PMID: What evidence do they show? IDA GO:

61 P05147 GO:0047519 IDA PMID:2976880 Record these: IDA P05147

62 Submit to the GO Consortium

63 Annotation appears in GO database

64 Many species groups annotate We see the research of one
Clark et al., 2005 Many species groups annotate We see the research of one function across all species

65 Adding terms to the GO

66 Developing GO GO under constant development
International group of developers central editorial office at EBI - 4 members Developed in consultation with domain experts Term suggestions handled through online tracking system

67

68

69

70

71 U.S. Virgin Islands, March 30 - April 3, 2006
2006 Consortium Meeting, St. Croix, U.S. Virgin Islands, March 30 - April 3, 2006 Finally, this is the current list of groups in the consortium. The Editorial office where the ontologies are developed is in Cambridge in the UK, and the rest of these groups contribute annotations. We are keen to include more groups in the annotation process so that more species will be manually annotated to the go and so Harold Drabkin is now going to talk about the process of annotation and about how you can contribute manual annotations of your own gene products.

72 Contributors dictyBase FlyBase GeneDB Gramene
Reactome WormBase The GO Editorial Office Berkeley Bioinformatics and Ontology Project (BBOP) Gene Ontology EBI (GOA) Mouse Genome Database (MGD) and Gene Expression Database (GXD) Rat Genome Database (RGD) Saccharomyces Genome Database (SGD) The Arabidopsis Information Resource (TAIR) The Institute for Genomic Research (TIGR) Zebrafish Information Network (ZFIN) Finally, this is the current list of groups in the consortium. The Editorial office where the ontologies are developed is in Cambridge in the UK, and the rest of these groups contribute annotations. We are keen to include more groups in the annotation process so that more species will be manually annotated to the go and so Harold Drabkin is now going to talk about the process of annotation and about how you can contribute manual annotations of your own gene products.


Download ppt "Gene Ontology (GO) Project"

Similar presentations


Ads by Google