Gene Ontology John Pinney

Slides:



Advertisements
Similar presentations
A Comparative mapping resource ONTOLOGY DEVELOPMENT AND INTEGRATION IN GRAMENE Pankaj Jaiswal Cornell University.
Advertisements

Annotation of Gene Function …and how thats useful to you.
Applications of GO. Goals of Gene Ontology Project.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
CITE EVIDENCE THAT ORGANISMS ARE LINKED BY LINES OF DESCENT FROM COMMON ANCESTRY LEARNING GOAL.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Four of the many different types of human cells: They all share the same genome. What makes them different?
Gene function analysis Stem Cell Network Microarray Course, Unit 5 May 2007.
Introduction to Functional Analysis J.L. Mosquera and Alex Sanchez.
Gene ontology & hypergeometric test Simon Rasmussen CBS - DTU.
1 Using Gene Ontology. 2 Assigning (or Hypothesizing About) Biological Meaning to Clusters What do you want to be able to to? –Identify over-represented.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
COG and GO tutorial.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Biology 224 Dr. Tom Peavy Sept 27 & 29 Protein Structure & Analysis- part 2.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Today’s menu: -SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Internet tools for genomic analysis: part 2
Protein and Function Databases
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
MN-B-C 2 Analysis of High Dimensional (-omics) Data Kay Hofmann – Protein Evolution Group Week 5: Proteomics.
Ch10. Intermolecular Interactions and Biological Pathways
Automatic methods for functional annotation of sequences Petri Törönen.
Using The Gene Ontology: Gene Product Annotation.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
Biology 224 Instructor: Tom Peavy Feb 21 & 26, Protein Structure & Analysis.
GENOME-CENTRIC DATABASES Daniel Svozil. NCBI Gene Search for DUT gene in human.
Fission Yeast Computing Workshop -1- Searching, querying, browsing downloading and analysing data using PomBase Basic PomBase Features Gene Page Overview.
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
Gene expression analysis
BIOINFORMATIK I UEBUNG 2 mRNA processing.
1 SRI International Bioinformatics GO Term Integration and Curation in Pathway Tools and EcoCyc Ingrid M. Keseler Bioinformatics Research Group SRI International.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Monday, November 8, 2:30:07 PM  Ontology is the philosophical study of the nature of being, existence or reality as such, as well as the basic categories.
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
Tutorial 7 Gene expression analysis 1. Expression data –GEO –UCSC –ArrayExpress General clustering methods –Unsupervised Clustering Hierarchical clustering.
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
Copyright OpenHelix. No use or reproduction without express written consent1.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
Motif discovery and Protein Databases Tutorial 5.
Getting Started: a user’s guide to the GO TAMU GO Workshop 17 May 2010.
The Mammalian Protein – Protein Interaction Database and Its Viewing System That Is Linked to the Main FANTOM2 Viewer Genome Research (2003) Speaker: 蔡欣吟.
Rice Proteins Data acquisition Curation Resources Development and integration of controlled vocabulary Gene Ontology Trait Ontology Plant Ontology
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
Gene Ontology Consortium
Gene set analyses of genomic datasets Andreas Schlicker Jelle ten Hoeve Lodewyk Wessels.
Mining the Biomedical Research Literature Ken Baclawski.
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Scope of the Gene Ontology Vocabularies. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
Gene Ontology TM (GO) Consortium
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
Networks and Interactions
GO : the Gene Ontology & Functional enrichment analysis
Mental Functioning and the Gene Ontology
Department of Genetics • Stanford University School of Medicine
Overview Gene Ontology Introduction Biological network data
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
Gene expression analysis
Presentation transcript:

Gene Ontology John Pinney

Gene annotation Goal: transfer knowledge about the function of gene products from model organisms to other genomes

Gene annotation Problem: keyword systems are different between research communities

Gene annotation Solution: controlled vocabulary

Ontology structured controlled vocabulary

Ontology: a collection of terms and their definitions and the logical relationships between them

Gene Ontology (GO): a collection of terms and their definitions and the logical relationships between them describing gene products

nucleus “A membrane-bounded organelle of eukaryotic cells in which chromosomes are housed and replicated. In most cells, the nucleus contains all of the cell's chromosomes except the organellar chromosomes, and is the site of RNA synthesis and processing. In some species, or in specialized cell types, RNA metabolism or DNA replication may be absent.” GO:

nucleus cell nuclear membrane nucleoplasm nucleolus “part of”

nucleus intracellular membrane- bounded organelle pronucleus intracellular organelle “is a” membrane-bounded organelle

A term may have more than one parent term and more than one child term. => The gene ontology is not a tree

The gene ontology has a structure known as a Directed Acyclic Graph (DAG). relationships are not symmetrical there are no directed loops mathematical term for a network

GO is actually made up of 3 different ontologies: cellular component molecular function biological process

cellular component “The part of a cell or its extracellular environment in which a gene product is located. A gene product may be located in one or more parts of a cell.”

cellular component examples: cohesin core heterodimer extracellular region laminin-1 complex replication fork transcription factor complex

molecular function “Elemental activities, such as catalysis or binding, describing the actions of a gene product at the molecular level. A given gene product may exhibit one or more molecular functions.”

molecular function examples: transcription factor binding enzyme activator activity 3'-nucleotidase activity metallopeptidase activity hexokinase activity

biological process “Those processes specifically pertinent to the functioning of integrated living units: cells, tissues, organs, and organisms. A process is a collection of molecular events with a defined beginning and end.”

biological process examples: para-aminobenzoic acid biosynthetic process protein localization establishment of blood-nerve barrier circadian rhythm posterior midgut development

geneontology.org

search and browse the ontologies

geneontology.org search and browse the ontologies

geneontology.org download ontologies

geneontology.org download mappings from other databases enzyme functions (EC, KEGG, MetaCyc) protein domains (Pfam, SMART, PRINTS,…) other controlled vocabularies of functions (E. coli functions, MIPS FunCat)

geneontology.org download annotations for various genomes

NCBI_NP NP_ lolD GO: ISS "ABC transporter, nucleotide binding/ATPase protein (lipoprotein)" taxon: PAMGO_GAT geneontology.org download annotations for various genomes database gene product ID gene symbol GO term ID evidence code

evidence codes Allow curators to indicate the type of evidence for each gene-term annotation. experimental computational author statement e.g. IMPInferred from mutant phenotype IDAInferred from direct assay e.g. ISSInferred from sequence similarity IGCInferred from genome context e.g. TASTraceable author statement

NCBI_NP NP_ lolD GO: ISS "ABC transporter, nucleotide binding/ATPase protein (lipoprotein)" taxon: PAMGO_GAT geneontology.org download annotations for various genomes database gene product ID gene symbol GO term ID evidence code description organism (taxon) ID date annotation project ID

geneontology.org repository of analysis tools that use GO search, edit and and browse ontologies / annotations software libraries statistical analysis text mining protein interactions enrichment analysis

Enrichment analysis

significant expression change in a microarray experiment cluster from a protein interaction network some other experiment / analysis gene set whole genome (annotated) Which GO terms occur significantly more often than expected in this gene set? BiNGO GOstat ArrayTrack

Advantages of GO single set of terms to describe the function of gene products from all organisms. DAG structure provides a logical framework to represent knowledge at whatever level of detail is available. continually revised to reflect the state of current knowledge. can quantify strength of relationships between terms (semantic similarity). many statistical analysis tools available.

Limitations of GO GO is limited in scope: it does not cover processes that are not normal functions of gene products (e.g. oncogenesis). sequence attributes (e.g. introns/exons) protein structures or interactions evolution gene expression

Summary (1) The gene ontology (GO) is a structured, controlled vocabulary to describe the function of gene products. Terms in GO have logical relationships (“is a”, “part of”) with one another. Together these form a structure called a Directed Acyclic Graph (DAG). GO is formed of 3 separate ontologies describing different aspects of gene function: cellular component, molecular function and biological process.

Summary (2) geneontology.org is the central resource for downloading ontology, annotation and mapping files. evidence codes are used in annotations to show the experimental, computational or literature support for each function.

Summary (3) many software tools are available to support GO analysis of experimental data, including enrichment analysis by ArrayTrack (microarray expression data) BiNGO (protein interaction clusters) GOstat (any data in the form of gene sets)