Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.

Slides:



Advertisements
Similar presentations
Annotation of Gene Function …and how thats useful to you.
Advertisements

24th Feb 2006 Jane Lomax Gene Ontology tutorial Talk:Using the Gene Ontology (GO) for Expression Analysis Practical:Onto-Express analysis tool Talk: GO.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Genome databases and webtools for genome analysis Become familiar with microbial genome databases Use some of the tools useful for analyzing genome Visit.
Lecture Outline Introduction Data mining sources: –GO, InterPro, KEGG, UniProt Tools to do the data mining: –FatiGO –FatiWISE.
Gene Ontology John Pinney
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
Introduction to Functional Analysis J.L. Mosquera and Alex Sanchez.
Gene ontology & hypergeometric test Simon Rasmussen CBS - DTU.
1 Using Gene Ontology. 2 Assigning (or Hypothesizing About) Biological Meaning to Clusters What do you want to be able to to? –Identify over-represented.
Gene Ontology Luis Tari. Gene Ontology (GO) URL: Gene Ontology is A hierarchy of roles of genes.
COG and GO tutorial.
Biological Databases Notes adapted from lecture notes of Dr. Larry Hunter at the University of Colorado.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
Toward Making Online Biological Data Machine Understandable Cui Tao.
Biology 224 Dr. Tom Peavy Sept 27 & 29 Protein Structure & Analysis- part 2.
BI class 2010 Gene Ontology Overview and Perspective.
Internet tools for genomic analysis: part 2
1 Gene Ontology and Semantic Similarity Measures.
 2 Outline  Review of major computational approaches to facilitate biological interpretation of  high-throughput microarray  and RNA-Seq experiments.
Lecture 4: Gene Annotation & Gene Ontology June 11, 2015.
MN-B-C 2 Analysis of High Dimensional (-omics) Data Kay Hofmann – Protein Evolution Group Week 5: Proteomics.
Daniel Rico, PhD. Daniel Rico, PhD. ::: Introduction to Functional Analysis Course on Functional Analysis Bioinformatics Unit.
Ch10. Intermolecular Interactions and Biological Pathways
Automatic methods for functional annotation of sequences Petri Törönen.
Overviews and Omics Viewers. SRI International Bioinformatics Introduction Each overview is a genome-scale diagram of cellular machinery l Cellular Overview.
Gene Ontology (GO) Project
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Gene Set Enrichment Analysis (GSEA)
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
Intralab Workshop - Reactome CMAP Chang-Feng Quo June 29 th, 2006.
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
Gene Ontology Consortium
1 Bio-Trac 40 (Protein Bioinformatics) October 8, 2009 Zhang-Zhi Hu, M.D. Associate Professor Department of Oncology Department of Biochemistry and Molecular.
The Gene Ontology: a real-life ontology, progress and future. Jane Lomax EMBL-EBI.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
A Method for Protein Functional Flow Configuration and Validation Woo-Hyuk Jang 1 Suk-Hoon Jung 1 Dong-Soo Han 1
BIOINFORMATIK I UEBUNG 2 mRNA processing.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Gene Onotology Part 1: what is the GO? Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse Genome Informatics.
The Gene Ontology and its insertion into UMLS Jane Lomax.
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
Getting Started: a user’s guide to the GO TAMU GO Workshop 17 May 2010.
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
Biological Networks & Systems Anne R. Haake Rhys Price Jones.
Gene Ontology Consortium
An overview of Bioinformatics. Cell and Central Dogma.
Scope of the Gene Ontology Vocabularies. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary.
Cell Communication (Chpt. 11) Chapter 11. Overview of Cell Signaling Signaling evolved early in history of life Communicating cells may be close together.
A database of biological pathways and processes (borrowed from a presentation created by Steve Jupe)
Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation Bioinformatics, July 2003 P.W.Load,
Protein databases Petri Törönen Shamelessly copied from material done by Eija Korpelainen and from CSC bio-opas
Gene Ontology TM (GO) Consortium
2/3/2005 Gene Ontology (GO) The Gene Ontology (GO) project is a collaborative effort to address the need for consistent descriptions.
Joined up ontologies: incorporating the Gene Ontology into the UMLS.
RDF based on Integration of Pathway Database and Gene Ontology SNU OOPSLA LAB DongHyuk Im.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
Gene Annotation & Gene Ontology May 24, Gene lists from RNAseq analysis What do you do with a list of 100s of genes that contain only the following.
Gene Annotation & Gene Ontology
GO : the Gene Ontology & Functional enrichment analysis
Department of Genetics • Stanford University School of Medicine
Overview Gene Ontology Introduction Biological network data
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

Tools in Bioinformatics Ontologies and pathways

Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader However, it is a lousy way to tell that to a computer When are we interested in a computer-interperable annotation? We want all the proteins associated with a certain disease All the proteins localized to a lysosome We found a cluster of “interesting” genes and we want to know what are they involved it We want to measure the similarity between gene pairs

Simple solution The simplest solution is to use a set of keywords for every protein Why is this a bad solution?

What’s in a name? Glucose synthesis Glucose biosynthesis Glucose formation Glucose anabolism Gluconeogenesis All refer to the process of making glucose from simpler components

What’s in a name? Same name for different concepts Different names for the same concept Vast amounts of biological data from different sources  Cross-species or cross-database comparison is difficult The problem:

What is the Gene Ontology? A (part of the) solution: The Gene Ontology: “a controlled vocabulary that can be applied to all organisms even as knowledge of gene and protein roles in cells is accumulating and changing” A controlled vocabulary to describe gene products - proteins and RNA - in any organism.

What is GO? One of the Open Biological Ontologies Standard, species-neutral way of representing biology Three structured networks of defined terms to describe gene product attributes More like a phrase book than a biology text book

How does GO work? What does the gene product do? Molecular function Where and when does it act? Cellular compartment What is the purpose of these activities? Biological process What information might we want to capture about a gene product?

Molecular Function insulin binding insulin receptor activity activities or “jobs” of a gene product

Cellular Component where a gene product acts

Cellular Component

Enzyme complexes in the component ontology refer to places, not activities.

Biological Process a commonly recognized series of events cell division

Biological Process transcription

Ontology Structure Ontologies are structured as a hierarchical directed acyclic graph Terms can have more than one parent and zero, one or more children Terms are linked by two relationships is-a  part-of 

Ontology Structure cell membrane chloroplast mitochondrial chloroplast membrane is-a part-of

True Path Rule The path from a child term all the way up to its top-level parent(s) must always be true cell  nucleus  chromosome But what about bacteria?

True Path Rule Resolved component ontology structure: cell  cytoplasm  chromosome  nuclear chromosome  nucleus  nuclear chromosome

GO Annotation Using GO terms to represent the activities and localizations of a gene product Annotations contributed by members of the GO Consortium model organism databases cross-species databases, eg. UniProt Annotations freely available from GO website

GO Annotation Electronic annotation from mappings files e.g. UniProt keyword2go High quantity but low quality Annotations to low level terms Not checked by curators Manual annotation From literature curation Time consuming but high quality

Where do we see GO annotations Entrez Gene / GeneCards / SwissProt Organism-specific databases amigo.geneontology.org/

Pathways – beyond terms Saying that a gene participates in gluconeogenesis and binds pyruvate in the nucleus does not provide us with all the information Pathway databases specify where is the plays of a specific gene/protein with respect to other genes doing similar jobs

KEGG – Kyoto Encyclopedia of Genes and Genomes Manually annotated “Reference maps” linked to hundreds of genomes Focus on metabolic pathways Can be used to answer questions: Give me all the genes involved in pathway X! Given a set of genes, is there a pathway that has a lot of genes in our set?

KEGG

BioCarta Focus on human signaling pathways

MSigDB So far we saw curated databases Focus on the established knowledge Always lagging behind MSigDB – combines “established” with gene sets that came up in some experiment Up regulated after UV exposure Down in colorectal cancers Predicted targets of some transcription factor Frequently more useful than GO/KEGG