Annotation: linking literature to gene products

Slides:



Advertisements
Similar presentations
Microarray Data Analysis Day 2
Advertisements

Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
Gene Ontology John Pinney
Introduction to Bioinformatics Lecturer: Dr. Yael Mandel-Gutfreund Teaching Assistant: Shula Shazman Sivan Bercovici Course web site :
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Lecture 2.21 Retrieving Information: Using Entrez.
Genome Related Biological Databases. Content DNA Sequence databases Protein databases Gene prediction Accession numbers NCBI website Ensembl website.
IST Computational Biology1 Information Retrieval Biological Databases 2 Pedro Fernandes Instituto Gulbenkian de Ciência, Oeiras PT.
Biology 224 Dr. Tom Peavy Sept 27 & 29 Protein Structure & Analysis- part 2.
BI420 – Course information Web site: Instructor: Gabor Marth Teaching.
Genome Browsing with the UCSC Genome Browser
Gene Ontology at WormBase: Making the Most of GO Annotations Kimberly Van Auken.
Bioinformatics for biomedicine More annotation, Gene Ontology and pathways Lecture 6, Per Kraulis
An Introduction to Bioinformatics Molecular Biology Databases.
An introduction to using the AmiGO Gene Ontology tool.
Thomas Lemberger Chief Editor, Molecular Systems Biology Deputy Head, Scientific Publications, EMBO Publishing actionable data.
1 iProLINK: An integrated protein resource for literature mining and literature-based curation 1. Bibliography mapping - UniProt mapped citations 2. Annotation.
On line (DNA and amino acid) Sequence Information
Erice 2008 Introduction to PDB Workshop From Molecules to Medicine: Integrating Crystallography in Drug Discovery Erice, 29 May - 8 June Peter Rose
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
NCBI’s Bioinformatics Resources Michele R. Tennant, Ph.D., M.L.I.S. Health Science Center Libraries U.F. Genetics Institute January 2015.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
Intralab Workshop - Reactome CMAP Chang-Feng Quo June 29 th, 2006.
CANDID: A candidate gene identification tool Janna Hutz March 19, 2007.
Copyright OpenHelix. No use or reproduction without express written consent1.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
Predicting protein degradation rates Karen Page. The central dogma DNA RNA protein Transcription Translation The expression of genetic information stored.
Biological databases Exercises. Discovery of distinct sequence databases using ensembl.
The Mammalian Protein – Protein Interaction Database and Its Viewing System That Is Linked to the Main FANTOM2 Viewer Genome Research (2003) Speaker: 蔡欣吟.
Rice Proteins Data acquisition Curation Resources Development and integration of controlled vocabulary Gene Ontology Trait Ontology Plant Ontology
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Primary vs. Secondary Databases Primary databases are repositories of “raw” data. These are also referred to as archival databases. -This is one of the.
A database of biological pathways and processes (borrowed from a presentation created by Steve Jupe)
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
Large-scale Prediction of Yeast Gene Function Introduction to Bio-Informatics Winter Roi Adadi Naama Kraus
Copyright OpenHelix. No use or reproduction without express written consent1.
Getting GO: how to get GO for functional modeling Iowa State Workshop 11 June 2009.
An example of GO annotation from a primary paper Rebecca E. Foulger (UniProt Curator) GO Annotation Camp, June 2005 PMID:
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
Starter What do you know about DNA and gene expression?
An example of GO annotation from a primary paper GO Annotation Camp, July 2006 PMID:
Welcome to the Protein Database Tutorial. This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Introduction to PubChem BioAssay
CACAO Training ASM-JGI 2012.
Interactions and Ontologies
Intersecting different databases to define the inner and outer limits of the data-supported druggable proteome
GO : the Gene Ontology & Functional enrichment analysis
Introduction to the Gene Ontology
Interrogation of cross talk between proteins and gene regulatory networks in breast cancer Chambers, Teressa Lee Hiren Karathia Sridhar Hannenhalli.
Department of Genetics • Stanford University School of Medicine
Functional Annotation of the Horse Genome
Modified from slides from Jim Hu and Suzi Aleksander Spring 2016
Genome Annotation Continued
Strategy for working on your own data sets.
gene-CENTRIC database
Gene Expression Omnibus (GEO)
ID Mapping tools: Converting Accessions between Databases
GO Annotation from different sources
Overview Gene Ontology Introduction Biological network data
Strategies for annotation of a genome
Tutorial: Bioinformatics Resources
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
A User’s Guide to GO: Structural and Functional Annotation
Annotating Gene Products to the GO
Take out a sheet of paper for a new page of notes
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

Annotation: linking literature to gene products How to find existing annotation for gene products and identify literature containing unannotated data.

Functional Annotation: linking literature to genes Genomic Annotation and Functional Modeling Workshop Maxwell H. Gluck Equine Research Center 15-16 November, 2011

Identifying functional literature. GO Browsers & databases display links to papers, where these are used to make the annotation (direct, experimental evidence codes). However, annotation lags publication: how to find functional papers? UniProtKB links Entrez Gene links & GeneRIFs PubMed & GOPubMed

Note – for many un-reviewed or un-annotated UniProt records, the references section is incomplete and likely has references to do with structural annotation (no functional data).

PubMed link Links out to PubMed records. Note that links are to PubMed records that are curated to this gene - may be incomplete - may be incorrect (text mining – e.g. ovalbumin reagent)

GeneRIFs “Gene References Into Function”. References are linked to this page by researchers! - simple, easy to use - may be incomplete

Finding unlinked publications. Targeted PubMed searches Gene name(s), gene symbol, protein name(s) Limit by species (using species name OR MeSH terms)

Now what??

Functional Literature Triage PubMed searches: Cellular component: e.g. “subcellular localization” Molecular function: e.g. activity, binding, assay, interaction Biological Process: e.g. pathway, developmental process. Sequence based papers – limited use for function Tissue expression – limited only to active molecule (protein, noncoding RNA – not mRNA in situ hybridization) Functional information not always found in abstract!

OVA used as a reagent (to test Tcell development). - only incidental to OVA function

OVA mentioned as part of introduction to serpins. OVA = SERPINB14 but paper is about SERPINB11.

OVA regulated by steroid hormones – biological process? OVA secreted – cellular component?

http://www.gopubmed.org GOPubMed: Uses the PubMed database (note number of papers are the same for the same search. Incorporates MeSH & GO term text mining to try and identify most relevant searches. Note: errors in text mining!

Use links on the side to identify papers relating to - specific organisms - techniques (e.g. proteomics) - GO term names