Semantic Similarity over the Gene Ontology F. M. Couto, M. J. Silva, P. M. Coutinho Family Correlation and Selecting Disjunctive Ancestors

Slides:



Advertisements
Similar presentations
Semantic Similarity Measures Across The Gene Ontology. Relating Sequence to Annotation. P.W. Lord, R.D. Stevens, A.Brass, and C. Goble Department of Computer.
Advertisements

Microarray Data Analysis Day 2
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Pfam(Protein families )
Chapter 18 Classification
Seeing the forest for the trees : using the Gene Ontology to restructure hierarchical clustering Dikla Dotan-Cohen, Simon Kasif and Avraham A. Melkman.
1 Using Gene Ontology. 2 Assigning (or Hypothesizing About) Biological Meaning to Clusters What do you want to be able to to? –Identify over-represented.
InterPro/prosite UCSC Genome Browser Exercise 3. Turning information into knowledge  The outcome of a sequencing project is masses of raw data  The.
Inference in Probabilistic Ontologies with Attributive Concept Descriptions and Nominals Rodrigo Bellizia Polastro and Fabio Gagliardi Cozman.
Using the Semantic Web for Web Searches Norman Piedade de Noronha, Mário J. Silva XLDB / LaSIGE, Faculdade de Ciências, Universidade de Lisboa.
Molecular Evidence Using DNA, RNA or Protein Sequences to Classify Organisms.
1 Data Integration and Extraction over Molecular Biological Data Cui Tao supported by NSF.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
DI FC UL1 Gene Function Prediction by Mining Biomedical Literature Pooja Jain Master in Bioinformatics Supervisor - Mário Jorge Costa Gaspar.
The Protein Data Bank (PDB)
09 / 23 / Predicting Protein Function Using Machine-Learned Hierarchical Classifiers Roman Eisner Supervisors: Duane Szafron.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Today’s menu: -SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
5.4 Cladistics The ancestry of groups of species can be deduced by comparing their base or amino acid sequences.
Protein and Function Databases
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
 2 Outline  Review of major computational approaches to facilitate biological interpretation of  high-throughput microarray  and RNA-Seq experiments.
DEMO CSE fall. What is GeneMANIA GeneMANIA finds other genes that are related to a set of input genes, using a very large set of functional.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Predicting Missing Provenance Using Semantic Associations in Reservoir Engineering Jing Zhao University of Southern California Sep 19 th,
Review of Ondex Bernice Rogowitz G2P Visualization and Visual Analytics Team March 18, 2010.
From Web 1.0  Web 3.0: Is RDF access to RDB enough? Vipul Kashyap Senior Medical Informatician, Clinical Informatics R&D Partners.
Semantic Similarity over Gene Ontology for Multi-label Protein Subcellular Localization Shibiao WAN and Man-Wai MAK The Hong Kong Polytechnic University.
Information Need Question Understanding Selecting Sources Information Retrieval and Extraction Answer Determina tion Answer Presentation This work is supported.
A Z Approach in Validating ORA-SS Data Models Scott Uk-Jin Lee Jing Sun Gillian Dobbie Yuan Fang Li.
Fission Yeast Computing Workshop -1- Searching, querying, browsing downloading and analysing data using PomBase Basic PomBase Features Gene Page Overview.
1 A Hierarchical Approach to Wrapper Induction Presentation by Tim Chartrand of A paper bypaper Ion Muslea, Steve Minton and Craig Knoblock.
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
1 SRI International Bioinformatics GO Term Integration and Curation in Pathway Tools and EcoCyc Ingrid M. Keseler Bioinformatics Research Group SRI International.
Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.
Protein and RNA Families
Mining Biological Data. Protein Enzymatic ProteinsTransport ProteinsRegulatory Proteins Storage ProteinsHormonal ProteinsReceptor Proteins.
Overview  Introduction  Biological network data  Text mining  Gene Ontology  Expression data basics  Expression, text mining, and GO  Modules and.
Motif discovery and Protein Databases Tutorial 5.
Algorithmic Detection of Semantic Similarity WWW 2005.
Statistical Testing with Genes Saurabh Sinha CS 466.
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
Gene set analyses of genomic datasets Andreas Schlicker Jelle ten Hoeve Lodewyk Wessels.
Protein Domain Database
DDPIn Distance and Density Based Protein Indexing David Hoksza Charles University in Prague Department of Software Engineering Czech Republic.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Seeing the forest for the trees. Training Lumberjacks.
Supplementary Figure 1: Comparison of the results obtained from three widely used databases (namely AmiGO, ArrayExpress and GeneCards) with that from HypoxiaDB.
Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation Bioinformatics, July 2003 P.W.Load,
InterPro Sandra Orchard.
Protein databases Petri Törönen Shamelessly copied from material done by Eija Korpelainen and from CSC bio-opas
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
COURSE OF BIOINFORMATICS Exam_30/01/2014 A.
Welcome to the Protein Database Tutorial. This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
BUSINESS SENSITIVE 1 SAAW - Sequence Annotation and Analysis Workshop Boyu Yang and Gene Godbold Battelle Memorial Institute, Charlottesville Operations.
Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.
` Comparison of Gene Ontology Term Annotations Between E.coli K12 Databases REDDYSAILAJA MARPURI WESTERN KENTUCKY UNIVERSITY.
Sequence based searches:
Statistical Testing with Genes
Genome Annotation Continued
Predict Protein Sequence by Fuzzy-Association Rules
There are four levels of structure in proteins
Overview Gene Ontology Introduction Biological network data
A User’s Guide to GO: Structural and Functional Annotation
Walking the Interactome for Prioritization of Candidate Disease Genes
Ontologies and Databases
Overview Domains and conclusion Introduction Biological network data
Tantan Liu, Fan Wang, Gagan Agrawal The Ohio State University
Statistical Testing with Genes
CoXML: A Cooperative XML Query Answering System
Presentation transcript:

Semantic Similarity over the Gene Ontology F. M. Couto, M. J. Silva, P. M. Coutinho Family Correlation and Selecting Disjunctive Ancestors LASIGE - XLDB CIKM 2005 

Family Correlation Family similarity is also a structural similarity –Like sequence similarity, but: Overcomes some of the limitations of sequence similarity Since it is based on experimental results about protein domains Validate GO semantic similarity with family similarity Pfam similarity –a database of protein families –assigned to UniProt proteins

GraSM (Graph-based Similarity Measure) Motivation: –Semantic similarity measures use Gene Ontology as a hierarchical tree Approach: –Selecting multiple ancestors of two GO terms –Describe different interpretations Results:

Disjunctive Ancestors (t1,t2)  DisjAnc(t5) (t0,t1)  DisjAnc(t5) Since (t1,t2)  DisjAnc(t5) then CommonDisjAnc(t4,t5) = {t2,t1}

Biological Process Cellular ComponentMolecular Function

Biological Process Cellular ComponentMolecular Function

Conclusions Family correlation provided a measures’ ranking that is: – uniform over all the different aspects of GO –consistent with previous studies using different corpora We have demonstrated the higher effectiveness of GraSM for calculating semantic similarities between GO terms –Higher correlation using disjunctive common ancestors than only using the most informative common ancestor All the measures mentioned in this document were implemented by FuSSiMeG (Functional Semantic Similarity Measure between Gene-Products) –FuSSiMeG is available on the Web at: