EBI is an Outstation of the European Molecular Biology Laboratory. Anatomy ontology ArrayExpress Helen Parkinson,

Slides:



Advertisements
Similar presentations
The ArrayExpress Gene Expression Database: a Software Engineering and Implementation Perspective Ugis Sarkans European Bioinformatics Institute.
Advertisements

ISO TC184/SC4 Future architecture Rotterdam Progress on the Future SC4 Architecture PWI Friday 13 th November 2009.
5 EBI is an Outstation of the European Molecular Biology Laboratory. Master title Molecular Interactions – the IntAct Database Sandra Orchard EMBL-EBI.
Welcome to mini-symposium on ontologies for biological sample description EMBL-EBI Wellcome Trust Genome Campus Deceber 5, 2001.
TU/e technische universiteit eindhoven Hypermedia Presentation Adaptation on the Semantic Web Flavius Frasincar Geert-Jan Houben
EBI is an Outstation of the European Molecular Biology Laboratory. UniProt Jennifer McDowall, Ph.D. Senior InterPro Curator Protein Sequence Database:
Class Projects. Future Work and Possible Project Topic in Gene Regulatory network Learning from multiple data sources; Learning causality in Motifs; Learning.
Text Search and Fuzzy Matching
ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Gene expression services: ArrayExpress and the Gene Expression Atlas Contact: Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
EBI is an Outstation of the European Molecular Biology Laboratory. MAGE-TAB - The ArrayExpress Production Experience Helen Parkinson, PhD.
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Computational Biology and Informatics Laboratory Development of an Application Ontology for Beta Cell Genomics Based On the Ontology for Biomedical Investigations.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Copyright OpenHelix. No use or reproduction without express written consent1.
Integrating Business Process Models with Ontologies Peter De Baer, Pieter De Leenheer, Gang Zhao, Robert Meersman {Peter.De.Baer, Pieter.De.Leenheer,
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
Copyright OpenHelix. No use or reproduction without express written consent1.
The European Bioinformatics Institute MGED ontology for consistent annotation of microarray experiments Manchester Bioinformatics Week Ontologies Workshop1.
Outline Quick review of GS Current problems with GS Our solutions Future work Discussion …
Intralab Workshop - Reactome CMAP Chang-Feng Quo June 29 th, 2006.
Ontologically Modeling Sample Variables in Gene Expression Data James Malone EBI, Cambridge, UK.
Creating Metabolic Network Models using Text Mining and Expert Knowledge J.A. Dickerson, D. Berleant, Z. Cox, W. Qi, and E. Wurtele Iowa State University.
Rapid Development of an Ontology of Coriell Cell Lines Chao Pang, Tomasz Adamusiak, Helen Parkinson and James Malone
Gene Expression Data Annotation – an application of the cell type ontology Helen Parkinson, PhD 19 May 2010.
Copyright OpenHelix. No use or reproduction without express written consent1.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
MIAMExpress and the development of annotation ontologies for gene expression experiments Ele Holloway Microarray Informatics European Bioinformatics Institute.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Google’s Deep-Web Crawl By Jayant Madhavan, David Ko, Lucja Kot, Vignesh Ganapathy, Alex Rasmussen, and Alon Halevy August 30, 2008 Speaker : Sahana Chiwane.
Using Several Ontologies for Describing Audio-Visual Documents: A Case Study in the Medical Domain Sunday 29 th of May, 2005 Antoine Isaac 1 & Raphaël.
Data provenance in biomedical discovery Donald Dunbar Queen’s Medical Research Institute University of Edinburgh Workshop on Principles of Provenance in.
Alvis Brazma, Johan Rung, Ugis Sarkans, Thomas Schlitt, Jaak Vilo European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge,
The future of the Web: Semantic Web 9/30/2004 Xiangming Mu.
Introduction to the Semantic Web and Linked Data
1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)
- EVS Overview - Biomedical Terminology and Ontology Resources Frank Hartel, Ph.D. Director, Enterprise Vocabulary Services NCI Center for Bioinformatics.
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
12/7/2015Page 1 Service-enabling Biomedical Research Enterprise Chapter 5 B. Ramamurthy.
Controlled Vocabulary Giri Palanisamy Eda C. Melendez-Colom Corinna Gries Duane Costa John Porter.
Master headline RDFizing the EBI Gene Expression Atlas James Malone, Electra Tapanari
EBI is an Outstation of the European Molecular Biology Laboratory. Gautier Koscielny VectorBase Meeting 08 Feburary 2012, EBI VectorBase Text Search Engine.
EBI is an Outstation of the European Molecular Biology Laboratory. Literature Resources at the EBI Information Workshop on European Bioinformatics Resources.
Data Integration & Data Mining Tool Donald Dunbar BHF CoRE Bioinformatics Team Edinburgh Bioinformatics Meeting April 2013.
Linking Models & Data within the ISA structure Stuart Owen (based upon notes by Olga Krebs).
ArrayExpress - a Public Repository for Microarray Based Gene Expression Data European Bioinformatics Institute - EMBL outstation and German Cancer Research.
Approach to building ontologies A high-level view Chris Wroe.
Phenotype And Trait Ontology (PATO) and plant phenotypes
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
Describing Bioinformatic Metadata at EBI James Malone
Converting an Existing Taxonomic Data Resource to Employ an Ontology and LSIDS Jessie Kennedy Rob Gales, Robert Kukla.
Lisa Matthews, 1 Esther Schmidt, 2 Suzanna Lewis, 3 David Croft, 2 Bernard de Bono, 2 Peter D'Eustachio, 1 Marc Gillespie, 1 Gopal Gopinath, 1 Bijay Jassal,
Anatomy Ontologies & Potential Users: Bridging the Gap Ravensara Travillian European Bioinformatics Institute
Semantic Wiki: Automating the Read, Write, and Reporting functions Chuck Rehberg, Semantic Insights.
RDF based on Integration of Pathway Database and Gene Ontology SNU OOPSLA LAB DongHyuk Im.
ArrayExpress Ugis Sarkans EMBL - EBI
Of 24 lecture 11: ontology – mediation, merging & aligning.
EBI is an Outstation of the European Molecular Biology Laboratory. Semantic Interoperability Framework Sarala M. Wimalaratne (RICORDO project)
Exploiting semantic technologies to build an application ontology
Department of Genetics • Stanford University School of Medicine
Functional Annotation of the Horse Genome
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
CCO: concept & current status
Gramene’s Ontologies Tutorial
Tantan Liu, Fan Wang, Gagan Agrawal The Ohio State University
Presentation transcript:

EBI is an Outstation of the European Molecular Biology Laboratory. Anatomy ontology ArrayExpress Helen Parkinson, PhD

Content ArrayExpress use cases Fuzzy matching of ontology terms Data driven ontology building Wish list

ArrayExpress: Overview Submit Hybs Experiment queries Public/Private ATLAS Summarize Public Only Re-annotate Gene queries Genes Cross expt/ species queries

Fuzzy matching of ontology terms – why? Clean up ArrayExpress OE and synonym tables OE based integration Constrain OEs on data entry/validation Improved searches in repository/DW web interface Data integration across species, experiments and experimental designs Automated mapping of free text to ontology terms for data imporrt

Phonetic Matching Precompute phonetic encodings of all terms in the ontology Match each target term by comparing these encodings Soundex: Robert Russell and Margaret Odell (1918), famously described by Donald Knuth Double Metaphone: Lawrence Philips (2000)‏ Metaphone: Lawrence Philips ‏ Most matches are single Highest success rate

Algorithm comparisons

Percent matches using automated mapping

Failures to match Species (or Kingdom)-specific terms (e.g. plant anatomy)‏ Conflated terms (e.g. diseased cell types)‏ Compound terms (e.g. "cerebral cortex and hypothalamus")‏ Genuinely missing terms Esoteric terms less of a priority Most trivial misspellings, however, were matched Dirty input data

Implications Need more terms in some commonly-used ontologies Synonyms are important generating less noise better coverage Choice of ontology can limit expressivity - this will be frustrating to biologists

Why? Clean up ArrayExpress OE and synonym tables Add accessions/DB links to these tables Constrain OEs on data entry/validation Improved searches in repository/DW web interface Generate suggestions for new OE terms Evaluate domain coverage by a given ontology

ArrayExpress Ontology Development and Future Directions Developing the Ontology Define Scope: ArrayExpress already has some useful structure given the current database plus rich source of use cases and competency questions. Build: Ontology Capture: Identify key concepts and relationships within our domain and give explicit definitions to these features: Middle-out approach – specify core of basic terms then specialise and generalise as required Mappings – text mining approach to do initial semi-automated mappings to external resources for rapid coverage Manual mapping for data warehouse data, and selected data sets

ArrayExpress Ontology Development and Future Directions Capture to Code: Definitions and Hierarchy

ArrayExpress Ontology Development and Future Directions Semantic Roadmap Position of the ArrayExpress Experimental Factor Ontology in the ‘bigger picture’ AE Ontology Disease Ontology Common Anatomy Reference Ontology Cell Type Ontology Chemical Entities of Biological Interest (ChEBI) NCI Various Species Anatomy Ontologies Key is orthogonal coverage, reuse of existing resources and shared frameworks

Wish list NOT to build our own anatomy ontology CARO extension CARO evaluation Mapping CARO to relevant multi-species ontologies Application of CARO to ArrayExpress data Use of CARO in ArrayExpress tools

Acknowledgments Anna Farne Ele Holloway James Malone Margus LukkArrayExpress Production Team Helen Parkinson Tim Rayner Faisal Rezwan Eleanor Williams Mengyao Zhao Holly Zheng