BioOntologies SIG, ISMB/ECCB 2007Daniel Schober, EMBL-EBI 1 Naming conventions for ontology engineering Daniel Schober, PhD The European Bioinformatics.

Slides:



Advertisements
Similar presentations
Chemical named entity recognition and literature mark-up Colin Batchelor Informatics Department Royal Society of Chemistry
Advertisements

I. Spasić,1 D. Schober,2 S. Sansone,2 D. Rebholz-Schuhmann,2 D
WORKSHOP ON CRIS, CERIF AND INSTITUTIONAL REPOSITORIES, Rome, 10-11/5/2010 Interoperability Challenges and Approaches.
Chapter 1: The Database Environment
1 Copyright ©2007 Sandpiper Software, Inc. Vocabulary, Ontology & Specification Management at OMG Elisa Kendall Sandpiper Software
OMV Ontology Metadata Vocabulary April 10, 2008 Peter Haase.
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
1 Ontolog Open Ontology Repository Review 19 February 2009.
…to Ontology Repositories Mathieu dAquin Knowledge Media Institute, The Open University From…
13 September 2012 SDMX Technical Working Group1 Report of the SDMX Technical Standards Working Group SDMX Expert Group Meeting, Paris, September 2012.
Oyster, Edinburgh, May 2006 AIFB OYSTER - Sharing and Re-using Ontologies in a Peer-to-Peer Community Raul Palma 2, Peter Haase 1 1) Institute AIFB, University.
DC2001, Tokyo DCMI Registry : Background and demonstration DC2001 Tokyo October 2001 Rachel Heery, UKOLN, University of Bath Harry Wagner, OCLC
Copyright © 2006 Data Access Technologies, Inc. Open Source eGovernment Reference Architecture Approach to Semantic Interoperability Cory Casanave, President.
Ontological Resources and Top-Level Ontologies Nicola Guarino LADSEB-CNR, Padova, Italy
Cathy N. Hartman University of North Texas Libraries October 10, 1998 Cathy N. Hartman University of North Texas Libraries October 10, 1998.
Experiences from the NCBO OBO-to-OWL Mapping Effort Dilvan A. Moreira, University of São Paulo Mark A. Musen, Stanford University.
Supporting education and research Repositories in Context Digital repositories as components of an integrated infrastructure for education Leona Carpenter.
UKOLN, University of Bath
Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,
Upper Ontology Summit Wednesday March 15 The BFO perspective Barry Smith Department of Philosophy, University at Buffalo National.
Chapter 2 Entity-Relationship Data Modeling: Tools and Techniques
Functional Genomics Ontology FuGO and Metabolomics Society Ontology group Susanna-Assunta Sansone Nutr/Toxicogenomics Projects Coordinator EMBL-EBI Metabolomics.
Who am I Gianluca Correndo PhD student (end of PhD) Work in the group of medical informatics (Paolo Terenziani) PhD thesis on contextualization techniques.
 Goals Unambiguous description of how the investigation was performed Consistent annotation, powerful queries and data integration  Details NOT model.
Information and Business Work
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
Iowa State University Animal Science Department Bioinformatics & Computational Biology Program - 01/16/06 1 Overview of Animal Trait Ontology and PATO.
1 Enriching UK PubMed Central SPIDER launch meeting, Wolfson College, Oxford Paul Davey, UK PubMed Central Engagement Manager.
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
Thee-Framework for Education & Research The e-Framework for Education & Research an Overview TEN Competence, Jan 2007 Bill Olivier,
National center for ontological research. Part One: The History of NCOR and ECOR Part Two: How to Establish JCOR: The Japanese Consortium.
Domain Modelling the upper levels of the eframework Yvonne Howard Hilary Dexter David Millard Learning Societies LabDistributed Learning, University of.
Development Principles PHIN advances the use of standard vocabularies by working with Standards Development Organizations to ensure that public health.
12 December, 2012 Katrin Heinze, Bundesbank CEN/WS XBRL CWA1: European Filing Rules CWA1Page 1.
Enriching the Ontology for Biomedical Investigations (OBI) to Improve Its Suitability for Web Service Annotations Chaitanya Guttula, Alok Dhamanaskar,
Ontological realism as a strategy for integrating ontologies Ontology Summit February 7, 2013 Barry Smith 1.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
OBI – Communities and Structure 1. Coordination Committee (CC): Representatives of the communities -> Monthly conferences 2. Developers WG: CC and other.
Imports, MIREOT Contributors: Carlo Torniai, Melanie Courtot, Chris Mungall, Allen Xiang.
Open Biomedical Ontologies. Open Biomedical Ontologies (OBO) An umbrella project for grouping different ontologies in biological/medical field –a repository.
Teranode Tools and Platform for Pathway Analysis Michael Kellen, Solution Manager June 16, 2006.
FuGO An Ontology for Functional Genomics Investigation Susanna-Assunta Sansone (EBI): Overview Trish Whetzel (Un of Pen): Microarray Daniel Schober (EBI):
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
EMBL- EBI Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD, UK Standards and infrastructure for managing experimental metadata Philippe Rocca-Serra,
Alan Ruttenberg PONS R&D Task force Alan Ruttenberg Science Commons.
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society.
W HAT IS I NTEROPERABILITY ? ( AND HOW DO WE MEASURE IT ?) INSPIRE Conference 2011 Edinburgh, UK.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Anatomy Ontology Community Melissa Haendel. The OBO Foundry More than just a website, it’s a community of ontology developers.
Towards a Glossary of Activities in the Ontology Engineering Field Mari Carmen Suárez-Figueroa and Asunción Gómez-Pérez {mcsuarez, Ontology.
Master headline RDFizing the EBI Gene Expression Atlas James Malone, Electra Tapanari
Mining the Biomedical Research Literature Ken Baclawski.
Towards a Top-Domain Ontology for Linking Biomedical Ontologies Holger Stenzhorn a,b Elena Beißwanger c Stefan Schulz a a Department of Medical Informatics,
1 Ontolog OOR-BioPortal Comparative Analysis Todd Schneider 15 October 2009.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
Describing Bioinformatic Metadata at EBI James Malone
Big Data that might benefit from ontology technology, but why this usually fails Barry Smith National Center for Ontological Research 1.
EMBL- EBI Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD, UK The BioInvestigation Index – Standards and Infrastructure for Omics Data Philippe.
Upper Ontology Summit The BFO perspective Barry Smith Department of Philosophy, University at Buffalo National Center for Ontological Research National.
Semantic Media Wiki Open Terminology Development - Initial Steps - Frank Hartel, Ph.D. Associate Director, Enterprise Vocabulary Services National Cancer.
ISWG / SIF / GEOSS OOS - August, 2008 GEOSS Interoperability Steven F. Browdy (ISWG, SIF, SCC)
Bio-ontologies SIG in conjunction with ISMB July Boston, USA
National Aeronautics and Space Administration 1 CCSDS Information Architecture Working Group Daniel J. Crichton NASA/JPL 24 March 2005.
EOSC MODEL Pasquale Pagano CNR - ISTI
Development of the Amphibian Anatomical Ontology
Doron Goldfarb & Yann LE FRANC
OBO Foundry Principles
Ontology of biomedical investigations (OBI)
OBI – Standard Semantic
Presentation transcript:

BioOntologies SIG, ISMB/ECCB 2007Daniel Schober, EMBL-EBI 1 Naming conventions for ontology engineering Daniel Schober, PhD The European Bioinformatics Institute (EBI) NET Project – Postdoctoral Ontologist

BioOntologies SIG, ISMB/ECCB 2007Daniel Schober, EMBL-EBI 2 Metabolomics Standards Initiative (MSI) –Describe metabolomics laboratory workflows Minimal requirements, augmenting exchange formats –Ontology working group under OBI… Ontology for Biomedical Investigations (OBI) –Larger collaborative, multi-domains effort Brings together p various omics and biomedical communities –Describe general laboratory workflow Experimental Design, protocols, data analysis etc. –Developed under OBO Foundry… Open Biomedical Ontologies (OBO) Foundry –Provides best practices for ontology engineering –Creates a complete suite of orthogonal and interoperable ontologies Over 60 ontologies and ~10 core foundry Collaborative Efforts – Scenario

BioOntologies SIG, ISMB/ECCB 2007Daniel Schober, EMBL-EBI 3 Collaborative Efforts – Challenges Create networked orthogonal ontologies –Integrating MSI ontology with OBI –Integrating OBI with BFO and other OBO-Foundry ontologies, e.g. PATO (qualities), ChEBI (chemicals), … Integrate modular developments –Parallel branch development –OWL-import, referencing Improve the communication among developers –Database developers and biologists –Semantic web and text miners -> We need common naming conventions - To harmonize the appearance and design of modules

BioOntologies SIG, ISMB/ECCB 2007Daniel Schober, EMBL-EBI 4 Representational artefacts built according to different - Engineering methodologies MethOntology, Tove, Enterprise, … –Engineering Tools Protégé, OBO-Edit, OntoEdit, … –Representation languages and semantics OBO, OWL and CLIPS-Frames, … - Engineering schools and philosophies GO, semantics web, AI (Protégé Frames), … Manchester, Saarbruecken, Stanford, Trento, Karlsruhe, … Realists, Conceptualists, … As diverse as these backgrounds are the naming conventions applied ! –Diverse ad hoc ways to name what is represented Common Naming Conventions – Why?

BioOntologies SIG, ISMB/ECCB 2007Daniel Schober, EMBL-EBI 5 Separator Space vs. underscore vs. nil Case UpperCamelCase vs. underscore Namespace prefix Acronyms Synonyms Administrative helper classes Compound name Singular vs. Plural, xref Instance convention ID convention uppercase prefix, underscore, number vs. lowercase prefix, colon, string or no name just ID string Omissions

BioOntologies SIG, ISMB/ECCB 2007Daniel Schober, EMBL-EBI 6 Semantic web best practices and deployment group web –Format specific: OWL –Limited visibility: information dispersed and embedded into many documents BioPax manual –Limited visibility: naming conventions only implicitly dealt with in general documentation –Implementation specific: naming conventions discussed at implementation level (Protégé/OWL) –Limited coverage: IDs addressed marginally (page 53, Technical Notes RDF:ID), no conventions on relations GO developers style guide –Format specific: mainly OBO; has its own definition for namespace which differs from the one in OWL/semantic web –Limited visibility: naming conventions dispersed throughout websites, e.g. GO namespace, term names and identifiers are explained in different documents Existing Naming Conventions – Status

BioOntologies SIG, ISMB/ECCB 2007Daniel Schober, EMBL-EBI 7 ISO-Standards –Information overflow: About 40 documents that contain closely related guidelines –Limited access: commercial ANSI/ISO Z –Semantics specific: Controlled vocabulary, e.g. about terms, not classes –Limited coverage: No term ID handling or versioning addressed Law and order - Assessing and enforcing compliance with ontological modeling principles in the Foundational Model of Anatomy (FMA) S Zhang, O Bodenreider, Computers in Biology and Medicine 36 (2006) –Scientific domain dependent: anatomy –Hardly visible: paper access Acceptance and visibility is limited to specific target community We need universally applicable conventions Existing Naming Conventions – Status

BioOntologies SIG, ISMB/ECCB 2007Daniel Schober, EMBL-EBI 8 Overcome diversity and fragmentation –Collect existing naming conventions Make them accessible via repository –Review and compare Create a single common document –Distil universally valid aspects for OWL and OBO –Ensure visibility for target domains –Move towards a common resource for the OBO Foundry groups Provide best practice guidelines –Provide robust names for ontology classes –Not a knowledge representation language for names, like e.g. HUGO does for gene symbols (awg Tg(GBtslenv)832Pkw ) Engage in discussion with other groups –A two phases approach … Our Goals

BioOntologies SIG, ISMB/ECCB 2007Daniel Schober, EMBL-EBI 9 Phase 1: Straw man document -Working towards naming conventions for use in controlled vocabulary and ontology engineering See Bio-Ontologies SIG Proceedings, p Created for MSI Ontology WG, targeting the larger OBI group -Implementation and format independent Phase 2: Survey OBO Foundry groups -Questionnaire (work in progress) Ontology and engineering process Current practice in naming entities Envisioned benefits of common conventions In depth questions on particular conventions –Results to be posted under OBO Foundry wiki Towards Common Naming Conventions

BioOntologies SIG, ISMB/ECCB 2007Daniel Schober, EMBL-EBI 10 Explicit and concise names –Avoid omissions and ellipses Plant Ontology (PO) used 'cell' for 'plant cell' –Avoid negative names like non-separation device –Avoid ambiguous words 30 meanings of set; e.g. plurality protocol set or action parameter set –Brand name convention: use [company name+brand name+superclass] US 2 becomes Bruker US 2 NMR magnet To ensure shared understanding of intended meaning Typographical issues –Use lowercase as in natural language most flexible, e.g. pH, DNA_hybridisation (no acronym boarder problems) –Avoid punctuation, sub/superscripts –Resolve special characters consistently, e.g. ->alpha To ensure readability, reduce diversity in appearance Naming Convention Straw Man - Examples

BioOntologies SIG, ISMB/ECCB 2007Daniel Schober, EMBL-EBI 11 Lexical issues –Reuse words and avoid synonyms within compound names x_part_of_process, y_part_of_process and z_part_of_process instead of x_component_of_process, y_portion_of_process, z_part_of_process To decrease learning- and search-burden on user side, to ease text mining by reducing string variability –Use underscore or space separator (instead of CamelCase) prevents distortions like CapNMRProbe and pHValue, yet allows brandnames like SampleJet To ease text mining and readability (demarked word borders) –Use singular nominal word form Avoid inconsistencies like biphenyl (CHEBI:17097) under a IUPAC required biphenyls (CHEBI:22888) To harmonize appearance, to avoid redundancy, to ease ontology cross-referencing and import Naming Convention Straw Man - Examples

BioOntologies SIG, ISMB/ECCB 2007Daniel Schober, EMBL-EBI 12 Syntactic issues –Qualifier order: put the qualifier term before the part being qualified ? NMR_instrument in place of instrument_for_NMR –Helper strings in class names: establish general ones ? E.g. sensu postfix in GO to indicate species specificity, fruiting body development (sensu Bacteria) (GO: ) Semantic issues –Administrative helper classes: how to name these metadata bins ? unclassified (OBI_200067), ChEBI_objects (OBI_336), toBeDiscussed, _collected_relations –Identifiers and namespace: are conventions useful ? OBI uses [group prefix+underscore+unique number], e.g. OBI_334 BFO uses [meaningful string], e.g. IndependentContinuant Common Naming Convention – Open Issues

BioOntologies SIG, ISMB/ECCB 2007Daniel Schober, EMBL-EBI 13 Communication has improvedp … -In geographically distributed, collaborative efforts -Between developers from different domains and backgrounds Appearance of what we represent has been normalized - Not just a matter of aesthetics - Manoeuvring within the hierarchy became faster … we further envision … Facilitated access to ontologies through meta-tools Reducing diversity with which ontology libraries and tools have to cope with, e.g. OLS, BioPortal, PROMPT and text mining tools Facilitating ontology integration and cross-referencing Comparison, alignment (OWL-import) and mapping Serving as guideline for new communities Common Naming Convention - Benefits

BioOntologies SIG, ISMB/ECCB 2007Daniel Schober, EMBL-EBI 14 Authors and those contributing to the discussion –Susanna-Assunta Sansone, Philippe Rocca-Serra, Suzi Lewis, Waclaw Kusnierczyk, Barry Smith, Chris Mungall, Jane Lomax, Robert Stevens, Frank Gibson, Luisa Montecchi-Palazzi, Dietrich Rebholz Members of MSI, PSI, OBI groups and OBO Foundry coordinators – – – – Further info -Working towards naming conventions for use in controlled vocabulary and ontology engineering, Bio-Ontologies SIG Proceedings, p Funding sources (supporting my work) –UK BBSRC e-Science BB/D524283/1 and BB/E025080/1 –Semantic Mining NoE (visits to IFOMIS and Manchester) Acknowledgements and Resources