1 Pax Terminologica Barry Smith Institute for Formal Ontology and Medical Information Science, Saarland University, Saarbrücken.

Slides:



Advertisements
Similar presentations
Upper Ontology Summit Tuesday March 14 The BFO perspective Barry Smith Department of Philosophy, University at Buffalo National Center.
Advertisements

Ontology Assessment – Proposed Framework and Methodology.
Upper Ontology Summit Wednesday March 15 The BFO perspective Barry Smith Department of Philosophy, University at Buffalo National.
The National Center for Biomedical Ontology Online Knowledge Resources for the Industrial Age Mark A. Musen Stanford University
Confessions/Disclaimers Ontologies and REDfly CARO SO OBO Foundry.
On the Future of the NeuroBehavior Ontology and Its Relation to the Mental Functioning Ontology Barry Smith
Goal and Status of the OBO Foundry Barry Smith. 2 Semantic Web, Moby, wikis, crowd sourcing, NLP, etc.  let a million flowers (and weeds) bloom  to.
Ontology Notes are from:
1 Introduction to Biomedical Ontology Barry Smith University at Buffalo
The Future of Health Information Barry Smith Ontology Research Group Center of Excellence in Bioinformatics and Life Sciences University at Buffalo ontology.buffalo.edu/smith.
1 The OBO Foundry Towards Gold Standard Terminology Resources in the Biomedical Domain Thomas Bittner (based on a presentation by Barry Smith)
1 How Ontologies Create Research Communities Barry Smith
1 An Ontology of Relations for Biomedical Informatics Barry Smith 10 January 2005.
1 The OBO Foundry Barry Smith University at Buffalo
1 Ontology in 15 Minutes Barry Smith. 2 Main obstacle to integrating genetic and EHR data No facility for dealing with time and instances (particulars)
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
1 The OBO Foundry 2 A prospective standard designed to guarantee interoperability of ontologies from the very start (contrast.
National center for ontological research. Part One: The History of NCOR and ECOR Part Two: How to Establish JCOR: The Japanese Consortium.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
1 Logical Tools and Theories in Contemporary Bioinformatics Barry Smith
Room for Lunch: Arlington Room Room for Evening Reception: Grand Prairie Room.
OBO-Foundry. OBO was conceived and announced in in october 2001 Michael Ashburner and Suzanna Lewis with acknowledgements of others in the GO.
1 The OBO Foundry Barry Smith Center of Excellence in Bioinformatics & Life Sciences, University at Buffalo IFOMIS, Saarland University
CoE Ontology Research Group (ORG) Barry Smith Center of Excellence in Bioinformatics and Life Sciences Ontology Research Group Department of Philosophy.
How to Organize the World of Ontologies Barry Smith 1.
CSE 730 Information Retrieval of Biomedical Data The use of medical lexicon in biomedical IR.
New York State Center of Excellence in Bioinformatics & Life Sciences Biomedical Ontology in Buffalo Part I: The Gene Ontology Barry Smith and Werner Ceusters.
1 Part III.The OBO Foundry Project: Towards Scientific Standards and Principles-Based Coordination in Biomedical Ontology Development.
1 How Ontologies Create Research Communities Barry Smith
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
Linking Diseases and Genes through Informatics Knowledge Bases and Ontologies Joyce A. Mitchell, Ph.D. National Library of Medicine University of Missouri.
Ontological realism as a strategy for integrating ontologies Ontology Summit February 7, 2013 Barry Smith 1.
Why we need the OBO Core Michael Ashburner, Suzanna Lewis and Barry Smith.
Amo amos amot amomus amotis amont. Happy birthday Swiss-Prot Fortaleza August 2006.
Open Biomedical Ontologies. Open Biomedical Ontologies (OBO) An umbrella project for grouping different ontologies in biological/medical field –a repository.
Nancy Lawler U.S. Department of Defense ISO/IEC Part 2: Classification Schemes Metadata Registries — Part 2: Classification Schemes The revision.
Knowledge Representation and Indexing Using the Unified Medical Language System Kenneth Baclawski* Joseph “Jay” Cigna* Mieczyslaw M. Kokar* Peter Major.
1 How Ontologies Create Research Communities Barry Smith University at Buffalo
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
1 HL7 RIM Barry Smith
Ontology of Disease and the OBO Foundry Chris Mungall NCBO GO Nov 2006.
Alan Ruttenberg PONS R&D Task force Alan Ruttenberg Science Commons.
GREGORY SILVER KUSHEL RIA BELLPADY JOHN MILLER KRYS KOCHUT WILLIAM YORK Supporting Interoperability Using the Discrete-event Modeling Ontology (DeMO)
Sharing Ontologies in the Biomedical Domain Alexa T. McCray National Library of Medicine National Institutes of Health Department of Health & Human Services.
10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
2 3 where in the body ? where in the cell ?
Mining the Biomedical Research Literature Ken Baclawski.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Need for common standard upper ontology
Towards a Top-Domain Ontology for Linking Biomedical Ontologies Holger Stenzhorn a,b Elena Beißwanger c Stefan Schulz a a Department of Medical Informatics,
1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Big Data that might benefit from ontology technology, but why this usually fails Barry Smith National Center for Ontological Research 1.
Basic Formal Ontology Barry Smith August 26, 2013.
© University of Manchester Creative Commons Attribution-NonCommercial 3.0 unported 3.0 license Quality Assurance, Ontology Engineering, and Semantic Interoperability.
Building Ontologies with Basic Formal Ontology Barry Smith May 27, 2015.
Upper Ontology Summit The BFO perspective Barry Smith Department of Philosophy, University at Buffalo National Center for Ontological Research National.
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
© University of Manchester Creative Commons Attribution-NonCommercial 3.0 unported 3.0 license Quality Assurance, Ontology Engineering, and Semantic Interoperability.
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Intelligence Ontology: A Strategy for the Future
Development of the Amphibian Anatomical Ontology
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Ontology in 15 Minutes Barry Smith.
OBO Foundry Principles
2. An overview of SDMX (What is SDMX? Part I)
Ontology in 15 Minutes Barry Smith.
OBO Foundry Update: April 2010
Presentation transcript:

1 Pax Terminologica Barry Smith Institute for Formal Ontology and Medical Information Science, Saarland University, Saarbrücken

2 Overview systems for semantic annotation linguistics vs. science semantic annotation in biomedical informatics improving systems for semantic annotation conclusions

3 The Penn Treebank Project annotates naturally occurring text for linguistic structure, producing skeletal parses showing syntactic and semantic information in tree form

4 Automatic Content Extraction Program (ACE) develops text corpora in English, Chinese and Arabic annotated for entities, the relations among them and the events in which they participate.

5 High Accuracy Retrieval from Documents (HARD) creates corpora and annotations including topics, metadata and relevance judgements

6 Annotation Graph Toolkit (AGTK) formal framework for representing linguistic annotations of time series data.

7 TimeML robust specification language for markup of natural language to support: time stamping of events (identifying and anchoring in time); ordering events with respect to one another reasoning about persistence

8 SpaceML provides facilities for annotating category attributions to spatial regions (self-connected, bounded, regular, etc.) ascription to regions of topological, distance, morphological and orientation relations; the definition of a region in terms of its boundary.

9 WordNet annotates English nouns, verbs, adjectives and adverbs to synonym sets, each representing one underlying lexical concept.

10 FrameNet documents the range of semantic and syntactic combinatory possibilities (valences) of each word in each of its senses

11 is there order in this chaos?

12 ISO/TC 37 / SC 4 N 076 Ide, N., Romary, L., de la Clergerie, E. (2003). International Standard for a Linguistic Annotation Framework. HLT-NAACL 2003 (Edmonton)

13 OntoGloss (influenced by ISO Linguistic Annotation Framework) an ontology based annotation tool that uses pre–defined terms in an ontology to mark-up a document No standard portal for semantic annotation tools/projects (?)

14 Purposes of semantic annotation information retrieval (incl. semantic indexing = answering queries that use words not used in the text, including words from other languages) automatic translation disambiguation topic extraction and text summarization information integration reasoning

15 for linguistics fiction no less important than fact English has no privileged status regimentation not allowed annotation frameworks may be competitive cross-framework consistency is not important

16 for science factual discourse alone important English is language par excellence regimentation is allowed goal of truth: to create a single computer- processable map of reality truth is one  must strive for consistency of annotations and additivity of annotation frameworks

17 for science must end the terminology wars Plant Ontology (PO) cell = def. structural and physiological unit of a plant what should PO do when it needs to study bacteria in plants? answer: all shall use the word ‘cell’ to mean the same thing! (all = in biology)

18 the ideal (of additivity) WordNet for single word forms FrameNet for valencies/combination forms SpaceNet for spatial structures TimeNet for temporal structures ChemNet for chemical structures CellNet for cellular structures etc.

19 a scientific problem: huge swarms of biomedical data at different granularities, from molecule to clinic methods for data integration needed to enable reasoning across data at multiple granularities (genomic medicine...)

20 orthodox solutions to this problem dumb statistical number-crunching or: Semantic Web, Unified Medical Language System (UMLS), Moby, etc. let a million flowers bloom and rely on mappings between already existing controlled vocabularies/annotation systems

21 an alternative solution use the peer-reviewed biomedical literature contains both textual descriptions of biological functions (incl. diseases) and references to entities represented in the biochemical databases use high-quality semantic annotations of the former to integrate across the latter  the Gene Ontology

22

23

24 The methodology of annotations Model organism databases employ scientific curators, who use the experimental observations reported in the biomedical literature to link gene products (such as proteins) with GO terms in annotations.

25 The process of annotations leads to improvements and extensions of the ontology, which in turn leads to better annotations a virtuous cycle of improvement in the quality and reach of both future annotations and the ontology itself, yielding a slowly growing computer- interpretable map of biological reality within which major databases are automatically integrated in semantically searchable form

26 need to extend GO by means of other ontologies, e.g. Cell Ontology, via integrated definitions id: CL: name: osteoblast def: "A bone-forming cell which secretes an extracellular matrix. Hydroxyapatite crystals are then deposited into the matrix to form bone." [MESH:A ] is_a: CL: relationship: develops_from CL: relationship: develops_from CL: GO Cell type New Definition + = Osteoblast differentiation: Processes whereby an osteoprogenitor cell or a cranial neural crest cell acquires the specialized features of an osteoblast, a bone-forming cell which secretes extracellular matrix.

27 need to extend GO also to semantic annotation of clinical literature unfortunately, available (UMLS) clinical vocabularies are of variable quality and low mutual consistency

28  need for prospective standards to assure consistency and high quality create rules for high-quality controlled vocabularies for the annotation of scientific literature make everyone follow these rules regimentation !

29 a shared portal for (so far) 58 ontologies (low regimentation) first step

30

31 Second step: The OBO Foundry

32 scientific standards and principles- based coordination of systems for semantic annotation of biomedical literature to create a single interoperable family of gold standard reference ontologies The OBO Foundry

33 A subset of OBO ontologies, whose developers have agreed in advance to accept a common set of principles designed to ensure –formal robustness –stability –compatibility –interoperability –support for logic-based reasoning The OBO Foundry

34 –Custodians Michael Ashburner (Cambridge) Suzanna Lewis (Berkeley) Barry Smith (Buffalo/Saarbrücken) The OBO Foundry

35 A prospective standard designed to guarantee interoperability of ontologies from the very start established March 2006; already 13 OBO ontologies have joined the Foundry and are being corresponding reformed; three new ontologies are being constructed ab initio in its terms The OBO Foundry

36 Initial Candidate Members –GO Gene Ontology –CL Cell Ontology –SO Sequence Ontology –ChEBI Chemical Ontology –PATO Phenotype (Quality) Ontology –FuGO Functional Genomics Investigation Ontology –FMA Foundational Model of Anatomy –RO Relation Ontology –ChEBI Chemical Entities of Biological Interest –CARO Common Anatomy Reference Ontology –FuGO Functional Genomics Investigation Ontology –PrO Protein Ontology –RnaO RNA Ontology The OBO Foundry

37 Under development –Disease Ontology –Mammalian Phenotype Ontology –OBO-UBO / Ontology of Biomedical Reality –Organism (Species) Ontology –Plant Trait Ontology –Environment Ontology –Behavior Ontology –Biomedical Image Ontology –Clinical Trial Ontology The OBO Foundry

38

39 CRITERIA The OBO Foundry The ontology is open and available to be used by all. The ontology is in, or can be instantiated in, a common formal language. The developers of the ontology agree in advance to collaborate with developers of other OBO Foundry ontology where domains overlap.

40 The developers of each ontology commit to its maintenance in light of scientific advance, and to soliciting community feedback for its improvement. They commit to working with other Foundry members to ensure that, for any particular domain, there is community convergence on a single controlled vocabulary The OBO Foundry CRITERIA

41 The ontology possesses a unique identifier space within OBO. The ontology provider has procedures for identifying distinct successive versions. The ontology includes textual definitions for all terms. CRITERIA The OBO Foundry

42 The ontology has a clearly specified and clearly delineated content. The ontology is well-documented. The ontology has a plurality of independent users. CRITERIA The OBO Foundry

43 The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.* *Genome Biology 2005, 6:R46 CRITERIA The OBO Foundry

44 OBO Relation Ontology Foundationalis_a part_of Spatiallocated_in contained_in adjacent_to Temporaltransformation_of derives_from preceded_by Participationhas_participant has_agent

45 analogy with FrameNet the constituent ontologies in the OBO Foundry are focused overwhelmingly on single nouns the OBO Relation Ontology is designed to ensure a common structure of relations shared by all Foundry ontologies – comparable to SpaceML, TimeML... need something like (Bio)FrameNet to pull the different levels of granularity together

46 CRITERIA Further criteria will be added over time in order to bring about a gradual improvement in the quality of the ontologies in the Foundry The OBO Foundry

47 GOALS semantic alignment of OBO Foundry ontologies through a common system of formally defined relations to enable reasoning both within and across ontologies, and thus also within and between the literature annotated in its terms and thus also to support reasoning across associated data The OBO Foundry

48 GOALS to promote re-usability of data if data-schemas are formulated using a single well-integrated framework for semantic annotation in widespread use, then this data will be to this degree itself become more widely accessible and usable The OBO Foundry

49 GOALS to help in creating better mappings e.g. between human and model organism phenotypes: S Zhang, O Bodenreider, “Alignment of Multiple Ontologies of Anatomy: Deriving Indirect Mappings from Direct Mappings to a Reference Ontology”, AMIA 2005 The OBO Foundry

50 to introduce the scientific method into the development of semantic annotation frameworks to introduce some of the features of scientific peer review into biomedical ontology development The OBO Foundry GOALS

51 to aid literature search: to subvert the current policy of ad hoc creation of new annotation schemas by each clinical research group by providing a common shared framework The OBO Foundry GOALS

52 to use the Foundry ontologies as benchmark for improving existing terminologies to create controlled vocabularies for semantic annotation of clinical trial records, scientific journal articles,... The OBO Foundry GOALS

53 to create an evolving map-like computable representation of the entire domain of biomedical reality to create the conditions for a step-by-step evolution towards high quality ontologies in the biomedical domain which will serve as stable attractors for clinical and biomedical researchers in the future The OBO Foundry GOALS

54 GOALS to end the terminology wars; and to advance regimentation of clinical and other vocabularies in a scientific spirit The OBO Foundry

55 Conclusion 1 existing linguistic resources for semantic annotation are scattered to the four winds  need for something like the OBO Library to ensure that the different available tools are available for comparison and alignment

56 Conclusion 2 linguists developing tools for semantic annotation with scientific purposes need something like the Foundry to ensure a complete set of interoperable tools which allow for additivity of annotations

57 the ideal BioWordNet for single word forms SpaceNet for spatial structures TimeNet for temporal structures ChemNet for chemical structures CellNet for cellular structures BioFrameNet for valencies/combination forms

58