Why do we need upper ontologies? What are their purported benefits? Barry Smith and Alan Ruttenberg University at Buffalo IAOA Summer Institute on Upper Ontologies Toronto August 9, 2017
Why do we need upper-level ontologies? Is the distinction between continuant and occurrent just a security blanket for philosophers? Poisoning
Wed Session 1: Why do we need upper ontologies? 1. Why do we need top-level ontologies? 2. Why do we need mid-level ontologies (for space, time, information, …)? Q: Where do we draw the line between the two?
Why do we need top-level ontologies?
Why do we need a good top-level ontology? Q: How do we pick out the good one(s)?
Why do we need a good top-level ontology? Q: How do we pick out the good one(s)? A: Aggressive testing in real-world contexts
What does aggressive testing in real-world contexts tell us about what a good TLO is useful for?
The problem facing model organism researchers with the completion of the human genome project Human, mouse, rat, fly, fish, yeast, … Different vocabularies for each model organism What do you call (human:) cleft palate in zebrafish? One approach: pair off laterally, create mappings between human and mouse, between human and rat, between human and fly, … GO approach: move up one level, create a vocabulary for talking about attributes of gene products that is species neutral (future proof because the approach will still apply as data pertaining to new model organisms need to be incorporated)
What does aggressive testing in real-world contexts tell us about what a good TLO is useful for? RELATION TO TIME GRANULARITY CONTINUANT OCCURRENT INDEPENDENT DEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function Molecular Process Original OBO (Open Biomedical Ontologies) Foundry (Gene Ontology in yellow)
The problem facing model organism researchers with the introduction of the GO? Gene product attributes = cellular component, molecular function, biological process What about diseases, anatomy, proteins, cell types, …? Different vocabularies for each life science domain One approach: pair off laterally, create mappings between molecular function and diabetes, molecule function and influenza, molecular function and cancer, dendritic cell and hip replacement surgery, … … … … … TLO approach: move up one level, create a vocabulary for talking about the phenomena of the life sciences that is domain neutral (future proof because the approach will still apply as data pertaining to new sorts of entities need to be incorporated)
Wed Session 1: Why do we need upper ontologies? 1. Why do we need top-level ontologies? 2. Why do we need mid-level ontologies (for space, time, information, …)? Q: Where do we draw the line between the two? A: TLOs are domain-neutral
Build new ontologies to conform to the upper-level architecture of BFO RELATION TO TIME GRANULARITY CONTINUANT OCCURRENT INDEPENDENT DEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function Molecular Process BFO provides a common starting point for definitions + an evolving common set of best practices
What you get Safer division of labor – clarifies what sorts of ontologies (of processes, of objects, of qualities, … are needed and how they relate to the ontologies we already have) Safer distribution of labor – there are benefits to restricting choices to only what is made available in a well-worked out scheme Audit trail to reality – require all classes to be defined in terms of lower-level classes in such a way that we always know what the instances are that we need to check to test if an assertion is true
Simple examples 1. Old chemistry ontologies oxide and aluminium oxide are both instances but then how can you assert the relation between them? New chemistry ontology: they are both classes 2. NeuronDB database of proteins classified into functions (inhibitor, dopamine receptor …). Some time later they have to link their data to data classified using PRO and the GO function ontology. They cannot do this because they used ‘protein’ to mean ‘function of a protein’. BFO would have told them to make two ontologies (for function and for protein) from the very start
https://senselab.med.yale.edu/NeuronDB/NeuronalReceptors
Thursday Session 2: Relationships among Upper Ontologies
Q: How to integrate biological and clinical data within and across domains across different species across levels of granularity (organ, organism, cell, molecule) across different perspectives (physical, biological, clinical) A: by tagging with ontologies What could go wrong?
379 Ontologies
http://bioportal.bioontology.org/search?q=obesity
Linked Open Data
divided we fail
LOD: we can save the day with mappings
LOL: we can save the day with mappings Mappings are fragile – since both sides of the mapping will change independently Mappings are expensive to maintain
LOL: we can save the day with mappings between terminologies? Mappings are fragile – since both sides of the mapping will change independently Mappings are expensive to maintain The goal should be to minimize the need for mappings By finding out how to create a good, robust ontology, and by creating one ontology module for each domain
Original OBO Foundry ontologies RELATION TO TIME GRANULARITY CONTINUANT OCCURRENT INDEPENDENT DEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function Molecular Process Original OBO Foundry ontologies (Gene Ontology in yellow)
http://www.onto-med.de/Archiv/ontomed2002/en/theories/gfo/part1/node65.html
Mapping GFO to DOLCE http://www.onto-med.de/Archiv/ontomed2002/en/theories/gfo/part1/node65.html
http://www.onto-med.de/Archiv/ontomed2002/en/theories/gfo/part1/node65.html
Mapping of DOLCE to BFO
principles for BFO 2.0 to DOLCE mapping whenever possible, map to the most specific BFO 2.0 representational unit with an equivalence relation; if there is no equivalence, map to the most specific superordinate BFO 2.0 representational unit with a subclass relation if more than one mapping is possible, map to the union of BFO 2.0 lowest level representational units
BFO: Continuant Ontology
Thursday Session 4: Relationships among Upper Ontologies: Upper Ontologies in Different Logics If an upper ontology is axiomatized in Common Logic and there exists another axiomatization in OWL, in what sense are they really the same ontology?
ontology =def. a collection of terms and relational expressions, together with definitions and axioms expressed in a computer interpretable language NOTE 3: The term ‘ontology’ is sometimes used in a narrow sense to refer to specific formal representations. In this standard, however, an ontology is conceived as an artefact created by humans in time, comparable in this respect to a scientific theory or to a lexicon. Thus an ontology may exist in different versions at different times, for example as a result of the fact that errors are corrected or new terms added.
What is a credit card number? – not a mathematical object – not a contingent object with physical properties, taking part in causal relations – but a historical object, with a very special provenance – stands in relations analogous to those of ownership, – exists only within a system of working financial institutions of specific kinds
Information vs. Information Artifact ‘information’ – mass noun (Shannon and Weaver) ‘information artifact’ – count noun (Information Artifact Ontology)
Information Artifacts in Science protocol database theory ontology gene list publication result ...
a credit card number is a generically dependent continuant It requires some bearer, but it can migrate from one bearer to another (can exist in many different places at one and the same time) It is a historical object, with a very special provenance It stands in relations analogous to those of ownership, It exists only within a system of working financial institutions of specific kinds
what is a credit card account?
what is an ontology? a generically dependent continuant bearers: hard drives, paper documents, people’s brains … a historical object, with a very special provenance It stands in relations such as being used, being amended, being understood, being reasoned with It exists only within a system of working socio-linguistico- computational institutions of specific kinds
Goal for BFO-OWL Everything in the OWL should be interpretable in terms of the FOL Sound with respect BFO-FOL Maximize useful entailments Axiomatize BFO in FOL based on: universals in domain of discourse time-indexed instantiation exists_at
Goal for BFO-OWL (contd.) Define OWL-Specific relations e.g. temporalized binary relations Show consistency of axiomatization Prove each OWL assertion on background of BFO FOL Extend, as needed, Lisp Semantic Web (LSW) Toolkit Common-lisp based, incremental development, interactive evaluation Simple, but extensible, language for writing FOL – similar to common logic, with macros OWL construction, reasoning. OWL->FOL translation. Provers: Prover9, Z3, Vampire, HermiT, Fact++, Pellet, Elk Translators: prover9 syntax, SMT-LIB, Common Logic, Latex Checks: satisfiability, proof, non-entailment, model construction/checking
Working with LSW Basic checks Definitions of OWL relations, theories, Proofs expectations of their properties
3.13 top-level ontology (TLO) an ontology that is created to represent the categories that are shared across a maximally broad range of domains