Ontology and the Future of Biomedical Research Barry Smith

Slides:



Advertisements
Similar presentations
Ontology Assessment – Proposed Framework and Methodology.
Advertisements

Upper Ontology Summit Wednesday March 15 The BFO perspective Barry Smith Department of Philosophy, University at Buffalo National.
Unit 7: Evolution.
1 Pax Terminologica Barry Smith Institute for Formal Ontology and Medical Information Science, Saarland University, Saarbrücken.
1 An Ontology of Relations for Biomedical Informatics Barry Smith 10 January 2005.
The Role of Foundational Relations in the Alignment of Biomedical Ontologies Barry Smith and Cornelius Rosse.
1 Ontology in 15 Minutes Barry Smith. 2 Main obstacle to integrating genetic and EHR data No facility for dealing with time and instances (particulars)
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
FMA: a domain reference ontology Comments on Cornelius Rosse’s talk Anita Burgun WG6 meeting, Rome 29 Apr- 2 May 2005.
1 The OBO Foundry 2 A prospective standard designed to guarantee interoperability of ontologies from the very start (contrast.
Thomas Bittner and Barry Smith IFOMIS (Saarbrücken) Normalizing Medical Ontologies Using Basic Formal Ontology.
STOP Barry Smith Smart Terminologies via Ontological Principles.
National center for ontological research. Part One: The History of NCOR and ECOR Part Two: How to Establish JCOR: The Japanese Consortium.
On the Application of Formal Principles to Life Science Data: A Case Study in the Gene Ontology Barry Smith * Jacob Köhler † Anand Kumar * *
1 Logical Tools and Theories in Contemporary Bioinformatics Barry Smith
AN INTRODUCTION TO BIOMEDICAL ONTOLOGY Barry Smith University at Buffalo 1.
VT. From Basic Formal Ontology to Medicine Barry Smith and Anand Kumar.
Biological Ontologies Neocles Leontis April 20, 2005.
1 A General Introduction to Biomedical Ontology Barry Smith
Anatomical Information Science Barry Smith
1 The OBO Relation Ontology Genome Biology 2005, 6:R46 based on the fundamental distinction between instances and universals takes instances and time into.
How to Organize the World of Ontologies Barry Smith 1.
1 Part II. The Ontology of Biomedical Reality Some Terminological Proposals.
1 What an Ontology is For Barry Smith University at Buffalo Common Anatomy Reference Ontology Workshop.
1 Part III.The OBO Foundry Project: Towards Scientific Standards and Principles-Based Coordination in Biomedical Ontology Development.
Lecture Nine Database Planning, Design, and Administration
1 The Canonical Life Barry Smith
1 The Future of Clinical Bioinformatics: Overcoming Obstacles to Information Integration Barry Smith Brussells, Eurorec Ontology Workshop, 25 November.
Basic Concepts The Unified Modeling Language (UML) SYSC System Analysis and Design.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Overview of the Database Development Process
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
Ontological realism as a strategy for integrating ontologies Ontology Summit February 7, 2013 Barry Smith 1.
1 Principles of (Biomedical) Ontology Design Barry Smith Department of Philosophy, University at Buffalo National Center for Biomedical Ontology (
Why we need the OBO Core Michael Ashburner, Suzanna Lewis and Barry Smith.
Amo amos amot amomus amotis amont. Happy birthday Swiss-Prot Fortaleza August 2006.
Core 6 (University at Buffalo) Dissemination of Ontology Best Practices Barry Smith (PI) Fabian Neuhaus (Post-Doc) Werner.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
LOGIC AND ONTOLOGY Both logic and ontology are important areas of philosophy covering large, diverse, and active research projects. These two areas overlap.
Wrap-Up Barry Smith. Principles of Ontology Development.
Ontology of Disease and the OBO Foundry Chris Mungall NCBO GO Nov 2006.
Ontological Foundations of Biological Continuants Stefan Schulz, Udo Hahn Text Knowledge Engineering Lab University of Jena (Germany) Department of Medical.
Software Engineering Prof. Ing. Ivo Vondrak, CSc. Dept. of Computer Science Technical University of Ostrava
1 What is OO Design? OO Design is a process of invention, where developers create the abstractions necessary to meet the system’s requirements OO Design.
GREGORY SILVER KUSHEL RIA BELLPADY JOHN MILLER KRYS KOCHUT WILLIAM YORK Supporting Interoperability Using the Discrete-event Modeling Ontology (DeMO)
Sharing Ontologies in the Biomedical Domain Alexa T. McCray National Library of Medicine National Institutes of Health Department of Health & Human Services.
Christoph F. Eick University of Houston Organization 1. What are Ontologies? 2. What are they good for? 3. Ontologies and.
2 3 where in the body ? where in the cell ?
Scientific Methods and Terminology. Scientific methods are The most reliable means to ensure that experiments produce reliable information in response.
Click on a lesson name to select. The Study of Life Section 1: Introduction to Biology Section 2: The Nature of Science Section 3: Methods of Science.
Click on a lesson name to select. The Study of Life Section 1: Introduction to Biology Section 2: The Nature of Science Section 3: Methods of Science.
Approach to building ontologies A high-level view Chris Wroe.
PATO and TO Barry Smith. HP: ! tachycardia =def. Process: GO: cardiac muscle contraction Quality: PATO: increased rate HP = Human.
1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo
1 How to build an ontology Barry Smith
1 The OBO Relation Ontology: Preliminaries Barry Smith
What I SHOULD Have Learned in Life Science Class
Big Data that might benefit from ontology technology, but why this usually fails Barry Smith National Center for Ontological Research 1.
Basic Formal Ontology Barry Smith August 26, 2013.
Building Ontologies with Basic Formal Ontology Barry Smith May 27, 2015.
Upper Ontology Summit The BFO perspective Barry Smith Department of Philosophy, University at Buffalo National Center for Ontological Research National.
Ontology III Cristian Cocos (CLIStFX). Recap What Why (interoperability, “Tower of Babel,” the problem of “human idiosyncrasy”) Upper-Level Ontology,
New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U.
Ontologies COMP6028 Semantic Web Technologies Dr Nicholas Gibbins
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
New York State Center of Excellence in Bioinformatics & Life Sciences R T U Discovery Seminar /UE 141 MMM – Spring 2008 Solving Crimes using Referent.
1 Standards and Ontology Barry Smith
Ontology in 15 Minutes Barry Smith.
2. An overview of SDMX (What is SDMX? Part I)
Ontology in 15 Minutes Barry Smith.
Presentation transcript:

Ontology and the Future of Biomedical Research Barry Smith

Institute for Formal Ontology and Medical Information Science Saarland University

From chromosome to disease

Problem: how to reason with data deriving from different sources, each of which uses its own system of classification ?

Solution: Ontology !

Examples of current needs for ontologies in biomedicine –to enforce semantic consistency within a database –to enable data sharing and re-use –to enable data integration (bridging across data at multiple granularities) –to allow querying

What is needed strong general purpose classification hierarchies created by domain specialists clear, rigorous definitions thoroughly tested in real use cases updated in light of scientific advance

The actuality (too often) myriad special purpose ‘light’ ontologies, prepared by ontology engineers and deposited in internet ‘repositories’ or ‘registries’

ontologies for ‘agent’

General trend on the part of NIH, FDA and other bodies to consolidate ontology-based standards for the communication and processing of biomedical data.

Responses to this trend Old: UMLS (Unified Medical Language System) – rooted in the faithfulness to the ways language is used by different medical communities

SNOMED DEMONS U M L S

–congenital absent nipple is_a nipple –cancer documentation is_a cancer –disease prevention is_a disease – repair and maintenance of wheelchair is_a disease – water is_a nursing phenomenon – part-whole =def. a nursing phenomenon with topology part-whole U M L S

MeSH MeSH Descriptors Index Medicus Descriptor Anthropology, Education, Sociology and Social Phenomena (MeSH Category) Social Sciences Political Systems National Socialism

MeSH National Socialism is_a Political Systems National Socialism is_a Anthropology... National Socialism is_a Social Sciences National Socialism is_a MeSH Descriptors

New: Semantic Web deposits Pet Profile Ontology Review Vocabulary Band Description Vocabulary Musical Baton Vocabulary MusicBrainz Metadata Vocabulary Kissology

Beer Ontology  all instances of hops that have ever existed are necessarily ingredients of beer.

some nice computational resources, but low expressivity and few genuinely scientific demonstration cases OWL-based ontologies …

OWL’s syntactic regimentation is not enough to ensure high-quality ontologies – the use of a common syntax and logical machinery and the careful separating out of ontologies into namespaces does not solve the problem of ontology integration

Both UMLS- and OWL-type responses involve ad hoc creation of new terminologies by each community Many of these terminologies remain as torsos, gather dust, poison the wells,...

How to do better? How to create the conditions for a step-by-step evolution towards high quality ontologies in the biomedical domain which will serve as stable attractors for clinical and biomedical researchers in the future?

A basic distinction type vs. instance science text vs. clinical document dog vs. Fido

Instances are not represented in an ontology built for scientific purposes It is the generalizations that are important (but instances must still be taken into account)

A515287DC3300 Dust Collector Fan B521683Gilmer Belt C521682Motor Drive Belt Catalog vs. inventory

Ontology Types Instances

Ontology = A Representation of Types

Each node of an ontology consists of: preferred term (aka term) term identifier (TUI, aka CUI) synonyms definition, glosses, comments

Each term in an ontology represents exactly one type hence ontology terms should be singular nouns National Socialism is_a Political Systems

An ontology is a representation of types We learn about types in reality from looking at the results of scientific experiments in the form of scientific theories – which describe not what is particular in reality but rather what is general Ontologies need to exploit the evolutionary path to convergence created by science

High quality shared ontologies build communities NIH, FDA trend to consolidate ontology- based standards for the communication and processing of biomedical data. caBIG / NECTAR / BIRN / BRIDG...

The Methodology of Annotations GO employs scientific curators, who use experimental observations reported in the biomedical literature to link gene products with GO terms in annotations. This gene product exercises this function, in this part of the cell, leading to these biological processes

The Methodology of Annotations This process of annotating literature leads to improvements and extensions of the ontology, which in turn leads to better annotations This institutes a virtuous cycle of improvement in the quality and reach of both future annotations and the ontology itself. Annotations + ontology taken together yield a slowly growing computer-interpretable map of biological reality.

The OBO Foundry

A subset of OBO ontologies, whose developers have agreed in advance to accept a common set of principles designed to ensure –intelligibility to biologists (curators, annotators, users) –formal robustness –stability –compatibility –interoperability –support for logic-based reasoning The OBO Foundry

Custodians Michael Ashburner (Cambridge) Suzanna Lewis (Berkeley) Barry Smith (Buffalo/Saarbrücken) The OBO Foundry

A collaborative experiment participants have agreed in advance to a growing set of principles specifying best practices in ontology development designed to guarantee interoperability of ontologies from the very start The OBO Foundry

The developers of each ontology commit to its maintenance in light of scientific advance, and to soliciting community feedback for its improvement. They commit to working with other Foundry members to ensure that, for any particular domain, there is community convergence on a single reference ontology. The OBO Foundry

Initial Candidate Members of the OBO Foundry –GO Gene Ontology –CL Cell Ontology –SO Sequence Ontology –ChEBI Chemical Ontology –PATO Phenotype Ontology –FuGO Functional Genomics Investigation Ontology –FMA Foundational Model of Anatomy –RO Relation Ontology The OBO Foundry

Under development – Disease Ontology –NCI Thesaurus –Mammalian Phenotype Ontology –OBO-UBO / Ontology of Biomedical Reality –Organism (Species) Ontology –Plant Trait Ontology –Protein Ontology –RnaO RNA Ontology The OBO Foundry

Considered for development –Environment Ontology –Behavior Ontology –Biomedical Image Ontology –Clinical Trial Ontology The OBO Foundry

CRITERIA The OBO Foundry The ontology is open and available to be used by all. The developers of the ontology agree in advance to collaborate with developers of other OBO Foundry ontology where domains overlap. The ontology is in, or can be instantiated in, a common formal language.

The ontology possesses a unique identifier space within OBO. The ontology provider has procedures for identifying distinct successive versions. The ontology includes textual definitions for all terms. CRITERIA The OBO Foundry

The ontology has a clearly specified and clearly delineated content. The ontology is well-documented. The ontology has a plurality of independent users. CRITERIA The OBO Foundry

The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.* *Genome Biology 2005, 6:R46 CRITERIA The OBO Foundry

CRITERIA Further criteria will be added over time in order to bring about a gradual improvement in the quality of the ontologies in the Foundry The OBO Foundry

A reference ontology is analogous to a scientific theory; it seeks to optimize representational adequacy to its subject matter to the maximal degree that is compatible with the constraints of computational usefulness.

An application ontology is comparable to an engineering artifact such as a software tool. It is constructed for a specific practical purpose. Examples: National Cancer Institute Thesaurus FuGO Functional Genomics Investigation Ontology

Reference Ontology vs. Application Ontology Currently, application ontologies are often built afresh for each new task; commonly introducing not only idiosyncrasies of format or logic, but also simplifications or distortions of their subject- matters. To solve this problem application ontology development should take place always against the background of a formally robust reference ontology framework

Advantages of the methodology of shared coherently defined ontologies promotes quality assurance (better coding) guarantees automatic reasoning across ontologies and across data at different granularities yields direct connection to temporally indexed instance data

Advantages of the methodology of shared coherently defined ontologies We know that high-quality ontologies can help in creating better mappings e.g. between human and model organism phenotypes S Zhang, O Bodenreider, “Alignment of Multiple Ontologies of Anatomy: Deriving Indirect Mappings from Direct Mappings to a Reference Ontology”, AMIA 2005

Advantages of the methodology of shared coherently defined ontologies once the interoperable gold standard reference ontologies are there, it will make sense to reformulate parts of existing incompatible terminologies (e.g. in UMLS) in terms of the standard ontologies in order to achieve greater domain coverage and alignment of different but veridical views. Thus not everything that was done in the past turns out to be a waste.

Goal: to create a family of gold standard reference ontologies upon which terminologies developed for specific applications can draw The OBO Foundry

Goal: to introduce the scientific method into ontology development: –all Foundry ontologies must be constantly updated in light of scientific advance –all Foundry ontology developers must work with all other Foundry ontology developers in a spirit of scientific collaboration The OBO Foundry

Goal: to replace the current policy of ad hoc creation of new database schemas by each clinical research group by providing reference ontologies in terms of which database schemas can be defined The OBO Foundry

Goal: to introduce some of the features of scientific peer review into biomedical ontology development The OBO Foundry

Goal: to create controlled vocabularies for use by clinical trial banks, clinical guidelines bodies, scientific journals,... The OBO Foundry

Goal: to create controlled vocabularies for use by clinical trial banks, clinical guidelines bodies, scientific journals,... The OBO Foundry

Goal: to create an evolving map-like representation of the entire domain of biological reality The OBO Foundry

GO’s three ontologies molecular function cellular component biological process

cell (types) molecular function (GO) species molecular process cellular anatomy anatomy (fly, fish, human...) cellular physiology organism-level physiology ChEBI, Sequence, RNA...

cell (types) molecula r function (GO) species molecula r process cellular anatomy anatomy (fly, fish, human...) cellular physiology organism-level physiology ChEBI, Sequence, RNA... normal (functionings)

pathophysiology (disease) pathoanatomy (fly, fish, human...) pathological (malfunctionings)

cell (types) molecula r function (GO) species molecula r process cellular anatomy (GO) anatomy (fly, fish, human...) cellular physiology organism-level physiology ChEBI, Sequence, RNA... pathophysiology (disease) pathoanatomy (fly, fish, human...)

cell (types) molecula r function (GO) species molecula r process cellular anatomy anatomy (fly, fish, human...) cellular physiology organism-level physiology ChEBI, Sequence, RNA... pathophysiology (disease) pathoanatomy (fly, fish, human...) phenotype

cell (types) molecula r function (GO) species molecula r process cellular anatomy anatomy (fly, fish, human...) cellular physiology organism-level physiology ChEBI, Sequence, RNA... pathophysiology (disease) pathoanatomy (fly, fish, human...) phenotype investigation (FuGO)

Ende

First step Alignment of OBO Foundry ontologies through a common system of formally defined relations in the OBO Relation Ontology See “Relations in Biomedical Ontologies”, Genome Biology Apr. 2005

Judith Blake: “The use of bio-ontologies … ensures consistency of data curation, supports extensive data integration, and enables robust exchange of information between heterogeneous informatics systems... ontologies … formally define relationships between the concepts.”

"Gene Ontology: Tool for the Unification of Biology" an ontology "comprises a set of well-defined terms with well-defined relationships" (Ashburner et al., 2000, p. 27)

is_a (sensu UMLS) A is_a B =def ‘A ’ is narrower in meaning than ‘B ’ grows out of the heritage of dictionaries (which ignore the basic distinction between types and instances)

is_a congenital absent nipple is_a nipple cancer documentation is_a cancer disease prevention is_a disease Nazism is_a social science

is_a (sensu logic) A is_a B =def For all x, if x instance_of A then x instance_of B cell division is_a biological process adult is_a child ???

Two kinds of entities occurrents (processes, events, happenings) cell division, ovulation, death continuants (objects, qualities,...) cell, ovum, organism, temperature of organism,...

is_a (for occurrents) A is_a B =def For all x, if x instance_of A then x instance_of B cell division is_a biological process

is_a (for continuants) A is_a B =def For all x, t if x instance_of A at t then x instance_of B at t abnormal cell is_a cell adult human is_a human but not: adult is_a child

Part_of as a relation between types is more problematic than is standardly supposed heart part_of human being ? human heart part_of human being ? human being has_part human testis ? human testis part_of human being ?

two kinds of parthood 1.between instances: Mary’s heart part_of Mary this nucleus part_of this cell 2.between types human heart part_of human cell nucleus part_of cell

Definition of part_of as a relation between types A part_of B =Def all instances of A are instance-level parts of some instance of B ALL–SOME STRUCTURE

part_of (for occurrents) A part_of B =Def For all x, if x instance_of A then there is some y, y instance_of B and x part_of y where ‘part_of’ is the instance-level part relation

part_of (for continuants) A part_of B =def. For all x, t if x instance_of A at t then there is some y, y instance_of B at t and x part_of y where ‘part_of’ is the instance-level part relation ALL-SOME STRUCTURE

How to use the OBO Relation Ontology Ontologies are representations of types and of the relations between types The definitions of these relations involve reference to times and instances, but these references are washed out when we get to the assertions (edges) in the ontology But curators should still be aware of the underlying definitions when formulating such assertions

part_of (for occurrents) A part_of B =Def For all x, if x instance_of A then there is some y, y instance_of B and x part_of y where ‘part_of’ is the instance-level part relation

A part_of B, B part_of C... The all-some structure of such definitions allows cascading of inferences (true path rule) (i) within ontologies (ii) between ontologies (iii) between ontologies and repositories of instance-data

Strengthened true path rule Whichever A you choose, the instance of B of which it is a part will be included in some C, which will include as part also the A with which you began The same principle applies to the other relations in the OBO-RO: located_at, transformation_of, derived_from, adjacent_to, etc.

Kinds of relations Between types: –is_a, part_of,... Between an instance and a type –this explosion instance_of the type explosion Between instances: –Mary’s heart part_of Mary

In every ontology some terms and some relations are primitive = they cannot be defined (on pain of infinite regress) Examples of primitive relations: –identity –instantiation –(instance-level) part_of –(instance-level) continuous_with

Fiat and bona fide boundaries

Continuity Attachment Adjacency

everything here is an independent continuant

structures vs. formations = bona fide vs. fiat boundaries

Modes of Connection The body is a highly connected entity. Exceptions: cells floating free in blood.

Modes of Connection Modes of connection: attached_to (muscle to bone) synapsed_with (nerve to nerve, nerve to muscle) continuous_with (= share a fiat boundary)

articular eminencearticular (glenoid)fossa ANTERIOR Attachment, location, containment

Containment involves relation to a hole or cavity 1: cavity 2: tunnel, conduit (artery) 3: mouth; a snail’s shell

Fiat vs. Bona Fide Boundaries

Double Hole Structure Medium (filling the environing hole) Tenant (occupying the central hole) Retainer (a boundary of some surrounding structure)

head of condyle neck of condyle fossa fiat boundary THE TEMPOROMANDIBULAR JOINT

continuous_with (a relation between instances which share a fiat boundary) is always symmetric: if x continuous_with y, then y continuous_with x

continuous_with (relation between types) A continuous_with B =Def. for all x, if x instance-of A then there is some y such that y instance_of B and x continuous_with y

continuous_with is not always symmetric Consider lymph node and lymphatic vessel: Each lymph node is continuous with some lymphatic vessel, but there are lymphatic vessels (e.g. lymphs and lymphatic trunks) which are not continuous with any lymph nodes

Adjacent_to as a relation between types is not symmetric Consider seminal vesicle adjacent_to urinary bladder Not: urinary bladder adjacent_to seminal vesicle

instance level this nucleus is adjacent to this cytoplasm implies: this cytoplasm is adjacent to this nucleus type level nucleus adjacent_to cytoplasm Not: cytoplasm adjacent_to nucleus

Applications Expectations of symmetry e.g. for protein- protein interactions may hold only at the instance level if A interacts with B, it does not follow that B interacts with A if A is expressed simultaneously with B, it does not follow that B is expressed simultaneously with A

c at t 1 C c at t C 1 time same instance transformation_of pre-RNAmature RNA adultchild

transformation_of A transformation_of B =Def. Every instance of A was at some earlier time an instance of B adult transformation_of child

C c at t c at t 1 C 1 tumor development

C c at t C 1 c 1 at t 1 C' c' at t time instances zygote derives_from ovum sperm derives_from

two continuants fuse to form a new continuant C c at t C 1 c 1 at t 1 C' c' at t fusion

one initial continuant is replaced by two successor continuants C c at t C 1 c 1 at t 1 C 2 c 1 at t 1 fission

one continuant detaches itself from an initial continuant, which itself continues to exist C c at t c at t 1 C 1 c 1 at t budding

one continuant absorbs a second continuant while itself continuing to exist C c at t c at t 1 C' c' at t capture

A suite of defined relations between types Foundationalis_a part_of Spatiallocated_in contained_in adjacent_to Temporaltransformation_of derives_from preceded_by Participationhas_participant has_agent

To be added to the Relation Ontology lacks (between an instance and a type, e.g. this fly lacks wings) dependent_on (between a dependent entity and its carrier or bearer) quality_of (between a dependent and an independent continuant) functioning_of (between a process and an independent continuant)

Low Hanging Fruit Ontologies should include only those relational assertions which hold universally (= have the ALL-SOME form) Often, order will matter here: We can include adult transformation_of child but not child transforms_into adult

The Gene Ontology

GO’s three ontologies molecular functions cellular components biological processes

When a gene is identified three types of questions need to be addressed: 1. Where is it located in the cell? 2. What functions does it have on the molecular level? 3. To what biological processes do these functions contribute?

Three granularities: Cellular (for components) Molecular (for functions) Organ + organism (for processes)

GO has cells but it does not include terms for molecules or organisms within any of its three ontologies except e.g. GO: host =Def. Any organism in which another organism spends part or all of its life cycle

Are the relations between functions and processes a matter of granularity? Molecular activities are the ‘building blocks’ of biological processes ? But they are not allowed to be represented in GO as parts of biological processes

GO’s three ontologies molecular functions cellular components biological processes

What does “function” mean? an entity has a biological function if and only if it is part of an organism and has a disposition to act reliably in such a way as to contribute to the organism’s survival the function is this disposition

Improved version an entity has a biological function if and only if it is part of an organism and has a disposition to act reliably in such a way as to contribute to the organism’s realization of the canonical life plan for an organism of that type

This canonical life plan might include canonical embryological development canonical growth canonical reproduction canonical aging canonical death

The function of the heart is to pump blood Not every activity (process) in an organism is the exercise of a function – there are – mal functionings – side-effects (heart beating) –accidents (external interference) – background stochastic activity

Kidney

Nephron

Functional Segments

Functions

This is a screwdriver This is a good screwdriver This is a broken screwdriver This is a heart This is a healthy heart This is an unhealthy heart

Functions are associated with certain characteristic process shapes Screwdriver: rotates and simultaneously moves forward simultaneously transferring torque from hand and arm to screw Heart: performs a contracting movement inwards and an expanding movement outwards

Not functioning at all leads to death, modulo internal factors: plasticity redundancy (2 kidneys) criticality of the system involved external factors: prosthesis (dialysis machines, oxygen tent) special environments assistance from other organisms

What clinical medicine is for to eliminate malfunctioning by fixing broken body parts (or to prevent the appearance of malfunctioning by intervening e.g. at the molecular level)

Hypothesis: there are no ‘bad’ functions It is not the function of an oncogene to cause cancer Oncogenes were in every case proto- oncogenes with functions of their own They become oncogenes because of bad (non-prototypical) environments

Is there an exception for molecular functions? Does this apply only to functions on biological levels of granularity (= levels of granularity coarser than the molecule) ? If pathology is the deviation from (normal) functioning, does it make sense to talk of a pathological molecule? (Pathologically functioning molecule vs. pathologically structured molecule)

Is there an exception for molecular functions? A molecular function is a propensity of a gene product instance to perform actions on the molecular level of granularity. Hypothesis 1: these actions must be reliably such as to contribute to biological processes. Hypothesis 2: these actions must be reliably such as to contribute to the organism’s realization of the canonical life plan for an organism of that type.

The Gene Ontology is a canonical ontology – it represents only what is normal in the realm of molecular functioning

The GO is a canonical representation “The Gene Ontology is a computational representation of the ways in which gene products normally function in the biological realm” Nucl. Acids Res. 2006: 34.

The FMA is a canonical representation It is a computational representation of types and relations between types deduced from the qualitative observations of the normal human body, which have been refined and sanctioned by successive generations of anatomists and presented in textbooks and atlases of structural anatomy.

The importance of pathways (successive causality) Each stage in the history of a disease presupposes the earlier stages Therefore need to reason across time, tracking the order of events in time, using relations such as derives_from, transformation_of... Need pathway ontologies on every level of granularity

The importance of granularity (simultaneous causality) Networks are continuants At any given time there are networks existing in the organism at different levels of granularity Changes in one cause simultaneous changes in all the others (Compare Boyle’s law: a rise in temperature causes a simultaneous increase in pressure)

The Granularity Gulf most existing data-sources are of fixed, single granularity many (all?) clinical phenomena cross granularities Therefore need to reason across time, tracking the order of events in time

Good ontologies require: consistent use of terms, supported by logically coherent (non-circular) definitions, in equivalent human-readable and computable formats coherent shared treatment of relations to allow cascading inference both within and between ontologies

Three fundamental dichotomies continuants vs. occurrents dependent vs. independent types vs. instances

ONTOLOGIES ARE REPRESENTATIONS OF TYPES aka kinds, universals, categories, species, genera,...

Continuants (aka endurants) –have continuous existence in time –preserve their identity through change –exist in toto whenever they exist at all Occurrents (aka processes) –have temporal parts –unfold themselves in successive phases –exist only in their phases

You are a continuant Your life is an occurrent You are 3-dimensional Your life is 4-dimensional

Dependent entities require independent continuants as their bearers There is no run without a runner There is no grin without a cat

Dependent vs. independent continuants Independent continuants (organisms, cells, molecules, environments) Dependent continuants (qualities, shapes, roles, propensities, functions)

All occurrents are dependent entities They are dependent on those independent continuants which are their participants (agents, patients, media...)

Top-Level Ontology Continuant Occurrent (always dependent on one or more independent continuants) Independent Continuant Dependent Continuant

= A representation of top-level types Continuant Occurrent Independent Continuant Dependent Continuant cell component biological process molecular function

Top-Level Ontology Continuant Occurrent Independent Continuant Dependent Continuant Functioning Side-Effect, Stochastic Process,... Function

Top-Level Ontology Continuant Occurrent Independent Continuant Dependent Continuant Functioning Side-Effect, Stochastic Process,... Function

Top-Level Ontology Continuant Occurrent Independent Continuant Dependent Continuant Quality Function Spatial Region Functioning Side-Effect, Stochastic Process,... instances (in space and time)

Smith B, Ceusters W, Kumar A, Rosse C. On Carcinomas and Other Pathological Entities, Comp Functional Genomics, Apr. 2006

everything here is an independent continuant

Functions, etc. Some dependent continuants are realizable expression of a gene application of a therapy course of a disease execution of an algorithm realization of a protocol

Functions vs Functionings the function of your heart = to pump blood in your body this function is realized in processes of pumping blood not all functions are realized (consider the function of this sperm...)

Concepts Biomedical ontology integration will never be achieved through integration of meanings or concepts The problem is precisely that different user communities use different concepts Concepts are in your head and will change as your understanding changes

Concepts Ontologies represent types: not concepts, meanings, ideas... Types exist, with their instances, in objective reality – including types of image, of imaging process, of brain region, of clinical procedure, etc.

Rules on types Don’t confuse types with words Don’t confuse types with concepts Don’t confuse types with ways of getting to know types Don’t confuse types with ways of talking about types Don’t confuses types with data about types

Some other simple rules for high quality ontologies

Univocity Terms should have the same meanings on every occasion of use. They should refer to the same kinds of entities in reality Basic ontological relations such as is_a and part_of should be used in the same way by all ontologies

Positivity Complements of types are not themselves types. Hence terms such as non-mammal non-membrane other metalworker in New Zealand do not designate types in reality

Ontology of types  logic of terms There are no conjunctive and disjunctive types: anatomic structure, system, or substance musculoskeletal and connective tissue disorder rheumatism, excluding the back

Objectivity Which types exist in reality is not a function of our knowledge. Terms such as unknown unclassified unlocalized arthropathies not otherwise specified do not designate types in reality.

Keep Epistemology Separate from Ontology If you want to say that We do not know where A’s are located do not invent a new class of A’s with unknown locations (A well-constructed ontology should grow linearly; it should not need to delete classes or relations because of increases in knowledge)

Syntactic Separateness Do not confuse sentences with terms If you want to say I surmise that this is a case of pneumonia do not invent a new class of surmised pneumonias

Single Inheritance No kind in a classificatory hierarchy should have more than one is_a parent on the immediate higher level

Multiple Inheritance thing car blue thing blue car is_a

Multiple Inheritance is a source of errors encourages laziness serves as obstacle to integration with neighboring ontologies hampers use of Aristotelian methodology for defining terms

Multiple Inheritance thing car blue thing blue car is_a 1 is_a 2

is_a Overloading The success of ontology alignment demands that ontological relations (is_a, part_of,...) have the same meanings in the different ontologies to be aligned.

Example: is_a is pressed into service by the GO to express location is-located-at and similar relations are expressed by creating special compound terms using: site of … … within … … in … extrinsic to … yielding associated errors

e.g. errors with ‘within’ lytic vacuole within a protein storage vacuole lytic vacuole within a protein storage vacuole is- a protein storage vacuole Compare: embryo within a uterus is-a uterus

similar problems with part_of extrinsic to membrane part_of membrane

Compositionality The meanings of compound terms should be determined 1. by the meanings of component terms together with 2. the rules governing syntax

Why do we need rules/standards for good ontology? Ontologies must be intelligible both to humans (for annotation and curation) and to machines (for reasoning and error-checking): the lack of rules for classification leads to human error and blocks automatic reasoning and error-checking Intuitive rules facilitate training of curators and annotators Common rules allow alignment with other ontologies

When we annotate the record of an experiment we use terms representing types to capture what we learn about: –this experiment (instance), performed here and now, in this laboratory –the instances experimented upon These instances are typical = they are representatives of types –of experiment (described in FuGO) –of gene product molecules, molecular functions, cellular components, biological processes (described in GO)

Experimental records document a variety of instances (particular real- world examples or cases), ranging from instances of gene products (including individual molecules) to instances of biochemical processes, molecular functions, and cellular locations

Experimental records provide evidence that gene products of given types have molecular functions of given types by documenting occurrences in the real world that involve corresponding instances of functioning. They document the existence of real-world molecules that have the potential to execute (carry out, realize, perform) the types of molecular functions that are involved in these occurrences.

Motivation: To capture reality Inferences and decisions we make are based upon what we know of reality. An ontology is a computable representation of biological reality, which is designed to enable a computer to reason over the data we collect about this reality in (some of) the ways that we do.