Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Formal Ontology and Information Systems Barry Smith

Similar presentations


Presentation on theme: "1 Formal Ontology and Information Systems Barry Smith"— Presentation transcript:

1 1 Formal Ontology and Information Systems Barry Smith http://ifomis.de

2 2 Institute for Formal Ontology and Medical Information Science (IFOMIS) Faculty of Medicine University of Leipzig http://ifomis.de

3 3 The Idea Computational medical research will transform the discipline of medicine … but only if communication problems can be solved

4 4 Database standardization is desparately needed in medicine to enable the huge amounts of data resulting from trials by different groups to be fused together

5 5 How make one system out of all of this? How resolve incompatibilities? “ONTOLOGY” = the solution of first resort (compare: kicking a television set) But what does ‘ontology’ mean? Current answer: a collection of terms and definitions satisfying constraints of description logic = application ontology

6 6 Description logic a decidable logic (thus much weaker than first-order predicate logic) for manipulating hierarchies of concepts/general terms)

7 7 Enterprise Ontology A Sale is an agreement between two Legal- Entities for the exchange of a Product for a Sale-Price. A Strategy is a Plan to Achieve a high-level Purpose. A Market is all Sales and Potential Sales within a scope of interest.

8 8 Gene Ontology Molecular Function Ontology: tasks performed by individual gene products; examples: transcription factor, DNA helicase Biological Process Ontology: broad biological goals accomplished by ordered assemblies of molecular functions; examples: mitosis, purine metabolism Cellular Component Ontology: subcellular structures, locations, and macromolecular complexes; examples: nucleus, telomere

9 9 Example from Molecular Function Ontology hormone ; GO:0005179 %digestive hormone ; GO:0046659 %peptide hormone ; GO:0005180 %adrenocorticotropin ; GO:0017043 %glycopeptide hormone ; GO:0005181 %follicle-stimulating hormone ; GO:0016913 % = IS A

10 10 as tree (joined by is a links): hormone digestive hormone peptide hormone adrenocorticotropin glycopeptide hormone follicle-stimulating hormone

11 11 Problem: There exist multiple databases genomic cellular structural phenotypic … and even for each specific type of information, e.g. DNA sequence data, there exist several databases of different scope and organisation

12 12 What is a gene? GDB: a gene is a DNA fragment that can be transcribed and translated into a protein Genbank: a gene is a DNA region of biological interest with a name and that carries a genetic trait or phenotype (from Schulze-Kremer)

13 13 What is blood? Unified Medical Language System (UMLS): blood is a tissue Systematized Nomenclature of Medicine (SNOMED): blood is a fluid

14 14 Another Example: Statements of Accounts Company Financial statements may be prepared under either the (US) GAAP or the (European) IASC standards These allocate cost items to different categories depending on the laws of the countries involved.

15 15 Job: to develop an algorithm for the automatic conversion of income statements and balance sheets between the two systems. Not even this relatively simple problem has been satisfactorily resolved … why not?

16 16 The World Wide Web Vast amount of heterogeneous data sources Needs: dramatically better support at the level of metadata The ability to query and integrate across different conceptual systems: The currently preferred answer is The Semantic Web, based on description logic will not work: How tag blood?

17 17 Application ontology cannot solve the problems of database integration There can be no mechanical solution to the problems of data fusion in a domain like medicine

18 18 Applications ontology: … grew out of work in AI and in knowledge representation Ontologies are applications running in real time

19 19 Applications ontology: ontologies are inside the computer thus subject to severe constraints on expressive power (effectively the expressive power of description logic)

20 20 Applications ontology cannot solve the data-fusion problem because of its roots in knowledge mining

21 21 different conceptual systems

22 22 need not interconnect at all

23 23 because of the limits of knowledge mining

24 24 we cannot make incompatible concept-systems interconnect just by looking at concepts, or knowledge – we need some tertium quid

25 25 Applications ontology has its philosophical roots in Quine’s doctrine of ontological commitment and in the ‘internal metaphysics’ of Carnap/Putnam Roughly, for an applications ontology the world and the semantic model are one and the same What exists = what the system says exists

26 26 The Problem for the Quinean If an ontology is the set of ontological commitments of a theory, how can we cope with questions pertaining to the relations between the objects to which different theories are committed?

27 27 theories, semantic models, need not interconnect at all

28 28 What is needed in some sort of wider common framework which is sufficiently rich and nuanced to allow concept systems deriving from different sources to be hand-callibrated

29 29 What is needed is not an applications ontology but a reference ontology (something like old-fashioned metaphysics)

30 30 Reference Ontology … grew out of logic and analytic metaphysics An ontology is a theory of the relevant domain of entities Ontology is outside the computer seeks maximal expressiveness and adequacy to reality willing to sacrifice computational tractability for the sake of representational adequacy

31 31 Belnap “it is a good thing logicians were around before computer scientists; “if computer scientists had got there first, then we wouldn’t have numbers because arithmetic is undecidable”

32 32 It is a good thing Aristotelian metaphysics was around before description logic, because otherwise we would have only hierarchies of concepts/universals/classes and no individual instances …

33 33 Reference Ontology a theory of the tertium quid – called reality – needed to hand-callibrate database/terminology systems

34 34 Methodology Get ontology right first (realism; descriptive adequacy; rather powerful logic); solve tractability problems later

35 35 The Reference Ontology Community IFOMIS (Leipzig) Laboratory for Applied Ontology (Trento/Rome, Turin) Foundational Ontology Project (Leeds) Ontology Works (Baltimore) Ontek Corporation (Buffalo/Leeds) LandC (Belgium/Philadelphia) (CYC?)

36 36 Domains of Current Work in Reference Ontology IFOMIS Leipzig: Medicine Laboratory for Applied Ontology Trento/Rome: Ontology of Cognition/Language Turin: Law Foundational Ontology Project (Leeds): Space, Physics Ontology Works (Baltimore): Genetics, Molecular Biology Ontek Corporation (Buffalo/Leeds): Biological Systematics LandC (Belgium/Philadelphia): Medical NLP (? CYC : Everything ?)

37 37 Some Historical Background on Reference Ontology

38 38 Recall: GDB: a gene is a DNA fragment that can be transcribed and translated into a protein Genbank: a gene is a DNA region of biological interest with a name and that carries a genetic trait or phenotype (from Schulze-Kremer)

39 39 Ontology Note that terms like ‘fragment’, ‘region’, ‘name’, ‘carry’, ‘trait’, ‘type’ … along with terms like ‘part’, ‘whole’, ‘function’, ‘substance’, ‘inhere’ … are ontological terms in the sense of traditional (philosophical) ontology

40 40 Aristotle First ontologist

41 41 First ontology ( from Porphyry’s Commentary on Aristotle’s Categories)

42 42 Linnaean Ontology

43 43 Formal Ontology term coined by Edmund Husserl = the theory of those ontological structures such as part-whole, universal-particular which apply to all domains whatsoever

44 44 Edmund Husserl

45 45 Husserl outlines a new method of constituent ontology to study a domain ontologically is to establish the parts of the domain and the interrelations between them especially the dependence relations

46 46 Logical Investigations¸1900/01 Aristotelian theory of universals and particulars theory of part and whole theory of ontological dependence the theory of boundaries and fusion

47 47 Formal Ontology contrasted with material or regional ontologies (compare relation between pure and applied mathematics) Husserl’s idea: If we can build a good formal ontology, this should save time and effort in building reference ontologies for each successive domain

48 48 Basic Formal Ontology BFO The Vampire Slayer

49 49 Basic Formal Ontology Aristotelian theory of universals and instances theory of part and whole theory of ontological dependence theory of boundary, continuity and contact theory of states, powers, qualities, roles (SPQR- entities) theory of processes theory of environments/niches/contexts and spatial and spatio-temporal regions

50 50 BFO not just a system of categories but a formal theory with definitions, axioms, theorems designed to provide the resources for reference ontologies for specific domains the latter should be of sufficient richness that terminological incompatibilities can be resolves intelligently rather than by brute force

51 51 Three types of reference ontology 1) formal ontology = framework for rigorous definition of the highly general concepts – such as object, event, whole, part – employed in every domain 2) domain ontology, a top-level system with a few highly general concepts, applies formal ontology to a particular domain, such as genetics or medicine 3) terminology-based ontology, a very large system embracing many concepts and inter- concept relations

52 52 MedO = medical domain ontology including sub-ontologies: cell ontology drug ontology protein ontology gene ontology

53 53 other sub-ontologies anatomical ontology epidemiological ontology disease ontology therapy ontology pathology ontology the whole designed to give structure to the medical domain (currently medical education comparable to stamp-collecting)

54 54 MedO and its various sub-ontologies will inherit the definitions and axioms of BFO but will add new definitions and axioms of their own

55 55 Granularity cell ontology drug ontology protein ontology gene ontology imply that we need also a theory of granularity

56 56 Ontology like cartography must work with maps at different scales How fit these maps (conceptual grids) together into a single system? IFOMIS is developing a theory of granular partitions designed to provide a framework within which different maps/views of the same reality can be combined together

57 57

58 58 Part Two Reference Ontology and Situated Computing

59 59 Shimon Edelman’s Riddle of Representation two humans, a monkey, and a robot are looking at a piece of cheese; what is common to the representational processes in their visual systems?

60 60 Answer: The cheese, of course

61 61 Rodney Brooks “Intelligence without Representation” The world itself is our model opposition between the Engineering view and the SMPA View

62 62 SMPA model Sense Model Plan Act the agent first senses its environment through sensors then uses this data to build a model of the world then produces a plan to achieve goals then acts on this plan

63 63 Proposal SMPA belongs to the same methodological universe as Applications Ontology If we want to build an intelligent agent within this framework, there need to be representations of the domain within which the agent acts which are inside the computer

64 64 Engineering Approach The system embodies a number of distinct layers of activity (compare: faculties of the mind) These layers operate independently and connect directly to the environment outside the system Each layer operates as a complete system that copes in real time with a changing environment Layers evolve through interaction with the environment (artificial insects/vehicles …)

65 65 Brooks’ Engineering Approach lends very little weight to the role of representations or models At the same time it insists that AI should use the world in all its complexity in producing systems that react directly to the world An ontology appropriate for this approach would have to include within its purview both the world and the system, thus be essentially richer than the system alone

66 66 An intelligent system must be situated it is situatedness which gives the processes within each layer meaning meaning exists precisely in the relation to the world, the world serves also as to unify the different layers together and to make them compatible

67 67 Organisms, especially humans, fix their beliefs not only in their heads but in their worlds, as they attune themselves differently to different parts of the world as a result of their experience. And they pull the same trick with their memories, not only by rearranging their parsing of the world (their understanding of what they see), but by marking it. They place traces out there which changes what they will be confronted with the next time it comes around. Thus they don't have to carry their memories with them. “Intelligence without Representation”

68 68 Andy Clark, Being There humans can accomplish much without building detailed, internal models; they rely on external scaffolding = maps, models, tools, landmarks, buildings, language, culture … we act so as to simplify cognitive tasks by "leaning on" the structures in our environment.

69 69 Compare the Ecological Psychology of J. J. Gibson To understand human cognition we should study the moving, acting human person as it exists in its real-world environment and taking account of how it has evolved into this real-world environment

70 70 For Gibson we are like (multi-layered) tuning forks – tuned to the environment which surrounds us, and for us human beings this is a social environment which includes traces of prior actions in the form of records and representations

71 71 Gibsonian Ecological View of Information Systems To understand information systems we should study the hardware as it exists embedded in its real-world environment and taking account of the environment for which it was designed and built Information systems are like tuning forks – they resonate in tune to their surrounding environments e.g. through their biological and chemical sensors

72 72 So what is the ontology of blood?

73 73 We cannot solve this problem just by looking at concepts (by engaging in further acts of knowledge mining)

74 74 concept systems may be simply incommensurable

75 75 the problem can only be solved in Brooksian/Gibsonian fashion by taking the world itself into account

76 76 By looking not at concepts, representations, and their semantic models but rather at organisms acting in the world and standing at different levels in a range of different sorts of relations to the world

77 77 We then recognize that the same object can be apprehended at different levels of granularity: at the perceptual level blood is a liquid (?) at the cellular level blood is a tissue

78 78 This implies a view of ontology not as a theory of concepts but as a theory of reality But how is this possible? How can we get beyond our concepts? answer: ontology must be maximally opportunistic it must relate not to beliefs, concepts, syntactic strings but to the world itself

79 79 “Maximally opportunistic” means: look at concepts and beliefs critically and always in the context of a wider view which includes independent ways to access the objects themselves at different levels of granularity and taking account of tacit knowledge of those features of reality of which the domain experts are not consciously aware

80 80 “Maximally opportunistic” means: look not at what the expert says but at what the expert does Experts have expertise = knowing how Ontologists can have windows on reality, by focusing on categories, and can extract some form of knowing that Gibsonianism: experts don’t know what the ontologist knows

81 81 Ontology must be maximally opportunistic This means: don’t just look at beliefs look at the objects themselves from every possible direction, formal and informal scientific and non-scientific …

82 82 Maximally opportunistic means: look at the same objects at different levels of granularity:

83 83 Second step: select out the good conceptualizations these have a reasonable chance of being integrated together into a single ontological system based on tested principles robust conform to natural science

84 84 Ontology like cartography must work with maps at different scales

85 85 Medical ontologies at different levels of granularity: cell ontology drug ontology protein ontology gene ontology anatomical ontology epidemiological ontology Rigidly hierachical, modular organization – with many things which can go wrong

86 86 There are many compatible map-like partitions many maps at different scales, all transparent to the reality beyond

87 87 Partitions should be cuts through reality a good medical ontology should NOT be compatible with the conceptualization of disease as: caused by evil spirits and demons and cured by golems

88 88 Three main sorts of partitions 1. substances and their parts 2. qualities/functions/roles 3. processes in addition: spatial regions/niches spatio-temporal regions AS UNIVERSALS, AS PARTICULARS

89 89 1. Substances and their parts Patterned parts (carved out by fiat) chess board football pitch Broca’s Region nervous system

90 90 2. Functions function of a screwdriver tied to processes = generalized four-dimensional shapes (carved out by fiat) contextual dependence function of the heart function of the circulatory system

91 91 Generalized 4-dimensional shapes as UNIVERSALS as PARTICULARS

92 92 Once we understand functions we can also understand malfunctions: broken screwdriver defective heart

93 93 Application to Bodily Systems Immune system, digestive system … are complex substances paradigm: skeleton carved out by fiat from the whole organism in terms of their functions engaging in specific types of processes

94 94 Multi-layered systems How one system can use another system to exercise its function Drug transport system uses circulatory system (Layered Mereotopology of substances of processes)

95 95 Part 3

96 96 Testing the BFO/MedO approach within a software environment for NLP of unstructured patient records collaborating with Language and Computing nv (www.landc.be)

97 97 L&C LinKBase®: world’s largest terminology-based ontology incorporating UMLS, SNOMED, etc. + LinKFactory®: suite for developing and managing large terminology-based ontologies

98 98 LinKBase BFO and MedO designed to add depth, and so also reasoning capacity by tagging LinKBase terms with corresponding BFO/MedO categories ???


Download ppt "1 Formal Ontology and Information Systems Barry Smith"

Similar presentations


Ads by Google