Presentation is loading. Please wait.

Presentation is loading. Please wait.

BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester1 Formal Modelling Alan Rector.

Similar presentations


Presentation on theme: "BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester1 Formal Modelling Alan Rector."— Presentation transcript:

1 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester1 Formal Modelling Alan Rector

2 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester2 The problem But we want to ►Build ontologies cooperatively with different groups ►Extend ontologies smoothly ►Re-use pieces of ontologies ►Build new ontologies on top of old ►Quit starting from scratch Knowledge is Big, Fractal & Changeable!

3 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester3 Assertion: ►Let the ontology authors ►create discrete modules ►describe the links between modules ►Let the logic reasoner ►Organise the result The arrival of logic-based ontologies/OWL gives new opportunities to make ontologies more manable and modular

4 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester4 When to use a classifier 1.At author time: As a compiler – Pre-coordination ►Ontologies will be delivered as “pre-coordinated” ontologies to be used without a reasoner ►To make extensions and additions quick, easy, and responsive, distri but developments, empower users to make changes ►Part of an ontology life cycle 2.At delivery time: As a service: - Post-coordination ►Many fixed ontologies are too big and too small ►Too big to find things; too small to contain what you need ►Create them on the fly ►Part of an ontology service 3.At application time: as a reasoner - Post- coordination & inference ►Decision support, query optimisation, schema integration, …, …, … ►Part of a reasoning service

5 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester5 When to use a classifier 1: Pre-coordinated delivery classifier as compiler ►The life cycle ►Gather requirements, sketch, experiment ►Establish patterns – design a “language” ►Criteria for success: What a subject domain expert can learn in a few days ►Bulk authoring ►Classification ►Quality assurance ►Commit classifier results to a pre-coordinated ontology & deliver ►Polyhierarchies (Protégé, DAG-Edit, OWL-Lite, RDF(S), Topic Maps, … ►Query and use with you favourite tool Development & evolution

6 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester6 Commit Results to a Pre- Coordinated Ontology Assert (“Commit”) changes inferred by classifier

7 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester7 Precoordinated Ontologies ►Author ►Classify ►Export to OWL-Lite or RDF ►Query with RDQL, SparQL, or your favourite RDF/RDF(S) query tool ►Use as a vocabulary in your database

8 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester8 has diagnosis has treatment Disease Surgical Procedure Patient Disease Surgical Procedure has complica tion Disease Content

9 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester9 has diagnosis has treatment has complica tion Disease Surgical Procedure Excision Mrs Smith aMelanoma Melanoma Infection anExcision anInfection Content

10 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester10 When to use a classifier 2: Post Coordination ►When the ontology too big – “Lazy classification” on demand Big on the outside: small kernel on the inside kernel model Externally available resource API

11 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester11 Often combined with other services: Example - the GALEN Server Service API Multilingual Lexicons Multilingual Module Ontologies & Classifie r Ontology Module Resource References External Resources & “ Coding ” Module Client Applications Users Ontology Services Client Run time classifier

12 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester12 Logic-based Ontologies: Conceptual Lego hand extremity body acute chronic abnormal normal ischaemic deletion bacterial polymorphism cell protein gene infection inflammation Lung expression

13 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester13 Logic-based Ontologies: Conceptual Lego “ SNPolymorphism of CFTRGene causing Defect in MembraneTransport of Chloride Ion causing Increase in Viscosity of Mucus in CysticFibrosis …” “Hand which is anatomically normal”

14 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester14 Logical Constructs build complex concepts from modularised primitives Genes Species Protein Function Disease Protein coded by gene in humans Function of Protein coded by gene in humans Disease caused by abnormality in Function of Protein coded by gene in humans Gene in humans

15 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester15 Normalising (untangling) Ontologies Structure Function Part-whole Structure Function Part-whole

16 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester16 Rationale for Normalisation ►Explicit distinctions between modules ►Primitives are opaque to the reasoner ►Information implicit in primitive names cannot contribute to modularisation ►Primitives are indivisible to both human and reasoner ►Each primitive should represent a single notion ►Therefore, each primitive must belong to exactly one module ►If a primitive belongs to two modules, they are not modular. ►If a primitive belongs to two modules, it probably conflates two notions ►Therefore concentrate on the “primitive skeleton” of the domain ontology

17 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester17 Normalisation and Untangling Let the reasoner do multiple classification ►Tree ►Everything has just one parent ►A ‘strict hierarchy’ ►Directed Acyclic Graph (DAG) ►Things can have multiple parents ►A ‘Polyhierarchy’ ►Normalisation ►Separate primitives into disjoint trees ►Link the trees with restrictions ►Fill in the values

18 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester18 Tables are easier to manage than DAGs / Polyhierarchies …and get the benefit of inference: Grass and Leafy_plants are both kinds of Plant

19 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester19 Remember to add any closure axioms Closure Axiom Then let the reasoner do the work

20 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester20 Normalisation: From Trees to DAGs Before classification A tree After classification A DAG Directed Acyclic Graph

21 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester21 Normalisation: Criterion 1 The skeleton should consist of disjoint trees ►Every primitive concept should have exactly one primitive parent ►All multiple hierarchies the result of inference by reasoner

22 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester22 Normalisation Criterion 2: No hidden changes of meaning ►Each branch should be homogeneous and logical (“Aristotelian”) ►Hierarchical principle should be subsumption ►Otherwise we are “lying to the logic” ►The criteria for differentiation should follow consistent principles in each branch eg. structure XOR function XOR cause

23 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester23 A Non-homogeneous taxonomy “ On those remote pages it is written that animals are divided into: a. those that belong to the Emperor b. embalmed ones c. those that are trained d. suckling pigs e. mermaids f. fabulous ones g. stray dogs h. those that are included in this classification i. those that tremble as if they were mad j. innumerable ones k. those drawn with a very fine camel's hair brush l. others m. those that have just broken a flower vase n. those that resemble flies from a distance" From The Celestial Emporium of Benevolent Knowledge, Borges

24 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester24 Normalisation Criterion 3 Distinguish “Self-standing” and “Refining” Concepts ►Self-standing concepts ►Roughly Welty & Guarino’s “sortals” ►person, idea, plant, committee, belief,… ►Refining concepts – depend on self-standing concepts ►mild|moderate|severe, hot|cold, left|right,… ►Roughly Welty & Guarino’s non-sortals ►Closely related to Smith’s “fiat partitions” ►Usefully thought of as Value Types by engineers ►For us an engineering distinction…

25 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester25 Normalisation Criterion 3a Self-standing primitives should be globally disjoint & open ►Primitives are disjoint ►If primitives overlap, the overlap conceals implicit information ►A list of self-standing primitives can never be guaranteed complete ►How many kinds of person? of plant? of committee? of belief? ►Can’t infer: Parent & ¬sub 1 &…& ¬sub n-1  sub n ►Heuristic: ►Diagnosis by exclusion about self-standing concepts should NOT be part of ‘standard’ ontological reasoning

26 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester26 Normalisation Criterion 3b Refining primitives should be locally disjoint & closed ►I ndividual values must be disjoint ►but can be hierarchical ►e.g. “very hot”, “moderately severe” ►Each list can be guaranteed to be complete ►Can infer Parent & ¬sub 1 &…& ¬sub n-1  sub n ►“Value partitions” themselves need not be disjoint ►“being hot” is not disjoint from “being severe” ►Allowing Valuetypes to overlap is a useful trick, e.g. ►restriction has_state someValuesFrom (severe and hot)

27 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester27 Normalisation Criterion 4 Axioms ►No axiom should denormalise the ontology No axiom should imply that a primitive is part of more than one branch of primitive skeleton ►If all primitives are disjoint, any such axioms will make that primitive unsatisfiable ►A partial test for normalisation: ►Create random conjunctions of primitives which do not subsume each other. ►If any are satisfiable, the ontology is not normalised

28 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester28 Consequences 1 ►All self-standing primitives are disjoint ►All multiple classification is inferred ►For any two primitive self-standing classes, either one subsumes the other or they are disjoint ►Every self standing concept is part of exactly one primitive branch of the skeleton ►Every self-standing concept has exactly one most specific primitive ancestor

29 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester29 Consequences 2 ►Primitives introduced by a conjunction of one class and a boolean combination of zero or more restrictions ►Tree subclass-of Plant and restriction isMadeOf someValuesFrom Wood ►Resort subclass-of Accommodation restriction isIntendedFor someValueFrom Holidays

30 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester30 Consequences 3 ►Use of axioms limited (outside of skeleton construction) The following are a safe but not exhaustive guide: ►The right side of subclass axioms limited to restrictions ►Both sides of disjointness axioms limited to restrictions ►No equivalence axioms with primitives on either side

31 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester31 A real example: Build a simple treee easy to maintain

32 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester32 Let the classifier organise it

33 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester33 If you want more abstractions, just add new definitions (re-use existing data) “Diseases linked to abnormal proteins”

34 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester34 And let the classifier work again

35 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester35 And again – even for a quite different category “Diseases linked genes described in the mouse”

36 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester36 And let classifier check consistency (My first try wasn’t)

37 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester37 Summary Why use a Classifier? ►To compose concepts ►Allow conceptual lego ►To manage polyhierarchies ►Adding abstractions (“axes”) as needed ►Normalisation ►Untangling ►labelling of “kinds of is-a” ►To avoid combinatorial explosions ►Keep bicycles from exploding ►To manage context ►Cross species, Cross disciplines, Cross studies ►To check consistency and help users find errors

38 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester38 Limitations on OWL (1) ►Ontologies are not databases ►About classes / schemas ►Individuals only to define classes ►And that poorly supported by classifiers ►Open World ►Negation as impossibility ►Databases are not ontologies ►About individuals ►Classes only as part of schemas ►Closed world ►Negation as failure ►Variants ►Instance store ►RDF Triple Store

39 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester39 Limitations on OWL ►Expressivity ►Binary relational ►only two variables ►no way of saying equal ►No way to express “A person who teaches themselves” “A play directed by its producer” ►Basic patterns ►Supported ►All – Some, All – Only ►Some support ►at least n ; at most n ►Qualified cardinality restrictions ►No support ►Some-All, Some-Some ►All-All* ►“Generally, but subject to exceptions”

40 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester40 Limitations on OWL ►Metadata and Annotations ►OWL statements are always about “all members of class C…” ►Statements about Class C itself or the implementation of Class C are metadata or higher order data ►Confused and not well standardised ►Annotation properties in OWL DL are arbitrarily restricted ►Choice between annotations and OWL-Full around an OWL DL core is unclear ►Experience and use cases needed ►Need experience leading to standards – at least for UK community ►One function potentially growing out of this group

41 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester41 Classifier limitations ►Numbers and strings ►Simple concrete data types in spec ►User defined XML data types enmeshed in standards disputes ►No standard classifier deals with numeric ranges ►Although several experimental ones do ►is-part-of and has-part ►Totally doubly-linked structures scale horridly ►Handling of individuals ►Variable with different classifiers ►oneOF works badly with all classifiers at the moment

42 BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester42 Axiom Schemas and Tools ►Many things that would need extra variables can be expressed as axiom schemas ►All right body parts can only be part of right or midline body parts ►All left body parts can only be part of left or midline body parts ►All X-sided body parts can only be part of X-sided or midline body parts. ►For small numbers can do by hand ►For large numbers need tool support


Download ppt "BioHealth Informatics Group Ontology tutorial, © 2005 Univ. of Manchester1 Formal Modelling Alan Rector."

Similar presentations


Ads by Google