Presentation is loading. Please wait.

Presentation is loading. Please wait.

© University of Manchester 1 Ontology Normalisation, Pre- and Post- Coordination Alan Rector & CO-ODE/NIBHI University of Manchester

Similar presentations


Presentation on theme: "© University of Manchester 1 Ontology Normalisation, Pre- and Post- Coordination Alan Rector & CO-ODE/NIBHI University of Manchester"— Presentation transcript:

1 © University of Manchester 1 Ontology Normalisation, Pre- and Post- Coordination Alan Rector & CO-ODE/NIBHI University of Manchester Rector@cs.man.ac.uk OpenGALEN BioHealth Informatics Group © University of Manchester

2 2 When to use a classifier 1.At author time: As a compiler – Pre-coordination ►Ontologies will be delivered as “pre-coordinated” ontologies to be used without a reasoner ►To make extensions and additions quick, easy, and responsive, distri but developments, empower users to make changes ►Part of an ontology life cycle 2.At delivery time: As a service: - Post-coordination ►Many fixed ontologies are too big and too small ►Too big to find things; too small to contain what you need ►Create them on the fly ►Part of an ontology service 3.At application time: as a reasoner - Post- coordination & inference ►Decision support, query optimisation, schema integration, …, …, … ►Part of a reasoning service

3 © University of Manchester 3 When to use a classifier 1: Pre-coordinated delivery classifier as compiler ►The life cycle ►Gather requirements, sketch, experiment ►Establish patterns – design a “language” ►Criteria for success: What a subject domain expert can learn in a few days ►Bulk authoring ►Classification ►Quality assurance ►Commit classifier results to a pre-coordinated ontology & deliver ►Polyhierarchies (Protégé, DAG-Edit, OWL-Lite, RDF(S), Topic Maps, … ►Query and use with you favourite tool Development & evolution

4 © University of Manchester 4 Commit Results to a Pre- Coordinated Ontology Assert (“Commit”) changes inferred by classifier

5 © University of Manchester 5 Precoordinated Ontologies ►Author ►Classify ►Export to OWL-Lite or RDF ►Query with RDQL, SparQL, or your favourite RDF/RDF(S) query tool ►Use as a vocabulary in your database

6 © University of Manchester 6 The problem But we want to ►Build ontologies cooperatively with different groups ►Extend ontologies smoothly ►Re-use pieces of ontologies ►Build new ontologies on top of old ►Quit starting from scratch Knowledge is Big, Fractal & Changeable!

7 © University of Manchester 7 Assertion: ►Let the ontology authors ►create discrete modules ►describe the links between modules ►Let the logic reasoner ►Organise the result The arrival of logic-based ontologies/OWL gives new opportunities to make ontologies more manable and modular

8 © University of Manchester 8 Logic-based Ontologies: Conceptual Lego hand extremity body acute chronic abnormal normal ischaemic deletion bacterial polymorphism cell protein gene infection inflammation Lung expression

9 © University of Manchester 9 Logic-based Ontologies: Conceptual Lego “ SNPolymorphism of CFTRGene causing Defect in MembraneTransport of Chloride Ion causing Increase in Viscosity of Mucus in CysticFibrosis …” “Hand which is anatomically normal”

10 © University of Manchester 10 Logical Constructs build complex concepts from modularised primitives Genes Species Protein Function Disease Protein coded by gene in humans Function of Protein coded by gene in humans Disease caused by abnormality in Function of Protein coded by gene in humans Gene in humans

11 © University of Manchester 11 Normalising (untangling) Ontologies Structure Function Part-whole Structure Function Part-whole

12 © University of Manchester 12 Rationale for Normalisation ►Explicit distinctions between modules ►Primitives are opaque to the reasoner ►Information implicit in primitive names cannot contribute to modularisation ►Primitives are indivisible to both human and reasoner ►Each primitive should represent a single notion ►Therefore, each primitive must belong to exactly one module ►If a primitive belongs to two modules, they are not modular. ►If a primitive belongs to two modules, it probably conflates two notions ►Therefore concentrate on the “primitive skeleton” of the domain ontology

13 © University of Manchester 13 Normalisation and Untangling Let the reasoner do multiple classification ►Tree ►Everything has just one parent ►A ‘strict hierarchy’ ►Directed Acyclic Graph (DAG) ►Things can have multiple parents ►A ‘Polyhierarchy’ ►Normalisation ►Separate primitives into disjoint trees ►Link the trees with restrictions ►Fill in the values

14 © University of Manchester 14 Developing a normalised ontology from scratch ►The goal for the example ontology is to build an ontology of animals to index a children’s book of animals

15 © University of Manchester 15 Steps in developing an Ontology 1.Establish the purpose ►Without purpose, no scope, requirements, evaluation, 2.Informal/Semiformal knowledge elicitation ►Collect the terms ►Organise terms informally ►Paraphrase and clarify terms to produce informal concept definitions ►Diagram informally 3.Refine requirements & tests

16 © University of Manchester 16 Steps in implementing an Ontology 4.Implementation ►Paraphrase and comment at each stage before implementing ►Develop normalised schema and skeleton ►Implement prototype recording the intention as a paraphrase ►Keep track of what you meant to do so you can compare with what happens ►Implementing logic-based ontologies is programming ►Scale up a bit ►Check performance ►Populate ►Possibly with help of text mining and language technology 5.Evaluate & quality assure ►Against goals ►Include tests for evolution and change management ►Design regression tests and “probes” 6.Monitor use and evolve ►Process not product!

17 © University of Manchester 17 Purpose & scope of the animals ontology ►To provide an ontology for an index of a children’s book of animals including ►Where they live ►What they eat ►Carnivores, herbivores and omnivores ►How dangerous they are ►How big they are ►A bit of basic anatomy ►numbers of legs, wings, toes, etc.

18 © University of Manchester 18 Collect the concepts ►Card sorting is often the best way ►Write down each concept/idea on a card ►Organise them into piles ►Link the piles together ►Do it again, and again ►Works best in a small group ►In the lab we will provide you with some pre-printed cards and many spare cards ►Work in pairs or triples

19 © University of Manchester 19 Example: Animals & Plants ►Dog ►Cat ►Cow ►Person ►Tree ►Grass ►Herbivore ►Male ►Female ►Dangerous ►Pet ►Domestic Animal ►Farm animal ►Draft animal ►Food animal ►Fish ►Carp ►Goldfish ►Carnivore ►Plant ►Animal ►Fur ►Child ►Parent ►Mother ►Father

20 © University of Manchester 20 Organise the concepts Example: Animals & Plants ►Dog ►Cat ►Cow ►Person ►Tree ►Grass ►Herbivore ►Male ►Female ►Healthy ►Pet ►Domestic Animal ►Farm animal ►Draft animal ►Food animal ►Fish ►Carp ►Goldfish ►Carnivore ►Plant ►Animal ►Fur ►Child ►Parent ►Mother ►Father

21 © University of Manchester 21 Extend the concepts “Laddering” ►Take a group of things and ask what they have in common ►Then what other ‘siblings’ there might be ►e.g. ►Plant, Animal  Living Thing ►Might add Bacteria and Fungi but not now ►Cat, Dog, Cow, Person  Mammal ►Others might be Goat, Sheep, Horse, Rabbit,… ►Cow, Goat, Sheep, Horse  Hoofed animal (“Ungulate”) ►What others are there? Do they divide amongst themselves? ►Wild, Domestic  Demoestication ►What other states – “Feral” (domestic returned to wild)

22 © University of Manchester 22 Choose some main axes ►Add abstractions where needed ►e.g. “Living thing” ►identify relations ►e.g. “eats”, “owns”, “parent of” ► Identify definable things ►e.g. “child”, “parent”, “Mother”, “Father” ►Things where you can say clearly what it means ►Try to define a dog precisely – very difficult ►A “natural kind” ► make names explicit

23 © University of Manchester 23 Choose some main axes Add abstractions where needed; identify relations; Identify definable things, make names explicit ►Living Thing ►Animal ►Mammal ►Cat ►Dog ►Cow ►Person ►Fish ►Carp ►Goldfish ►Plant ►Tree ►Grass ►Fruit ►Modifiers ►domestic ►pet ►Farmed ►Draft ►Food ►Wild ►Health ►healthy ►sick ►Sex ►Male ►Female ►Age ►Adult ►Child Definable Carinvore Herbivore Child Parent Mother Father Food Animal Draft Animal Relations eats owns parent-of …

24 © University of Manchester 24 Self_standing_entities ►Things that can exist on there own nouns ►People, animals, houses, actions, processes, … ►Roughly nouns ►Modifiers ►Things that modify (“inhere”) in other things ►Roughly adjectives and adverbs

25 © University of Manchester 25 Reorganise everything but “definable” things into pure trees – these will be the “primitives” ►Self_standing ►Living Thing ►Animal ►Mammal ►Cat ►Dog ►Cow ►Person ►Pig ►Fish ►Carp Goldfish ►Plant ►Tree ►Grass ►Fruit ►Modifiers ►Domestication ►Domestic ►Wild ►Use ►Draft ►Food ►pet ►Risk ►Dangerous ►Safe ►Sex ►Male ►Female ►Age ►Adult ►Child Definables Carnivore Herbivore Child Parent Mother Father Food Animal Draft Animal Relations eats owns parent-of …

26 © University of Manchester 26 If anything needs clarifying, add a text comment ►Self_standing ►Living Thing ►Animal ►Mammal ►Cat ►Dog ►Cow ►Person ►Pig ►Fish ►Carp Goldfish ►Plant ►Tree ►Grass ►Fruit – The abstract ancestor concept including all living things – restrict to plants and animals for now”

27 © University of Manchester 27 Identify the domain and range constraints for properties ►Animal eats Living_thing ►eats domain: Animal; range: Living_thing ►Person owns Living_thing except person ►owns domain: Person range: Living_thing & not Person ►Living_thing parent_of Living_thing ►parent_of: domain: Animal range: Animal

28 © University of Manchester 28 If anything is used in a special way, add a text comment ►Animal eats Living_thing ►eats domain: Animal; range: Living_thing —ignore difference between parts of living things and living things also derived from living things

29 © University of Manchester 29 For definable things ►Paraphrase and formalise the definitions in terms of the primitives, relations and other definables. ►Note any assumptions to be represented elsewhere. ►Add as comments when implementing ►“A ‘Parent’ is an animal that is the parent of some other animal” (Ignore plants for now) ►Parent = Animal and parent_of some Animal ►“A ‘Herbivore’ is an animal that eats only plants” (NB All animals eat some living thing) ►Herbivore= Animal and eats only Plant ►“An ‘omnivore’ is an animal that eats both plants and animals” ► Omnivore= Animal and eats some Animal and eats some Plant

30 © University of Manchester 30 Paraphrases and Comments ►Write down the paraphrases and put them in the comment space. ►We can show you how to make the comment space bigger to make it easier. ►Without a paraphrase, we cannot tell if we disagree on ►What you meant to represent ►How you represented it ►Without a paraphrase we will mark down by at least half and give no partial credit ►We will try to understand what you are doing, but we cannot read your minds.

31 © University of Manchester 31 Which properties can be filled in at the class level now? ►What can we say about all members of a class? ►eats ►All cows eat some plants ►All cats eat some animals ►All pigs eat some animals & eat some plants

32 © University of Manchester 32 Fill in the details (can use property matrix wizard)

33 © University of Manchester 33 Check with classifier ►Cows should be Herbivores ►Are they? why not? ►What have we said? ►Cows are animals and, amongst other things, eat some grass and eat some leafy_plants ►What do we need to say: Closure axiom ►Cows are animals and, amongst other things, eat some plants and eat only plants ►(See “Vegetarian Pizzas” in OWL tutorial)

34 © University of Manchester 34 Closure Axiom ►Cows are animals and, amongst other things, eat some plants and eat only plants Closure Axiom

35 © University of Manchester 35 Open vs Closed World reasoning ►Open world reasoning ►Negation as contradiction ►Anything might be true unless it can be proven false ►Reasoning about any world consistent with this one ►Closed world reasoning ►Negation as failure ►Anything that cannot be found is false ►Reasoning about this world ►Ontologies are not databases

36 © University of Manchester 36 Tables are easier to manage than DAGs / Polyhierarchies …and get the benefit of inference: Grass and Leafy_plants are both kinds of Plant

37 © University of Manchester 37 Remember to add any closure axioms Closure Axiom Then let the reasoner do the work

38 © University of Manchester 38 Normalisation: From Trees to DAGs Before classification A tree After classification A DAG Directed Acyclic Graph

39 © University of Manchester 39 Untangling and Enrichment Using a classifier to make life easier Substance - Protein - - ProteinHormone - - - Insulin - Steroid - - SteroidHormone - - - Cortisol - Hormone - -ProteinHormone - - - Insulin - - SteroidHormone - - - Cortisol - Catalyst - - Enzyme - - - ATPase - PhsioloicRole - - HormoneRole - - CatalystRole - Substance - - Protein - - - Insulin - - - ATPase - Steroid - - Cortisol Hormone  Substance & playsRole - someValuesFrom HormoneRole ProteinHormone  Protein & playsRole someValuesFrom HormoneRole SteroidHomone  Steroid & playsRole someValuesFrom HormoneRole Catalyst  Substance & playsRole someValuesFrom CatalystRole Enzyme  Protein & playsRole someValuesFrom CatalystRole Insulin  playsRole someValuesFrom HormoneRole Cortisol  playsRole someValuesFrom HormoneRole ATPase  playsRole someValuesFrom CatalystRole Substance - Protein - - ProteinHormone - - - Insulin - - Enzyme - - - ATPase - Steroid - - SteroidHomone^ - - - Cortisol -Hormone - - ProteinHormone^ - - - Insulin^ - - SteroidHormone^ - - - Cortisol^ - Catalyst - - Enzyme^ - - - ATPase^

40 © University of Manchester 40 The original tangled ontology

41 © University of Manchester 41 Modularised into structure and function ontologies (all primitive)

42 © University of Manchester 42 Unified by an ontology that imports both structure and function ontologies and adds linking definitions Before classification - primitives in yellow - defined in orange; prefixes indicate imported ontology

43 © University of Manchester 43 Definition of Hormone

44 © University of Manchester 44 Unified ontology after classification

45 © University of Manchester 45 Normalisation: Criterion 1 The skeleton should consist of disjoint trees ►Every primitive concept should have exactly one primitive parent ►All multiple hierarchies the result of inference by reasoner

46 © University of Manchester 46 Normalisation Criterion 2: No hidden changes of meaning ►Each branch should be homogeneous and logical (“Aristotelian”) ►Hierarchical principle should be subsumption ►Otherwise we are “lying to the logic” ►The criteria for differentiation should follow consistent principles in each branch eg. structure XOR function XOR cause

47 © University of Manchester 47 A Non-homogeneous taxonomy “ On those remote pages it is written that animals are divided into: a. those that belong to the Emperor b. embalmed ones c. those that are trained d. suckling pigs e. mermaids f. fabulous ones g. stray dogs h. those that are included in this classification i. those that tremble as if they were mad j. innumerable ones k. those drawn with a very fine camel's hair brush l. others m. those that have just broken a flower vase n. those that resemble flies from a distance" From The Celestial Emporium of Benevolent Knowledge, Borges

48 © University of Manchester 48 Normalisation Criterion 3 Distinguish “Self-standing” and “Refining” Concepts ►Self-standing concepts ►Roughly Welty & Guarino’s “sortals” ►person, idea, plant, committee, belief,… ►Refining concepts – depend on self-standing concepts ►mild|moderate|severe, hot|cold, left|right,… ►Roughly Welty & Guarino’s non-sortals ►Closely related to Smith’s “fiat partitions” ►Usefully thought of as Value Types by engineers ►For us an engineering distinction…

49 © University of Manchester 49 Normalisation Criterion 3a Self-standing primitives should be globally disjoint & open ►Primitives are atomic ►If primitives overlap, the overlap conceals implicit information ►A list of self-standing primitives can never be guaranteed complete ►How many kinds of person? of plant? of committee? of belief? ►Can’t infer: Parent & ¬sub 1 &…& ¬sub n-1  sub n ►Heuristic: ►Diagnosis by exclusion about self-standing concepts should NOT be part of ‘standard’ ontological reasoning

50 © University of Manchester 50 Normalisation Criterion 3b Refining primitives should be locally disjoint & closed ►I ndividual values must be disjoint ►but can be hierarchical ►e.g. “very hot”, “moderately severe” ►Each list can be guaranteed to be complete ►Can infer Parent & ¬sub 1 &…& ¬sub n-1  sub n ►Value types themselves need not be disjoint ►“being hot” is not disjoint from “being severe” ►Allowing Valuetypes to overlap is a useful trick, e.g. ►restriction has_state someValuesFrom (severe and hot)

51 © University of Manchester 51 Normalisation Criterion 4 Axioms ►No axiom should denormalise the ontology No axiom should imply that a primitive is part of more than one branch of primitive skeleton ►If all primitives are disjoint, any such axioms will make that primitive unsatisfiable ►A partial test for normalisation: ►Create random conjunctions of primitives which do not subsume each other. ►If any are satisfiable, the ontology is not normalised

52 © University of Manchester 52 Consequences 1 ►All self-standing primitives are disjoint ►All multiple classification is inferred ►For any two primitive self-standing classes, either one subsumes the other or they are disjoint ►Every self standing concept is part of exactly one primitive branch of the skeleton ►Every self-standing concept has exactly one most specific primitive ancestor

53 © University of Manchester 53 Consequences 2 ►Primitives introduced by a conjunction of one class and a boolean combination of zero or more restrictions ►Tree subclass-of Plant and restriction isMadeOf someValuesFrom Wood ►Resort subclass-of Accommodation restriction isIntendedFor someValueFrom Holidays

54 © University of Manchester 54 Consequences 3 ►Use of axioms limited (outside of skeleton construction) The following are a safe but not exhaustive guide: ►The right side of subclass axioms limited to restrictions ►Both sides of disjointness axioms limited to restrictions ►No equivalence axioms with primitives on either side

55 © University of Manchester 55 A real example: Build a simple treee easy to maintain

56 © University of Manchester 56 Let the classifier organise it

57 © University of Manchester 57 If you want more abstractions, just add new definitions (re-use existing data) “Diseases linked to abnormal proteins”

58 © University of Manchester 58 And let the classifier work again

59 © University of Manchester 59 And again – even for a quite different category “Diseases linked genes described in the mouse”

60 © University of Manchester 60 And let classifier check consistency (My first try wasn’t)

61 © University of Manchester 61 Summary: Why Normalise? Why use a Classifier? ►To compose concepts ►Allow conceptual lego ►To manage polyhierarchies ►Adding abstractions (“axes”) as needed ►Normalisation ►Untangling ►labelling of “kinds of is-a” ►To avoid combinatorial explosions ►Keep bicycles from exploding ►To manage context ►Cross species, Cross disciplines, Cross studies ►To check consistency and help users find errors

62 © University of Manchester 62 But remember the ‘ontology’ is just one part of an information resource ►The ontology ►Naming - Definitions & necessary conditions ►Classification ►Indexing ►Knowledge bases ►What we know about those entities – what is true in general ►Databases ►What we know about individuals ►Instance stores – specialised databases that link to ontologies ►Plus ►Lexicons ►Metadata ►Mappings


Download ppt "© University of Manchester 1 Ontology Normalisation, Pre- and Post- Coordination Alan Rector & CO-ODE/NIBHI University of Manchester"

Similar presentations


Ads by Google