Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.landc.be 1 LinkSuite™: formally robust ontology-based data and information integration Werner Ceusters a, Barry Smith b, James Matthew Fielding b a.

Similar presentations


Presentation on theme: "Www.landc.be 1 LinkSuite™: formally robust ontology-based data and information integration Werner Ceusters a, Barry Smith b, James Matthew Fielding b a."— Presentation transcript:

1 www.landc.be 1 LinkSuite™: formally robust ontology-based data and information integration Werner Ceusters a, Barry Smith b, James Matthew Fielding b a Language & Computing nv (L&C) b Institute for Formal Ontology and Medical Information Science

2 www.landc.be 2

3 www.landc.be 3 The problem A (simple?) question... –What genes are involved in juvenile diabetes ?... may lead to many more questions: –Where is the answer to be found ? knowledge sources: text books, scientific papers,... information sources: physician reports, medical records,... data sources: clinical laboratory databases,... –Is there a known correct answer ? –How should the question be phrased for machine processing ? –...

4 www.landc.be 4 Partial solutions are availableSame question – different answers

5 www.landc.be 5 How to solve this ? By developing a framework for data-, information- and ontology-integration –across all levels of generalisation –including information in both structured and unstructured forms. what requires three tasks to be dealt with properly: 1.identifying the basic ontological foundations of a framework expressive enough to describe life science data at all levels; 2.carrying out the research in information engineering needed to create technology able to exploit this ontological framework in a way that can support the integration of massively heterogenous structured and semi- structured life science databases; 3.developing the tools for natural language understanding in the domain of the life sciences needed to extract structured data from free text documents. our approach to “ontology” L&C’s LinkSuite

6 www.landc.be 6 “Ontology” N. Guarino, P. Giaretta, "Ontologies and Knowledge Bases: Towards a Terminological Clarification". In Towards Very Large Knowledge Bases: Knowledge Building and Knowledge Sharing, N. Mars (ed.), pp 25-32. IOS Press, Amsterdam, 1995.

7 www.landc.be 7 From buzz-word to the “O-word” “An ontology is a classification methodology for formalizing a subject's knowledge or belief system in a structured way. Dictionaries and encyclopedias are examples of ontologies.” (X1) “A terminology (or classification) is a kind of ontology by definition and it should preserve (and "understand") the relationships between the 1,000s of terms in it or else it would become a mere dictionary (or at best a thesaurus).” (X2) “Ontologies are Web pages that contain a mystical unifying force that gives differing labels common meaning.” (X3)

8 www.landc.be 8 If, later, you can remember just one thing of this representation, then make sure it is this one: If you use the word “ontology”, ALWAYS be specific about what you understand by it.

9 www.landc.be 9 a for a computer understable representation of some pre-existing domain of REALITY, reflecting the properties of the objects within its domain in such a way that there obtain substantial and systematic correlations between reality and the ontology itself. modified from Barry Smith My understanding of an ontology to be used by software (agents) in a machine, and NOT by humans does not rely on what people know or think, hence no “concepts” instance driven, although it accepts universals that are not instanciated does not “create” or “constrain” reality The T-Box has no meaning without the A-Box

10 www.landc.be 10 Ontological theories = theories between reality and “the ontology” (“ontology” as a representation) –Granular Partition Theory (T Bittner & B. Smith) –Logic of Classes (B. Smith)

11 www.landc.be 11 Theory of granular partitions (B. Smith) Think of it as Alberti’s grid

12 www.landc.be 12 Granular partitions: main principles a partition is the drawing of a (typically complex) fiat boundary over a certain domain a partition typically comes with labels and/or an address system partitions are artefacts of our cognition a partition is transparent (veridical) bona fide objects exist independently of our partitions, fiat objects are determined by partitions different partitions may represent cuts through the same reality which are skew to each other entities (existing in reality) located in the same cell of a partition share common characteristics

13 www.landc.be 13 Logic of classes primitive: –entities: particulars versus universals –relation inst such that: all classes are universals; all instances are particulars some universals are not classes, hence have no instances: pet, adult, physician some particulars are not instances; e.g. some mereological sums subsumption defined resorting to instances:

14 www.landc.be 14 Basic Formal Ontology Basic Formal Ontology consists in a series of sub-ontologies (most properly conceived as a series of perspectives on reality), the most important of which are: –SnapBFO, a series of snapshot ontologies (O ti ), indexed by times –SpanBFO a single videoscopic ontology (O v ). Each O ti is an inventory of all entities existing at a time. O v is an inventory (processory) of all processes unfolding through time.

15 www.landc.be 15

16 www.landc.be 16 UMLS Semantic Types EntityEvent Language Organisation Group Attribute Idea or Concept Finding Organism Attribute Intellectual Product Occupation Or Discipline Group Substance Organism Anatomical Structure Manufactured Object Behaviour Daily or Recreational activity Occupational Activity Machine Actiivty Laboratory Procedure Diagnostic Procedure Therapeutic Procedure Individual Behaviour Social Behaviour Health care Activity Research Activity Educational Activity Governmental or Regulatory Activity Injury or Poisoning Natural Phenomenon Or Process Human-caused Phenomenon Or Process Environment Effect of Humans PhysicalObject Conceptual Entity Phenomenon Or Process Activity Biologic Function Physiologic Function Pathologic Function Organ or Tissue Function Organism Function Mental Process Cell Function Molecular Function Genetic Function Disease or Syndrome Mental or Behavioural Dysfunction Neoplastic Process Cell or Molecular Dysfunction Experimental Model of Disease

17 www.landc.be 17 L&C’s LinkSuite Tm

18 www.landc.be 18 Technology overview structuredtext LinKFactory Server MaDBoKS TeSSI indexer Information Extraction System LinKFactory Client

19 www.landc.be 19 LinKBase Formal Domain Ontology Lexicon Grammar Language A Lexicon Grammar Language B Cassandra Linguistic Ontology MEDDRA ICD SNOMED ICPC Others... Proprietary Terminologies

20 www.landc.be 20 Based on formal ontology HAS- PARTIAL- SPATIAL- OVERLAP IS- TOPO- INSIDE- OF IS-GEO- INSIDE- OF IS- INSIDE- CONVEX- HULL-OF IS-PARTLY- IN-CONVEX- HULL-OF IS- OUTSIDE- CONVEX- HULL-OF HAS- DISCONNECTED- REGION HAS- EXTERNAL- CONNECTING- REGION HAS-DISCRETED- REGION HAS- TANG.- SPAT.- PART HAS-NON- TANG.- SPAT.- PART IS- SPAT.- EQUIV.- OF IS- TANG.- SPAT.- PART-OF IS-NON- TANG.- SPAT.- PART-OF HAS- PROPER- SPATIAL -PART IS- PROPER- SPAT.- PART-OF HAS- SPATIAL -PART IS- SPATIAL -PART- OF HAS- OVERLAPPING -REGION HAS- CONNECTING- REGION HAS-SPATIAL- POINT- REFERENCE

21 www.landc.be 21 Linking external ontologies MESH-2001 : “Seizures” MESH-2001 : “Convulsions” Snomed-RT : “Convulsion” Snomed-RT : “Seizure” L&C : ConvulsionL&C : Seizure L&C : Health crisis L&C : Epileptic convulsion IS-A IS-narrower-than ISA Has-CCC

22 www.landc.be 22 Managing different views External ontology Internal ontology Criteria Mappings Definitions Terms

23 www.landc.be 23 Ontological theory inside LinKBase if you know that a real-world entity satisfies the Full Definition of a domain-entity- type, then you may infer that that object is an instance of that type. if a real-world entity is an instance of a domain-entity, all that is said about the domain- entity applies to the instance; the statement “A-Link-B” says something about all instances of A, but nothing about instances of B unless the Link is declared to have an inverse;

24 www.landc.be 24 Ontology based parsing ONTOLOGY Patient Is-possessor-of Cancer patient IS-A Has-Healthcare- phenomenon 2 2 IS-A 3 3 Having a healthcare phenomenon Healthcare phenomenon IS-A Has- possessor Has- possessed Malignant neoplasm IS-A 1 1 1 Mr. Smith has a pulmonary carcinoma Generalised Possession Human lung carcinoma 1. Parsing 2. Relating 3. Inferring Mr. Smith has a pulmonary carcinoma

25 www.landc.be 25 L&C Parser output

26 www.landc.be 26 Information Extraction

27 www.landc.be 27 Semantic indexing

28 www.landc.be 28 Conclusions There is a huge need for life science data integration technology able to deal with both structured and unstructured data formats. To keep the data manageable, the technology should be able to understand the data. The proper sort of ontology is a means to accomplish this. Based on several POCs, L&C’s LinKSuite can be claimed to be a successful attempt to exploit these insights. But humble as we are, we understand that it is still far from where it should be.


Download ppt "Www.landc.be 1 LinkSuite™: formally robust ontology-based data and information integration Werner Ceusters a, Barry Smith b, James Matthew Fielding b a."

Similar presentations


Ads by Google