Www.landc.be 1 Using ontology in query answering systems: scenarios, requirements and challenges Werner Ceusters a, Barry Smith b, Maarten Van Mol a a.

Slides:



Advertisements
Similar presentations
Integrating the gender aspects in research and promoting the participation of women in Life Sciences, Genomics and Biotechnology for Health.
Advertisements

ECO R European Centre for Ontological Research Realist Ontology for Electronic Health Records Dr. Werner Ceusters ECOR: European Centre for Ontological.
Ontological analysis of the semantic types Anand Kumar MBBS, PhD IFOMIS, University of Saarland, Germany. BIOMEDICALONTOLOGYBIOMEDICALONTOLOGY.
ECO R European Centre for Ontological Research Ontology-based Error Detection in SNOMED-CT ® Werner Ceusters European Centre for Ontological Research Universität.
Ontology management for NLU: the L&C approach W. Ceusters CTO * Language & Computing nv, Zonnegem, Belgium.
Overview of Nursing Informatics
1/39 Terminology and Ontology Management Systems Dr. W. Ceusters CTO Language & Computing nv.
Biomedical Informatics Some Observations on Clinical Data Representation in EHRs Christopher G. Chute, MD DrPH, Mayo Clinic Chair, ICD11 Revision, World.
New York State Center of Excellence in Bioinformatics & Life Sciences R T U Discovery Seminar /UE 141 CC – Fall 2007 Difficult Problems, Easy Solutions:
Battling Scylla and Charybdis: The Search for Redundancy and Ambiguity in the 2001 UMLS Metathesuarus James J. Cimino Department of Medical Informatics.
VT. From Basic Formal Ontology to Medicine Barry Smith and Anand Kumar.
BFO/MedO: Basic Formal Ontology and Medical Ontology Draft ( )
L & C Dr. W. Ceusters Language & Computing nv 1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language.
L & C Dr. W. Ceusters Language & Computing nv 1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language.
Problems and Situations in Searching the Literature.
IMPROVING THE DOCUMENTATION OF DIAGNOSES Carol A. Lewis.
Unified Medical Language System® (UMLS®) NLM Presentation Theater MLA 2007 National Library of Medicine National Institutes of Health U.S. Dept. of Health.
NURS 4006 Nursing Informatics
*To Err is Human: Building a Safer Health System. National Academy Press, 2001 Why is DynaMed Needed? Between 44,000 and 98,000 American deaths per year.
Ontology Development in the Sciences Some Fundamental Considerations Ontolytics LLC Topics:  Possible uses of ontologies  Ontologies vs. terminologies.
Dr.F Eslamipour DDS.MS Orthodontist Associated professor Department of Oral Public Health Isfahan University of Medical Science.
Annual reports and feedback from UMLS licensees Kin Wah Fung MD, MSc, MA The UMLS Team National Library of Medicine Workshop on the Future of the UMLS.
1 st June 2006 St. George’s University of LondonSlide 1 Using UMLS to map from a Library to a Clinical Classification: Improving the Functionality of a.
Integrated Biomedical Information for Better Health Workprogramme Call 4 IST Conference- Networking Session.
Using Information in Practice Nigel Hart & Kieran Mc Glade Medical School, Queen’s University, Belfast.
1 Enriching and Designing Metaschemas for the UMLS Semantic Network Department of Computer Science New Jersey Institute of Technology Yehoshua Perl James.
Survey of Medical Informatics CS 493 – Fall 2004 September 27, 2004.
Socio-cultural and ethical aspects Anne G. Ekeland Berlin,
Overview of Chapter The issues of evidence-based medicine reflect the question of how to apply clinical research literature: Why do disease and injury.
Recent advances in the field of Family Medicine classifications ICPC into WHO-FIC J K Soler Wonca International Classification Committee.
This material was developed by Duke University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information.
Health IT Workforce Curriculum Version 1.0 Fall Networking and Health Information Exchange Unit 4c Basic Health Data Standards Component 9/Unit.
Agent-based methods for translational cancer multilevel modelling Sylvia Nagl PhD Cancer Systems Science & Biomedical Informatics UCL Cancer Institute.
UMLS Unified Medical Language System. What is UMLS? A Unified knowledge representation system Project of NLM Large scale Distributed First launched in.
This material was developed by Duke University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information.
Knowledge-Based Semantic Interpretation for Summarizing Biomedical Text Thomas C. Rindflesch, Ph.D. Marcelo Fiszman, M.D., Ph.D. Halil Kilicoglu, M.S.
Terminology and documentation*  Object of the study of terminology:  analysis and description of the units representing specialized knowledge in specialized.
Correlating Knowledge Using NLP: Relationships between the concepts of blood cancers, stem cell transplantation, and biomarkers Katy Zou and Weizhong Zhu.
Taken from Schulze-Kremer Steffen Ontologies - What, why and how? Cartic Ramakrishnan LSDIS lab University of Georgia.
ADVANCED DB SYSTEMS BIOMEDICAL ENGINEERING. Index INTRODUCTION  BIOMEDICAL ENGINEERING  B.E. DATASETS APPLICATIONS  DATA MINING ON FDA DATABASE  ONTOLOGY-BASED.
Sharing Ontologies in the Biomedical Domain Alexa T. McCray National Library of Medicine National Institutes of Health Department of Health & Human Services.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
The UMLS Semantic Network Alexa T. McCray Center for Clinical Computing Beth Israel Deaconess Medical Center Harvard Medical School
1 ECCF Training Computationally Independent Model (CIM) ECCF Training Working Group January 2011.
| | Healthcare Science careers.
© University of Manchester Creative Commons Attribution-NonCommercial 3.0 unported 3.0 license Quality Assurance, Ontology Engineering, and Semantic Interoperability.
Health IT Workforce Curriculum Version 1.0 Fall Networking and Health Information Exchange Unit 4a Basic Health Data Standards Component 9/Unit.
Indexing Medical Documents using related ontologies: towards a strategy for automatic quality assurance Dr. W. Ceusters CTO Language and Computing.
Study on the Design for Consumer Health Knowledge Organization in China Institute of Medical Information Chinese Academy of Medical Sciences Jul. 10th,
Language, terminology and ontology in a medical context: theory en reality in industrial applications Werner CEUSTERS CTO Language & Computing.
New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U.
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
© University of Manchester Creative Commons Attribution-NonCommercial 3.0 unported 3.0 license Quality Assurance, Ontology Engineering, and Semantic Interoperability.
1 LinkSuite™: formally robust ontology-based data and information integration Werner Ceusters a, Barry Smith b, James Matthew Fielding b a.
D ETERMINATION OF D ISABILITY. Over 7 million disability assessments are made annually in the United States. Many of which are made by physicians in the.
Introduction to Health Informatics Leon Geffen MBChB MCFP(SA)
Knowledge Representation Part I Ontology Jan Pettersen Nytun Knowledge Representation Part I, JPN, UiA1.
Antibiotics: handle with care!
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
The UMLS and the Semantic Web
Classification and Treatment Plans
NeurOn: Modeling Ontology for Neurosurgery
Chapter 4 Theoretical Foundations of Nursing Practice
Achieving Semantic Interoperability of Cancer Registries
Medical Natural Language Understanding now and tomorrow
TDM=Text Mining “automated processing of large amounts of structured digital textual content for purposes of information retrieval, extraction, interpretation.
Determination of Disability
Department of Medical Informatics
Ontological analysis of the semantic types
Evidence Based Diagnosis
Presentation transcript:

1 Using ontology in query answering systems: scenarios, requirements and challenges Werner Ceusters a, Barry Smith b, Maarten Van Mol a a Language & Computing nv (L&C) b Institute for Formal Ontology and Medical Information Science

2

3 L&C’s LinkSuite Tm

4 Presentation overview 1.Clinical Question Answering (QA) a)kinds of questions b)question analysis resources c)possible answer sources 2.Role of Ontology 3.State of the Art in QA at L&C

5 Clinicians’ Questions What is the dose of metformin? What is the proper treatment of gastro-oesophageal reflux disease (GERD)? What should I use for atopic dermatitis? How common is depression after infectious mononucleosis? What is the name of that rash that diabetics get on their legs? Is it ethical for me to take care of my own file clerk, who has back pain and wants a work excuse? (Jerome Ely et al., BMJ 2002)

6 OHSUMED –60 year old menopausal woman without hormone replacement therapy Are there adverse effects on lipids when progesterone is given with estrogen replacement therapy ? –60 yo male with disseminated intravascular coagulation pathophysiology and treatment of disseminated intravascular coagulation ? –88 yo with subdural reviews on subdurals in elderly ? –58 yo with cancer and hypercalcemia effectiveness of etidronate in treating hypercalcemia of malignancy ? –55 yo female,postmenopausal does estrogen replacement therapy cause breast cancer ? W. R. Hersh, C. Buckley, T. J. Leone, and D. H. Hickam. OHSUMED: an interactive retrieval evaluation and new large test collection for research. In: Proceedings of the 17th Annual International ACM Special Interest Group in Information Retrieval. Dublin, Ireland: Springer-Verlag, 1994:

7 Clinical questions & answers Physicians spend less than 2 minutes on average seeking an answer to a question. (Ely et al., BMJ 1999) Most clinical questions remain unanswered. (Alper et al., J Fam Pract 2001) Doctors are overwhelmed by the amount of information available, yet they often cannot answer their questions about specific clinical problems. (Ely et al., BMJ 2002)

8 Main obstacles to answering doctors’ questions about patient care The excessive time required to find information Difficulty modifying the original question Difficulty selecting an optimal strategy to search for information decide which resources will be most helpful; search in which order; which articles to read thoroughly; how thoroughly Failure of a seemingly appropriate resource to cover the topic Uncertainty about how to know when all the relevant evidence has been found so that the search can stop Inadequate synthesis of multiple bits of evidence into a clinically useful statement (e.g. conflicting evidence) (Ely et al., BMJ 2002)

9 Are such questions important ? 0.42 unanswered questions are generated per patient- physician contact, of which the majority sufficiently important that an answer might lead to different advice or treatment »Mark H. Ebell and Linda White. What is the best way to gather clinical questions from physicians? J Med Libr Assoc July; 91 (3): in the United States between 44,000 and 98,000 people are killed every year from medical errors total cost of preventable medical mistakes, including lost wages and extra health costs, are estimated to lie between $17 billion and $29 billion a year »Institute of Medicine "To Err Is Human: Building A Safer Health System.". December 1999.

10 Can they be answered by a QA system ? Influencing factors: –how easy/difficult to analyse the questions ? question patterns availability of analysis resources –is there a right answer ? –can the right answer be found ? availability of answer resources how easy/difficult to analyse these resources ?

11 What is the proper treatment of gastro-oesophageal reflux disease ? [Which-X]-(properly treats)-[GERD] –is this easy pattern general ? –is clinical terminology a problem for QA systems ? –does the “properly” adverb make it hard to answer the question since it is a request for a judgement ?

12 Clinicians’ Questions : templates Jerome Ely e.a. A taxonomy of generic clinical questions: classification study. BMJ 2000;

oral pathology questions from medical students Quels signes caractérisent un lichen plan? Which signs characterize a lichen planus? Quelle mesure diagnostique une adénopathie? Which test diagnoses an adenopathy? Doit-on traiter les atteintes buccales du lichen plan? Must one treat oral lesions of lichen planus? P. Jacquemart and P. Zweigenbaum Towards a medical question-answering system: a feasibility study. In Pierre Le Beux and Robert Baud, editors,Proceedings Medical Informatics Europe, Amsterdam. IOS Press 2003;

14 Syntactic-semantic patterns Semantic ModelNr of synsem patterns Nr of questions 1[which X]-(r)-[B]2439 [A]-(r)-[which Y]1729 2does [A]-(r)-[B]1217 3why [A]-(r)-[B]45 4[which X, Y]-(r)-[B]11 5[which X]-(r)-[B, C]33 6duration [A]-(precedes)-[B]22 7define [A]12 8which specific precaution if [A]-(r)-[B]11 total66100 P. Jacquemart and P. Zweigenbaum Table 1.

15 Possible answer sources

16

17 Ear: The seed-bearing spike of a cereal plant, such as corn

18 Judgement questions  clinical guidelines

19 CG for ingrown toenails

20 A wealth of clinical terminology resources UMLS Knowledge Sources: UMLS Metathesaurus ® –includes SNOMED-CT in 2004 (US restricted license) SPECIALIST Lexicon UMLS Semantic Network

21 UMLS Semantic Types EntityEvent Language Organisation Group Attribute Idea or Concept Finding Organism Attribute Intellectual Product Occupation Or Discipline Group Substance Organism Anatomical Structure Manufactured Object Behaviour Daily or Recreational activity Occupational Activity Machine Actiivty Laboratory Procedure Diagnostic Procedure Therapeutic Procedure Individual Behaviour Social Behaviour Health care Activity Research Activity Educational Activity Governmental or Regulatory Activity Injury or Poisoning Natural Phenomenon Or Process Human-caused Phenomenon Or Process Environment Effect of Humans PhysicalObject Conceptual Entity Phenomenon Or Process Activity Biologic Function Physiologic Function Pathologic Function Organ or Tissue Function Organism Function Mental Process Cell Function Molecular Function Genetic Function Disease or Syndrome Mental or Behavioural Dysfunction Neoplastic Process Cell or Molecular Dysfunction Experimental Model of Disease

22 Semantic Net: 54 Links Spatially RelatedTo Has_locationAdjacent_toSurrounded_byTraversed_byPhysically RelatedTo Has_partConstitutes Contained_inConnected_to Interconnected_by Has_branch Has_tributary Has_ingredient Temporally RelatedTo follows co-occurs_with brought_about_by has_manifestation indicated_by has_result Functionally RelatedTo affected_by managed_by treated_by disrupted_by complicated_by interacted_with prevented_by used-by produced_by caused_by performed_bycarried_out_by exhibited_by practiced_by has_occurrencehas_process Conceptually RelatedTo has_degreediagnosed_by has_property has_derivative has_developmental_form has_measurement measured_by has_evaluation has_method has_conceptual_part has_issue analyzed_byAssessed_for_effect_by

23 The MetaThesaurus For each unique concept… –All the string variants that share that meaning But NB: more synset than true synonymy And where that string came from –A text definition of the concept (if exists) –The semantic type –Relationship with other concepts in sources Parent-of, child-of, broader-than, can-qualify etc –A ragbag of other properties E.g. is an ICD entry term

24 Foundational Model of Anatomy Rosse, C. and Shapiro, L. G. and Brinkley, J. F. (1998) The Digital Anatomist Foundational Model: Principles for Defining and Structuring Its Concept Domain.

25 Characteristics of clinical “terminologies” large collections of term lists synonyms, preferred terms semantic neigbourhood taxonomy parthood machine interpretable formal “ontologically” sound

26 Current expectations on clinical question analysis patterns may considered to be relatively simple pragmatic education can keep them simple –ask questions simply instead of ask simple questions clinical guidelines are available but difficult to analyse required terminologies are available, but good understanding of the kind of knowledge they contain is required

27 Ontology

28 “Ontology” in the QA Roadmap Document (1) Study models of question processing based on ontologies and knowledge bases. (p8 + 5x) [ontologies]-(which R)-[knowledge bases] typical in this collocation: ontology as (constraint) model of KB KB contains facts (about instances) or rules

29 “Ontology” in the QA Roadmap Document (2) p13: Very problematic “=“ sign !!! true: operating system “IS A” software product false: drivers “IS A” operating system

30 “Ontology” in the QA Roadmap Document (3) p13: 6/ Integration of contextual knowledge into and from world knowledge and special purpose ontologies as well as axiomatic knowledge. p20: 4/ Develop knowledge bases and ontologies that contain concepts that are language- independent (interlingua type of hierarchies, linked to concepts or words in other languages than English).

31 The O-word N. Guarino, P. Giaretta, "Ontologies and Knowledge Bases: Towards a Terminological Clarification". In Towards Very Large Knowledge Bases: Knowledge Building and Knowledge Sharing, N. Mars (ed.), pp IOS Press, Amsterdam, 1995.

32 If, later, you can remember just one thing of this representation, then make sure it is this one: If you use the word “ontology”, ALWAYS be specific about what you understand by it.

33 a for a computer understable representation of some pre-existing domain of REALITY, reflecting the properties of the objects within its domain in such a way that there obtain substantial and systematic correlations between reality and the ontology itself. modified from Barry Smith L&C’s understanding of an ontology to be used by software (agents) in a machine, and NOT by humans does not rely on what people know or think, hence no “concepts” instance driven, although it accepts universals that are not instanciated does not “create” or “constrain” reality The T-Box has no meaning without the A-Box

34 BFO/MedO Basic Formal Ontology consists in a series of sub-ontologies (most properly conceived as a series of perspectives on reality), the most important of which are: –SnapBFO, a series of snapshot ontologies (O ti ), indexed by times –SpanBFO a single videoscopic ontology (O v ). Each O ti is an inventory of all entities existing at a time. O v is an inventory (processory) of all processes unfolding through time.

35

36

37 Question Answering Systems at L&C

38 Technology overview structuredtext LinKFactory Server MaDBoKS TeSSI indexer Information Extraction System LinKFactory Client

39 LinKBase Formal Domain Ontology Lexicon Grammar Language A Lexicon Grammar Language B Cassandra Linguistic Ontology MEDDRA ICD SNOMED ICPC Others... Proprietary Terminologies

40 Based on formal ontology HAS- PARTIAL- SPATIAL- OVERLAP IS- TOPO- INSIDE- OF IS-GEO- INSIDE- OF IS- INSIDE- CONVEX- HULL-OF IS-PARTLY- IN-CONVEX- HULL-OF IS- OUTSIDE- CONVEX- HULL-OF HAS- DISCONNECTED- REGION HAS- EXTERNAL- CONNECTING- REGION HAS-DISCRETED- REGION HAS- TANG.- SPAT.- PART HAS-NON- TANG.- SPAT.- PART IS- SPAT.- EQUIV.- OF IS- TANG.- SPAT.- PART-OF IS-NON- TANG.- SPAT.- PART-OF HAS- PROPER- SPATIAL -PART IS- PROPER- SPAT.- PART-OF HAS- SPATIAL -PART IS- SPATIAL -PART- OF HAS- OVERLAPPING -REGION HAS- CONNECTING- REGION HAS-SPATIAL- POINT- REFERENCE

41 Example: joint anatomy joint HAS-HOLE joint space joint capsule IS-OUTER-LAYER-OF joint meniscus –IS-INCOMPLETE-FILLER-OF joint space –IS-TOPO-INSIDE joint capsule –IS-NON-TANGENTIAL-MATERIAL-PART-OF joint joint –IS-CONNECTOR-OF bone X –IS-CONNECTOR-OF bone Y synovia –IS-INCOMPLETE-FILLER-OF joint space synovial membrane IS-BONAFIDE- BOUNDARY-OF joint space

42 Linking external ontologies MESH-2001 : “Seizures” MESH-2001 : “Convulsions” Snomed-RT : “Convulsion” Snomed-RT : “Seizure” L&C : ConvulsionL&C : Seizure L&C : Health crisis L&C : Epileptic convulsion IS-A IS-narrower-than ISA Has-CCC

43 Linguistic and domain ontologies Having a healthcare phenomenon Generalised Possession Healthcare phenomenon Human IS-A Has- possessor Has- possessed Patient Is-possessor-of Cancer patient IS-A Has-Healthcare- phenomenon Malignant neoplasm IS-A lung carcinoma IS-A Mr. Kovács has a pulmonary carcinoma

44 TeSSI Indexing

45 Information Extraction

46

47

48 Web-based ontology refinement 1.concept investigated 2.Information used for relevance assessment 3.new terms found 4.context of new terms 5.relevant documents

49 Concluding remarks