Presentation is loading. Please wait.

Presentation is loading. Please wait.

Query Languages for SNOMED: Use Cases and Issues for Binding to Health Records and to ICD & background for comments on DRAFT SNOMED Query language spec.

Similar presentations


Presentation on theme: "Query Languages for SNOMED: Use Cases and Issues for Binding to Health Records and to ICD & background for comments on DRAFT SNOMED Query language spec."— Presentation transcript:

1 Query Languages for SNOMED: Use Cases and Issues for Binding to Health Records and to ICD & background for comments on DRAFT SNOMED Query language spec ( locally http://www.cs.man.ac.uk/~rector/temp/SNOMED_TQL_for_comment) http://www.cs.man.ac.uk/~rector/temp/SNOMED_TQL_for_comment Alan Rector BioHealth Informatics Group University of Manchester rector@cs.manchester.ac.uk rector@cs.manchester.ac.uk http://www.cs.manchester.ac.uk/~rector Copyright University of Manchester 2012 Licensed under Creative Commons Attribution Non-commercial Licence v3

2 Background ►Use cases that for terminology query languages ►Binding of ontologies to health records: HL7 & EN 13606/Archtypes Specifying “value sets” for fields Expanding SQL queries to include subsumed concepts ►Use of a common ontology in ICD ►Questions ►Theoretical Query expansion for querying data bases rather DL queries on A-Boxes ‣ Negation: “not necessarily” vs “necessarily not” ‣ Natural level of incompleteness – The frame problem and Grice Coping with representations in subsets of EL++ without disjointness ►Practical for ICD Are there flaws in SNOMED’s proposed query language? Are there alternatives? ‣ “build or borrow” – relation to standards Establishing a “reference representation” – what it should have been ‣ and a migration path ►Major issues in Query Language Spec ►Pragmatic requirements for ICD “Arbitrary selection of classes” Negation – exclusions, residual classes (“other”), with/without Using queries to cope with known errors in SNOMED Comprehensible rules for assigning cases to codes 2

3 Necessary background: ►SNOMED CT ►Binding to EHR ►Separation of Domain Ontology from Data schema ►HL7 and Archetypes ►Three component architecture for ICD11. ►Requirements and status of SNOMED Terminology Query Language (& its Ocean Informatics predecessor) 3

4 SNOMED CT (SCT) ►Large terminology formuated in an old description logic ►Roughly EL ++ without disjointness Logical content available in OWL syntax OWL version classifies with ELK or SNOROCKET in a few seconds ►~300K active classes; ~1.2M axioms MConvenient to extract modules for experiments ‣ Most tools get bogged down in bulk ►Role Group Translation into OWL not identical to KRSS original ►Idiosyncratic schema & many errors See papers on my website. ►Canonical form mechanism that is often used in lieu of classification A good topic for a separate discussion – not for today 4

5 Role Groups ►Purpose: group qualifiers (restrictions) together to distinguish ►Cancer originating in breast and metastatic to bone* Cancer & RoleGroup some (has_status some primary & hasSite some Breast) & RoleGroup some (has_status metastases & has_site some bone) ►Cancer originating in bone and metastatic to breast Cancer & RoleGroup some (has_status some metastases & hasSite some Breast) & RoleGroup some (has_status primary & has_site some bone) ►OWL translation pragmatic ►Role groups inserted everywhere for consistency. Native syntax omits them when not required 5 * Easy to understand example. Not literally correct for SNOMED

6 Major issue: What should a code represent? The “Condition” vs “Situation” debate (now largely resolved in favour of “situations” ►Does a code represent ►A “disorder”? “Condition” interpretation ►“having a disorder”? “Situation” interpretation ‣ “Situation of having a disorder” / ‣ “Patient having the disorder (at a given place and time as observed by| a given clinician)” 6

7 Example: Fracture of Radius & Ulna (Forearm) – a single code in ICD and SNOMED ►Nothing can be both a “fracture of radius” and “fracture of ulna” ►“Condition interpretation” ►A patient can simultaneously have both a “fracture of radius” and “fracture of ulna” ►“Situation interpretation” 7

8 The evidence ►Should responses to queries / rules for patients with “Fracture of Radius” include patients with “Fracture of the radius & ulna”? ►Most doctors say “yes” ►Hierarchies of SNOMED and ICD imply “yes”, i.e. “Fracture of Radius and Ulna” is a kind of “Fracture of Radius” ►Which is safer? 8

9 Implications in OWL 9

10 …but ►For the foreseable future: ►The hierarchies behave as if the codes represented situations ►Separate entities for the condition and the situation will not be created It is up to software and users to disambiguate or to manage as best they can ‣ One of the many legacy idiosyncracies 10

11 Most common use case: eHealth records 11 Data schema Ontology Are the dotted arrows: Class expressions? Queries? Other?

12 Ontology Data base Most common use case: eHealth records To determine what is legal for entries in the database

13 Consider retrieval from a database ►I want to retrieve all situations with hypertension during pregnancy… ►Pregnancy only recorded if kind of hypertension does not necessarily involve pregnancy, so we need the union of: All situations with kinds of hypertension necessarily involving pregnancy -e.g. SELECT ?situation, ?diagnosis from DiagnosticTable WHERE ?diagnosis IN {SubclassesOf Hypertension_necessarily_not_involves_pregnancy} All situations involving kinds of hypertension not necessarily involving pregnancy but with pregnancy recorded separately. -e.g. SELECT ?situation, ?diagnosis1 from DiagnosticTable WHERE ?diagnosis1 IN {SubclassesOf Hypertension_not_necessarily_involved_pregnancy} & EXISTS ?situation, ?diagnosis2 WHERE ?diagnosis2 IN {Subclasses of Pregnancy} ►In the terminology query language we need a query for: “Kinds of hypertension not necessarily involving X” “Kinds of hypertension necessarily involving X” ‣ (but that’s simple: “Subclasses of X” usually abbreviated “X”) “Kinds of hypertension necessarily not involving X” ‣ Straightforward if we had negation and disjointness, which we don’t 13

14 Consider specification of “value sets” ►Main cases ►Simple value sets not used elsewhere severity in {mild | moderate | severe} ►Complete hierarchies – all descendants diagnosis in {SubclassesOf Disorder} ►Ordered hierarchies and defaults, with specialisation “Reason for admission” in {Chest pain, Major trauma, Hypothermia,…} ►Arbitrary lists of one or more specific classes “Radiation of chest pain” in {left arm, shoulder, neck, axilla, abdomen} ‣ Exist elsewhere and used for many other purposes ►Union, intersection & difference of all of the above ►Other issues ►Declarative specification updating with changes in terminology; changes in data schema. ►Addition or removal of values by context (discussion for another day) 14

15 ICD and ICD-11 (“International Classificaiton of Diseases”) ►ICD is a classification NOT an ontology ►Used for national and international statistical returns ►Also for billing in many jurisdictions (including an extra layer of “Clinical Modifications” for each country) ►Lots of legacy idiosyncracies Designed to be printed in books & manuals ►Basic rule: Everything must add up to 100% at each level: therefore… ►Each code has only one parent ►Children of every code mutually exclusive and exhaustive ►Therefore… If a code fits logically in two places it must be “excluded” from all but one. Residual categories “other” & “not elsewhere classified” are required to make siblings exhaustive 15

16 SNOMED CT Common Ontology Subset ICD 11 Revision use case Multi-layer system 16 Foundation Component (signs, symptoms, causes, …) Ontology Component (kinds) MortalityMorbidtyPrimary Care … Linearizations

17 ICD 11 Revision ►Aims to provide a persistent structure for computer access ►Foundation component An “ontological core” shared with SNOMED A “Content model” of other information that folk want ‣ signs, symptoms, effects, relation to diability, … … … … … … ►“Linearizations” that look like the legacy system But can be generated from the Foundation Component and its annotations ‣ Coherent with Foundation Model (except for flagged legacy issues) ‣ A single tree of mutually exclusive and exhaustive subclasses at each level -Therefore must have -“Exclusions” -“Residudala categories” – “other” “not elsewhere classified” 17

18 Assumptions ►Snomed disorder codes to be treated as “situations” ►Conjunctions and negation “wrapped” in code ►Hierarcies consistent with “situation” interpretation ►Queries will be against the either asserted or inferred form of the ontology, but no reasoner will be used ►To be used with separate data schemas ►For lists of potential values ►For expanding queries for retrieval ►To be used with ICD “Linearizations” ►Specify meaning of each item in a linearization in terms of the ontology 18

19 Requirements listed for SNOMED Terminology Query Language ( locally http://www.cs.man.ac.uk/~rector/temp/SNOMED_TQL_for_comment) http://www.cs.man.ac.uk/~rector/temp/SNOMED_TQL_for_comment ►Support ►Select class itself only, children, and/or descendants ►Set operations on results – union, intersection, difference ►Differentiate primitive and fully defined concepts; leaf concepts from others C SubclassOf … vs C EquivalentTo ….; no subclasses vs has subclasses; ‣ And possibly other syntactic selection/filtering ►Concepts asserted related to another given concept And possibly the reciprocals (‘used in’) ►String matching ►Use results of previous queries in nested ) queries and subsequent queries? ►Other ►Functional & all functions returning a set of concepts ►Easy to use, understand, and implement ►Questions ►What’s missing? How best to satisfy the requirements? 19

20 Examples ►/* This query expression returns concepts in the Clinical finding sub-hierarchy*/ ►DescendantsAndSelf(404684003|Clinical finding|) ► /* This query expression returns all fully defined concepts in the Clinical finding sub-hierarchy /* ►FilterOnFullyDefined(DescendantsAndSelf(404684003|Clinical finding|)) ►/* This query expression returns the first three levels of the Clinical findings hierarchy. */ ►ChildrenAndSelf( ChildrenAndSelf( ChildrenAndSelf(404684003|Clinical finding|))) ►/* This query expression returns all concepts in the ‘Immune hypersensitivity reaction hierarchy that have an explicit ungrouped ‘Causative agent’ relationship defined to any target concept.* ►Intersection( DescendantsAndSelf(418925002|Immune hypersensitivity reaction|), HasDirectRel(246075003|Causitive agent|, All)) ► 20

21 Inferred & asserted Use of Role Groups ►/* When run against the inferred view, this query expression returns all concepts that contain a first group with a ‘Finding site’ of ‘Inguinal canal structure’ and an ‘Associated morphology’ of ‘Hermial opening’, and a second group with a ‘Finding site’ of ‘Abdominal cavity structure’ and an ‘Associated morphology’of ‘Hernia’. Concepts with inherited grouped relationships are also returned.*/ ►Intersection( HasGroupedRels( 363698007|Finding site|, 90785001|Inguinal canal structure|, 116676008|Associated morphology|, 414402003|Hermial opening|) HasGroupedRels( 363698007|Finding site|, 52731004|Abdominal cavity structure|, 116676008|Associated morphology|, 414403008|Hernia|)) 21

22 Example using descendants and has rel without role groups ►/* this query expression returns concepts describing infectious arthritis */ ►Intersection( Descendants(404684003|Clinical finding|) HasRel(116676008|Associated morphology|, DescendantsAndSelf(23583003| Inflammation|)), HasRel(363698007|Finding site|, DescendantsAndSelf(39352004|Joint structure|)), HasRel(246075003|Causative agent|, DescendantsAndSelf(410607006|Organism|)) ) 22


Download ppt "Query Languages for SNOMED: Use Cases and Issues for Binding to Health Records and to ICD & background for comments on DRAFT SNOMED Query language spec."

Similar presentations


Ads by Google