Detection of underspecifications in SNOMED CT concept definitions using language processing 1 Federal Technical University of Paraná (UTFPR), Curitiba,

Slides:



Advertisements
Similar presentations
Semantic Interoperability in Health Informatics: Lessons Learned 10 January 2008Semantic Interoperability in Health Informatics: Lessons Learned 1 Medical.
Advertisements

Complex Occurrents in Clinical Terminologies Stefan Schulz, Kornél Markó Department of Medical Informatics, Freiburg University Hospital, Germany Boontawee.
Automatic Mapping of Clinical Documentation to SNOMED CT Holger Stenzhorn Saarland University Hospital, Homburg, Germany Edson Pacheco Percy Nohama Stefan.
Consistent and standardized common model to support large-scale vocabulary use and adoption Robust, scalable, and common API to reduce variation in clinical.
1 Institute of Medical Biometry und Medical Informatics, University Medical Center Freiburg, Germany 2 AVERBIS GmbH, Freiburg, Germany 3 Paediatric Hematology.
Copyright © 2001 College of American Pathologists Pseudophakic corneal edema Sample Hierarchy for Corneal Edema Idiopathic corneal edema Phakic corneal.
W. Ceusters a, I. Desimpel a, B. Smith b, S. Schulz c a Language and Computing nv., Zonnegem, Belgium b IFOMIS, Leipzig, Germany c Dept. of.
ECO R European Centre for Ontological Research Ontology-based Error Detection in SNOMED-CT ® Werner Ceusters European Centre for Ontological Research Universität.
Catalina Martínez-Costa, Stefan Schulz: Ontology-based reinterpretation of the SNOMED CT context model Ontology-based reinterpretation of the SNOMED CT.
A Description Logics Approach to Clinical Guidelines and Protocols Stefan Schulz Department of Med. Informatics Freiburg University Hospital Germany Udo.
SNOMED CT’s Ontological Commitment Stefan Schulz University Medical Center, Freiburg, Germany Ronald Cornet Academic Medical Center, Amsterdam, The Netherlands.
HISA ltd. Biography proforma MEDINFO Lygon Street, Brunswick East 3057 Australia Presenter Name: Stefan Schulz Country:1. Germany, 2. Brazil Qualification(s):
Andrade et al. Corpus-based Error Detection in a Multilingual Medical Thesaurus HISA ltd. Biography proforma MEDINFO Lygon Street, Brunswick East.
Biomedical Informatics Some Observations on Clinical Data Representation in EHRs Christopher G. Chute, MD DrPH, Mayo Clinic Chair, ICD11 Revision, World.
A First Attempt towards a Logical Model for the PBMS PANDA Meeting, Milano, 18 April 2002 National Technical University of Athens Patterns for Next-Generation.
Copyright © 2001 College of American Pathologists Sample Hierarchy for Acute MI Acute MI of lateral wall Acute MI of apical-lateral wall Acute Myocardial.
Copyright © 2001 College of American Pathologists Sample Hierarchy for Hepatitis Chronic viral hepatitis Chronic viral hepatitis C About SNOMED Relationships.
Stefan Schulz, Thorsten Seddig, Susanne Hanser, Albrecht Zaiß, Philipp Daumke Checking coding completeness by mining discharge summaries.
The IHTSDO Workbench A Terminology Management Tool John Gutai, IHTSDO May 2011 For OHT.
Ontology-based Convergence of Medical Terminologies: SNOMED CT and ICD-11 Institute for Medical Informatics, Statistics and Documentation, Medical University.
Developing an OWL-DL Ontology for Research and Care of Intracranial Aneurysms – Challenges and Limitations Holger Stenzhorn, Martin Boeker, Stefan Schulz,
The Foundational Model of Anatomy and its Ontological Commitment(s) Stefan Schulz University Medical Center, Freiburg, Germany FMA in OWL meeting November.
A Case Study of ICD-11 Anatomy Value Set Extraction from SNOMED CT Guoqian Jiang, PhD ©2011 MFMER | slide-1 Division of Biomedical Statistics & Informatics,
SNOMED CT – Distributed Content Management Stefan Schulz Content Committee April 2, 2009.
Knowledge Representation and Indexing Using the Unified Medical Language System Kenneth Baclawski* Joseph “Jay” Cigna* Mieczyslaw M. Kokar* Peter Major.
Terminology and HL7 Dr Colin Price HL7 UK 11 th December 2003.
Overview, Benefits and how to approach Implementing - Robyn Richards (NEHTA)
IntroductionMethods & MaterialsResults Conclusions The Office of Standards and Interoperability (OFTSI) of the Foundation TicSalut, is working on the need.
Copyright © 2001 College of American Pathologists Sample Hierarchy for Tularemia Disorder Zoonotic bacterial disease Enteric tularemia Glandular tularemia.
A School of Information Science, Federal University of Minas Gerais, Brazil b Medical University of Graz, Austria, c University Medical Center Freiburg,
WHO – IHTSDO: SNOMED CT – ICD-11 coordination: Conditions vs. Situations Stefan Schulz IHTSDO conference Copenhagen, Apr 24, 2012.
Stefan Schulz, Kornél Markó, Philipp Daumke, Udo Hahn, Susanne Hanser, Percy Nohama, Roosewelt Leite de Andrade, Edson Pacheco, Martin Romacker Semantic.
Semantic Clarification of the Representation of Procedures and Diseases in SNOMED CT Stefan Schulz Medical Informatics, Freiburg University Hospital (Germany)
Ontological Analysis, Ontological Commitment, and Epistemic Contexts Stefan Schulz WHO – IHTSDO Joint Advisory Group First Face-to-Face Meeting Heathrow,
The ICPS: A taxonomy, a classification, an ontology or an information model? Stefan SCHULZ IMBI, University Medical Center, Freiburg, Germany.
What do SNOMED CT Concepts Represent? Stefan Schulz University Medical Center, Freiburg, Germany 31 July & 1 August 2009 Internationales Begegnungszentrum.
Layered MorphoSaurus Lexicon Extension. Problem Confuse and arbitrary synonym classes of non-medical concepts High ambiguity of general (non- terminological)
© University of Manchester Creative Commons Attribution-NonCommercial 3.0 unported 3.0 license Lexically Suggest, Logically Define: QA of Qualifiers &
Higgs bosons, mars missions, and unicorn delusions: How to deal with terms of dubious reference in scientific ontologies Stefan Schulz a,b Mathias Brochhausen.
SNOMED CT A Technologist’s Perspective Gaur Sunder Principal Technical Officer & Incharge, National Release Center VC&BA, C-DAC, Pune.
Nursing Occupations Proposed by: Cynthia Lundberg, BSN Judith Warren, PhD, RN.
Editorial Advisory Group IHTSDO Business Meeting London April 2016 James T. Case Head of Terminology.
Mapping International Classification for Nursing Practice (ICNP) and SNOMED CT Submitted by the ICNP Programme Team to IHTSDO Nursing Special Interest.
© University of Manchester Creative Commons Attribution-NonCommercial 3.0 unported 3.0 license Quality Assurance, Ontology Engineering, and Semantic Interoperability.
Assessing SNOMED CT for Large Scale eHealth Deployments in the EU Workpackage 2- Building new Evidence Daniel Karlsson, Linköping University Stefan Schulz,
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Representation of Hypersensitivity, Allergy and adverse reactions in SNOMED CT Bruce Goldberg, MD, PhD.
The “Common Ontology” between ICD 11 and SNOMED CT
Using language technology for SNOMED CT localization
Stefan Schulz Medical Informatics Research Group
SNOMED CT’s RF2: Werner CEUSTERS1 , MD
Solving the Health IT Interoperability Puzzle with SNOMED CT
Scope The scope of this test is to make a preliminary assessment of the fitness of the Draft Observables and Investigation Model (Observables model)
SNOMED CT, Synoptic Observables and Clinical Genomics
February 17, 2017 Bruce Goldberg, MD, PhD
Ontology authoring and assessment
Drug and Substance Project Update
Drug and Substance Project Update
Terminology Quality Evaluation S60 Rashmie Abeysinghe Joint work with
Efficient Remediation of Terms Inactivated by Dictionary Updates
Kin Wah Fung, MD, MS, MA National Library of Medicine, NIH
Multilingual Biomedical Dictionary
Device complications.
A Description Logics Approach to Clinical Guidelines and Protocols
Controlled Vocabularies for Capturing Clinical Encounters
Harmonizing SNOMED CT with BioTopLite
Lexical ambiguity in SNOMED CT
Morphoogle - A Multilingual Interface to a Web Search Engine
Stefan SCHULZ IMBI, University Medical Center, Freiburg, Germany
A Description Logics Approach to Clinical Guidelines and Protocols
Presentation transcript:

Detection of underspecifications in SNOMED CT concept definitions using language processing 1 Federal Technical University of Paraná (UTFPR), Curitiba, Brazil; 2 Pontifical Catholic University of Paraná (PUCPR), Curitiba, Brazil; 3 Institute of Medical Biometry und Medical Informatics, University Medical Center Freiburg; 4 AVERBIS GmbH, Freiburg, Germany Detection of underspecifications in SNOMED CT concept definitions using language processing Edson Pacheco 1,2, Holger Stenzhorn 3, Percy Nohama 1, Jan Paetzold 3,4, Stefan Schulz 2,3,4

Detection of underspecifications in SNOMED CT concept definitions using language processing SNOMED CT “Standardized Nomenclature of Medicine - Clinical Terms” Comprehensive clinical terminology ( > 300,000 representational units) Concepts are arranged in extensive taxonomic (is-a) hierarchies Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Taxonomic Structure of SNOMED CT Background Methods Evaluation Results Conclusions Relation: is-a

Detection of underspecifications in SNOMED CT concept definitions using language processing SNOMED CT “Standardized Nomenclature of Medicine - Clinical Terms” Comprehensive clinical terminology ( > 300,000 representational units) Concepts are arranged in extensive taxonomic (is-a) hierarchies Cross-reference between concepts from several branches via semantic relations obeying description logics semantics Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Cross-reference between SNOMED CT concepts Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing SNOMED CT semantics in a nutshell B C D A r S A subClassOf C and r some B and s some D A – isA – C A – r – B A – s – D EL+ description logics A: concept C: taxonomic parent B, D: attributes Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing SNOMED CT “Standardized Nomenclature of Medicine - Clinical Terms” Comprehensive clinical terminology ( > 300,000 representational units) Concepts are arranged in extensive taxonomic (is-a) hierarchies Cross-reference between concepts from several branches via semantic relations obeying description logics semantics Burden of terminology content maintenance and quality assurance Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing SNOMED CT “Standardized Nomenclature of Medicine - Clinical Terms” Comprehensive clinical terminology ( > 300,000 representational units) Concepts are arranged in extensive taxonomic (is-a) hierarchies Cross-reference between concepts from several branches via semantic relations obeying description logics semantics Burden of terminology content maintenance and quality assurance To be supported by automated approaches Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Looking for underspecifications of cross-linkage Nearly half (45.2%) of the SNOMED CT concepts (132,125) have no attributes. Textual descriptions suggest composed meanings Examples: –Cerebral function only related to its parent Nervous system function expected relation with Brain structure missing –Hepatitis notification only related to its parent Disease notification expected relation with Inflammatory disease of liver missing Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Automatic suggestion of attributes Source: 01/2009 release of SNOMED CT Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Automatic suggestion of attributes Source: 01/2009 release of SNOMED CT Algorithm: 1.Identify non-attributed concepts Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Select non-attributed SNOMED CT concepts Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Select non-attributed SNOMED CT concepts Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Automatic suggestion of attributes Source: 01/2009 release of SNOMED CT Algorithm: 1.Identify non-attributed concepts 2.Extract concept names Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Extract names of non-attributed concepts “hepatitis notification”“notification of AIDS” “notification of disease” Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Automatic suggestion of attributes Source: 01/2009 release of SNOMED CT Algorithm: 1.Identify non-attributed concepts 2.Extract concept names 3.Perform semantic abstraction from word to sets of morphosemantic identifiers (MIDs) Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Perform morphosemantic abstraction “hepatitis notification”“notification of AIDS” “notification of disease” #report #disease #liver #inflamm #report #report #aids Using MorphoSaurus* morphosemantic indexing *Markó K, Schulz S, Hahn U: MorphoSaurus - Design and Evaluation of an Interlingua-based, Cross-language Document Retrieval Engine for the Medical Domain. Meth Inf Med 4/2005(44): Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Automatic suggestion of attributes Source: 01/2009 release of SNOMED CT Algorithm: 1.Identify non-attributed concepts 2.Extract concept names 3.Perform semantic abstraction from word to sets of morphosemantic identifiers (MIDs) 4.Compare MID sets between children and parents and reduce child sets Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Automatic suggestion of attributes Source: 01/2009 release of SNOMED CT Algorithm: 1.Identify non-attributed concepts 2.Extract concept names 3.Perform semantic abstraction from word to sets of morphosemantic identifiers (MIDs) 4.Compare MID sets between children and parents and reduce child sets 5.Match reduced child set against MID representations of all SNOMED descriptions Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Matching heuristics For the FSN MID set of every non-attributed concept: –remove MID that occurs in any of this concept’s parents –check whether the remainder set coincides with the MID representation of some other SNOMED CT concept, considering all descriptions (FSNs, PTs, synonyms) –consider this concept a refinement candidate #report #disease #liver #inflamm #report “notification of disease” “hepatitis notification” SNOMED description MID set report FSN: “inflammatory disease of liver” = “hepatitis” #inflamm #disease #liver #inflamm FSN: SYN: Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Automatic suggestion of attributes Source: 01/2009 release of SNOMED CT Algorithm: 1.Identify non-attributed concepts 2.Extract concept names 3.Perform semantic abstraction from word to sets of morphosemantic identifiers (MIDs) 4.Compare MID sets between children and parents and reduce child sets 5.Match reduced child set against MID representations of all SNOMED descriptions 6.Suggest candidates for refining attributes Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Addition of refinement candidate “hepatitis notification” “notification of disease” #report #disease #liver #inflamm #report *Markó K, Schulz S, Hahn U: MorphoSaurus - Design and Evaluation of an Interlingua-based, Cross-language Document Retrieval Engine for the Medical Domain. Meth Inf Med 4/2005(44): “inflammatory disease of liver” = “hepatitis” #liver #inflamm Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Evaluation Methodology For each of 14 SNOMED subhierarchies: random sample of 20 underspecified concepts, compared to attribute refinement candidate proposed by the system For each of the sample concept verify 1.whether this concept should be refined 2.whether one of the suggested refinement candidates can be plausibly used for refinement. Performed by two domain experts. Double rating for interrater agreement measurement: 25% Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Results of retrieval experiments Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Results of retrieval experiments Non-attributed Concepts Refinement candidates Analysis of samples(n=20) Sample based estimation SNOMED hierarchies Active Concepts n%n%refine- ment justified correct suggest ion refinable concepts with correct suggest- ions Organism % 00 Substance %35% body structure %0%8000 qualifier value % 00 observable entity %50% Finding %75% physical object %80%1100 morphologic abnormality %60% Occupation %10% Product %60% Event %45% Disorder %60% Procedure %65% Others %60% TOTAL Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Results Interrater agreement (Kohen’s kappa): –A concept should be refined: 0.55 (low !) –There is a proposed refinement candidate: 0.74 Estimation: approximately 18,000 SNOMED CT concepts can be refined. Problematic suggestions: –Macaroni for Macaroni maker –Canada for Salmonella canada –Acyl carnitine for Acylcarnitine hydrolase –First for Female first cousin (already fully defined by the intersection of First cousin and Female cousin) Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Conclusions Many SNOMED CT concepts are underdefined and can / should be refined The proposed methodology was useful to detect underspecifications Large difference between SNOMED hierarchies re harvesting and approval of refinement candidates “Grey areas” –many proposed refinements are debatable –only part of refinement candidates not retrieved due to restrictions of the methodology Should be considered for future SNOMED CT editing policies Background Methods Evaluation Results Conclusions

Detection of underspecifications in SNOMED CT concept definitions using language processing Thank You! Contact: Stefan Schulz Acknowledgements: CNPq, Brazil: / BMBF-IB, Germany: BRA05/022