L & C Dr. W. Ceusters Language & Computing nv 1 L&C’s LinkBase: a multi-lingual Hub to medical terminologies Dr. W. Ceusters Dir R&D Language & Computing nv
L & C Dr. W. Ceusters Language & Computing nv 2 Presentation overview Short history of L&C L&C’s integrated approach to medical natural language understanding –Focus on medical terminology management Position in the international market Relevant demonstrations –LinkFactory –Ontology Browser
L & C Dr. W. Ceusters Language & Computing nv 3 Goal of Language & Computing nv To provide users and developers of systems for knowledge management with tools and services for efficient and accurate data-entry and retrieval by exploiting the full power of automated (medical) natural language understanding We hereby declare...
L & C Dr. W. Ceusters Language & Computing nv 4 speech recognition TTS natural language understanding text generation Language Engineering speech text semantic representations language models semantic models dialogue models speech models information processing
L & C Dr. W. Ceusters Language & Computing nv 5 The three pillars of Healthcare IT EHCRS Language Terminology Individual patient care Seamless care Historical overview... Comparability of data Crossborder care Decision support Abstraction / grouping... Faithful data recording Sufficient level of detail... Domain of discourse: healthcare
L & C Dr. W. Ceusters Language & Computing nv 6 History of R&D in L&C AnthemMulti-TaleDomeGIUSelectC-CareLiquidMobidev R/D ratio
L & C Dr. W. Ceusters Language & Computing nv 7 L&C’s integrated approach
L & C Dr. W. Ceusters Language & Computing nv 8 The L&C integrated solution Data structure and function library for language understanding Medical and linguistic knowledge required for language understanding NLU enabling tools for knowledge supported data-entry and -retrieval
L & C Dr. W. Ceusters Language & Computing nv 9 The L&C Linguistic Concept Factory Linguistic-semantic Function Library C-DEFINE(c-meningitis, c-inflammation HAS-LOC c-meninges) T-DEFINE(“méningite”, french, c-meningitis) Storage Functions Retrieval Functions GET-TERMS(c-meningitis, {french, dutch}) “méningite”, “hersenvliesontsteking”
L & C Dr. W. Ceusters Language & Computing nv 10 Architectual overview
L & C Dr. W. Ceusters Language & Computing nv 11 Client Graphical Objects
L & C Dr. W. Ceusters Language & Computing nv 12 Build-in quality control Knowledge entered is immediately used to check validity of subsequent entries Version management User-management with : –Allowed actions based on experience –Personal audit trail Clear and formal separation with 3 rd party systems to avoid copying mistakes such as: –UMLS’ cyclical ISA relationships –SNOMED-RT ‘s “very usual = always” modelling –Most systems’ overloaded hierarchical relations
L & C Dr. W. Ceusters Language & Computing nv 13 The L&C Linguistic Concept Database Formal Domain Ontology Lexicon Grammar Language A Lexicon Grammar Language B Cassandra Linguistic Ontology MEDRA ICD SNOMED ICPC Others... Proprietary Terminologies
L & C Dr. W. Ceusters Language & Computing nv 14 A formal terminology Separation of terms and concepts To be used by machines, not people All information is explicit in the structure, not implicit in the terms Clean subsumption hierarchies Formal, “computable” definitions of concepts Internal, automated quality control
L & C Dr. W. Ceusters Language & Computing nv 15 Expl: Joint anatomy joint HAS-HOLE joint space joint capsule IS-OUTER-LAYER-OF joint meniscus –IS-INCOMPLETE-FILLER-OF joint space –IS-TOPO-INSIDE joint capsule –IS-NON-TANGENTIAL-MATERIAL-PART-OF joint joint –IS-CONNECTOR-OF bone X –IS-CONNECTOR-OF bone Y synovia –IS-INCOMPLETE-FILLER-OF joint space synovial membrane IS-BONAFIDE- BOUNDARY-OF joint space
L & C Dr. W. Ceusters Language & Computing nv 16 Expl: Relative spatial localisation IS- TOPO- INSIDE- OF IS-GEO- INSIDE- OF IS- INSIDE- CONVEX- HULL-OF IS-PARTLY- IN-CONVEX- HULL-OF IS- OUTSIDE- CONVEX- HULL-OF HAS- DISCONNECTED- REGION HAS- EXTERNAL- CONNECTING- REGION HAS-DISCRETED- REGION HAS- TANG.- SPAT.- PART HAS-NON- TANG.- SPAT.- PART IS- SPAT.- EQUIV.- OF IS- TANG.- SPAT.- PART-OF IS-NON- TANG.- SPAT.- PART-OF HAS- PARTIAL- SPATIAL- OVERLAP HAS- PROPER- SPATIAL -PART IS- PROPER- SPAT.- PART-OF HAS- SPATIAL -PART IS- SPATIAL -PART- OF HAS- OVERLAPPING -REGION HAS- CONNECTING- REGION HAS-SPATIAL- POINT- REFERENCE
L & C Dr. W. Ceusters Language & Computing nv 17 Expl: Patient at risk (risk patient) Having a healthcare phenomenon Generalised Possession Healthcare phenomenon Human IS-A Has- possessor Has- possessed Patient Is-possessor-of Patient at risk IS-A Has-Healthcare- phenomenon Risk Factor IS-A Is-Risk- Factor-Of Patient at risk for osteoporosis Risk factor for osteoporosis Osteoporosis Has-Healthcare- phenomenon Is-Risk- Factor-Of IS-A
L & C Dr. W. Ceusters Language & Computing nv 18 LinkBase size per ( ) concepts terms 320link-types link instances links to 3 rd party systems But: –Never finished ! –Quality sufficient for current applications
L & C Dr. W. Ceusters Language & Computing nv 19 Text ResultProcessor Domain representation Goal representation LinguisticKnowledge TaskKnowledge Formal domain ontology L&C Linguistic components Text ResultProcessor Domain representation Goal representation LinguisticKnowledge TaskKnowledge Formal domain ontology
L & C Dr. W. Ceusters Language & Computing nv 20 L&C application servers Coding tools: FastCode Semantic indexers: Tessi Spell checkers and type ahead: FastType Semi controlled language parsers in restricted domains: FreePharma Ontology browser Stochastic dependency-based indexer: C-Link (Ir)relevant document classifier for very low prevalence data sets
L & C Dr. W. Ceusters Language & Computing nv 21 FastCode Generator LinC- Factory Integrated coding approach Formal representation of Classification system LinCBase Mapping data Domain+Linguistic ontology FastCode client FastCode server Coding data
L & C Dr. W. Ceusters Language & Computing nv 22 Benefits of formal multi-lingual terminology management
L & C Dr. W. Ceusters Language & Computing nv 23 Semi-automatic mapping (ICPC-ICD10) Zenker’s diverticulum (D84) diverticulumesophagus HAS-LOC pressure HAS-CAUSEintraluminalHAS-ORIG Acquired diverticulum of esophagus (K22.5) HAS- LOC Acquired HAS-AcqMode HAS-AqMode
L & C Dr. W. Ceusters Language & Computing nv 24 Reclassify: FOOT EXARTICULATION Definitions given by domain-expert: –( ( FOOT EXARTICULATION) { [ IS_A ] ( EXARTICULATION ) } { [HAS_THEME] ( FOOT ) } ) –( (AMPUTATION OF FOOT) { [ IS_A ] (AMPUTATION ) } { [ HAS_THEME ] ( FOOT ) } ) –( (EXARTICULATION) { [ IS_A ] (AMPUTATION ) } { [ HAS_SOURCE ] ( JOINT ) } ) Redefinition by automatic classifier –( ( FOOT EXARTICULATION ) –{ [ IS_A ] (AMPUTATION OF FOOT ) } –{ [ IS_A ] ( EXARTICULATION ) } )
L & C Dr. W. Ceusters Language & Computing nv 25 Detection of missing terms
L & C Dr. W. Ceusters Language & Computing nv 26 Resolving conflicting views MESH-2001 : “Seizures” MESH-2001 : “Convulsions” Snomed-RT : “Convulsion” Snomed-RT : “Seizure” L&C : ConvulsionL&C : Seizure L&C : Health crisis L&C : Epileptic convulsion IS-A IS-narrower-than ISA Has-CCC
L & C Dr. W. Ceusters Language & Computing nv 27 Position in the market
L & C Dr. W. Ceusters Language & Computing nv 28 Main business model Software developersIntegrators HospitalsInternet Service Providers Pharmaceutical companiesResearch Organisations Medical PublishersGovernment Healthcare Insurance CompaniesMCO
L & C Dr. W. Ceusters Language & Computing nv 29 Project-based product development Service Component Product Component Project Definition Corpus analysis Set up service Product development Workbench development Teach and deliver
L & C Dr. W. Ceusters Language & Computing nv 30 Current major partners/clients Coding tools –Several hospitals using ICD-9-CM FAstCode Terminology management services + NLU based data entry –IDEWE: largest Belgian occupational medicine services provider –First Databank UK –Belgian military medical service Semantic indexing –Belgian Professional Association of Pharmaceutical industry
L & C Dr. W. Ceusters Language & Computing nv 31 Academic Competitors/Colleagues Main characteristics: –Prototypes with very small coverage –No professional support Relevant examples: –OpenGalen (VUMAN): Very small “LinkBase” “Toy”-link to language (language ignored as medium) –Protégé (Stanford): Ontology editor –Several DL-systems: FacT, Cyclop, LOOM,... Tested with very small (tiny) ontologies More powerful reasoning mechanisms than LinkFactory but totally intractable on ontologies of over a few 1000 distinct concept classes
L & C Dr. W. Ceusters Language & Computing nv 32 Commercial competitors/colleagues Health Language Inc. Apelon Inc. –Ontyx –Lexical Technologies
L & C Dr. W. Ceusters Language & Computing nv 33 L&C’s strong position Multi-lingual and multi-cultural approach Modelling independent from specific languages but not from language as communication medium Proven scalability of our approach Support at all levels –Services to migrate existing client dictionairies –Large tool set for terminology development, maintenance, and/or use Only company with in-house expertise in medicine, computational linguistics in many languages, formal ontologies and informatics