Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mapping the NCI Thesaurus and the Collaborative Inter-Lingual Index Amanda Hicks University of Florida HealthInsight Workshop, Oslo, Norway.

Similar presentations


Presentation on theme: "Mapping the NCI Thesaurus and the Collaborative Inter-Lingual Index Amanda Hicks University of Florida HealthInsight Workshop, Oslo, Norway."— Presentation transcript:

1 Mapping the NCI Thesaurus and the Collaborative Inter-Lingual Index Amanda Hicks University of Florida aehicks@ufl.edu HealthInsight Workshop, Oslo, Norway 20 May 2016

2 Overview Ontologies versus wordnets Inter-Lingual Index Future work to map National Cancer Institute Thesaurus to the Collaborative Inter-Lingual Index In collaboration with Francis Bond, Nanyan Technological University, Singapore Selja Seppälä, University Florida, USA 2

3 ONTOLOGIES VERSUS WORDNETS 3

4 Semantic Networks Words, concepts, or classes that are arranged in a network Provide a framework for machine readable meaning and context for what would otherwise be uninterpreted syntax Simultaneously logical objects and mathematical objects, subject to inference and graph theoretical analysis 4

5 Wordnets versus Ontologies Wordnets are semantic networks that represent how we use language. Meanings are stated in natural language definitions and relatively sparse semantic relations. The word ‘cat’ in context Ontologies are semantic networks that represent properties of things in the world. Meanings are encoded in logical form. What it is to be a cat 5

6 Comparative Strengths Wordnets NLP applications that require word sense discrimination Cross-lingual comparison of lexical categories Distance or related measures of concepts 6 Ontologies Provide a coherent, stable and unified frame of reference for the interpretation of concepts and specification of classes May support interoperability of data sets Support deductive reasoning over structured data

7 INTER-LINGUAL INDEX 7

8 Current State of Mapping Wordnets Wordnets exist for many languages. – 33 open wordnets in the Global Wordnet Grid http://globalwordnet.org/global-wordnet-grid/ Mapping often occurs through English WordNet. – English centric – English does not have a word for every concept. Some wordnets are mapped to each other directly. 8

9 Mapping wordnets to each other directly gets messy. Vossen, GlobalWordNet Conference, 2016 9

10 The Collaborative Inter-Lingual Index (CILI) en es no pt 10

11 CILI Flat list of concepts with a persistent Semantic Web compliant IRI Synsets from wordnets mapped directly to the ILI IRI English WordNet 3.0, 3.1 and Dutch Open Wordnet currently mapped Unique English definitions are associated with each ILI to support mapping (but no English words or labels) Not imposed on linked wordnets Open, anyone can contribute https://github.com/globalwordnet/ili 11

12 NCI THESAURUS AND CILI 12

13 National Cancer Institute Thesaurus (NCIt) An English medical reference terminology Definitions crafted by teams of medical experts and terminologists Covers vocabulary for clinical care, translational and basic research, and public information and administrative activities Widely used in biomedical and health informatics in the USA 13

14 Why map NCIt to CILI? Specialized terminology in CILI should be defined by subject matter experts, not linguists. There is currently no prototype for mapping specialized vocabulary to CILI. More resources may lead to improved formal semantics to be integrated with CILI. To support integration of health knowledge extracted from linguistically heterogeneous sources – Multi-lingual – Layperson/specialized vocab 14

15 “Patella” in WN and NCIt WordNet patella, kneecap, kneepan A small flat triangular bone in front of the knee that protects the knee joint Part_holonym – knee Hypernym – Sesamoid bone NCI T BONE, PATELLA A small flat triangular bone in front of the knee that articulates with the femur and protects the knee joint. subClassOf – Bone of the Lower Extremity – Short Bone Semantic Type – Anatomical Structure Additional synonyms Additional definitional knowledge: potentially useful for formalizing semantics Additional formal semantic information 15

16 Semantic Modeling One of the goals of CILI is to have ontologies that provide formal semantics for the indexed concepts. Different semantic resources encode different semantic information. NCIt can be used to enrich the common semantic model. 16

17 Our Planned Approach To the Project Map NCI Thesaurus to CILI Convert NCI Thesaurus to Lexical Markup Framework Partially automate mapping NCIt to CILI using string matching on WordNet synsets and NCIt names an similarity measures on definitions There will still be false negatives that will need to be identified by hand. Formalize the semantics of NCIt related CILIs with other ontologies. KYOTO BFO 17

18 … smokes hubbly-bubbly on weekends … Smoking status| L ILI Concept Use for Knowledge Integration 18

19 THANK YOU 19


Download ppt "Mapping the NCI Thesaurus and the Collaborative Inter-Lingual Index Amanda Hicks University of Florida HealthInsight Workshop, Oslo, Norway."

Similar presentations


Ads by Google