Presentation is loading. Please wait.

Presentation is loading. Please wait.

Integrating ontological and linguistic knowledge for Conceptual Information Extraction Roberto Basili, Michele Vindigni, Fabio Massimo Zanzotto Università.

Similar presentations


Presentation on theme: "Integrating ontological and linguistic knowledge for Conceptual Information Extraction Roberto Basili, Michele Vindigni, Fabio Massimo Zanzotto Università."— Presentation transcript:

1 Integrating ontological and linguistic knowledge for Conceptual Information Extraction Roberto Basili, Michele Vindigni, Fabio Massimo Zanzotto Università di Roma “Tor Vergata” Italy

2 WI 2003 Motivation professor(“XYZ”) course(“Database Theory”) teacherOf( professor(“XYZ”), course(“Database Theory”)) professor course teacherOf

3 WI 2003 Motivation  Linguistic interfaces are relevant for helping final users to deal with standardised conceptualizations (or ontologies)  Information Extraction systems may be used for this purpose. These require conceptualisations with:  high domain specificity  high coverage  High coverage and domain specific conceptualisations are difficult build, reuse of pre-existing knowledge is a must.

4 WI 2003 Reusing conceptualisations... Let us then take a Domain Concept Hierarchy: e.g. Medical Subject Headings (MESH) and a concept: Dendrite Dendrite  Neuron  Nervous System Dendrite  Cell Surface Extension  Cellular Structure  Cell Dendrite  Neuron  Cell None of the dendrites were cut....but the dendrites and axons are often cut. Researchers don't know why, but for some reason, the ends of dendrites tangle and knot. dendrite fiber nerve fiber

5 WI 2003 Target Problem  Integration of linguistic information (Lexical Knowledge Base, LKB) with domain knowledge (Domain Concept Hierarchy, DCH)  Need to harmonise linguistic processing (i.e. feature detection in text) with available resource (DCH)  Need to annotate texts with semantic information, that is build a linguistic interface to DCH

6 WI 2003 Target problem Lexical Knowledge Base Domain Concept Hierarchy

7 WI 2003 Inspiring Principles (P1) Extensional Nature of Domain Concept Hierarchy Subsumption in the DCH has an extensional interpretation in the LKB Lexical Knowledge Base 11 22 33 44 55 Domain Concept Hierarchy

8 WI 2003 Inspiring Principles (P2) Intensional Strength in Lexical Knowledge Base Given a set of words W whose senses are subsumed by a in LKB, the intensional strength measures the trade-off between  the generalization required to model all the words  the capability of separating individual word senses in W

9 WI 2003 Intensional Strength in LKB Conceptual Density 9 2 3 1 10 6 45 15 14 13 8 1211 7 Word1Word3Word2 Mean Tree Area of n Words Actual Tree Area CD =

10 WI 2003 Mapping Algorithm Preliminary definitions Extension of C ext(C)={t c’ in DCH|c subsumes C’ in DCH} Linguistic Generalisation of C lgen(C)={  in LKB|  t in ext(C) and  t is subsumed by  in LKB}

11 WI 2003 Mapping Algorithm merge(DCH,LKB,T)  C  T Step 1Determine the linguistic extensions lgen(C) in DCH made of all descendants of C Step 2 Compute the optimal mapping G(C)  lgen(C), by a greedy selection maximizing the conceptual density Step 3Attach t C to senses in G(C) Step 4  t  ext(C) Attach t to  LKB iff:  is a sense for t in LKB and   G(C) |  subsumes  in LKB P1 (Extensional Nature of DCH) P2 (Intensional Strength of LKB)

12 WI 2003 lgen(t) ={,..., } Mapping Algorithm Lexical Knowledge Base 11 22 33 44 55 66 Domain Concept Hierarchy t C= t t1t1 t2t2 t3t3 t4t4 ext( t)={t 1,t 2,t 3,t 4 } G( t)={  5,  4 } Conceptual Density Step 1 Step 2

13 WI 2003 Mapping Algorithm Lexical Knowledge Base 11 22 33 44 55 66 Domain Concept Hierarchy t t1t1 t2t2 t3t3 t4t4 Step 3 Step 4 Attach t C to senses in G(C)  t  ext(C) Attach t to  LKB

14 WI 2003 A case study: mapping MeSH in WordNet  Medical Subject Headings (MeSH) as Domain Concept Hierarchy  WordNet as Lexical Knowledge Base

15 WI 2003 Mapping results: an excerpt

16 WI 2003 Summary  Target Probem: Mapping a DCH in a LKB for Information Extraction  Solution:  Inspiring Principles:  Extensional Nature of DCH  Intesional Streght in LKB  The notion of conceptual density (Agirre & Rigau, 1996)  A novel mapping algorithm between DCH and LKB  A case study: MeSH in WordNet

17 WI 2003 Conclusions and future work  If a Domain Concept Hierarchy is available, the presented method is a viable solution to integrate it in WordNet. But, what when it is not available? How to learn taxonomical relations from text collections? Moreover, how to induce different kind of relations between concepts?


Download ppt "Integrating ontological and linguistic knowledge for Conceptual Information Extraction Roberto Basili, Michele Vindigni, Fabio Massimo Zanzotto Università."

Similar presentations


Ads by Google