Download presentation
Presentation is loading. Please wait.
1
Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France http://www.lirmm.fr/~lafourcade
2
Overwiew & Objectives lexical semantic representations conceptual vector model (cvm) autonomous learning by the system from a given « semantic space » (ontology) Constructing texonomies Hierarchical - findind hyperonyms Multiple inheritance - views ambiguity as noise towards self contained WSD annotations « I made a deposit at the bank » « I made a deposit at the bank »
3
Conceptual vectors vector space An idea Concept combination — a vector Idea space = vector space A concept = an idea = a vector V with augmentation: V + neighboorhood Meaning space = vector space + {v}*
4
2D view of « meaning space » “ cat ” “ product ”
5
Conceptual vectors Thesaurus H : thesaurus hierarchy — K concepts Thesaurus Larousse = 873 concepts V(C i ) : a j = 1/ (2 ** D um (H, i, j)) 1/41 1/16 1/64 264
6
Conceptual vectors Concept c4:peace peace hierarchical relations conflict relations The world, manhood society
7
Conceptual vectors Term “peace” c4:peace
8
finance profit exchange
9
Angular distance D A (x, y) = angle (x, y) 0 D A (x, y) if 0 then x & y colinear — same idea if /2 then nothing in common if then D A (x, -x) with -x — anti-idea of x x’ y x
10
Angular distance D A (x, y) = acos(sim(x,y)) D A (x, y) = acos(x.y/|x||y|)) D A (x, x) = 0 D A (x, y) = D A (y, x) D A (x, y) + D A (y, z) D A (x, z) D A (0, 0) = 0 and D A (x, 0) = /2 by definition D A ( x, y) = D A (x, y) with 0 D A ( x, y) = - D A (x, y) with < 0 D A (x+x, x+y) = D A (x, x+y) D A (x, y)
11
Thematic distance Examples D A (tit, tit) = 0 D A (tit, passerine) = 0.4 D A (tit, bird) = 0.7 D A (tit, train) = 1.14 D A (tit, insect) = 0.62 tit = insectivorous passerine bird …
12
Some vector operations Addition : Z = X Y z i = x i + y ivector Z is normalized Term to term mult : Z = X Y z i = (x i * y i ) 1/2 vector Z is not normalized Weak contextualization : Z = X (X Y) = (X,Y) “ Z is X augmented by its mutual information with Y ”
13
2D view of weak contextualization Y X XYXY XYXY Y (X Y) XYXY X (X Y)
14
Autonomous learning 1/2 set of known words K, set of unknow words U revise a word w of K OR (try to) learn a word w of U From the web : for w ask for a def D specific sites : dicts, synonyms list, etc. def analysis general sites : google, etc. corpus analysis for each word wd of D if not in K then add wd to U AND add VO to V* otherwise get the vector of wd AND add V(wd) to V* compute the new vector of w from def(D) and V* 98870 words for 400000 senses (vectors) learned in 3 years French « ever » looping process
15
Autonomous learning 2/2 insectivorous passerine bird … ADJ, … N, GOV … … PH TXT VVV V V V V V (X,Y) Weighted sum
17
Hyperonyms identifications Extraction Try all terms too costly and unproductive Extract potential candidates From definitions, cooccurence lists etc. Ex: Cand(emerald) = precious stone, stone, beryl, gem, … Evaluation of cand (m) to meaning (m) Contextualize : (c,m) = c (c m) Retain c such as (c,m) is the closest to m Loop: extracting hyper helps identifying meanings
18
Émeraude/pierre précieuseÉmeraude/béryl béryl Pierre précieuse Gemme/pierre précieuseGemme/bourgeonGemme/résine closest vector Émeraude/gemme … v v v vv v Émeraude/pierre précieuseÉmeraude/béryl béryl Pierre précieuse Gemme/pierre précieuseGemme/bourgeonGemme/résine Émeraude/gemme … v v v vv v
19
Émeraude/pierre précieuseÉmeraude/béryl béryl Pierre précieuse Gemme/pierre précieuseGemme/bourgeonGemme/résine 0.81 0.9 0.7 0.85 Émeraude/béryl béryl Pierre précieuse Gemme/pierre précieuse 0.81 0.9 0.85 Émeraude/vertÉmeraude/couleur Émeraude/vert Vert/couleur des signaux Couleur/matièreCouleur/sensation Vert/couleur … … …
20
Voiture/wagon wagon Moyen de transport véhicule/Moyen de transportvéhicule/vecteur automobile Voiture/automobile Cheval/moyen de transport Cheval/mammifère mammifère Cheval/viande Viande/nourriture aliment nourriture artefact Cheval/unité de puissance animal hypo
21
Last words Switching of representation From subsymbolic to symbolic … and vice-versa readabily of symbols … of words global and local test functions for vector quality assessment decision taking about number of meanings … or views detectors when combined to lexical functions (antonymy, etc.) the basis for self adjustement toward a vector space of constant density wsd as a reduction of noise (in context or out of context) unification of ontologies self emergent structuration of terminology
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.