Automatically Populating Acception Lexical Database through Bilingual Dictionaries and Conceptual Vectors PAPILLON 2002 Mathieu Lafourcade LIRMM - France.

Presentation on theme: "Automatically Populating Acception Lexical Database through Bilingual Dictionaries and Conceptual Vectors PAPILLON 2002 Mathieu Lafourcade LIRMM - France."— Presentation transcript:

Automatically Populating Acception Lexical Database through Bilingual Dictionaries and Conceptual Vectors PAPILLON 2002 Mathieu Lafourcade LIRMM - France

Overwiew & Objectives Lexical soup what ? Bilingual dic & Conceptual vectors which heuristics ? for what ? linking decision and quality assessment

Conceptual vectors vector space An idea Concept combination a vector Idea space = vector space A concept = an idea = a vector V with augmentation: V + neighboorhood Meaning space = vector space + {v}*

Conceptual vectors Thesaurus H : thesaurus hierarchy K concepts Thesaurus Larousse = 873 concepts V(C i ) : a j = 1/ (2 ** D um (H, i, j)) 1/41 1/16 1/64 264

Conceptual vectors Concept c4:peace peace hiérarchical relations conflict relations The world, manhood society

Conceptual vectors Term peace c4:peace

Angular distance D A (x, y) = angle (x, y) 0 D A (x, y) if 0 then x & y colinear same idea if /2 then nothing in common if then D A (x, -x) with -x anti-idea of x x y x

Angular distance D A (x, y) = acos(sim(x,y)) D A (x, y) = acos(x.y/|x||y|)) D A (x, x) = 0 D A (x, y) = D A (y, x) D A (x, y) + D A (y, z) D A (x, z) D A (0, 0) = 0 and D A (x, 0) = /2 by definition D A ( x, y) = D A (x, y) with 0 D A ( x, y) = - D A (x, y) with < 0 D A (x+x, x+y) = D A (x, x+y) D A (x, y)

Thematic distance Examples D A (tit, tit) = 0 D A (tit, passerine) = 0.4 D A (tit, bird) = 0.7 D A (tit, train) = 1.14 D A (tit, insect) = 0.62 tit = insectivorous passerine bird …

Acception base Conceptual vector base Malay entry base French entry base English entry base Japanese entry base acc x rivière fleuve carte.2carte.1 acc z acc y acc x acc river map card.1 vv v v v sungai v

Conceptual vector base French entry base acc x rivière fleuve carte.2 carte.1 acc z acc y acc x vv v v v v v v v v rivière fleuve carte.2 carte.1 v v v v empty Acception base

demand English Vectorized monolingual dictionary English-French Bilingual dictionary v v v v v demand.5glossesequivalents Left over meaning Vector space Association demand.4 v defdemand.3 v defdemand.2 v defdemand.1 v defdemand.4glossesequivalentsdemand.3glossesequivalentsdemand.2glossesequivalentsdemand.1glossesequivalents demand demand.1 v def Glosses 2Equivalents 2 demand.1 v def Glosses 2Equivalents 2 demand.1 v def Glosses 2Equivalents 2 demand.1 v def Glosses 2Equivalents 2 Associations between definitions, vectors, glosses and equivalents

W-SL i equiv = {W-TL} Source LanguageTarget Language W-TLequiv = {W-SL, …} W-SL i{W-TL} W-TL {W-SL, …} vectors are close already existing link W-TL {W-SL, …} W-TL {…} W-TL {W-SL, …} v v v v created link left over acceptions Acceptions warning if not close

W-SL i {W-TL1,W-TL2, …} W-TL1Equiv = {W-SL, …} W-TL1Equiv = {W-SL, …} W-TL1Equiv = {…} W-TL1Equiv = {W-SL, …} v v v v W-TL2Equiv = {W-SL, …} W-TL2Equiv = {W-SL, …} W-TL2 Equiv = {…} W-TL2Equiv = {W-SL, …} v v v … Source LanguageTarget LanguageAcceptions

W-SL i equiv = W-TL refinement links created acception W-SL i equiv = W-TL W-SL i equiv = W-TL W-SL i equiv = W-TL W-SL i equiv = W-TL W-SL i equiv = W-TL closest vector

W-SL i equiv = W-TL W-SL j equiv = W-TL W-SL j equiv = W-TL W-SL i equiv = W-TL W-SL j equiv = W-TL W-SL j equiv = W-TL W-SL i equiv = W-TL W-SL j equiv = W-TL W-SL j equiv = W-TL closest vector 1 2

Conclusion System in continuous learning Evolving results Hopefully converging Assisting and begin assisted by Vectorized lexical functions Human annotators Toward Community of lexical agents Lexical knowledge negotiation

Similar presentations