Download presentation

Presentation is loading. Please wait.

Published byAlexa Powell Modified over 2 years ago

1
Conceptual vectors for NLP Lexical functions MMA 2001 Mathieu Lafourcade LIRMM - France

2
Semantic Analysis Word Sense Disambiguation Text Indexing in IR Lexical Transfer in MT Conceptual vector Reminiscent of Vector Models (Salton, Sowa, LSI) Applied on pre-selected concepts (not terms) Concepts are not independent Propagation on morpho-syntactic tree (no surface analysis) Objectives

3
Conceptual vectors An idea = a combination of concepts = a vector The Idea space = vector space A concept = an idea = a vector = combination of itself + neighborhood Sense space = vector space + vector set

4
Conceptual vectors Annotations Helps building vectors Can take the form of vectors Set of k basic concepts example Thesaurus Larousse = 873 concepts A vector = a 873 uple Encoding for each dimension C = 2 15

5
Vector construction Concept vectors H : Thesaurus hierarchy V(C i ) : a j = 1/ (2 ** D um (H, i)) 1/41 1/16 1/64 264

6
Vector construction Concept vectors C : mammals L4 : zoologie, mammals, birds, fish, … L3 : animals, plants, living beings L2 : …, time, movement, matter, life, …, L1 : the society, the mankind, the world

7
Vector construction Concept vectors mammals zoology, birds, fishes, … living beings, plants the world the mankind, the society

8
Vector construction Term vectors Example : cat Kernel c:mammal, c:stroke mammal + stroke Augmented with weights c:mammal, c:stroke, 0.75*c:zoology, 0.75*c:love … zoology + mammal stroke love … Iteration for neighborhood augmentation

9
Vector construction Term vectors Cat mammals stroke

10
Vector space Basic concepts are not independent Sense space = Generator Space of a real k vector space (unknown) = Dim k k Relative position of points

11
Conceptual vector distance Angular Distance D A (x, y) = angle (x, y) 0 D A (x, y) if 0 then colinear - same idea if /2 then nothing in common if then D A (x, -x) with -x as anti-idea of x x y x

12
Conceptual vector distance Distance = acos(similarity) D A (x, y) = acos(x.y/|x||y|)) D A (x, x) = 0 D A (x, y) = D A (y, x) D A (x, y) + D A (y, z) D A (x, z) D A (0, 0) = 0 and D A (x, 0) = /2 by definition D A ( x, y) = D A (x, y) with 0 D A ( x, y) = - D A (x, y) with < 0 D A (x+x, x+y) = D A (x, x+y) D A (x, y)

13
Conceptual vector distance Example D A (tit, tit) = 0 D A (tit, passerine) = 0.4 D A (tit, bird) = 0.7 D A (tit, train) = 1.14 D A (tit, insect) = 0.62 tit = kind of insectivorous passerine …

14
Conceptual lexicon Set of (word, vector) = (w, )* Monosemy word 1 meaning 1 vector (w, ) tit

15
Conceptual lexicon Polyseme building Polysemy word n meanings n vectors {(w, ), (w.1, 1 ) … (w.n, n ) } bank

16
Squash isolated meanings against numerous close meanings Conceptual lexicon Polyseme building (w) = (w.i) =.i bank : bank.1: Mound bank.3: River border, … bank.2: Money institution bank.3: Organ keyboard bank.4: … bank

17
Conceptual lexicon Polyseme building (w) = classification(w.i) bank 1:D A (3,4) & (3+2) 2: (bank 4 ) 7: (bank 2 )6: (bank 1 ) 4: (bank 3 ) 5: D A (6,7)& (6+7) 3: D A (4,5) & (4+5)

18
Lexical scope LS(w) = LS t ( (w)) LS t ( (w)) = 1 if is a leaf LS t ( (w)) = (LS( 1 ) + LS( 2 )) /(2-sin(D( (w))) otherwise (w) = t ( (w)) t ( (w)) = (w) if is a leaf t ( (w)) = LS( 1 ) t ( 1 ) + LS( 2 ) t ( 2 ) otherwise 1:D(3,4), (3+2) 2: 4 7: 2 6: 1 4: 3 5:D(6,7), (6+7) 3:D(4,5), (4+5) Can handle duplicated definitions (w) =

19
Vector Statistics Norm( ) [0, 1] * C (2 15 =32768) Intensity( ) Norm / C Usually = 1 Standard deviation (SD) SD 2 = variance variance = 1/n * (x i - ) 2 with as the arith mean

20
Vector Statistics Variation coefficient (CV) CV = SD / mean No unity - Norm independent Pseudo Conceptual strength If A Hyperonym B CV(A) > CV(B) (we dont have ) vector « fruit juice » (N) MEAN = 527, SD = 973 CV = 1.88 vector « drink » (N) MEAN = 443, SD = 1014 CV = 2.28

21
Vector operations Sum V = X + Y v i = x i + y i Neutral element : 0 Generalized to n terms : V = V i Normalization of sum : v i /|V|* c Kind of mean

22
Vector operations Term to term product V = X Y v i = x i * y i Neutral element : 1 Generalized to n terms V = V i Kind of intersection

23
Vector operations Amplification V = X ^ n v i = sg(v i ) * |v i |^ n V = V ^ 1/2and n V = V ^ 1/n V V = V ^ 2if v i 0 Normalization of ttm product to n terms V = n V i

24
Vector operations Product + sum V = X Y = ( X Y ) + X + Y Generalized n terms : V = n V i + V i Simplest request vector computation in IR

25
Vector operations Subtraction V = X Y v i = x i y i Dot subtraction V = X Y v i = max (x i y i, 0) Complementary V = C(X) v i = (1 x i c) * c etc. Set operations

26
Intensity Distance Intensity of normalized ttm product 0 ( (X Y)) 1 if |x| = |y| = 1 D I (X, Y) = acos( ( X Y)) D I (X, X) = 0and D I (X, 0) = /2 D I (tit, tit) = 0(D A = 0) D I (tit, passerine) = 0.25(D A = 0.4) D I (tit, bird) = 0.58(D A = 0.7) D I (tit, train) = 0.89 (D A = 1.14) D I (tit, insect) = 0.50(D A = 0.62)

27
Relative synonymy Syn R (A, B, C) C as reference feature Syn R (A, B, C) = D A (A C, B C) D A (coal,night) = 0.9 Syn R (coal, night, color) = 0.4 Syn R (coal, night, black) = 0.35

28
Relative synonymy Syn R (A, B, C) = Syn R (B, A, C) Syn R (A, A, C) = D (A C, A C) = 0 Syn R (A, B, 0) = D (0, 0) = 0 Syn R (A, 0, C) = /2 Syn A (A, B) = Syn R (A, B, 1) = D (A 1, B 1) = D (A, B)

29
Subjective synonymy Syn S (A, B, C) C as point of view Syn S (A, B, C) = D(C-A, C-B) 0 Syn S (A, B, C) normalization: 0 asin(sin(Syn S (A, B, C))) /2 B A C C C

30
Subjective synonymy When |C| then Syn S (A, B, C) 0 Syn S (A, B, 0) = D(-B, -A) = D(A, B) Syn S (A, A, C) = D(C-A, C-A) = 0 Syn S (A, B, B) = Syn S (A, B, A) = 0 Syn S (tit, swallow, animal) = 0.3 Syn S (tit, swallow, bird) = 0.4 Syn S (tit, swallow, passerine) = 1

31
68 Semantic analysis Vectors propagate on syntactic tree Lesrapidement P GV GVA GNP termites attaquent lesfermes GN dutoit The white ants strike rapidly the trusses of the roof

32
Semantic analysis Lesrapidement P GV GVA GNP attaquent fermes termites les GN toit GN du farm business farm building truss roof anatomy above to start to attack to Critisize

33
Semantic analysis Initialization - attach vectors to nodes Lesrapidement P GV GVA GNP termites attaquent lesfermes GN dutoit

34
Semantic analysis Propagation (up) Lesrapidement P GV GVA GNP termites attaquent lesfermes GN dutoit

35
Semantic analysis Back propagation (down) (N i j ) = ( (N i j ) (N i )) + (N i j ) Lesrapidement P GV GVA GNP termites attaquent lesfermes GN dutoit

36
Semantic analysis Sense selection or sorting Lesrapidement P GV GVA GNP termites attaquent les fermes GN du toit farm business farm building truss roof anatomy above to start to attack to Critisize

37
Sense selection 1:D(3,4) & (3+2) 2: (bank 4 ) 7: (bank 2 )6: (bank 1 ) 4: (bank 3 ) 5:D(6,7)& (6+7) 3:D(4,5) & (4+5) Recursive descent on t(w) as decision tree D A (, i ) Stop on a leaf Stop on an internal node

38
Vector syntactic schemas S: NP(ART,N) (NP) = V(N) S: NP1(NP2,N) (NP1) = (NP1)+ (N)0< <1 (sail boat) = (sail) + 1/2 (boat) (boat sail) = 1/2 (boat) + (sail)

39
Vector syntactic schemas Not necessary linear S: GA(GADV(ADV),ADJ) (GA) = (ADJ)^p(ADV) p(very) = 2 p(mildly) = 1/2 (very happy) = (happy)^2 (mildly happy) = (happy)^1/2

40
Iteration & convergence Iteration with convergence Local D( i, i+1 ) for top Global D( i, i+1 ) for all Good results but costly

41
Lexicon construction Manual kernel Automatic definition analysis Global infinite loop = learning Manual adjustments iterations synonyms Ukn word in definitions kernel

42
Application machine translation Lexical transfer source target Knn search that minimizes D A ( source, target ) Submeaning selection Direct Transformation matrix chat cat

43
Application Information Retrieval on Texts Textual document indexation Language dependant Retrieval Language independent - Multilingual Domain representation horse equitation Granularity Document, paragraphs, etc.

44
Application Information Retrieval on Texts Index = Lexicon = (d i, i )* Knn search that minimizes D A ( (r), (d i )) doc 1 doc n doc 2 request

45
Search engine Distances adjustments Min D A ( (r), (d i )) may pose problems Especially with small documents Correlation between CV & conceptual richness Pathological cases « plane » and « plane plane plane plane … » « inundation » « blood » D = 0.85 (liquid)

46
Search engine Distances adjustments Correction with relative intensity Request vs retrieved doc ( r and d ) D = (D A ( r, d ) * D I ( r, d )) 0 ( r, d ) 1 0 D I ( r, d ) /2

47
Conclusion Approach statistical (but not probabilistic) thema (and rhema ?) Combination of Symbolic methods (IA) Transformational systems Similarity Neural nets With large Dim (> ?)

48
Conclusion Self evaluation Vector quality Tests against corpora Unknown words Proper nouns of person, products, etc. Lionel Jospin, Danone, Air France Automatic learning Badly handled phenomena? Negation & Lexical functions (Meltchuk)

49
End 1. extremity 2. death 3. aim …

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google