Conceptual vectors for NLP MMA 2001 Mathieu Lafourcade LIRMM - France

Slides:



Advertisements
Similar presentations
S é mantique lexicale Vecteur conceptuels et TALN Mathieu Lafourcade LIRMM - France
Advertisements

Conceptual vectors for NLP Lexical functions
Conceptual vectors for NLP MMA 2001 Mathieu Lafourcade LIRMM - France
Automatically Populating Acception Lexical Database through Bilingual Dictionaries and Conceptual Vectors PAPILLON 2002 Mathieu Lafourcade LIRMM - France.
Synonymies and conceptual vectors NLPRS 2001 Mathieu Lafourcade, Violaine Prince LIRMM - France.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Traditional IR models Jian-Yun Nie.
Slides from: Doug Gray, David Poole
Medical Image Registration Kumar Rajamani. Registration Spatial transform that maps points from one image to corresponding points in another image.
A Robust Approach to Aligning Heterogeneous Lexical Resources Mohammad Taher Pilehvar Roberto Navigli MultiJEDI ERC
2 Information Retrieval System IR System Query String Document corpus Ranked Documents 1. Doc1 2. Doc2 3. Doc3.
1 CS 391L: Machine Learning: Instance Based Learning Raymond J. Mooney University of Texas at Austin.
Text Similarity David Kauchak CS457 Fall 2011.
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
Guessing Hierarchies and Symbols for Word Meanings through Hyperonyms and Conceptual Vectors Mathieu Lafourcade LIRMM - France
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) IR Queries.
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Antonymy and Conceptual Vectors Didier Schwab, Mathieu Lafourcade, Violaine Prince Laboratoire d’informatique, de robotique Et de microélectronique de.
Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Jinhui Tang †, Shuicheng Yan †, Richang Hong †, Guo-Jun Qi ‡, Tat-Seng Chua † † National University of Singapore ‡ University of Illinois at Urbana-Champaign.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
M. Lafourcade (LIRMM & Ch. Boitet (GETA, CLIPS)LREC-02, Las Palmas, 31/5/ LREC-2002, Las Palmas, May 2002 Mathieur Lafourcade & Christian Boitet.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
1 Query Operations Relevance Feedback & Query Expansion.
Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.
1 Motivation Web query is usually two or three words long. –Prone to ambiguity –Example “keyboard” –Input device of computer –Musical instruments How can.
Semantic distance & WordNet Serge B. Potemkin Moscow State University Philological faculty.
1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )
Lecture 4 Linear machine
1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.
Improving Translation Selection using Conceptual Vectors LIM Lian Tze Computer Aided Translation Unit School of Computer Sciences Universiti Sains Malaysia.
1 Latent Concepts and the Number Orthogonal Factors in Latent Semantic Analysis Georges Dupret
Support Vector Machines and Kernel Methods for Co-Reference Resolution 2007 Summer Workshop on Human Language Technology Center for Language and Speech.
Question Classification using Support Vector Machine Dell Zhang National University of Singapore Wee Sun Lee National University of Singapore SIGIR2003.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Lens effects in autonomous terminology and conceptual vector learning Mathieu Lafourcade LIRMM - France
Overview of Statistical NLP IR Group Meeting March 7, 2006.
1 Knowledge-Based Medical Image Indexing and Retrieval Caroline LACOSTE Joo Hwee LIM Jean-Pierre CHEVALLET Daniel RACOCEANU Nicolas Maillot Image Perception,
Query expansion COMP423. Menu Query expansion Two approaches Relevance feedback Thesaurus-based Most Slides copied from
Support Vector Machine
Distance and Similarity Measures
Character Animation Forward and Inverse Kinematics
Chapter 7. Classification and Prediction
Approaches to Machine Translation
Sentiment analysis algorithms and applications: A survey
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Instance Based Learning
Machine Learning Basics
Antonymy and Conceptual Vectors
Vector-Space (Distributional) Lexical Semantics
Mining the Data Charu C. Aggarwal, ChengXiang Zhai
CIS 350 – 3 Image ENHANCEMENT SPATIAL DOMAIN
Information Organization: Clustering
Perceptron as one Type of Linear Discriminants
Approaches to Machine Translation
Artificial Intelligence Lecture No. 28
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
Large scale multilingual and multimodal integration
Text Categorization Berlin Chen 2003 Reference:
Synonymies and conceptual vectors
Automatically Populating Acception Lexical Database through Bilingual Dictionaries and Conceptual Vectors PAPILLON 2002 Mathieu Lafourcade LIRMM -
Semantics Going beyond syntax.
Berlin Chen Department of Computer Science & Information Engineering
Lens effects in autonomous terminology and conceptual vector learning
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
CS249: Neural Language Model
Presentation transcript:

Conceptual vectors for NLP MMA 2001 Mathieu Lafourcade LIRMM - France www.lirmm.fr/~lafourca 1

Objectives Semantic Analysis Conceptual vector Propagation Word Sense Disambiguation Text Indexing in IR Lexical Transfer in MT Conceptual vector Reminiscent of Vector Models (Salton and all.) & Sowa Applied on preselected concepts (not terms) Concepts are not independent Propagation on morpho-syntactic tree (no surface analysis) 2

Conceptual vectors An idea The Idea space A concept Sense space = a combination of concepts = a vector The Idea space = vector space A concept = an idea = a vector = combination of itself + neighborhood Sense space = vector space + vector set 27

Conceptual vectors Annotations Set of k basic concepts — example Helps building vectors Can take the form of vectors Set of k basic concepts — example Thesaurus Larousse = 873 concepts A vector = a 873 uple Encoding for each dimension C = 215 28

Conceptual vectors Example : cat Kernel Augmented c:mammal, c:stroke <… mammal … stroke …> <… 0,8 … 0,8 … > Augmented c:mammal, c:stroke, c:zoology, c:love … <… zoology … mammal… stroke … love …> <… 0,5 … 0,75 … 0,75 … 0,5 … > + Iteration for neighborhood augmentation Finer vectors 29

Vector space  Basic concepts are not independent Sense space = Generator Space of a real k’ vector space (unknown) = Dim k’ <= k Relative position of points  34

Conceptual vector distance Angular Distance DA(x, y) = angle (x, y) 0 <= DA(x, y) <= π if 0 then colinear - same idea if π/2 then nothing in common if π then DA(x, -x) with -x as anti-idea of x x’ x  y 36

Conceptual vector distance Distance = acos(similarity) DA(x, y) = acos(((x1y1 + … + xnyn)/|x||y|)) DA(x, x) = 0 DA(x, y) = DA(y, x) DA(x, y) + DA(y, z)  DA(x, z) DA(0, 0) = 0 and DA(x, 0) = π/2 by definition DA(x, y) = DA(x, y) with .  0 DA(x, y) = π - DA(x, y) with . < 0 DA(x+x, x+y) = DA(x, x+y)  DA(x, y) 37

Conceptual vector distance Example DA(tit, tit) = 0 DA(tit, passerine) = 0.4 DA(tit, bird) = 0.7 DA(tit, train) = 1.14 DA(tit, insect) = 0.62 tit = kind of insectivorous passerine … 43

Conceptual lexicon Set of (word, vector) = (w, )* Monosemy word  1 meaning  1 vector (w, ) tit 45

Conceptual lexicon Polyseme building Polysemy word  n meanings  n vectors {(w, ), (w.1, 1) … (w.n, n) } bank 46

Conceptual lexicon Polyseme building (w) =  (w.i)  =  .i bank = 1. Mound, 3. river border, … 2. Money institution 3. Organ keyboard 4. … bank Squash isolated meanings against numerous close meanings 49

Conceptual lexicon Polyseme building (w) = classification(w.i) bank 1:DA(3,4) & (3+2) 2:(bank4) 7:(bank2) 6:(bank1) 4:(bank3) 5: DA(6,7)& (6+7) 3: DA(4,5) & (4+5) 51

Lexical scope LS(w) = LSt((w)) (w) = t((w)) LSt((w)) = 1 if  is a leaf LSt((w)) = (LS(1) + LS(2)) /(2-sin(D((w))) otherwise (w) = t((w)) t((w)) = (w) if  is a leaf t((w)) = LS(1)t(1) + LS(2)t(2) (w) = 1:D(3,4), (3+2) 2:4 7:2 6:1 4:3 5:D(6,7), (6+7) 3:D(4,5), (4+5) Can handle duplicated definitions 51

Vector Statistics Norm () Intensity () Standard deviation (SD) Norm / C Usually  = 1 Standard deviation (SD) SD2 = variance variance = 1/n * (xi - )2 with  as the arith. mean 52

Vector Statistics Variation coefficient (CV) CV = SD / mean No unity - Norm independent Pseudo Conceptual strength If A Hyperonym B  CV(A) > CV(B) (we don’t have  ) vector « fruit juice » (N)  MEAN = 527, SD = 973 CV = 1.88 vector « drink » (N)  MEAN = 443, SD = 1014 CV = 2.28 54

Vector operations Sum V = X + Y  vi = xi + yi Neutral element : 0 Generalized to n terms : V =  Vi Normalization of sum : vi /|V|* c Kind of mean 59

Vector operations Term to term product V = X  Y  vi = xi * yi Neutral element : 1 Generalized to n terms V =  Vi Kind of intersection 60

Vector operations Amplification V = X ^ n  vi = sg(vi) * |vi|^ n  V = V ^ 1/2 and n V = V ^ 1/n V  V = V ^ 2 if  vi  0 Normalization of ttm product to n terms V = n  Vi 62

Vector operations Product + sum V = X  Y = ( X  Y ) + X + Y Generalized n terms : V = n  Vi +  Vi Simplest request vector computation in IR 63

Vector operations Subtraction Dot subtraction Complementary etc. V = X - Y  vi = xi - yi Dot subtraction V = X  Y  vi = max (xi - yi, 0) Complementary V = C(X)  vi = (1 - xi/c) * c etc. Set operations 65

DI(X, Y) = acos(( X  Y)) Intensity Distance Intensity of normalized ttm product 0  ( (X  Y))  1 if |x| = |y| = 1 DI(X, Y) = acos(( X  Y)) DI(X, X) = 0 and DI(X, 0) = π/2 DI(tit, tit) = 0 (DA = 0) DI(tit, passerine) = 0.25 (DA = 0.4) DI(tit, bird) = 0.58 (DA = 0.7) DI(tit, train) = 0.89 (DA = 1.14) DI(tit, insect) = 0.50 (DA = 0.62) 65

SynR(A, B, C) = DA(AC, BC) Relative synonymy SynR(A, B, C) — C as reference feature SynR(A, B, C) = DA(AC, BC) DA(coal,night) = 0.9 SynR(coal, night, color) = 0.4 SynR(coal, night, black) = 0.35 66

Relative synonymy SynR(A, B, C) = SynR(B, A, C) SynR(A, A, C) = D (A  C, A  C) = 0 SynR(A, B, 0) = D (0, 0) = 0 SynR(A, 0, C) = π/2 SynA(A, B) = SynR(A, B, 1) = D (A  1, B  1) = D (A, B) 67

Subjective synonymy SynS(A, B, C) — C as point of view SynS(A, B, C) = D(C-A, C-B) 0  SynS(A, B, C)  π normalization: 0  asin(sin(SynS(A, B, C)))  π/2 C” ” C’ ’ C A  B 66

Subjective synonymy When |C|   then SynS(A, B, C)  0 SynS(A, B, 0) = D(-B, -A) = D(A, B) SynS(A, A, C) = D(C-A, C-A) = 0 SynS(A, B, B) = SynS(A, B, A) = 0 SynS(tit, swallow, animal) = 0.3 SynS(tit, swallow, bird) = 0.4 SynS(tit, swallow, passerine) = 1 66

The white ants strike rapidly the trusses of the roof Semantic analysis Vectors propagate on syntactic tree P GN GVA Les termites GV rapidement GNP attaquent les fermes GN du toit The white ants strike rapidly the trusses of the roof 68 68

Semantic analysis P GN GVA Les termites GV rapidement GNP attaquent fermes GN to start to attack to Critisize du toit farm business farm building truss roof anatomy above 69

Semantic analysis Initialization - attach vectors to nodes P GN GVA Les termites GV rapidement 1 5 GNP attaquent 2 les fermes GN 3 du toit 4 70

Semantic analysis Propagation (up) P GN GVA Les termites GV rapidement GNP attaquent les fermes GN du toit 71

Semantic analysis Back propagation (down) (Ni j) = ((Ni j)  (Ni)) + (Ni j) P GN GVA Les termites GV rapidement 1’ 5’ GNP attaquent 2’ les fermes GN 3’ du toit 4’ 76

Semantic analysis Sense selection or sorting P GN GVA Les termites GV rapidement GNP attaquent les fermes GN to start  to attack to Critisize du toit farm business farm building  truss  roof anatomy above 77

Sense selection Recursive descent on t(w) as decision tree DA(’, i) Stop on a leaf Stop on an internal node 1:D(3,4) & (3+2) 2:(bank4) 7:(bank2) 6:(bank1) 4:(bank3) 5:D(6,7)& (6+7) 3:D(4,5) & (4+5) 78

Vector syntactic schemas S: NP(ART,N)  (NP) = V(N) S: NP1(NP2,N)  (NP1) =  (NP1)+ (N) 0<<1 (sail boat) = (sail) + 1/2 (boat) (boat sail) = 1/2 (boat) + (sail) 78

Vector syntactic schemas Not necessary linear S: GA(GADV(ADV),ADJ)  (GA) = (ADJ)^p(ADV) p(very) = 2 p(mildly) = 1/2 (very happy) = (happy)^2 (mildly happy) = (happy)^1/2 79

Iteration & convergence Iteration with convergence Local D(i, i+1)   for top  Global D(i, i+1)   for all  Good results but costly 87

Lexicon construction Manual kernel Automatic definition analysis Global infinite loop = learning Manual adjustments synonyms Ukn word in definitions iterations kernel 93

Application machine translation Lexical transfer  source   target Knn search that minimizes DA(source, target) Submeaning selection Direct Transformation matrix chat cat 97

Application Information Retrieval on Texts Textual document indexation Language dependant Retrieval Language independent - Multilingual Domain representation horse  equitation Granularity Document, paragraphs, etc. 98

Application Information Retrieval on Texts Index = Lexicon = (di , i )* Knn search that minimizes DA((r), (di)) doc1 request docn doc2 101

Search engine Distances adjustments Min DA((r), (di)) may pose problems Especially with small documents Correlation between CV & conceptual richness Pathological cases « plane » and « plane plane plane plane … » « inundation »  « blood » D = 0.85 (liquid) 103

Search engine Distances adjustments Correction with relative intensity Request vs retrieved doc (r and d) D =  (DA(r , d) * DI(r , d)) 0  (r , d)  1  0  DI(r , d)  π/2 105

Conclusion Approach Combination of Similarity statistical (but not probabilistic) thema (and rhema ?) Combination of Symbolic methods (IA) Transformational systems Similarity Neural nets With large Dim (> 50000 ?) 107

Conclusion Self evaluation Unknown words Badly handled phenomena? Vector quality Tests against corpora Unknown words Proper nouns of person, products, etc. Lionel Jospin, Danone, Air France Automatic learning Badly handled phenomena? Negation & Lexical functions (Meltchuk) 109

End 1. extremity 2. death 3. aim … 110