University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Learning a Compositional Semantic Parser.

University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Learning a Compositional Semantic Parser using an Existing Syntactic Parser August 4, 2009 Ruifang Ge and Raymond J. Mooney

2 Semantic Parsing Semantic Parsing: maps a natural-language sentence into a completely formal meaning representation (MR) in a meaning representation language ( MRL ) Applications –CLang (Chen et al., 2003) –Geoquery (Zelle & Mooney, 1996)

3 Example: Clang (RoboCup Coach Language) In RoboCup Coach competition teams compete to coach simulated players The coaching instructions are given in a formal language called CLang Simulated soccer field Coach If our player 2 has the ball, then position our player 5 in the midfield. CLang ((bowner (player our {2})) (do (player our {5}) (pos (midfield)))) Semantic Parsing

4 GeoQuery: A Database Query Application Query application for U.S. geography database [Zelle & Mooney, 1996] User How many states does the Mississippi run through? Query answer(A, count(B, (state(B), C=riverid(mississippi), traverse(C,B)), A)) Semantic Parsing DataBase 10

5 Outline Prior work on learning semantic parsers Learning a compositional semantic parser using an existing syntactic parser Experimental results Conclusions

6 Prior Work: S CISSOR (Ge & Mooney, 2005) A syntax-driven semantic parser. Uses an integrated syntactic-semantic parsing model to generate a semantically augmented parse tree (SAPT) for an NL sentence. Gold-standard SAPTs are required for training.

7 Meaning Representation Language Grammar (MRLG) ProductionPredicate CONDITION →(bowner PLAYER)P_BOWNER PLAYER →(player TEAM {UNUM})P_PLAYER UNUM → 2P_UNUM TEAM → ourP_OUR Assumes MRL is defined by an unambiguous context-free grammar. Each production rule introduces a single predicate in the MRL. The parse of an MR gives its predicate-argument structure.

8 Example: Scissor (SAPT) ourplayer2hasballthe VP λa 1 P_BOWNER S P_BOWNER PRP$ P_OUR NP NULL NP λa 1 P_PLAYER VB λa 1 P_BOWNER DET NULL NN NULL NP P_PLAYER NN λa 1 λa 2 P_PLAYER CD P_UNUM P_BOWNER P_PLAYER P_UNUM P_OUR

9 ourplayer2hasballthe PRP$ P_OUR VB λa 1 P_BOWNER DET NULL NN NULL NN λa 1 λa 2 P_PLAYER CD P_UNUM VP λa 1 bowner(a 1 ) NP NULL NP player(our {2}) S bowner(player(our {2})) NP λa 1 player(a 1 {2}) Example: Scissor (Meaning Composition)

10 Problem with Scissor Requires extensive SAPT annotation for training, including both syntactic parse trees and semantic labels. Must learn both syntax and semantics from same limited training corpus. High performance syntactic parsers are available that are trained on existing large corpora (Collins, 1997; Charniak & Johnson, 2005).

12 Unambiguous CFG of MRL Training set, {(s,t,m)} where m is the sentence’s MR Training Semantic parsing Input sentence, s Output MR, m Testing Before training & testing training/test sentence, s Syntactic parser syntactic parse tree, t Semantic knowledge acquisition Semantic lexicon & composition rules Parameter estimation Probabilistic parsing model System Overview

13 Outline Prior work on learning semantic parsers Learning a compositional semantic parser using an existing syntactic parser –Semantic knowledge acquisition –Parameter estimation Experimental results Conclusions

14 Learning a Semantic Lexicon Use a statistical MT approach based on Wong and Mooney (2006) to construct word alignments between NL sentences and their MRs. Train IBM Model 5 word alignment (GIZA++) to generate the top 5 word/predicate alignments for each training example. Assume each word alignment defines a possible mapping from words to predicates for composing the correct MR.

15 Example: Word Alignment ourplayer2hasballthe P_PLAYERP_BOWNERP_OURP_UNUM

16 Learning Semantic Composition Rules Assume each word alignment and syntactic parse defines a possible SAPT for composing the correct MR. A bottom-up, left-to-right process is used to compose the MRs of internal nodes according to its correct MR. Semantic composition rules are extracted directly during this process.

17 Learning Semantic Composition Rules ourplayer2hasballthe VP S PRP$ NP VB DETNN NP NNCD P_OUR λa 1 λa 2 P_PLAYER λa 1 P_BOWNER P_UNUMNULL P_BOWNER P_PLAYER P_UNUMP_OUR

18 Learning Semantic Composition Rules ourplayer2hasballthe VP S PRP$ NP VB DETNN NP NNCD P_OUR λa 1 λa 2 P_PLAYER λa 1 P_BOWNER P_UNUMNULL P_BOWNER P_PLAYER P_UNUMP_OUR

19 Learning Semantic Composition Rules ourplayer2hasballthe VP S PRP$ NP VB DETNN NP NNCD P_OUR λa 1 λa 2 P_PLAYER λa 1 P_BOWNER P_UNUMNULL P_BOWNER P_PLAYER P_UNUM P_OUR λa 1 λa 2 P_PLAYER + P_UNUM  {λa 1 P_PLAYER, a 2 =c 2 } λa 1 P_PLAYER

20 Learning Semantic Composition Rules ourplayer2hasballthe VP S PRP$ NP VB DETNN NP NNCD P_OUR λa 1 λa 2 P_PLAYER λa 1 P_BOWNER P_UNUMNULL P_BOWNER P_PLAYER P_UNUMP_OUR λa 1 P_PLAYER P_OUR +λa 1 P_PLAYER  {P_PLAYER, a 1 =c 1 } P_PLAYER

21 Learning Semantic Composition Rules ourplayer2hasballthe VP S PRP$ NP VB DETNN NP NNCD P_OUR λa 1 λa 2 P_PLAYER λa 1 P_BOWNER P_UNUMNULL P_BOWNER P_PLAYER P_UNUMP_OUR λa 1 P_PLAYER NULL λa 1 P_OWNER P_PLAYER

22 Learning Semantic Composition Rules ourplayer2hasballthe VP S PRP$ NP VB DETNN NP NNCD P_OUR λa 1 λa 2 P_PLAYER λa 1 P_BOWNER P_UNUMNULL P_BOWNER P_PLAYER P_UNUMP_OUR λa 1 P_PLAYER NULL λa 1 P_BOWNER P_PLAYER P_BOWNER P_PLAYER + λa 1 P_BOWNER  {P_OWNER, a 1 =c 1 }

23 Ensuring Meaning Composition The syntactic structure of the NL is sometimes different from the syntactic structure of the MR. Introduce macro-predicates that combine multiple predicates. These ensure that each MR in the training set can be composed using the syntactic parse of its corresponding NL given its word alignments.

24 Ensuring Meaning Composition λa 1 P_POS P_PLAYER P_MIDFIELDλa 1 λa 2 P_DO thenpositionour player 5in the midfield VBNP PP ADVP RBVP P_DO P_POSP_PLAYER P_MIDFIELD λa 1 λa 2 P_DO

25 Ensuring Meaning Composition λa 1 P_POS P_PLAYER P_MIDFIELDλa 1 λa 2 P_DO thenpositionour player 5in the midfield VBNP PP ADVP RB VP P_DO P_POSP_PLAYER P_MIDFIELD λa 1 P_POS + P_PLAYER  ? λa 1 λa 2 P_DO

26 Ensuring Meaning Composition λa 1 P_POS P_PLAYER P_MIDFIELDλa 1 λa 2 P_DO thenpositionour player 5in the midfield VBNP PP ADVP RB VP P_DO P_POSa 1 : PLAYER a 2 :REGION λa 1 P_POS + P_PLAYER  {λp 1 λa 2 P_DO_POS, a 1 =c 2 } Macro: P_DO_POS λp 1 λa 2 P_DO_POS λa 1 λa 2 P_DO

27 Ensuring Meaning Composition λa 1 P_POS P_PLAYER P_MIDFIELDλa 1 λa 2 P_DO thenpositionour player 5in the midfield VBNP PP ADVP RB VP λp 1 λa 2 P_DO_POS + P_MIDFIELD  {λp 1 P_DO_POS, a 2 =c 2 } λp 1 λa 2 P_DO_POS λp 1 P_DO_POS = λp 1 P_DO λa 1 λa 2 P_DO P_DO P_POSP_PLAYER P_MIDFIELD

28 Ensuring Meaning Composition λa 1 P_POS P_PLAYER P_MIDFIELDλa 1 λa 2 P_DO thenpositionour player 5in the midfield VBNP PP ADVP RB VP λa 1 λa 2 P_DO + λp 1 P_DO  {P_DO, a 1 =(c 2,a 1 ), a 2 =(c 2,a 2 ) } λp 1 λa 2 P_DO_POS λp 1 P_DO_POS = λp 1 P_DOλa 1 λa 2 P_DO P_DO P_POSP_PLAYER P_MIDFIELD

29 Outline Prior work on learning semantic parsers Learning a compositional semantic parser using an existing syntactic parser –Semantic knowledge acquisition –Parameter estimation Experimental results Conclusions

30 Parameter Estimation Use a maximum-entropy model similar to that of Zettlemoyer & Collins (2006), and Wong & Mooney (2006) (s: sentence; t:syntactic parse; s a : SAPTs) Training finds a parameter that (approximately) maximizes the sum of the conditional log-likelihood of the training set ( m: MR )

31 Features Lexical features: –Unigram features: the number of times a word is assigned a particular predicate. –Bigram features: the number of times a word is assigned a particular predicate given its previous/subsequent word. Rule features: the number of times a particular composition rule is applied in a derivation

33 Experimental Corpora CLang –300 randomly selected rules from the log files of the 2003 RoboCup Coach Competition –CLANG advice is annotated with NL sentences –22.52 words per sentence GeoQuery [Zelle & Mooney, 1996] –Prolog logical forms –880 queries for U.S. geography database –7.48 words per sentence

34 Experimental Methodology Evaluated using standard 10-fold cross validation Correctness –CLang: output exactly matches the correct representation –Geoquery: query retrieves correct answer, reflecting the quality of the final result returned to the user

35 Experimental Methodology Metrics

36 Compared Systems Scissor (Ge & Mooney, 2005) –An integrated syntactic-semantic parser, training requires S APT s K RISP (Kate & Mooney, 2006) –An SVM-based parser using string kernels Lu (Lu et al., 2008) –A generative model with discriminative reranking W ASP (Wong & Mooney, 2006; Wong & Mooney, 2007) –A system based on synchronous grammars Z&C (Zettlemoyer & Collins, 2007) –A probabilistic parser based on relaxed CCG grammars, requiring a hand-built, ambiguous CCG grammar template Our system –Requires a syntactic parser Scissor (Ge & Mooney, 2005)

37 Syntactic Parses Utilized Gold-standard syntactic parses ( GoldSyn ) Collins’s head-driven syntactic parsing model II trained on WSJ plus a small number of in-domain training sentences –CLang: Syn20 (88.21%) –Geoquery: Syn40 (91.46%) Collins’s parser trained only on WSJ –CLang: Syn0 (82.15%) –Geoquery: Syn0 (76.44%)

38 Performance on CLang PrecisionRecallF-measure G OLD S YN 84.7374.0079.00 S YN 2085.3770.0076.92 S YN 087.0167.0075.71 W ASP 88.8561.9372.99 K RISP 85.2061.8571.67 S CISSOR 89.5073.7080.80 LULU 82.5067.7074.40

39 Performance on Geoquery PrecisionRecallF-measure G OLD S YN 91.9488.1890.02 S YN 2090.2186.9388.54 S YN 081.7678.9880.35 W ASP 91.9586.5989.19 Z&C91.6386.0788.76 S CISSOR 95.5077.2085.38 K RISP 93.3471.7081.10 LULU 89.3081.5085.20

40 Experiments with Less Training Data CLang40: 40 random selected examples from the training set in CLang G EO 250 (Zelle & Mooney, 1996): an existing 250- example subset

41 Performance on CLang40 PrecisionRecallF-measure G OLD S YN 61.1435.6745.05 S YN 2057.7531.0040.35 S YN 053.5422.6731.85 W ASP 88.0014.3724.71 K RISP 68.3520.0030.95 S CISSOR 85.0023.0036.20

42 Performance on G EO 250 PrecisionRecallF-measure G OLD S YN 95.7389.6092.56 S YN 2093.1987.6090.31 S YN 091.8185.2088.38 W ASP 91.7675.6082.90 S CISSOR 98.5074.4084.77 K RISP 84.4371.6077.49 LULU 91.4672.8081.07

43 Conclusion Presented a new approach to learning a semantic parser that exploits an existing syntactic parser to produce disambiguated parse trees. The experiments showed that the resulting system produces improved results on standard corpora. The improvements of our system with less training data demonstrate the advantage of utilizing an existing syntactic parser. –Syntactic structure is learned from large open domain treebanks instead of relying just on the training data.

44 Thank You! Questions?

University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Learning a Compositional Semantic Parser.

Similar presentations

Presentation on theme: "University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Learning a Compositional Semantic Parser."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Learning a Compositional Semantic Parser.

Similar presentations

Presentation on theme: "University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Learning a Compositional Semantic Parser."— Presentation transcript:

Similar presentations

About project

Feedback