Presentation is loading. Please wait.

Presentation is loading. Please wait.

What is the Jeopardy Model? A Quasi-Synchronous Grammar for Question Answering Mengqiu Wang, Noah A. Smith and Teruko Mitamura Language Technology Institute.

Similar presentations


Presentation on theme: "What is the Jeopardy Model? A Quasi-Synchronous Grammar for Question Answering Mengqiu Wang, Noah A. Smith and Teruko Mitamura Language Technology Institute."— Presentation transcript:

1 What is the Jeopardy Model? A Quasi-Synchronous Grammar for Question Answering Mengqiu Wang, Noah A. Smith and Teruko Mitamura Language Technology Institute Carnegie Mellon University

2 2 The task High-efficiency document retrieval High-precision answer ranking Who is the leader of France? 1. Bush later met with French president Jacques Chirac. 2. Henri Hadjenberg, who is the leader of France ’s Jewish community, … 3. … 1. Henri Hadjenberg, who is the leader of France ’s Jewish community, … 2. Bush later met with French president Jacques Chirac. (as of May 16 2007) 3. …

3 3 Challenges High-efficiency document retrieval High-precision answer ranking Who is the leader of France ? 1. Bush later met with French president Jacques Chirac. 2. Henri Hadjenberg, who is the leader of France ’s Jewish community, … 3. …

4 4 Semantic Tranformations Q:“Who is the leader of France?” A: Bush later met with French president Jacques Chirac.

5 5 Syntactic Transformations WholeadertheFranceofis? BushmetFrenchwithpresidentJacquesChirac mod

6 6 Syntactic Variations WholeadertheFranceofis? HenriHadjenberb,wholeaderistheofFrance’sJewishcommunity mod

7 7 Two key phenomena in QA  Semantic transformation leader president  Syntactic transformation leader of France French president Q A

8 8 Existing work in QA  Semantics Use WordNet as thesaurus for expansion  Syntax Use dependency parse trees, but merely transform the feature space into dependency parse feature space. No fundamental changes in the algorithms (edit-distance, classifier, similarity measure).

9 9 Where else have we seen these transformations?  Machine Translation (especially in syntax-based MT)  Paraphrasing  Sentence compression  Textual entailment F E

10 10 Noisy-channel  Machine Translation  Question Answering S E Q A Language modelTranslation model retrieval model Jeopardy model

11 11  From wikipedia.org: Jeopardy! is a popular international television quiz game show ( #2 of the 50 Greatest Game Show of All Times ). 3 contestants select clues in the form of an answer, to which they must supply correct responses in the form of a question. The concept of "questioning answers" is original to Jeopardy!. What is Jeopardy! ?

12 12 Jeopardy Model  We make use of a formalism called quasi-synchronous grammar [ D. Smith & Eisner ’06 ], originally developed for MT

13 13 Quasi-Synchronous Grammars  Based on key observations in MT: translated sentences often have some isomorphic syntactic structure, but not usually in entirety. the strictness of the isomorphism may vary across words or syntactic rules.  Key idea: Unlike some synchronous grammars (e.g. SCFG, which is more strict and rigid), QG defines a monolingual grammar for the target tree, “inspired” by the source tree.

14 14 Quasi-Synchronous Grammars  In other words, we model the generation of the target tree, influenced by the source tree (and their alignment)  QA can be thought of as extremely free translation within the same language.  The linkage between question and answer trees in QA is looser than in MT, which gives a bigger edge to QG.

15 15 Jeopardy Model  Works on labeled dependency parse trees  Learn the hidden structure (alignment between Q and A trees) by summing out ALL possible alignments  One particular alignment tells us both the syntactic configurations and the word-to-word semantic correspondences  An example… question answer parse tree question parse tree an alignment

16 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

17 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

18 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person is VB Q:A: $ root $ root subjwith nmod Our model makes local Markov assumptions to allow efficient computation via Dynamic Programming (details in paper) given its parent, a word is independent of all other words (including siblings).

19 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword is VB Q:A: $ root $ root subj root subjwith nmod

20 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB Q:A: $ root $ root subjobj root subjwith nmod

21 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT Q:A: $ root $ root subjobj det root subjwith nmod

22 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

23 23 6 types of syntactic configurations  Parent-child

24 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

25 Parent-child configuration

26 26 6 types of syntactic configurations  Parent-child  Same-word

27 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

28 Same-word configuration

29 29 6 types of syntactic configurations  Parent-child  Same-word  Grandparent-child

30 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

31 Grandparent-child configuration

32 32 6 types of syntactic configurations  Parent-child  Same-word  Grandparent-child  Child-parent  Siblings  C-command (Same as [D. Smith & Eisner ’06])

33

34 34 Modeling alignment  Base model

35 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

36 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

37 37 Modeling alignment cont.  Base model  Log-linear model Lexical-semantic features from WordNet, Identity, hypernym, synonym, entailment, etc.  Mixture model

38 38 Parameter estimation  Things to be learnt Multinomial distributions in base model Log-linear model feature weights Mixture coefficient  Training involves summing out hidden structures, thus non-convex.  Solved using conditional Expectation- Maximization

39 39 Experiments  Trec8-12 data set for training  Trec13 questions for development and testing

40 40 Candidate answer generation  For each question, we take all documents from the TREC doc pool, and extract sentences that contain at least one non-stop keywords from the question.  For computational reasons (parsing speed, etc.), we only took answer sentences <= 40 words.

41 41 Dataset statistics  Manually labeled 100 questions for training Total: 348 positive Q/A pairs  84 questions for dev Total: 1415 Q/A pairs 3.1+, 17.1-  100 questions for testing Total: 1703 Q/A pairs 3.6+, 20.0-  Automatically labeled another 2193 questions to create a noisy training set, for evaluating model robustness

42 42 Experiments cont.  Each question and answer sentence is tokenized, POS tagged (MX-POST), parsed (MSTParser) and labeled with named-entity tags (Identifinder)

43 43 Baseline systems (replications)  [Cui et al. SIGIR ‘05] The algorithm behind one of the best performing systems in TREC evaluations. It uses a mutual information-inspired score computed over dependency trees and a single fixed alignment between them.  [Punyakanok et al. NLE ’04] measures the similarity between Q and A by computing tree edit distance.  Both baselines are high-performing, syntax-based, and most straight-forward to replicate  We further enhanced the algorithms by augmenting them with WordNet.

44 44 Results Mean Average Precision Mean Reciprocal Rank of Top 1 Statistically significantly better than the 2 nd best score in each column 28.2% 23.9% 41.2% 30.3%

45 45 Summing vs. Max

46 46 Conclusion  We developed a probabilistic model for QA based on quasi-synchronous grammar  Experimental results showed that our model is more accurate and robust than state-of- the-art syntax-based QA models  The mixture model is shown to be powerful. The log-linear model allows us to use arbitrary features.  Provides a general framework for many other NLP applications (compression, textual entailment, paraphrasing, etc.)

47 47 Future Work  Higher-order Markovization, both horizontally and vertically, allows us to look at more context, at the expense of higher computational cost.  More features from external resources, e.g. paraphrasing database  Extending it for Cross-lingual QA Avoid the paradigm of translation as pre- of post-processing We can naturally fit in a lexical or phrase translation probability table into our model to model the translation inherently  Taking into account parsing uncertainty

48 48 Thank you! Questions?


Download ppt "What is the Jeopardy Model? A Quasi-Synchronous Grammar for Question Answering Mengqiu Wang, Noah A. Smith and Teruko Mitamura Language Technology Institute."

Similar presentations


Ads by Google