Presentation is loading. Please wait.

Presentation is loading. Please wait.

Enhanced Answer Type Inference from Questions using Sequential Models Vijay Krishnan Sujatha Das Soumen Chakrabarti IIT Bombay.

Similar presentations


Presentation on theme: "Enhanced Answer Type Inference from Questions using Sequential Models Vijay Krishnan Sujatha Das Soumen Chakrabarti IIT Bombay."— Presentation transcript:

1 Enhanced Answer Type Inference from Questions using Sequential Models Vijay Krishnan Sujatha Das Soumen Chakrabarti IIT Bombay

2 HLT/EMNLP 2005Krishnan, Das, Chakrabarti2 Types in question answering  Factoid questions What country’s president won a Fields medal? What is Bill Clinton’s wife’s profession? How much does a rhino weigh?  Nontrivial to anticipate the answer type Broad clues like how much can be misleading: How much does a rhino cost? Must drive carefully around possessives  Motivation: initial passage scoring and screening via semi-structured query atype={weight#n#1, hasDigits} NEAR “rhino”

3 HLT/EMNLP 2005Krishnan, Das, Chakrabarti3 Answer types and informer spans  Space of atypes: any type membership you can recognize reliably in the corpus WordNet synsets, named (i.e., typed) entities Orthography and other surface patterns  How tall is Tom Cruise? NUMBER:distance (UIUC atype taxonomy) hasDigits (surface) linear_measure#n#1 (WN)  A single dominant informer span is almost always enough on standard benchmarks Name the largest producer of wheat Which country is the largest producer of wheat?

4 HLT/EMNLP 2005Krishnan, Das, Chakrabarti4 Exploiting atypes in QA This talk Informer span tagger (CRF) Informer span tagger (CRF) Atype classifier (SVM) Atype classifier (SVM) how tall How tall is the Eiffel Tower NUMBER:distance hasDigits Filter out informers and stopwords Eiffel Tower IR system supporting type tags and proximity Short-listed candidate passages 1 2

5 HLT/EMNLP 2005Krishnan, Das, Chakrabarti5 Part 1: Sequence-tagging of informers  Parse the question  Features from parse tree at many levels of detail  Use a CRF to learn discriminative feature weights 2-state generator 3-state generator

6 HLT/EMNLP 2005Krishnan, Das, Chakrabarti6 Example parse tree and feature ideas  Being an informer token is correlated with part of speech “IsTag” features  And whether the token is in the first chunk of its kind, or the second… “IsNum” features + Neighborhood info Tags for “capital” are NN, NP, null, NP, SQ, SBARQ Japan is part of the second NP at level 2

7 HLT/EMNLP 2005Krishnan, Das, Chakrabarti7 A multi-resolution feature table  Training data too sparse to lexicalize  Offset i fires boolean feature IsTag(y,t,ℓ) iff  E.g. position 4 fires IsTag(1,NP,2)  Offset i fires boolean feature IsNum(y,n,ℓ) iff  Lots of multi-resolution features, great for CRF! Num Tag

8 HLT/EMNLP 2005Krishnan, Das, Chakrabarti8 Experimental setup  5500 training and 500 test questions from UIUC (Li and Roth) 6 coarse atypes, 50 fine atypes We tagged informer spans by hand Almost perfect agreement in hand-tagging  Accuracy Measures for Informer Tagging: Exact match score: predicted informer token set exactly equals true set Jaccard score = |X  Y|/|X  Y| where X is predicted informer token set, Y true set

9 HLT/EMNLP 2005Krishnan, Das, Chakrabarti9 Contributions of CRF features  IsNum gives soft bias toward earlier NPs  Neighbor tags tune to VBZ, IN, POS(sessives)  IsEdge makes a big difference Modeling Markov state transition essential Multilevel POS tags Offset within chunk type Neighbor tags Markov Predicted Known

10 HLT/EMNLP 2005Krishnan, Das, Chakrabarti10 Heuristic baseline Effective mapping heurstics often used in QA systems:  For What, Which and Name questions, use head of the NP adjoining wh-word  For How questions, tag how and the subsequent word  For other questions (When, Where, Who etc.), choose the Wh-word

11 HLT/EMNLP 2005Krishnan, Das, Chakrabarti11 Breakup by question cue Jaccard (%) Major improvement for “what” and “who” questions with diverse atypes Heuristic informers not nearly as good 3-state much better than 2-state

12 HLT/EMNLP 2005Krishnan, Das, Chakrabarti12 Robustness to wrong parses  Our Learning Approach is much more robust to slightly incorrect sentence parses. For e.g.: (X (X (WP What)) (NP (NP (NN passage)) (SBAR (S (VP (VBZ has) (NP (DT the) (CD Ten) (NNS Commandments))))))) The parse should instead be ((WHNP (WH What) (NN Passage))… ….)

13 HLT/EMNLP 2005Krishnan, Das, Chakrabarti13 Part 2: Mapping informers to atypes This talk Informer span identifier (CRF) Informer span identifier (CRF) Atype classifier (SVM) Atype classifier (SVM) how tall How tall is the Eiffel Tower NUMBER:distance hasDigits Filter out informers and stopwords Eiffel Tower IR system supporting type tags and proximity Shortlisted candidate passages 1 2

14 HLT/EMNLP 2005Krishnan, Das, Chakrabarti14 Learning SVMs with informer features  Choose q-grams from each “field”, “named apart”  Also add WordNet hypernyms of informers Map scientist/president/CEO/… to feature person#n#1 Target HUMAN:individual is coarse grained Target better correlated with generalizations of informer

15 HLT/EMNLP 2005Krishnan, Das, Chakrabarti15 SVM meta-learning results Bigram linear SVM close to best so far Small gains from parse tree kernels If human-annotated informers are used by the SVM, we beat all existing numbers by a large margin Even with some errors committed by the CRF we retain most of the benefit

16 HLT/EMNLP 2005Krishnan, Das, Chakrabarti16 Linear SVM feature ablation Bigrams beat all other q-grams Both informer bigrams and hypernyms help Good to hedge bets by retaining ordinary bigrams Hypernyms of all question tokens does not help at all

17 HLT/EMNLP 2005Krishnan, Das, Chakrabarti17 Atype accuracy percent by question cue What and which make most diff Heuristic informers less effective

18 HLT/EMNLP 2005Krishnan, Das, Chakrabarti18 Observations  Retain most of the benefit of “perfect” informers  Significantly more accurate atype classification  Heuristic informers yields relatively small gains  We frequently fix errors in what/which questions whose answer types are harder to infer  Hypernyms of informer tokens help, but hypernyms of all question tokens don’t  Therefore, the notion of minimal informer span is important and non-trivial

19 HLT/EMNLP 2005Krishnan, Das, Chakrabarti19 Summary and ongoing work  Informers and atypes are important and non- trivial aspects of factoid questions  Simple, clean model for exploiting question syntax and sequential dependencies  CRF+SVM meta-learner  high accuracy informer and atype prediction  Can map informer directly to WordNet noun hierarchy—improves precision further  Can supplement WordNet with KnowItAll- style compilations—improves recall further


Download ppt "Enhanced Answer Type Inference from Questions using Sequential Models Vijay Krishnan Sujatha Das Soumen Chakrabarti IIT Bombay."

Similar presentations


Ads by Google