Presentation is loading. Please wait.

Presentation is loading. Please wait.

Natural Language Inference Bill MacCartney NLP Group Stanford University 8 May 2009.

Similar presentations


Presentation on theme: "Natural Language Inference Bill MacCartney NLP Group Stanford University 8 May 2009."— Presentation transcript:

1 Natural Language Inference Bill MacCartney NLP Group Stanford University 8 May 2009

2 2 Natural language inference (NLI) Aka recognizing textual entailment (RTE) Does premise P justify an inference to hypothesis H? An informal, intuitive notion of inference: not strict logic Emphasis on variability of linguistic expression Necessary to goal of natural language understanding (NLU) Many more immediate applications … P Several airlines polled saw costs grow more than expected, even after adjusting for inflation. H Some of the companies in the poll reported cost increases. yes Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

3 3 input: Gazprom va doubler le prix du gaz pour la Géorgie. Applications of NLI semantic searchquestion answering summarizationMT evaluation …double Georgia’s gas bill……two-fold increase in gas price… Economist.com …price of gas will be doubled… X X output: Gazprom will double the price of gas for Georgia. target: Gazprom will double Georgia’s gas Bill. machine translation evaluation: does output paraphrase target? Georgia’s gas bill doubled Search Q: How much did Georgia’s gas price increase? A: In 2006, Gazprom doubled Georgia’s gas bill. A: Georgia’s main imports are natural gas, machinery,... A: Tbilisi is the capital and largest city of Georgia. A: Natural gas is a gas consisting primarily of methane. Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion [Pado et al. 09] [Harabagiu & Hickl 06] [Tatar et al. 08] [King et al. 07]

4 4 NLI problem sets Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion RTE (Recognizing Textual Entailment) 4 years, each with dev & test sets, each 800 NLI problems Longish premises taken from (e.g.) newswire; short hypotheses Balanced 2-way classification: entailment vs. non-entailment P As leaders gather in Argentina ahead of this weekends regional talks, Hugo Chávez, Venezuela’s populist president is using an energy windfall to win friends and promote his vision of 21st-century socialism. H Hugo Chávez acts as Venezuela’s president. yes P Smith wrote a report in two hours. H Smith spend more than two hours writing the report. no FraCaS test suite 346 NLI problems, constructed by semanticists in mid-90s 55% have single premise; remainder have 2 or more premises 3-way classification: entailment, contradiction, compatibility

5 5 NLI: a spectrum of approaches lexical/ semantic overlap Jijkoun & de Rijke 2005 patterned relation extraction Romano et al. 2006 semantic graph matching MacCartney et al. 2006 Hickl et al. 2006 FOL & theorem proving Bos & Markert 2006 robust, but shallow deep, but brittle natural logic (this work) Problem: imprecise  easily confounded by negation, quantifiers, conditionals, factive & implicative verbs, etc. Problem: hard to translate NL to FOL idioms, anaphora, ellipsis, intensionality, tense, aspect, vagueness, modals, indexicals, reciprocals, propositional attitudes, scope ambiguities, anaphoric adjectives, non- intersective adjectives, temporal & causal relations, unselective quantifiers, adverbs of quantification, donkey sentences, generic determiners, comparatives, phrasal verbs, … Solution? Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

6 6 more than expected, even after adjusting for inflation. 0.90.60.90.40.90.8 Shallow approaches to NLI Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion Example: the bag-of-words approach [Glickman et al. 2005] Measures approximate lexical similarity of H to (part of) P P Several airlines polled saw costs grow H Some of the companies in the poll reported cost increases. 0.9 No None Robust, and surprisingly effective for many NLI problems But imprecise, and hence easily confounded Ignores predicate-argument structure — this can be remedied Struggles with antonymy, negation, verb-frame alternation Crucially, depends on assumption of upward monotonicity Non-upward-monotone constructions are rife! [Danescu et al. 2009] not, all, most, few, rarely, if, tallest, without, doubt, avoid, regardless, unable, …

7 7 Relies on full semantic interpretation of P & H (greater-than (magnitude g) The formal approach to NLI Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion P Several airlines polled saw costs grow more than expected, even after adjusting for inflation. (exists p (and (poll-event p) (several x (and (airline x) (obj p x) (exists c (and (cost c) (has x c) (exists g (and (grow-event g) (subj g c)..... ? Need background axioms to complete proofs — but from where? Besides, NLI task based on informal definition of inferability Bos & Markert 06 found FOL proof for just 4% of RTE problems Translate to formal representation & apply automated reasoner Can succeed in restricted domains, but not in open-domain NLI!

8 8 Solution? Natural logic! (  natural deduction) Characterizes valid patterns of inference via surface forms precise, yet sidesteps difficulties of translating to FOL A long history traditional logic: Aristotle’s syllogisms, scholastics, Leibniz, … modern natural logic begins with Lakoff (1970) van Benthem & Sánchez Valencia (1986-91): monotonicity calculus Nairn et al. (2006): an account of implicatives & factives We introduce a new theory of natural logic… extends monotonicity calculus to account for negation & exclusion incorporates elements of Nairn et al.’s model of implicatives …and implement & evaluate a computational model of it Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

9 9Outline Introduction Alignment for NLI A theory of entailment relations A theory of compositional entailment The NatLog system Conclusions [Not covered today: the bag-of-words model, the Stanford RTE system] Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

10 10 P Gazprom today confirmed a two-fold increase in its gas price for Georgia, beginning next Monday. H Gazprom will double Georgia’s gas bill. yes Alignment for NLI Linking corresponding words & phrases in two sentences Alignment problem is familiar in machine translation (MT) Most approaches to NLI depends on a facility for alignment Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

11 11 Alignment example unaligned content: “deletions” from P approximate match: price ~ bill phrase alignment: two-fold increase ~ double H (hypothesis) P (premise) Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

12 12 Approaches to NLI alignment Alignment addressed variously by current NLI systems In some approaches to NLI, alignments are implicit: NLI via lexical overlap [Glickman et al. 05, Jijkoun & de Rijke 05] NLI as proof search [Tatu & Moldovan 07, Bar-Haim et al. 07] Other NLI systems make alignment step explicit: Align first, then determine inferential validity [Marsi & Kramer 05, MacCartney et al. 06] What about using an MT aligner? Alignment is familiar in MT, with extensive literature [Brown et al. 93, Vogel et al. 96, Och & Ney 03, Marcu & Wong 02, DeNero et al. 06, Birch et al. 06, DeNero & Klein 08] Can tools & techniques of MT alignment transfer to NLI? Dissertation argues: not very well Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

13 13 The MANLI aligner A model of alignment for NLI consisting of four components: 1.Phrase-based representation 2.Feature-based scoring function 3.Decoding using simulated annealing 4.Perceptron learning Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

14 14 Phrase-based alignment representation EQ( Gazprom 1, Gazprom 1 ) INS( will 2 ) DEL( today 2 ) DEL( confirmed 3 ) DEL( a 4 ) SUB( two-fold 5 increase 6, double 3 ) DEL( in 7 ) DEL( its 8 ) … Represent alignments by sequence of phrase edits: EQ, SUB, DEL, INS One-to-one at phrase level (but many-to-many at token level) Avoids arbitrary alignment choices; can use phrase-based resources Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

15 15 Score edits as linear combination of features, then sum: A feature-based scoring function Edit type features: EQ, SUB, DEL, INS Phrase features: phrase sizes, non-constituents Lexical similarity feature: max over similarity scores WordNet: synonymy, hyponymy, antonymy, Jiang-Conrath Distributional similarity à la Dekang Lin Various measures of string/lemma similarity Contextual features: distortion, matching neighbors Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

16 16 Decoding using simulated annealing 1.Start 3.Score 4.Smooth/sharpen P(A) = P(A) 1/T 5.Sample 6.Lower temp T = 0.9  T … 2.Generate successors 7.Repeat … 100 times Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

17 17 Perceptron learning of feature weights We use a variant of averaged perceptron [Collins 2002] Initialize weight vector w = 0, learning rate R 0 = 1 For training epoch i = 1 to 50: For each problem  P j, H j  with gold alignment E j : Set Ê j = ALIGN (P j, H j, w) Set w = w + R i  (  (E j ) –  (Ê j )) Set w = w / ‖ w ‖ 2 (L 2 normalization) Set w[i] = w (store weight vector for this epoch) Set R i = 0.8  R i–1 (reduce learning rate) Throw away weight vectors from first 20% of epochs Return average weight vector Training runs require about 20 hours (on 800 RTE problems) Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

18 18 The MSR RTE2 alignment data Previously, little supervised data Now, MSR gold alignments for RTE2 [Brockett 2007] dev & test sets, 800 problems each Token-based, but many-to-many allows implicit alignment of phrases 3 independent annotators 3 of 3 agreed on 70% of proposed links 2 of 3 agreed on 99.7% of proposed links merged using majority rule Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

19 19 Evaluation on MSR data We evaluate several alignment models on MSR data Baseline: a simple bag-of-words aligner Matches each token in H to most string-similar token in P Two well-known MT aligners: GIZA++ & Cross-EM Supplemented with lexicon; tried various symmetrization heuristics A representative NLI aligner: the Stanford RTE aligner Can’t do phrase alignments, but can exploit syntactic features The MANLI aligner just presented How well do they recover gold-standard alignments? Assess per-link precision, recall, and F 1 ; and exact match rate Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

20 20 Aligner evaluation results RTE2 devRTE2 test SystemP %R %F 1 %E %P %R %F 1 %E % Bag-of-words57.881.267.53.562.182.670.95.3 GIZA++83.066.472.19.485.169.174.811.3 Cross-EM67.680.172.11.370.381.074.10.8 Stanford RTE81.175.878.40.582.775.879.10.3 MANLI83.485.584.421.785.485.3 21.3 Bag-of-words aligner: good recall, but poor precision MT aligners fail to learn word-word correspondences Stanford RTE aligner struggles with function words MANLI outperforms all others on every measure F 1 : 10.5% higher than GIZA++, 6.2% higher than Stanford Good balance of precision & recall; matched >20% exactly Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

21 21 MANLI results: discussion Three factors contribute to success: 1.Lexical resources: jail ~ prison, prevent ~ stop, injured ~ wounded 2.Contextual features enable matching function words 3.Phrases: death penalty ~ capital punishment, abdicate ~ give up But phrases help less than expected! If we set max phrase size = 1, we lose just 0.2% in F 1 Recall errors: room to improve 40%: need better lexical resources: conservation ~ protecting, organization ~ agencies, bone fragility ~ osteoporosis Precision errors harder to reduce equal function words (49%), forms of be (21%), punctuation (7%) Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

22 22 Alignment for NLI: conclusions MT aligners not directly applicable to NLI They rely on unsupervised learning from massive amounts of bitext They assume semantic equivalence of P & H MANLI succeeds by: Exploiting (manually & automatically constructed) lexical resources Accommodating frequent unaligned phrases Using contextual features to align function words Phrase-based representation shows potential But not yet proven: need better phrase-based lexical resources Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

23 23Outline Introduction Alignment for NLI A theory of entailment relations A theory of compositional entailment The NatLog system Conclusion Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

24 24 Entailment relations in past work X is a man X is a woman X is a hippo X is hungry X is a fish X is a carp X is a crow X is a bird X is a couch X is a sofa Yes entailment No non-entailment 2-way RTE1,2,3 Yes entailment No contradiction Unknown compatibility 3-way FraCaS, PARC, RTE4 P = Q equivalence P < Q forward entailment P > Q reverse entailment P # Q non-entailment containment Sánchez-Valencia Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

25 25 16 elementary set relations ?? ?? yy xx x y Assign sets  x, y  to one of 16 relations, depending on emptiness or non- emptiness of each of four partitions Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion empty non-empty

26 26 16 elementary set relations x ^ y x  yx  y x  yx  y x ⊐ yx ⊐ y x ⊏ yx ⊏ y x | yx # y But 9 of 16 are degenerate: either x or y is either empty or universal. I.e., they correspond to semantically vacuous expressions, which are rare outside logic textbooks. We therefore focus on the remaining seven relations. Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

27 27 The set of basic entailment relations diagramsymbo l name example x  yx  y equivalence couch  sofa x ⊏ yx ⊏ y forward entailment (strict) crow ⊏ bird x ⊐ yx ⊐ y reverse entailment (strict) European ⊐ French x ^ y negation (exhaustive exclusion) human ^ nonhuman x | y alternation (non-exhaustive exclusion) cat | dog x  y cover (exhaustive non-exclusion) animal  nonhuman x # y independence hungry # hippo Relations are defined for all semantic types: tiny ⊏ small, hover ⊏ fly, kick ⊏ strike, this morning ⊏ today, in Beijing ⊏ in China, everyone ⊏ someone, all ⊏ most ⊏ some Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

28 28 | x R y Joining entailment relations fishhumannonhuman ^ yz S??  ⋈  ⊏ ⋈ ⊏  ⊏ ⊐ ⋈ ⊐  ⊐ ^ ⋈ ^  R ⋈  R  ⋈ R  R ⊏ Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

29 29 Some joins yield unions of relations! x | yy | zx ? z couch | table | sofacouch  sofa pistol | knife | gunpistol ⊏ gun dog | cat | terrierdog ⊐ terrier rose | orchid | daisyrose | daisy woman | frog | Eskimowoman # Eskimo What is | | ? ⋈ | |   { , ⊏, ⊐, |, #} ⋈ Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

30 30 Of 49 join pairs, 32 yield relations in ; 17 yield unions Larger unions convey less information — limits power of inference In practice, any union which contains # can be approximated by # — so, in practice, we can avoid the complexity of unions The complete join table Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

31 31Outline Introduction Alignment for NLI A theory of entailment relations A theory of compositional entailment The NatLog system Conclusion Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

32 32  will depend on: 1.the lexical entailment relation generated by e:  (e) 2.other properties of the context x in which e is applied  (, ) Lexical entailment relations xe(x)e(x) compound expression atomic edit: DEL, INS, SUB entailment relation Example: suppose x is red car If e is SUB ( car, convertible ), then  (e) is ⊐ If e is DEL ( red ), then  (e) is ⊏ Crucially,  (e) depends solely on lexical items in e, independent of context x But how are lexical entailment relations determined? Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

33 33 Lexical entailment relations: SUBs  ( SUB (x, y)) =  (x, y) For open-class terms, use lexical resource (e.g. WordNet)  for synonyms: sofa  couch, forbid  prohibit ⊏ for hypo-/hypernyms: crow ⊏ bird, frigid ⊏ cold, soar ⊏ rise |for antonyms and coordinate terms: hot | cold, cat | dog  or | for proper nouns: USA  United States, JFK | FDR # for most other pairs: hungry # hippo Closed-class terms may require special handling Quantifiers: all ⊏ some, some ^ no, no | all, at least 4  at most 6 See dissertation for discussion of pronouns, prepositions, … Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

34 34 Lexical entailment relations: DEL & INS Generic (default) case:  ( DEL ()) = ⊏,  ( INS ()) = ⊐ Examples: red car ⊏ car, sing ⊐ sing off-key Even quite long phrases: car parked outside since last week ⊏ car Applies to intersective modifiers, conjuncts, independent clauses, … This heuristic underlies most approaches to RTE! Does P subsume H? Deletions OK; insertions penalized. Special cases Negation: didn’t sleep ^ did sleep Implicatives & factives (e.g. refuse to, admit that ): discussed later Non-intersective adjectives: former spy | spy, alleged spy # spy Auxiliaries etc.: is sleeping  sleeps, did sleep  slept Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

35 35 The impact of semantic composition How are entailment relations affected by semantic composition? f @ f @ x y  ? The monotonicity calculus provides a partial answer UP  ⊏  ⊏ ⊐  ⊐ #  # DOWN  ⊏  ⊐ ⊐  ⊏ #  # NON  ⊏  # ⊐  # #  # If f has monotonicity… How is  (x, y) projected by f? But how are other relations (|, ^,  ) projected? @ means fn application  Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

36 36 A typology of projectivity Projectivity signatures: a generalization of monotonicity classes negatio n  ⊏  ⊐ ⊐  ⊏ ^  ^ |   | #  # not French  not German not more than 4 | not less than 6 not human ^ not nonhuman didn’t kiss ⊐ didn’t touch not ill ⊏ not seasick In principle, 7 7 possible signatures, but few actually realized ↦ Each projectivity signature is a map not happy  not glad isn’t swimming # isn’t hungry Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

37 37 A typology of projectivity Projectivity signatures: a generalization of monotonicity classes Each projectivity signature is a map In principle, 7 7 possible signatures, but few actually realized ↦ negatio n  ⊏  ⊐ ⊐  ⊏ ^  ^ |   | #  # metallic pipe # nonferrous pipe intersective modification  ⊏  ⊏ ⊐  ⊐ ^  | |  |  # #  # live human | live nonhuman French wine | Spanish wine See dissertation for projectivity of connectives, quantifiers, verbs Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

38 38 Projecting through multiple levels ⊏ ⊏ ⊐ ⊐ ⊐ a shirtnobodycanwithoutenter @ @ @ @ clothesnobodycanwithoutenter @ @ @ @ Propagate entailment relation between atoms upward, according to projectivity class of each node on path to root nobody can enter with a shirt ⊏ nobody can enter with clothes Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

39 39 Implicatives & factives [Nairn et al. 06] signatur e example implicative s + / – he managed to escape + / o he was forced to sell o / – he was permitted to live implicative s – / + he forgot to pay – / o he refused to fight o / + he hesitated to ask factives+ / + he admitted that he knew – / – he pretended he was sick o / o he wanted to fly 9 signatures, per implications (+, –, or o) in positive and negative contexts Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

40 40 Implicatives & factives signatur e example  ( DEL )  ( INS ) implicative s + / – he managed to escape  he escaped  + / o he was forced to sell ⊏ he sold ⊏⊐ o / – he was permitted to live ⊐ he lived ⊐⊏ implicative s – / + he forgot to pay ^ he paid ^^ – / o he refused to fight | he fought || o / + he hesitated to ask  he asked  nonfactiveso / o he wanted to fly # he flew ## We can specify relation generated by DEL or INS of each signature Room for variation w.r.t. infinitives, complementizers, passivation, etc. Some more intuitive when negated: he didn’t hesitate to ask | he didn’t ask Doesn’t cover factives, which involve presuppositions — see dissertation Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

41 41 Putting it all together 1.Find a sequence of edits  e 1, …, e n  which transforms p into h. Define x 0 = p, x n = h, and x i = e i (x i–1 ) for i  [1, n]. 2.For each atomic edit e i : a.Determine the lexical entailment relation  (e i ). b.Project  (e i ) upward through the semantic composition tree of expression x i–1 to find the atomic entailment relation  (x i–1, x i ) 3.Join atomic entailment relations across the sequence of edits:  (p, h) =  (x 0, x n ) =  (x 0, x 1 ) ⋈ … ⋈  (x i–1, x i ) ⋈ … ⋈  (x n–1, x n ) Limitations: need to find appropriate edit sequence connecting p and h; tendency of ⋈ operation toward less-informative entailment relations; lack of general mechanism for combining multiple premises Less deductive power than FOL. Can’t handle e.g. de Morgan’s Laws. Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

42 42 An example P The doctor didn’t hesitate to recommend Prozac. H The doctor recommended medication. yes ieiei xixi lexatomjoin The doctor didn’t hesitate to recommend Prozac. 1DEL( hesitate to ) The doctor didn’t recommend Prozac. 2DEL( didn’t ) The doctor recommended Prozac. 3SUB( Prozac, medication ) The doctor recommended medication.  || ^^ ⊏ ⊏⊏⊏ yes Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

43 43 Different edit orders? ieiei lexatomjoin 1DEL( hesitate to )  || 2DEL( didn’t )^^ ⊏ 3SUB( Prozac, medication ) ⊏⊏⊏ ieiei lexatomjoin 1DEL( didn’t )^^^ 2DEL( hesitate to )  ⊏ 3SUB( Prozac, medication ) ⊏⊏⊏ ieiei lexatomjoin 1SUB( Prozac, medication ) ⊏⊏⊏ 2DEL( hesitate to )  || 3DEL( didn’t )^^ ⊏ ieiei lexatomjoin 1DEL( hesitate to )  || 2SUB( Prozac, medication ) ⊏⊐ | 3DEL( didn’t )^^ ⊏ ieiei lexatomjoin 1DEL( didn’t )^^^ 2SUB( Prozac, medication ) ⊏⊐ | 3DEL( hesitate to )  ⊏ ieiei lexatomjoin 1SUB( Prozac, medication ) ⊏⊏⊏ 2DEL( didn’t )^^| 3DEL( hesitate to )  ⊏ Intermediate steps may vary; final result is typically (though not necessarily) the same Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

44 44Outline Introduction Alignment for NLI A theory of entailment relations A theory of compositional entailment The NatLog system Conclusion Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

45 45 The NatLog system linguistic analysis alignment lexical entailment classification 1 2 3 NLI problem prediction entailment projection entailment joining 4 5 Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion next slide from outside sources core of system covered shortly straightforward not covered further

46 46 PP Stage 1: Linguistic analysis Tokenize & parse input sentences (future: & NER & coref & …) Identify items w/ special projectivity & determine scope Problem: PTB-style parse tree  semantic structure! Jimmy Dean refused to move without blue jeans NNP NNP VBD TO VB IN JJ NNS NP NP VP S Solution: specify scope in PTB trees using Tregex [Levy & Andrew 06] VP S +++–––++ refuse  move Jimmy Dean without  jeans blue category: –/o implicatives examples: refuse, forbid, prohibit, … scope: S complement pattern: __ > (/VB.*/ > VP $. S=arg) projectivity: {  : , ⊏ : ⊐, ⊐ : ⊏, ^:|, |:#, _:#, #:#} Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

47 47 Stage 3: Lexical entailment classification Goal: predict entailment relation for each edit, based solely on lexical features, independent of context Approach: use lexical resources & machine learning Feature representation: WordNet features: synonymy (  ), hyponymy ( ⊏ / ⊐ ), antonymy (|) Other relatedness features: Jiang-Conrath (WN-based), NomBank Fallback: string similarity (based on Levenshtein edit distance) Also lexical category, quantifier category, implication signature Decision tree classifier Trained on 2,449 hand-annotated lexical entailment problems E.g., SUB( gun, weapon ): ⊏, SUB( big, small ): |, DEL( often ): ⊏ Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

48 48 The FraCaS test suite FraCaS: a project in computational semantics [Cooper et al. 96] 346 “textbook” examples of NLI problems 3 possible answers: yes, no, unknown (not balanced!) 55% single-premise, 45% multi-premise (excluded) P At most ten commissioners spend time at home. H At most ten commissioners spend a lot of time at home. yes P Dumbo is a large animal. H Dumbo is a small animal. no P Smith believed that ITEL had won the contract in 1992. H ITEL won the contract in 1992. unk Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

49 49 27% error reduction Results on FraCaS System# prec % rec %acc % most common class18355.7100.055.7 MacCartney & Manning 07 18368.960.859.6 this work18389.365.770.5 Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

50 50 high precision even outside areas of expertise 27% error reduction in largest category, all but one correct high accuracy in sections most amenable to natural logic Results on FraCaS System# prec % rec %acc % most common class18355.7100.055.7 MacCartney & Manning 07 18368.960.859.6 this work18389.365.770.5 §Category# prec % rec %acc % 1Quantifiers4495.2100.097.7 2Plurals2490.064.375.0 3Anaphora6100.060.050.0 4Ellipsis25100.05.324.0 5Adjectives1571.483.380.0 6Comparatives1688.9 81.3 7Temporal3685.770.658.3 8Verbs880.066.762.5 9Attitudes9100.083.388.9 1, 2, 5, 6, 910890.485.587.0 Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

51 51 The RTE3 test suite P As leaders gather in Argentina ahead of this weekends regional talks, Hugo Chávez, Venezuela’s populist president is using an energy windfall to win friends and promote his vision of 21st-century socialism. H Hugo Chávez acts as Venezuela’s president. yes P Democrat members of the Ways and Means Committee, where tax bills are written and advanced, do not have strong small business voting records. H Democrat members had strong small business voting records. no Somewhat more “natural”, but not ideal for NatLog Many kinds of inference not addressed by NatLog: paraphrase, temporal reasoning, relation extraction, … Big edit distance  propagation of errors from atomic model Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

52 52 Results on RTE3: NatLog SystemData% Yes Prec % Rec %Acc % Stanford RTEdev50.268.767.067.2 test50.061.860.260.5 NatLogdev22.573.932.459.2 test26.470.136.159.4 (each data set contains 800 problems) Accuracy is unimpressive, but precision is relatively high Strategy: hybridize with Stanford RTE system As in Bos & Markert 2006 But NatLog makes positive prediction far more often (~25% vs. 4%) Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

53 53 4% gain (significant, p < 0.05) Results on RTE3: hybrid system SystemData% Yes Prec % Rec %Acc % Stanford RTEdev50.268.767.067.2 test50.061.860.260.5 NatLogdev22.573.932.459.2 test26.470.136.159.4 Hybriddev56.069.275.270.0 test54.564.468.564.5 (each data set contains 800 problems) Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

54 54Outline Introduction Alignment for NLI A theory of entailment relations A theory of compositional entailment The NatLog system Conclusion Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

55 55 What natural logic can’t do Not a universal solution for NLI Many types of inference not amenable to natural logic Paraphrase: Eve was let go  Eve lost her job Verb/frame alternation: he drained the oil ⊏ the oil drained Relation extraction: Aho, a trader at UBS… ⊏ Aho works for UBS Common-sense reasoning: the sink overflowed ⊏ the floor got wet etc. Also, has a weaker proof theory than FOL Can’t explain, e.g., de Morgan’s laws for quantifiers: Not all birds fly  Some birds don’t fly Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

56 56 Enables precise reasoning about semantic containment … hypernymy & hyponymy in nouns, verbs, adjectives, adverbs containment between temporal & locative expressions quantifier containment adding & dropping of intersective modifiers, adjuncts … and semantic exclusion … antonyms & coordinate terms: mutually exclusive nouns, adjectives mutually exclusive temporal & locative expressions negation, negative & restrictive quantifiers, verbs, adverbs, nouns … and implicatives and nonfactives Sidesteps myriad difficulties of full semantic interpretation What natural logic can do Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

57 57 Contributions of this dissertation Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion Undertook first systematic study of alignment for NLI Examined the relation between alignment in NLI and MT Evaluated bag-of-words, MT, and NLI aligners for NLI alignment Proposed a new model of alignment for NLI: MANLI Extended natural logic to incorporate semantic exclusion Defined expressive set of entailment relations (& join algebra) Introduced projectivity signatures: a generalization of monotonicity Unified account of implicativity under same framework Implemented a robust system for natural logic inference Demonstrated practical value on FraCaS & RTE test suites

58 58 The future of NLI Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion No silver bullet for NLI — problems are too diverse A full solution will need to combine disparate reasoners simple lexical similarity (e.g., bag-of-words) relation extraction natural logic & related forms of “semantic” reasoning temporal, spatial, & simple mathematical reasoning commonsense reasoning Key question: how can they best be combined? Apply in parallel, then combine predictions? How? Fine-grained “interleaving”? Collaborative proof search?

59 59Thanks! Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion :-) Thanks! Questions? My heartfelt appreciation to… My committee: Profs. Genesereth, Jurafsky, Manning, Peters, and van Benthem My collaborators: Marie-Catherine de Marneffe, Michel Galley, Teg Grenager, and many others My advisor: Prof. Chris Manning My girlfriend: Destiny Man Li Zhao

60 60 Backup slides follow Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

61 61 NLI alignment vs. MT alignment Doubtful — NLI alignment differs in several respects: 1.Monolingual: can exploit resources like WordNet 2.Asymmetric: P often longer & has content unrelated to H 3.Cannot assume semantic equivalence NLI aligner must accommodate frequent unaligned content 4.Little training data available MT aligners use unsupervised training on huge amounts of bitext NLI aligners must rely on supervised training & much less data Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

62 62 Projectivity of connectives  ⊏⊐⊏ ⊐⊏⊐ ^^| |  |  |# ### Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion conjunction ( and ) / intersective modification negation ( not )

63 63 Projectivity of connectives  ⊏⊐⊏⊏ ⊐⊏⊐⊐ ^^|  |  |#  |#  #### Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion disjunction ( or ) conjunction ( and ) / intersective modification negation ( not ) waltzed or sang ⊏ danced or sang human or equine  nonhuman or equine red or yellow # blue or yellow

64 64 Projectivity of connectives  ⊏⊐⊏⊏⊐ ⊐⊏⊐⊐⊏ ^^|  # |  |##  |#  # ##### Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion conditional ( if ) (antecedent) disjunction ( or ) conjunction ( and ) / intersective modification negation ( not ) If he drinks tequila, he feels nauseous ⊐ If he drinks liquor, he feels nauseous If it’s sunny, we surf # If it’s not sunny, we surf If it’s sunny, we surf # If it’s rainy, we surf

65 65 Projectivity of connectives  ⊏⊐⊏⊏⊐⊏ ⊐⊏⊐⊐⊏⊐ ^^|  #| |  |##|  |#  ## ###### Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion conditional ( if ) (consequent) conditional ( if ) (antecedent) disjunction ( or ) conjunction ( and ) / intersective modification negation ( not ) If he drinks tequila, he feels nauseous ⊏ If he drinks tequila, he feels sick If it’s sunny, we surf | If it’s sunny, we don’t surf If it’s sunny, we surf | If it’s sunny, we ski

66 66 Projectivity of connectives  ⊏⊐⊏⊏⊐⊏ # ⊐⊏⊐⊐⊏⊐ # ^^|  #|^ |  |##|#  |#  ### ####### Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion biconditional ( if and only if ) conditional ( if ) (consequent) conditional ( if ) (antecedent) disjunction ( or ) conjunction ( and ) / intersective modification negation ( not )

67 67 Projectivity of quantifiers Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion


Download ppt "Natural Language Inference Bill MacCartney NLP Group Stanford University 8 May 2009."

Similar presentations


Ads by Google