Natural Language Inference Bill MacCartney NLP Group Stanford University 8 May 2009.

Slides:



Advertisements
Similar presentations
Artificial Intelligence
Advertisements

Statistical Machine Translation
COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.
COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.
The Logic of Intelligence Pei Wang Department of Computer and Information Sciences Temple University.
Inference and Reasoning. Basic Idea Given a set of statements, does a new statement logically follow from this. For example If an animal has wings and.
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Logic Use mathematical deduction to derive new knowledge.
Recognizing Textual Entailment Challenge PASCAL Suleiman BaniHani.
Natural Logic for Textual Inference Bill MacCartney and Christopher D. Manning NLP Group Stanford University 29 June 2007.
Deciding entailment and contradiction with stochastic and edit distance-based alignment Marie-Catherine de Marneffe, Sebastian Pado, Bill MacCartney, Anna.
Logic Concepts Lecture Module 11.
Statistical NLP: Lecture 3
Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수
Robust Textual Inference via Graph Matching Aria Haghighi Andrew Ng Christopher Manning.
Formal Logic Proof Methods Direct Proof / Natural Deduction Conditional Proof (Implication Introduction) Reductio ad Absurdum Resolution Refutation.
1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
A Phrase-Based Model of Alignment for Natural Language Inference Bill MacCartney, Michel Galley, and Christopher D. Manning Stanford University 26 October.
Two Related Approaches to the Problem of Textual Inference Bill MacCartney NLP Group Stanford University 6 March 2008.
Modeling Semantic Containment and Exclusion in Natural Language Inference Bill MacCartney and Christopher D. Manning NLP Group Stanford University 22 August.
An Extended Model of Natural Logic Bill MacCartney and Christopher D. Manning NLP Group Stanford University 8 January 2009.
Two Aspects of the Problem of Natural Language Inference
Let remember from the previous lesson what is Knowledge representation
Containment, Exclusion, and Implicativity: A Model of Natural Logic for Textual Inference Bill MacCartney and Christopher D. Manning NLP Group Stanford.
Proof by Deduction. Deductions and Formal Proofs A deduction is a sequence of logic statements, each of which is known or assumed to be true A formal.
Regular Expressions and Automata Chapter 2. Regular Expressions Standard notation for characterizing text sequences Used in all kinds of text processing.
Meaning and Language Part 1.
Copyright © Cengage Learning. All rights reserved.
SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.
February 2009Introduction to Semantics1 Logic, Representation and Inference Introduction to Semantics What is semantics for? Role of FOL Montague Approach.
Outline P1EDA’s simple features currently implemented –And their ablation test Features we have reviewed from Literature –(Let’s briefly visit them) –Iftene’s.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Natural Logic and Natural Language Inference Bill MacCartney Stanford University / Google, Inc. 8 April 2011.
A Compositional Context Sensitive Multi-document Summarizer: Exploring the Factors That Influence Summarization Ani Nenkova, Stanford University Lucy Vanderwende,
MATH 224 – Discrete Mathematics
November 2003CSA4050: Semantics I1 CSA4050: Advanced Topics in NLP Semantics I What is semantics for? Role of FOL Montague Approach.
1 LIN 1310B Introduction to Linguistics Prof: Nikolay Slavkov TA: Qinghua Tang CLASS 13, Feb 16, 2007.
1 Sections 1.5 & 3.1 Methods of Proof / Proof Strategy.
Learning from Observations Chapter 18 Through
Pattern-directed inference systems
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Chapter 1, Part II: Predicate Logic With Question/Answer Animations.
Computational Semantics Day 5: Inference Aljoscha.
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
Linguistic Essentials
LECTURE 2: SEMANTICS IN LINGUISTICS
Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Automated Reasoning Early AI explored how to automated several reasoning tasks – these were solved by what we might call weak problem solving methods as.
Supertagging CMSC Natural Language Processing January 31, 2006.
© Copyright 2008 STI INNSBRUCK Intelligent Systems Propositional Logic.
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
Commonsense Reasoning in and over Natural Language Hugo Liu, Push Singh Media Laboratory of MIT The 8 th International Conference on Knowledge- Based Intelligent.
SYNTAX.
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
CSE 311: Foundations of Computing Fall 2013 Lecture 8: Proofs and Set theory.
Computational Learning Theory Part 1: Preliminaries 1.
Annotating and measuring Temporal relations in texts Philippe Muller and Xavier Tannier IRIT,Université Paul Sabatier COLING 2004.
Learning From Observations Inductive Learning Decision Trees Ensembles.
Meaning and Language Part 1. Plan We will talk about two different types of meaning, corresponding to two different types of objects: –Lexical Semantics:
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
Logical Agents. Outline Knowledge-based agents Logic in general - models and entailment Propositional (Boolean) logic Equivalence, validity, satisfiability.
Knowledge Representation and Reasoning
Statistical NLP: Lecture 3
Recognizing Partial Textual Entailment
CS201: Data Structures and Discrete Mathematics I
Truth Trees.
Linguistic Essentials
CS201: Data Structures and Discrete Mathematics I
Presentation transcript:

Natural Language Inference Bill MacCartney NLP Group Stanford University 8 May 2009

2 Natural language inference (NLI) Aka recognizing textual entailment (RTE) Does premise P justify an inference to hypothesis H? An informal, intuitive notion of inference: not strict logic Emphasis on variability of linguistic expression Necessary to goal of natural language understanding (NLU) Many more immediate applications … P Several airlines polled saw costs grow more than expected, even after adjusting for inflation. H Some of the companies in the poll reported cost increases. yes Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

3 input: Gazprom va doubler le prix du gaz pour la Géorgie. Applications of NLI semantic searchquestion answering summarizationMT evaluation …double Georgia’s gas bill……two-fold increase in gas price… Economist.com …price of gas will be doubled… X X output: Gazprom will double the price of gas for Georgia. target: Gazprom will double Georgia’s gas Bill. machine translation evaluation: does output paraphrase target? Georgia’s gas bill doubled Search Q: How much did Georgia’s gas price increase? A: In 2006, Gazprom doubled Georgia’s gas bill. A: Georgia’s main imports are natural gas, machinery,... A: Tbilisi is the capital and largest city of Georgia. A: Natural gas is a gas consisting primarily of methane. Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion [Pado et al. 09] [Harabagiu & Hickl 06] [Tatar et al. 08] [King et al. 07]

4 NLI problem sets Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion RTE (Recognizing Textual Entailment) 4 years, each with dev & test sets, each 800 NLI problems Longish premises taken from (e.g.) newswire; short hypotheses Balanced 2-way classification: entailment vs. non-entailment P As leaders gather in Argentina ahead of this weekends regional talks, Hugo Chávez, Venezuela’s populist president is using an energy windfall to win friends and promote his vision of 21st-century socialism. H Hugo Chávez acts as Venezuela’s president. yes P Smith wrote a report in two hours. H Smith spend more than two hours writing the report. no FraCaS test suite 346 NLI problems, constructed by semanticists in mid-90s 55% have single premise; remainder have 2 or more premises 3-way classification: entailment, contradiction, compatibility

5 NLI: a spectrum of approaches lexical/ semantic overlap Jijkoun & de Rijke 2005 patterned relation extraction Romano et al semantic graph matching MacCartney et al Hickl et al FOL & theorem proving Bos & Markert 2006 robust, but shallow deep, but brittle natural logic (this work) Problem: imprecise  easily confounded by negation, quantifiers, conditionals, factive & implicative verbs, etc. Problem: hard to translate NL to FOL idioms, anaphora, ellipsis, intensionality, tense, aspect, vagueness, modals, indexicals, reciprocals, propositional attitudes, scope ambiguities, anaphoric adjectives, non- intersective adjectives, temporal & causal relations, unselective quantifiers, adverbs of quantification, donkey sentences, generic determiners, comparatives, phrasal verbs, … Solution? Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

6 more than expected, even after adjusting for inflation Shallow approaches to NLI Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion Example: the bag-of-words approach [Glickman et al. 2005] Measures approximate lexical similarity of H to (part of) P P Several airlines polled saw costs grow H Some of the companies in the poll reported cost increases. 0.9 No None Robust, and surprisingly effective for many NLI problems But imprecise, and hence easily confounded Ignores predicate-argument structure — this can be remedied Struggles with antonymy, negation, verb-frame alternation Crucially, depends on assumption of upward monotonicity Non-upward-monotone constructions are rife! [Danescu et al. 2009] not, all, most, few, rarely, if, tallest, without, doubt, avoid, regardless, unable, …

7 Relies on full semantic interpretation of P & H (greater-than (magnitude g) The formal approach to NLI Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion P Several airlines polled saw costs grow more than expected, even after adjusting for inflation. (exists p (and (poll-event p) (several x (and (airline x) (obj p x) (exists c (and (cost c) (has x c) (exists g (and (grow-event g) (subj g c)..... ? Need background axioms to complete proofs — but from where? Besides, NLI task based on informal definition of inferability Bos & Markert 06 found FOL proof for just 4% of RTE problems Translate to formal representation & apply automated reasoner Can succeed in restricted domains, but not in open-domain NLI!

8 Solution? Natural logic! (  natural deduction) Characterizes valid patterns of inference via surface forms precise, yet sidesteps difficulties of translating to FOL A long history traditional logic: Aristotle’s syllogisms, scholastics, Leibniz, … modern natural logic begins with Lakoff (1970) van Benthem & Sánchez Valencia ( ): monotonicity calculus Nairn et al. (2006): an account of implicatives & factives We introduce a new theory of natural logic… extends monotonicity calculus to account for negation & exclusion incorporates elements of Nairn et al.’s model of implicatives …and implement & evaluate a computational model of it Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

9Outline Introduction Alignment for NLI A theory of entailment relations A theory of compositional entailment The NatLog system Conclusions [Not covered today: the bag-of-words model, the Stanford RTE system] Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

10 P Gazprom today confirmed a two-fold increase in its gas price for Georgia, beginning next Monday. H Gazprom will double Georgia’s gas bill. yes Alignment for NLI Linking corresponding words & phrases in two sentences Alignment problem is familiar in machine translation (MT) Most approaches to NLI depends on a facility for alignment Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

11 Alignment example unaligned content: “deletions” from P approximate match: price ~ bill phrase alignment: two-fold increase ~ double H (hypothesis) P (premise) Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

12 Approaches to NLI alignment Alignment addressed variously by current NLI systems In some approaches to NLI, alignments are implicit: NLI via lexical overlap [Glickman et al. 05, Jijkoun & de Rijke 05] NLI as proof search [Tatu & Moldovan 07, Bar-Haim et al. 07] Other NLI systems make alignment step explicit: Align first, then determine inferential validity [Marsi & Kramer 05, MacCartney et al. 06] What about using an MT aligner? Alignment is familiar in MT, with extensive literature [Brown et al. 93, Vogel et al. 96, Och & Ney 03, Marcu & Wong 02, DeNero et al. 06, Birch et al. 06, DeNero & Klein 08] Can tools & techniques of MT alignment transfer to NLI? Dissertation argues: not very well Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

13 The MANLI aligner A model of alignment for NLI consisting of four components: 1.Phrase-based representation 2.Feature-based scoring function 3.Decoding using simulated annealing 4.Perceptron learning Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

14 Phrase-based alignment representation EQ( Gazprom 1, Gazprom 1 ) INS( will 2 ) DEL( today 2 ) DEL( confirmed 3 ) DEL( a 4 ) SUB( two-fold 5 increase 6, double 3 ) DEL( in 7 ) DEL( its 8 ) … Represent alignments by sequence of phrase edits: EQ, SUB, DEL, INS One-to-one at phrase level (but many-to-many at token level) Avoids arbitrary alignment choices; can use phrase-based resources Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

15 Score edits as linear combination of features, then sum: A feature-based scoring function Edit type features: EQ, SUB, DEL, INS Phrase features: phrase sizes, non-constituents Lexical similarity feature: max over similarity scores WordNet: synonymy, hyponymy, antonymy, Jiang-Conrath Distributional similarity à la Dekang Lin Various measures of string/lemma similarity Contextual features: distortion, matching neighbors Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

16 Decoding using simulated annealing 1.Start 3.Score 4.Smooth/sharpen P(A) = P(A) 1/T 5.Sample 6.Lower temp T = 0.9  T … 2.Generate successors 7.Repeat … 100 times Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

17 Perceptron learning of feature weights We use a variant of averaged perceptron [Collins 2002] Initialize weight vector w = 0, learning rate R 0 = 1 For training epoch i = 1 to 50: For each problem  P j, H j  with gold alignment E j : Set Ê j = ALIGN (P j, H j, w) Set w = w + R i  (  (E j ) –  (Ê j )) Set w = w / ‖ w ‖ 2 (L 2 normalization) Set w[i] = w (store weight vector for this epoch) Set R i = 0.8  R i–1 (reduce learning rate) Throw away weight vectors from first 20% of epochs Return average weight vector Training runs require about 20 hours (on 800 RTE problems) Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

18 The MSR RTE2 alignment data Previously, little supervised data Now, MSR gold alignments for RTE2 [Brockett 2007] dev & test sets, 800 problems each Token-based, but many-to-many allows implicit alignment of phrases 3 independent annotators 3 of 3 agreed on 70% of proposed links 2 of 3 agreed on 99.7% of proposed links merged using majority rule Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

19 Evaluation on MSR data We evaluate several alignment models on MSR data Baseline: a simple bag-of-words aligner Matches each token in H to most string-similar token in P Two well-known MT aligners: GIZA++ & Cross-EM Supplemented with lexicon; tried various symmetrization heuristics A representative NLI aligner: the Stanford RTE aligner Can’t do phrase alignments, but can exploit syntactic features The MANLI aligner just presented How well do they recover gold-standard alignments? Assess per-link precision, recall, and F 1 ; and exact match rate Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

20 Aligner evaluation results RTE2 devRTE2 test SystemP %R %F 1 %E %P %R %F 1 %E % Bag-of-words GIZA Cross-EM Stanford RTE MANLI Bag-of-words aligner: good recall, but poor precision MT aligners fail to learn word-word correspondences Stanford RTE aligner struggles with function words MANLI outperforms all others on every measure F 1 : 10.5% higher than GIZA++, 6.2% higher than Stanford Good balance of precision & recall; matched >20% exactly Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

21 MANLI results: discussion Three factors contribute to success: 1.Lexical resources: jail ~ prison, prevent ~ stop, injured ~ wounded 2.Contextual features enable matching function words 3.Phrases: death penalty ~ capital punishment, abdicate ~ give up But phrases help less than expected! If we set max phrase size = 1, we lose just 0.2% in F 1 Recall errors: room to improve 40%: need better lexical resources: conservation ~ protecting, organization ~ agencies, bone fragility ~ osteoporosis Precision errors harder to reduce equal function words (49%), forms of be (21%), punctuation (7%) Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

22 Alignment for NLI: conclusions MT aligners not directly applicable to NLI They rely on unsupervised learning from massive amounts of bitext They assume semantic equivalence of P & H MANLI succeeds by: Exploiting (manually & automatically constructed) lexical resources Accommodating frequent unaligned phrases Using contextual features to align function words Phrase-based representation shows potential But not yet proven: need better phrase-based lexical resources Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

23Outline Introduction Alignment for NLI A theory of entailment relations A theory of compositional entailment The NatLog system Conclusion Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

24 Entailment relations in past work X is a man X is a woman X is a hippo X is hungry X is a fish X is a carp X is a crow X is a bird X is a couch X is a sofa Yes entailment No non-entailment 2-way RTE1,2,3 Yes entailment No contradiction Unknown compatibility 3-way FraCaS, PARC, RTE4 P = Q equivalence P < Q forward entailment P > Q reverse entailment P # Q non-entailment containment Sánchez-Valencia Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

25 16 elementary set relations ?? ?? yy xx x y Assign sets  x, y  to one of 16 relations, depending on emptiness or non- emptiness of each of four partitions Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion empty non-empty

26 16 elementary set relations x ^ y x  yx  y x  yx  y x ⊐ yx ⊐ y x ⊏ yx ⊏ y x | yx # y But 9 of 16 are degenerate: either x or y is either empty or universal. I.e., they correspond to semantically vacuous expressions, which are rare outside logic textbooks. We therefore focus on the remaining seven relations. Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

27 The set of basic entailment relations diagramsymbo l name example x  yx  y equivalence couch  sofa x ⊏ yx ⊏ y forward entailment (strict) crow ⊏ bird x ⊐ yx ⊐ y reverse entailment (strict) European ⊐ French x ^ y negation (exhaustive exclusion) human ^ nonhuman x | y alternation (non-exhaustive exclusion) cat | dog x  y cover (exhaustive non-exclusion) animal  nonhuman x # y independence hungry # hippo Relations are defined for all semantic types: tiny ⊏ small, hover ⊏ fly, kick ⊏ strike, this morning ⊏ today, in Beijing ⊏ in China, everyone ⊏ someone, all ⊏ most ⊏ some Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

28 | x R y Joining entailment relations fishhumannonhuman ^ yz S??  ⋈  ⊏ ⋈ ⊏  ⊏ ⊐ ⋈ ⊐  ⊐ ^ ⋈ ^  R ⋈  R  ⋈ R  R ⊏ Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

29 Some joins yield unions of relations! x | yy | zx ? z couch | table | sofacouch  sofa pistol | knife | gunpistol ⊏ gun dog | cat | terrierdog ⊐ terrier rose | orchid | daisyrose | daisy woman | frog | Eskimowoman # Eskimo What is | | ? ⋈ | |   { , ⊏, ⊐, |, #} ⋈ Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

30 Of 49 join pairs, 32 yield relations in ; 17 yield unions Larger unions convey less information — limits power of inference In practice, any union which contains # can be approximated by # — so, in practice, we can avoid the complexity of unions The complete join table Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

31Outline Introduction Alignment for NLI A theory of entailment relations A theory of compositional entailment The NatLog system Conclusion Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

32  will depend on: 1.the lexical entailment relation generated by e:  (e) 2.other properties of the context x in which e is applied  (, ) Lexical entailment relations xe(x)e(x) compound expression atomic edit: DEL, INS, SUB entailment relation Example: suppose x is red car If e is SUB ( car, convertible ), then  (e) is ⊐ If e is DEL ( red ), then  (e) is ⊏ Crucially,  (e) depends solely on lexical items in e, independent of context x But how are lexical entailment relations determined? Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

33 Lexical entailment relations: SUBs  ( SUB (x, y)) =  (x, y) For open-class terms, use lexical resource (e.g. WordNet)  for synonyms: sofa  couch, forbid  prohibit ⊏ for hypo-/hypernyms: crow ⊏ bird, frigid ⊏ cold, soar ⊏ rise |for antonyms and coordinate terms: hot | cold, cat | dog  or | for proper nouns: USA  United States, JFK | FDR # for most other pairs: hungry # hippo Closed-class terms may require special handling Quantifiers: all ⊏ some, some ^ no, no | all, at least 4  at most 6 See dissertation for discussion of pronouns, prepositions, … Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

34 Lexical entailment relations: DEL & INS Generic (default) case:  ( DEL ()) = ⊏,  ( INS ()) = ⊐ Examples: red car ⊏ car, sing ⊐ sing off-key Even quite long phrases: car parked outside since last week ⊏ car Applies to intersective modifiers, conjuncts, independent clauses, … This heuristic underlies most approaches to RTE! Does P subsume H? Deletions OK; insertions penalized. Special cases Negation: didn’t sleep ^ did sleep Implicatives & factives (e.g. refuse to, admit that ): discussed later Non-intersective adjectives: former spy | spy, alleged spy # spy Auxiliaries etc.: is sleeping  sleeps, did sleep  slept Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

35 The impact of semantic composition How are entailment relations affected by semantic composition? x y  ? The monotonicity calculus provides a partial answer UP  ⊏  ⊏ ⊐  ⊐ #  # DOWN  ⊏  ⊐ ⊐  ⊏ #  # NON  ⊏  # ⊐  # #  # If f has monotonicity… How is  (x, y) projected by f? But how are other relations (|, ^,  ) means fn application  Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

36 A typology of projectivity Projectivity signatures: a generalization of monotonicity classes negatio n  ⊏  ⊐ ⊐  ⊏ ^  ^ |   | #  # not French  not German not more than 4 | not less than 6 not human ^ not nonhuman didn’t kiss ⊐ didn’t touch not ill ⊏ not seasick In principle, 7 7 possible signatures, but few actually realized ↦ Each projectivity signature is a map not happy  not glad isn’t swimming # isn’t hungry Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

37 A typology of projectivity Projectivity signatures: a generalization of monotonicity classes Each projectivity signature is a map In principle, 7 7 possible signatures, but few actually realized ↦ negatio n  ⊏  ⊐ ⊐  ⊏ ^  ^ |   | #  # metallic pipe # nonferrous pipe intersective modification  ⊏  ⊏ ⊐  ⊐ ^  | |  |  # #  # live human | live nonhuman French wine | Spanish wine See dissertation for projectivity of connectives, quantifiers, verbs Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

38 Projecting through multiple levels ⊏ ⊏ ⊐ ⊐ ⊐ @ Propagate entailment relation between atoms upward, according to projectivity class of each node on path to root nobody can enter with a shirt ⊏ nobody can enter with clothes Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

39 Implicatives & factives [Nairn et al. 06] signatur e example implicative s + / – he managed to escape + / o he was forced to sell o / – he was permitted to live implicative s – / + he forgot to pay – / o he refused to fight o / + he hesitated to ask factives+ / + he admitted that he knew – / – he pretended he was sick o / o he wanted to fly 9 signatures, per implications (+, –, or o) in positive and negative contexts Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

40 Implicatives & factives signatur e example  ( DEL )  ( INS ) implicative s + / – he managed to escape  he escaped  + / o he was forced to sell ⊏ he sold ⊏⊐ o / – he was permitted to live ⊐ he lived ⊐⊏ implicative s – / + he forgot to pay ^ he paid ^^ – / o he refused to fight | he fought || o / + he hesitated to ask  he asked  nonfactiveso / o he wanted to fly # he flew ## We can specify relation generated by DEL or INS of each signature Room for variation w.r.t. infinitives, complementizers, passivation, etc. Some more intuitive when negated: he didn’t hesitate to ask | he didn’t ask Doesn’t cover factives, which involve presuppositions — see dissertation Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

41 Putting it all together 1.Find a sequence of edits  e 1, …, e n  which transforms p into h. Define x 0 = p, x n = h, and x i = e i (x i–1 ) for i  [1, n]. 2.For each atomic edit e i : a.Determine the lexical entailment relation  (e i ). b.Project  (e i ) upward through the semantic composition tree of expression x i–1 to find the atomic entailment relation  (x i–1, x i ) 3.Join atomic entailment relations across the sequence of edits:  (p, h) =  (x 0, x n ) =  (x 0, x 1 ) ⋈ … ⋈  (x i–1, x i ) ⋈ … ⋈  (x n–1, x n ) Limitations: need to find appropriate edit sequence connecting p and h; tendency of ⋈ operation toward less-informative entailment relations; lack of general mechanism for combining multiple premises Less deductive power than FOL. Can’t handle e.g. de Morgan’s Laws. Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

42 An example P The doctor didn’t hesitate to recommend Prozac. H The doctor recommended medication. yes ieiei xixi lexatomjoin The doctor didn’t hesitate to recommend Prozac. 1DEL( hesitate to ) The doctor didn’t recommend Prozac. 2DEL( didn’t ) The doctor recommended Prozac. 3SUB( Prozac, medication ) The doctor recommended medication.  || ^^ ⊏ ⊏⊏⊏ yes Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

43 Different edit orders? ieiei lexatomjoin 1DEL( hesitate to )  || 2DEL( didn’t )^^ ⊏ 3SUB( Prozac, medication ) ⊏⊏⊏ ieiei lexatomjoin 1DEL( didn’t )^^^ 2DEL( hesitate to )  ⊏ 3SUB( Prozac, medication ) ⊏⊏⊏ ieiei lexatomjoin 1SUB( Prozac, medication ) ⊏⊏⊏ 2DEL( hesitate to )  || 3DEL( didn’t )^^ ⊏ ieiei lexatomjoin 1DEL( hesitate to )  || 2SUB( Prozac, medication ) ⊏⊐ | 3DEL( didn’t )^^ ⊏ ieiei lexatomjoin 1DEL( didn’t )^^^ 2SUB( Prozac, medication ) ⊏⊐ | 3DEL( hesitate to )  ⊏ ieiei lexatomjoin 1SUB( Prozac, medication ) ⊏⊏⊏ 2DEL( didn’t )^^| 3DEL( hesitate to )  ⊏ Intermediate steps may vary; final result is typically (though not necessarily) the same Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

44Outline Introduction Alignment for NLI A theory of entailment relations A theory of compositional entailment The NatLog system Conclusion Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

45 The NatLog system linguistic analysis alignment lexical entailment classification NLI problem prediction entailment projection entailment joining 4 5 Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion next slide from outside sources core of system covered shortly straightforward not covered further

46 PP Stage 1: Linguistic analysis Tokenize & parse input sentences (future: & NER & coref & …) Identify items w/ special projectivity & determine scope Problem: PTB-style parse tree  semantic structure! Jimmy Dean refused to move without blue jeans NNP NNP VBD TO VB IN JJ NNS NP NP VP S Solution: specify scope in PTB trees using Tregex [Levy & Andrew 06] VP S +++–––++ refuse  move Jimmy Dean without  jeans blue category: –/o implicatives examples: refuse, forbid, prohibit, … scope: S complement pattern: __ > (/VB.*/ > VP $. S=arg) projectivity: {  : , ⊏ : ⊐, ⊐ : ⊏, ^:|, |:#, _:#, #:#} Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

47 Stage 3: Lexical entailment classification Goal: predict entailment relation for each edit, based solely on lexical features, independent of context Approach: use lexical resources & machine learning Feature representation: WordNet features: synonymy (  ), hyponymy ( ⊏ / ⊐ ), antonymy (|) Other relatedness features: Jiang-Conrath (WN-based), NomBank Fallback: string similarity (based on Levenshtein edit distance) Also lexical category, quantifier category, implication signature Decision tree classifier Trained on 2,449 hand-annotated lexical entailment problems E.g., SUB( gun, weapon ): ⊏, SUB( big, small ): |, DEL( often ): ⊏ Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

48 The FraCaS test suite FraCaS: a project in computational semantics [Cooper et al. 96] 346 “textbook” examples of NLI problems 3 possible answers: yes, no, unknown (not balanced!) 55% single-premise, 45% multi-premise (excluded) P At most ten commissioners spend time at home. H At most ten commissioners spend a lot of time at home. yes P Dumbo is a large animal. H Dumbo is a small animal. no P Smith believed that ITEL had won the contract in H ITEL won the contract in unk Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

49 27% error reduction Results on FraCaS System# prec % rec %acc % most common class MacCartney & Manning this work Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

50 high precision even outside areas of expertise 27% error reduction in largest category, all but one correct high accuracy in sections most amenable to natural logic Results on FraCaS System# prec % rec %acc % most common class MacCartney & Manning this work §Category# prec % rec %acc % 1Quantifiers Plurals Anaphora Ellipsis Adjectives Comparatives Temporal Verbs Attitudes , 2, 5, 6, Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

51 The RTE3 test suite P As leaders gather in Argentina ahead of this weekends regional talks, Hugo Chávez, Venezuela’s populist president is using an energy windfall to win friends and promote his vision of 21st-century socialism. H Hugo Chávez acts as Venezuela’s president. yes P Democrat members of the Ways and Means Committee, where tax bills are written and advanced, do not have strong small business voting records. H Democrat members had strong small business voting records. no Somewhat more “natural”, but not ideal for NatLog Many kinds of inference not addressed by NatLog: paraphrase, temporal reasoning, relation extraction, … Big edit distance  propagation of errors from atomic model Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

52 Results on RTE3: NatLog SystemData% Yes Prec % Rec %Acc % Stanford RTEdev test NatLogdev test (each data set contains 800 problems) Accuracy is unimpressive, but precision is relatively high Strategy: hybridize with Stanford RTE system As in Bos & Markert 2006 But NatLog makes positive prediction far more often (~25% vs. 4%) Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

53 4% gain (significant, p < 0.05) Results on RTE3: hybrid system SystemData% Yes Prec % Rec %Acc % Stanford RTEdev test NatLogdev test Hybriddev test (each data set contains 800 problems) Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

54Outline Introduction Alignment for NLI A theory of entailment relations A theory of compositional entailment The NatLog system Conclusion Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

55 What natural logic can’t do Not a universal solution for NLI Many types of inference not amenable to natural logic Paraphrase: Eve was let go  Eve lost her job Verb/frame alternation: he drained the oil ⊏ the oil drained Relation extraction: Aho, a trader at UBS… ⊏ Aho works for UBS Common-sense reasoning: the sink overflowed ⊏ the floor got wet etc. Also, has a weaker proof theory than FOL Can’t explain, e.g., de Morgan’s laws for quantifiers: Not all birds fly  Some birds don’t fly Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

56 Enables precise reasoning about semantic containment … hypernymy & hyponymy in nouns, verbs, adjectives, adverbs containment between temporal & locative expressions quantifier containment adding & dropping of intersective modifiers, adjuncts … and semantic exclusion … antonyms & coordinate terms: mutually exclusive nouns, adjectives mutually exclusive temporal & locative expressions negation, negative & restrictive quantifiers, verbs, adverbs, nouns … and implicatives and nonfactives Sidesteps myriad difficulties of full semantic interpretation What natural logic can do Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

57 Contributions of this dissertation Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion Undertook first systematic study of alignment for NLI Examined the relation between alignment in NLI and MT Evaluated bag-of-words, MT, and NLI aligners for NLI alignment Proposed a new model of alignment for NLI: MANLI Extended natural logic to incorporate semantic exclusion Defined expressive set of entailment relations (& join algebra) Introduced projectivity signatures: a generalization of monotonicity Unified account of implicativity under same framework Implemented a robust system for natural logic inference Demonstrated practical value on FraCaS & RTE test suites

58 The future of NLI Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion No silver bullet for NLI — problems are too diverse A full solution will need to combine disparate reasoners simple lexical similarity (e.g., bag-of-words) relation extraction natural logic & related forms of “semantic” reasoning temporal, spatial, & simple mathematical reasoning commonsense reasoning Key question: how can they best be combined? Apply in parallel, then combine predictions? How? Fine-grained “interleaving”? Collaborative proof search?

59Thanks! Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion :-) Thanks! Questions? My heartfelt appreciation to… My committee: Profs. Genesereth, Jurafsky, Manning, Peters, and van Benthem My collaborators: Marie-Catherine de Marneffe, Michel Galley, Teg Grenager, and many others My advisor: Prof. Chris Manning My girlfriend: Destiny Man Li Zhao

60 Backup slides follow Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

61 NLI alignment vs. MT alignment Doubtful — NLI alignment differs in several respects: 1.Monolingual: can exploit resources like WordNet 2.Asymmetric: P often longer & has content unrelated to H 3.Cannot assume semantic equivalence NLI aligner must accommodate frequent unaligned content 4.Little training data available MT aligners use unsupervised training on huge amounts of bitext NLI aligners must rely on supervised training & much less data Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion

62 Projectivity of connectives  ⊏⊐⊏ ⊐⊏⊐ ^^| |  |  |# ### Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion conjunction ( and ) / intersective modification negation ( not )

63 Projectivity of connectives  ⊏⊐⊏⊏ ⊐⊏⊐⊐ ^^|  |  |#  |#  #### Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion disjunction ( or ) conjunction ( and ) / intersective modification negation ( not ) waltzed or sang ⊏ danced or sang human or equine  nonhuman or equine red or yellow # blue or yellow

64 Projectivity of connectives  ⊏⊐⊏⊏⊐ ⊐⊏⊐⊐⊏ ^^|  # |  |##  |#  # ##### Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion conditional ( if ) (antecedent) disjunction ( or ) conjunction ( and ) / intersective modification negation ( not ) If he drinks tequila, he feels nauseous ⊐ If he drinks liquor, he feels nauseous If it’s sunny, we surf # If it’s not sunny, we surf If it’s sunny, we surf # If it’s rainy, we surf

65 Projectivity of connectives  ⊏⊐⊏⊏⊐⊏ ⊐⊏⊐⊐⊏⊐ ^^|  #| |  |##|  |#  ## ###### Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion conditional ( if ) (consequent) conditional ( if ) (antecedent) disjunction ( or ) conjunction ( and ) / intersective modification negation ( not ) If he drinks tequila, he feels nauseous ⊏ If he drinks tequila, he feels sick If it’s sunny, we surf | If it’s sunny, we don’t surf If it’s sunny, we surf | If it’s sunny, we ski

66 Projectivity of connectives  ⊏⊐⊏⊏⊐⊏ # ⊐⊏⊐⊐⊏⊐ # ^^|  #|^ |  |##|#  |#  ### ####### Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion biconditional ( if and only if ) conditional ( if ) (consequent) conditional ( if ) (antecedent) disjunction ( or ) conjunction ( and ) / intersective modification negation ( not )

67 Projectivity of quantifiers Introduction Alignment for NLI Entailment relations Compositional entailment The NatLog system Conclusion