Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Knowledge Representation and Inference Models for Textual Entailment Dan Roth University of Illinois Urbana-Champaign with Rodrigo Braz, Roxana Girju,

Similar presentations


Presentation on theme: "1 Knowledge Representation and Inference Models for Textual Entailment Dan Roth University of Illinois Urbana-Champaign with Rodrigo Braz, Roxana Girju,"— Presentation transcript:

1 1 Knowledge Representation and Inference Models for Textual Entailment Dan Roth University of Illinois Urbana-Champaign with Rodrigo Braz, Roxana Girju, Vasin Punyakanok, Mark Sammons

2 Page 2 By “textually entailed” we mean: most people would agree that one sentence implies the other. (more later) Fundamental Task WalMart defended itself in court today against claims that its female employees were kept out of jobs in management because they are women WalMart was sued for sexual discrimination Entails Subsumed by 

3 Page 3 A fundamental task that can be used as a building block in multiple NLP and information extraction applications There is always a risk in solving a separate ’fundamental’ task rather than the task one really wants to solve… Some of the examples here are very direct, though. Has multiple direct applications Why Textual Entailment?

4 Page 4 Given: Q: Who acquired Overture? Determine: A: Eyeing the huge market potential, currently led by Google, Yahoo took over search company Overture Services Inc last year. Question Answering Eyeing the huge market potential, currently led by Google, Yahoo took over search company Overture Services Inc last year Yahoo acquired Overture Entails Subsumed by  (and distinguish from other candidates)

5 Page 5 A process that maintains and updates a collection of propositions about the state of affairs. Viewed this way, a fundamental task to consider is that of textual entailment: Given a snippet of text S, does it entail a proposition T? Story Comprehension (ENGLAND, June, 1989) - Christopher Robin is alive and well. He lives in England. He is the same person that you read about in the book Winnie the Pooh. As a boy, Chris lived in a pretty home called Cotchfield Farm. When Chris was three years old, his father wrote a poem about him. The poem was printed in a magazine for others to read. Mr. Robin then wrote a book. He made up a fairy tale land where Chris lived. His friends were animals. There was a bear called Winnie the Pooh. There was also an owl and a young pig, called a piglet. All the animals were stuffed toys that Chris owned. Mr. Robin made them come to life with his words. The places in the story were all near Cotchfield Farm. Winnie the Pooh was written in Children still love to read about Christopher Robin and his animal friends. Most people don't know he is a real person. He has written books of his own that tell what it is like to be famous. [REMEDIA] 1. Christopher Robin was born in England. 2. Winnie the Pooh is a title of a book. 3. Christopher Robin’s dad was a magician. 4. Christopher Robin must be at least 65 now.

6 Page 6 A key problem in natural language understanding is to abstract over the inherent syntactic and semantic variability in natural language. Multiple tasks attempt to do just that. Relation Extraction: Dole ’ s wife, Elizabeth, is a native of Salisbury, N.C.  Elizabeth Dole was born in Salisbury, N.C Information Integration (Data Bases) Different database schemas represent the same information under different titles. Information retrieval: Multiple issues, from variability in the query and target text, to relations Summarization Multiple techniques can be applied; all are entailment problems. More Examples You may disagree with the truth of this statement; and you may infer also that: the presidential candidate’s wife was born in N.C.

7 Page 7 Direct Application: Semantic Verification (and distinguish from other candidates) Given: A long contract that you need to ACCEPT Determine: Does it satisfy the 3 conditions that you really care about? ACCEPT?

8 Page 8 A fundamental task for language comprehension. Builds on a lot of research (and tools) done in the last few years in Learning and Inference in Natural Language. Opens up a large collection of questions both from the natural language perspective and from the machine learning, knowledge representation and inference perspectives. Why Study Textual Entailment?

9 Page 9 A brief perspective & technical motivation An Approach to Textual Entailment The CCG Inference model for textual entailment Inference as optimization Some examples Knowledge modules Conclusions This Talk

10 Page 10 Statistics: Using relatively simple statistical techniques for BOW and/or paraphrases Multiple problems that may not be addressed just from the data: Entailment vs. Correlation [Geffet & Dagan’s 04,05] An important component, but:  How to put together/chain/weigh paraphrases? Inference model. Inference in NL requires mapping sentences to logical forms and using general purpose theorem proving. Extensions include various relaxations in the way the representation is generated and in the type of information incorporated in a KB, to support the theorem prover; non-logical, probabilistic paradigms. Key problems include the realization that underspecificty of the language is a feature, rather than a bug.  representation, but not a canonical representation Two Extremes in Representation and Inference

11 Page 11 Access to information requires tolerating “loose speak” [Porter et. al, ‘04] Refers to the imprecise way queries/questions are formed – with respect to the representation of the information source. Metonymy: referring to the an entity or event by one of its attributes Causal factor: referring to a result by one of its causes Aggregate: referring to an aggregate by one of its members Generic: referring to a specific concept by the generic class to which it belongs [The potato was cultivated first in SA] Noun compounds: referring to a relation between nouns by using just the noun phrase consisting of the two nouns. [wooden table] Many other kinds of ambiguities – some language related and some knowledge related. New (Better?) View on Problems

12 Page 12 Collin Powel addressed the general assembly yesterday  Collin Powel gave a speech at the UN The secretary of state gave a speech at the UN Resolving the sense ambiguity in “addressed” ? Or a weaker, “existential”, Yes/No with respect to “gave a speech” is sufficient [Ido Dagan; Seneval’04] How about Collin Powel? In many disambiguation problems, the view taken when studying entailment is that keeping the underspecificity of language is possible, and perhaps the right thing to do. Example: New (Better?) View on Problems

13 Page 13 Task-based Refinement

14 Page 14 An unified framework to study Learning, Knowledge Representation and Reasoning. A series of theoretical results on the advantages of a unified framework for L, KR & R, in a situations where: The goal is to Reason - deduction; abduction (best explanation) Starting point for Reasoning is not a static Knowledge Base but rather A representation of knowledge learned via interaction with the world. Quality of the learned representation is determined by the reasoning stage. Intermediate Representation is important – but only to the extent that it is learnable, and it facilitates reasoning. There may not be a need (or even a possibility) to learn an exact intermediate representation, but only to the extent that is supports Reasoning. [Khardon & Roth JACM97, AAAI94; Roth95, Roth96, Khardon&Roth99 Learning to Plan: Khardon’99] Learning in order to Reason [’94-’97] Lesson: Reflection from the Past

15 Page 15 A brief perspective & technical motivation An Approach to Textual Entailment The CCG Inference model for textual entailment Inference as optimization Some examples Knowledge modules Conclusions This Talk

16 Page 16 Mapping text to a canonical representation is often not the right approach (or: not possible) Not a computational issue Rather, the representation might depend on the task, in our case, on the hypothesis sentence. Suggests a definition for textual entailment: Let s, t, be text snippets with representations r s, r t 2 R. We say that s textually entails t if there is a representation r 2 R of s, for which we can prove that r µ r t Defining Textual Entailment

17 Page 17 R - a knowledge representation language, with a well defined syntax and semantics or a domain D. For text snippets s, t: r s, r t - their representations in R. M(r s ), M(r t ) their model theoretic representations There is a well defined notion of subsumption in R, defined model theoretically u, v 2 R: u is subsumed by v when M(u) µ M(v) Not an algorithm; need a proof theory. Defining Semantic Entailment

18 Page 18 The proof theory is weak; will show r s µ r t only when they are relatively “similar”. r 2 R is faithful to s if M(r s ) = M(r) Definition: Let s, t, be text snippets with representations r s, r t 2 R. We say that s textually entails t if there is a representation r 2 R that is faithful to s, for which we can prove that r µ r t Given r s one needs to generate many equivalent representations r’ s and test r’ s µ r t Defining Semantic Entailment (2) Cannot be done exhaustively How to generate alternative representations?

19 Page 19 A rewrite rule (l,r) is a pair of expressions in R such that l µ r Given a representation r s of s and a rule (r,l) for which r s µ l the augmentation of r s via (l,r) is r’ s = r s Æ r. Claim: r’ s is faithful to s. Proof: In general, since r’ s = r s Æ r then M(r’ s )= M(r s ) Å M(r) However, since r s µ l µ r then M(r s ) µ M(r). Consequently: M(r’ s )= M(r s ) And the augmented representation is faithful to s. The Role of Knowledge: Refining Representations rsrs l µ r, r s µ l µ r’ s = r s Æ r

20 Page 20 The claim suggests an algorithm for generating alternative (equivalent) representations, and for textual entailment. The resulting algorithm is sound, but is not complete. Completeness depends on the quality of the KB of rules. The power of this re-representation algorithm is in the rules KB and in an inference procedure that incorporates them. Choosing appropriate refinements Depends on the target sentence Is an optimization procedure. Comments

21 Page 21 General Strategy Given a sentence S (answer) Find the optimal set of transformations that maps one sentence to the target sentence. Given a KB of semantic; structural and pragmatic transformations (rules). Given a sentence T (question) ee Induce an abstract representation of S (a concept graph) Re-represent S Cartoon

22 Page 22 Inducing an Abstract Representation of Text Multiple learning Steps; centered around a semantic parse (predicate-argument representation) of a sentence augmented by additional information. Final representation is a hierarchical concept graph (DL inspired) Refining the representation using an existing KB Rewrite rules at multiple levels; application depends on target; [Features] Modeling Entailment as Constrained Optimization Entailment is a mapping between sentence representation Find an optimal mapping [minimal cost proof; abduction] that respects The hierarchy Transformations (rules) applied to nodes/edges/sub-graphs The confidence in the induced information All modeled as (soft) constraints Provides robustness against inherent variability in natural language, inevitable noise in learning processes and missing information. The One Slide Approach Summary

23 Page 23 Learning, Representing and Reasoning take part at several levels in the process. A unified knowledge representation of the text, that provides an hierarchical encoding of the structural, relational and semantic properties of the given text is integrated with learning mechanisms that can be used to induce such information from newly observed raw text, and that is equipped with an inferential mechanism that can be used to support inferences with respect to such representations. An Inference Model for Semantic Entailment [AAAI’05] Experiments with a Semantic Entailment System [IJCAI’05-WS] Components

24 Page 24 An Example s: Lung cancer put an end to the life of Jazz singer Marion Montgomery on Monday. t: Singer dies of carcenoma. s is re-represented in several ways; one of these is shown to be subsumed by t s’ 1 : Lung cancer killed Jazz singer Marion Montgomery on Monday. s’ 2 : Jazz singer Marion Montgomery died of lung cancer on Monday.

25 Page 25 Representation Hierarchical; Multiple types of information; All hanging on the sentence itself. Formally, represented using Description Logic Expressions; Rewrite rules have the same representation.

26 Page 26 Representation (2) Representation is formal – not to be confused with a logical/canonical representation. Attempt is made to represent the text, and augment/refine the representation as part of the inference process. The skeleton of the representation is a predicate-argument representation learned based on PropBank (the semantic role labelling task). Resources used to augment the representation: Segmentation; tokenization; Lemmatizer;POS tagger Shallow Parser Syntactic parser (Collins;Charniak) Named entity tagger Entity identification. (co-Reference) Resources used to Rewrite/Refine and for Subsumption Wordnet Dirt paraphrase rules (Lin) Word clusters (Lin) Ad hoc modules (later) In house machine learning based tools [http://L2R.cs.uiuc.edu/~cogcomphttp://L2R.cs.uiuc.edu/~cogcomp

27 Page 27 Predicate-Argument Representation For each predicate in a sentence [currently – verbs] Represent all constituents that fill a semantic role Core Arguments, e.g., Agent, Patient or Instrument Their adjuncts, e.g., Locative, Temporal or Manner I left my pearls to my daughter-in-law in my will. A0 : leaver A1 : thing left A2 : benefactor AM-LOC The pearls which I left to my daughter-in-law are fake. A0 : leaverA1 : thing left A2 : benefactorR-A1 The pearls, I said, were left to my daughter-in-law. A0 : sayer A1 : utteranceC-A1 : utterance

28 Page 28 Screen shot from a CCG demo This problem itself is modelled as a constrained optimization problem over the output of a large number of classifiers, and multiple constraints. Solution: formulating it as a linear program and solving integer linear programs. Top system in CoNLL shared Task; presentation later today Semantic Role Labelling

29 Page 29 Rewrite Rules (KB) Goal: Acquire transformations that preserve meaning Basic linguistics processing levels: Keyword matching; Grammatical; Semantic; (Discourse, Pragmatic, …) The mechanism supports chaining. Rules may contain variables; the augmentation mechanism supports inheritance. Some examples later Rules are used also to avoid semantic parsing problems. managed to enter  entered; failed to enter  enter not

30 Page 30 The Inference Problem 1.Optimizing over the transformations applied to the initial representation. 2.Optimizing over the transformations applied to determine final subsumption Even after the refinement of the representation, requiring exact subsumption (embedding of the target graph in the source graph) is unrealistic. Words can be replaced by synonyms; modifiers can be dropped, etc. We develop a notion of functional subsumption: say “yes” when node & edges unify modulo some allowed transformations. [Why do we separate to two stages?]

31 Page 31 1.Incrementally augment the original representation and generate faithful re-representations of it. 2.Compute whether the target representation subsumes the augmented concept graph via an extended subsumption algorithm. Uncertainty is encoded by optimizing a linear cost function. Cost can be learned in a straight forward way via and EM-like algorithm. The inference model seeks the optimal re-representation S' i such that: S' i = argmin { S‘ | C(S,S' i ) + D(S' i,T) } Over the space of all possible re-representations of S given KB (subject to multiple constraints – order, structure) C returns the cost of augmenting S to S' i and D returns the costs of performing extended subsumption from S' i to T. Modeling Inference as Optimization

32 Page 32 Inference: Key Points Hierarchical Subsumption Decision List: if succeeds at a level, go on to the next; otherwise, fail At the Predicate-Argument level At the phrase level At the word level Match both attributes and edges (relational information) Match may not be perfect Inference (unification) as Optimization The optimal unification U’ is the one minimizing:  Hi  {(X,Y)  U| X  Hi}  i G (X,Y) (X,Y, resp. substructures on S, T) where i is a fixed constant that ensures the hierarchical behavior is as a decision list. ( i makes sure that changes in H 0 dominate changes in H 1 ) Integer Linear Programming formulation for Unification

33 Page 33 KR: [Learning & Inference] A description logic inspired hierarchical KR into which we re-represent the surface level text augmented with multiple abstractions. KB: [Acquisition & Inference] A knowledge base consisting of syntactic and semantic rewrite rules, written at several levels of abstractions Inference: [modeled as optimization: flexibility & error tolerance] An extended subsumption algorithm which determines subsumption between representations. An Inference Model for Semantic Entailment [AAAI’05] Experiments with a Semantic Entailment System [IJCAI’05-WS] Evaluation: SRL (CoNLL Shared Task) ; Pascal Ablation study on the PARC collection Summary

34 Page 34 A brief perspective & technical motivation An Approach to Textual Entailment The CCG Inference model for textual entailment Inference as optimization Some examples Knowledge modules Conclusions This Talk

35 Page 35 PARC Data 76 Pairs of Q-A sentences questions converted manually treat label “unknown” as “false” Designed to test linguistic (lexical and constructional) entailment Out of 76 pairs: 64 pairs – got perfect SRL labelling System versions: Vary Two Dimensions Structure: add more parsing capabilities Semantic: add more semantic resources (some use parse structure) Ablation study on the PARC Data

36 Page 36 System Versions Suite of tests, incrementally adding system components System versions: LLM: Uses BOW++ to match entire sentences SRL + LLM: Uses SRL tagging (filter) and BOW on verb arguments SRL + Deep Structure: System parses arguments of Verbs Uses full parse, shallow parse tagging to identify argument structure Knowledge Base (of rewrite rules) active or inactive

37 Page 37 Testing the Entailment System Entailment (Knowledge Base) Modules (can only be activated when appropriate parse structure is present) Verb Phrase Compression Rewrite verb constructions – modal, VERB to VERB, tense Discourse Analysis Detect embedded predicates Annotate effect of embedding predicate on embedded predicate Qualifier Reasoning Detect qualifiers and scope – some, no, all, any, etc. Determine entailment of qualified arguments Not shown: Functional Subsumption – rules (e.g., synonyms) used to allow other rules to fire.

38 Page 38 Perfect Corpus with applicable entailment modules, with Knowledge Base Active Components SystemBaseBase + VPBase + VP + DA Base + VP + DA + Qual LLM60.94N/A SRL+LLM N/A SRL + Deep Structure Results for Different Entailment Systems

39 Page 39 Full Corpus with applicable entailment modules, with Knowledge Base Active Components SystemBaseBase + VPBase + VP + DA Base + VP + DA + Qual LLM63.15N/A SRL+LLM N/A SRL + Deep Structure Results for Different Entailment Systems

40 Page 40 Baseline Entailment System (1) * Baseline system is Lexical Level Matching (LLM) Ignores many “stopwords”, including “be” verbs, prepositions, determiners Lemmatizes words before matching Requiring structure may hurt: LLM allows entailment when SRL- based subsumption requires a rewrite rule: For LLM, the only words of T that register are ”diplomat” and “Iraq” As these are present in S, LLM will return “true” S: [The diplomat]/ARG1 visited [Iraq]/ARG1 [in September]/AM_TMP T: [The diplomat]/ARG1 was in [Iraq]/ARG2

41 Page 41 Baseline System (1.1) * But, LLM is insensitive to small changes in wording LLM ignores modal “could”, so returns incorrect answer “true”. S: [Legally]/AM_ADV, [John]/ARG0 [could]/AM_MOD drive. T: [John]/ARG0 drove.

42 Page 42 SRL + LLM (2.) SRL + LLM system uses Semantic Role Labeler tagging First, tries to match verb and argument types in the two sentences If successful, system uses LLM to determine entailment of arguments Advantage over LLM when argument or modifier attached to different verb in T than in S: Words are identical, so LLM incorrectly labels example “true” SRL+LLM returns “false” because arguments of “said”, “visit” don’t match. S: [The president]/ARG0 said [[the diplomat]/ARG0 left [Iraq]/ARG1]/ARG1 T: [The diplomat]/ARG0 said [[the president]/ARG0 left [Iraq]/ARG1]/ARG1

43 Page 43 SRL + LLM (2.1) * Disadvantage of using SRL+LLM compared to LLM: SRL generates predicate frames verbs ignored as stopwords by LLM Example: “went” in following sentence pair: LLM ignores “went”, returns correct label “true” SRL generates a verb frame for “went” Subsumption fails as no match for this verb in S In this data set, more instances like the second case than like the first the result is a drop in performance However, SRL forms crucial backbone for other functionality S: [The president]/ARG0 visited [Iraq]/ARG1 [in September]/AM_TMP T: [The president]/ARG0 went to [Iraq]/ARG1.

44 Page 44 SRL+LLM with Verb Processing (3.0) * The Verb Processing (VP) module rewrites certain verb phrases as a single verb with additional attributes Uses word order and Part of Speech information to identify candidate patterns Presently recognizes modal and tense constructions, and simple verb compounds of the form ”VERB to VERB” (such as “manage to enter”) Verb phrase replaced by single predicate (verb) node with additional attributes Modality (“CONFIDENCE”) Tense Requires POS and word order information Default CONFIDENCE is “FACTUAL”

45 Page 45 SRL+LLM with Verb Processing (3.1) * Example where Verb Processing (VP) module helps: Subsumption in LLM and SRL+LLM system succeeds, as argument and verb lemma in T match those in S VP module rewrites “could drive” as “drive”, adds attribute “CONFIDENCE: POTENTIAL” to “drive” predicate node In SRL+LLM+VP, subsumption fails at verb level, as CONFIDENCE attributes don’t match S: [Legally]/AM_ADV, [John]/ARG0 [could]/AM_MOD drive. T: [John]/ARG0 drove.

46 Page 46 SRL+LLM with Verb Processing (3.2) VP module rewrites auxiliary construction in T as a single verb with tense and modality attributes attached Now, SRL generates only a single predicate frame for “sold” This matches its counterpart in S, and subsumption succeeds, qualifying effect of the verb ``said'' in S cannot be recognized without the deeper parse structure and the Discourse Analysis module. S: Bush said that Khan sold centrifuges to North Korea. T: Centrifuges sold to North Korea.

47 Page 47 SRL + Deep Structure (4.0) * SRL + Deep Structure entailment system identifies substructure in SRL predicate arguments uses full- and shallow parse, Named Entity and Part of Speech information identifies the key entity in each argument Identifies modifiers of key entity such as adjectives, titles, and quantities Enables further semantic modules, such as Qualifier module for reasoning about entailment of qualified arguments

48 Page 48 SRL + Deep Structure (4.0) * “Some” and “no” are stopwords (i.e., ignored by LLM), so LLM and SRL+LLM incorrectly label this example “true” SRL + Deep Structure gives correct label, “false”, because “no” and “some” are identified as key entity modifiers for matching argument, and they don’t match S: No US congressman visited Iraq until the war. T: Some US congressmen visited Iraq before the war.

49 Page 49 SRL + Deep Structure (4.2) Handling modifiers: No rules for modifiers: The LLM and SRL+LLM systems find no match for “intelligent” in S, and so return the correct answer, “false” SRL + Deep Structure system allows unbalanced T adjective modifiers (assumption: S must be more general than T) and returns “true”. Context sensitive handling of modifiers? S: The room was full of women. T: The room was full of intelligent women.

50 Page 50 SRL + Deep Structure + Discourse Analysis (5.0) * Detecting the effects of an embedding predicate on the embedded predicate Presently, supports distinction between “FACTUAL” (default assumption) and a set of values that distinguish various types of uncertainty, such as “REPORTED” All systems lacking Discourse Analysis (DA) module label this sentence pair “true”, because T is a literal fragment of S Actual truth value depends on interpretation of “reported” Other embedding constructions DA can handle: Adjectival: “It is unlikely that Hanssen sold secrets…” Nominal: “There was a suspicion that Hanssen sold secrets…” S: The New York Times reported that Hanssen sold FBI secrets to the Russians and could face the death penalty. T: Hanssen sold FBI secrets to the Russians.

51 Page 51 SRL + Deep Structure + DA + Qualifier (6.0) * The Qualifier module allows comparison of qualifiers such as all, some, many, no, etc. In the following example it is used to identify that “all soldiers” entails “many soldiers” S: All soldiers were killed in the ambush. T: Many soldiers were killed in the ambush.

52 Page 52 Perfect Corpus with applicable entailment modules, with Knowledge Base Active Components SystemBaseBase + VPBase + VP + DA Base + VP + DA + Qual LLM60.94N/A SRL+LLM N/A SRL + Deep Structure Results for Different Entailment Systems

53 Page 53 Full Corpus with applicable entailment modules, with Knowledge Base Active Components SystemBaseBase + VPBase + VP + DA Base + VP + DA + Qual LLM63.15N/A SRL+LLM N/A SRL + Deep Structure Results for Different Entailment Systems

54 Page 54 Experiment: Conclusions Monotonic improvement as additional analysis resources are added. Best performance for system with most structural information (which supports the most semantic analysis modules) Non-monotonic improvement, relative to LLM, because: LLM robust to certain errors due to stopwords SRL matching stricter: fewer false positives, more false negatives Corpus distribution favors LLM Consistent behavior for “imperfect” corpus (includes SRL errors) Hierarchical representational approach shows strong promise

55 Page 55 Summary Progress in Natural Language Understanding requires the ability to learn, represent and reason with respect to structured and relational data. The task of Textual Entailment provides a general setting within which to study and develop these theories. At the same time, it supports some immediate applications. We argued for an approach that Attempts to refine a learned representation using a collection of knowledge modules, thus maintaining some of the under specificity in language as far as possible. Models inference as an optimization problem that attempts to find the minimal cost solution. No surprise, the key issues in this approach are in knowledge acquisition.

56 Page 56 Semantic Role Labeling (1/2) For each verb in a sentence 1. Identify all constituents that fill a semantic role 2. Determine their roles Core Arguments, e.g., Agent, Patient or Instrument Their adjuncts, e.g., Locative, Temporal or Manner I left my pearls to my daughter-in-law in my will. A0 : leaver A1 : thing left A2 : benefactor AM-LOC The pearls which I left to my daughter-in-law are fake. A0 : leaverA1 : thing left A2 : benefactorR-A1 The pearls, I said, were left to my daughter-in-law. A0 : sayer A1 : utteranceC-A1 : utterance

57 Page 57 Semantic Role Labeling (2/2) PropBank [Palmer et. al. 05] provides a large human-annotated corpus of semantic verb-argument relations. It adds a layer of generic semantic labels to Penn Tree Bank II. (Almost) all the labels are on the constituents of the parse trees. Core arguments: A0-A5 and AA different semantics for each verb specified in the PropBank Frame files 13 types of adjuncts labeled as AM- arg where arg specifies the adjunct type

58 Page 58 Our Approach Identify argument candidates Pruning [Xue&Palmer, EMNLP’04] Argument Identifier Binary classification (SNoW) Classify argument candidates Argument Classifier Multi-class classification (SNoW) Inference Use the estimated probability distribution given by the argument classifier Use structural and linguistic constraints Infer the optimal global output I left my nice pearls to her [ [ [ [ [ ] ] ] ] ] I left my nice pearls to her [ [ [ [ [ ] ] ] ] ] I left my nice pearls to her

59 Page 59 Inference Maximize expected number correct T* = argmax T  i P( a i = t i ) Subject to some constraints Structural and Linguistic (R-A1  A1) Solved with Integer Learning Programming I left my nice pearls to her Cost = = 1.6Non-OverlappingCost = = 1.4 Blue  Red & N-O Cost = = 1.8Independent Max

60 Page 60 No duplicate argument classes  a  P OT A RG x { a = A0 }  1 R-ARG  a2  P OT A RG,  a  P OT A RG x { a = A0 }  x { a2 = R-A0 } C-ARG  a2  P OT A RG,  (a  P OT A RG )  (a is before a2 ) x { a = A0 }  x { a2 = C-A0 } Many other possible constraints: Unique labels No overlapping or embedding Relations between number of arguments If verb is of type A, no argument of type B Joint inference can be used also to combine different SRL Systems. Any Boolean rule can be encoded as a linear constraint. If there is an R-ARG phrase, there is an ARG Phrase If there is an C-ARG phrase, there is an ARG before it Constraints


Download ppt "1 Knowledge Representation and Inference Models for Textual Entailment Dan Roth University of Illinois Urbana-Champaign with Rodrigo Braz, Roxana Girju,"

Similar presentations


Ads by Google