Presentation is loading. Please wait.

Presentation is loading. Please wait.

Page 1 Global Inference and Learning Towards Natural Language Understanding Dan Roth Department of Computer Science University of Illinois at Urbana-Champaign.

Similar presentations


Presentation on theme: "Page 1 Global Inference and Learning Towards Natural Language Understanding Dan Roth Department of Computer Science University of Illinois at Urbana-Champaign."— Presentation transcript:

1 Page 1 Global Inference and Learning Towards Natural Language Understanding Dan Roth Department of Computer Science University of Illinois at Urbana-Champaign CRI-06 Workshop on Machine Learning in Natural Language Processing

2 Page 2 Nice to Meet You

3 Page 3 Learning and Inference  Global decisions in which several local decisions play a role but there are mutual dependencies on their outcome.  (Learned) classifiers for different sub-problems  Incorporate classifiers’ information, along with constraints, in making coherent decisions – decisions that respect the local classifiers as well as domain & context specific constraints.  Global inference for the best assignment to all variables of interest.

4 Page 4 Comprehension 1. Christopher Robin was born in England. 2. Winnie the Pooh is a title of a book. 3. Christopher Robin’s dad was a magician. 4. Christopher Robin must be at least 65 now. A process that maintains and updates a collection of propositions about the state of affairs. (ENGLAND, June, 1989) - Christopher Robin is alive and well. He lives in England. He is the same person that you read about in the book, Winnie the Pooh. As a boy, Chris lived in a pretty home called Cotchfield Farm. When Chris was three years old, his father wrote a poem about him. The poem was printed in a magazine for others to read. Mr. Robin then wrote a book. He made up a fairy tale land where Chris lived. His friends were animals. There was a bear called Winnie the Pooh. There was also an owl and a young pig, called a piglet. All the animals were stuffed toys that Chris owned. Mr. Robin made them come to life with his words. The places in the story were all near Cotchfield Farm. Winnie the Pooh was written in 1925. Children still love to read about Christopher Robin and his animal friends. Most people don't know he is a real person who is grown now. He has written two books of his own. They tell what it is like to be famous.

5 Page 5 How to Address Comprehension? (cartoon) It’s an Inference Problem Map into a well defined language Use standard reasoning tools Huge number of problems: Variability of Language Knowledge Acquisition Reasoning Patterns multiple levels of ambiguity make language precise Canonical Mapping? Underspecificity calls for “purposeful”, goal specific mapping Statistics: know a word by its neighbors Counting Machine Learning Clustering Probabilistic models Structured Models Classification

6 Page 6 Illinois’ bored of education board...Nissan Car and truck plant is … …divide life into plant and animal kingdom (This Art) (can N) (will MD) (rust V) V,N,N The dog bit the kid. He was taken to a veterinarian a hospital What we Know: Stand Alone Ambiguity Resolution Learn a function f: X  Y that maps observations in a domain to one of several categories or < Broad Coverage

7 Page 7  Theoretically: generalization bounds  How many example does one need to see in order to guarantee good behavior on previously unobserved examples.  Algorithmically: good learning algorithms for linear representations.  Can deal with very high dimensionality (10 6 features)  Very efficient in terms of computation and # of examples. On-line.  Key issues remaining:  Learning protocols: how to minimize interaction (supervision); how to map domain/task information to supervision; semi-supervised learning; active learning; ranking; sequences  What are the features? No good theoretical understanding here.  Programming systems that have multiple classifiers Classification is Well Understood

8 Page 8 Comprehension 1. Christopher Robin was born in England. 2. Winnie the Pooh is a title of a book. 3. Christopher Robin’s dad was a magician. 4. Christopher Robin must be at least 65 now. A process that maintains and updates a collection of propositions about the state of affairs. (ENGLAND, June, 1989) - Christopher Robin is alive and well. He lives in England. He is the same person that you read about in the book, Winnie the Pooh. As a boy, Chris lived in a pretty home called Cotchfield Farm. When Chris was three years old, his father wrote a poem about him. The poem was printed in a magazine for others to read. Mr. Robin then wrote a book. He made up a fairy tale land where Chris lived. His friends were animals. There was a bear called Winnie the Pooh. There was also an owl and a young pig, called a piglet. All the animals were stuffed toys that Chris owned. Mr. Robin made them come to life with his words. The places in the story were all near Cotchfield Farm. Winnie the Pooh was written in 1925. Children still love to read about Christopher Robin and his animal friends. Most people don't know he is a real person who is grown now. He has written two books of his own. They tell what it is like to be famous.

9 Page 9 This Talk  Integrating Learning and Inference  Historical Perspective  Role of Learning  Global Inference over Classifiers  Semantic Parsing  Multiple levels of processing  Textual Entailment  Summary and Future Directions

10 Page 10 Learning, Knowledge Representation, Reasoning There has been a lot of work on these three topics in AI But, mostly,  work on Learning, and  work on Knowledge Representation and Reasoning Very little has been done on an integrative framework.  Inductive Logic Programming  Some work from the perspective of probabilistic AI  Some work from the perspective of Machine Learning (EBL)  Recent: Valiant’s Robust Logic  Learning to Reason

11 Page 11 An unified framework to study Learning, Knowledge Representation and Reasoning The goal is to Reason (deduction; abduction - best explanation) Reasoning is not done from a static Knowledge Base but rather done with knowledge that is learned via interaction with the world. Intermediate Representation is important – but only to the extent that it is learnable, and it facilitates reasoning. Feedback to learning is given by the reasoning stage. There may not be a need (or even a possibility) to learn the intermediate representation exactly, but only to the extent that is supports Reasoning. [Khardon & Roth JACM97, AAAI94; Roth95, Roth96, Khardon&Roth99 Learning to Plan: Khardon’99] Learning to Reason [’94-’97]: Key Insights

12 Page 12 Learning to Reason Interaction with the World  Knowledge representation is:  - Chosen to facilitate inference  - Learned by interaction with the world Performance is measured with respect to the world, on a reasoning task Task Reasoning W Learning Knowledge Representation KB

13 Page 13 Deduction (Khardon,Roth 94,97) f q, f  CNF, q  MonCNF L2R: Computational Advantages

14 Page 14 Deduction (Khardon,Roth 94,97) f q, f  CNF, q  MonCNF Learning to Reason is easy when Reasoning is hard f q L2R: Computational Advantages

15 Page 15 Deduction (Khardon,Roth 94,97) f q, f  CNF, q  MonCNF Learning to Reason is easy when Reasoning is hard W KB(f) q L2R: Computational Advantages

16 Page 16 Deduction (Khardon,Roth 94,97) f q, f  CNF, q  MonCNF Learning to Reason is easy when Reasoning is hard Learning to Reason is easy when Learning is hard W RL KB(f) W L f q L2R: Computational Advantages

17 Page 17 Deduction (Khardon,Roth 94,97) f q, f  CNF, q  MonCNF Learning to Reason is easy when Reasoning is hard Learning to Reason is easy when Learning is hard W RL KB(f) W RL q q L2R: Computational Advantages

18 Page 18 Deduction (Khardon,Roth 94,97) f q, f  CNF, q  MonCNF Learning to Reason is easy when Reasoning is hard Learning to Reason is easy when Learning is hard No Magic W RL KB(f) W RL q q L2R: Computational Advantages Learn a different representation for f, one that allows efficient reasoning Learn a different function, which only approximates f, but is sufficient for exact reasoning. Other studies on non-monotonic reasoning as learning, etc.

19 Page 19 An unified framework to study Learning, Knowledge Representation and Reasoning The goal is to Reason (deduction; abduction - best explanation) Reasoning is not done from a static Knowledge Base but rather done with knowledge that is learned via interaction with the world. Intermediate Representation is important – but only to the extent that it is learnable, and it facilitates reasoning. Feedback to learning is given by the reasoning stage. There may not be a need (or even a possibility) to learn the intermediate representation exactly, but only to the extent that is supports Reasoning. [Khardon & Roth JACM97, AAAI94; Roth95, Roth96, Khardon&Roth99 Learning to Plan: Khardon’99] Learning to Reason [’94-’97]: Relevant? How?

20 Page 20 Comprehension 1. Christopher Robin was born in England. 2. Winnie the Pooh is a title of a book. 3. Christopher Robin’s dad was a magician. 4. Christopher Robin must be at least 65 now. A process that maintains and updates a collection of propositions about the state of affairs. (ENGLAND, June, 1989) - Christopher Robin is alive and well. He lives in England. He is the same person that you read about in the book, Winnie the Pooh. As a boy, Chris lived in a pretty home called Cotchfield Farm. When Chris was three years old, his father wrote a poem about him. The poem was printed in a magazine for others to read. Mr. Robin then wrote a book. He made up a fairy tale land where Chris lived. His friends were animals. There was a bear called Winnie the Pooh. There was also an owl and a young pig, called a piglet. All the animals were stuffed toys that Chris owned. Mr. Robin made them come to life with his words. The places in the story were all near Cotchfield Farm. Winnie the Pooh was written in 1925. Children still love to read about Christopher Robin and his animal friends. Most people don't know he is a real person who is grown now. He has written two books of his own. They tell what it is like to be famous.

21 Page 21 Learning serves to support abstraction  A context sensitive operation that is done at multiple levels  From names (Mr. Robin, Christopher Robin) to Relations (wrote, author) and concepts Learning serves to generate the vocabulary  over which reasoning is possible (Part-of-speech; a subject-of,…) Knowledge Acquisition; tuning; memory,…. Learning in the context of a reasoning system  Training with an Inference Mechanism  Feedback is done at the inference level, not at the single classifier level  Labeling using Inference It’s an Inference Problem Today’s Research Issues What is the role of Learning?

22 Page 22 This Talk  Integrating Learning and Inference  Historical Perspective  Role of Learning  Global Inference over Classifiers  Semantic Parsing  Multiple levels of processing  Textual Entailment  Summary and Future Directions

23 Page 23 A unified representation that is used  as an input to learning processes, and  as an output of learning processes  …… Specifically, we use: An abstract representation that is centered around a semantic parse (predicate-argument representation) augmented by additional information. Formalized as a hierarchical concept graph (Description Logic inspired) Feature Description Logic [Cumby&Roth’00, 02, 03] What I will not talk about: Knowledge Representation ci ty pers on date month(April) year(2001) cou ntry mee ting participa nt locationtime name(Iraq ) affiliatio n nationalit y name(Pragu e) organizatio n Mohammed Atta met with an Iraqi intelligence agent in Prague in April 2001.

24 Page 24 Inference and Learning  Global decisions in which several local decisions play a role but there are mutual dependencies on their outcome.  Learned classifiers for different sub-problems  Incorporate classifiers’ information, along with constraints, in making coherent decisions – decisions that respect the local classifiers as well as domain & context specific constraints.  Global inference for the best assignment to all variables of interest. How to induce a predicate argument representation of a sentence. How to use inference methods over learned outcomes. How to use declarative information over/along with learned information.

25 Page 25 Semantic Role Labeling I left my pearls to my daughter in my will. [ I ] A0 left [ my pearls ] A1 [ to my daughter ] A2 [ in my will ] AM-LOC. A0Leaver A1Things left A2Benefactor AM-LOCLocation I left my pearls to my daughter in my will. Special Case (structure output problem): here, all the data is available at one time; in general, classifiers might be learned from different sources, at different times, at different contexts. Implications on training paradigms Overlapping arguments If A2 is present, A1 must also be present.

26 Page 26 Random Variables Y: Conditional Distributions P (learned by classifiers) Constraints C– any Boolean function defined on partial assignments (possibly: + weights W ) Goal: Find the “best” assignment  The assignment that achieves the highest global accuracy. This is an Integer Programming Problem Problem Setting y4y4 y5y5 y6y6 y7y7 y8y8 y1y1 y2y2 y3y3 C(y 1,y 4 ) C(y 2,y 3,y 6,y 7,y 8 ) Y*=argmax Y P  Y subject to constraints C (+ W  C)

27 Page 27 A General Inference Setting Inference as Optimization [Yih&Roth CoNLL ’ 04] [Punyakanok et. al COLING ’ 04] [Punyakanok et. al IJCAI ’ 04]  Markov Random Field  [standard] Optimization Problem (e.g., Metric Labeling Problems)  [Chekuri et. al ’ 01] Linear Programming Problems An Integer linear programming (ILP) formulation  General: works on non-sequential constraint structure  Expressive: can represent many types of declarative constraints  Optimal: finds the optimal solution  Fast: commercial packages are able to quickly solve very large problems (hundreds of variables and constraints; sparsity is important)

28 Page 28 For each verb in a sentence 1. Identify all constituents that fill a semantic role 2. Determine their roles Core Arguments, e.g., Agent, Patient or Instrument Their adjuncts, e.g., Locative, Temporal or Manner I left my pearls to my daughter-in-law in my will. A0 : leaver A1 : thing left A2 : benefactor AM-LOC The pearls which I left to my daughter-in-law are fake. A0 : leaverA1 : thing left A2 : benefactorR-A1 The pearls, I said, were left to my daughter-in-law. A0 : sayer A1 : utteranceC-A1 : utterance Who did what to whom, when, where, why,… Semantic Role Labeling (1/2)

29 Page 29 PropBank [Palmer et. al. 05] provides a large human-annotated corpus of semantic verb-argument relations.  It adds a layer of generic semantic labels to Penn Tree Bank II.  (Almost) all the labels are on the constituents of the parse trees. Core arguments: A0-A5 and AA  different semantics for each verb  specified in the PropBank Frame files 13 types of adjuncts labeled as AM- arg  where arg specifies the adjunct type Semantic Role Labeling (2/2)

30 Page 30 Algorithmic Approach Identify argument candidates  Pruning [Xue&Palmer, EMNLP’04]  Argument Identifier Binary classification (SNoW) Classify argument candidates  Argument Classifier Multi-class classification (SNoW) Inference  Use the estimated probability distribution given by the argument classifier  Use structural and linguistic constraints  Infer the optimal global output I left my nice pearls to her [ [ [ [ [ ] ] ] ] ] I left my nice pearls to her [ [ [ [ [ ] ] ] ] ] I left my nice pearls to her Identify Vocabulary Inference over (old and new) Vocabulary candidate arguments

31 Page 31 Argument Identification & Classification Both argument identifier and argument classifier are trained phrase-based classifiers. Features (some examples)  voice, phrase type, head word, path, chunk, chunk pattern, etc. [some make use of a full syntactic parse] Learning Algorithm – SNoW  Sparse network of linear functions weights learned by regularized Winnow multiplicative update rule  Probability conversion is done via softmax p i = exp{act i }/  j exp{act j } I left my nice pearls to her [ [ [ [ [ ] ] ] ] ] I left my nice pearls to her

32 Page 32 Inference I left my nice pearls to her The output of the argument classifier often violates some constraints, especially when the sentence is long. Finding the best legitimate output is formalized as an optimization problem and solved via Integer Linear Programming. [Punyakanok et. al 04, Roth & Yih 04] Input:  The probability estimation (by the argument classifier)  Structural and linguistic constraints Allows incorporating expressive (non-sequential) constraints on the variables (the arguments types).

33 Page 33 Integer Linear Programming (ILP) Maximize: Subject to

34 Page 34 Integer Linear Programming Inference For each argument a i  Set up a Boolean variable: a i,t indicating whether a i is classified as t Goal is to maximize   i score(a i = t ) a i,t  Subject to the (linear) constraints Any Boolean constraints can be encoded as linear constraint(s). If score(a i = t ) = P(a i = t ), the objective is to find the assignment that maximizes the expected number of arguments that are correct and satisfies the constraints.

35 Page 35 Maximize expected number correct  T* = argmax T  i P( a i = t i ) Subject to some constraints  Structural and Linguistic (R-A1  A1) Solved with Integer Learning Programming 0.3 0.2 0.2 0.3 0.6 0.0 0.0 0.4 0.1 0.3 0.5 0.1 0.1 0.2 0.3 0.4 I left my nice pearls to her Cost = 0.3 + 0.4 + 0.5 + 0.4 = 1.6Non-OverlappingCost = 0.3 + 0.4 + 0.3 + 0.4 = 1.4 Blue  Red & N-O Cost = 0.3 + 0.6 + 0.5 + 0.4 = 1.8Independent Max Inference

36 Page 36 No duplicate argument classes  a  P OT A RG x { a = A0 }  1 R-ARG  a2  P OT A RG,  a  P OT A RG x { a = A0 }  x { a2 = R-A0 } C-ARG  a2  P OT A RG,  (a  P OT A RG )  (a is before a2 ) x { a = A0 }  x { a2 = C-A0 } Many other possible constraints: Unique labels No overlapping or embedding Relations between number of arguments If verb is of type A, no argument of type B Any Boolean rule can be encoded as a linear constraint. If there is an R-ARG phrase, there is an ARG Phrase If there is an C-ARG phrase, there is an ARG before it Constraints Universally quantified rules

37 Page 37 This approach produces a very good semantic parser. Top ranked system in CoNLL’05 shared task:  Key difference is the Inference Easy and fast: ~1 Sentence/Second (using Xpress-MP) A lot of room for improvement (additional constraints) Demo available http://L2R.cs.uiuc.edu/~cogcomphttp://L2R.cs.uiuc.edu/~cogcomp  Significant also in enabling knowledge acquisition So far, shown the use of only declarative (deterministic) constraints. In fact, this approach can be used both with statistical and declarative constraints. Semantic Parsing: Summary I

38 Page 38 ILP as a Unified Algorithmic Scheme Consider a common model for sequential inference: HMM/CRF  Inference in this model is done via the Viterbi Algorithm. Viterbi is a special case of the Linear Programming based Inference.  Viterbi is a shortest path problem, which is a LP, with a canonical matrix that is totally unimodular. Therefore, you can get integrality constraints for free.  One can now incorporate non-sequential/expressive/declarative constraints by modifying this canonical matrix  The extension reduces to a polynomial scheme under some conditions (e.g., when constraints are sequential, when the solution space does not change, etc.)  Not necessarily increases complexity and very efficient in practice [Roth&Yih, ICML’05] y1y1 y2y2 y3y3 y4y4 y5y5 y x x1x1 x2x2 x3x3 x4x4 x5x5 s A B C A B C A B C A B C A B C t

39 Page 39 An Inference method for the “best explanation”, used here to induce a semantic representation of a sentence.  A general Information Integration framework. Allows expressive constraints  Any Boolean rule can be represented by a set of linear (in)equalities Combining acquired (statistical) constraints with declarative constraints  Start with shortest path matrix and constraints  Add new constraints to the basic integer linear program. Solved using off-the-shelf packages  If the additional constraints don’t change the solution, LP is enough  Otherwise, the computational time depends on sparsity; fast in practice Demo available http://L2R.cs.uiuc.edu/~cogcomphttp://L2R.cs.uiuc.edu/~cogcomp Integer Linear Programming Inference - Summary

40 Page 40 This Talk  Integrating Learning and Inference  Historical Perspective  Role of Learning  Global Inference over Classifiers  Semantic Parsing  Multiple levels of processing  Textual Entailment  Summary and Future Directions

41 Page 41  Global decisions in which several local decisions play a role but there are mutual dependencies on their outcome.  So far, this was a single stage process.  Learn (acquire a new vocabulary) and  Run inference over it to guarantee the coherency of the outcome.  Is that it?  Of course, this isn’t sufficient.  The process of learning and Inference needs to be done in phases. Inference and Learning It’s turtles all the way down…

42 Page 42 Pipelining is a crude approximation; interactions occur across levels and down stream decisions often interact with previous decisions. Leads to propagation of errors Occasionally, later stage problems are easier but upstream mistakes will not be corrected. There are good reasons for pipelining decisions Global inference over the outcomes of different levels can be used to break away from this paradigm. [between pipeline & fully global] Allows a flexible way to incorporate linguistic and structural constraints. POS TaggingPhrasesSemantic EntitiesRelations Vocabulary is generated in phases Left to Right processing of sentences is also a pipeline process ParsingWSDSemantic Role Labeling Raw Data Pipeline

43 Page 43 J.V. Oswald was murdered at JFK after his assassin, K. F. Johns… Identify: J.V. Oswald was murdered at JFK after his assassin, K. F. Johns… location person Kill (X, Y) Identify named entities Identify relations between entities Exploit mutual dependencies between named entities and relation to yield a coherent global detection. [Roth & Yih, COLING’02;CoNLL’04] Some knowledge (classifiers) may be known in advance Some constraints may be available only at decision time Entities and Relations: Information Integration

44 Page 44 This Talk  Integrating Learning and Inference  Historical Perspective  Role of Learning  Global Inference over Classifiers  Semantic Parsing  Multiple levels of processing  Textual Entailment  Summary and Future Directions

45 Page 45 Comprehension 1. Christopher Robin was born in England. 2. Winnie the Pooh is a title of a book. 3. Christopher Robin’s dad was a magician. 4. Christopher Robin must be at least 65 now. A process that maintains and updates a collection of propositions about the state of affairs. (ENGLAND, June, 1989) - Christopher Robin is alive and well. He lives in England. He is the same person that you read about in the book, Winnie the Pooh. As a boy, Chris lived in a pretty home called Cotchfield Farm. When Chris was three years old, his father wrote a poem about him. The poem was printed in a magazine for others to read. Mr. Robin then wrote a book. He made up a fairy tale land where Chris lived. His friends were animals. There was a bear called Winnie the Pooh. There was also an owl and a young pig, called a piglet. All the animals were stuffed toys that Chris owned. Mr. Robin made them come to life with his words. The places in the story were all near Cotchfield Farm. Winnie the Pooh was written in 1925. Children still love to read about Christopher Robin and his animal friends. Most people don't know he is a real person who is grown now. He has written two books of his own. They tell what it is like to be famous.

46 Page 46 Given: Q: Who acquired Overture? Determine: A: Eyeing the huge market potential, currently led by Google, Yahoo took over search company Overture Services Inc last year. Textual Entailment Eyeing the huge market potential, currently led by Google, Yahoo took over search company Overture Services Inc. last year Yahoo acquired Overture Entails Subsumed by  Overture is a search company Google is a search company ………. Google owns Overture By “semantically entailed” we mean: most people would agree that one sentence implies the other. Simply – making plausible inferences Phrasal verb paraphrasing [Connor&Roth’06] Entity matching [Li et. al, AAAI’04, NAACL’04] Semantic Role Labeling

47 Page 47 Discussing Textual Entailment Requires an inference process that makes use of a large number of learned (and knowledge-based) operators.  A sound approach for determining whether a statement of interest holds in a given sentence. [Braz et. al, AAAI05]  A pair (Sentence, hypothesis) is transformed into a simpler pair, in an entailment preserving manner.  Constrained Optimization formulation, over a large number of learned operators. Aimed at the best (simplest) mapping between predicate- argument representations. Inference is purposeful: No canonical representation, but rather reasoning on sentences transformation depends on the hypothesis. What is shown next is a proof. At any stage, a large number of operators are entertained, some do not fire, some lead nowhere. This is a path through the optimization process, that leads to a justifiable (and explainable) answer.

48 Page 48 Sample Entailment Pair S:Hurricane Katrina petroleum-supply outlook improved somewhat, yesterday, as U.S. and European governments finally reached a consensus. They finally made up their minds to release 2 million barrels a day, of oil and refined products, from their reserves. T:Offers by individual European governments involved supplies of crude or refined oil products. Does ‘T’ follow from ‘S’?

49 Page 49 S:Hurricane Katrina petroleum-supply outlook improved somewhat, yesterday, as U.S. and European governments finally reached a consensus. They finally made up their minds to release 2 million barrels a day, of oil and refined products, from their reserves. T:Offers by individual European governments involved supplies of crude or refined oil products. S:Hurricane Katrina petroleum-supply outlook improved somewhat, yesterday, as U.S. and European governments finally reached a consensus. They finally decided to release 2 million barrels a day, of oil and refined products, from their reserves. T:Offers by individual European governments involved supplies of crude or refined oil products. S:Hurricane Katrina petroleum-supply outlook improved somewhat, yesterday, as U.S. and European governments finally reached a consensus. They finally made up their minds to release 2 million barrels a day, of oil and refined products, from their reserves. T:Offers by individual European governments involved supplies of crude or refined oil products. OPERATOR 1: Phrasal Verb Replace phrasal verbs with an equivalent single word verb

50 Page 50 S:Hurricane Katrina petroleum-supply outlook improved somewhat, yesterday, as U.S. and European governments finally reached a consensus. They finally decided to release 2 million barrels a day, of oil and refined products, from their reserves. T:Offers by individual European governments involved supplies of crude or refined oil products. S:Hurricane Katrina petroleum-supply outlook improved somewhat, yesterday, as U.S. and European governments finally reached a consensus. U.S. and European governments finally decided to release 2 million barrels a day, of oil and refined products, from their reserves. T:Offers by individual European governments involved supplies of crude or refined oil products. S:Hurricane Katrina petroleum-supply outlook improved somewhat, yesterday, as U.S. and European governments finally reached a consensus. They finally decided to release 2 million barrels a day, of oil and refined products, from their reserves. T:Offers by individual European governments involved supplies of crude or refined oil products. OPERATOR 2: Coreference Resolution Replace pronouns/possessive pronouns with the entity to which they refer

51 Page 51 S:U.S. and European governments finally decided to release 2 million barrels a day, of oil and refined products, from their reserves. T:Offers by individual European governments involved supplies of crude or refined oil products. S:Hurricane Katrina petroleum-supply outlook improved somewhat, yesterday, as U.S. and European governments finally reached a consensus. U.S. and European governments finally decided to release 2 million barrels a day, of oil and refined products, from their reserves. T:Offers by individual European governments, involved supplies of crude or refined oil products. S:Hurricane Katrina petroleum-supply outlook improved somewhat, yesterday, as U.S. and European governments finally reached a consensus. U.S. and European governments finally decided to release 2 million barrels a day, of oil and refined products, from their reserves. T:Offers by individual European governments involved supplies of crude or refined oil products. OPERATOR 3: Focus of Attention Remove segments of a sentence that do not appear to be necessary; may allow more accurate annotation of remaining words

52 Page 52 S:U.S. and European governments finally decided to release 2 million barrels a day, of oil and refined products, from their reserves. T:Offers by individual European governments involved supplies of crude or refined oil products. S:U.S. and European governments finally decided to release 2 million barrels a day, of oil and refined products, from their reserves. T:Individual European governments offered supplies of crude or refined oil products. S:U.S. and European governments finally decided to release 2 million barrels a day, of oil and refined products, from their reserves. T:Offers by individual European governments involved supplies of crude or refined oil products. OPERATOR 4: Nominalization Promotion Replace a verb that does not express a useful/meaningful relationship with a nominalization in one of its arguments involved supplies … Offers by individual … offered Individual …supplies … Requires semantic role labeling (for noun predicates)

53 Page 53 S:U.S. and European governments finally decided to release 2 million barrels a day, of oil and refined products, from their reserves. T:Individual European governments offered supplies of crude or refined oil products. S:U.S. and European governments finally decided to release 2 million barrels a day, of oil and refined products, from their reserves. T:Individual European governments supplied crude or refined oil products. S:U.S. and European governments finally decided to release 2 million barrels a day, of oil and refined products, from their reserves. T:Individual European governments offered supplies of crude or refined oil products. OPERATOR 4: Nominalization Promotion Replace a verb that does not express a useful/meaningful relationship with a nominalization in one of its arguments offered Individual … supplies of crude … supplied Individual …crude …

54 Page 54 S:U.S. and European governments finally decided to release 2 million barrels a day, of oil and refined products, from their reserves. T:Individual European governments supplied crude or refined oil products. OPERATOR 5: Predicate Embedding Resolution Replace a verb compound where the first verb may indicate modality or negation with a single verb, marked with negation/modality attribute decided U.S. and … release 2 million barrels a day, of oil … released U.S. and … 2 million barrels a day, of oil … S:U.S. and European governments finally decided to release 2 million barrels a day, of oil and refined products, from their reserves. T:Individual European governments supplied crude or refined oil products. S:U.S. and European governments finally released 2 million barrels a day, of oil and refined products, from their reserves. T:Individual European governments supplied crude or refined oil products. ‘decided’ (almost) does not change the meaning of the embedded verb But what if the embedding verb had been ‘refused’? ENTAILMENT SHOULD NOT BE SUCCEEDED refused U.S. and … release 2 million barrels a day, of oil … released U.S. and … 2 million barrels a day, of oil … negation Requires semantic role labeling (for noun predicates)

55 Page 55 S:U.S. and European governments finally released 2 million barrels a day, of oil and refined products, from their reserves. T:Individual European governments supplied crude or refined oil products. OPERATOR 6: Predicate Matching System matches PREDICATES and their ARGUMENTS -- accounts for monotonicity, modality, negation, and quantifiers S:U.S. and European governments finally released 2 million barrels a day, of oil and refined products, from their reserves. T:Individual European governments supplied crude or refined oil products. ENTAILMENT SUCCEEDS Requires lexical abstraction

56 Page 56 Discussed a general paradigm for learning and inference in the context of natural language understanding tasks Did not discuss: Knowledge Representation How to train?  Key insight – what to learn is driven by global decisions.  Luckily: # of components is much smaller than # of decisions. Emphasis should be on  Learn Locally and make use globally (via global inference) [Punyakanok et. al IJCAI’05]  Ability to make use of domain & constraints to drive supervision [Klementiev & Roth, ACL’06] Conclusions

57 Page 57 Discussed a general paradigm for learning and inference in the context of natural language understanding tasks  Incorporate classifiers’ information, along with expressive constraints, within an inference framework for the best explanation. We can now incorporate many good old ideas. Learning allows us to develop the right vocabulary, and supports appropriate abstractions so that we can study natural language understanding as a problem of reasoning. Room for new research on reasoning patterns in NLP Conclusions (2)

58 Page 58 Acknowledgement Many of my students contributed significantly to this line of work Vasin Punyakanok, Scott Yih, Mark Sammons Xin Li, Dav Zimak, Rodrigo de salvo Braz, Chad Cumby, Yair Even Zohar, Michael Connor; Kevin Small, Alex Klementiev Funding  ARDA, under the AQUAINT program  NSF: ITR IIS-0085836, ITR IIS-0428472; ITR IIS- 0085980  A DOI grant under the Reflex program,  DASH Optimization

59 Page 59 Questions? Thank you


Download ppt "Page 1 Global Inference and Learning Towards Natural Language Understanding Dan Roth Department of Computer Science University of Illinois at Urbana-Champaign."

Similar presentations


Ads by Google