10/3/2015CPSC503 Winter 20091 CPSC 503 Computational Linguistics Lecture 8 Giuseppe Carenini.

Slides:



Advertisements
Similar presentations
Mrach 1, 2009Dr. Muhammed Al-Mulhem1 ICS482 Formal Grammars Chapter 12 Muhammed Al-Mulhem March 1, 2009.
Advertisements

Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
CS Basic Parsing with Context-Free Grammars.
Albert Gatt LIN3022 Natural Language Processing Lecture 8.
Parsing with CFG Ling 571 Fei Xia Week 2: 10/4-10/6/05.
Features and Unification
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
Syntax and Context-Free Grammars CMSC 723: Computational Linguistics I ― Session #6 Jimmy Lin The iSchool University of Maryland Wednesday, October 7,
CS 4705 Basic Parsing with Context-Free Grammars.
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
CPSC 503 Computational Linguistics
CS 4705 Lecture 11 Feature Structures and Unification Parsing.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING COMP3310 Natural Language Processing Eric Atwell, Language Research Group.
1 Features and Unification Chapter 15 October 2012 Lecture #10.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin and Rada Mihalcea.
Speech and Language Processing Lecture 12—02/24/2015 Susan W. Brown.
Session 13 Context-Free Grammars and Language Syntax Introduction to Speech and Natural Language Processing (KOM422 ) Credits: 3(3-0)
1 Syntax Sudeshna Sarkar 25 Aug Sentence-Types Declaratives: A plane left S -> NP VP Imperatives: Leave! S -> VP Yes-No Questions: Did the plane.
1 CPE 480 Natural Language Processing Lecture 5: Parser Asst. Prof. Nuttanart Facundes, Ph.D.
CS 4705 Parsing More Efficiently and Accurately. Review Top-Down vs. Bottom-Up Parsers Left-corner table provides more efficient look- ahead Left recursion.
10/12/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 10 Giuseppe Carenini.
Chapter 10. Parsing with CFGs From: Chapter 10 of An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, by.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.
October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
6/2/2016CPSC503 Winter CPSC 503 Computational Linguistics Lecture 9 Giuseppe Carenini.
10/30/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 7 Giuseppe Carenini.
1 LIN6932 Spring 2007 LIN6932 Topics in Computational Linguistics Lecture 6: Grammar and Parsing (I) February 15, 2007 Hana Filip.
Rules, Movement, Ambiguity
Natural Language - General
CPSC 503 Computational Linguistics
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Natural Language Processing Lecture 14—10/13/2015 Jim Martin.
1/22/2016CPSC503 Winter CPSC 503 Computational Linguistics Lecture 8 Giuseppe Carenini.
Instructor: Nick Cercone CSEB - 1 Parsing and Context Free Grammars Parsers, Top Down, Bottom Up, Left Corner, Earley.
3/20/2016CPSC503 Winter CPSC 503 Computational Linguistics Lecture 7 Giuseppe Carenini.
Speech and Language Processing Formal Grammars Chapter 12.
Context Free Grammars. Slide 1 Syntax Syntax = rules describing how words can connect to each other * that and after year last I saw you yesterday colorless.
Natural Language Processing Vasile Rus
Speech and Language Processing
Basic Parsing with Context Free Grammars Chapter 13
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
CPSC 503 Computational Linguistics
Lecture 13: Grammar and Parsing (I) November 9, 2004 Dan Jurafsky
CPSC 503 Computational Linguistics
CSC 594 Topics in AI – Natural Language Processing
CPSC 503 Computational Linguistics
CPSC 503 Computational Linguistics
CPSC 503 Computational Linguistics
CSCI 5832 Natural Language Processing
CS 388: Natural Language Processing: Syntactic Parsing
CSCI 5832 Natural Language Processing
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 25
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
CPSC 503 Computational Linguistics
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 26
CSCI 5832 Natural Language Processing
CPSC 503 Computational Linguistics
CPSC 503 Computational Linguistics
Presentation transcript:

10/3/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 8 Giuseppe Carenini

10/3/2015CPSC503 Winter Today 1/10 Finish POS tagging Start Syntax / Parsing (Chp 12!)

10/3/2015CPSC503 Winter Evaluating Taggers Accuracy: percent correct (most current taggers 96-7%) *test on unseen data!* Human Celing: agreement rate of humans on classification (96-7%) Unigram baseline: assign each token to the class it occurred in most frequently in the training set (race -> NN). (91%) What is causing the errors? Build a confusion matrix…

10/3/2015CPSC503 Winter Confusion matrix Precision ? Recall ?

10/3/2015CPSC503 Winter Error Analysis (textbook) Look at a confusion matrix See what errors are causing problems –Noun (NN) vs ProperNoun (NNP) vs Adj (JJ) –Past tense (VBD) vs Past Participle (VBN)

10/3/2015CPSC503 Winter Knowledge-Formalisms Map (next three lectures) Logical formalisms (First-Order Logics) Rule systems (and prob. versions) (e.g., (Prob.) Context-Free Grammars) State Machines (and prob. versions) (Finite State Automata,Finite State Transducers, Markov Models) Morphology Syntax Pragmatics Discourse and Dialogue Semantics AI planners

10/3/2015CPSC503 Winter Today 1/10 Finish POS tagging English Syntax Context-Free Grammar for English –Rules –Trees –Recursion –Problems Start Parsing

10/3/2015CPSC503 Winter Syntax Def. The study of how sentences are formed by grouping and ordering words Example: Ming and Sue prefer morning flights * Ming Sue flights morning and prefer Groups behave as single unit wrt Substitution, Movement, Coordination

10/3/2015CPSC503 Winter Syntax: Useful tasks Why should you care? –Grammar checkers –Basis for semantic interpretation Question answering Information extraction Summarization –Machine translation –……

10/3/2015CPSC503 Winter Key Constituents – with heads (English) Noun phrases Verb phrases Prepositional phrases Adjective phrases Sentences (Det) N (PP) (Qual) V (NP) (Deg) P (NP) (Deg) A (PP) (NP) (I)(VP) Some simple specifiers Category Typical functionExamples Determiner specifier of N the, a, this, no.. Qualifier specifier of V never, often.. Degree word specifier of A or P very, almost.. Complements? (Specifier) X (Complement)

10/3/2015CPSC503 Winter Key Constituents: Examples Noun phrases Verb phrases Prepositional phrases Adjective phrases Sentences (Det) N (PP) the cat on the table (Qual) V (NP) never eat a cat (Deg) P (NP) almost inthe net (Deg) A (PP) very happy about it (NP) (I)(VP) a mouse --ate it

10/3/2015CPSC503 Winter Context Free Grammar (Example) S -> NP VP NP -> Det NOMINAL NOMINAL -> Noun VP -> Verb Det -> a Noun -> flight Verb -> left Terminal Non-terminal Start-symbol

10/3/2015CPSC503 Winter CFG more complex Example LexiconGrammar with example phrases

10/3/2015CPSC503 Winter CFGs Define a Formal Language (un/grammatical sentences) Generative Formalism –Generate strings in the language –Reject strings not in the language –Impose structures (trees) on strings in the language

10/3/2015CPSC503 Winter CFG: Formal Definitions 4-tuple (non-term., term., productions, start) (N, , P, S) P is a set of rules A  ; A  N,  (  N)* A derivation is the process of rewriting  1 into  m ( both strings in (  N)*) by applying a sequence of rules:  1  *  m L G = W|w  * and S  * w

10/3/2015CPSC503 Winter Derivations as Trees flight Nominal Context Free?

10/3/2015CPSC503 Winter CFG Parsing It is completely analogous to running a finite-state transducer with a tape –It’s just more powerful Chpt. 13 Parser I prefer a morning flight flight Nominal

10/3/2015CPSC503 Winter Other Options Regular languages (FSA) A  x B or A  x –Too weak (e.g., cannot deal with recursion in a general way – no center-embedding) CFGs A   (also produce more understandable and “useful” structure) Context-sensitive  A   ;  ≠  –Can be computationally intractable Turing equiv.  ;  ≠  –Too powerful / Computationally intractable

10/3/2015CPSC503 Winter Common Sentence-Types Declaratives: A plane left S -> NP VP Imperatives: Leave! S -> VP Yes-No Questions: Did the plane leave? S -> Aux NP VP WH Questions: Which flights serve breakfast? S -> WH NP VP When did the plane leave? S -> WH Aux NP VP

10/3/2015CPSC503 Winter NP: more details NP -> Specifiers N Complements NP -> (Predet)(Det)(Card)(Ord)(Quant) (AP) Nom e.g., all the other cheap cars Nom -> Nom PP (PP) (PP) e.g., reservation on BA456 from NY to YVR Nom -> Nom GerundVP e.g., flight arriving on Monday Nom -> Nom RelClause Nom RelClause ->(who | that) VP e.g., flight that arrives in the evening

10/3/2015CPSC503 Winter Conjunctive Constructions S -> S and S –John went to NY and Mary followed him NP -> NP and NP –John went to NY and Boston VP -> VP and VP –John went to NY and visited MOMA … In fact the right rule for English is X -> X and X

10/3/2015CPSC503 Winter Problems with CFGs Agreement Subcategorization

10/3/2015CPSC503 Winter Agreement In English, –Determiners and nouns have to agree in number –Subjects and verbs have to agree in person and number Many languages have agreement systems that are far more complex than this (e.g., gender).

10/3/2015CPSC503 Winter Agreement This dog Those dogs This dog eats You have it Those dogs eat *This dogs *Those dog *This dog eat *You has it *Those dogs eats

10/3/2015CPSC503 Winter Possible CFG Solution S -> NP VP NP -> Det Nom VP -> V NP … SgS -> SgNP SgVP PlS -> PlNp PlVP SgNP -> SgDet SgNom PlNP -> PlDet PlNom PlVP -> PlV NP SgVP3p ->SgV3p NP … Sg = singular Pl = plural OLD GrammarNEW Grammar

10/3/2015CPSC503 Winter CFG Solution for Agreement It works and stays within the power of CFGs But it doesn’t scale all that well (explosion in the number of rules)

10/3/2015CPSC503 Winter Subcategorization *John sneezed the book *I prefer United has a flight *Give with a flight Def. It expresses constraints that a predicate (verb here) places on the number and type of its arguments (see first table)

10/3/2015CPSC503 Winter Subcategorization Sneeze: John sneezed Find: Please find [a flight to NY] NP Give: Give [me] NP [a cheaper fare] NP Help: Can you help [me] NP [with a flight] PP Prefer: I prefer [to leave earlier] TO-VP Told: I was told [United has a flight] S …

10/3/2015CPSC503 Winter So? So the various rules for VPs overgenerate. –They allow strings containing verbs and arguments that don’t go together –For example: VP -> V NP therefore Sneezed the book VP -> V S therefore go she will go there

10/3/2015CPSC503 Winter Possible CFG Solution VP -> V VP -> V NP VP -> V NP PP … VP -> IntransV VP -> TransV NP VP -> TransPP to NP PP to … TransPP to -> hand,give,.. This solution has the same problem as the one for agreement OLD Grammar NEW Grammar

10/3/2015CPSC503 Winter CFG for NLP: summary CFGs cover most syntactic structure in English. But there are problems (overgeneration) –That can be dealt with adequately, although not elegantly, by staying within the CFG framework. There are simpler, more elegant, solutions that take us out of the CFG framework: LFG, XTAGS… Chpt 15 “Features and Unification”

10/3/2015CPSC503 Winter Dependency Grammars Syntactic structure: binary relations between words Links: grammatical function or very general semantic relation Abstract away from word-order variations (simpler grammars) Useful features in many NLP applications (for classification, summarization and NLG)

10/3/2015CPSC503 Winter Today 2/10 English Syntax Context-Free Grammar for English –Rules –Trees –Recursion –Problems Start Parsing (if time left)

10/3/2015CPSC503 Winter Parsing with CFGs Assign valid trees: covers all and only the elements of the input and has an S at the top Parser I prefer a morning flight flight Nominal CFG Sequence of words Valid parse trees

10/3/2015CPSC503 Winter Parsing as Search S -> NP VP S -> Aux NP VP NP -> Det Noun VP -> Verb Det -> a Noun -> flight Verb -> left, arrive Aux -> do, does Search space of possible parse trees CFG defines Parsing : find all trees that cover all and only the words in the input

10/3/2015CPSC503 Winter Constraints on Search Parser I prefer a morning flight flight Nominal CFG (search space) Sequence of wordsValid parse trees Search Strategies: Top-down or goal-directed Bottom-up or data-directed

10/3/2015CPSC503 Winter Top-Down Parsing Since we’re trying to find trees rooted with an S (Sentences) start with the rules that give us an S. Then work your way down from there to the words. flight Input:

10/3/2015CPSC503 Winter Next step: Top Down Space When POS categories are reached, reject trees whose leaves fail to match all words in the input ……..

10/3/2015CPSC503 Winter Bottom-Up Parsing Of course, we also want trees that cover the input words. So start with trees that link up with the words in the right way. Then work your way up from there. flight

10/3/2015CPSC503 Winter Two more steps: Bottom-Up Space flight ……..

10/3/2015CPSC503 Winter Top-Down vs. Bottom-Up Top-down –Only searches for trees that can be answers –But suggests trees that are not consistent with the words Bottom-up –Only forms trees consistent with the words –Suggest trees that make no sense globally

10/3/2015CPSC503 Winter So Combine Them Top-down: control strategy to generate trees Bottom-up: to filter out inappropriate parses Top-down Control strategy: Depth vs. Breadth first Which node to try to expand next Which grammar rule to use to expand a node (left-most) (textual order)

10/3/2015CPSC503 Winter Top-Down, Depth-First, Left-to- Right Search Sample sentence: “Does this flight include a meal?”

10/3/2015CPSC503 Winter Example “Does this flight include a meal?”

10/3/2015CPSC503 Winter flight Example “Does this flight include a meal?”

10/3/2015CPSC503 Winter flight Example “Does this flight include a meal?”

10/3/2015CPSC503 Winter Adding Bottom-up Filtering The following sequence was a waste of time because an NP cannot generate a parse tree starting with an AUX Aux

10/3/2015CPSC503 Winter Bottom-Up Filtering CategoryLeft Corners SDet, Proper-Noun, Aux, Verb NPDet, Proper-Noun NominalNoun VPVerb Aux

10/3/2015CPSC503 Winter Problems with TD-BU-filtering Ambiguity Repeated Parsing SOLUTION: Earley Algorithm (once again dynamic programming!)

10/3/2015CPSC503 Winter For Next Time Read Chapter 13 (Parsing) Optional: Read Chapter 16 (Features and Unification) – skip algorithms and implementation

10/3/2015CPSC503 Winter Grammars and Constituency Of course, there’s nothing easy or obvious about how we come up with right set of constituents and the rules that govern how they combine... That’s why there are so many different theories of grammar and competing analyses of the same data. The approach to grammar, and the analyses, adopted here are very generic (and don’t correspond to any modern linguistic theory of grammar).

10/3/2015CPSC503 Winter Syntactic Notions so far... N-grams: prob. distr. for next word can be effectively approximated knowing previous n words POS categories are based on: –distributional properties (what other words can occur nearby) –morphological properties (affixes they take)