Parsing I: Earley Parser CMSC 35100 Natural Language Processing May 1, 2003.

Slides:



Advertisements
Similar presentations
Basic Parsing with Context-Free Grammars CS 4705 Julia Hirschberg 1 Some slides adapted from Kathy McKeown and Dan Jurafsky.
Advertisements

Natural Language Processing - Parsing 1 - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment / Binding Bottom vs. Top Down Parsing.
Chapter 9: Parsing with Context-Free Grammars
PARSING WITH CONTEXT-FREE GRAMMARS
Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin
Chapter 12 Lexicalized and Probabilistic Parsing Guoqiang Shan University of Arizona November 30, 2006.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing Probabilistic Context Free Grammars (Chapter 14) Muhammed Al-Mulhem March 1,
CKY Parsing Ling 571 Deep Processing Techniques for NLP January 12, 2011.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment.
1 Earley Algorithm Chapter 13.4 October 2009 Lecture #9.
 Christel Kemke /08 COMP 4060 Natural Language Processing PARSING.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
6/9/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 11 Giuseppe Carenini.
CS Basic Parsing with Context-Free Grammars.
Parsing context-free grammars Context-free grammars specify structure, not process. There are many different ways to parse input in accordance with a given.
Parsing with CFG Ling 571 Fei Xia Week 2: 10/4-10/6/05.
Parsing with PCFG Ling 571 Fei Xia Week 3: 10/11-10/13/05.
Earley’s algorithm Earley’s algorithm employs the dynamic programming technique to address the weaknesses of general top-down parsing. Dynamic programming.
CMSC 723 / LING 645: Intro to Computational Linguistics November 10, 2004 Lecture 10 (Dorr): CFG’s (Finish Chapter 9) Parsing (Chapter 10) Prof. Bonnie.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
Syntactic Parsing with CFGs CMSC 723: Computational Linguistics I ― Session #7 Jimmy Lin The iSchool University of Maryland Wednesday, October 14, 2009.
CS 4705 Basic Parsing with Context-Free Grammars.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
Parsing SLP Chapter 13. 7/2/2015 Speech and Language Processing - Jurafsky and Martin 2 Outline  Parsing with CFGs  Bottom-up, top-down  CKY parsing.
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books עיבוד שפות טבעיות - שיעור תשע Bottom Up Parsing עידו דגן.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
LIN LIN6932: Topics in Computational Linguistics Hana Filip.
1 Basic Parsing with Context- Free Grammars Slides adapted from Dan Jurafsky and Julia Hirschberg.
CS 4705 Parsing More Efficiently and Accurately. Review Top-Down vs. Bottom-Up Parsers Left-corner table provides more efficient look- ahead Left recursion.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
1 CKY and Earley Algorithms Chapter 13 October 2012 Lecture #8.
LINGUISTICA GENERALE E COMPUTAZIONALE ANALISI SINTATTICA (PARSING)
10. Parsing with Context-free Grammars -Speech and Language Processing- 발표자 : 정영임 발표일 :
Chapter 13: Parsing with Context-Free Grammars Heshaam Faili University of Tehran.
11 Syntactic Parsing. Produce the correct syntactic parse tree for a sentence.
Natural Language Processing Artificial Intelligence CMSC February 28, 2002.
Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from: Jim.
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
6/2/2016CPSC503 Winter CPSC 503 Computational Linguistics Lecture 9 Giuseppe Carenini.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
1 Chart Parsing Allen ’ s Chapter 3 J & M ’ s Chapter 10.
Sentence Parsing Parsing 3 Dynamic Programming. Jan 2009 Speech and Language Processing - Jurafsky and Martin 2 Acknowledgement  Lecture based on  Jurafsky.
Natural Language - General
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
NLP. Introduction to NLP Motivation –A lot of the work is repeated –Caching intermediate results improves the complexity Dynamic programming –Building.
Quick Speech Synthesis CMSC Natural Language Processing April 29, 2003.
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
December 2011CSA3202: PCFGs1 CSA3202: Human Language Technology Probabilistic Phrase Structure Grammars (PCFGs)
Click to edit Master title style Instructor: Nick Cercone CSEB - CSE Introduction to Computational Linguistics Tuesdays,
GRAMMARS David Kauchak CS457 – Spring 2011 some slides adapted from Ray Mooney.
Instructor: Nick Cercone CSEB - 1 Parsing and Context Free Grammars Parsers, Top Down, Bottom Up, Left Corner, Earley.
October 2005CSA3180: Parsing Algorithms 21 CSA3050: NLP Algorithms Parsing Algorithms 2 Problems with DFTD Parser Earley Parsing Algorithm.
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Speech and Language Processing SLP Chapter 13 Parsing.
Roadmap Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities:
1 Statistical methods in NLP Diana Trandabat
Parsing Recommended Reading: Ch th Jurafsky & Martin 2nd edition
Basic Parsing with Context Free Grammars Chapter 13
CPSC 503 Computational Linguistics
Natural Language - General
Parsing and More Parsing
CSA2050 Introduction to Computational Linguistics
Parsing I: CFGs & the Earley Parser
David Kauchak CS159 – Spring 2019
Presentation transcript:

Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003

Roadmap Parsing: –Accepting & analyzing –Combining top-down & bottom-up constraints Efficiency –Earley parsers Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities: Treebanks & Inside-Outside Issues with probabilities

Representation: Context-free Grammars CFGs: 4-tuple –A set of terminal symbols: Σ –A set of non-terminal symbols: N –A set of productions P: of the form A -> α Where A is a non-terminal and α in (Σ U N)* –A designated start symbol S L = W|w in Σ* and S=>*w –Where S=>*w means S derives w by some seq

Representation: Context-free Grammars Partial example –Σ: the, cat, dog, bit, bites, man –N: NP, VP, AdjP, Nominal –P: S-> NP VP; NP -> Det Nom; Nom-> N Nom|N –S S NP VP Det Nom V NP N Det Nom N The dog bit the man

Parsing Goals Accepting: –Legal string in language? Formally: rigid Practically: degrees of acceptability Analysis –What structure produced the string? Produce one (or all) parse trees for the string

Parsing Search Strategies Top-down constraints: –All analyses must start with start symbol: S –Successively expand non-terminals with RHS –Must match surface string Bottom-up constraints: –Analyses start from surface string –Identify POS –Match substring of ply with RHS to LHS –Must ultimately reach S

Integrating Strategies Left-corner parsing: –Top-down parsing with bottom-up constraints –Begin at start symbol –Apply depth-first search strategy Expand leftmost non-terminal Parser can not consider rule if current input can not be first word on left edge of some derivation Tabulate all left-corners for a non-terminal

Issues Left recursion –If the first non-terminal of RHS is recursive -> Infinite path to terminal node Could rewrite Ambiguity: pervasive (costly) –Lexical (POS) & structural Attachment, coordination, np bracketing Repeated subtree parsing –Duplicate subtrees with other failures

Earley Parsing Avoid repeated work/recursion problem –Dynamic programming Store partial parses in “chart” –Compactly encodes ambiguity O(N^3) Chart entries: –Subtree for a single grammar rule –Progress in completing subtree –Position of subtree wrt input

Earley Algorithm Uses dynamic programming to do parallel top-down search in (worst case) O(N 3 ) time First, left-to-right pass fills out a chart with N+1 states –Think of chart entries as sitting between words in the input string keeping track of states of the parse at these positions –For each word position, chart contains set of states representing all partial parse trees generated to date. E.g. chart[0] contains all partial parse trees generated at the beginning of the sentence

Chart Entries predicted constituents in-progress constituents completed constituents Represent three types of constituents:

Progress in parse represented by Dotted Rules Position of indicates type of constituent 0 Book 1 that 2 flight 3 S → VP, [0,0] (predicted) NP → Det Nom, [1,2] (in progress) VP → V NP, [0,3] (completed) [x,y] tells us what portion of the input is spanned so far by this rule Each State s i :, [, ]

S → VP, [0,0] –First 0 means S constituent begins at the start of input –Second 0 means the dot here too –So, this is a top-down prediction NP → Det Nom, [1,2] –the NP begins at position 1 –the dot is at position 2 –so, Det has been successfully parsed –Nom predicted next 0 Book 1 that 2 flight 3

0 Book 1 that 2 flight 3 (continued) VP → V NP, [0,3] –Successful VP parse of entire input

Successful Parse Final answer found by looking at last entry in chart If entry resembles S →  [nil,N] then input parsed successfully Chart will also contain record of all possible parses of input string, given the grammar

Parsing Procedure for the Earley Algorithm Move through each set of states in order, applying one of three operators to each state: –predictor: add predictions to the chart –scanner: read input and add corresponding state to chart –completer: move dot to right when new constituent found Results (new states) added to current or next set of states in chart No backtracking and no states removed: keep complete history of parse

States and State Sets Dotted Rule s i represented as, [, ] State Set S j to be a collection of states s i with the same.

Earley Algorithm (simpler!) 1. Add Start → · S, [0,0] to state set 0 Let i=1 2. Predict all states you can, adding new predictions to state set 0 3. Scan input word i—add all matched states to state set S i. Add all new states produced by Complete to state set S i Add all new states produced by Predict to state set S i Let i = i + 1 Unless i=n, repeat step At the end, see if state set n contains Start → S ·, [nil,n]

3 Main Sub-Routines of Earley Algorithm Predictor: Adds predictions into the chart. Completer: Moves the dot to the right when new constituents are found. Scanner: Reads the input words and enters states representing those words into the chart.

Predictor Intuition: create new state for top-down prediction of new phrase. Applied when non part-of-speech non- terminals are to the right of a dot: S → VP [0,0] Adds new states to current chart –One new state for each expansion of the non- terminal in the grammar VP → V [0,0] VP → V NP [0,0] Formally: S j : A → α · B β, [i,j] S j : B → · γ, [j,j]

Scanner Intuition: Create new states for rules matching part of speech of next word. Applicable when part of speech is to the right of a dot: VP → V NP [0,0] ‘Book…’ Looks at current word in input If match, adds state(s) to next chart VP → V NP [0,1] Formally: S j : A → α · B β, [i,j] S j+1 : A → α B ·β, [i,j+1]

Completer Intuition: parser has finished a new phrase, so must find and advance states all that were waiting for this Applied when dot has reached right end of rule NP → Det Nom [1,3] Find all states w/dot at 1 and expecting an NP: VP → V NP [0,1] Adds new (completed) state(s) to current chart : VP → V NP [0,3] Formally: S k : B → δ ·, [j,k] S k : A → α B · β, [i,k], where: S j : A → α · B β, [i,j].

Example: State Set S 0 for Parsing “Book that flight” using Grammar G 0

Example: State Set S 1 for Parsing “Book that flight” VP -> Verb. [0,1] Scanner S -> VP. [0,1] Completer VP -> Verb. NP [0,1] Scanner NP ->.Det Nom [1,1] Predictor NP ->.Proper-Noun [1,1] Predictor

Prediction of Next Rule When VP → V  is itself processed by the Completer, S → VP  is added to Chart[1] since VP is a left corner of S Last 2 rules in Chart[1] are added by Predictor when VP → V  NP is processed And so on….

Last Two States Chart[2] NP->Det. Nominal [1,2] Scanner Nom ->.Noun [2,2] Predictor Nom ->.Noun Nom [2,2] Predictor Chart[3] Nom -> Noun. [2,3] Scanner Nom -> Noun. Nom [2,3] Scanner NP -> Det Nom. [1,3] Completer VP -> Verb NP. [0,3] Completer S -> VP. [0,3] Completer Nom ->.Noun [3,3] Predictor Nom ->.Noun Nom [3,3] Predictor

How do we retrieve the parses at the end? Augment the Completer to add pointers to prior states it advances as a field in the current state –i.e. what state did we advance here? –Read the pointers back from the final state

Probabilistic CFGs

Handling Syntactic Ambiguity Natural language syntax Varied, has DEGREES of acceptability Ambiguous Probability: framework for preferences –Augment original context-free rules: PCFG –Add probabilities to transitions NP -> N NP -> Det N NP -> Det Adj N NP -> NP PP VP -> V VP -> V NP VP -> V NP PP S -> NP VP S -> S conj S PP -> P NP 1.0

PCFGs Learning probabilities –Strategy 1: Write (manual) CFG, Use treebank (collection of parse trees) to find probabilities Parsing with PCFGs –Rank parse trees based on probability –Provides graceful degradation Can get some parse even for unusual constructions - low value

Parse Ambiguity Two parse trees S NP VP N V NP PP Det N P NP Det N I saw the man with the duck S NP VP N V NP NP PP Det N P NP Det N I saw the man with the duck

Parse Probabilities –T(ree),S(entence),n(ode),R(ule) –T1 = 0.85*0.2*0.1*0.65*1*0.65 = –T2 = 0.85*0.2*0.45*0.05*0.65*1*0.65 = Select T1 Best systems achieve 92-93% accuracy

Probabilistic CYK Parsing Augmentation of Cocke-Younger-Kasami –Bottom-up parsing Inputs –PCFG in CNF G={N,Σ,P,S,D}, N have indices –N words w1…wn DS:Dynamic programming array: π[i,j,a] Holding max prob index a spanning i,j Output: Parse π[1,n,1] with S and w1..wn

Probabilistic CYK Parsing Base case: Input strings of length 1 –In CNF, prob must be from A=>wi Recursive case: For strings > 1, A=>*wij iff there is rule A->BC and some k, 1<=k<j st B derives the first k symbols and C the last j-k. Since len < |wij|, probability in table. Multiply subparts; compute max over all subparts.

Inside-Outside Algorithm EM approach –Similar to Forward-Backward training of HMM Estimate number of times production used –Base on sentence parses –Issue: Ambiguity Distribute across rule possibilities –Iterate to convergence

Issues with PCFGs Non-local dependencies –Rules are context-free; language isn’t Example: –Subject vs non-subject NPs Subject: 90% pronouns (SWB) NP-> Pron vs NP-> Det Nom: doesn’t know if subj Lexical context: –Verb subcategorization: Send NP PP vs Saw NP PP –One approach: lexicalization