Giorgio Satta University of Padua Parsing Techniques for Lexicalized Context-Free Grammars* * Joint work with : Jason Eisner, Mark-Jan Nederhof.

Slides:



Advertisements
Similar presentations
Computational language: week 10 Lexical Knowledge Representation concluded Syntax-based computational language Sentence structure: syntax Context free.
Advertisements

Chapter 5: Languages and Grammar 1 Compiler Designs and Constructions ( Page ) Chapter 5: Languages and Grammar Objectives: Definition of Languages.
CPSC Compiler Tutorial 4 Midterm Review. Deterministic Finite Automata (DFA) Q: finite set of states Σ: finite set of “letters” (input alphabet)
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Albert Gatt Corpora and Statistical Methods Lecture 11.
Probabilistic and Lexicalized Parsing CS Probabilistic CFGs: PCFGs Weighted CFGs –Attach weights to rules of CFG –Compute weights of derivations.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
PARSING WITH CONTEXT-FREE GRAMMARS
Dependency Parsing Joakim Nivre. Dependency Grammar Old tradition in descriptive grammar Modern theroretical developments: –Structural syntax (Tesnière)
GRAMMAR & PARSING (Syntactic Analysis) NLP- WEEK 4.
CKY Parsing Ling 571 Deep Processing Techniques for NLP January 12, 2011.
Introduction to Computability Theory
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
6/9/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 11 Giuseppe Carenini.
Amirkabir University of Technology Computer Engineering Faculty AILAB Efficient Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing.
Parsing with PCFG Ling 571 Fei Xia Week 3: 10/11-10/13/05.
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Lecture 4 Context-free grammars Jan Maluszynski, IDA, 2007
Validating Streaming XML Documents Luc Segoufin & Victor Vianu Presented by Harel Paz.
1/13 Parsing III Probabilistic Parsing and Conclusions.
1/17 Probabilistic Parsing … and some other approaches.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
Syntactic Parsing with CFGs CMSC 723: Computational Linguistics I ― Session #7 Jimmy Lin The iSchool University of Maryland Wednesday, October 14, 2009.
Parsing SLP Chapter 13. 7/2/2015 Speech and Language Processing - Jurafsky and Martin 2 Outline  Parsing with CFGs  Bottom-up, top-down  CKY parsing.
Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Efficient Parsing for Bilexical CF Grammars Head Automaton Grammars Jason Eisner U. of Pennsylvania Giorgio Satta U. of Padova, Italy U. of Rochester.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
BİL711 Natural Language Processing1 Statistical Parse Disambiguation Problem: –How do we disambiguate among a set of parses of a given sentence? –We want.
Probabilistic Context Free Grammars for Representing Action Song Mao November 14, 2000.
Tree-adjoining grammar (TAG) is a grammar formalism defined by Aravind Joshi and introduced in Tree-adjoining grammars are somewhat similar to context-free.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
1 Natural Language Processing Lecture 11 Efficient Parsing Reading: James Allen NLU (Chapter 6)
LINGUISTICA GENERALE E COMPUTAZIONALE ANALISI SINTATTICA (PARSING)
Copyright © by Curt Hill Grammar Types The Chomsky Hierarchy BNF and Derivation Trees.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
Introduction to Parsing
Basic Parsing Algorithms: Earley Parser and Left Corner Parsing
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
Introduction to Compiling
© 2005 Hans Uszkoreit FLST WS 05/06 FLST Grammars and Parsing Hans Uszkoreit.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
11 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 7 School of Innovation, Design and Engineering Mälardalen University 2012.
Syntax and Semantics Form and Meaning of Programming Languages Copyright © by Curt Hill.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Formal grammars A formal grammar is a system for defining the syntax of a language by specifying sequences of symbols or sentences that are considered.
Dependency Parsing Niranjan Balasubramanian March 24 th 2016 Credits: Many slides from: Michael Collins, Mausam, Chris Manning, COLNG 2014 Dependency Parsing.
Natural Language Processing : Probabilistic Context Free Grammars Updated 8/07.
2016/7/9Page 1 Lecture 11: Semester Review COMP3100 Dept. Computer Science and Technology United International College.
Roadmap Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities:
Natural Language Processing Vasile Rus
CSC 594 Topics in AI – Natural Language Processing
Formal Language & Automata Theory
G. Pullaiah College of Engineering and Technology
Programming Languages Translator
Basic Parsing with Context Free Grammars Chapter 13
Table-driven parsing Parsing performed by a finite state machine.
Compiler Lecture 1 CS510.
Probabilistic and Lexicalized Parsing
Introduction CI612 Compiler Design CI612 Compiler Design.
CS 388: Natural Language Processing: Syntactic Parsing
Jaya Krishna, M.Tech, Assistant Professor
CSCI 5832 Natural Language Processing
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Parsing I: CFGs & the Earley Parser
Probabilistic Parsing
Lec00-outline May 18, 2019 Compiler Design CS416 Compiler Design.
Presentation transcript:

Giorgio Satta University of Padua Parsing Techniques for Lexicalized Context-Free Grammars* * Joint work with : Jason Eisner, Mark-Jan Nederhof

Summary Part I: Lexicalized Context-Free Grammars –motivations and definition –relation with other formalisms Part II: standard parsing –TD techniques –BU techniques Part III: novel algorithms –BU enhanced –TD enhanced

Lexicalized grammars each rule specialized for one or more lexical items advantages over non-lexicalized formalisms: –express syntactic preferences that are sensitive to lexical words –control word selection

Syntactic preferences adjuncts Workers [ dumped sacks ] into a bin *Workers dumped [ sacks into a bin ] N-N compound [ hydrogen ion ] exchange *hydrogen [ ion exchange ]

Word selection lexical Nora convened the meeting ?Nora convened the party semantics Peggy solved two puzzles ?Peggy solved two goats world knowledge Mary shelved some books ?Mary shelved some cooks

Lexicalized CFG Motivations : study computational properties common to generative formalisms used in state-of-the-art real-world parsers develop parsing algorithm that can be directly applied to these formalisms

Lexicalized CFG dumped sacks into a bin VP[dump][sack] P[into] N[bin]Det[a] NP[bin] PP[into][bin] V[dump] VP[dump][sack] NP[sack] N[sack]

Lexicalized CFG Context-free grammars with : alphabet V T : –dumped, sacks, into,... delexicalized nonterminals V D : –NP, VP,... nonterminals V N : –NP[sack], VP[dump][sack],...

Lexicalized CFG Delexicalized nonterminals encode : word sense N, V,... grammatical features number, tense,... structural information bar level, subcategorization state,... other constraints distribution, contextual features,...

Lexicalized CFG productions have two forms : –V[dump] dumped –VP[dump][sack] VP[dump][sack] PP[into][bin] lexical elements in lhs inherited from rhs

Lexicalized CFG production is k-lexical : k occurrences of lexical elements in rhs –NP[bin] Det[a] N[bin] is 2-lexical –VP[dump][sack] VP[dump][sack] PP[into][bin] is 4-lexical

LCFG at work 2-lexical CFG –Alshawi 1996 : Head Automata –Eisner 1996: Dependency Grammars –Charniak 1997 : CFG –Collins 1997 : generative model

LCFG at work Probabilistic LCFG G is strongly equivalent to probabilistic grammar G iff mapping between derivations each direction is a homomorphism derivation probabilities are preserved

LCFG at work From Charniak 1997 to 2-lex CFG : NP S [profits] N NP [profits]ADJ NP [corporate] Pr 1 (corporate | ADJ, NP, profits) Pr 1 (profits | N, NP, profits) Pr 2 ( NP ADJ N | NP, S, profits)

LCFG at work From Collins 1997 (Model #2) to 2-lex CFG : Pr left (NP, IBM | VP, S, bought, left, {NP-C}) N, [IBM] VP S, {NP-C}, left, S [bought] VP S, {}, left, S [bought]

LCFG at work Major Limitation : Cannot capture relations involving lexical items outside actual constituent (cfr. history based models) V[d 0 ] d1d1 NP[d 1 ] d2d2 PP[d 2 ][d 3 ] d0d0 cannot look at d 0 when computing PP attachment NP[d 1 ][d 0 ]

LCFG at work lexicalized context-free parsers that are not LCFG : –Magerman 1995: Shift-Reduce+ –Ratnaparkhi 1997: Shift-Reduce+ –Chelba & Jelinek 1998: Shift-Reduce+ –Hermjakob & Mooney 1997: LR

Related work Other frameworks for the study of lexicalized grammars : Carroll & Weir 1997 : Stochastic Lexicalized Grammars; emphasis on expressiveness Goodman 1997 : Probabilistic Feature Grammars; emphasis on parameter estimation

Summary Part I: Lexicalized Context-Free Grammars –motivations and definition –relation with other formalisms Part II: standard parsing –TD techniques –BU techniques Part III: novel algorithms –BU enhanced –TD enhanced

Standard Parsing standard parsing algorithms (CKY, Earley, LC,...) run on LCFG in time O ( |G | |w | 3 ) for 2-lex CFG (simplest case) |G | grows with |V D | 3 |V T | 2 !! Goal : Get rid of |V T | factors

Standard Parsing: TD Result (to be refined) : Algorithms satisfying the correct-prefix property are unlikely to run on LCFG in time independent of V T

Correct-prefix property Earley, Left-Corner, GLR,... : w S left-to-right reading position

On-line parsing No grammar precompilation (Earley) : Parser G w Output

Standard Parsing: TD Result : On-line parsers with correct-prefix property cannot run in time O ( f(|V D |, |w |) ), for any function f

Off-line parsing Grammar is precompiled (Left-Corner, LR) : Parser G PreComp C(G ) w Output Parser

Standard Parsing: TD Fact : We can simulate a nondeterministic FA M on w in time O ( |M | |w | ) Conjecture : Fix a polynomial p. We cannot simulate M on w in time p( |w | ) unless we spend exponential time in precompiling M

Standard Parsing: TD Assume our conjecture holds true Result : Off-line parsers with correct-prefix property cannot run in time O ( p(|V D |, |w |) ), for any polynomial p, unless we spend exponential time in precompiling G

Standard Parsing: BU Common practice in lexicalized grammar parsing : select productions that are lexically grounded in w parse BU with selected subset of G Problem : Algorithm removes |V T | factors but introduces new |w | factors !!

Standard Parsing: BU Running time is O ( |V D | 3 |w | 5 ) !! ikj B[d 1 ]C[d 2 ] A[d 2 ] d1d1 d2d2 Time charged : i, k, j |w | 3 A, B, C |V D | 3 d 1, d 2 |w | 2

Standard BU : Exhaustive

Standard BU : Pruning

Summary Part I: Lexicalized Context-Free Grammars –motivations and definition –relation with other formalisms Part II: standard parsing –TD techniques –BU techniques Part III: novel algorithms –BU enhanced –TD enhanced

BU enhanced Result : Parsing with 2-lex CFG in time O ( |V D | 3 |w | 4 ) Remark : Result transfers to models in Alshawi 1996, Eisner 1996, Charniak 1997, Collins 1997 Remark : Technique extends to improve parsing of Lexicalized-Tree Adjoining Grammars

Algorithm #1 Basic step in naive BU : j C[d 2 ] d2d2 ik B[d 1 ] d1d1 Idea : Indices d 1 and j can be processed independently A[d 2 ]

Algorithm #1 Step 1 Step 2 i k B[d 1 ] d1d1 A[d 2 ] C[d 2 ] d2d2 A[d 2 ] i d2d2 j i k A[d 2 ] C[d 2 ] d2d2 j i A[d 2 ] C[d 2 ] d2d2 k

BU enhanced Upper bound provided by Algorithm #1 : O (|w | 4 ) Goal : Can we go down to O (|w | 3 ) ?

Spine last week AdvP[week] NP[IBM] NP[Lotus] Lotus IBM S[buy] VP[buy] V[buy] bought The spine of a parse tree is the path from the root to the roots head S[buy] VP[buy] V[buy] bought

The spine projection is the yield of the sub- tree composed by the spine and all its sibling nodes Spine projection last week AdvP[week] S[buy] NP[IBM] S[buy] VP[buy] V[buy]NP[Lotus] Lotusbought IBM AdvP[week] NP[IBM] NP[Lotus] NP[IBM] bought NP[Lotus] AdvP[week]

Split Grammars Split spine projections at head : Problem : how much information do we need to store in order to construct new grammatical spine projections from splits ? ??

Split Grammars Fact : Set of spine projections is a linear context- free language Definition : 2-lex CFG is split if set of spine projections is a regular language Remark : For split grammars, we can recombine splits using finite information

Split Grammars Non-split grammar : unbounded # of dependencies between left and right dependents of head S[d] AdvP[a] S[d]AdvP[b] S 1 [d] linguistically unattested and unlikely AdvP[a] S[d]AdvP[b] S 1 [d]

Split Grammars Split grammar : finite # of dependencies between left and right dependents of lexical head S[d] AdvP[a] S[d]AdvP[b] S[d]

Split Grammars Precompile grammar such that splits are derived separately AdvP[week] NP[IBM] NP[Lotus] S[buy] VP[buy] V[buy] bought AdvP[week] NP[IBM] NP[Lotus] S[buy] r 3 [buy] r 2 [buy] r 1 [buy] bought r 3 [buy] is a split symbol

Split Grammars t : max # of states per spine automaton g : max # of split symbols per spine automaton (g < t ) m : # of delexicalized nonterminals thare are maximal projections

BU enhanced Result : Parsing with split 2-lexical CFG in time O (t 2 g 2 m 2 |w | 3 ) Remark: Models in Alshawi 1996, Charniak 1997 and Collins 1997 are not split

Algorithm #2 Idea : recognize left and right splits separately collect head dependents one split at a time B[d] d d s s d

NP[IBM] bought NP[Lotus] AdvP[week] Algorithm #2

Step 1 B[d 1 ] d1d1 s1s1 k r1r1 d2d2 s2s2 r2r2 Step 2 B[d 1 ] d1d1 s1s1 d2d2 s2s2 r2r2 d1d1 s1s1 d2d2 s2s2 r2r2 i r2r2 d2d2 i s2s2

Algorithm #2 : Exhaustive

Algorithm #2 : Pruning

Related work Cubic time algorithms for lexicalized grammars : Sleator & Temperley 1991 : Link Grammars Eisner 1997 : Bilexical Grammars (improved by transfer of Algorithm #2)

TD enhanced Goal : Introduce TD prediction for 2-lexical CFG parsing, without |V T | factors Remark : Must relax left-to-right parsing (because of previous results)

TD enhanced Result : TD parsing with 2-lex CFG in time O ( |V D | 3 |w | 4 ) Open : O ( |w | 3 ) extension to split grammars

TD enhanced Strongest version of correct-prefix property : S w reading position

Data Structures Prods with lhs A[d] : A[d] X 1 [d 1 ] X 2 [d 2 ] A[d] Y 1 [d 3 ] Y 2 [d 2 ] A[d] Z 1 [d 2 ] Z 2 [d 1 ] d3d3 d2d2 d1d1 Trie for A[d] : d1d1 d2d2

Data Structures Rightmost subsequence recognition by precompiling input w into a deterministic FA : a b a c c b a b b a a c

Algorithm #3 Item representation : i, j indicate extension of A[d] partial analysis A[d] i j k S k indicates rightmost possible position for completion of A[d] analysis

Algorithm #3 : Prediction Step 1 : find rightmost subsequence before k for some A[d 2 ] production Step 2 : make Earley prediction C[d 2 ] B[d 1 ] k i A[d 2 ] k d2d2 d1d1

Conclusions standard parsing techniques are not suitable for processing lexicalized grammars novel algorithms have been introduced using enhanced dynamic programming work to be done : extension to history-based models

The End Many thanks for helpful discussion to : Jason Eisner, Mark-Jan Nederhof