Giorgio Satta University of Padua Parsing Techniques for Lexicalized Context-Free Grammars* * Joint work with : Jason Eisner, Mark-Jan Nederhof.

Giorgio Satta University of Padua Parsing Techniques for Lexicalized Context-Free Grammars* * Joint work with : Jason Eisner, Mark-Jan Nederhof

Summary Part I: Lexicalized Context-Free Grammars –motivations and definition –relation with other formalisms Part II: standard parsing –TD techniques –BU techniques Part III: novel algorithms –BU enhanced –TD enhanced

Lexicalized grammars each rule specialized for one or more lexical items advantages over non-lexicalized formalisms: –express syntactic preferences that are sensitive to lexical words –control word selection

Syntactic preferences adjuncts Workers [ dumped sacks ] into a bin *Workers dumped [ sacks into a bin ] N-N compound [ hydrogen ion ] exchange *hydrogen [ ion exchange ]

Word selection lexical Nora convened the meeting ?Nora convened the party semantics Peggy solved two puzzles ?Peggy solved two goats world knowledge Mary shelved some books ?Mary shelved some cooks

Lexicalized CFG Motivations : study computational properties common to generative formalisms used in state-of-the-art real-world parsers develop parsing algorithm that can be directly applied to these formalisms

Lexicalized CFG dumped sacks into a bin VP[dump][sack] P[into] N[bin]Det[a] NP[bin] PP[into][bin] V[dump] VP[dump][sack] NP[sack] N[sack]

Lexicalized CFG Context-free grammars with : alphabet V T : –dumped, sacks, into,... delexicalized nonterminals V D : –NP, VP,... nonterminals V N : –NP[sack], VP[dump][sack],...

Lexicalized CFG Delexicalized nonterminals encode : word sense N, V,... grammatical features number, tense,... structural information bar level, subcategorization state,... other constraints distribution, contextual features,...

Lexicalized CFG productions have two forms : –V[dump] dumped –VP[dump][sack] VP[dump][sack] PP[into][bin] lexical elements in lhs inherited from rhs

Lexicalized CFG production is k-lexical : k occurrences of lexical elements in rhs –NP[bin] Det[a] N[bin] is 2-lexical –VP[dump][sack] VP[dump][sack] PP[into][bin] is 4-lexical

LCFG at work 2-lexical CFG –Alshawi 1996 : Head Automata –Eisner 1996: Dependency Grammars –Charniak 1997 : CFG –Collins 1997 : generative model

LCFG at work Probabilistic LCFG G is strongly equivalent to probabilistic grammar G iff 1-2-1 mapping between derivations each direction is a homomorphism derivation probabilities are preserved

LCFG at work From Charniak 1997 to 2-lex CFG : NP S [profits] N NP [profits]ADJ NP [corporate] Pr 1 (corporate | ADJ, NP, profits) Pr 1 (profits | N, NP, profits) Pr 2 ( NP ADJ N | NP, S, profits)

LCFG at work From Collins 1997 (Model #2) to 2-lex CFG : Pr left (NP, IBM | VP, S, bought, left, {NP-C}) N, [IBM] VP S, {NP-C}, left, S [bought] VP S, {}, left, S [bought]

LCFG at work Major Limitation : Cannot capture relations involving lexical items outside actual constituent (cfr. history based models) V[d 0 ] d1d1 NP[d 1 ] d2d2 PP[d 2 ][d 3 ] d0d0 cannot look at d 0 when computing PP attachment NP[d 1 ][d 0 ]

LCFG at work lexicalized context-free parsers that are not LCFG : –Magerman 1995: Shift-Reduce+ –Ratnaparkhi 1997: Shift-Reduce+ –Chelba & Jelinek 1998: Shift-Reduce+ –Hermjakob & Mooney 1997: LR

Related work Other frameworks for the study of lexicalized grammars : Carroll & Weir 1997 : Stochastic Lexicalized Grammars; emphasis on expressiveness Goodman 1997 : Probabilistic Feature Grammars; emphasis on parameter estimation

Standard Parsing standard parsing algorithms (CKY, Earley, LC,...) run on LCFG in time O ( |G | |w | 3 ) for 2-lex CFG (simplest case) |G | grows with |V D | 3 |V T | 2 !! Goal : Get rid of |V T | factors

Standard Parsing: TD Result (to be refined) : Algorithms satisfying the correct-prefix property are unlikely to run on LCFG in time independent of V T

Correct-prefix property Earley, Left-Corner, GLR,... : w S left-to-right reading position

On-line parsing No grammar precompilation (Earley) : Parser G w Output

Standard Parsing: TD Result : On-line parsers with correct-prefix property cannot run in time O ( f(|V D |, |w |) ), for any function f

Off-line parsing Grammar is precompiled (Left-Corner, LR) : Parser G PreComp C(G ) w Output Parser

Standard Parsing: TD Fact : We can simulate a nondeterministic FA M on w in time O ( |M | |w | ) Conjecture : Fix a polynomial p. We cannot simulate M on w in time p( |w | ) unless we spend exponential time in precompiling M

Standard Parsing: TD Assume our conjecture holds true Result : Off-line parsers with correct-prefix property cannot run in time O ( p(|V D |, |w |) ), for any polynomial p, unless we spend exponential time in precompiling G

Standard Parsing: BU Common practice in lexicalized grammar parsing : select productions that are lexically grounded in w parse BU with selected subset of G Problem : Algorithm removes |V T | factors but introduces new |w | factors !!

Standard Parsing: BU Running time is O ( |V D | 3 |w | 5 ) !! ikj B[d 1 ]C[d 2 ] A[d 2 ] d1d1 d2d2 Time charged : i, k, j |w | 3 A, B, C |V D | 3 d 1, d 2 |w | 2

Standard BU : Exhaustive

Standard BU : Pruning

BU enhanced Result : Parsing with 2-lex CFG in time O ( |V D | 3 |w | 4 ) Remark : Result transfers to models in Alshawi 1996, Eisner 1996, Charniak 1997, Collins 1997 Remark : Technique extends to improve parsing of Lexicalized-Tree Adjoining Grammars

Algorithm #1 Basic step in naive BU : j C[d 2 ] d2d2 ik B[d 1 ] d1d1 Idea : Indices d 1 and j can be processed independently A[d 2 ]

Algorithm #1 Step 1 Step 2 i k B[d 1 ] d1d1 A[d 2 ] C[d 2 ] d2d2 A[d 2 ] i d2d2 j i k A[d 2 ] C[d 2 ] d2d2 j i A[d 2 ] C[d 2 ] d2d2 k

BU enhanced Upper bound provided by Algorithm #1 : O (|w | 4 ) Goal : Can we go down to O (|w | 3 ) ?

Spine last week AdvP[week] NP[IBM] NP[Lotus] Lotus IBM S[buy] VP[buy] V[buy] bought The spine of a parse tree is the path from the root to the roots head S[buy] VP[buy] V[buy] bought

The spine projection is the yield of the sub- tree composed by the spine and all its sibling nodes Spine projection last week AdvP[week] S[buy] NP[IBM] S[buy] VP[buy] V[buy]NP[Lotus] Lotusbought IBM AdvP[week] NP[IBM] NP[Lotus] NP[IBM] bought NP[Lotus] AdvP[week]

Split Grammars Split spine projections at head : Problem : how much information do we need to store in order to construct new grammatical spine projections from splits ? ??

Split Grammars Fact : Set of spine projections is a linear context- free language Definition : 2-lex CFG is split if set of spine projections is a regular language Remark : For split grammars, we can recombine splits using finite information

Split Grammars Non-split grammar : unbounded # of dependencies between left and right dependents of head S[d] AdvP[a] S[d]AdvP[b] S 1 [d] linguistically unattested and unlikely AdvP[a] S[d]AdvP[b] S 1 [d]

Split Grammars Split grammar : finite # of dependencies between left and right dependents of lexical head S[d] AdvP[a] S[d]AdvP[b] S[d]

Split Grammars Precompile grammar such that splits are derived separately AdvP[week] NP[IBM] NP[Lotus] S[buy] VP[buy] V[buy] bought AdvP[week] NP[IBM] NP[Lotus] S[buy] r 3 [buy] r 2 [buy] r 1 [buy] bought r 3 [buy] is a split symbol

Split Grammars t : max # of states per spine automaton g : max # of split symbols per spine automaton (g < t ) m : # of delexicalized nonterminals thare are maximal projections

BU enhanced Result : Parsing with split 2-lexical CFG in time O (t 2 g 2 m 2 |w | 3 ) Remark: Models in Alshawi 1996, Charniak 1997 and Collins 1997 are not split

Algorithm #2 Idea : recognize left and right splits separately collect head dependents one split at a time B[d] d d s s d

NP[IBM] bought NP[Lotus] AdvP[week] Algorithm #2

Step 1 B[d 1 ] d1d1 s1s1 k r1r1 d2d2 s2s2 r2r2 Step 2 B[d 1 ] d1d1 s1s1 d2d2 s2s2 r2r2 d1d1 s1s1 d2d2 s2s2 r2r2 i r2r2 d2d2 i s2s2

Algorithm #2 : Exhaustive

Algorithm #2 : Pruning

Related work Cubic time algorithms for lexicalized grammars : Sleator & Temperley 1991 : Link Grammars Eisner 1997 : Bilexical Grammars (improved by transfer of Algorithm #2)

TD enhanced Goal : Introduce TD prediction for 2-lexical CFG parsing, without |V T | factors Remark : Must relax left-to-right parsing (because of previous results)

TD enhanced Result : TD parsing with 2-lex CFG in time O ( |V D | 3 |w | 4 ) Open : O ( |w | 3 ) extension to split grammars

TD enhanced Strongest version of correct-prefix property : S w reading position

Data Structures Prods with lhs A[d] : A[d] X 1 [d 1 ] X 2 [d 2 ] A[d] Y 1 [d 3 ] Y 2 [d 2 ] A[d] Z 1 [d 2 ] Z 2 [d 1 ] d3d3 d2d2 d1d1 Trie for A[d] : d1d1 d2d2

Data Structures Rightmost subsequence recognition by precompiling input w into a deterministic FA : a b a c c b a b b a a c

Algorithm #3 Item representation : i, j indicate extension of A[d] partial analysis A[d] i j k S k indicates rightmost possible position for completion of A[d] analysis

Algorithm #3 : Prediction Step 1 : find rightmost subsequence before k for some A[d 2 ] production Step 2 : make Earley prediction C[d 2 ] B[d 1 ] k i A[d 2 ] k d2d2 d1d1

Conclusions standard parsing techniques are not suitable for processing lexicalized grammars novel algorithms have been introduced using enhanced dynamic programming work to be done : extension to history-based models

The End Many thanks for helpful discussion to : Jason Eisner, Mark-Jan Nederhof

Giorgio Satta University of Padua Parsing Techniques for Lexicalized Context-Free Grammars* * Joint work with : Jason Eisner, Mark-Jan Nederhof.

Similar presentations

Presentation on theme: "Giorgio Satta University of Padua Parsing Techniques for Lexicalized Context-Free Grammars* * Joint work with : Jason Eisner, Mark-Jan Nederhof."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Giorgio Satta University of Padua Parsing Techniques for Lexicalized Context-Free Grammars* * Joint work with : Jason Eisner, Mark-Jan Nederhof.

Similar presentations

Presentation on theme: "Giorgio Satta University of Padua Parsing Techniques for Lexicalized Context-Free Grammars* * Joint work with : Jason Eisner, Mark-Jan Nederhof."— Presentation transcript:

Similar presentations

About project

Feedback