Download presentation

Presentation is loading. Please wait.

Published byBrandon Douglas Modified over 2 years ago

1
Giorgio Satta University of Padua Parsing Techniques for Lexicalized Context-Free Grammars* * Joint work with : Jason Eisner, Mark-Jan Nederhof

2
Summary Part I: Lexicalized Context-Free Grammars –motivations and definition –relation with other formalisms Part II: standard parsing –TD techniques –BU techniques Part III: novel algorithms –BU enhanced –TD enhanced

3
Lexicalized grammars each rule specialized for one or more lexical items advantages over non-lexicalized formalisms: –express syntactic preferences that are sensitive to lexical words –control word selection

4
Syntactic preferences adjuncts Workers [ dumped sacks ] into a bin *Workers dumped [ sacks into a bin ] N-N compound [ hydrogen ion ] exchange *hydrogen [ ion exchange ]

5
Word selection lexical Nora convened the meeting ?Nora convened the party semantics Peggy solved two puzzles ?Peggy solved two goats world knowledge Mary shelved some books ?Mary shelved some cooks

6
Lexicalized CFG Motivations : study computational properties common to generative formalisms used in state-of-the-art real-world parsers develop parsing algorithm that can be directly applied to these formalisms

7
Lexicalized CFG dumped sacks into a bin VP[dump][sack] P[into] N[bin]Det[a] NP[bin] PP[into][bin] V[dump] VP[dump][sack] NP[sack] N[sack]

8
Lexicalized CFG Context-free grammars with : alphabet V T : –dumped, sacks, into,... delexicalized nonterminals V D : –NP, VP,... nonterminals V N : –NP[sack], VP[dump][sack],...

9
Lexicalized CFG Delexicalized nonterminals encode : word sense N, V,... grammatical features number, tense,... structural information bar level, subcategorization state,... other constraints distribution, contextual features,...

10
Lexicalized CFG productions have two forms : –V[dump] dumped –VP[dump][sack] VP[dump][sack] PP[into][bin] lexical elements in lhs inherited from rhs

11
Lexicalized CFG production is k-lexical : k occurrences of lexical elements in rhs –NP[bin] Det[a] N[bin] is 2-lexical –VP[dump][sack] VP[dump][sack] PP[into][bin] is 4-lexical

12
LCFG at work 2-lexical CFG –Alshawi 1996 : Head Automata –Eisner 1996: Dependency Grammars –Charniak 1997 : CFG –Collins 1997 : generative model

13
LCFG at work Probabilistic LCFG G is strongly equivalent to probabilistic grammar G iff mapping between derivations each direction is a homomorphism derivation probabilities are preserved

14
LCFG at work From Charniak 1997 to 2-lex CFG : NP S [profits] N NP [profits]ADJ NP [corporate] Pr 1 (corporate | ADJ, NP, profits) Pr 1 (profits | N, NP, profits) Pr 2 ( NP ADJ N | NP, S, profits)

15
LCFG at work From Collins 1997 (Model #2) to 2-lex CFG : Pr left (NP, IBM | VP, S, bought, left, {NP-C}) N, [IBM] VP S, {NP-C}, left, S [bought] VP S, {}, left, S [bought]

16
LCFG at work Major Limitation : Cannot capture relations involving lexical items outside actual constituent (cfr. history based models) V[d 0 ] d1d1 NP[d 1 ] d2d2 PP[d 2 ][d 3 ] d0d0 cannot look at d 0 when computing PP attachment NP[d 1 ][d 0 ]

17
LCFG at work lexicalized context-free parsers that are not LCFG : –Magerman 1995: Shift-Reduce+ –Ratnaparkhi 1997: Shift-Reduce+ –Chelba & Jelinek 1998: Shift-Reduce+ –Hermjakob & Mooney 1997: LR

18
Related work Other frameworks for the study of lexicalized grammars : Carroll & Weir 1997 : Stochastic Lexicalized Grammars; emphasis on expressiveness Goodman 1997 : Probabilistic Feature Grammars; emphasis on parameter estimation

19
Summary Part I: Lexicalized Context-Free Grammars –motivations and definition –relation with other formalisms Part II: standard parsing –TD techniques –BU techniques Part III: novel algorithms –BU enhanced –TD enhanced

20
Standard Parsing standard parsing algorithms (CKY, Earley, LC,...) run on LCFG in time O ( |G | |w | 3 ) for 2-lex CFG (simplest case) |G | grows with |V D | 3 |V T | 2 !! Goal : Get rid of |V T | factors

21
Standard Parsing: TD Result (to be refined) : Algorithms satisfying the correct-prefix property are unlikely to run on LCFG in time independent of V T

22
Correct-prefix property Earley, Left-Corner, GLR,... : w S left-to-right reading position

23
On-line parsing No grammar precompilation (Earley) : Parser G w Output

24
Standard Parsing: TD Result : On-line parsers with correct-prefix property cannot run in time O ( f(|V D |, |w |) ), for any function f

25
Off-line parsing Grammar is precompiled (Left-Corner, LR) : Parser G PreComp C(G ) w Output Parser

26
Standard Parsing: TD Fact : We can simulate a nondeterministic FA M on w in time O ( |M | |w | ) Conjecture : Fix a polynomial p. We cannot simulate M on w in time p( |w | ) unless we spend exponential time in precompiling M

27
Standard Parsing: TD Assume our conjecture holds true Result : Off-line parsers with correct-prefix property cannot run in time O ( p(|V D |, |w |) ), for any polynomial p, unless we spend exponential time in precompiling G

28
Standard Parsing: BU Common practice in lexicalized grammar parsing : select productions that are lexically grounded in w parse BU with selected subset of G Problem : Algorithm removes |V T | factors but introduces new |w | factors !!

29
Standard Parsing: BU Running time is O ( |V D | 3 |w | 5 ) !! ikj B[d 1 ]C[d 2 ] A[d 2 ] d1d1 d2d2 Time charged : i, k, j |w | 3 A, B, C |V D | 3 d 1, d 2 |w | 2

30
Standard BU : Exhaustive

31
Standard BU : Pruning

32
Summary Part I: Lexicalized Context-Free Grammars –motivations and definition –relation with other formalisms Part II: standard parsing –TD techniques –BU techniques Part III: novel algorithms –BU enhanced –TD enhanced

33
BU enhanced Result : Parsing with 2-lex CFG in time O ( |V D | 3 |w | 4 ) Remark : Result transfers to models in Alshawi 1996, Eisner 1996, Charniak 1997, Collins 1997 Remark : Technique extends to improve parsing of Lexicalized-Tree Adjoining Grammars

34
Algorithm #1 Basic step in naive BU : j C[d 2 ] d2d2 ik B[d 1 ] d1d1 Idea : Indices d 1 and j can be processed independently A[d 2 ]

35
Algorithm #1 Step 1 Step 2 i k B[d 1 ] d1d1 A[d 2 ] C[d 2 ] d2d2 A[d 2 ] i d2d2 j i k A[d 2 ] C[d 2 ] d2d2 j i A[d 2 ] C[d 2 ] d2d2 k

36
BU enhanced Upper bound provided by Algorithm #1 : O (|w | 4 ) Goal : Can we go down to O (|w | 3 ) ?

37
Spine last week AdvP[week] NP[IBM] NP[Lotus] Lotus IBM S[buy] VP[buy] V[buy] bought The spine of a parse tree is the path from the root to the roots head S[buy] VP[buy] V[buy] bought

38
The spine projection is the yield of the sub- tree composed by the spine and all its sibling nodes Spine projection last week AdvP[week] S[buy] NP[IBM] S[buy] VP[buy] V[buy]NP[Lotus] Lotusbought IBM AdvP[week] NP[IBM] NP[Lotus] NP[IBM] bought NP[Lotus] AdvP[week]

39
Split Grammars Split spine projections at head : Problem : how much information do we need to store in order to construct new grammatical spine projections from splits ? ??

40
Split Grammars Fact : Set of spine projections is a linear context- free language Definition : 2-lex CFG is split if set of spine projections is a regular language Remark : For split grammars, we can recombine splits using finite information

41
Split Grammars Non-split grammar : unbounded # of dependencies between left and right dependents of head S[d] AdvP[a] S[d]AdvP[b] S 1 [d] linguistically unattested and unlikely AdvP[a] S[d]AdvP[b] S 1 [d]

42
Split Grammars Split grammar : finite # of dependencies between left and right dependents of lexical head S[d] AdvP[a] S[d]AdvP[b] S[d]

43
Split Grammars Precompile grammar such that splits are derived separately AdvP[week] NP[IBM] NP[Lotus] S[buy] VP[buy] V[buy] bought AdvP[week] NP[IBM] NP[Lotus] S[buy] r 3 [buy] r 2 [buy] r 1 [buy] bought r 3 [buy] is a split symbol

44
Split Grammars t : max # of states per spine automaton g : max # of split symbols per spine automaton (g < t ) m : # of delexicalized nonterminals thare are maximal projections

45
BU enhanced Result : Parsing with split 2-lexical CFG in time O (t 2 g 2 m 2 |w | 3 ) Remark: Models in Alshawi 1996, Charniak 1997 and Collins 1997 are not split

46
Algorithm #2 Idea : recognize left and right splits separately collect head dependents one split at a time B[d] d d s s d

47
NP[IBM] bought NP[Lotus] AdvP[week] Algorithm #2

48
Step 1 B[d 1 ] d1d1 s1s1 k r1r1 d2d2 s2s2 r2r2 Step 2 B[d 1 ] d1d1 s1s1 d2d2 s2s2 r2r2 d1d1 s1s1 d2d2 s2s2 r2r2 i r2r2 d2d2 i s2s2

49
Algorithm #2 : Exhaustive

50
Algorithm #2 : Pruning

51
Related work Cubic time algorithms for lexicalized grammars : Sleator & Temperley 1991 : Link Grammars Eisner 1997 : Bilexical Grammars (improved by transfer of Algorithm #2)

52
TD enhanced Goal : Introduce TD prediction for 2-lexical CFG parsing, without |V T | factors Remark : Must relax left-to-right parsing (because of previous results)

53
TD enhanced Result : TD parsing with 2-lex CFG in time O ( |V D | 3 |w | 4 ) Open : O ( |w | 3 ) extension to split grammars

54
TD enhanced Strongest version of correct-prefix property : S w reading position

55
Data Structures Prods with lhs A[d] : A[d] X 1 [d 1 ] X 2 [d 2 ] A[d] Y 1 [d 3 ] Y 2 [d 2 ] A[d] Z 1 [d 2 ] Z 2 [d 1 ] d3d3 d2d2 d1d1 Trie for A[d] : d1d1 d2d2

56
Data Structures Rightmost subsequence recognition by precompiling input w into a deterministic FA : a b a c c b a b b a a c

57
Algorithm #3 Item representation : i, j indicate extension of A[d] partial analysis A[d] i j k S k indicates rightmost possible position for completion of A[d] analysis

58
Algorithm #3 : Prediction Step 1 : find rightmost subsequence before k for some A[d 2 ] production Step 2 : make Earley prediction C[d 2 ] B[d 1 ] k i A[d 2 ] k d2d2 d1d1

59
Conclusions standard parsing techniques are not suitable for processing lexicalized grammars novel algorithms have been introduced using enhanced dynamic programming work to be done : extension to history-based models

60
The End Many thanks for helpful discussion to : Jason Eisner, Mark-Jan Nederhof

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google