Download presentation
Presentation is loading. Please wait.
1
Statistical NLP Winter 2009
Lecture 10: Parsing I Roger Levy Thanks to Jason Eisner & Dan Klein for slides
2
Why is natural language parsing hard?
As language structure gets more abstract, computing it gets harder Document classification finite number of classes fast computation at test time Part-of-speech tagging (recovering label sequences) Exponentially many possible tag sequences But exact computation possible in O(n) Parsing (recovering labeled trees) Exponentially many, or even infinite, possible trees Exact inference worse than tagging, but still within reach
3
Why is parsing harder than tagging
How many trees are there for a given string? Imagine a rule VPVP …∞! This is not a problem for inferring availability of structures (why?) Nor is this a problem for inferring the most probable structure in a PCFG (why?)
4
Why parsing is harder than tagging II
Ingredient 1: syntactic category ambiguity Exponentially many category sequences, like tagging Ingredient 2: attachment ambiguity Classic case: prepositional-phrase (PP) attachment 1 PP: no ambiguity 2 PPs: some ambiguity
5
Why parsing is harder than tagging III
3 PPs: much more attachment ambiguity! 5 PPs: 14 trees, 6 PPs: 42 trees, 7 PPs: 132 trees…
6
Why parsing is harder than tagging IV
Tree-structure ambiguity grows like the Catalan numbers (Knuth, 1975; Church & Patil, 1982) This is factorial growth on top of the exponential growth associated with sequence label ambiguity
7
Why parsing is still tractable
This all makes parsing look really bad But there’s still hope Those factorially many parses are different combinations of common subparts
8
How to parse tractably Recall that we did HMM part-of-speech tagging by storing partial results in a trellis An HMM is a special type of grammar with essentially two types of rules: “Category Y can follow category X (with cost π)” “Category X can be realized as word w (with cost η)” The trellis is a graph whose structure reflects its rules Edges between all sequentially adjacent category pairs
9
How to parse tractably II
But a (weighted) CFG has more complicated rules: “Category X can rewrite as categories α (with cost π)” “Preterminal X can be realized as word w (with cost η)” (2 is really a special case of 1) A graph is not rich enough to reflect CFG/tree structure Phrases need to be stored as partial results We also need rule combination structure We’ll do this with hypergraphs
10
How to parse tractably III
Hypergraphs are like graphs, but have hyper-edges instead of edges “We observe a DT as word 1 and an NN as word 2.” “Together, these let us infer an NP spanning words 1—2.” start state allows us to infer each of these both of these are needed to infer this
11
How to parse tractably IV
Goal Spanning words 1—3 Hypergraph for Bird shot flies (only partial) Spanning words 1—2 Spanning words 2—3 Grammar: S NP VP VP V NP VP V NP N NP N N
12
How to parse tractably V
The nodes in the hypergraph can be thought of as being arranged in a triangle For a sentence of length N, this is the upper right triangle of an N×N matrix This matrix is called the parse chart
13
How to parse tractably VI
Before we study examples of parsing, let’s linger on the hypergraph for a moment The goal of parsing is to fully interconnect all the evidence (words) and the goal This could be done from the bottom up… …or from the top down & left to right These correspond to different parse strategies Today: bottom-up (later: top-down)
14
Bottom-up (CKY) parsing
Bottom-up is the most straightforward efficient parsing algorithm to implement Known as Cocke-Kasami-Young (CKY) algorithm We’ll illustrate it for the weighted CFG instance Each rule has a weight (log-prob) associated with it We’re looking for the “lightest” (lowest-weight or, equivalently, highest-probability) tree T for sentence S Implicitly this is Bayes’ rule!
15
CKY parsing II Here’s the (partial) grammar we’ll use:
The sentence we’ll parse (see the ambiguity?): 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP 3 NP time 4 NP flies 4 VP flies 3 Vst time 2 P like 5 V like 1 Det an 8 N arrow Imperative verb: “Do the dishes!” Time flies like an arrow
16
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 1 NP 4 VP 4 2 P 2 V 5 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
17
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 1 NP 4 VP 4 2 P 2 V 5 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
18
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 1 NP 4 VP 4 2 P 2 V 5 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
19
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 1 NP 4 VP 4 2 P 2 V 5 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
20
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 1 NP 4 VP 4 2 P 2 V 5 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
21
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 1 NP 4 VP 4 2 P 2 V 5 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
22
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 1 NP 4 VP 4 2 P 2 V 5 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
23
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 1 NP 4 VP 4 2 P 2 V 5 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
24
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 1 NP 4 VP 4 2 P 2 V 5 PP 12 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
25
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 1 NP 4 VP 4 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
26
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 1 NP 4 VP 4 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
27
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 1 NP 4 VP 4 NP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
28
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 1 NP 4 VP 4 NP 18 S 21 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
29
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
30
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
31
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 NP 24 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
32
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 NP 24 S 22 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
33
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 NP 24 S 22 S 27 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
34
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 NP 24 S 22 S 27 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
35
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 NP 24 S 22 S 27 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
36
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 NP 24 S 22 S 27 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
37
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 NP 24 S 22 S 27 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
38
1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 NP 24 S 22 S 27 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
39
Follow backpointers … S time 1 flies 2 like 3 an 4 arrow 5 NP 3 Vst 3
NP 3 Vst 3 NP 10 S 8 S 13 NP 24 S 22 S 27 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
40
S NP VP time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 NP 24 S 22 S 27 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
41
S NP VP time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 NP 24 S 22 S 27 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 VP PP 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
42
S NP VP time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 NP 24 S 22 S 27 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 VP PP P NP 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
43
S NP VP time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 NP 24 S 22 S 27 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 VP PP P NP Det N 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
44
Which entries do we need?
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 NP 24 S 22 S 27 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
45
Which entries do we need?
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 NP 24 S 22 S 27 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
46
Not worth keeping … time 1 flies 2 like 3 an 4 arrow 5 NP 3 Vst 3
NP 3 Vst 3 NP 10 S 8 S 13 NP 24 S 22 S 27 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
47
… since it just breeds worse options
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 NP 24 S 22 S 27 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
48
Keep only best-in-class!
time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 S 13 NP 24 S 22 S 27 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 “inferior stock” 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
49
Keep only best-in-class!
(and backpointers so you can recover parse) time flies like an arrow 5 NP 3 Vst 3 NP 10 S 8 NP 24 S 22 1 NP 4 VP 4 NP 18 S 21 VP 18 2 P 2 V 5 PP 12 VP 16 3 Det 1 4 N 8 1 S NP VP 6 S Vst NP 2 S S PP 1 VP V NP 2 VP VP PP 1 NP Det N 2 NP NP PP 3 NP NP NP 0 PP P NP
50
Computational complexity of parsing
This approach has good space complexity O(GN2) where G is the # categories in the grammar What is the time complexity of the algorithm? It’s cubic in N…why? What about time complexity in G? First, a clarification is in order CFG rules can have right-hand sides of arbitrary length X α But CKY works only w/ right-hand sides of max length 2 So we need to convert the CFG for use with CKY
51
Computational complexity II
Any CFG can be transformed into a new CFG whose rules are at most binary-branching (α=2) (Look up Chomsky normal form in the book for an example) This transformation is reversible with no loss of information It’s also possible to similarly transform weighted CFGs This makes CKY possible, and it is cubic in G
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.