Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Natural Language Processing Lecture 11 Efficient Parsing Reading: James Allen NLU (Chapter 6)

Similar presentations


Presentation on theme: "1 Natural Language Processing Lecture 11 Efficient Parsing Reading: James Allen NLU (Chapter 6)"— Presentation transcript:

1 1 Natural Language Processing Lecture 11 Efficient Parsing Reading: James Allen NLU (Chapter 6)

2 2 Human preference in Parsing Parsing techniques seen so far depended on a complete Search But human seem to parse more deterministically However, they may fall in a garden-path: The raft floated down the river sank

3 3 Human preference in Parsing Some of principles that appears to be used by people to choose the correct interpretation are: –Minimal attachment –Right Association –Lexical Preferences

4 4 Minimal Attachment

5 5 The M.A. Principle may cause misparsing 1.We painted all the walls with cracks PP tends to attach to VP rather to NP 2.The horse [that was] raced past the barn fell The reduced relative clause introduce more nodes, so “raced” is taken as the main verb, but it is rejected when “fell” is seen

6 6 Right Association (or Late Closure) George said that Henry left in his car I thought it would rain yesterday

7 7 Right Association (or Late Closure)

8 8 Lexical Preference M.A. and R.A. principle may have conflict The man keep the dog in the house R.A. Suggests The man keep the dog in the house M.A. Suggests The man keep the dog in the house Should M.A. be given more priority?

9 9 Lexical Preference 1.I wanted the dog in the house I wanted the dog in the house 2.I kept the dog in the house I kept the dog in the house 3.I put the dog in the house So L.P. overrides M.A. and R.A.

10 10 Uncertainty in Shift-Reduce Parsers Postponing decisions (  breadth first search), or Coding all possibilities into a parse table The grammar need to be unambiguous No unambiguous grammar exists for N.L. But the technique can be extended Consider the following grammar S  VP NP NP  ART N VP  AUX V NP VP  V NP

11 11 Transition Graph

12 12 Parse table (Oracle)

13 13 Shift-Reduce Parsing A class of parsers with the following principles: Parsing is done Bottom-Up, reducing the input into the grammar start symbol The parser builds a right-most derivation of the input in reverse Parsing algorithm simulates the operation of a PDA Prefix of the sentential form is kept on the stack Two types of operation: – Shift the next input symbol onto the stack – Reduce the stack by popping the RHS of a grammar rule, and pushing the corresponding LHS non-terminal symbol Parser is usually deterministic and with no back-tracking Extremely efficient, operating in linear time - O(n) But - possible to construct for only a limited class of CFGs

14 14 LR Parsing General Principles: Use sets of “dotted” grammar rules to reflect the state of the parser: – What constituents have we constructed so far – What constituents are we predicting next Pre-compile the grammar into a collection of finite sets of “dotted” rules Use these sets to capture the state of the parser during parsing The Parser is a deterministic shift-reduce parser. Developed by Knuth in the late 1960s - as a framework for compiling programming languages

15 15 LR Parsing Algorithm Performs shift and reduce parsing actions on the stack, and changes state with each operation Is driven by a pre-compiled parsing table that has two parts – The action table specifies the next shift or reduce parsing operation – The goto table specifies which state to transfer to after a reduction The stack stores a string of the form S 0 X 1 S 1 X 2 …X m S m where the S i are parser states and the X i are grammar symbols At each step the parser does one of the following types of operations: – Shift s: Push the current input symbol X i on the stack followed by the new state s – Reduce i: Reduce the stack according to rule i of the grammar – Reject: Reject the input as ungrammatical and signal an error – Accept: Accept the input as grammatical and halt

16 16 LR Parsing - Example The Grammar: (1) S  NP V P (2) NP  art adj n (3) NP  art n (4) NP  adj n (5) VP  aux V P (6) VP  v NP The original input: “ x = The large can can hold the water” POS assigned input: “ x = art adj n aux v art n” Parser input: “ x = art adj n aux v art n $”

17 17 Parse Table

18 18 The input: “x = art adj n aux v art n $”

19 19 Constructing an SLR Parsing Table An LR(0) item is a “dotted” grammar rule [A  B  ] We construct a deterministic FSA that recognizes prefixes of rightmost sentential forms of the grammar G. The states of the FSA are sets of LR(0) items We augment the grammar with a new start rule S’  S We define the closure operation on a set S of LR(0) items: 1. Every item in S is also in closure(S) 2. If [A  B  ]  closure(S) and B   is a rule in G, then add [B   ] to closure(S)

20 20 Constructing an SLR Parsing Table We define the Goto operation for an item set S and a grammar symbol X: Goto(S, X) is the closure of the set of all items [A   X  ] such that [A   X  ]  S Example: S 0 = {[S   NP VP]} Goto(S 0, NP) =

21 21 Constructing an SLR Parsing Table We construct the collection of sets of LR(0) items for an augmented grammar G We start with the item set S0 = {closure ({[S’  S]})} The algorithm:

22 22 Constructing an SLR Parsing Table - Example

23 23 Constructing an SLR Parsing Table

24 24 The constructed FSA for the example grammar:

25 25 Parsing with an LR Parser The pointers that form the parse tree can be created while performing reduce actions A parse node is created for each constituent that is pushed onto the stack When we do a reduce - we create a new parse node for the LHS non-terminal and link it to the parse-nodes of the popped RHS constituents At the end - the S constituent on the stack point to the root of the parse tree

26 26 The input: “x = art adj n aux v art n $”

27 27 Shift-Reduce Parsers and Ambiguity 1.NP  ART N REL-PRO VP 2.NP  ART N PP NP1: NP   ART N REL-PRO VP NP   ART N PP NP2: NP  ART  N REL-PRO VP NP  ART  N PP NP3: NP  ART N  REL-PRO VP NP  ART N  PP PP   P NP NP4: NP  ART N REL-PRO  VP VP   V NP

28 28 Lexical Ambiguity Ambiguous words are pushed onto stack Adding some extra states Can is both V and AUX Add S3_4 which is the union of S3 and S4 VP  AUX  V NP VP  V  NP NP   ART N Next input will resolve the ambiguity If it is a V then go to S5, If it is an ART then go to S1, If it is an NP then go to S3’

29 29 Ambiguous Parse States 3.NP  ART N 4.NP  ART N PP NP  ART  N NP  ART  N PP NP5: NP  ART N  NP  ART N  PP PP   P NP Now, what if the next input is a P? There is a Shift/Reduce Conflict

30 30 Ambiguous Parse States (Cont.) Solutions: 1.Maintain determinism and lose some interpretations (as a human may do) Choose shift in Shift/Reduce conflicts (  R.A.) Choose the longer rule in Reduce/Reduce conflicts (  M.A.) 2.Use search again (dfs or bfs) Dfs combined with General Preference Principles


Download ppt "1 Natural Language Processing Lecture 11 Efficient Parsing Reading: James Allen NLU (Chapter 6)"

Similar presentations


Ads by Google