Download presentation

Presentation is loading. Please wait.

1
**PARSING WITH CONTEXT-FREE GRAMMARS**

cc437

2
**PARSING Parsing is the process of recognizing and assigning STRUCTURE**

Parsing a string with a CFG: Finding a derivation of the string consistent with the grammar The derivation gives us a PARSE TREE

3
**EXAMPLE (CFR LAST WEEK)**

4
PARSING AS SEARCH Just as in the case of non-deterministic regular expressions, the main problem with parsing is the existence of CHOICE POINTS There is a need for a SEARCH STRATEGY determining the order in which alternatives are considered

5
**TOP-DOWN AND BOTTOM-UP SEARCH STRATEGIES**

The search has to be guided by the INPUT and the GRAMMAR TOP-DOWN search: the parse tree has to be rooted in the start symbol S EXPECTATION-DRIVEN parsing BOTTOM-UP search: the parse tree must be an analysis of the input DATA-DRIVEN parsing

6
**AN EXAMPLE OF TOP-DOWN SEARCH (IN PARALLEL)**

7
**AN EXAMPLE OF BOTTOM-UP SEARCH**

8
NON-PARALLEL SEARCH If it’s not possible to examine all alternatives in parallel, it’s necessary to make further decisions: Which node in the current search space to expand first (breadth-first or depth-first) Which of the applicable grammar rules to expand first Which leaf node in a parse tree to expand next (e.g., leftmost)

9
**TOP-DOWN, DEPTH-FIRST, LEFT-TO-RIGHT**

10
**TOP-DOWN, DEPTH-FIRST, LEFT-TO-RIGHT (II)**

11
**TOP-DOWN, DEPTH-FIRST, LEFT-TO-RIGHT (III)**

12
**TOP-DOWN, DEPTH-FIRST, LEFT-TO-RIGHT (IV)**

13
A T-D, D-F, L-R PARSER (Compare with ND-recognize)

14
**TOP-DOWN vs BOTTOM-UP TOP-DOWN: BOTTOM-UP:**

Only search among grammatical answers BUT: suggests hypotheses that may not be consistent with data Problem: left-recursion BOTTOM-UP: Only forms hypotheses consistent with data BUT: may suggest hypotheses that make no sense globally

15
LEFT-RECURSION A LEFT-RECURSIVE grammar may cause a T-D, D-F, L-R parser to never return Examples of left-recursive rules: NP NP PP S S and S But also: NP Det Nom Det NP’s

16
**THE PROBLEM WITH LEFT-RECURSION**

17
**LEFT-RECURSION: POOR SOLUTIONS**

Rewrite the grammar to a weakly equivalent one Problem: may not get correct parse tree Limit the depth during search Problem: limit is arbitrary

18
**LEFT-CORNER PARSING A hybrid of top-down and bottom-up parsing**

Strategy: don’t consider any expansion unless the current input can serve as the LEFT-CORNER of that expansion

19
**FURTHER PROBLEMS IN PARSING**

Ambiguity Church and Patel (1982): the number of attachment ambiguities grows like the Catalan numbers C(2) = 2, C(3) = 5, C(4) = 14, C(5) = 132, C(6) = 469, C(7) = 1430, C(8) = 4867 Avoiding reparsing

20
**COMMON STRUCTURAL AMBIGUITIES**

COORDINATION ambiguity OLD (MEN AND WOMEN) vs (OLD MEN) AND WOMEN ATTACHMENT ambiguity: Gerundive VP attachment ambiguity I saw the Eiffel Tower flying to Paris PP attachment ambiguity I shot an elephant in my pajamas

21
**PP ATTACHMENT AMBIGUITY**

22
AMBIGUITY: SOLUTIONS Use a PROBABILISTIC GRAMMAR (not covered in this module) Use semantics

23
**AVOID RECOMPUTING INVARIANTS**

Consider parsing with a top-down parser the NP: A flight from Indianapolis to Houston on TWA With the grammar rules: NP Det Nominal NP NP PP NP ProperNoun

24
**INVARIANTS AND TOP-DOWN PARSING**

25
THE EARLEY ALGORITHM

26
DYNAMIC PROGRAMMING A standard T-D parser would reanalyze A FLIGHT 4 times, always in the same way A DYNAMIC PROGRAMMING algorithm uses a table (the CHART) to avoid repeating work The Earley algorithm also Does not suffer from the left-recursion problem Solves an exponential problem in O(n3)

27
THE CHART The Earley algorithm uses a table (the CHART) of size N+1, where N is the length of the input Table entries sit in the `gaps’ between words Each entry in the chart is a list of Completed constituents In-progress constituents Predicted constituents All three types of objects are represented in the same way as STATES

28
**THE CHART: GRAPHICAL REPRESENTATION**

29
**STATES A state encodes two types of information: DOTTED RULES**

How much of a certain rule has been encountered in the input Which positions are covered A , [X,Y] DOTTED RULES VP V NP NP Det Nominal S VP

30
EXAMPLES

31
SUCCESS The parser has succeeded if entry N+1 of the chart contains the state S , [0,N]

32
THE ALGORITHM The algorithm loops through the input without backtracking, at each step performing three operations: PREDICTOR: add predictions to the chart COMPLETER: Move the dot to the right when looked-for constituent is found SCANNER: read in the next input word

33
**THE ALGORITHM: CENTRAL LOOP**

34
**EARLEY ALGORITHM: THE THREE OPERATORS**

35
EXAMPLE, AGAIN

36
**EXAMPLE: BOOK THAT FLIGHT**

37
**EXAMPLE: BOOK THAT FLIGHT (II)**

38
**EXAMPLE: BOOK THAT FLIGHT (III)**

39
**EXAMPLE: BOOK THAT FLIGHT (IV)**

40
READINGS Jurafsky and Martin, chapter

Similar presentations

OK

Chapter 10. Parsing with CFGs From: Chapter 10 of An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, by.

Chapter 10. Parsing with CFGs From: Chapter 10 of An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, by.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on national stock exchange in india Ppt on pc based industrial automation Download ppt on height and distance for class 10 Ppt on project monitoring and evaluation Ppt on pricing policy objectives Ppt on review of literature in nursing research Ppt on life cycle of a frog Ppt on programmable logic array pla Ppt on viruses and anti viruses name Ppt on sources of energy class 10 download