Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Chapter 4: Top-Down Parsing. 2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct.

Similar presentations


Presentation on theme: "1 Chapter 4: Top-Down Parsing. 2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct."— Presentation transcript:

1 1 Chapter 4: Top-Down Parsing

2 2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct a parse tree for the input string starting from the root and creating the nodes of the parse tree in preorder.

3 Input String : >>> lm

4 4 1. with backtracking (making repeated scans of the input, a general form of top-down parsing) Methods: To create a procedure for each nonterminal. Approaches of Top-Down Parsing

5 e.g. S -> cAd A -> ab | a S( ) { if input symbol == ‘c’ A( ) { isave= input-pointer; { Advance(); if input-symbol == ‘a’ if A() { Advance(); if input-symbol == ‘d’ if input-symbol == ‘b’ { Advance(); { Advance(); return true; return true; } } return false; input-pointer = isave; } if input-symbol == ‘a’ { Advance(); return true; } else return false; } L = { cabd, cad } c ad

6 Problems for top-down parsing with backtracking : (1) left-recursion (can cause a top-down parser to go into an infinite loop) Def. A grammar is said to be left-recursive if it has a nonterminal A s.t. there is a derivation A => A  for some . (2) backtracking - undo not only the movement but also the semantics entering in symbol table. (3) the order the alternatives are tried (For the grammar shown above, try w = cabd where A -> a is applied first) +

7 7 With immediate left recursion: A -> A  |  ==> transform into A ->  A' A' ->  A' |  A A  A . A A .  ===> A  A'  ..  Elimination of Left-Recursion   …  

8 8 e.g. E -> E + T | T T -> T * F | F F -> (E) | id After transformation: E -> TE ' E' -> +TE' |  T -> FT ' T ' -> *FT' |  F -> (E) | id

9 9 General form (with left recursion): A -> A  1 | A  2 |... | A  n |  1 |  2 |... |  m After transformation: ==> A ->  1 A' |  2 A' |... |  m A' A' ->  1 A' |  2 A' |... |  n A' | 

10 10 How about left recursion occurred for derivation with more than two steps? e.g., S -> Aa | b A -> Ac | Sd | e where S => Aa => Sda

11 Algorithm: Eliminating left recursion Input Context-free Grammar G with no cycles (i.e., A => A ) or  -production Methods: 1. Arrange the nonterminals in some order A1, A2,..., An 2. for i = 1 to n do { for j = 1 to i -1 do replace each production of the form Ai -> Aj  by the production Ai ->  1  |  2  |... |  k , where Aj ->  1 |  2 |... |  k are all current Aj-production; eliminate the immediate left-recursion among the Ai- production; } +

12 12 An Example e.g. S -> Aa | b A -> Ac | Sd | e Step 1: ==> S -> Aa | b Step 2: ==> A -> Ac | Aad | bd | e Step 3: ==> A -> bdA' |eA' A' -> cA' |adA' | 

13 2. Non-backtracking (recursive-descent) parsing recursive descent : use a collection of mutually recursive routines to perform the syntax analysis. Left Factoring : A ->  1 |   2 ==> A ->  A' A' ->  1 |  2 Methods: 1. For each nonterminal A find the longest prefix  common to two or more of its alternatives. If    replace all the A productions A ->   1 |   2 |... |   n | others by A ->  A‘ | others A' ->  1 |  2 |... |  n 2. Repeat the transformation until no more found e.g. S -> iCtS | iCtSeS | a C -> b ==> S -> iCtSS' | a S' -> eS |  C -> b

14 14 Predicative Parsing Features: - maintains a stack rather than recursive calls - table-driven Components: 1. An input buffer with end marker ($) 2. A stack with endmarker ($) on the bottom 3. A parsing table, a two-dimensional array M[A,a], where ‘A’ is a nonterminal symbol and ‘a’ is the current input symbol (terminal/token).

15 15 Parsing Table M[A,a] S ( S  ( S ) S ) S  ε $

16 16 Algorithm: Input: An input string w and a parsing table M for grammar G. Output: A leftmost derivation of w or an error indication.

17 Initially w$ is in input buffer and S$ is in the stack. Method: do { Let a of w be the next input symbol and X be the top stack symbol; if X is a terminal { if X == a then pop X from stack and remove a from input; else ERROR();} else { if M[X, a] = X -> Y1Y2...Yn then 1. pop X from the stack; 2. push YnYn-1...Y1 onto the stack with Y1 on top; else ERROR(); } } while (X ≠ $) if (X == $) and (the next input symbol == $) then accept else error(); Starting Symbol of the grammar

18

19 19 An Example

20

21

22 22 Construction of the parsing table for predictive parser First and Follow Def. First(  ) /*  denotes grammar symbol*/ is the set of terminals that begin the string derived from . If  => , then  is also in First(  ). Def. Follow(A), A is a nonterminal, is the set of terminals a that can appear immediately to the right of A in some sentential form, that is, the set of terminals 'a' s.t. there exists a derivation of the form S =>*  A a  for some  and . If A can be the rightmost symbol in some sentential form, then  is in Follow(A). *

23 23 Compute First(X) for all grammar symbols X: 1. If X is terminal, then First(X) = {X}. 2. If X ->  is a production then  is in First(X). 3. If X is nonterminal and X -> Y1Y2...Yk is a production, then place 'a' in First(X) if for some i, a is in First(Yi), and  is in all of First(Y1),..., First(Yi-1); that is Y1... Yi-1 => . If  is in First(Yj) for all j = 1,2,...,k, then add  in First(X). *

24 24 An Example E -> TE' E' -> +TE'|  T -> FT' T' -> *FT‘ |  F -> (E) | id First(E) = First(T) = First(F) = {(, id} First(E') = {+,  } First(T') = {*,  }

25 25

26 26 Compute Follow(A) for all nonterminals A 1. Place $ in Follow(S), where S is the start symbol and $ is the input buffer endmarker. 2. If there is a production A ->  B , then everything in First(  ) except for  is placed in Follow(B). 3. If there is a production A ->  B, or a production A ->  B  where First(  ) contains , then everything in Follow(A) is in Follow(B).

27 27 An Example E -> TE' E' -> +TE'|  T -> FT' T' -> *FT' |  F -> (E) | id /* E is the start symbol */ Follow(E) = { $,) } // rules 1 & 2 Follow(E') = { $,) } // rule 3 Follow(T) = { +,$,) } // rules 2 & 3 Follow(T') = { +,$,) } // rule 3 Follow(F) = { *,+,$,) } // rules 2 & 3

28 28 E -> TE' E' -> +TE'|  T -> FT' T' -> *FT‘ |  F -> (E) | id First(E) = First(T) = First(F) = {(, id} First(E') = {+,  } First(T') = {*,  }

29 29 Construct a Predicative Parsing Table 1. For each production A ->  of the grammar, do steps 2 and 3. 2. For each terminal a in First(  ), add A ->  to M[A, a]. 3. If  is in First(  ), add A ->  to M[A, b] for each terminal b in Follow(A). If  is in First(  ) and $ is in Follow(A), add A ->  to M[A, $]. 4. Make each undefined entry of M be error.

30 LL(1) grammar A grammar whose parsing table has no multiply-defined entries is said to be LL(1). First 'L' : scan the input from left to right. Second 'L': produce a leftmost derivation. '1' : use one input symbol to determine parsing action. * No ambiguous or left-recursive grammar can be LL(1).

31 31 Properties of LL(1) grammar A grammar G is LL(1) iff whenever A ->  |  are two distinct productions of G, the following conditions hold: (1) For no terminal a do both  and  derive strings beginning with a. (based on method 2)  First(  ) ∩ First(  ) = ψ (2) At most one of  and  can derive the empty string  (based on method 3). (3) if  =>  then  does not derive any string beginning with a terminal in Follow (A) (based on methods 2 and 3).  First(  ) ∩ Follow(A) = ψ (i.e. If First(A) contains  then First(A) ∩ Follow(A) = ψ) *

32 32 Def. for Multiply-defined entry If G is left-recursive or ambiguous, then M will have at least one multiply-defined entry. e.g. S -> iCtSS'| a S' -> eS |  C -> b generates: M[S',e] = { S' -> , S' -> eS} with multiply- defined entry.

33 33 Parsing table with multiply-defined entry abeit$ SS-> aS -> iCtSS' S’ S’->  S' -> eS S’->  CC->b

34 34 Difficulty in predictive parsing - Left recursion elimination and left factoring make the resulting grammar hard to read and difficult to use for translation purpose. Thus: * Use predictive parser for control constructs * Use operator precedence for expressions.

35 35 Assignment #3b Do exercises 4.3, 4.10, 4.13, 4.15


Download ppt "1 Chapter 4: Top-Down Parsing. 2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct."

Similar presentations


Ads by Google