Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Bottom Up Parsing. 2 Bottom-Up Parsing l Bottom-up parsing is more general than top-down parsing »And just as efficient »Builds on ideas in top-down.

Similar presentations


Presentation on theme: "1 Bottom Up Parsing. 2 Bottom-Up Parsing l Bottom-up parsing is more general than top-down parsing »And just as efficient »Builds on ideas in top-down."— Presentation transcript:

1 1 Bottom Up Parsing

2 2 Bottom-Up Parsing l Bottom-up parsing is more general than top-down parsing »And just as efficient »Builds on ideas in top-down parsing »Preferred method in practice l Also called LR parsing »L means that tokens are read left to right »R means that it constructs a rightmost derivation in reverse!

3 3 An Introductory Example LR parsers don ’ t need left-factored grammars and can also handle left-recursive grammars l Consider the following grammar: E  E + ( E ) | int »Why is this not LL(1)? l Consider the string: int + ( int ) + ( int )

4 4 The Idea l LR parsing reduces a string to the start symbol by inverting productions: str à input string of terminals repeat »Identify  in str such that A !  is a production (i.e., str =    ) »Replace  by A in str (i.e., str à  A  ) until str = S

5 5 A Bottom-up Parse in Detail (1) int++ () int + (int) + (int) ()

6 6 A Bottom-up Parse in Detail (2) E int++ () int + (int) + (int) E + (int) + (int) ()

7 7 A Bottom-up Parse in Detail (3) E int++ () int + (int) + (int) E + (int) + (int) E + (E) + (int) () E

8 8 A Bottom-up Parse in Detail (4) E int++ () int + (int) + (int) E + (int) + (int) E + (E) + (int) E + (int) E () E

9 9 A Bottom-up Parse in Detail (5) E int++ () int + (int) + (int) E + (int) + (int) E + (E) + (int) E + (int) E + (E) E () E E

10 10 A Bottom-up Parse in Detail (6) E E int++ () int + (int) + (int) E + (int) + (int) E + (E) + (int) E + (int) E + (E) E E () E E A rightmost derivation in reverse

11 11 Important Fact #1 Important Fact #1 about bottom-up parsing: An LR parser traces a rightmost derivation in reverse

12 12 Where Do Reductions Happen Important Fact #1 has an interesting consequence: »Let  be a step of a bottom-up parse »Assume the next reduction is by A   »Then  is a string of terminals Why? Because  A    is a step in a right-most derivation

13 13 Notation l Idea: Split string into two substrings »Right substring is as yet unexamined by parsing (a string of terminals) »Left substring has terminals and non-terminals The dividing point is marked by a I »The I is not part of the string Initially, all input is unexamined: I x 1 x 2... x n

14 14 Shift-Reduce Parsing l Bottom-up parsing uses only two kinds of actions: Shift Reduce

15 15 Shift Shift: Move I one place to the right »Shifts a terminal to the left string E + ( I int )  E + (int I )

16 16 Reduce Reduce: Apply an inverse production at the right end of the left string »If E  E + ( E ) is a production, then E + (E + ( E ) I )  E +(E I )

17 Shift-Reduce Example I int + (int) + (int)$ shift int++ ()()

18 Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int int++ ()()

19 Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E int++ ()()

20 Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E ! int E int++ ()()

21 Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E ! int E + (E I ) + (int)$ shift E int++ ()() E

22 Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E ! int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E ! E + (E) E int++ ()() E

23 Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E ! int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E ! E + (E) E I + (int)$ shift 3 times E int++ () E () E

24 Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E ! int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E ! E + (E) E I + (int)$ shift 3 times E + (int I )$ red. E ! int E int++ () E () E

25 Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E ! int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E ! E + (E) E I + (int)$ shift 3 times E + (int I )$ red. E ! int E + (E I )$ shift E int++ () E () E E

26 Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E ! int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E ! E + (E) E I + (int)$ shift 3 times E + (int I )$ red. E ! int E + (E I )$ shift E + (E) I $ red. E ! E + (E) E int++ () E () E E

27 Shift-Reduce Example I int + (int) + (int)$ shift int I + (int) + (int)$ red. E ! int E I + (int) + (int)$ shift 3 times E + (int I ) + (int)$ red. E ! int E + (E I ) + (int)$ shift E + (E) I + (int)$ red. E ! E + (E) E I + (int)$ shift 3 times E + (int I )$ red. E ! int E + (E I )$ shift E + (E) I $ red. E ! E + (E) E I $ accept E E int++ () E () E E

28 28 The Stack l Left string can be implemented by a stack »Top of the stack is the I l Shift pushes a terminal on the stack l Reduce pops 0 or more symbols off the stack (production rhs) and pushes a non-terminal on the stack (production lhs)

29 29 Key Issue: When to Shift or Reduce ? l Idea: use a finite automaton (FA) to decide when to shift or reduce »The input is the stack »The language consists of terminals and non- terminals We run the FA on the stack and we examine the resulting state X and the token tok after I »If X has a transition labeled tok then shift »If X is labeled with “ A !  on tok ” then reduce

30 LR(1) Parsing. An Example 0 I int + (int) + (int)$ shift 0 int 1 I + (int) + (int)$ E ! int 0 E 2 I + (int) + (int)$ shift 3 0 E 2 + 3 ( 4 int 5 I ) + (int)$ E ! int 0 E 2 + 3 ( 4 E 6 I ) + (int)$ shift 0 E 2 + 3 ( 4 E 6 ) I + (int)$ E ! E+(E) 0 E 2 I + (int)$ shift 3 0 E 2 + 3 ( 4 int 5 I )$ E ! int 0 E 2 + 3 ( 4 E 6 I )$ shift 0 E 2 + 3 ( 4 E 6 ) 7 I $ E ! E+(E) 0 E 2 I $ accept int E ! int on $, + accept on $ E ! int on ), + E ! E + (E) on $, + E ! E + (E) on ), + ( + E int 10 9 11 01 234 56 8 7 + E + ) ( int E )

31 31 Representing the FA l Parsers represent the Deterministic FA as a 2D table »Recall table-driven lexical analysis l Lines correspond to DFA states l Columns correspond to terminals and non- terminals l Typically columns are split into: »Those for terminals: action table »Those for non-terminals: goto table

32 32 Representing the DFA. Example l The table for a fragment of our DFA: int+()$E … 3s4 4s5g6 5 r E! int 6s8s7 7 r E! E+(E) … E ! int on ), + E ! E + (E) on $, + ( int 34 5 6 7 ) E

33 33 The LR Parsing Algorithm l After a shift or reduce action we rerun the DFA on the entire stack »This is wasteful, since most of the work is repeated l Remember for each stack element on which state it brings the DFA l LR parser maintains a stack  sym 1, state 1 ...  sym n, state n  state k is the final state of the DFA on sym 1 … sym k

34 34 The LR Parsing Algorithm Let w be initial input Let j = 0 Let DFA state 0 be the start state Let stack =  dummy, 0  repeat case action[topState(stack), w[j] ] of shift k: push  w[j++], k  reduce X   : pop |  | pairs, push  X, Goto[topState(stack), X]  accept: halt normally error: halt and report error

35 35 LR Parsing Notes l Can be used to parse more grammars than LL l Most programming language grammars are LR l Can be described as a simple table l There are tools for building the table l How is the table constructed?

36 36 LR Parsing and Parser Generators

37 37 Outline l Implementing a Shift-reduce Parser »Synthesize a DFA –Captures all the possible states that the parser can be in during state transitions for terminals and non- terminals –DFA for LR(0) Parser –DFA for SLR Parser –DFA for LR(1) Parser –DFA for LALR(1) Parser »Use DFA to create a parse table l Using parser generators

38 38 Key Issue: How is the DFA Constructed? l The stack describes the context of the parse »What non-terminal we are looking for »What production rhs we are looking for »What we have seen so far from the rhs »=> LR(0) item l Each DFA state describes several such contexts »E.g., when we are looking for non-terminal E, we might be looking either for an int or a E + (E) rhs »=> LR(0) item set

39 39 LR Example l The grammar S  A $(1) A  (A)(2) A  ( )(3)

40 40 DFA States Based on LR(0) Items l We need to capture how much of a given production we have scanned so far A  ( A ) Are we here?Or here?

41 41 LR(0) Items l We need to capture how much of a given production we have scanned so far l Production Generates 4 items »A  (A ) A  ( A )

42 42 Example of LR(0) Items l The grammar S  A $ A  (A ) A  ( ) l Items S  A $ A  (A) A  ( )

43 43 Key idea behind LR(0) items If the “ current state ” contains the item A   c  and the current symbol in the input buffer is c »the state prompts parser to perform a shift action »next state will contain A   c  If the “ state ” contains the item A   »the state prompts parser to perform a reduce action If the “ state ” contains the item S   $ and the input buffer is empty »the state prompts parser to accept But How about A   B  ? where B is a nonterminal?

44 44 The NFA for LR(0) items l The transition of LR(0) items can be represented by an NFA, in which »1. each LR(0) item is a state, »2. there is a transition from item A   X  to item A   X  with label X, »3. there is an  -transition from item A   B  » to B   »4. S  A $ is the start state »5. A   is a final state.

45 45 Example NFA for Items l LR(0) Items S  A $ S  A $A  (A) A  ( A ) A  (A ) A  (A) A  ( ) A  ( )A  ( ) A  ( A ) A  ( ) A  (A) S  A $ A  ( ) A  ( A ) A A ( ( ) )    

46 46 The DFA from LR(0) items l After the NFA for LR(0) is constructed, the resulting DFA for LR(0) parsing can be obtained by the usual NFA2DFA construction. l we thus require »  -closure (I) » move(S, a)

47 47 Closure() of a set of items Closure finds all the items in the same “ state ” l Fixed Point Algorithm for Closure(I) »Every item in I is also an item in Closure(I) »If A   B  is in Closure(I) and B   is an item, then add B   to Closure(I) »Repeat until no more new items can be added to Closure(I)

48 48 Example of Closure Closure({ A  ( A ) } ) l Items S  A $ A  (A) A  ( ) A  ( A) A  ( )

49 49 Another Example S  A $ A  ( A ) A  ( ) closure({ S  A $}) l Items S  A $ A  (A) A  ( )

50 50 Goto() of a set of items l Goto finds the new state after consuming a grammar symbol while at the current state l Algorithm for Goto(I, X) where I is a set of items and X is a grammar symbol Goto(I, X) = Closure( { A   X  | A   X  in I } ) goto is the new set obtained by “ moving the dot ” over X

51 51 Example of Goto Goto ({ A  ( A ) }, A ) A  ( A ) l Items S  A $ A  (A) A  ( )

52 52 Example of Goto Goto ({ A  ( A ) }, ( ) l Items S  A $ A  (A) A  ( ) A  ( A ) A  ( )

53 53 l Essentially the usual NFA2DFA construction!! l Let A be the start symbol and S a new start symbol. l Create a new rule S  A $ Create the first state to be Closure({ S  A $}) l Pick a state I »for each item A   X  in I –find Goto(I, X) –if Goto(I, X) is not already a state, make one –Add an edge X from state I to Goto(I, X) state l Repeat until no more additions possible Building the DFA states

54 54 DFA Example S  A$ A  (A) A  ( ) s0 S  A $ s1 A A  ( A) A  ( ) A  (A) A  ( ) s2 ( A  (A ) A s3 ( A  ( ) ) s5 A  (A) ) s4

55 55 Constructing a LR(0) Parse Engine l Build a DFA »DONE l Construct a parse table using the DFA

56 56 Creating the parse tables l For each state l Transition to another state using a terminal symbol is a shift to that state (shift to sn) l Transition to another state using a non-terminal is a goto that state (goto sn) If there is an item A   in the state do a reduction with that production for all terminals (reduce k)

57 57 Building Parse Table Example S  A$ A  (A) A  ( ) s0 S  A $ s1 A A  ( A) A  ( ) A  (A) A  ( ) s2 ( A  (A ) A s3 ( A  ( ) ) s5 A  (A) ) s4

58 58 Problem With LR(0) Parsing l No lookahead l Vulnerable to unnecessary conflicts »Shift/Reduce Conflicts (may reduce too soon in some cases) »Reduce/Reduce Conflicts l Solutions: »SLR(1) parsing - reduce only when next symbol can occur after nonterminal from production »LR(1) parsing - systematic lookahead

59 59 SLR(1) Parsing If a state contains A   l Reduce by A   only if next input symbol can follow A in some derivation l Example Grammar S  A $ A  a A  a b

60 60 LR(0) Parser S  A $ A  a A  a b S  A $ A  a A  a b s0 s3 s1 s2 A a b

61 61 Creating SLR parse tables l For each state l Transition to another state using a terminal symbol is a shift to that state (shift to sn) (same as LR(0)) l Transition to another state using a non-terminal is a goto that state (goto sn) (same as LR(0)) If there is an item A   in the state do a reduction with that production for all terminals that can follow A.

62 62 Follow() sets For each non-terminal A, Follow(A) is the set of terminals that can come after A in some derivation

63 63 1. $  Follow(S), where S is the start symbol 2. If A   B  is a production then First(  )  Follow(B) 3. If A   B is a productionthen Follow(A)  Follow(B) 4. If A   B  is a production and  derives  then Follow(A)  Follow(B) l Note: 3. is a special case of 4. Constraints for Follow()

64 64 Algorithm for Follow for all nonterminals NT Follow(NT) = {} Follow(S) = { } // {} since we add the rule S ’  S $ For all productions A   B  and nonterminal B at RHS Follow(B) = Follow(B)  First(  ) while Follow sets keep changing for all productions A   B  and nonterminal B if (  derives  ) Follow(B) = Follow(B)  Follow(A)

65 65 Augmenting Example with Follow l Example Grammar for Follow S  A $ A  a A  a b Follow(S) = { } Follow(A) = { $ }

66 66 SLR Eliminates Shift/Reduce Conflict S  A $ A  a A  a b S  A $ A  a A  a b s0 s3 s1 s2 A a b b  Follow(A)

67 67 Basic Idea Behind LR(1) l Split states in LR(0) DFA based on lookahead l Reduce based on item and lookahead

68 68 LR(1) Items l An LR(1) item is a pair: A  , a »A !  is a production »a is a terminal (the lookahead terminal) »LR(1) means 1 lookahead terminal [A  , a] describes a context of the parser : »We are trying to find an A followed by an a, and »We have  already on top of the stack »Thus we need to see next a prefix derived from  a

69 69 Note The symbol I was used before to separate the stack from the rest of input »  I , where  is the stack and  is the remaining string of terminals In items is used to mark a prefix of a production rhs: A  , a »Here  might contain non-terminals as well l In both case the stack is on the left

70 70 Convention We add to our grammar a fresh new start symbol S and a production S ! E »Where E is the old start symbol l The initial parsing context contains: S ! E, $ »Trying to find an S as a string derived from E$ »The final stack is

71 71 LR(1) Items (Cont.) l In context containing E ! E + ( E ), + »If ( follows then we can perform a shift to context containing E ! E + ( E ), + l In context containing E ! E + ( E ), + »We can perform a reduction with E ! E + ( E ) »But only if a + follows

72 72 LR(1) Items (Cont.) l Consider the item E ! E + ( E ), + l We expect a string derived from E ) + l There are two productions for E E ! int and E ! E + ( E) l We describe this by extending the context with two more items: E ! int, ) E ! E + ( E ), )

73 73 The Closure Operation l The operation of extending the context with items is called the closure operation Closure(Items) = repeat for each [A !  B , a] in Items for each production B !  for each b 2 First(  a) add [B ! , b] to Items until Items is unchanged

74 74 Constructing the Parsing DFA (1) Construct the start context: Closure({ [ S ! E, $] }) S ! E, $ E ! E+(E), $ E ! int, $ E ! E+(E), + E ! int, + S ! E, $ E ! E+(E), $/+ E ! int, $/+ We abbreviate as:

75 75 Constructing the Parsing DFA (2) l A DFA state is a closed set of LR(1) items The start state is Closure( { [S ! E, $] } ). A state that contains [A ! , b] is labeled with “ reduce with A !  on b ” And now the transitions …

76 76 The DFA Transitions A state S that contains [A   y , b] has a transition labeled y to a state transition(S, y) that contains the item [A   y , b] »y can be a terminal or a non-terminal transition(S, y) Items ←  for each [A   y , b] 2 State add [A   y , b] to Items return Closure(Items)

77 77 Constructing the Parsing DFA. Example. E ! E+ (E), $/+ E ! int on $, + accept on $ E ! E+( E), $/+ E ! E+(E), )/+ E ! int, )/+ E ! int on ), + E ! E+(E ), $/+ E ! E +(E), )/+ and so on… S ! E, $ E ! E+(E), $/+ E ! int, $/+ 0 3 4 5 6 1 S ! E, $ E ! E +(E), $/+ 2 int E + ( E

78 78 LR Parsing Tables. Notes l Parsing tables (i.e. the DFA) can be constructed automatically for a CFG l But we still need to understand the construction to work with parser generators »E.g., they report errors in terms of sets of items l What kind of errors can we expect?

79 79 Shift/Reduce Conflicts l If a DFA state contains both [A   a , b] and [B  , a] Then on input “ a ” we could either »Shift into state [A   a , b], or »Reduce with B   l This is called a shift-reduce conflict

80 80 Shift/Reduce Conflicts l Typically due to ambiguities in the grammar l Classic example: the dangling else S  if E then S | if E then S else S | OTHER l Will have DFA state containing [S  if E then S, else] [S  if E then S else S, x] l If else follows then we can shift or reduce l Default (bison, CUP, etc.) is to shift »Default behavior is as needed in this case

81 81 More Shift/Reduce Conflicts l Consider the ambiguous grammar E  E + E | E * E | int l We will have the states containing [E  E * E, +] [E  E * E, +] [E  E + E, +]  E [E  E + E, +] … l Again we have a shift/reduce on input + »We need to reduce (* binds more tightly than +) »Recall solution: declare the precedence of * and +

82 82 More Shift/Reduce Conflicts l In bison declare precedence and associativity: %left + %left * In CUP: precedence left ADD SUB precedence left MULT DIV l Precedence of a rule = that of its last terminal »See CUP/bison manual for ways to override this default l Resolve shift/reduce conflict with a shift if: »no precedence declared for either rule or terminal »input terminal has higher precedence than the rule »the precedences are the same and right associative

83 83 Using Precedence to Solve S/R Conflicts l Back to our example: [E  E * E, +] [E  E * E, +] [E  E + E, +]  E [E  E + E, +] … Will choose reduce because precedence of rule E  E * E is higher than that of terminal +

84 84 Using Precedence to Solve S/R Conflicts l Same grammar as before E  E + E | E * E | int l We will also have the states [E  E + E, *] [E  E + E, *] [E  E * E, *]  E [E  E * E, *] … l Now we also have a shift/reduce on input * »We choose shift because * has a higher precedence than the rule E  E + E.

85 85 Using Precedence to Solve S/R Conflicts l the grammar E  E + E | E - E | E * E | int l We will also have the states [E  E + E, -] [E  E + E, -] [E  E - E, -]  E [E  E - E, -] … l Now we also have a shift/reduce on input - »We choose reduce because E  E + E and - have the same precedence and +/- is left-associative

86 86 Using Precedence to Solve S/R Conflicts l Back to our dangling else example [S  if E then S, else] [S  if E then S else S, x] l Can eliminate conflict by declaring else with higher precedence than then »Or just rely on the default shift action But this starts to look like “ hacking the parser ” Best to avoid overuse of precedence declarations or you ’ ll end with unexpected parse trees

87 87 Reduce/Reduce Conflicts l If a DFA state contains both [A ! , a] and [B ! , a] »Then on input “ a ” we don ’ t know which production to reduce l This is called a reduce/reduce conflict

88 88 Reduce/Reduce Conflicts l Usually due to gross ambiguity in the grammar l Example: a sequence of identifiers S   | id | id S l There are two parse trees for the string id S  id S  id S  id l How does this confuse the parser?

89 89 More on Reduce/Reduce Conflicts Consider the states [S  id, $] [S ’  S, $] [S  id S, $] [S , $]  id [S , $] [S  id, $] [S  id, $] [S  id S, $] [S  id S, $] l Reduce/reduce conflict on input $ S ’  S  id S ’  S  id S  id Better rewrite the grammar: S   | id S

90 90 Using Parser Generators l Parser generators construct the parsing DFA given a CFG »Use precedence declarations and default conventions to resolve conflicts »The parser algorithm is the same for all grammars (and is provided as a library function) l But most parser generators do not construct the DFA as described before »Because the LR(1) parsing DFA has 1000s of states even for a simple language

91 91 LR(1) Parsing Tables are Big l But many states are similar, e.g. and l Idea: merge the DFA states whose items differ only in the lookahead tokens »We say that such states have the same core l We obtain E ! int on $, + E  int, $/+ E  int, )/+ E ! int on ), + 5 1 E ! int on $, +, ) E  int, $/+/) 1’

92 92 The Core of a Set of LR Items l Definition: The core of a set of LR items is the set of first components »Without the lookahead terminals l Example: the core of { [X   , b], [Y   , d]} is {X   , Y    }

93 93 LALR States l Consider for example the LR(1) states {[A  , a], [B  , c]} {[A  , b], [B  , d]} l They have the same core and can be merged l And the merged state contains: {[A  , a/b], [B  , c/d]} l These are called LALR(1) states »Stands for LookAhead LR »Typically 10 times fewer LALR(1) states than LR(1)

94 94 A LALR(1) DFA l Repeat until all states have distinct core »Choose two distinct states with same core »Merge the states by creating a new one with the union of all the items »Point edges from predecessors to new state »New state points to all the previous successors A ED CB F A BE D C F

95 Conversion LR(1) to LALR(1). Example. int E ! int on $, + E ! int on ), + E ! E + (E) on $, + E ! E + (E) on ), + ( + E int 10 9 11 01 234 56 8 7 + E + ) ( int E ) accept on $ int E ! int on $, +, ) E ! E + (E) on $, +, ) ( E int 01,5 23,84,9 6,107,11 + + ) E accept on $

96 96 The LALR Parser Can Have Conflicts l Consider for example the LR(1) states {[A  , a], [B  , b]} {[A  , b], [B  , c]} l And the merged LALR(1) state {[A  , a/b], [B  , b/c]} l Has a new reduce-reduce conflict on b l In practice such cases are rare

97 97 Practical method to construct LALR(1) machine l alternative method to construct LALR(1) finite state machine »rather than constructing LR(1) machine and then merge states. »adopted by many LR parser generators l Steps 1.construct the cores (i.e., a DFA for LR(0)/SLR(1) ). 2. compute lookaheads 2.1. create propagate link b/t LR(1) items. 2.2. determine spontaneous lookahead 2.3. propagate until convergence inter-state : A  X ,L1 A  X , L1  L2 l intra-state : B  A , L A , First(  )  L if     B  A , L A , First(  ) o/w … A   X ,L2

98 98 Example : Input LR(0) DFA S  A A  (A) A  ( ) s0 S  A s1 A A  ( A) A  ( ) A  (A) A  ( ) s2 ( A  (A ) A s3 ( A  ( ) ) s5 A  (A) ) s4

99 99 Example : Inter-state Propagation Links S  A, $ A  (A) A  ( ) s0 S  A, s1 A A  ( A) A  ( ) A  (A) A  ( ) s2 ( A  (A ) A s3 ( A  ( ) ) s5 A  (A) ) s4

100 100 Example : Intra-state Propagation Links S  A, $ A  (A) A  ( ) s0 S  A, s1 A A  ( A) A  ( ) A  (A) A  ( ) s2 ( A  (A ) A s3 ( A  ( ) ) s5 A  (A) ) s4 x x

101 101 Example : Spontaneous lookaheads insertion S  A, $ A  (A) A  ( ) s0 S  A, s1 A A  ( A) A  ( ) A  (A) A  ( ) s2 ( A  (A ) A s3 ( A  ( ) ) s5 A  (A) ) s4 x x {} ) )

102 102 Example : lookaheads propagation S  A, $ A  (A) A  ( ) s0 S  A, $ s1 A A  ( A) A  ( ) A  (A) A  ( ) s2 ( A  (A ) A s3 ( A  ( ) ) s5 A  (A) ) s4 {} $ ) ) ) )

103 103 Example : lookaheads propagation S  A, $ A  (A) A  ( ) s0 S  A, $ s1 A A  ( A) A  ( ) A  (A) A  ( ) s2 ( A  (A ) A s3 ( A  ( ) ) s5 A  (A) ) s4 {} $ ) ) )$)$ )$)$)$)$ )$)$ )$)$ Propagation flow: |  |  |  |  |

104 104 LALR vs. LR Parsing l LALR languages are not natural »They are an efficiency hack on LR languages l Any reasonable programming language has a LALR(1) grammar l LALR(1) has become a standard for programming languages and for parser generators

105 105 A Hierarchy of Grammar Classes From Andrew Appel, “Modern Compiler Implementation in Java”

106 106 Notes on Parsing l Parsing »A solid foundation: context-free grammars »A simple parser: LL(1) »A more powerful parser: LR(1) »An efficiency hack: LALR(1) »LALR(1) parser generators

107 107 Supplement to LR Parsing Strange Reduce/Reduce Conflicts Due to LALR Conversion (from the bison manual)

108 108 Strange Reduce/Reduce Conflicts (skipped) l Consider the grammar S  P R, NL  N | N, NL P  T | NL : T R  T | N : T N  id T  id l P - parameters specification » P is id | id + : id l R - result specification »R is id | id : id l N - a parameter or result name l T - a type name l NL - a list of names

109 109 Strange Reduce/Reduce Conflicts (skipped) l In P an id is a »N when followed by, or : »T when followed by id l In R an id is a »N when followed by : »T when followed by, l Ex: id N1,id N2 : id T3 id N4 :id T5, l This is an LR(1) grammar. l But it is not LALR(1). Why? »For obscure reasons

110 110 A Few LR(1) States P  T id P  NL : T id NL  N : NL  N, NL : N  id : N  id, T  id id 1 R  T, R  N : T, T  id, N  id : 2 T  id id N  id : N  id, id 3 T  id, N  id : id 4 T  id id/, N  id :/, LALR merge LALR reduce/reduce conflict on “,”

111 111 What Happened? l Two distinct states were confused because they have the same core l Fix: add dummy productions to distinguish the two confused states. l E.g., add R  id bogus »bogus is a terminal not used by the lexer »This production will never be used during parsing »But it distinguishes R from P

112 112 A Few LR(1) States After Fix P  T id P  NL : T id NL  N : NL  N, NL : N  id : N  id, T  id id R  T, R  N : T, R  id bogus, T  id, N  id : T  id id N  id : N  id, T  id, N  id : R  id bogus, id 1 2 3 4 Different cores  no LALR merging


Download ppt "1 Bottom Up Parsing. 2 Bottom-Up Parsing l Bottom-up parsing is more general than top-down parsing »And just as efficient »Builds on ideas in top-down."

Similar presentations


Ads by Google