Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bottom-Up Syntax Analysis Mooly Sagiv html://www.math.tau.ac.il/~msagiv/courses/wcc02.html Textbook:Modern Compiler Implementation in C Chapter 3.

Similar presentations


Presentation on theme: "Bottom-Up Syntax Analysis Mooly Sagiv html://www.math.tau.ac.il/~msagiv/courses/wcc02.html Textbook:Modern Compiler Implementation in C Chapter 3."— Presentation transcript:

1 Bottom-Up Syntax Analysis Mooly Sagiv html://www.math.tau.ac.il/~msagiv/courses/wcc02.html Textbook:Modern Compiler Implementation in C Chapter 3

2 Efficient Parsers Pushdown automata Deterministic Report an error as soon as the input is not a prefix of a valid program Not usable for all context free grammars bison context free grammar tokens parser “Ambiguity errors” parse tree

3 Kinds of Parsers Top-Down (Predictive Parsing) LL –Construct parse tree in a top-down matter –Find the leftmost derivation –For every non-terminal and token predict the next production Bottom-Up LR –Construct parse tree in a bottom-up manner –Find the rightmost derivation in a reverse order – For every potential right hand side and token decide when a production is found

4 Bottom-Up Syntax Analysis Input –A context free grammar –A stream of tokens Output –A syntax tree or error Method –Construct parse tree in a bottom-up manner –Find the rightmost derivation in (reversed order) – For every potential right hand side and token decide when a production is found –Report an error as soon as the input is not a prefix of valid program

5 Plan Pushdown automata Bottom-up parsing (given a parser table) Constructing the parser table Interesting non LR grammars

6 Pushdown Automaton control parser-table input stack $ $utw V

7 Bottom-Up Parser Actions reduce A   –Pop |  | symbols and |  | states from the stack –Apply the associated action –Push a symbol goto[top, A] on the stack shift X –Push X onto the stack –Advance the input accept –Parsing is complete error –Report an error

8 A Parser Table for S  a S b|  statedescriptionab$S 0initials1 r S   4 1parsed as1 r S   2 2parsed Ss3 3parsed a S b r S  a S b 4finalacc

9 stateab$S 0s1 r S   4 1s1 r S   2 2s3 3 r S  a S b 4acc stackinputaction 0($)aabb$s1 0($) 1(a)abb$s1 0($) 1(a)1(a)bb$ r S   0($) 1(a)1(a)2(S)bb$s3 0($) 1(a)1(a)2(S)3(b)b$ r S  a S b 0($) 1(a)2(S)b$s3 0 ($)1(a)2(S)3(b)$ r S  a S b 0($) 4(S)$acc

10 stateab$S 0s1 r S   4 1s1 r S   2 2s3 3 r S  a S b 4acc stackinputaction 0($)abb$s1 0($) 1(a)bb$ r S   0($) 1(a)2(S)bb$s3 0($) 1(a)2(S)3(b)b$ r S  a S b 0($) 4(S)b$err

11 A Parser Table for S  AB | A A  a B  a statedescriptiona$ABS 0initials125 1parsed a from Ar A  a 2parsed As3r S  A 4 3parsed a from Br B  a 4parsed ABr S  AB 5finalacc

12 stackinputaction 0($)aa$s1 0 ($)1(a)a$ r A  a 0 ($)2(A)a$s3 0 ($)2(A)3(a)$ r B  a 0 ($)2(A)4(B)$ r S  AB 0($)5(S)$acc statea$ABS 0s125 1 r A  a 2s3 r S  A 4 3 r B  a 4 r S  AB 5acc

13 A Parser Table for E  E + E| id statedescriptionid+$E 0initials12 1parsed id r E  id 2parsed Es3acc 3parsed E+s14 4parsed E+Es3 r E  E + E

14 The Challenge How to construct a parser-table from a given grammar LR(1) grammars –Left to right scanning –Rightmost derivations (reverse) –1 token Different solutions –Operator precedence –SLR(1) Simple LR(1) –CLR(1) Canonic LR(1) –LALR(1) Look Ahead LR(1) Yacc, Bison, CUP

15 Grammar Hierarchy Non-ambiguous CFG CLR(1) LALR(1) SLR(1) LL(1)

16 Constructing an SLR parsing table Add a production S’  S$ Construct a finite automaton accepting “valid stack symbols” –The states of the automaton becomes the states of parsing-table –Determine shift operations –Determine goto operations Construct reduce entries by analyzing the grammar

17 A finite Automaton for S’  S$ S  a S b|  statedescriptionab$S 0initials14 1parsed as12 2parsed Ss3 3parsed a S b 4final 012 3 4 a b S S a

18 Constructing a Finite Automaton NFA –For X  X 1 X 2 … X n [X  X 1 X 2 …X i  X i+1 … X n ] –“prefixes of rhs (handles)” –X 1 X 2 … X i is at the top of the stack and we expect X i+1 … X n The initial state [S’ . S$ ] –  ([X  X 1 …X i  X i+1 … X n ], X i+1 ) = [X  X 1 …X i X i+1  … X n ] –For every production X i+1    ([[X  X 1 X 2 …X i  X i+1 … X n ],  ) = [X i+1    ] Convert into DFA

19 NFA S’  S$ S  a S b|  [S’ .S$] [S’  S.$] [S .aSb] [S .] [S  a.Sb] [S  aSb.] [S  aS.b] S S b a    

20 [S’ .S$] [S’  S.$] [S .aSb] [S .] [S  a.Sb] [S  aSb.] [S  aS.b] S S b a     [S’ .S$] [S .aSb] [S .] [S  a.Sb] [S .aSb] [S .] a [S’  S.$] S [S  aS.b] S [S  aSb.] b DFA a

21 stateitemsab$S 0 [S .S$, S .aSb, S .] s14 1 [S  a.Sb$, S .aSb, S .] s12 2 [S  aS.b] s3 3 [S  aSb.] 4 [S’  S.$] acc [S’ .S$] [S .aSb] [S .] [S  a.Sb] [S .aSb] [S .] a [S’  S.$] S [S  aS.b] S [S  aSb.] b a

22 Filling reduce entries For an item [A .] we need to know the tokens that can follow A in a derivation from S’ Follow(A) = {t | S’  *  At  } See the textbook for an algorithm for constructing Follow from a given grammar

23 stateitemsab$S 0 [S .S$, S .aSb, S .] s14 1 [S  a.Sb$, S .aSb, S .] s12 2 [S  aS.b] s3 3 [S  aSb.] 4 [S’  S.$] acc [S’ .S$] [S .aSb] [S .] [S  a.Sb] [S .aSb] [S .] a [S’  S.$] S [S  aS.b] S [S  aSb.] b a Follow(S) = {b, $} r S   r S  a S b

24 Example E  E + E| E* E | id [S’ .E$] [E .E + E] [E .E *E] [E .id] id [E  id.] E [S’  E.$] [E  E. + E] [E  E. *E] [E  E +. E] [E .E +E] [E .E*E] [E .id] + [E  E *. E] [E .E +E] [E .E*E] [E .id] * id E E [E  E + E.] [E  E.+E] [E  E.*E] [E  E *E.] [E  E.+E] [E  E.*E] E E 1 2 3 4 5 6 7 + * * +

25 Interesting Non SLR(1) Grammar S’  S$ S  L = R | R L  *R | id R  L [S’ .S$] [S .L=R] [S .R] [L .*R] [L .id] [R  L] [S  L.=R] [R  L.] L = Partial DFA [S  L=.R] [R .L] [L .*R] [L .id] Follow(R)= {$, =}

26 LR(1) Parser Item [A . , t] –  is at the top of the stack and we are expecting  t LR(1) State –Sets of items LALR(1) State –Merge items with the same look-ahead

27 Interesting Non LR(1) Grammars Ambiguous –Arithmetic expressions –Dangling-else Common derived prefix –A  B 1 a b | B 2 a c –B 1   –B 2   Optional non-terminals –St  OptLab Ass –OptLab  id : |  –Ass  id := Exp

28 A motivating example Create a desk calculator Challenges –Non trivial syntax –Recursive expressions (semantics) Operator precedence

29 Solution (lexical analysis) /* desk.l */ % [0-9]+ { yylval = atoi(yytext); return NUM ;} “+” { return PLUS ;} “-” { return MINUS ;} “/” { return DIV ;} “*” { return MUL ;} “(“ { return LPAR ;} “)” { return RPAR ;} “//”.*[\n] ; /* comment */ [\t\n ] ; /* whitespace */. { error (“illegal symbol”, yytext[0]); }

30 Solution (syntax analysis) /* desk.y */ %token NUM %left PLUS, MINUS %left MUL, DIV %token LPAR, RPAR %start P % P : E {printf(“%d\n”, $1) ;} ; E : NUM { $$ = $1 ;} | LPAR e RPAR { $$ = $2; } | e PLUS e { $$ = $1 + $3 ; } | e MINUS e { $$ = $1 - $3 ; } | e MUL e { $$ = $1 * $3; } | e DIV e {$$ = $1 / $3; } ; % #include “lex.yy.c” flex desk.l bison desk.y cc y.tab.c –ll -ly

31 Summary LR is a powerful technique Generates efficient parsers Generation tools exit –Bison, yacc, CUP But some grammars need to be tuned –Shift/Reduce conflicts –Reduce/Reduce conflicts –Efficiency of the generated parser


Download ppt "Bottom-Up Syntax Analysis Mooly Sagiv html://www.math.tau.ac.il/~msagiv/courses/wcc02.html Textbook:Modern Compiler Implementation in C Chapter 3."

Similar presentations


Ads by Google