Bottom-Up Syntax Analysis Mooly Sagiv html://www.math.tau.ac.il/~msagiv/courses/wcc02.html Textbook:Modern Compiler Implementation in C Chapter 3.

Slides:



Advertisements
Similar presentations
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Advertisements

Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Compiler Designs and Constructions
Compilation (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
Bottom up Parsing Bottom up parsing trys to transform the input string into the start symbol. Moves through a sequence of sentential forms (sequence of.
Chapter 3 Syntax Analysis
Joey Paquet, 2000, 2002, 2008, Lecture 7 Bottom-Up Parsing II.
Pushdown Automata Consists of –Pushdown stack (can have terminals and nonterminals) –Finite state automaton control Can do one of three actions (based.
Recap Roman Manevich Mooly Sagiv. Outline Subjects Studied Questions & Answers.
CSE 5317/4305 L4: Parsing #21 Parsing #2 Leonidas Fegaras.
Mooly Sagiv and Roman Manevich School of Computer Science
Predictive Parsing l Find derivation for an input string, l Build a abstract syntax tree (AST) –a representation of the parsed program l Build a symbol.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
1 Chapter 5: Bottom-Up Parsing (Shift-Reduce). 2 - attempts to construct a parse tree for an input string beginning at the leaves (the bottom) and working.
1 Bottom Up Parsing. 2 Bottom-Up Parsing l Bottom-up parsing is more general than top-down parsing »And just as efficient »Builds on ideas in top-down.
Recap Mooly Sagiv. Outline Subjects Studied Questions & Answers.
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
Bottom-Up Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter (modified)
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter
CS Summer 2005 Top-down and Bottom-up Parsing - a whirlwind tour June 20, 2005 Slide acknowledgment: Radu Rugina, CS 412.
Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.
Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter 2.2 (Partial) Hashlama 11:00-14:00.
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
Bottom Up Parsing.
Chapter 4-2 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR Other.
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter 2.2 (Partial)
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
Bottom-up parsing Goal of parser : build a derivation
Syntax and Semantics Structure of programming languages.
LR Parsing Compiler Baojian Hua
10/13/2015IT 3271 Tow kinds of predictive parsers: Bottom-Up: The syntax tree is built up from the leaves Example: LR(1) parser Top-Down The syntax tree.
Parsing G Programming Languages May 24, 2012 New York University Chanseok Oh
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
Chapter 3-3 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR 
Syntax and Semantics Structure of programming languages.
Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter 2.2 (Partial) 1.
Chapter 5: Bottom-Up Parsing (Shift-Reduce)
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Lecture 5: LR Parsing CS 540 George Mason University.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
Bottom-up parsing. Bottom-up parsing builds a parse tree from the leaves (terminals) to the start symbol int E T * TE+ T (4) (2) (3) (5) (1) int*+ E 
COMP 3438 – Part II - Lecture 4 Syntax Analysis I Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
CMSC 330: Organization of Programming Languages Pushdown Automata Parsing.
Chapter 8. LR Syntactic Analysis Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
COMPILER CONSTRUCTION
Syntax and Semantics Structure of programming languages.
Programming Languages Translator
50/50 rule You need to get 50% from tests, AND
Compiler Baojian Hua LR Parsing Compiler Baojian Hua
Unit-3 Bottom-Up-Parsing.
Textbook:Modern Compiler Design
Table-driven parsing Parsing performed by a finite state machine.
Chapter 4 Syntax Analysis.
Context-free Languages
Compiler Construction
Fall Compiler Principles Lecture 4: Parsing part 3
Bottom-Up Syntax Analysis
Syntax Analysis Part II
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Kanat Bolazar February 16, 2010
Compiler design Bottom-up parsing: Canonical LR and LALR
Presentation transcript:

Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3

Efficient Parsers Pushdown automata Deterministic Report an error as soon as the input is not a prefix of a valid program Not usable for all context free grammars bison context free grammar tokens parser “Ambiguity errors” parse tree

Kinds of Parsers Top-Down (Predictive Parsing) LL –Construct parse tree in a top-down matter –Find the leftmost derivation –For every non-terminal and token predict the next production Bottom-Up LR –Construct parse tree in a bottom-up manner –Find the rightmost derivation in a reverse order – For every potential right hand side and token decide when a production is found

Bottom-Up Syntax Analysis Input –A context free grammar –A stream of tokens Output –A syntax tree or error Method –Construct parse tree in a bottom-up manner –Find the rightmost derivation in (reversed order) – For every potential right hand side and token decide when a production is found –Report an error as soon as the input is not a prefix of valid program

Plan Pushdown automata Bottom-up parsing (given a parser table) Constructing the parser table Interesting non LR grammars

Pushdown Automaton control parser-table input stack $ $utw V

Bottom-Up Parser Actions reduce A   –Pop |  | symbols and |  | states from the stack –Apply the associated action –Push a symbol goto[top, A] on the stack shift X –Push X onto the stack –Advance the input accept –Parsing is complete error –Report an error

A Parser Table for S  a S b|  statedescriptionab$S 0initials1 r S   4 1parsed as1 r S   2 2parsed Ss3 3parsed a S b r S  a S b 4finalacc

stateab$S 0s1 r S   4 1s1 r S   2 2s3 3 r S  a S b 4acc stackinputaction 0($)aabb$s1 0($) 1(a)abb$s1 0($) 1(a)1(a)bb$ r S   0($) 1(a)1(a)2(S)bb$s3 0($) 1(a)1(a)2(S)3(b)b$ r S  a S b 0($) 1(a)2(S)b$s3 0 ($)1(a)2(S)3(b)$ r S  a S b 0($) 4(S)$acc

stateab$S 0s1 r S   4 1s1 r S   2 2s3 3 r S  a S b 4acc stackinputaction 0($)abb$s1 0($) 1(a)bb$ r S   0($) 1(a)2(S)bb$s3 0($) 1(a)2(S)3(b)b$ r S  a S b 0($) 4(S)b$err

A Parser Table for S  AB | A A  a B  a statedescriptiona$ABS 0initials125 1parsed a from Ar A  a 2parsed As3r S  A 4 3parsed a from Br B  a 4parsed ABr S  AB 5finalacc

stackinputaction 0($)aa$s1 0 ($)1(a)a$ r A  a 0 ($)2(A)a$s3 0 ($)2(A)3(a)$ r B  a 0 ($)2(A)4(B)$ r S  AB 0($)5(S)$acc statea$ABS 0s125 1 r A  a 2s3 r S  A 4 3 r B  a 4 r S  AB 5acc

A Parser Table for E  E + E| id statedescriptionid+$E 0initials12 1parsed id r E  id 2parsed Es3acc 3parsed E+s14 4parsed E+Es3 r E  E + E

The Challenge How to construct a parser-table from a given grammar LR(1) grammars –Left to right scanning –Rightmost derivations (reverse) –1 token Different solutions –Operator precedence –SLR(1) Simple LR(1) –CLR(1) Canonic LR(1) –LALR(1) Look Ahead LR(1) Yacc, Bison, CUP

Grammar Hierarchy Non-ambiguous CFG CLR(1) LALR(1) SLR(1) LL(1)

Constructing an SLR parsing table Add a production S’  S$ Construct a finite automaton accepting “valid stack symbols” –The states of the automaton becomes the states of parsing-table –Determine shift operations –Determine goto operations Construct reduce entries by analyzing the grammar

A finite Automaton for S’  S$ S  a S b|  statedescriptionab$S 0initials14 1parsed as12 2parsed Ss3 3parsed a S b 4final a b S S a

Constructing a Finite Automaton NFA –For X  X 1 X 2 … X n [X  X 1 X 2 …X i  X i+1 … X n ] –“prefixes of rhs (handles)” –X 1 X 2 … X i is at the top of the stack and we expect X i+1 … X n The initial state [S’ . S$ ] –  ([X  X 1 …X i  X i+1 … X n ], X i+1 ) = [X  X 1 …X i X i+1  … X n ] –For every production X i+1    ([[X  X 1 X 2 …X i  X i+1 … X n ],  ) = [X i+1    ] Convert into DFA

NFA S’  S$ S  a S b|  [S’ .S$] [S’  S.$] [S .aSb] [S .] [S  a.Sb] [S  aSb.] [S  aS.b] S S b a    

[S’ .S$] [S’  S.$] [S .aSb] [S .] [S  a.Sb] [S  aSb.] [S  aS.b] S S b a     [S’ .S$] [S .aSb] [S .] [S  a.Sb] [S .aSb] [S .] a [S’  S.$] S [S  aS.b] S [S  aSb.] b DFA a

stateitemsab$S 0 [S .S$, S .aSb, S .] s14 1 [S  a.Sb$, S .aSb, S .] s12 2 [S  aS.b] s3 3 [S  aSb.] 4 [S’  S.$] acc [S’ .S$] [S .aSb] [S .] [S  a.Sb] [S .aSb] [S .] a [S’  S.$] S [S  aS.b] S [S  aSb.] b a

Filling reduce entries For an item [A .] we need to know the tokens that can follow A in a derivation from S’ Follow(A) = {t | S’  *  At  } See the textbook for an algorithm for constructing Follow from a given grammar

stateitemsab$S 0 [S .S$, S .aSb, S .] s14 1 [S  a.Sb$, S .aSb, S .] s12 2 [S  aS.b] s3 3 [S  aSb.] 4 [S’  S.$] acc [S’ .S$] [S .aSb] [S .] [S  a.Sb] [S .aSb] [S .] a [S’  S.$] S [S  aS.b] S [S  aSb.] b a Follow(S) = {b, $} r S   r S  a S b

Example E  E + E| E* E | id [S’ .E$] [E .E + E] [E .E *E] [E .id] id [E  id.] E [S’  E.$] [E  E. + E] [E  E. *E] [E  E +. E] [E .E +E] [E .E*E] [E .id] + [E  E *. E] [E .E +E] [E .E*E] [E .id] * id E E [E  E + E.] [E  E.+E] [E  E.*E] [E  E *E.] [E  E.+E] [E  E.*E] E E * * +

Interesting Non SLR(1) Grammar S’  S$ S  L = R | R L  *R | id R  L [S’ .S$] [S .L=R] [S .R] [L .*R] [L .id] [R  L] [S  L.=R] [R  L.] L = Partial DFA [S  L=.R] [R .L] [L .*R] [L .id] Follow(R)= {$, =}

LR(1) Parser Item [A . , t] –  is at the top of the stack and we are expecting  t LR(1) State –Sets of items LALR(1) State –Merge items with the same look-ahead

Interesting Non LR(1) Grammars Ambiguous –Arithmetic expressions –Dangling-else Common derived prefix –A  B 1 a b | B 2 a c –B 1   –B 2   Optional non-terminals –St  OptLab Ass –OptLab  id : |  –Ass  id := Exp

A motivating example Create a desk calculator Challenges –Non trivial syntax –Recursive expressions (semantics) Operator precedence

Solution (lexical analysis) /* desk.l */ % [0-9]+ { yylval = atoi(yytext); return NUM ;} “+” { return PLUS ;} “-” { return MINUS ;} “/” { return DIV ;} “*” { return MUL ;} “(“ { return LPAR ;} “)” { return RPAR ;} “//”.*[\n] ; /* comment */ [\t\n ] ; /* whitespace */. { error (“illegal symbol”, yytext[0]); }

Solution (syntax analysis) /* desk.y */ %token NUM %left PLUS, MINUS %left MUL, DIV %token LPAR, RPAR %start P % P : E {printf(“%d\n”, $1) ;} ; E : NUM { $$ = $1 ;} | LPAR e RPAR { $$ = $2; } | e PLUS e { $$ = $1 + $3 ; } | e MINUS e { $$ = $1 - $3 ; } | e MUL e { $$ = $1 * $3; } | e DIV e {$$ = $1 / $3; } ; % #include “lex.yy.c” flex desk.l bison desk.y cc y.tab.c –ll -ly

Summary LR is a powerful technique Generates efficient parsers Generation tools exit –Bison, yacc, CUP But some grammars need to be tuned –Shift/Reduce conflicts –Reduce/Reduce conflicts –Efficiency of the generated parser