Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter 2.2.5 (modified)

Slides:



Advertisements
Similar presentations
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Advertisements

Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Compiler construction in4020 – lecture 4 Koen Langendoen Delft University of Technology The Netherlands.
Compilation (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
Compiler Principles Fall Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University.
Chapter 3 Syntax Analysis
Joey Paquet, 2000, 2002, 2008, Lecture 7 Bottom-Up Parsing II.
Pushdown Automata Consists of –Pushdown stack (can have terminals and nonterminals) –Finite state automaton control Can do one of three actions (based.
Recap Roman Manevich Mooly Sagiv. Outline Subjects Studied Questions & Answers.
CSE 5317/4305 L4: Parsing #21 Parsing #2 Leonidas Fegaras.
Mooly Sagiv and Roman Manevich School of Computer Science
Theory of Compilation Erez Petrank Lecture 3: Syntax Analysis: Bottom-up Parsing 1.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
1 Chapter 5: Bottom-Up Parsing (Shift-Reduce). 2 - attempts to construct a parse tree for an input string beginning at the leaves (the bottom) and working.
Recap Mooly Sagiv. Outline Subjects Studied Questions & Answers.
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
Bottom-Up Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter (modified)
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter
Context-Free Grammars Lecture 7
Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.
Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter 2.2 (Partial) Hashlama 11:00-14:00.
Bottom Up Parsing.
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter 2.2 (Partial)
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
Bottom-up parsing Goal of parser : build a derivation
Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use , Semester 1,
Syntax and Semantics Structure of programming languages.
LEX and YACC work as a team
LR Parsing Compiler Baojian Hua
10/13/2015IT 3271 Tow kinds of predictive parsers: Bottom-Up: The syntax tree is built up from the leaves Example: LR(1) parser Top-Down The syntax tree.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
Syntax and Semantics Structure of programming languages.
Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter 2.2 (Partial) 1.
Chapter 5: Bottom-Up Parsing (Shift-Reduce)
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
Compiler Principles Fall Compiler Principles Lecture 6: Parsing part 5 Roman Manevich Ben-Gurion University.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Yet Another Compiler-Compiler Stephen C. Johnson July 31, 1978 YACC.
Compiler Principles Fall Compiler Principles Lecture 5: Parsing part 4 Roman Manevich Ben-Gurion University.
4. Bottom-up Parsing Chih-Hung Wang
Bernd Fischer RW713: Compiler and Software Language Engineering.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
Bottom-up parsing. Bottom-up parsing builds a parse tree from the leaves (terminals) to the start symbol int E T * TE+ T (4) (2) (3) (5) (1) int*+ E 
Conflicts in Simple LR parsers A SLR Parser does not use any lookahead The SLR parsing method fails if knowing the stack’s top state and next input token.
CMSC 330: Organization of Programming Languages Pushdown Automata Parsing.
Chapter 8. LR Syntactic Analysis Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
COMPILER CONSTRUCTION
Syntax and Semantics Structure of programming languages.
Programming Languages Translator
Compiler Baojian Hua LR Parsing Compiler Baojian Hua
Unit-3 Bottom-Up-Parsing.
Textbook:Modern Compiler Design
Table-driven parsing Parsing performed by a finite state machine.
Chapter 4 Syntax Analysis.
Context-free Languages
Fall Compiler Principles Lecture 4: Parsing part 3
Bottom-Up Syntax Analysis
Top-Down Parsing CS 671 January 29, 2008.
Lecture (From slides by G. Necula & R. Bodik)
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Kanat Bolazar February 16, 2010
Compiler Construction
Presentation transcript:

Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)

Efficient Parsers Pushdown automata Deterministic Report an error as soon as the input is not a prefix of a valid program Not usable for all context free grammars bison context free grammar tokens parser “Ambiguity errors” parse tree

Kinds of Parsers Top-Down (Predictive Parsing) LL –Construct parse tree in a top-down matter –Find the leftmost derivation –For every non-terminal and token predict the next production Bottom-Up LR –Construct parse tree in a bottom-up manner –Find the rightmost derivation in a reverse order – For every potential right hand side and token decide when a production is found

Bottom-Up Syntax Analysis Input –A context free grammar –A stream of tokens Output –A syntax tree or error Method –Construct parse tree in a bottom-up manner –Find the rightmost derivation in (reversed order) –For every potential right hand side and token decide when a production is found –Report an error as soon as the input is not a prefix of valid program

Plan Pushdown automata Bottom-up parsing (informal) Non-deterministic bottom-up parsing Deterministic bottom-up parsing Interesting non LR grammars

Pushdown Automaton control parser-table input stack $ $utw V

Informal Example(1) S  E$ E  T | E + T T  i | ( E ) input + i $ i$i$ stacktree i i + i $ input $ stacktree shift

Informal Example(2) S  E$ E  T | E + T T  i | ( E ) input + i $i$i$ stacktree i reduce T  i input + i $T$T$ stacktree i T

Informal Example(3) S  E$ E  T | E + T T  i | ( E ) input + i $T$T$ stacktree reduce E  T i T input + i $E$E$ stacktree i T E

Informal Example(4) S  E$ E  T | E + T T  i | ( E ) shift input + i $E$E$ stacktree i T E input i $+E$+E$ stacktree i T E +

Informal Example(5) shift input i $+E$+E$ stacktree i T E + input $i+ E $ stacktree i T E + i S  E$ E  T | E + T T  i | ( E )

Informal Example(6) reduce T  i input $i+ E $ stacktree i T E +i input $T+E$T+E$ stacktree i T E + i T S  E$ E  T | E + T T  i | ( E )

Informal Example(7) reduce E  E + T input $ T+E$T+E$ stacktree i T E + i T input $ E$E$ stacktree i T E + T E i S  E$ E  T | E + T T  i | ( E )

Informal Example(8) input $ E$E$ stacktree i T E + T E shift i input $E$$E$ stacktree i T E + T E i S  E$ E  T | E + T T  i | ( E )

Informal Example(9) input $ $E$$E$ stacktree i T E + T E reduce S  E $ i S  E$ E  T | E + T T  i | ( E )

Informal Example reduce E  E + T reduce T  i reduce E  T reduce T  i reduce S  E $

The Problem Deciding between shift and reduce

Informal Example(7) S  E$ E  T | E + T T  i | ( E ) reduce E  E + T input $ T+E$T+E$ stacktree i T E + i T input $ E$E$ stacktree i T E + T E i

Informal Example(7’) S  E$ E  T | E + T T  i | ( E ) reduce E  T input $ T+E$T+E$ stacktree i T E +i T input $ E+E$E+E$ stacktree i T E + E E 

Dotted LR(0) Items

LR(0) items()i+$TE  1: S   E$ 24, 6 2: S  E  $ s3 3: S  E $  r 4: E   T 510, 12 5: E  T  r 6: E   E + T 74, 6 7: E  E  + T s8 8: E  E +  T 910, 12 9: E  E + T  r 10: T   i s11 11: T  i  r 12: T   (E) s13 13: T  (  E) 144, 6 14: T  (E  ) s15 15: T  (E)  r S  E$ E  T E  E + T T  i T  ( E )

Formal Example(1) S  E$ E  T | E + T T  i | ( E ) i + i $ input 1: S   E$ stack  -move 6 i + i $ input 6: E   E+T 1: S   E$ stack

Formal Example(2) S  E$ E  T | E + T T  i | ( E )  -move 4 i + i $ input 4: E   T 6: E   E+T 1: S   E$ stack i + i $ input 6: E   E+T 1: S   E$ stack

Formal Example(3) S  E$ E  T | E + T T  i | ( E )  -move 10 i + i $ input 10: T   i 4: E   T 6: E   E+T 1: S   E$ stack i + i $ input 4: E   T 6: E   E+T 1: S   E$ stack

Formal Example(4) S  E$ E  T | E + T T  i | ( E ) shift 11 input + i $ 11: T  i  10: T   i 4: E   T 6: E   E+T 1: S   E$ stack i + i $ input 10: T   i 4: E   T 6: E   E+T 1: S   E$ stack

Formal Example(5) S  E$ E  T | E + T T  i | ( E ) reduce T  i + i $ input 5: E  T  4: E   T 6: E   E+T 1: S   E$ stack + i $ inputstack 11: T  i  10: T   i 4: E   T 6: E   E+T 1: S   E$

Formal Example(6) S  E$ E  T | E + T T  i | ( E ) reduce E  T + i $ input 7: E  E  +T 6: E   E+T 1: S   E$ stack + i $ input 5: E  T  4: E   T 6: E   E+T 1: S   E$ stack

Formal Example(7) S  E$ E  T | E + T T  i | ( E ) shift 8 i $ input 8: E  E +  T 7: E  E  +T 6: E   E+T 1: S   E$ stack + i $ input 7: E  E  +T 6: E   E+T 1: S   E$ stack

Formal Example(8) S  E$ E  T | E + T T  i | ( E )  -move 10 i $ input 10: T   i 8: E  E +  T 7: E  E  +T 6: E   E+T 1: S   E$ stack i $ input 8: E  E +  T 7: E  E  +T 6: E   E+T 1: S   E$ stack

Formal Example(9) S  E$ E  T | E + T T  i | ( E ) shift 11 $ input 11: T  i  10: T   i 8: E  E +  T 7: E  E  +T 6: E   E+T 1: S   E$ stack i $ input 10: T   i 8: E  E +  T 7: E  E  +T 6: E   E+T 1: S   E $ stack

Formal Example(10) S  E$ E  T | E + T T  i | ( E ) reduce T  i $ input stack 11: T  i  10: T   i 8: E  E +  T 7: E  E  +T 6: E   E+T 1: S   E$ $ inputstack 9: E  E + T  8: E  E +  T 7: E  E  +T 6: E   E+T 1: S   E$

Formal Example(11) S  E$ E  T | E + T T  i | ( E ) reduce E  E + T $ inputstack 9: E  E + T  8: E  E +  T 7: E  E  +T 6: E   E+T 1: S   E$ $ inputstack 2: S  E  $ 1: S   E$

Formal Example(12) S  E$ E  T | E + T T  i | ( E ) shift 3 $ inputstack 2: S  E  $ 1: S   E$ inputstack 3: S  E $  2: S  E  $ 1: S   E$

Formal Example(13) S  E$ E  T | E + T T  i | ( E ) reduce S  E$ inputstack 3: S  E $  2: S  E  $ 1: S   E$

But how can this be done efficiently? Deterministic Pushdown Automaton

Handles Identify the leftmost node (nonterminal) that has not been constructed but all whose children have been constructed –A  

Identifying Handles Create a finite state automaton over grammar symbols –Sets of LR(0) items –Accepting states identify handles Use automaton to build parser tables –reduce For items A    on every token –shift For items A    t  on token t When conflicts occur the grammar is not LR(0)

()i+$TE  1: S   E$ 24, 6 2: S  E  $ s3 3: S  E $  r 4: E   T 510, 12 5: E  T  r 6: E   E + T 74, 6 7: E  E  + T s8 8: E  E +  T 910, 12 9: E  E + T  r 10: T   i s11 11: T  i  r 12: T   (E) s13 13: T  (  E) 144, 6 14: T  (E  ) s15 15: T  (E)  r S  E$ E  T E  E + T T  i T  ( E )

1: S   E$ 4: E   T 6: E   E + T 10: T   i 12: T   (E) 5: E  T  T 11: T  i  i 2: S  E  $ 7: E  E  + T E 13: T  (  E) 4: E   T 6: E   E + T 10: T   i 12: T   (E) ( ( 15: T  (E)  ) 14: T  (E  ) 7: E  E  + T E 7: E  E +  T 10: T   i 12: T   (E) + + 8: E  E + T  T 2: S  E $  $ i i

Example Control Table i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

i + i $ input 0($) shift 5 stack i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

+ i $ input 5 (i) 0 ($) reduce T  i stack i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

+ i $ input 6 (T) 0 ($) reduce E  T stack i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

+ i $ input 1(E) 0 ($) shift 3 stack i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

i $ input 3 (+) 1(E) 0 ($) shift 5 stack i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

$ input 5 (i) 3 (+) 1(E) 0($) reduce T  i i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E) stack

$ input 4 (T) 3 (+) 1(E) 0($) reduce E  E + T stack i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

$ input 1 (E) 0 ($) shift 2 stack i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

input 2 ($) 1 (E) 0 ($) accept stack i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

((i) $ input 0($) shift 7 stack i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

(i) $ input 7(() 0($) shift 7 stack i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

i) $ input 7 (() 0($) shift 5 stack i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

) $ input 5 (i) 7 (() 0($) reduce T  i stack i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

) $ input 6 (T) 7 (() 0($) reduce E  T stack i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

) $ input 8 (E) 7 (() 0($) shift 9 stack i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

$ input 9 ()) 8 (E) 7 (() 0($) reduce T  ( E ) stack i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

$ input 6 (T) 7(() 0($) reduce E  T stack i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

$ input 8 (E) 7(() 0($) err stack i+()$ET 0s5errs7err 16 1 s3err s2 2acc 3s5errs7err 4 4 reduce E  E+T 5 reduce T  i 6 reduce E  T 7s5errs7err 86 8 s3errs9err 9 reduce T  (E)

Constructing LR(0) parsing table Add a production S’  S$ Construct a finite automaton accepting “valid stack symbols” States are set of items A   –The states of the automaton becomes the states of parsing-table –Determine shift operations –Determine goto operations –Determine reduce operations

Filling Parsing Table A state s i reduce A  –A    s i Shift on t –A   t   s i Goto(s i, X) = s j –A   X   s i –  (s i, X) = s j When conflicts occurs the grammar is not LR(0)

Grammar Hierarchy Non-ambiguous CFG CLR(1) LALR(1) SLR(1) LL(1) LR(0)

Example Non LR(0) S  E $ E  E + E | i S  E$ E   E+E E   i 0 SE$EE+ESE$EE+E 2 E E  E+  E E   E+E E   i 4 + E  E+ E  E  E  +E 5 E + E  i  i 1 S  E $  $ 3 i

i+$E 0s1err 2 1 red E  i 2errs4s3 3accept 4s15 5 red E  E +E s4 red E  E +E S  E$ E  E+E E   i 0 SE$EE+ESE$EE+E 2 E E  E+  E E   E+E E   i 4 + E  E+ E  E  E  +E 5 E + E  i  i 1 S  E $  $ 3 i

Non-Ambiguous Non LR(0) S  E $ E  E + T | T T  T * F | F F  i S   E $ E   E + T E   T T   T * F T   F F   i E  T  T  T  * F T *+i 0 ?? T  T *  F F   i *

LR(1) Parser LR(1) Items A , t –  is at the top of the stack and we are expecting  t LR(1) State –Sets of items LALR(1) State –Merge items with the same look-ahead

Interesting Non LR(1) Grammars Ambiguous –Arithmetic expressions –Dangling-else Common derived prefix –A  B 1 a b | B 2 a c –B 1   –B 2   Optional non-terminals –St  OptLab Ass –OptLab  id : |  –Ass  id := Exp

A motivating example Create a desk calculator Challenges –Non trivial syntax –Recursive expressions (semantics) Operator precedence

Solution (lexical analysis) /* desk.l */ % [0-9]+ { yylval = atoi(yytext); return NUM ;} “+” { return PLUS ;} “-” { return MINUS ;} “/” { return DIV ;} “*” { return MUL ;} “(“ { return LPAR ;} “)” { return RPAR ;} “//”.*[\n] ; /* comment */ [\t\n ] ; /* whitespace */. { error (“illegal symbol”, yytext[0]); }

Solution (syntax analysis) /* desk.y */ %token NUM %left PLUS, MINUS %left MUL, DIV %token LPAR, RPAR %start P % P : E {printf(“%d\n”, $1) ;} ; E : NUM { $$ = $1 ;} | LPAR e RPAR { $$ = $2; } | e PLUS e { $$ = $1 + $3 ; } | e MINUS e { $$ = $1 - $3 ; } | e MUL e { $$ = $1 * $3; } | e DIV e {$$ = $1 / $3; } ; % #include “lex.yy.c” flex desk.l bison desk.y cc y.tab.c –ll -ly

Summary LR is a powerful technique Generates efficient parsers Generation tools exit LALR(1) –Bison, yacc, CUP But some grammars need to be tuned –Shift/Reduce conflicts –Reduce/Reduce conflicts –Efficiency of the generated parser