Joey Paquet, 2000, 2002, 2008, 20121 Lecture 7 Bottom-Up Parsing II.

Slides:



Advertisements
Similar presentations
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Advertisements

Compiler Construction Sohail Aslam Lecture StackInput ¤0¤0 id – id  id $ s4 ¤0 id 4 – id  id $ r6 F → id ¤0F3¤0F3 – id  id $ r5 T → F ¤0T2¤0T2.
CH4.1 CSE244 More on LR Parsing Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 191 Auditorium Road, Box U-155.
Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Compiler Designs and Constructions
Compiler construction in4020 – lecture 4 Koen Langendoen Delft University of Technology The Netherlands.
Compilation (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
Compiler Principles Fall Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University.
Bottom up Parsing Bottom up parsing trys to transform the input string into the start symbol. Moves through a sequence of sentential forms (sequence of.
CS 31003: Compilers  Difference between SLR and LR(1)  Construction of LR(1) parsing table  LALR parser Bandi Sumanth 11CS30006 Date : 9/10/2013.
Mooly Sagiv and Roman Manevich School of Computer Science
LR Parsing Table Costruction
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter
Chapter 4-2 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR Other.
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
Bottom-up parsing Goal of parser : build a derivation
LALR Parsing Canonical sets of LR(1) items
Parsing. Goals of Parsing Check the input for syntactic accuracy Return appropriate error messages Recover if possible Produce, or at least traverse,
Joey Paquet, 2000, 2002, 2012, Lecture 6 Bottom-Up Parsing.
LR Parsing Compiler Baojian Hua
Joey Paquet, Lecture 12 Review. Joey Paquet, Course Review Compiler architecture –Lexical analysis, syntactic analysis, semantic.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
11 Outline  6.0 Introduction  6.1 Shift-Reduce Parsers  6.2 LR Parsers  6.3 LR(1) Parsing  6.4 SLR(1)Parsing  6.5 LALR(1)  6.6 Calling Semantic.
CSI 3120, Syntactic analysis, page 1 Syntactic Analysis and Parsing Based on A. V. Aho, R. Sethi and J. D. Ullman Compilers: Principles, Techniques and.
Joey Paquet, 2000, Lecture 5 Error Recovery Techniques in Top-Down Predictive Syntactic Analysis.
Recursive Descent Parsers Lecture 6 Mon, Feb 2, 2004.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
4. Bottom-up Parsing Chih-Hung Wang
Joey Paquet, 2000, 2002, 2008, 2012, Lecture 5 Error Recovery Techniques in Top-Down Predictive Syntactic Analysis.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Three kinds of bottom-up LR parser SLR “Simple LR” –most restrictions on eligible grammars –built quite directly from items as just shown LR “Canonical.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Lecture 5: LR Parsing CS 540 George Mason University.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
Eliminating Left-Recursion Where some of a nonterminal’s productions are left-recursive, top-down parsing is not possible “Immediate” left-recursion can.
COMPILER CONSTRUCTION
Compiler design Bottom-up parsing Concepts
Compiler Baojian Hua LR Parsing Compiler Baojian Hua
Unit-3 Bottom-Up-Parsing.
Table-driven parsing Parsing performed by a finite state machine.
Chapter 4 Syntax Analysis.
Compiler Construction
Compiler design Bottom-up parsing: Canonical LR and LALR
Fall Compiler Principles Lecture 4: Parsing part 3
LALR Parsing Canonical sets of LR(1) items
Bottom-Up Syntax Analysis
Syntax Analysis Part II
Parsing #2 Leonidas Fegaras.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Compiler Design 7. Top-Down Table-Driven Parsing
Bottom Up Parsing.
LALR Parsing Adapted from Notes by Profs Aiken and Necula (UCB) and
Ambiguity in Grammar, Error Recovery
Compiler Construction
Compiler SLR Parser.
Parsing #2 Leonidas Fegaras.
5. Bottom-Up Parsing Chih-Hung Wang
Kanat Bolazar February 16, 2010
Parsing Bottom-Up LR Table Construction.
Parsing Bottom-Up LR Table Construction.
Lecture 11 LR Parse Table Construction
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 7, 10/09/2003 Prof. Roy Levow.
Compiler design Bottom-up parsing: Canonical LR and LALR
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Joey Paquet, 2000, 2002, 2008, Lecture 7 Bottom-Up Parsing II

Joey Paquet, 2000, 2002, 2008, Part I Canonical LR and LALR Parsing Tables

Joey Paquet, 2000, 2002, 2008, Problems with SLR Parsers The SLR parser breaks down when –Shift-reduce conflict: it cannot decide whether to shift or reduce –Reduce-reduce conflict: A given configuration implies more than one possible reduction, e.g. State 3 :S  id  FOLLOW(S) = {$} V[id] E  id  FOLLOW(E) = {$,+,=} Z  S S  E = E S  id E  E + id E  id S  if E then S | if E then S else S

Joey Paquet, 2000, 2002, 2008, Canonical LR Parsers Problem: the FOLLOW set is not discriminating enough For example, the previous example problem is not without solution: –if the next token is either + or =, reduce id to E –if the next token is $, reduce id to S Solution: use lookahead (FIRST) sets in the items generation process to eliminate ambiguities

Joey Paquet, 2000, 2002, 2008, Canonical LR Parsers State 0 : [Z   S:{$}]: [Z   S:{$}] : V[S] State 1 V[  ] [S   id:{$}] [S   E=E:{$}] : V[E] State 2 [S   E=E:{$}] [E   E+id:{=,+}]: V[E]State 2 [E   E+id:{=}] [E   id]:{=,+}] : V[id]State 3 [E   id:{=}] [S   id:{$}] : V[id]State 3 [E   E+id:{+}] [E   id:{+}] State 1 :[Z  S  :{$}]: [Z  S  :{$}] : acceptFinal State V[S] State 2 : [S  E  = E:{$}]: [S  E  =E:{$}]: V[E=]State 4 V[E][E  E  +id:{=,+}] [E  E  +id:{=,+}]: V[E+]State 5 State 3 : [E  id  ]:{=,+}]: [E  id  :{=,+}] : handle (r4) V[id][S  id  :{$}] [S  id  :{$}] : handle (r2) State 4 :[S  E=  E:{$}]: [S  E=  E:{$}]: V[E=E]State 6 V[E=][E   E+id:{$}] [E   E+id:{+,$}] : V[E=E]State 6 [E   id:{$}] [E   id:{+,$}]: V[E=id]State 7 [E   E+id:{+}] [E   id:{+}]

Joey Paquet, 2000, 2002, 2008, Canonical LR Parsers State 5 : [E  E+  id:{=,+}] : [E  E+  id:{=,+}] : V[E+id]State 8 V[E+] State 6 : [S  E=E  :{$}]: [S  E=E  :{$}] : handle (r1) V[E=E] [E  E  +id:{+,$}] [E  E  +id:{+,$}] : V[E=E+]State 9 State 7 : [E  id  :{+,$}]: [E  id  :{+,$}]: handle (r4) V[E=id] State 8 : [E  E+id  :{=,+}] : [E  E+id  :{=,+}] : handle (r3) V[E+id] State 9 : [E  E+  id:{+,$}] : [E  E+  id:{+,$}] : V[E=E+id]State 10 V[E=E+] State 10 : [E  E+id  :{+,$}]: [E  E+id  :{+,$}]:handle (r3) V[E=E+id]

Joey Paquet, 2000, 2002, 2008, CLR Parsing Table state actiongoto id=+$SE 0s312 1acc 2s4s5 3r4 r2 4s76 5s8 6s9r1 7r4 8r3 9s10 10r3 0 Z  S 1 S  E = E 2 S  id 3 E  E + id 4 E  id

Joey Paquet, 2000, 2002, 2008, LALR Problem: CLR generates more states than SLR, e.g. for a typical programming language, the SLR table has hundreds of states whereas a CLR table has thousands of states Solution: merge similar states

Joey Paquet, 2000, 2002, 2008, LALR A. State 0: {[Z  S:{$}],[S  id:{$}],[S  E=E:{$}], [E  E+id:{=,+}],[E  id]:{=,+}]} B. State 1: {[Z  S:{$}]} C. State 2: {[S  E=E:{$}],[E  E+id:{=,+}]} D. State 3: {[E  id:{=,+}],[S  id:{$}]} E.State 4: {[S  E=E:{$}],[E  E+id:{+,$}],[E  id:{+,$}]} F. State 5: {[E  E+id:{=,+}]} State 9: {[E  E+id:{+,$}]} G. State 6: {[S  E=E:{$}],[E  E+id:{+,$}]} H. State 7: {[E  id:{+,$}]} I.State 8: {[E  E+id:{=,+}]} State 10: {[E  E+id:{+,$}]}

Joey Paquet, 2000, 2002, 2008, LALR State 0:{[Z  S:{$}],[S  id:{$}],[S  E=E:{$}], [E  E+id:{=,+}],[E  id]:{=,+}]} State 1: {[Z  S:{$}]} State 2:{[S  E=E:{$}],[E  E+id:{=,+}]} State 3:{[E  id:{=,+}],[S  id:{$}]} State 4:{[S  E=E:{$}],[E  E+id:{+,$}],[E  id:{+,$}]} State 5:{[E  E+id:{=,+,$}]} State 6:{[S  E=E:{$}],[E  E+id:{+,$}]} State 7:{[E  id:{+,$}]} State 8: {[E  E+id:{=,+,$}]}

Joey Paquet, 2000, 2002, 2008, LALR Parsing Table stat e actiongoto id=+$SE 0s312 1acc 2s4s5 3r4 r2 4s76 5s8 6s5r1 7r4 8r3 1 Z  S 2 S  E = E 3 S  id 4 E  E + id 5 E  id

Joey Paquet, 2000, 2002, 2008, Part II Error Recovery in LR Parsing

Joey Paquet, 2000, 2002, 2008, Error Recovery in LR Parsers An error is detected when the parser consults the action table and finds an empty entry Each empty entry potentially represents a different and specific syntax error If we come onto an empty entry on the goto table, it means that there is an error in the table itself

Joey Paquet, 2000, 2002, 2008, onError() 1. pop() until a state s is on top and s has at least one entry in the goto table under non-terminal N 2. S is the set of states in the goto table for s 3. T is the set of terminals for which the elements of S have a shift or accept entry in the action table 4. lookahead = nextToken() until lookahead x is in T 5. push the N corresponding to x and its corresponding state in S 6. resume parse Error Recovery in LR Parsers Example: in many PLs, statements end with a “;” –if an error is found, the stack is popped until we get V[ ] on top –nextToken() until the next “;” is found –resume parse

Joey Paquet, 2000, 2002, 2008, Error Recovery in LR Parsers state actiongoto id+*()$ETF 0s5e1 s4e1e0123 1e2s6eee2e3acc 2e2r2s7e2r2 3e2r4 e2r4 4s5e1 s4e1e0823 5e2r6 e2r6 6s5e4 s4e4e093 7s5e5 s4e5e010 8e2r1s7e2s11e0 9e2r1s7e2r1 10e2r3 e2r3 11e2r5 e2r5 1 E  E + T 2 E  T 3 T  T * F 4 T  F 5 F  (E) 6 F  id

Joey Paquet, 2000, 2002, 2008, Error Recovery in LR Parsers stackinputaction 10id)id*id)$shift 5 20id5)id*id)$ reduce (F  id) 30F3)id*id)$ reduce (T  F) 40T2)id*id)$ reduce (E  T) 50E1)id*id)$e3 60T2*id)$shift 7 70T2*7id)$shift 5 80T2*7id5)$ reduce (F  id) 90T2*7F10)$ reduce (T  T * F) 100T2)$ reduce (E  T) 110E1)$e3 120E1$accept e0unexpected end of program e1missing operand e2missing operator e3mismatched parenthesis e4missing term e5factor expected e6+ or ) expected eeparser error 1 E  E + T 2 E  T 3 T  T * F 4 T  F 5 F  (E) 6 F  id

Joey Paquet, 2000, 2002, 2008, Part III General Comments

Joey Paquet, 2000, 2002, 2008, Compiler-Compilers For real-life programming languages, construction of the table is extremely laborious and error-prone (several thousands of states) Table construction follows strictly defined rules that can be implemented as a program called a parser generator Yacc (Yet Another Compiler-Compiler) generates a LALR(1) parser code and table Grammar is given in input in (near) BNF Detects conflicts and resolves some conflicts automatically Requires a minimal number of changes to the grammar Each right hand side is associated with a custom semantic action to generate the symbol table and code

Joey Paquet, 2000, 2002, 2008, Which Parser to Use? We have seen –recursive descent predictive –table-driven predictive –SLR, CLR, LALR –Compiler-compiler For real-life languages, the recursive descent parser lacks maintainability The need for changing the grammar is a disadvantage Code generation and error detection is more difficult and less accurate in top-down parsers Most compilers are now implemented using the LR method, using compiler-compilers Other more recent ones are generating top-down parsing methods (e.g. JavaCC)