Joey Paquet, 2000, 2002, 2008, 20121 Lecture 7 Bottom-Up Parsing II.

Joey Paquet, 2000, 2002, 2008, 20121 Lecture 7 Bottom-Up Parsing II

Joey Paquet, 2000, 2002, 2008, 20122 Part I Canonical LR and LALR Parsing Tables

Joey Paquet, 2000, 2002, 2008, 20123 Problems with SLR Parsers The SLR parser breaks down when –Shift-reduce conflict: it cannot decide whether to shift or reduce –Reduce-reduce conflict: A given configuration implies more than one possible reduction, e.g. State 3 :S  id  FOLLOW(S) = {$} V[id] E  id  FOLLOW(E) = {$,+,=} Z  S S  E = E S  id E  E + id E  id S  if E then S | if E then S else S

Joey Paquet, 2000, 2002, 2008, 20124 Canonical LR Parsers Problem: the FOLLOW set is not discriminating enough For example, the previous example problem is not without solution: –if the next token is either + or =, reduce id to E –if the next token is $, reduce id to S Solution: use lookahead (FIRST) sets in the items generation process to eliminate ambiguities

Joey Paquet, 2000, 2002, 2008, 20125 Canonical LR Parsers State 0 : [Z   S:{$}]: [Z   S:{$}] : V[S] State 1 V[  ] [S   id:{$}] [S   E=E:{$}] : V[E] State 2 [S   E=E:{$}] [E   E+id:{=,+}]: V[E]State 2 [E   E+id:{=}] [E   id]:{=,+}] : V[id]State 3 [E   id:{=}] [S   id:{$}] : V[id]State 3 [E   E+id:{+}] [E   id:{+}] State 1 :[Z  S  :{$}]: [Z  S  :{$}] : acceptFinal State V[S] State 2 : [S  E  = E:{$}]: [S  E  =E:{$}]: V[E=]State 4 V[E][E  E  +id:{=,+}] [E  E  +id:{=,+}]: V[E+]State 5 State 3 : [E  id  ]:{=,+}]: [E  id  :{=,+}] : handle (r4) V[id][S  id  :{$}] [S  id  :{$}] : handle (r2) State 4 :[S  E=  E:{$}]: [S  E=  E:{$}]: V[E=E]State 6 V[E=][E   E+id:{$}] [E   E+id:{+,$}] : V[E=E]State 6 [E   id:{$}] [E   id:{+,$}]: V[E=id]State 7 [E   E+id:{+}] [E   id:{+}]

Joey Paquet, 2000, 2002, 2008, 20126 Canonical LR Parsers State 5 : [E  E+  id:{=,+}] : [E  E+  id:{=,+}] : V[E+id]State 8 V[E+] State 6 : [S  E=E  :{$}]: [S  E=E  :{$}] : handle (r1) V[E=E] [E  E  +id:{+,$}] [E  E  +id:{+,$}] : V[E=E+]State 9 State 7 : [E  id  :{+,$}]: [E  id  :{+,$}]: handle (r4) V[E=id] State 8 : [E  E+id  :{=,+}] : [E  E+id  :{=,+}] : handle (r3) V[E+id] State 9 : [E  E+  id:{+,$}] : [E  E+  id:{+,$}] : V[E=E+id]State 10 V[E=E+] State 10 : [E  E+id  :{+,$}]: [E  E+id  :{+,$}]:handle (r3) V[E=E+id]

Joey Paquet, 2000, 2002, 2008, 20127 CLR Parsing Table state actiongoto id=+$SE 0s312 1acc 2s4s5 3r4 r2 4s76 5s8 6s9r1 7r4 8r3 9s10 10r3 0 Z  S 1 S  E = E 2 S  id 3 E  E + id 4 E  id

Joey Paquet, 2000, 2002, 2008, 20128 LALR Problem: CLR generates more states than SLR, e.g. for a typical programming language, the SLR table has hundreds of states whereas a CLR table has thousands of states Solution: merge similar states

Joey Paquet, 2000, 2002, 2008, 20129 LALR A. State 0: {[Z  S:{$}],[S  id:{$}],[S  E=E:{$}], [E  E+id:{=,+}],[E  id]:{=,+}]} B. State 1: {[Z  S:{$}]} C. State 2: {[S  E=E:{$}],[E  E+id:{=,+}]} D. State 3: {[E  id:{=,+}],[S  id:{$}]} E.State 4: {[S  E=E:{$}],[E  E+id:{+,$}],[E  id:{+,$}]} F. State 5: {[E  E+id:{=,+}]} State 9: {[E  E+id:{+,$}]} G. State 6: {[S  E=E:{$}],[E  E+id:{+,$}]} H. State 7: {[E  id:{+,$}]} I.State 8: {[E  E+id:{=,+}]} State 10: {[E  E+id:{+,$}]}

Joey Paquet, 2000, 2002, 2008, 201210 LALR State 0:{[Z  S:{$}],[S  id:{$}],[S  E=E:{$}], [E  E+id:{=,+}],[E  id]:{=,+}]} State 1: {[Z  S:{$}]} State 2:{[S  E=E:{$}],[E  E+id:{=,+}]} State 3:{[E  id:{=,+}],[S  id:{$}]} State 4:{[S  E=E:{$}],[E  E+id:{+,$}],[E  id:{+,$}]} State 5:{[E  E+id:{=,+,$}]} State 6:{[S  E=E:{$}],[E  E+id:{+,$}]} State 7:{[E  id:{+,$}]} State 8: {[E  E+id:{=,+,$}]}

Joey Paquet, 2000, 2002, 2008, 201211 LALR Parsing Table stat e actiongoto id=+$SE 0s312 1acc 2s4s5 3r4 r2 4s76 5s8 6s5r1 7r4 8r3 1 Z  S 2 S  E = E 3 S  id 4 E  E + id 5 E  id

Joey Paquet, 2000, 2002, 2008, 201212 Part II Error Recovery in LR Parsing

Joey Paquet, 2000, 2002, 2008, 201213 Error Recovery in LR Parsers An error is detected when the parser consults the action table and finds an empty entry Each empty entry potentially represents a different and specific syntax error If we come onto an empty entry on the goto table, it means that there is an error in the table itself

Joey Paquet, 2000, 2002, 2008, 201214 onError() 1. pop() until a state s is on top and s has at least one entry in the goto table under non-terminal N 2. S is the set of states in the goto table for s 3. T is the set of terminals for which the elements of S have a shift or accept entry in the action table 4. lookahead = nextToken() until lookahead x is in T 5. push the N corresponding to x and its corresponding state in S 6. resume parse Error Recovery in LR Parsers Example: in many PLs, statements end with a “;” –if an error is found, the stack is popped until we get V[ ] on top –nextToken() until the next “;” is found –resume parse

Joey Paquet, 2000, 2002, 2008, 201215 Error Recovery in LR Parsers state actiongoto id+*()$ETF 0s5e1 s4e1e0123 1e2s6eee2e3acc 2e2r2s7e2r2 3e2r4 e2r4 4s5e1 s4e1e0823 5e2r6 e2r6 6s5e4 s4e4e093 7s5e5 s4e5e010 8e2r1s7e2s11e0 9e2r1s7e2r1 10e2r3 e2r3 11e2r5 e2r5 1 E  E + T 2 E  T 3 T  T * F 4 T  F 5 F  (E) 6 F  id

Joey Paquet, 2000, 2002, 2008, 201216 Error Recovery in LR Parsers stackinputaction 10id)id*id)$shift 5 20id5)id*id)$ reduce (F  id) 30F3)id*id)$ reduce (T  F) 40T2)id*id)$ reduce (E  T) 50E1)id*id)$e3 60T2*id)$shift 7 70T2*7id)$shift 5 80T2*7id5)$ reduce (F  id) 90T2*7F10)$ reduce (T  T * F) 100T2)$ reduce (E  T) 110E1)$e3 120E1$accept e0unexpected end of program e1missing operand e2missing operator e3mismatched parenthesis e4missing term e5factor expected e6+ or ) expected eeparser error 1 E  E + T 2 E  T 3 T  T * F 4 T  F 5 F  (E) 6 F  id

Joey Paquet, 2000, 2002, 2008, 201217 Part III General Comments

Joey Paquet, 2000, 2002, 2008, 201218 Compiler-Compilers For real-life programming languages, construction of the table is extremely laborious and error-prone (several thousands of states) Table construction follows strictly defined rules that can be implemented as a program called a parser generator Yacc (Yet Another Compiler-Compiler) generates a LALR(1) parser code and table Grammar is given in input in (near) BNF Detects conflicts and resolves some conflicts automatically Requires a minimal number of changes to the grammar Each right hand side is associated with a custom semantic action to generate the symbol table and code

Joey Paquet, 2000, 2002, 2008, 201219 Which Parser to Use? We have seen –recursive descent predictive –table-driven predictive –SLR, CLR, LALR –Compiler-compiler For real-life languages, the recursive descent parser lacks maintainability The need for changing the grammar is a disadvantage Code generation and error detection is more difficult and less accurate in top-down parsers Most compilers are now implemented using the LR method, using compiler-compilers Other more recent ones are generating top-down parsing methods (e.g. JavaCC)

Joey Paquet, 2000, 2002, 2008, 20121 Lecture 7 Bottom-Up Parsing II.

Similar presentations

Presentation on theme: "Joey Paquet, 2000, 2002, 2008, 20121 Lecture 7 Bottom-Up Parsing II."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Joey Paquet, 2000, 2002, 2008, 20121 Lecture 7 Bottom-Up Parsing II.

Similar presentations

Presentation on theme: "Joey Paquet, 2000, 2002, 2008, 20121 Lecture 7 Bottom-Up Parsing II."— Presentation transcript:

Similar presentations

About project

Feedback