Compiler construction in4020 – lecture 4 Koen Langendoen Delft University of Technology The Netherlands.

Slides:



Advertisements
Similar presentations
Compiler construction in4020 – lecture 2 Koen Langendoen Delft University of Technology The Netherlands.
Advertisements

Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Compiler Designs and Constructions
Compilation (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
Compiler Principles Fall Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University.
Bottom up Parsing Bottom up parsing trys to transform the input string into the start symbol. Moves through a sequence of sentential forms (sequence of.
Chapter 3 Syntax Analysis
Joey Paquet, 2000, 2002, 2008, Lecture 7 Bottom-Up Parsing II.
Pushdown Automata Consists of –Pushdown stack (can have terminals and nonterminals) –Finite state automaton control Can do one of three actions (based.
CS 31003: Compilers  Difference between SLR and LR(1)  Construction of LR(1) parsing table  LALR parser Bandi Sumanth 11CS30006 Date : 9/10/2013.
Mooly Sagiv and Roman Manevich School of Computer Science
Cse321, Programming Languages and Compilers 1 6/12/2015 Lecture #10, Feb. 14, 2007 Modified sets of item construction Rules for building LR parse tables.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
Bottom-Up Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter (modified)
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter
Formal Aspects Term 2, Week4 LECTURE: LR “Shift-Reduce” Parsers: The JavaCup Parser-Generator CREATES LR “Shift-Reduce” Parsers, they are very commonly.
Lecture #8, Feb. 7, 2007 Shift-reduce parsing,
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
Compiler construction 2002
Parsing V Introduction to LR(1) Parsers. from Cooper & Torczon2 LR(1) Parsers LR(1) parsers are table-driven, shift-reduce parsers that use a limited.
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
– 1 – CSCE 531 Spring 2006 Lecture 8 Bottom Up Parsing Topics Overview Bottom-Up Parsing Handles Shift-reduce parsing Operator precedence parsing Readings:
Compiler construction in4020 – lecture 3 Koen Langendoen Delft University of Technology The Netherlands.
Bottom-up parsing Goal of parser : build a derivation
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
Compiler Principles Fall Compiler Principles Lecture 6: Parsing part 5 Roman Manevich Ben-Gurion University.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Compiler Principles Fall Compiler Principles Lecture 5: Parsing part 4 Roman Manevich Ben-Gurion University.
4. Bottom-up Parsing Chih-Hung Wang
Bernd Fischer RW713: Compiler and Software Language Engineering.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
1 Chapter 6 Bottom-Up Parsing. 2 Bottom-up Parsing A bottom-up parsing corresponds to the construction of a parse tree for an input tokens beginning at.
COMPILER CONSTRUCTION
Announcements/Reading
Parsing Bottom Up CMPS 450 J. Moloney CMPS 450.
Programming Languages Translator
Compiler Baojian Hua LR Parsing Compiler Baojian Hua
Unit-3 Bottom-Up-Parsing.
CS 488 Spring 2012 Lecture 4 Bapa Rao Cal State L.A.
Parsing IV Bottom-up Parsing
Table-driven parsing Parsing performed by a finite state machine.
COP4620 – Programming Language Translators Dr. Manuel E. Bermudez
Compiler design Bottom-up parsing: Canonical LR and LALR
Fall Compiler Principles Lecture 4: Parsing part 3
Bottom-Up Syntax Analysis
Regular Grammar - Finite Automaton
Lexical and Syntax Analysis
Top-Down Parsing CS 671 January 29, 2008.
Lecture 4: Syntax Analysis: Botom-Up Parsing Noam Rinetzky
Lecture (From slides by G. Necula & R. Bodik)
Compiler Design 7. Top-Down Table-Driven Parsing
Bottom Up Parsing.
Compiler SLR Parser.
LR Parsing. Parser Generators.
Fall Compiler Principles Lecture 4: Parsing part 3
5. Bottom-Up Parsing Chih-Hung Wang
Syntax Analysis - 3 Chapter 4.
Kanat Bolazar February 16, 2010
Announcements HW2 due on Tuesday Fall 18 CSCI 4430, A Milanova.
Compiler Construction
Parsing Bottom-Up LR Table Construction.
Parsing Bottom-Up LR Table Construction.
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 7, 10/09/2003 Prof. Roy Levow.
Compiler design Bottom-up parsing: Canonical LR and LALR
Presentation transcript:

Compiler construction in4020 – lecture 4 Koen Langendoen Delft University of Technology The Netherlands

syntax analysis: tokens  AST top-down parsing recursive descent push-down automaton making grammars LL(1) Summary of lecture 3 program text lexical analysis syntax analysis context handling annotated AST tokens AST parser generator language grammar

Quiz 2.31 Can you create a top-down parser for the following grammars? (a) S  ‘(‘ S ‘)’ | ‘)’ (b) S  ‘(‘ S ‘)’ |  (c) S  ‘(‘ S ‘)’ | ‘)’ | 

syntax analysis: tokens  AST bottom-up parsing push-down automaton ACTION/GOTO tables LR(0), SLR(1), LR(1), LALR(1) Overview program text lexical analysis syntax analysis context handling annotated AST tokens AST parser generator language grammar

Bottom-up (LR) parsing Left-to-right parse, Rightmost-derivation create a node when all children are present handle: nodes representing the right-hand side of a production IDENT rest_expression IDENT expression rest_exprterm IDENT aap + ( noot + mies ) 

LR(0) parsing running example: expression grammar input  expression EOF expression  expression ‘+’ term | term term  IDENTIFIER | ‘(’ expression ‘)’ short-hand notation Z  E $ E  E ‘+’ T | T T  i | ‘(’ E ‘)

LR(0) parsing running example: expression grammar input  expression EOF expression  expression ‘+’ term | term term  IDENTIFIER | ‘(’ expression ‘)’ short-hand notation Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’

LR(0) parsing keep track of progress inside potential handles when consuming input tokens LR items: N     initial set  -closure: expand dots in front of non-terminals Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ Z   E $ S0

LR(0) parsing keep track of progress inside potential handles when consuming input tokens LR items: N     initial set  -closure: expand dots in front of non-terminals Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ Z   E $ E   E ‘+’ T E   T T   i T   ‘(’ E ‘)’ S0

LR(0) parsing S0 stack i + i $ input Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ shift input token (i) onto the stack compute new state

LR(0) parsing S0 i S1 stack + i $ input Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ reduce handle on top of the stack compute new state

LR(0) parsing reduce handle on top of the stack compute new state S0 T S2 stack + i $ input Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ i

LR(0) parsing shift input token on top of the stack compute new state S0 E S3 stack + i $ input Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ T i

LR(0) parsing shift input token on top of the stack compute new state S0 E S3 + S4 stack i $ input Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ T i

LR(0) parsing reduce handle on top of the stack compute new state S0 E S3 + S4 i S1 stack $ input Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ T i

LR(0) parsing reduce handle on top of the stack compute new state S0 E S3 + S4 T S5 stack $ input Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ T i i

LR(0) parsing shift input token on top of the stack compute new state S0 E S3 stack $ input Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ E+T iT i

LR(0) parsing reduce handle on top of the stack compute new state S0 E S3 $ S6 stackinput Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ E+T iT i

LR(0) parsing accept! S0 Z stackinput Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ E+T iT i E$

Transition diagram Z   E $ E   E ‘+’ T E   T T   i T   ‘(’ E ‘)’ S0 Z  E  $ E  E  ‘+’ T S3 Z  E $  S6 T i E  E + T  S5 E  E ‘+’  T T   i T   ‘(’ E ‘)’ S4 E  T  S2 T  i  S1 T i ‘+’ E $

Exercise (8 min.) complete the transition diagram for the LR(0) automaton can you think of a single input expression that causes all states to be used? If yes, give an example. If no, explain.

Answers

Answers (fig 2.89) Z   E $ E   E ‘+’ T E   T T   i T   ‘(’ E ‘)’ S0 Z  E  $ E  E  ‘+’ T S3 Z  E $  S6 T i E  E + T  S5 E  E ‘+’  T T   i T   ‘(’ E ‘)’ S4 E  T  S2 T  i  S1 T i ‘+’ E $ T  ‘(’  E ‘)’ E   E ‘+’ T E   T T   i T   ‘(’ E ‘)’ S7 ‘(’ T i T  ‘(‘ E  ‘)’ E  E  ‘+’ T S8 T  ‘(‘ E ‘)’  S9 ‘)’ E ‘(’ ‘+’

Answers The following expression exercises all states ( i ) + i

The LR tables state GOTO table ACTION table i+()$ET 01732shift 1 T  i 2 E  T 346shift E  E + T 6 Z  E $ 71782shift T  ( E )

LR(0) parsing concise notation stackinput action S0i + i $ shift S0 i S1 + i $ reduce by T  i S0 T S2 + i $ reduce by E  T S0 E S3 + i $ shift S0 E S3 + S4 i $ shift S0 E S3 + S4 i S1 $ reduce by T  i S0 E S3 + S4 T S5$ reduce by E  E + T S0 E S3$ shift S0 E S3 $ S6 reduce by Z  E $ S0 Z accept

The LR push-down automaton SWITCH action_table[top of stack]: CASE “shift”: see book; CASE (“reduce”, N   ): POP the symbols of  FROM the stack; SET state TO top of stack; PUSH N ON the stack; SET new state TO goto_table[state,N]; PUSH new state ON the stack; CASE empty: ERROR;

Break

LR(0) conflicts shift-reduce conflict array indexing: T  i [ E ] T  i  [ E ](shift) T  i  (reduce)  -rule: RestExpr   Expr  Term  RestExpr(shift) RestExpr   (reduce)

LR(0) conflicts reduce-reduce conflict assignment statement: Z  V := E $ V  i  (reduce) T  i  (reduce) typical LR(0) table contains many conflicts

Handling LR(0) conflicts solution: use a one-token look-ahead two-dimensional ACTION table [state,token] different construction of ACTION table SLR(1) – Simple LR LR(1) LALR(1) – Look-Ahead LR

SLR(1) parsing solves (some) shift-reduce conflicts reduce N   iff token  FOLLOW(N) FOLLOW(T) = { ‘+’, ‘)’, $ } FOLLOW(E) = { ‘+’, ‘)’, $ } FOLLOW(Z) = { $ }

state look-ahead token i+()$ 0shift 1 T  i 2 E  T 3shift 4 5 E  E + T 6 Z  E $ 7shift 8 9 T  ( E ) SLR(1) ACTION table

SLR(1) ACTION/GOTO table state stack symbol / look-ahead token i+()$ET 0s1s1s7s7s3s3s2s2 1r4 2r2 3s4s4s6s6 4s1s1s7s7s5s5 5r3 6r1 7s1s1s7s7s8s8s2s2 8s4s4s9s9 9r5 1: Z  E $ 2: E  E ‘+’ T 3: E  T 4: T  i 5: T  ‘(’ E ‘)’ sn – shift to state n rn – reduce rule n

SLR(1) ACTION/GOTO table 1: Z  E $ 2: E  E ‘+’ T 3: E  T 4: T  i 5: T  ‘(’ E ‘)’ 6: T  i ‘[‘ E ‘]’ state stack symbol / look-ahead token i+()$ET 0s1s1s7s7s3s3s2s2 1r4 2r2 3s4s4s6s6 4s1s1s7s7s5s5 5r3 6r1 7s1s1s7s7s8s8s2s2 8s4s4s9s9 9r5 state stack symbol / look-ahead token i+()[]$ET 0s1s1s7s7s3s3s2s2 1r4 2r2 3s4s4s6s6 4s1s1s7s7s5s5 5r3 6r1 7s1s1s7s7s8s8s2s2 8s4s4s9s9 9r5 sn – shift to state n rn – reduce rule n

SLR(1) ACTION/GOTO table 1: Z  E $ 2: E  E ‘+’ T 3: E  T 4: T  i 5: T  ‘(’ E ‘)’ 6: T  i ‘[‘ E ‘]’ state stack symbol / look-ahead token i+()$ET 0s1s1s7s7s3s3s2s2 1r4 2r2 3s4s4s6s6 4s1s1s7s7s5s5 5r3 6r1 7s1s1s7s7s8s8s2s2 8s4s4s9s9 9r5 state stack symbol / look-ahead token i+()[]$ET 0s1s1s7s7s3s3s2s2 1r4 2r2 3s4s4s6s6 4s1s1s7s7s5s5 5r3 6r1 7s1s1s7s7s8s8s2s2 8s4s4s9s9 9r5 sn – shift to state n rn – reduce rule n

SLR(1) ACTION/GOTO table 1: Z  E $ 2: E  E ‘+’ T 3: E  T 4: T  i 5: T  ‘(’ E ‘)’ 6: T  i ‘[‘ E ‘]’ state stack symbol / look-ahead token i+()$ET 0s1s1s7s7s3s3s2s2 1r4 2r2 3s4s4s6s6 4s1s1s7s7s5s5 5r3 6r1 7s1s1s7s7s8s8s2s2 8s4s4s9s9 9r5 state stack symbol / look-ahead token i+()[]$ET 0s1s1s7s7s3s3s2s2 1r4 s10r4 2r2 3s4s4s6s6 4s1s1s7s7s5s5 5r3 6r1 7s1s1s7s7s8s8s2s2 8s4s4s9s9 9r5 sn – shift to state n rn – reduce rule n

Unfortunately... SLR(1) leaves many shift-reduce conflicts unsolved problem: FOLLOW(N) set is a union of all contexts in which N may occur example S  A | x b A  a A b | x

LR(0) automaton S   A S   x b A   a A b A   x S0 A  a  A b A   a A b A   x S  x  b A  x  S1 S  x b  S2 x b S6 A  a A b  S7 A b A  a A  b a S  A  S3 S4 A A  x  S5 x a 1: S  A 2: S  x b 3: A  a A b 4: A  x

Exercise (6 min.) derive the SLR(1) ACTION/GOTO table (with shift-reduce conflict) for the following grammar: S  A | x b A  a A b | x

Answers

state stack symbol / look-ahead token abx$A 0s4s1s3 1s2/r4r4 2r2 3r1 4s4s5s6 5r4 6s7 7r3 1: S  A 2: S  x b 3: A  a A b 4: A  x FOLLOW(S) = {$} FOLLOW(A) = {$,b}

LR(1) parsing maintain follow set per item LR(1) item:N     {  }  - closure for LR(1) item sets: if set S contains an item P    N  {  } then foreach production rule N   S must contain the item N    {  } where  = FIRST(  {  } )

A  a  A b {b} A   a A b {b} A   x {b} S9 A  a A b  {b} S10 A b S8 A x A  a A  b {b} LR(1) automaton S   A {$} S   x b {$} A   a A b {$} A   x {$} S0 A  a  A b {$} A   a A b {b} A   x {b} S  x  b {$} A  x  {$} S1 S  x b  {$} S2 x b S6 A  a A b  {$} S7 A b A  a A  b {$} a S  A  {$} S3 S4 A A  x  {b} S5 x a

LALR(1) parsing LR tables are big combine “equal” sets by merging look- ahead sets

A  a  A b {b} A   a A b {b} A   x {b} S9 A  a A b  {b} S10 A b S8 A x A  a A  b {b} LALR(1) automaton S   A {$} S   x b {$} A   a A b {$} A   x {$} S0 A  a  A b {$} A   a A b {b} A   x {b} S  x  b {$} A  x  {$} S1 S  x b  {$} S2 x b S6 A  a A b  {$} S7 A b A  a A  b {$} a S  A  {$} S3 S4 A A  x  {b} S5 x a

A  a A b  {b,$} LALR(1) automaton S   A {$} S   x b {$} A   a A b {b,$} A   x {b,$} S0 A  a  A b {b,$} A   a A b {b,$} A   x {b,$} S  x  b {$} A  x  {b,$} S1 S  x b  {$} S2 x b S6 S7 A b A  a A  b {b,$} a S  A  {$} S3 S4 A S5 x S   A {$} S   x b {$} A   a A b {$} A   x {$} A  a  A b {b,$} A   a A b {b,$} A   x {b,$} S  x  b {$} A  x  {$} A a A  a A  b {b,$} A  x  {b}

LALR(1) ACTION/GOTO table state stack symbol / look-ahead token abx$A 0s4s1s3 1s2r4 2r2 3r1 4s4s5s6 5r4 6s7 7r3 1: S  A 2: S  x b 3: A  a A b 4: A  x

Making grammars LR(1) – or not? grammars are often ambiguous E  E ‘+’ E | E ‘*’ E | i handle shift-reduce conflicts (default) longest match: shift precedence directives input: i * i + i E  E  ‘+’ E E  E ‘*’ E  EE *E + EE +E * reduceshift

syntax analysis: tokens  AST bottom-up parsing push-down automaton ACTION/GOTO tables LR(0)NO look-ahead SLR(1)one-token look-ahead, FOLLOW sets to solve shift-reduce conflicts LR(1)SLR(1), but FOLLOW set per item LALR(1)LR(1), but “equal” states are merged Summary

Homework study sections: 1.10 closure algorithm error handling in LR(1) parsers print handout for next week [blackboard] find a partner for the “practicum” register your group send to