Bottom Up Parsing CS 671 January 31, 2008. CS 671 – Spring 2008 1 Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.

Slides:



Advertisements
Similar presentations
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Advertisements

Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Pushdown Automata Consists of –Pushdown stack (can have terminals and nonterminals) –Finite state automaton control Can do one of three actions (based.
CSE 5317/4305 L4: Parsing #21 Parsing #2 Leonidas Fegaras.
Mooly Sagiv and Roman Manevich School of Computer Science
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
1 Chapter 5: Bottom-Up Parsing (Shift-Reduce). 2 - attempts to construct a parse tree for an input string beginning at the leaves (the bottom) and working.
1 Bottom Up Parsing. 2 Bottom-Up Parsing l Bottom-up parsing is more general than top-down parsing »And just as efficient »Builds on ideas in top-down.
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter
CS Summer 2005 Top-down and Bottom-up Parsing - a whirlwind tour June 20, 2005 Slide acknowledgment: Radu Rugina, CS 412.
Lecture #8, Feb. 7, 2007 Shift-reduce parsing,
CS 536 Spring Introduction to Bottom-Up Parsing Lecture 11.
CS 536 Spring Bottom-Up Parsing: Algorithms, part 1 LR(0), SLR Lecture 12.
Bottom Up Parsing.
Chapter 4-2 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR Other.
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
1 LR parsing techniques SLR (not in the book) –Simple LR parsing –Easy to implement, not strong enough –Uses LR(0) items Canonical LR –Larger parser but.
LR(k) Grammar David Rodriguez-Velazquez CS6800-Summer I, 2009 Dr. Elise De Doncker.
LR(1) Languages An Introduction Professor Yihjia Tsai Tamkang University.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
1 CIS 461 Compiler Design & Construction Fall 2012 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #12 Parsing 4.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
Bottom-up parsing Goal of parser : build a derivation
Syntax and Semantics Structure of programming languages.
410/510 1 of 21 Week 2 – Lecture 1 Bottom Up (Shift reduce, LR parsing) SLR, LR(0) parsing SLR parsing table Compiler Construction.
LR Parsing Compiler Baojian Hua
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
1 Compiler Construction Syntax Analysis Top-down parsing.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
Chapter 3-3 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR 
Syntax and Semantics Structure of programming languages.
Chapter 5: Bottom-Up Parsing (Shift-Reduce)
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
111 Chapter 6 LR Parsing Techniques Prof Chung. 1.
Announcements/Reading
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
10/10/2002© 2002 Hal Perkins & UW CSED-1 CSE 582 – Compilers LR Parsing Hal Perkins Autumn 2002.
GRAMMARS & PARSING. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program, referred to as a.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Bottom-Up Parsing Algorithms LR(k) parsing L: scan input Left to right R: produce Rightmost derivation k tokens of lookahead LR(0) zero tokens of look-ahead.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Lecture 5: LR Parsing CS 540 George Mason University.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
Bottom-up parsing. Bottom-up parsing builds a parse tree from the leaves (terminals) to the start symbol int E T * TE+ T (4) (2) (3) (5) (1) int*+ E 
1 Chapter 6 Bottom-Up Parsing. 2 Bottom-up Parsing A bottom-up parsing corresponds to the construction of a parse tree for an input tokens beginning at.
Conflicts in Simple LR parsers A SLR Parser does not use any lookahead The SLR parsing method fails if knowing the stack’s top state and next input token.
Chapter 8. LR Syntactic Analysis Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
Parser Generation Tools (Yacc and Bison) CS 471 September 24, 2007.
COMPILER CONSTRUCTION
Syntax and Semantics Structure of programming languages.
Announcements/Reading
Programming Languages Translator
UNIT - 3 SYNTAX ANALYSIS - II
Bottom-Up Syntax Analysis
Syntax Analysis Part II
Subject Name:COMPILER DESIGN Subject Code:10CS63
Parsing #2 Leonidas Fegaras.
Lecture (From slides by G. Necula & R. Bodik)
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
LALR Parsing Adapted from Notes by Profs Aiken and Necula (UCB) and
Compiler Construction
LR Parsing. Parser Generators.
Parsing #2 Leonidas Fegaras.
Kanat Bolazar February 16, 2010
Compiler design Bottom-up parsing: Canonical LR and LALR
Presentation transcript:

Bottom Up Parsing CS 671 January 31, 2008

CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis Syntactic Analysis Semantic Analysis

CS 671 – Spring Last Time … Building a LL Parser Have a complete recipe for building a parser Language Grammar LL(1) Grammar Predictive Parse Table Recursive-Descent Parser Recursive-Descent Parser w/AST Gen

CS 671 – Spring Bottom-Up Parsing More general than top-down parsing And just as efficient Builds on ideas in top-down parsing Preferred method in practice Also called LR parsing L means that tokens are read left to right R means that it constructs a rightmost derivation

CS 671 – Spring The Idea LR parsing reduces a string to the start symbol by inverting productions: str = input string of terminals repeat –Identify  in str such that A   is a production (i.e., str =   ) –Replace  by A in str (i.e., str becomes  A ) until str = G

CS 671 – Spring A Bottom-up Parse in Detail (1) int++ () int + (int) + (int) () E  int E  E + (E)

CS 671 – Spring A Bottom-up Parse in Detail (2) E int++ () int + (int) + (int) E + (int) + (int) () E  int E  E + (E)

CS 671 – Spring A Bottom-up Parse in Detail (3) E int++ () int + (int) + (int) E + (int) + (int) E + (E) + (int) () E E  int E  E + (E)

CS 671 – Spring A Bottom-up Parse in Detail (4) E int++ () int + (int) + (int) E + (int) + (int) E + (E) + (int) E + (int) E () E E  int E  E + (E)

CS 671 – Spring A Bottom-up Parse in Detail (5) E int++ () int + (int) + (int) E + (int) + (int) E + (E) + (int) E + (int) E + (E) E () E E E  int E  E + (E)

CS 671 – Spring A Bottom-up Parse in Detail (6) E E int++ () int + (int) + (int) E + (int) + (int) E + (E) + (int) E + (int) E + (E) E E () E E E  int E  E + (E)

CS 671 – Spring Notation Split input into two substrings Right substring (a string of terminals) is as yet unexamined by parser Left substring has terminals and non-terminals The dividing point is marked by a  The  is not part of the string Initially, all input is unexamined:  x 1 x 2... x n

CS 671 – Spring Shift-Reduce Parsing Bottom-up parsing uses only two kinds of actions: Shift and Reduce Shift: Move  one place to the right Shifts a terminal to the left string E + ( int )  E + (int  ) Reduce: Apply an inverse production at the right end of the left string If E  E + ( E ) is a production, then E + (E + ( E )  )  E +(E  )

CS 671 – Spring Shift-Reduce Example  int + (int) + (int)$ shift int++ ()() E  int E  E + (E)

CS 671 – Spring Shift-Reduce Example  int + (int) + (int)$ shift int  + (int) + (int)$ red. E  int int++ ()() E  int E  E + (E)

CS 671 – Spring Shift-Reduce Example  int + (int) + (int)$ shift int  + (int) + (int)$ red. E  int E  + (int) + (int)$ shift 3 times E int++ ()() E  int E  E + (E)

CS 671 – Spring Shift-Reduce Example  int + (int) + (int)$ shift int  + (int) + (int)$ red. E  int E  + (int) + (int)$ shift 3 times E + (int  ) + (int)$ red. E  int E int++ ()() E  int E  E + (E)

CS 671 – Spring Shift-Reduce Example  int + (int) + (int)$ shift int  + (int) + (int)$ red. E  int E  + (int) + (int)$ shift 3 times E + (int  ) + (int)$ red. E  int E + (E  ) + (int)$ shift E int++ ()() E E  int E  E + (E)

CS 671 – Spring Shift-Reduce Example  int + (int) + (int)$ shift int  + (int) + (int)$ red. E  int E  + (int) + (int)$ shift 3 times E + (int  ) + (int)$ red. E  int E + (E  ) + (int)$ shift E + (E)  + (int)$ red. E  E + (E) E int++ ()() E E  int E  E + (E)

CS 671 – Spring Shift-Reduce Example  int + (int) + (int)$ shift int  + (int) + (int)$ red. E  int E  + (int) + (int)$ shift 3 times E + (int  ) + (int)$ red. E  int E + (E  ) + (int)$ shift E + (E)  + (int)$ red. E  E + (E) E  + (int)$ shift 3 times E int++ () E () E E  int E  E + (E)

CS 671 – Spring Shift-Reduce Example  int + (int) + (int)$ shift int  + (int) + (int)$ red. E  int E  + (int) + (int)$ shift 3 times E + (int  ) + (int)$ red. E  int E + (E  ) + (int)$ shift E + (E)  + (int)$ red. E  E + (E) E  + (int)$ shift 3 times E + (int  )$ red. E  int E int++ () E () E E  int E  E + (E)

CS 671 – Spring Shift-Reduce Example  int + (int) + (int)$ shift int  + (int) + (int)$ red. E  int E  + (int) + (int)$ shift 3 times E + (int  ) + (int)$ red. E  int E + (E  ) + (int)$ shift E + (E)  + (int)$ red. E  E + (E) E  + (int)$ shift 3 times E + (int  )$ red. E  int E + (E  )$ shift E int++ () E () E E E  int E  E + (E)

CS 671 – Spring Shift-Reduce Example E int++ () E () E E  int + (int) + (int)$ shift int  + (int) + (int)$ red. E  int E  + (int) + (int)$ shift 3 times E + (int  ) + (int)$ red. E  int E + (E  ) + (int)$ shift E + (E)  + (int)$ red. E  E + (E) E  + (int)$ shift 3 times E + (int  )$ red. E  int E + (E  )$ shift E + (E)  $ red. E  E + (E) E  int E  E + (E)

CS 671 – Spring Shift-Reduce Example E E int++ () E () E E  int + (int) + (int)$ shift int  + (int) + (int)$ red. E  int E  + (int) + (int)$ shift 3 times E + (int  ) + (int)$ red. E  int E + (E  ) + (int)$ shift E + (E)  + (int)$ red. E  E + (E) E  + (int)$ shift 3 times E + (int  )$ red. E  int E + (E  )$ shift E + (E)  $ red. E  E + (E) E  $ accept E  int E  E + (E)

CS 671 – Spring How do we keep track? Left part string implemented as a stack Top of the stack is the  Shift: –Pushes a terminal on the stack Reduce: –Pops 0 or more symbols off of the stack –Symbols are right-hand side of a production –Pushes a non-terminal on the stack (production LHS)

CS 671 – Spring LR(0) Parser Left-to-right scanning, Right-most derivation, “zero” look-ahead characters Too weak to handle most language grammars (e.g., “sum” grammar) But will help us lay the groundwork for more sophisticated parsers

CS 671 – Spring LR(0) States A state is a set of items keeping track of progress on possible upcoming reductions An LR(0) item is a production from the language with a separator “.” somewhere in the RHS of the production Stuff before “.” is already on stack (beginnings of possible γ’s to be reduced) Stuff after “.” : what we might see next The prefixes  represented by state itself E → num ● E → (● S ) state item

CS 671 – Spring Start State & Closure Constructing a DFA to read stack: First step: augment grammar with prod’n S’ → S $ Start state of DFA: empty stack = S’ →. S $ Closure of a state adds items for all productions whose LHS occurs in an item in the state, just after “.” –Set of possible productions to be reduced next –Added items have the “.” located at the beginning: no symbols for these items on the stack yet S’ →. S $ S →. ( L ) S →. id S → ( L ) | id L → S | L, S closure

CS 671 – Spring Applying Terminal Symbols In new state, include all items that have appropriate input symbol just after dot, advance dot in those items, and take closure. S ’ →. S $ S →. ( L ) S →. id S → id. S → (. L ) L →. S L →. L, S S →. ( L ) S →. id S → ( L ) | id L → S | L, S ( id (

CS 671 – Spring Applying Nonterminal Symbols Non-terminals on stack treated just like terminals (except added by reductions) S ’ →. S $ S →. ( L ) S →. id S → id. S → (. L ) L →. S L →. L, S S →. ( L ) S →. id ( id ( L → S. S → ( L. ) L → L., S L S S → ( L ) | id L → S | L, S

CS 671 – Spring Applying Reduce Actions Pop RHS off stack, replace with LHS X (X → γ) S ’ →. S $ S →. ( L ) S →. id S → id. S → (. L ) L →. S L →. L, S S →. ( L ) S →. id ( id ( L → S. S → ( L. ) L → L., S L S States causing reductions S → ( L ) | id L → S | L, S

CS 671 – Spring Full DFA reduce-only state: reduce if shift transition for look-ahead: shift otherwise: syntax error current state: push stack through DFA S ’ →. S $ S →. ( L ) S →. id S ’ → S. $ final state S → (. L ) L →. S L →. L, S S →. ( L ) S →. id S → id. L → L,. S S →. ( L ) S →. id S → ( L. ) L → L., S L → S.S → ( L ). L → L, S. S $ ( id S L ), S ( S → ( L ) | id L → S | L, S (

CS 671 – Spring Parsing Example: ((x),y) derivation stack input action ((x),y) ← 1 ((x),y) shift, goto 3 ((x),y) ← 1 ( 3 (x),y) shift, goto 3 ((x),y) ← 1 ( 3 ( 3 x),y) shift, goto 2 ((x),y) ← 1 ( 3 ( 3 x 2 ),y)reduce Sid ((S),y) ← 1 ( 3 ( 3 S 7 ),y) reduce LS ((L),y) ← 1 ( 3 ( 3 L 5 ),y) shift, goto 6 ((L),y) ← 1 ( 3 ( 3 L 5 ) 6,y) reduce S(L) (S,y) ← 1 ( 3 S 7,y) reduce LS (L,y) ← 1 ( 3 L 5,y) shift, goto 8 (L,y) ← 1 ( 3 L 5, 8 y) shift, goto 9 (L,y) ← 1 ( 3 L 5, 8 y 2 ) reduce Sid (L,S) ← 1 ( 3 L 5, 8 S 9 ) reduce LL,S (L) ← 1 ( 3 L 5 ) shift, goto 6 (L) ← 1 ( 3 L 5 ) 6 reduce S(L) S 1 S 4 $ done S → ( L ) | id L → S | L, S

CS 671 – Spring Implementation: LR Parsing Table next action next state input (terminal) symbolsnon-terminal symbols state Action table Used at every step to decide whether to shift or reduce Goto table Used only when reducing, to determine next state a X X   ▪

CS 671 – Spring Shift-Reduce Parsing Table Action table 1. shift and goto state n 2. reduce using X → γ –pop symbols γ off stack –using state label of top (end) of stack, look up X in goto table and goto that state DFA + stack = push-down automaton (PDA) next actions next state on red’n state terminal symbolsnon-terminal symbols

CS 671 – Spring List Grammar Parsing Table ()id,$SL 1s3s2g4 2S → id 3s3s2g7g5 4accept 5s6s8 6S → (L) 7L→SL→SL→SL→SL→SL→SL→SL→SL→SL→S 8s3s2g9 9L → L,S

CS 671 – Spring LR(0) Limitations An LR(0) machine only works if states with reduce actions have a single reduce action – in those states, always reduce ignoring lookahead With more complex grammar, construction gives states with shift/reduce or reduce/reduce conflicts Need to use look-ahead to choose L → L, S. S → S., L L → L, S. L → S. okshift/reducereduce/reduce

CS 671 – Spring LR(1) Parsing As much power as possible out of 1 lookahead symbol parsing table LR(1) grammar = recognizable by a shift/reduce parser with 1 look-ahead. LR(1) item = LR(0) item + look-ahead symbols possibly following production LR(0): S →. S + E LR(1): S →. S + E +

CS 671 – Spring LR(1) State LR(1) state = set of LR(1) items LR(1) item = LR(0) item + set of lookahead symbols No two items in state have same production + dot configuration S → S. + E + S → S. + E $ S → S +. E num S → S. + E +,$ S → S +. E num

CS 671 – Spring LALR Grammars Problem with LR(1): too many states LALR(1) (Look-Ahead LR) –Merge any two LR(1) states whose items are identical except look-ahead –Results in smaller parser tables—works extremely well in practice –Usual technology for automatic parser generators S → id. + S → E. $ S → id. $ S → E. + += ?

CS 671 – Spring The LALR(1) DFA Algorithm: repeat Choose two states with same core Merge the states by combining the items Point edges from predecessors to new state New state points to all the previous successors until all states have distinct core A ED CB F A BE D C F

CS 671 – Spring int E  int on $, + E  int on ), + E  E + (E) on $, + E  E + (E) on ), + ( + E int E + ) ( int E ) accept on $ int E  int on $, +, ) E  E + (E) on $, +, ) ( E int 01,5 23,84,9 6,107, ) E accept on $ Conversion LR(1) to LALR(1)

CS 671 – Spring Notes on Parsing Parsing A solid foundation: context-free grammars A simple parser: LL(1) A more powerful parser: LR(1) An efficiency hack: LALR(1) Issues with LR parsers LALR(1) parser generators

CS 671 – Spring Issues with LR Parsers What happens if a state contains: [X   a, b] and [Y  , a] Then on input “a” we could either Shift into state [X  a , b], or Reduce with Y   This is called a shift-reduce conflict Typically due to ambiguity

CS 671 – Spring Shift/Reduce Conflicts Classic example: the dangling else S  if E then S | if E then S else S | OTHER Will have DFA state containing [S  if E then S, else] [S  if E then S else S, x] Practical solutions: Modify grammar to reflect the precedence of else Many LR parsers default to “shift” Often have a precedence declaration

CS 671 – Spring Another Example Consider the ambiguous grammar E E + E | E * E | int Part of the DFA: We have a shift/reduce on input + What do we want to happen? Consider: x * y + z We need to reduce (* binds more tightly than +) Default action is shift [E  E * E, +] [E  E + E, +] … [E  E * E, +] [E  E + E, +] … E

CS 671 – Spring Declare relative precedence Explicitly resolve conflict Tell parser: we prefer the action involving * over + In practice: Parser generators support a precedence declaration for operators Precedence [E  E * E, +] [E  E + E, +] [E  E * E, +] [E  E + E, +] E

CS 671 – Spring Other Problems If a DFA state contains both [X  , a] and [Y  , a] What’s the problem here? Two reductions to choose from when next token is a This is called a reduce/reduce conflict Usually a serious ambiguity in the grammar Must be fixed in order to generate parser

CS 671 – Spring Reduce/Reduce Conflicts Example: a sequence of identifiers S   | id | id S There are two parse trees for the string id S  id S  id S  id How does this confuse the parser?

CS 671 – Spring Reduce/Reduce Conflicts Consider the DFA states: Reduce/reduce conflict on input $ G  S  id G  S  id S id Instead rewrite the grammar: S   | id S [G  S, $] [S , $] [S  id, $] [S  id S, $] [S  id, $] [S  id S, $] [S , $] [S  id, $] [S  id S, $] id

CS 671 – Spring LR Parsing LR Parsing Engine a 1 a 2 … a i … a n $ Input: Scanner s m X m s m-1 X m-1 … s 0 Stack: Parser Generator ActionGoto Grammar Compiler construction LR tables