Prof. Bodik CS 164 Lecture 91 Bottom-up parsing CS164 3:30-5:00 TT 10 Evans.

Slides:



Advertisements
Similar presentations
Parsing V: Bottom-up Parsing
Advertisements

A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Lesson 8 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Pushdown Automata Consists of –Pushdown stack (can have terminals and nonterminals) –Finite state automaton control Can do one of three actions (based.
Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)
Predictive Parsing l Find derivation for an input string, l Build a abstract syntax tree (AST) –a representation of the parsed program l Build a symbol.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
Top-Down Parsing.
CS Summer 2005 Top-down and Bottom-up Parsing - a whirlwind tour June 20, 2005 Slide acknowledgment: Radu Rugina, CS 412.
Context-Free Grammars Lecture 7
Prof. Bodik CS 164 Lecture 81 Grammars and ambiguity CS164 3:30-5:00 TT 10 Evans.
Prof. Bodik CS 164 Lecture 61 Building a Parser II CS164 3:30-5:00 TT 10 Evans.
CS 536 Spring Introduction to Bottom-Up Parsing Lecture 11.
CS 536 Spring Bottom-Up Parsing: Algorithms, part 1 LR(0), SLR Lecture 12.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
CS 330 Programming Languages 09 / 23 / 2008 Instructor: Michael Eckmann.
Professor Yihjia Tsai Tamkang University
LR(1) Languages An Introduction Professor Yihjia Tsai Tamkang University.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
– 1 – CSCE 531 Spring 2006 Lecture 8 Bottom Up Parsing Topics Overview Bottom-Up Parsing Handles Shift-reduce parsing Operator precedence parsing Readings:
9/27/2006Prof. Hilfinger, Lecture 141 Parsing Odds and Ends Lecture 14 (P. N. Hilfinger, plus slides adapted from R. Bodik)
Bottom-up parsing Goal of parser : build a derivation
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Syntax and Semantics Structure of programming languages.
BOTTOM-UP PARSING Sathu Hareesh Babu 11CS10039 G-8.
Top-Down Parsing - recursive descent - predictive parsing
4 4 (c) parsing. Parsing A grammar describes the strings of tokens that are syntactically legal in a PL A recogniser simply accepts or rejects strings.
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
CS 321 Programming Languages and Compilers Bottom Up Parsing.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 1 Chapter 4 Chapter 4 Bottom Up Parsing.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
Syntax and Semantics Structure of programming languages.
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
Bottom-Up Parsing David Woolbright. The Parsing Problem Produce a parse tree starting at the leaves The order will be that of a rightmost derivation The.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
1 Nonrecursive Predictive Parsing  It is possible to build a nonrecursive predictive parser  This is done by maintaining an explicit stack.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
Top-Down Parsing.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
UMBC  CSEE   1 Chapter 4 Chapter 4 (b) parsing.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
Bottom-Up Parsing Algorithms LR(k) parsing L: scan input Left to right R: produce Rightmost derivation k tokens of lookahead LR(0) zero tokens of look-ahead.
CPSC46001 Bottom-up Parsing Reading Sections 4.5 and 4.7 from ASU.
1 Chapter 6 Bottom-Up Parsing. 2 Bottom-up Parsing A bottom-up parsing corresponds to the construction of a parse tree for an input tokens beginning at.
CS 154 Formal Languages and Computability March 22 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron Mak.
Syntax and Semantics Structure of programming languages.
Parsing Bottom Up CMPS 450 J. Moloney CMPS 450.
Programming Languages Translator
Bottom-up parsing Goal of parser : build a derivation
CS510 Compiler Lecture 4.
Unit-3 Bottom-Up-Parsing.
Parsing IV Bottom-up Parsing
CS 404 Introduction to Compiler Design
Bottom-Up Syntax Analysis
4d Bottom Up Parsing.
Lecture (From slides by G. Necula & R. Bodik)
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Parsing IV Bottom-up Parsing
LR Parsing. Parser Generators.
4d Bottom Up Parsing.
Kanat Bolazar February 16, 2010
4d Bottom Up Parsing.
Compiler Construction
4d Bottom Up Parsing.
4d Bottom Up Parsing.
Presentation transcript:

Prof. Bodik CS 164 Lecture 91 Bottom-up parsing CS164 3:30-5:00 TT 10 Evans

Prof. Bodik CS 164 Lecture 9 2 Welcome to the running example we’ll build a parser for this grammar: E  E + T | E – T | T T  T * int | int see, the grammar is –left-recursive –not left-factored … and our parser won’t mind! –we can make the grammar ambiguous, too

Prof. Bodik CS 164 Lecture 9 3 Example input, parse tree input: int + int * int its parse tree: int+* T E T T E

Prof. Bodik CS 164 Lecture 9 4 Chaotic bottom-up parsing Key idea: build the derivation in reverse int+* E E T TT E E + T T + T T + T * int int + T * int int + int * int

Prof. Bodik CS 164 Lecture 9 5 Chaotic bottom-up parsing The algorithm: 1.stare at the input string s feel free to look anywhere in the string 2.find in s a right-hand side r of a production N  r ex.: found int for a production T  int 3.reduce the found string r into its non-terminal N –ex.: replace int with T 4.if string reduced to start non-terminal –we’re done, string is parsed, we got a parse tree 5.otherwise continue in step 1

Prof. Bodik CS 164 Lecture 9 6 Don’t celebrate yet! not guaranteed to parse a correct string –is this surprising? example: int+* E T and we are stuck int + E * int int + T * int int + int * int

Prof. Bodik CS 164 Lecture 9 7 Lesson from chaotic parser Lesson: –if you’re lucky in selecting the string to reduce next, then you will successfully parse the string How to “beat the odds”? –that is, how to find a lucky sequence of reductions that gives us a derivation of the input string? –use non-determinism!

Prof. Bodik CS 164 Lecture 9 8 What’s this non-determinism, again? You took cs164, then became a stock broker: –want 16 celebrities sign you as their private broker –here’s how: send free advice to 1024 celebrities to half of them: “MSFT will go up tomorrow, buy now” guess what’s your advice for the other 512 folks –send free advice to 512 who got the correct advice to half of them: “AAPL will go down tomorrow, sell now” –… –then apply for a broker job with the 16 who got six correct predictions in a row that’s sorta how we’ll parse the string

Prof. Bodik CS 164 Lecture 9 9 Non-deterministic chaotic parser The algorithm: 1.find in input all strings that can be reduced –assume there are k of them 2.create k copies of the (partially reduced) input –it’s like spawning k identical instances of the parser 3.in each instance, perform one of k reductions –and then go to step 1, advancing and further spawning all parser instances 4.stop when at least one parser instance reduced the string to start non-terminal

Prof. Bodik CS 164 Lecture 9 10 Properties of the n.d. chaotic parser Claim: –the input will be parsed by (at least) one parser instance But: –exponential blowup: k*k*k*…*k parser copies –(how many k’s are there?) Also: –Multiple (usually many) instances of the parser produce the correct parse tree. This is wasteful.

Prof. Bodik CS 164 Lecture 9 11 Overview Chaotic bottom-up parser –it will give us the parse tree, but only if it’s lucky Non-deterministic bottom-up parser –creates many parser instances to make sure at least one builds the parse trees for the string –an instance either builds the parse tree or gets stuck Non-deterministic LR parser (next) –restrict where a reduction can be made –as a result, fewer instances necessary

Prof. Bodik CS 164 Lecture 9 12 Non-deterministic LR parser What we want: –create multiple parser instances to find the lucky sequence of reductions –but the parse tree is found by at most one instance zero if the input has syntax error

Prof. Bodik CS 164 Lecture 9 13 Two simple rules to restrict # of instances 1.split the input in two parts: right: unexamined by parser left: in the parser (we’ll do the reductions here) int  + int * int after reduction: T  + int * int 2.reductions allowed only on right part next to split allowed: T + int  * int after reduction: T + T  * int not allowed: int + int  * int after reduction: T + int  * int  hence, left part of string can be kept on the stack

Prof. Bodik CS 164 Lecture 9 14 Wait a minute! Aren’t these restrictions fatally severe? –the doubt: no instance succeeds to parse the input No. recall: one parse tree  multiple derivations –in n.d. chaotic parser, the instances that build the same parse tree each follow a different derivation

Prof. Bodik CS 164 Lecture 9 15 Wait a minute! (cont) recall: two interesting derivations –left-most derivation, right-most derivation LR parser builds right-most derivation –but does so in reverse: first step of derivation is the last reduction (the reduction to start nonterminal) –example coming in two slides hence the name: –L: scan input left to right –R: right-most derivation so, if there is a parse tree, LR parser will build it! –this is the key theorem

Prof. Bodik CS 164 Lecture 9 16 LR parser actions The left part of the string will be on the stack –the  symbol is the top of stack Two simple actions –reduce: like in chaotic parser, but must replace a string on top of stack –shift: shifts  to the right, which moves a new token from input onto stack, potentially enabling more reductions These actions will be chosen non-deterministically

Prof. Bodik CS 164 Lecture 9 17 Example of a correct LR parser sequence A “lucky” sequence of shift/reduce actions (string parsed!): int+* E E T TT EE E + T  E + T * int  E + T *  int E + T  * int E + int  * int E +  int * int E  + int * int T  + int * int int  + int * int  int + int * int

Prof. Bodik CS 164 Lecture 9 18 Example of an incorrect LR parser sequence Where did the parser instance make the mistake? int+* T TT stuck! why can’t we reduce to E + T ? T + T  T + T * int  T + T *  int T + T  * int T + int  * int T +  int * int T  + int * int int  + int * int  int + int * int

Prof. Bodik CS 164 Lecture 9 19 Non-deterministic LR parser The algorithm: (compare with chaotic n.d. parser) 1.find all reductions allowed on top of stack –assume there are k of them 2.create k new identical instances of the parser 3.in each instance, perform one of the k reductions; in original instance, do no reduction, shift instead –and go to step 1 4.stop when a parser instance reduced the string to start non-terminal

Prof. Bodik CS 164 Lecture 9 20 Overview Chaotic bottom-up parser –tries one derivation (in reverse) Non-deterministic bottom-up parser –tries all ways to build the parse tree Non-deterministic LR parser –restricts where a reduction can be made –as a result, only one instance succeeds (on an unambiguous grammar) all others get stuck Generalized LR parser (next) –idea: kill off instances that are going to get stuck ASAP

Prof. Bodik CS 164 Lecture 9 21 Revisit the incorrect LR parser sequence Key question: What was the earliest stack configuration where we could tell this instance was doomed to get stuck? int+* T TT T + T  T + T * int  T + T *  int T + T  * int T + int  * int T +  int * int T  + int * int int  + int * int  int + int * int

Prof. Bodik CS 164 Lecture 9 22 Doomed stack configurations The parser made a mistake to shift to T +  int * int rather than reducing to E  + int * int The first configuration is doomed –because the T will never appear on top of stack so that it can be reduced to E –hence this instance of the parser can be killed (it will never produce a parse tree)

Prof. Bodik CS 164 Lecture 9 23 How to find doomed parser instances? Look at their stack! How to tell if a stack is doomed: –list all legal (non yet doomed) stack configurations –if a stack is not legal, kill the instance Listing legal stack configurations –list prefixes of all right-most derivations until you see a pattern –describe the pattern as a DFA –if the stack configuration is not from the DFA, it’s doomed

Prof. Bodik CS 164 Lecture 9 24 The stack-checking DFA note: all states are accepting states T * int T T T T - + T * T

Prof. Bodik CS 164 Lecture 9 25 Constructing the stack-checking DFA E  E+ T E  E- T E  T T  T*i nt T  int T  int  E  T  T  T  *int T  T*  int T  T*int  EE+TEE-TEE+TEE-T E  E+  T T  T*i nt T  int E  E-  T T  T*i nt T  int E  E+T  E  E- T  T  T  *int note: this is knows as SLR construction T * int T T T T - + T * T