CS 2130 Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing Warning: The precedence table given for the Wff grammar is in error.

Slides:



Advertisements
Similar presentations
Parsing V: Bottom-up Parsing
Advertisements

Compiler Construction
1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.
Pushdown Automata Consists of –Pushdown stack (can have terminals and nonterminals) –Finite state automaton control Can do one of three actions (based.
LR Parsing – The Items Lecture 10 Mon, Feb 14, 2005.
Honors Compilers An Introduction to Grammars Feb 12th 2002.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
CS Summer 2005 Top-down and Bottom-up Parsing - a whirlwind tour June 20, 2005 Slide acknowledgment: Radu Rugina, CS 412.
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
CS 536 Spring Introduction to Bottom-Up Parsing Lecture 11.
1 Predictive parsing Recall the main idea of top-down parsing: Start at the root, grow towards leaves Pick a production and try to match input May need.
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
CS 330 Programming Languages 09 / 23 / 2008 Instructor: Michael Eckmann.
1 LR parsing techniques SLR (not in the book) –Simple LR parsing –Easy to implement, not strong enough –Uses LR(0) items Canonical LR –Larger parser but.
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
LR(1) Languages An Introduction Professor Yihjia Tsai Tamkang University.
COS 320 Compilers David Walker. last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars.
Bottom-up parsing Goal of parser : build a derivation
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Syntax and Semantics Structure of programming languages.
4 4 (c) parsing. Parsing A grammar describes the strings of tokens that are syntactically legal in a PL A recogniser simply accepts or rejects strings.
10/13/2015IT 3271 Tow kinds of predictive parsers: Bottom-Up: The syntax tree is built up from the leaves Example: LR(1) parser Top-Down The syntax tree.
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
Syntax and Semantics Structure of programming languages.
4 4 (c) parsing. Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces.
Muhammad Idrees, Lecturer University of Lahore 1 Top-Down Parsing Top down parsing can be viewed as an attempt to find a leftmost derivation for an input.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Top-down Parsing lecture slides from C OMP 412 Rice University Houston, Texas, Fall 2001.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
Top-Down Parsing.
Top-Down Predictive Parsing We will look at two different ways to implement a non- backtracking top-down parser called a predictive parser. A predictive.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
UMBC  CSEE   1 Chapter 4 Chapter 4 (b) parsing.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
Lecture # 10 Grammar Problems. Problems with grammar Ambiguity Left Recursion Left Factoring Removal of Useless Symbols These can create problems for.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
Lecture 5: LR Parsing CS 540 George Mason University.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1.
CMSC 330: Organization of Programming Languages Pushdown Automata Parsing.
Syntax and Semantics Structure of programming languages.
Introduction to Parsing
Announcements/Reading
A Simple Syntax-Directed Translator
Programming Languages Translator
CS510 Compiler Lecture 4.
Introduction to Parsing (adapted from CS 164 at Berkeley)
Parsing IV Bottom-up Parsing
Parsing — Part II (Top-down parsing, left-recursion removal)
Chapter 4 Syntax Analysis.
4 (c) parsing.
Parsing Techniques.
Top-Down Parsing.
Top-Down Parsing CS 671 January 29, 2008.
LR Parsing – The Tables Lecture 11 Wed, Feb 16, 2005.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Subject Name:Sysytem Software Subject Code: 10SCS52
Lecture 7: Introduction to Parsing (Syntax Analysis)
Bottom Up Parsing.
Kanat Bolazar February 16, 2010
Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

CS 2130 Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing Warning: The precedence table given for the Wff grammar is in error.

Parsing Top-down Parsing 1. Root node  leaves 2. Abstract  concrete 3. Uses grammar left  right 4. Works by "guessing" Parsing -- Syntax/Semantic Analysis Bottom-up Parsing 1. Leaves  root node 2. Concrete  abstract 3. Uses grammar right  left 4. Works by "pattern matching"

Introduction Top down parsing –Scan across the string to be parsed –Attempt to find patterns that match the right hand side of a rule –Reduce them to the left hand side of the rule –If the eventual result is reduction to the start symbol then parse is successful

Imagine... We are parsing * 3 or num + num * num We need some way to make sure that we don't turn the num + num into + and reduce it to Can num + num be reduced to ? Why is it a problem?

Problem... We cannot reduce * num What we need is a way of recognizing that we must reduce first num * num num + num * num

Recall our expression grammar ::= + | ::= + | ::= * | ::= * | ::= '(' ')' | num | id ::= '(' ')' | num | id +It would suggest that what follows a + must be a term. *It would also suggest that if a num is followed by a * then we will somehow need to find a factor to perform ::= * ::= *

Bottom Up Parsing Bottom up parsing tries to group tokens into things it can reduce (based on a rule in the grammar) in the correct sequence This group of symbols is known as a handle. Wirth-WeberHandles are indicated using special symbols known as Wirth-Weber operators likeThese symbols function like parentheses which can be used to indicate precedence 1 + (2 * 3) We will determine where to put these symbols by examining the grammar and developing additional information to assist us

Wirth-Weber Operators x < yy has higher precedence than x (We expect y will be involved in a reduction before x) x = yx and y have equal precedence (We expect x and y will be involved in a reduction together) x > yx has higher precedence than y (We expect y will be involved in a reduction before x)

Bottom Up Parsing Two Things must be understood: –Given the ability to determine precedence between symbols how can we use this to parse a string? –How do we determine this precedence between symbols/tokens? We deliberately choose to explain in this order and we'll use a very simple grammar to explain

Recall Well Formed Formulae ::= p | q | r | s ::= N ::= ( C | A | K | E ) Suppose we wish to parse CANpqp

Bottom Up Parsing C A N p q p

Bottom Up Parsing < C A N p q p We can assume that the string has a leading less than precedence operator

Bottom Up Parsing < C < A N p q p We move from left to right (and in fact in reality we would normally proceed by asking a lexical scanner for the next token As we get to each token or symbol we get its precedence from a precedence table that we'll present later

Bottom Up Parsing < C < A < N p q p We continue in this fashion as long as we place the < and = operators

Bottom Up Parsing < C < A < N < p q p We continue in this fashion as long as we place the < and = operators

Bottom Up Parsing < C < A < N < p q p We continue in this fashion as long as we place < and = operators We are postponing the discussion on the precedence table because this part of the algorithm must be clear to be able to understand where the precedence table comes from!

Bottom Up Parsing q p When we place a > operator we have found a handle or something that we should be able to reduce We examine the rules of the grammer to see if there is a rule to match this handle

Bottom Up Parsing q p We find ::= p If no rule is found we have a parse errorNote: If no rule is found we have a parse error

Bottom Up Parsing q p Note that we have removed the entire handle and replaced it with the appropriate symbol from the grammar. We "backup" to examine the relationship between N and

Bottom Up Parsing q p We continue

Bottom Up Parsing > q p We continue, again, until we find a handle

Bottom Up Parsing q p We can reduce this using the rule ::= N

Bottom Up Parsing q p We continue

Bottom Up Parsing < q p We continue

Bottom Up Parsing p We can reduce this one also

Bottom Up Parsing p Once again backtracking

Bottom Up Parsing = p Once again backtracking

Bottom Up Parsing = > p Continuing

Bottom Up Parsing p Continuing

Bottom Up Parsing p Continuing

Bottom Up Parsing < p Continuing

Bottom Up Parsing A greater than precedence symbol is assumed after the last symbol in the input.

Bottom Up Parsing Continuing

Bottom Up Parsing = Continuing

Bottom Up Parsing = > Again a trailing greater than can be added

Bottom Up Parsing Since is our start symbol (and we have nothing left over) Successful Parse!

Bottom Up Parsing What kind of algorithm? –Stack based –Known as semantic stack or shift/reduce algorithm We won't code this algorithm but understanding this parsing technique will make some concepts found in yacc clearer

Example Our stream of tokens C A N p q p

Example Our stream of tokens C A N p q p Stack

Example Our stream of tokens C A N p q p Stack Color Commentary Welcome to Monday Night Parsing

Example Our stream of tokens A N p q p < C Stack We will place the Wirth-Weber operator and following token on the stack. Encountering the end of a handle > will initiate additional processing

Example Our stream of tokens N p q p < A < C Stack Working

Example Our stream of tokens p q p < N < A < C Stack Working

Example Our stream of tokens q p < p < N < A < C Stack Working

Example Our stream of tokens q p < p < N < A < C Stack Now, between the next token in the stream (q) and the symbol on top of the stack, we find a greater than precedence > indicating we have the end of a handle. We must now go down the stack and search for the beginning

Example Our stream of tokens q p < N < A < C Stack We can remove the p and looking at the grammar determine it can be reduced to be a. We then examine the in relation to the top of the stack

Example Our stream of tokens q p = < N < A < C Stack We can remove the p and looking at the grammar determine it can be reduced to be a. We then examine the in relation to the top of the stack

Example Our stream of tokens q p = < N < A < C Stack Looking at the followed bt the q we again find a greater than precedence relationship. We find that we can reduce the N to a.

Example Our stream of tokens q p < A < C Stack Now have from previous reduction. Compare it with A

Example Our stream of tokens q p = < A < C Stack Now have from previous reduction. Compare it with A

Example Our stream of tokens p < q = < A < C Stack Working

Example Our stream of tokens p = < A < C Stack q followed by p yields greater than allowing us to reduce the q to a

Example Our stream of tokens p = < A < C Stack followed by p yields greater than > so we reduce the A to a

Example Our stream of tokens p = < C Stack C followed by a yields equal precedence

Example Our stream of tokens < p = < C Stack Working

Example Our stream of tokens EOS < p = < C Stack End of input stream (EOS) allows us to place greater than precedence operator

Example Our stream of tokens EOS = < C Stack End of input stream allows us to reduce C

Example Our stream of tokens Stack End of input stream allows us to place greater than precedence operator allowing reduction to final Since is our start symbol: Successful Parse

Questions?

Constructing the Precedence Table Being a table which when given two successive symbols will return to us the correct interstitial Wirth-Weber Operator

Precedence Table CAKE † pqrs † N CAKE † pqrs † N Left Hand Symbol Right Hand Symbol

The Grammar ::= p | q | r | s ::= N ::= C ::= A ::= K ::= E Consider the previous slides As we move through a string we want to capture as a handle an occurrences of the rules above

The Grammar ::= p | q | r | s ::= N = ::= C = = ::= A = = ::= K = = ::= E = = Does this seem logical???

Precedence Table CAKE † pqrs † N CAKE † pqrs † N = = = Left Hand Symbol Right Hand Symbol

Now consider Whenever we come across a p, q, r or s We will want to follow this sequence A p A < p A So we might reason that any of the terminals C, A, K, E or N followed by a p, q, r,s will be < And a p, q, r or s will always be followed by a >

Precedence Table CAKE † pqrs † N CAKE † pqrs † N = = > = > < < > < > Left Hand Symbol Right Hand Symbol

We continue to use this reasoning Note that anything followed by a C, A, K, E or N should have < precedence to allow a proper WFF to be formed first i.e. C ??? Note also that the exception which we have already taken care of is p, q, r or s followed by C, A, K, E or N

Precedence Table CAKE † pqrs † N CAKE † pqrs † N = = > = < < > < < < > < < < > < Left Hand Symbol Right Hand Symbol Note: The exception which we have already taken care of is p, q, r or s followed by C, A, K, E or N

So It appears that this technique is quite simple We –Construct a grammar –Examine it to produce a precedence table –Write a program to execute our stack based algorithm Not so fast! There are two issues to deal with –Simple precedence –Size

Simple Precedence The technique we have been using is known as Bottom-Up Parsing or Shift-Reduce Parsing The action we take during operation is based on the precedence relationship found x < y x = y x > y What happens if there is no relationship in the table? What happens if there is more than one relationship in the table??? Shift Reduce

More than one relationship! Gadzooks! Actually we could deal with < = using lookahead (we'll see that in a moment) However rules that allowed >= or >< would be known as a shift reduce error Speaking of errors finding two rules that match is known as a reduce-reduce error Not finding a rule that matches is a syntax error

But how can we have multiple precedence relationships?

Recall our expression grammar ::= + | ::= * | ::= '(' ')' | num | id

Precedence Table ::= + | ::= * | ::= '(' ')' | num | id L R + * ( ) num id +(num*id)

Some things are impossible + () )(

Precedence Table ::= + | ::= * | ::= '(' ')' | num | id L R + * ( ) num id +(num*id)

We know From our WFF example we note that certain items must be reduced immediately (e.g. p, q, r and s) In a similar fashion we have ::= num | id So, anything followed by a num or an id will have

<< << << >>> >>> Precedence Table ::= + | ::= * | ::= '(' ')' | num | id L R + * ( ) num id +(num*id)

Precedence Between symbols there must be ? ::= + | ::= * | ::= '(' ')' | num | id

Precedence Between symbols there must be = ::= = + = | ::= = * = | ::= '(' = = ')' | num | id

== = = << = << << >>> >>> Precedence Table ::= + | ::= * | ::= '(' ')' | num | id L R + * ( ) num id +(num*id) =

Precedence To determine "end points" we must look at multiple rules to see how they interact... <wff> N<wff>= Do not be alarmed We are returning to the wff example just for a moment

Precedence To determine "end points" we must look at multiple rules to see how they interact... <wff> N<wff>= To determine what goes here... Do not be alarmed We are returning to the wff example just for a moment

Precedence To determine "end points" we must look at multiple rules to see how they interact... <wff> N<wff>= We look here. Do not be alarmed We are returning to the wff example just for a moment

Precedence ::= + | ::= * | ::= '(' ')' | num | id * ? ( What is the relationship between * and ( ?

Precedence ::= + | ::= * | ::= '(' ')' | num | id * ? ( = = ) What is the relationship between * and ( If we have parentheses it must be this form

Precedence ::= + | ::= * | ::= '(' ')' | num | id * = * = * < ( = = ) We go up the parse tree. Since ( ) will be a factor and a factor will need to be reduced as part of * we conclude that we will need to reduce the ( ) first

Precedence ::= + | ::= * | ::= '(' ')' | num | id + ? ( What is the relationship between + and (

Precedence ::= + | ::= * | ::= '(' ')' | num | id + ? ( = = ) Again the grammar reveals that ( must come from ( )

Precedence ::= + | ::= * | ::= '(' ')' | num | id < ( = = ) = = ) + = + = We examine the parse tree noting that a + can only be followed by a followed by a

== = =<< =<< << >>> >>> Precedence Table ::= + | ::= * | ::= '(' ')' | num | id L R + * ( ) num id +(num*id) = << <

Continuing to analyze in this way...

< == >=> >>> <<< =<<< <<< >>> >>> >>> Precedence Table ::= + | ::= * | ::= '(' ')' | num | id L R + * ( ) num id +(num*id) = =

< == >=> >>> <<< =<<< <<< >>> >>> >>> Precedence Table ::= + | ::= * | ::= '(' ')' | num | id L R + * ( ) num id +(num*id) '('...

Now for the complex part Consider ( followed by ( Is it ( ) = or ( + < ::= + | ::= * | ::= '(' ')' | num | id

Or Consider + followed by + Is it + + = + ) = + * < ::= + | ::= * | ::= '(' ')' | num | id

< == >=> >>> = <<<< =<<< <<< >>> >>> >>> Precedence Table ::= + | ::= * | ::= '(' ')' | num | id L R + * ( ) num id +(num*id) = <

< == >=> >>> <<< =<<< <<< >>> >>> >>> Precedence Table ::= + | ::= * | ::= '(' ')' | num | id L R + * ( ) num id +(num*id) = <

< == >=> >>> <<< =<<< <<< >>> >>> >>> Precedence Table ::= + | ::= * | ::= '(' ')' | num | id L R + * ( ) num id +(num*id) = < '(' = ')' '(' +

Resolving Ambiguity + = + * Solve by lookahead: + = + or ) + * Ambiguity can be resolved by increasing k in LR(k) but that's not the only way: We could rewrite grammar

Original Grammar ::= + | ::= * | ::= '(' ')' | num | id

Sources of Ambiguity ::= + | ::= * | ::= '(' ')' | num | id

Add 2 New Rules ::= + | ::= * | ::= '(' ')' | num | id ::=

Modify ::= + | ::= * | ::= '(' ')' | num | id ::=

Original Grammar ::= + | ::= * | ::= '(' ')' | num | id ::= Rewritten Grammar ::= + | ::= * | ::= '(' ')' | num | id

Bottom-Up Parsing No issues regarding left-recursive versus right- recursive such as those found with Top-down parsing Note: There are grammars that will break a bottom- up parser.

So It appears that this technique is quite simple We –Construct a grammar –Examine it to produce a precedence table –Write a program to execute our stack based algorithm Not so fast! There are two issues to deal with –Simple precedence –Size

Performance Size of table is O(n 2 ) For a "real" language this can be a problem One possibility: Use operator precedence –Only uses terminals –Thus the table size is not affected by adding non- terminals We will not go into details of Operator Precedence Tables You should be aware that they exist

Question Where do precedence relationships come from? –Make a table by hand –Write a program to make table How such a program works or how to write it are topics beyond the scope of this course.

Example

* 3 Tokenized: num + num * num * 3 Tokenized: num + num * num

+ > * 3 Tokenized: num + num * num * 3 Tokenized: num + num * num

+ > * 3 Tokenized: num + num * num * 3 Tokenized: num + num * num

+ > + = * 3 Tokenized: num + num * num * 3 Tokenized: num + num * num

+ > + = + = + < num * 3 Tokenized: num + num * num * 3 Tokenized: num + num * num

+ > + = + = + < num = + * * 3 Tokenized: num + num * num * 3 Tokenized: num + num * num

+ > + = + = + < num = + * = + > * * 3 Tokenized: num + num * num * 3 Tokenized: num + num * num

+ > + = + = + < num = + * = + > * = + = * * 3 Tokenized: num + num * num * 3 Tokenized: num + num * num

+ > + = + = + < num = + * = + > * = + = * = + = * < num * 3 Tokenized: num + num * num * 3 Tokenized: num + num * num

+ > + = + = + < num = + * = + > * = + = * = + = * < num = + = * * 3 Tokenized: num + num * num * 3 Tokenized: num + num * num

+ > + = + = + < num = + * = + > * = + = * = + = * < num = + = * = + = * = > * 3 Tokenized: num + num * num * 3 Tokenized: num + num * num

+ > + = + = + < num = + * = + > * = + = * = + = * < num = + = * = + = * = > = + = > * 3 Tokenized: num + num * num * 3 Tokenized: num + num * num

+ > + = + = + < num = + * = + > * = + = * = + = * < num = + = * = + = * = > = + = > > * 3 Tokenized: num + num * num * 3 Tokenized: num + num * num

Questions?