Exercise 1 A ::= B EOF B ::=  | B B | (B) Tokens: EOF, (, ) Generate constraints and compute nullable and first for this grammar. Check whether first.

Slides:



Advertisements
Similar presentations
Parsing 4 Dr William Harrison Fall 2008
Advertisements

Grammar vs Recursive Descent Parser
Exercise: Balanced Parentheses
Exercise 1: Balanced Parentheses Show that the following balanced parentheses grammar is ambiguous (by finding two parse trees for some input sequence)
Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.
YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.
Top-Down Parsing.
CS 310 – Fall 2006 Pacific University CS310 Parsing with Context Free Grammars Today’s reference: Compilers: Principles, Techniques, and Tools by: Aho,
1 Contents Introduction A Simple Compiler Scanning – Theory and Practice Grammars and Parsing LL(1) Parsing LR Parsing Lex and yacc Semantic Processing.
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
Parsing Compiler Baojian Hua Front End source code abstract syntax tree lexical analyzer parser tokens IR semantic analyzer.
COS 320 Compilers David Walker. last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars.
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
1 Predictive parsing Recall the main idea of top-down parsing: Start at the root, grow towards leaves Pick a production and try to match input May need.
CS 330 Programming Languages 09 / 23 / 2008 Instructor: Michael Eckmann.
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
1 Chapter 4: Top-Down Parsing. 2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct.
Professor Yihjia Tsai Tamkang University
COS 320 Compilers David Walker. last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars.
Top-Down Parsing.
Chapter 3 Chang Chi-Chung Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.
CPSC 388 – Compiler Design and Construction
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
COP4020 Programming Languages Computing LL(1) parsing table Prof. Xin Yuan.
Syntax Analysis – Part II Quick Look at Using Bison Top-Down Parsers EECS 483 – Lecture 5 University of Michigan Wednesday, September 20, 2006.
Syntax and Semantics Structure of programming languages.
Parsing Chapter 4 Parsing2 Outline Top-down v.s. Bottom-up Top-down parsing Recursive-descent parsing LL(1) parsing LL(1) parsing algorithm First.
Top-Down Parsing - recursive descent - predictive parsing
1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down  LL Bottom-up.
Chapter 5 Top-Down Parsing.
Parsing III (Top-down parsing: recursive descent & LL(1) )
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Lesson 5 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Syntax and Semantics Structure of programming languages.
Compiler Principles Fall Compiler Principles Lecture 3: Parsing part 2 Roman Manevich Ben-Gurion University.
Compiler verification for fun and profit a week from now, 8:55am in Salle Polyvalente Xavier LeroyInria Paris–Rocquencourt …Compilers are, however, vulnerable.
LL(k) Parsing Compiler Baojian Hua
Parsing Top-Down.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Top-down Parsing Recursive Descent & LL(1) Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Bottom-Up Parsing David Woolbright. The Parsing Problem Produce a parse tree starting at the leaves The order will be that of a rightmost derivation The.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Lecture 3: Parsing CS 540 George Mason University.
1 Context free grammars  Terminals  Nonterminals  Start symbol  productions E --> E + T E --> E – T E --> T T --> T * F T --> T / F T --> F F --> (F)
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
Computing if a token can follow first(B 1... B p ) = {a  | B 1...B p ...  aw } follow(X) = {a  | S ... ...Xa... } There exists a derivation from.
Top-Down Parsing.
CSE 5317/4305 L3: Parsing #11 Parsing #1 Leonidas Fegaras.
Top-Down Predictive Parsing We will look at two different ways to implement a non- backtracking top-down parser called a predictive parser. A predictive.
Parsing methods: –Top-down parsing –Bottom-up parsing –Universal.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
Bernd Fischer RW713: Compiler and Software Language Engineering.
Parsing III (Top-down parsing: recursive descent & LL(1) )
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
Bottom-up parsing. Bottom-up parsing builds a parse tree from the leaves (terminals) to the start symbol int E T * TE+ T (4) (2) (3) (5) (1) int*+ E 
9/30/2014IT 3271 How to construct an LL(1) parsing table ? 1.S  A S b 2.S  C 3.A  a 4.C  c C 5.C  abc$ S1222 A3 C545 LL(1) Parsing Table What is the.
Parsing #1 Leonidas Fegaras.
Chapter 4 - Parsing CSCE 343.
Context free grammars Terminals Nonterminals Start symbol productions
Top-down parsing cannot be performed on left recursive grammars.
FIRST and FOLLOW Lecture 8 Mon, Feb 7, 2005.
Top-Down Parsing CS 671 January 29, 2008.
Ambiguity, Precedence, Associativity & Top-Down Parsing
Predictive Parsing Lecture 9 Wed, Feb 9, 2005.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Predictive Parsing Program
Presentation transcript:

Exercise 1 A ::= B EOF B ::=  | B B | (B) Tokens: EOF, (, ) Generate constraints and compute nullable and first for this grammar. Check whether first sets for different alternatives are disjoint.

Exercise 2 S ::= B EOF B ::=  | (B) B Tokens: EOF, (, ) Generate constraints and compute nullable and first for this grammar. Check whether first sets for different alternatives are disjoint.

Exercise Introducing Follow Sets Compute nullable, first for this grammar: stmtList ::=  | stmt stmtList stmt ::= assign | block assign ::= ID = ID ; block ::= beginof ID stmtList ID ends Describe a parser for this grammar and explain how it behaves on this input: beginof myPrettyCode x = u; y = v; myPrettyCode ends

How does a recursive descent parser look like? def stmtList = if (???) {}what should the condition be? else { stmat; stmtList } def stmt = if (lex.token == ID) assign else if (lex.token == beginof) block else error(“Syntax error: expected ID or beginonf”) … def block = { skip(beginof); skip(ID); stmtList; skip(ID); skip(ends) }

Problem Identified stmtList ::=  | stmt stmtList stmt ::= assign | block assign ::= ID = ID ; block ::= beginof ID stmtList ID ends Problem parsing stmtList: –ID could start alternative stmt stmtList –ID could follow stmt, so we may wish to parse  that is, do nothing and return For nullable non-terminals, we must also compute what follows them

General Idea when parsing nullable(A) A ::= B 1... B p | C 1... C q | D 1... D r def A = if (token  T1) { B 1... B p else if (token  (T2 U T F )) { C 1... C q } else if (token  T3) { D 1... D r } // no else error, just return where: T1 = first(B 1... B p ) T2 = first(C 1... C q ) T3 = first(D 1... D r ) T F = follow(A) Only one of the alternatives can be nullable (here: 2nd) T1, T2, T3, T F should be pairwise disjoint sets of tokens.

LL(1) Grammar - good for building recursive descent parsers Grammar is LL(1) if for each nonterminal X –first sets of different alternatives of X are disjoint –if nullable(X), first(X) must be disjoint from follow(X) and only one alternative of X may be nullable For each LL(1) grammar we can build recursive-descent parser Each LL(1) grammar is unambiguous If a grammar is not LL(1), we can sometimes transform it into equivalent LL(1) grammar

Computing if a token can follow first(B 1... B p ) = {a  | B 1...B p ...  aw } follow(X) = {a  | S ... ...Xa... } There exists a derivation from the start symbol that produces a sequence of terminals and nonterminals of the form...Xa... (the token a follows the non-terminal X)

Rule for Computing Follow Given X ::= YZ(for reachable X) then first(Z)  follow(Y) and follow(X)  follow(Z) now take care of nullable ones as well: For each ruleX ::= Y 1... Y p... Y q... Y r follow(Y p ) should contain: first(Y p+1 Y p+2...Y r ) also follow(X) if nullable(Y p+1 Y p+2 Y r )

Compute nullable, first, follow stmtList ::=  | stmt stmtList stmt ::= assign | block assign ::= ID = ID ; block ::= beginof ID stmtList ID ends Is this grammar LL(1)?

Conclusion of the Solution The grammar is not LL(1) because we have nullable(stmtList) first(stmt)  follow(stmtList) = {ID} If a recursive-descent parser sees ID, it does not know if it should –finish parsing stmtList or –parse another stmt

Table for LL(1) Parser: Example S ::= B EOF (1) B ::=  | B (B) (1) (2) EOF() S{1} {} B{1}{1,2}{1} nullable: B first(S) = { ( } follow(S) = {} first(B) = { ( } follow(B) = { ), (, EOF } Parsing table: parse conflict - choice ambiguity: grammar not LL(1) empty entry: when parsing S, if we see ), report error 1 is in entry because ( is in follow(B) 2 is in entry because ( is in first(B(B))

Table for LL(1) Parsing Tells which alternative to take, given current token: choice : Nonterminal x Token -> Set[Int] A ::= (1) B 1... B p | (2) C 1... C q | (3) D 1... D r For example, when parsing A and seeing token t choice(A,t) = {2} means: parse alternative 2 (C 1... C q ) choice(A,t) = {3} means: parse alternative 3 (D 1... D r ) choice(A,t) = {} means: report syntax error choice(A,t) = {2,3} : not LL(1) grammar if t  first(C 1... C q ) add 2 to choice(A,t) if t  follow(A) add K to choice(A,t) where K is nullable alternative

Transform Grammar for LL(1) S ::= B EOF B ::=  | B (B) (1) (2) EOF() S{1} {} B{1}{1,2}{1} Transform the grammar so that parsing table has no conflicts. Old parsing table: conflict - choice ambiguity: grammar not LL(1) 1 is in entry because ( is in follow(B) 2 is in entry because ( is in first(B(B)) EOF() S B S ::= B EOF B ::=  | (B) B (1) (2) Left recursion is bad for LL(1) choice(A,t)

Parse Table is Code for Generic Parser var stack : Stack[GrammarSymbol] // terminal or non-terminal stack.push(EOF); stack.push(StartNonterminal); var lex = new Lexer(inputFile) while (true) { X = stack.pop t = lex.curent if (isTerminal(X)) if (t==X) if (X==EOF) return success else lex.next // eat token t else parseError("Expected " + X) else { // non-terminal cs = choice(X)(t) // look up parsing table cs match { // result is a set case {i} => { // exactly one choice rhs = p(X,i) // choose correct right-hand side stack.pushRev(rhs) } // pushes symbols in rhs so leftmost becomes top of stack case {} => parseError("Parser expected an element of " + unionOfAll(choice(X))) case _ => crash(“parse table with conflicts - grammar was not LL(1)") } }

Exercise: check if this grammar is LL(1) S::=  |A S A ::= id := id A ::= if id then A A ::= if id then A' else A A' ::= id := id A' ::= if id then A' else A‘ No, because first(if id then A) and first(if id then A' else A) overlap.

What if we cannot transform the grammar into LL(1)? 1) Redesign your language 2) Use a more powerful parsing technique