Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

Slides:



Advertisements
Similar presentations
Parsing V: Bottom-up Parsing
Advertisements

Compiler Construction
Exercise: Balanced Parentheses
YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.
Mooly Sagiv and Roman Manevich School of Computer Science
9/27/2006Prof. Hilfinger, Lecture 141 Syntax-Directed Translation Lecture 14 (adapted from slides by R. Bodik)
Top-Down Parsing.
1 Contents Introduction A Simple Compiler Scanning – Theory and Practice Grammars and Parsing LL(1) Parsing LR Parsing Lex and yacc Semantic Processing.
Chapter 4 Lexical and Syntax Analysis Sections 1-4.
ISBN Chapter 4 Lexical and Syntax Analysis.
ISBN Chapter 4 Lexical and Syntax Analysis.
Parsing III (Eliminating left recursion, recursive descent parsing)
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
1 Predictive parsing Recall the main idea of top-down parsing: Start at the root, grow towards leaves Pick a production and try to match input May need.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
CS 330 Programming Languages 09 / 23 / 2008 Instructor: Michael Eckmann.
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
Yu-Chen Kuo1 Chapter 2 A Simple One-Pass Compiler.
Professor Yihjia Tsai Tamkang University
COS 320 Compilers David Walker. last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars.
Chapter 2 A Simple Compiler
ISBN Lecture 04 Lexical and Syntax Analysis.
Lexical and syntax analysis
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
CPSC 388 – Compiler Design and Construction
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
Syntax Analysis – Part II Quick Look at Using Bison Top-Down Parsers EECS 483 – Lecture 5 University of Michigan Wednesday, September 20, 2006.
Syntax and Semantics Structure of programming languages.
Parsing. Goals of Parsing Check the input for syntactic accuracy Return appropriate error messages Recover if possible Produce, or at least traverse,
Chapter 9 Syntax Analysis Winter 2007 SEG2101 Chapter 9.
Top-Down Parsing - recursive descent - predictive parsing
1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down  LL Bottom-up.
Chapter 5 Top-Down Parsing.
4 4 (c) parsing. Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces.
10/13/2015IT 3271 Tow kinds of predictive parsers: Bottom-Up: The syntax tree is built up from the leaves Example: LR(1) parser Top-Down The syntax tree.
# 1 CMPS 450 Parsing CMPS 450 J. Moloney. # 2 CMPS 450 Check that input is well-formed Build a parse tree or similar representation of input Recursive.
Parsing III (Top-down parsing: recursive descent & LL(1) )
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Syntax and Semantics Structure of programming languages.
4 4 (c) parsing. Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces.
Exercise 1 A ::= B EOF B ::=  | B B | (B) Tokens: EOF, (, ) Generate constraints and compute nullable and first for this grammar. Check whether first.
Chapter 2. Design of a Simple Compiler J. H. Wang Sep. 21, 2015.
Parsing III (Top-down parsing: recursive descent & LL(1) ) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
COP4020 Programming Languages Parsing Prof. Xin Yuan.
Comp 311 Principles of Programming Languages Lecture 3 Parsing Corky Cartwright August 28, 2009.
Top-down Parsing lecture slides from C OMP 412 Rice University Houston, Texas, Fall 2001.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Top-down Parsing Recursive Descent & LL(1) Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Overview of Previous Lesson(s) Over View  In our compiler model, the parser obtains a string of tokens from the lexical analyzer & verifies that the.
1 Nonrecursive Predictive Parsing  It is possible to build a nonrecursive predictive parser  This is done by maintaining an explicit stack.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Top-Down Parsing.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
Chapter 2 (part) + Chapter 4: Syntax Analysis S. M. Farhad 1.
Bernd Fischer RW713: Compiler and Software Language Engineering.
UMBC  CSEE   1 Chapter 4 Chapter 4 (b) parsing.
Parsing III (Top-down parsing: recursive descent & LL(1) )
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Comp 411 Principles of Programming Languages Lecture 3 Parsing
Lecture #12 Parsing Types.
4 (c) parsing.
Top-Down Parsing CS 671 January 29, 2008.
LL and Recursive-Descent Parsing
LL and Recursive-Descent Parsing Hal Perkins Autumn 2009
LL and Recursive-Descent Parsing Hal Perkins Winter 2008
Presentation transcript:

Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011

Outline Overview LL(k) Grammars Recursive-Descent LL(1) Parsers Table-Driven LL(1) Parsers Obtaining LL(1) Grammars A Non-LL(1) Language Properties of LL(1) Parsers Parse Table Representation Syntactic Error Recovery and Repair

Overview Two forms of top-down parsers –Recursive-descent parsers –Table-driven LL parsers: LL(k) – to be explained later Compiler compilers (or parser generators) –CFG as a language’s definition, parsers can be automatically constructed –Language revision, update, or extension can be easily applied to a new parser –Grammar can be proved unambiguous if parser construction is successful

Top-Down Parsing Top-down –To grow a parse tree from root to leaves Predictive –Must predict which production rule to be applied LL(k) –Scan input left to right, leftmost derivation, k symbol lookahead Recursive descent –Can be implemented by a set of mutually recursive procedures

LL(k) Grammars Recall from Chap.2 –A parsing procedure for each nonterminal A –The procedure is responsible for accomplishing one step of derivation for the corresponding production –Choosing production by inspecting the next k tokens. Predict Set for production A   is the set of tokens that trigger the production –Predict Set is determined by the right-hand side (RHS) 

We need a strategy for choosing productions –Predict k (p): the set of length-k token strings that predict the application of rule p Input string:  a  * S=>* lm  Ay 1 …y n –P={p  ProductionsFor(A)|a  Predict(p)} P: empty set -> syntax error P: more than one productions -> nondeterminism P: exactly one production

How to Compute Predict(p) To predict production p: A  X 1 …X m, m>=0 –The set of terminal symbols that are first produced in some derivation from X 1 …X m –Those terminal symbols that can follow A –(Fig. 5.1)

For LL(1) grammar, the productions for each nonterminal A must have disjoint predict sets Not all CFGs are LL(1) –More lookahead may be needed: LL(k), k>1 –A more powerful parsing method may be required (Chap. 6) –The grammar may be ambiguous

S MATCH PEEK ADVANCE ERROR

Recursive-Descent LL(1) Parsers Input: token stream ts –PEEK(): to examine the next input token without advancing the input –ADVANCE(): to advances the input by one token To construct a recursive-descent parser –We write a separate procedure for each nonterminal A –For each production pi, we check each symbol in the RHS X 1 …X m Terminal symbol: MATCH( ts, X i ) Nonterminal symbol: call X i (ts)

PEEK

MATCH

Table-Driven LL(1) Parsers Creating recursive-descent parsers can be automated, but –Size of parser code –Inefficiency: overhead of method calls and returns To create table-driven parsers, we use stack to simulate the actions by MATCH() and calls to nonterminals’ procedures –Terminal symbol: MATCH –Nonterminal symbol: table lookup –(Fig. 5.8)

PUSH MATCH POP ERROR APPLY POP PUSH PEEK PARSER

How to Build LL(1) Parse Table The table is indexed by the top-of-stack (TOS) symbol and the next input token –Row: nonterminal symbol –Column: next input token –(Fig. 5.9)

ILL ABLE

Obtaining LL(1) Grammars It’s easy to violate the requirement of a unique prediction for each combination of nonterminal and lookahead symbols –Common prefixes –Left recursion

Common Prefixes Two productions for the same nonterminal begin with the same string of grammar symbols –Ex. (Fig. 5.12) Not LL(k) Factoring transformation –Fig –Ex. (Fig. 5.14)

ACTOR

LIMINATE EFT ECURSION

Left Recursion A production is left recursive if its LHS symbol is also the first symbol of its RHS –E.g. StmtList  StmtList ; Stmt –A  A  |  –(Fig & Fig. 5.16)

A Non-LL(1) Language Almost all common programming language constructs: LL(1) –One exception: if-then-else ( dangling else program) –Can be resolved by mandating that each else is matched to its closest unmatched then –(Fig. 5.17)

Ambiguous (Chap. 6) –E.g. if expr then if expr then other else other If expr then { if expr then other else other } If expr then { if expr then other } else other -> at least two distinct parses Dangling bracket language (DBL) –DBL={[ i ] j |i≥j≥0} if expr then Stmt -> [ (opening bracket) else Stmt -> ] (optional closing bracket)

Fig. 5.18(a) –S  [ S CL | λ CL  ] | λ E.g. [[] Fig. 5.18(b) –S  [ S | T T  [ T ] | λ

It’s not LL(k) –[  Predict( S  [S ) [  Predict( S  T ) [[  Predict 2 ( S  [S ) [[  Predict 2 ( S  T ) … [ k  Predict k ( S  [S ) [ k  Predict k ( S  T )

Properties of LL(1) Parsers A correct, leftmost parse is constructed All grammars in LL(1) are unambiguous All table-driven LL(1) parsers operate in linear time and space with respect to the length of the parsed input

Thanks for Your Attention!