Top-down parsing cannot be performed on left recursive grammars.

Slides:



Advertisements
Similar presentations
Compiler Construction
Advertisements

Grammar and Algorithm }
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Top-Down Parsing.
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
1 Predictive parsing Recall the main idea of top-down parsing: Start at the root, grow towards leaves Pick a production and try to match input May need.
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
1 Chapter 4: Top-Down Parsing. 2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct.
Professor Yihjia Tsai Tamkang University
Top-Down Parsing.
– 1 – CSCE 531 Spring 2006 Lecture 7 Predictive Parsing Topics Review Top Down Parsing First Follow LL (1) Table construction Readings: 4.4 Homework: Program.
COP4020 Programming Languages Computing LL(1) parsing table Prof. Xin Yuan.
Parsing Chapter 4 Parsing2 Outline Top-down v.s. Bottom-up Top-down parsing Recursive-descent parsing LL(1) parsing LL(1) parsing algorithm First.
Chapter 9 Syntax Analysis Winter 2007 SEG2101 Chapter 9.
Top-Down Parsing - recursive descent - predictive parsing
Chapter 5 Top-Down Parsing.
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
1 Compiler Construction Syntax Analysis Top-down parsing.
CSI 3120, Syntactic analysis, page 1 Syntactic Analysis and Parsing Based on A. V. Aho, R. Sethi and J. D. Ullman Compilers: Principles, Techniques and.
1 Problems with Top Down Parsing  Left Recursion in CFG May Cause Parser to Loop Forever.  Indeed:  In the production A  A  we write the program procedure.
Pembangunan Kompilator.  The parse tree is created top to bottom.  Top-down parser  Recursive-Descent Parsing ▪ Backtracking is needed (If a choice.
COP4020 Programming Languages Parsing Prof. Xin Yuan.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Top-Down Parsing The parse tree is created top to bottom. Top-down parser –Recursive-Descent Parsing.
Parsing Top-Down.
LL(1) Parser. What does LL signify ? The first L means that the scanning takes place from Left to right. The first L means that the scanning takes place.
1 Compiler Construction Syntax Analysis Top-down parsing.
1 Context free grammars  Terminals  Nonterminals  Start symbol  productions E --> E + T E --> E – T E --> T T --> T * F T --> T / F T --> F F --> (F)
1 Nonrecursive Predictive Parsing  It is possible to build a nonrecursive predictive parser  This is done by maintaining an explicit stack.
Top-Down Predictive Parsing We will look at two different ways to implement a non- backtracking top-down parser called a predictive parser. A predictive.
Parsing methods: –Top-down parsing –Bottom-up parsing –Universal.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Spring 16 CSCI 4430, A Milanova 1 Announcements HW1 due on Monday February 8 th Name and date your submission Submit electronically in Homework Server.
Problems with Top Down Parsing
Compiler Construction
Programming Languages Translator
Context free grammars Terminals Nonterminals Start symbol productions
UNIT - 3 SYNTAX ANALYSIS - II
Compilers Welcome to a journey to CS419 Lecture15: Syntax Analysis:
Table-driven parsing Parsing performed by a finite state machine.
Syntactic Analysis and Parsing
Parsing.
Compiler Construction
Introduction to Top Down Parser
SYNTAX ANALYSIS (PARSING).
FIRST and FOLLOW Lecture 8 Mon, Feb 7, 2005.
CS 404 Introduction to Compiler Design
Fall Compiler Principles Lecture 4: Parsing part 3
UNIT 2 - SYNTAX ANALYSIS Role of the parser Writing grammars
Top-Down Parsing.
Subject Name:COMPILER DESIGN Subject Code:10CS63
Top-Down Parsing CS 671 January 29, 2008.
Subject Name:COMPILER DESIGN Subject Code:10CS63
Lecture 7 Predictive Parsing
Syntax Analysis source program lexical analyzer tokens syntax analyzer
Lecture 8 Bottom Up Parsing
Compiler Design 7. Top-Down Table-Driven Parsing
Top-Down Parsing Identify a leftmost derivation for an input string
Top-Down Parsing The parse tree is created top to bottom.
Chapter 4 Top Down Parser.
LL PARSER The construction of a predictive parser is aided by two functions associated with a grammar called FIRST and FOLLOW. These functions allows us.
Predictive Parsing Lecture 9 Wed, Feb 9, 2005.
Computing Follow(A) : All Non-Terminals
Syntax Analysis - Parsing
Lecture 7 Predictive Parsing
Nonrecursive Predictive Parsing
Context Free Grammar – Quick Review
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Chap. 3 BOTTOM-UP PARSING
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Top-down parsing cannot be performed on left recursive grammars. Chap 2. TOP-DOWN PARSING We introduce the basic ideas behind top-down parsing and show how to construct an efficient non-backtracking form of top-down parsing called a predictive parser. We define the class of LL(1) grammars from which predictive parsers can be constructed automatically. Top-down parsing cannot be performed on left recursive grammars.

Recursive-Descent Parsing Top-down parsing can be viewed as an attempt to find a leftmost derivation for an input string. We can also view it as an attempt to construct a parse tree for the input starting from the root and creating the nodes of the parse tree in preorder. Let's consider a general form of top-down parsing, called recursive descent, that may involve backtracking, that is, making repeated scans of the input. However, backtracking parsers are not seen frequently, since backtracking is rarely needed to parse programming language constructs.

Example Consider the grammar ScAd Aab|a   and the input string w = cad. S S   c A d c A d a b a

Elimination of Left Recursion A grammar is immediately left recursive if it has a nonterminal A such that there is a production AA for some string . In order to be able to parse in the top-down way, left recursion must be eliminated. An immediate left recursion can be eliminated very easily. Let's take a production EE+T. Suppose the procedure for E decides to apply this production. The right side begins with E so the procedure for E is called recursively, and the parser loops forever. A left recursive production can be eliminated by rewriting the offending production.

Consider a nonterminal A with two productions AA |  where  and  are sequences of terminals and nonterminals that do not start with A. For example, in EE+T | T A=E, =+T, and =T. Repeated application of the production AA builds up a sequence of 's to the right of A. A   A   A   A        . . …… 

When A is finally replaced by , we have a  followed by a sequence of 0 or more 's. The same effect can be achieved, by rewriting the productions for A in the following manner. AR RR |    A  R  R ……………………  R  R   

Here R is a new nonterminal Here R is a new nonterminal. The production RR is right recursive, because this production for R has R itself as the last symbol on the right side. Right recursive productions lead to trees that grow down towards the right.

Example Consider the following grammar for arithmetic expressions. E E + T | T T T * F | F F (E) | id. Eliminating the immediate left recursion from the productions for E and then for T, we obtain E TE' E' + TE' |  T FT' T' * FT' | 

No matter how many A-productions there are, we can eliminate immediate left recursion from them A A 1 | A2 | …| Am | 1 | 2 | …| n where no i begins with an A. Then, we replace the A-productions by   A 1A' | 2A' | … |nA' A'1A' | 2A' | … |mA' |  The nonterminal A generates the same strings as before but is no longer left recursive.

Nonrecursive Predictive Parsing It is possible to build a nonrecursive predictive parser by maintaining a stack explicitly, rather than implicitly via recursive calls. The key problem during predictive parsing is that of determining the production to be applied for a nonterminal. The nonrecursive parser looks up the production to be applied in a parsing table. In what follows, we shall see how the table can be constructed directly from certain grammars.

Predictive Parsing Program   Input a + b $  Predictive Parsing Program Stack X   Output Y Z  Parsing Table M

A table-driven predictive parser has an input buffer, a stack, a parsing table, and an output stream. The input buffer contains the string to be parsed, followed by $, a symbol used as a right endmarker to indicate the end of the input string. The stack contains a sequence of grammar symbols with $ on the bottom, indicating the bottom of the stack. Initially, the stack contains the start symbol of the grammar on top of $. The parsing table is a two-dimensional array M [A, a], where A is a nonterminal, and a is a terminal or the symbol $.

The parser is controlled by a program that behaves as follows The parser is controlled by a program that behaves as follows. The program considers X, the symbol on top of the stack, and a, the current input symbol. If X = a = $, the parser halts and announces successful completion of parsing. 2. If X = a  $, the parser pops X off the stack and advances the input pointer to the next input symbol. 3. If X is a nonterminal, the program consults entry M [A, a] of the parsing table M. This entry will be either an X-production of the grammar or an error entry. If, for example, M[X, a] = {X  UVW}, the parser replaces X on top of the stack by WVU (with U on top). As output, we shall assume that the parser just prints the production used; any other code could be executed here. If M[X, a] = error, the parser calls an error recovery routine.

Algorithm for Nonrecursive predictive parsing. Input. A string w and a parsing table M for grammar G. Output. If w is in L(G), a leftmost derivation of w, otherwise, an error indication. Method. Initially, the parser is in a configuration in which it has $S on the stack with S, the start symbol of G on top, and w$ in the input buffer.

set ip to point to the first symbol of w$; repeat let X be the top stack symbol and a the symbol pointed to by ip; if X is a terminal or $ then if X = a then pop X from the stack and advance ip else error () else /* X is a nonterminal */ if M[X, a] = X Y1 Y2  Yk then begin pop X from the stack; push Yk , Yk–1 , . . ., Y1 onto the stack, with Y1 on top; output the production X Y1 Y2  Yk end until X = $ /* stack is empty */

Example Consider the grammar for arithmetic expression

Nonter-minal Input Sumbol id + * ( ) $ E  TE' E’ E’ ε T T  FT' T’   E’ E’ +TE' E’ ε T T  FT' T’ T’ ε T ‘ *FT' F F  id F  (E)

With input id + id * id the predictive parser makes the sequence of moves. The input pointer points to the leftmost symbol of the string in the input column. If we observe the actions of this parser carefully, we see that it is tracing out a leftmost derivation for the input, that is, the productions output are those of a leftmost derivation.

FIRST and FOLLOW The construction of a predictive parser is aided by two functions associated with a grammar G. These functions, FIRST and FOLLOW, allow us to fill in the entries of a predictive parsing table for G, whenever possible. If α is any string of grammar symbols, let first(α) be the set of terminals that begin the strings derived from α. If α , then  is also in FIRST(α). Define FOLLOW (A), for nonterminal A, to be the set of terminals a that can appear immediately to the right of A in some sentential form, that is the set of terminals a such that there exists a derivation of the form S  jAaβ for some α and β.

If X is terminal, then FIRST(X) is {X}. If X is a production, then add  to FIRST(X). If X is nonterminal and X — Y1 Y2 • • • Yk is a production, then place a in FIRST(X) if for some i, a is in FIRST(Yi), and  is in all of FIRST(Y1), .... FIRST(Yi–1); that is, Y1 • • • Yi–1. If  is in FIRST(Yj) for all j = 1, 2, .... k, then add  to FIRST(X). For example, everything in FIRST(Y1) is surely in FIRST(X). If Y1 does not derive , then we add nothing more to FIRST(X), but if Y1., then we add FIRST(Y2) and so on.

Now, we can compute FIRST for any string X1 X2 • • • Xn, as follows Now, we can compute FIRST for any string X1 X2 • • • Xn, as follows. Add to FIRST(X1X2 • • • Xn) all the non- symbols of FIRST(X1). Also add the non- symbols of FIRST(X2) if  is in FIRST(X1), the non- symbols of FIRST(X3) if  is in both FIRST(X1) and FIRST(X2), and so on. Finally, add  to FIRST(X1X2 • • • Xn) if, for all i, FIRST(Xi) contains .

To compute FOLLOW(A) for all nonterminals A, apply the following rules until nothing can be added to any FOLLOW set. Place $ in FOLLOW(S), where S is the start symbol and $ is the input right endmarker. If there is a production A  , then everything in FIRST() except for  is placed in FOLLOW(B). 3. If there is a production A  , or a production A , where FIRST() contains  (i.e., ), then everything in FOLLOW(A) is in FOLLOW(B).

Example ETE' E'+TE' |  TFT' T'*FT' |  F(E) | id Then   Then FIRST(E) = FIRST(T) = FIRST(F) = {(, id} FIRST(E') = {+, } FIRST(T') = {*, } FOLLOW(E) = FOLLOW(E') = {), $} FOLLOW(F) = {+,*, ), $}

Construction of Predictive Parsing Tables Algorithm. Construction of a predictive parsing table Input. Grammar G Output. Parsing table M Method 1.        For each production A of the grammar, do steps 2 and 3. 2.        For each terminal a in FIRST(), add A to M[A,a] 3.        If  is in FIRST() for each aFOLLOW(A), add A to M[A,a]. 4.        Make each undefined entry of M be error.

Example Let's apply the algorithm above to the arithmetic expression's grammar. Since FIRST(TE') = FIRST(T) = {(, id}, production ETE' causes M[E, (] and M[E,id] to acquire the entry ETE'. Production E'TE' causes M[E',)] and M[E',$] to acquire E' since FOLLOW(E') = {), $}. We will get the table that we have already seen.

LL(1) Grammars If G is left recursive or ambiguous, then M will have at least one multiply-defined entry. Example Let's consider a grammar : S  iEtSS' | a S'  eS |  E  b   FIRST(S) = {i, a} FIRST(S') = {e, } FIRST(E) = {b} FOLLOW(S) = FOLLOW(S') ={$,e} FOLLOW(E) = {t}

Nonterminal Input Symbol a b e i t $ S S a S' S'   S'  eS E E  b   S iEtSS' S' S'   S'  eS E E  b

The entry M[S',e] contains both S'   and S'  eS, since FOLLOW(S') = {e,$}. The grammar is ambiguous and the ambiguity is manifested by a choice in what production to use when an e (else) is seen.

A grammar whose parsing table has no multiply-defined entries is LL(1). The first "L" is for scanning the input from left to right. The second "L" for producing the leftmost derivation. The "1" for using one input symbol of lookahead at each step to make parsing action decisions.