Recursive Descent Parsers Read and recognize the input (in order to translate it or evaluate it) Implicitly construct the derivation tree Design is driven.

Slides:



Advertisements
Similar presentations
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Advertisements

Translator Architecture Code Generator ParserTokenizer string of characters (source code) string of tokens abstract program string of integers (object.
1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.
Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.
Chapter 4 Lexical and Syntax Analysis Sections
9/27/2006Prof. Hilfinger, Lecture 141 Syntax-Directed Translation Lecture 14 (adapted from slides by R. Bodik)
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
Top-Down Parsing.
Chapter 4 Lexical and Syntax Analysis Sections 1-4.
Context-Free Grammars Lecture 7
Parsing III (Eliminating left recursion, recursive descent parsing)
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
Prof. Bodik CS 164 Lecture 61 Building a Parser II CS164 3:30-5:00 TT 10 Evans.
CS 310 – Fall 2006 Pacific University CS310 Parsing with Context Free Grammars Today’s reference: Compilers: Principles, Techniques, and Tools by: Aho,
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
Yu-Chen Kuo1 Chapter 2 A Simple One-Pass Compiler.
Lexical and Syntax Analysis
Compiler construction in4020 – lecture 3 Koen Langendoen Delft University of Technology The Netherlands.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
Top-Down Parsing - recursive descent - predictive parsing
Chapter 5 Top-Down Parsing.
CISC 471 First Exam Review Game Questions. Overview 1 Draw the standard phases of a compiler for compiling a high level language to machine code, showing.
10/13/2015IT 3271 Tow kinds of predictive parsers: Bottom-Up: The syntax tree is built up from the leaves Example: LR(1) parser Top-Down The syntax tree.
CS 461 – Oct. 7 Applications of CFLs: Compiling Scanning vs. parsing Expression grammars –Associativity –Precedence Programming language (handout)
Parsing III (Top-down parsing: recursive descent & LL(1) )
1 Top Down Parsing. CS 412/413 Spring 2008Introduction to Compilers2 Outline Top-down parsing SLL(1) grammars Transforming a grammar into SLL(1) form.
COMP Parsing 2 of 4 Lecture 22. How do we write programs to do this? The process of getting from the input string to the parse tree consists of.
PART I: overview material
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Topic #2: Infix to Postfix EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Parsing Lecture 5 Fri, Jan 28, Syntax Analysis The syntax of a language is described by a context-free grammar. Each grammar rule has the form A.
Top Down Parsing - Part I Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
6/4/2016IT 3271 The most practical Parsers: Predictive parser: 1.input (token string) 2.Stacks, parsing table 3.output (syntax tree, intermediate codes)
COP4020 Programming Languages Parsing Prof. Xin Yuan.
Parsing Top-Down.
Top-down Parsing lecture slides from C OMP 412 Rice University Houston, Texas, Fall 2001.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Top-down Parsing Recursive Descent & LL(1) Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Top-Down Parsing.
Syntax Analyzer (Parser)
CSE 5317/4305 L3: Parsing #11 Parsing #1 Leonidas Fegaras.
PZ03BX Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ03BX –Recursive descent parsing Programming Language.
1 Introduction to Parsing. 2 Outline l Regular languages revisited l Parser overview Context-free grammars (CFG ’ s) l Derivations.
LECTURE 7 Lex and Intro to Parsing. LEX Last lecture, we learned a little bit about how we can take our regular expressions (which specify our valid tokens)
COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
Parsing III (Top-down parsing: recursive descent & LL(1) )
Lecture # 10 Grammar Problems. Problems with grammar Ambiguity Left Recursion Left Factoring Removal of Useless Symbols These can create problems for.
CMSC 330: Organization of Programming Languages Pushdown Automata Parsing.
Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.
Parsing 2 of 4: Scanner and Parsing
Programming Languages 2nd edition Tucker and Noonan
Programming Languages Translator
Lecture #12 Parsing Types.
Parsing with Context Free Grammars
Syntax-Directed Definition
Programming Language Syntax 2
ENERGY 211 / CME 211 Lecture 15 October 22, 2008.
Programming Language Syntax 5
Compiler Construction
Operator Precedence and Associativity
Presentation transcript:

Recursive Descent Parsers Read and recognize the input (in order to translate it or evaluate it) Implicitly construct the derivation tree Design is driven by the CFG of the language they recognize

Processing Expressions: A First Example Using the CFG for Boolean expressions, let’s determine the value of expressions such as (T AND NOT (T)) Assume expressions are represented as a string of characters with no syntax errors

Rewrite Rules for Boolean Expressions  F | T | NOT ( ) | ( AND ) | ( OR )

First, Tokens! Token Text “(” “ ” “T” “ ” “AND” “ ” “NOT” “ ” “(” “ ” “T” “ ” “)” “ ” “)” Token Kind LEFT_PAREN WHITE_SPACE TRUE_VALUE WHITE_SPACE AND_OPRTR WHITE_SPACE NOT_OPRTR WHITE_SPACE LEFT_PAREN WHITE_SPACE TRUE_VALUE WHITE_SPACE RIGHT_PAREN WHITE_SPACE RIGHT_PAREN The input expression “( T AND NOT ( T ) )” consists of:

First Token? Ignore WHITE_SPACE tokens First non white-space token can only be one of: T F NOT (

What If First Token is T? What rewrite rule will be used to derive the Boolean expression? Draw the derivation tree:  T T

First Token is T Continued… What is the entire expression? What is its value? T true

What If First Token is NOT? What will be the first rewrite rule used to derive the expression? Draw the top 2 levels of the derivation tree: How do we proceed?  NOT ( ) ()NOT

What If First Token is (? What will be the first rewrite rule used to derive the expression? Draw the top 2 levels of the derivation tree: How do we proceed?  ( OR ) | ( AND ) ()OR/AND

Summing Up… procedure Evaluate_Bool_Exp ( alters Text& input, produces Boolean& result ) { object Token t, oprtr; object Boolean left, right; GetNextNonWSToken (input, t); case_select (t.Kind ()) { case FALSE_VALUE: { result = false; } break; case TRUE_VALUE: { result = true; } break; case NOT_OPRTR: { GetNextNonWSToken (input, t); // ( Evaluate_Bool_Exp (input, result); GetNextNonWSToken (input, t); // ) result = not result; } break; case LEFT_PAREN: { Evaluate_Bool_Exp (input, left); GetNextNonWSToken (input, oprtr); Evaluate_Bool_Exp (input, right); GetNextNonWSToken (input, t); // ) if (oprtr.Kind () == AND_OPRTR) { result = left and right; } else // oprtr.Kind () == OR_OPRTR { result = left or right; } } break; }

Summing Up (Bigger!)… procedure Evaluate_Bool_Exp ( alters Text& input, produces Boolean& result ) { object Token t, oprtr; object Boolean left, right; GetNextNonWSToken (input, t); // read first token case_select (t.Kind ()) { CFG

Summing Up (Bigger!)… case FALSE_VALUE: { result = false; } break; case TRUE_VALUE: { result = true; } break; case NOT_OPRTR: { GetNextNonWSToken (input, t); // read ( Evaluate_Bool_Exp (input, result); GetNextNonWSToken (input, t); // read ) result = not result; } break; CFG

case LEFT_PAREN: { Evaluate_Bool_Exp (input, left); GetNextNonWSToken (input, oprtr); // read operator Evaluate_Bool_Exp (input, right); GetNextNonWSToken (input, t); // read ) if (oprtr.Kind () == AND_OPRTR) { result = left and right; } else // oprtr.Kind () == OR_OPRTR { result = left or right; } } break; Summing Up (Bigger!)… CFG

Processing Expressions: A Second Example  |  |  ( ) |  + | -  * | DIV | MOD  |  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

First, Left-Recursive Rules What’s the problem? Replace the following rewrite rules: with the following rewrite rules:  |  |  { }

Revised CFG for Expressions  { }  ( ) |  + | -  * | DIV | MOD  |  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Evaluating Expressions Recursive descent parser One operation per nonterminal symbol (for,, ) Tokenizer breaks up input in tokens-- (Text, Integer) pairs Tokenizer also handles other nonterminal symbols (,,, and )

Evaluation Operation for Nonterminal How does the following operation work? (Check out the next slide) global_procedure Evaluate_Expression ( alters Character_IStream& ins, alters Text& token_text, alters Integer& token_kind, produces Integer& value ); { lookahead token

Picture Specs for Evaluate_Expression prefix of ins.content representing an expression ins.content = “5 + 3 – 2 plus some more text” token texts: “5”, “+”, “3”, “-”, “2”, …... tokens still in #ins.content lookahead token (#token_text) token_text CFG

Evaluation Operation for Nonterminal How does the following operation work? (Check out the next slide) global_procedure Evaluate_Term ( alters Character_IStream& ins, alters Text& token_text, alters Integer& token_kind, produces Integer& value ); { lookahead token

Picture Specs for Evaluate_Term prefix of ins.content representing a term lookahead token (#token_text) token_text CFG

Evaluation Operation for Nonterminal How does the following operation work? (Check out the next slide) global_procedure Evaluate_Factor ( alters Character_IStream& ins, alters Text& token_text, alters Integer& token_kind, produces Integer& value ); { lookahead token

Picture Specs for Evaluate_Factor prefix of ins.content representing a factor lookahead token (#token_text) token_text CFG

What About the Other Nonterminal Symbols?,,, and can be handled by the tokenizer However, in warm-up for closed lab: no tokenizer just deal with characters one at a time use lookahead character let the CFG drive the implementation of the operations.

How To Write Recursive Descent Parsers 1.One operation per nonterminal (except single-token nonterminals if using tokenizer) 2.nonterminal in rewrite rule → call operation to parse nonterminal 3.terminal (or single-token nonterminal) in rewrite rule → advance input (get next token) 4.| in rewrite rules → if-else-if in parser 5.{} in rewrite rules → loop in parser

Revised CFG for Expressions  { }  ( ) |  + | -  * | DIV | MOD  |  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

1. One operation per nonterminal (except single-token nonterminals) → Evaluate_Expression → Evaluate_Term → Evaluate_Factor,, → single-token nonterminals

2. nonterminal in rewrite rule → call operation to parse nonterminal  { } procedure_body Evaluate_Expression (…) { Evaluate_Term (…) … }

2. nonterminal in rewrite rule → call operation to parse nonterminal  { } procedure_body Evaluate_Term (…) { Evaluate_Factor (…) … }

3. terminal (or single-token nonterminal) in rewrite rule → advance input (get token)  ( ) | procedure_body Evaluate_Factor (…) { … GetNextNonWSToken (…) Evaluate_Expression (…) … }

4. | in rewrite rules → if-else-if in parser  ( ) | procedure_body Evaluate_Factor (…) { if (tk == LEFT_PAREN) { GetNextNonWSToken (…) Evaluate_Expression (…) GetNextNonWSToken (…) } else { … GetNextNonWSToken (…) }

5. {} in rewrite rules → loop in parser  { } procedure_body Evaluate_Expression (…) { Evaluate_Term (…) while ((tk == PLUS) or (tk == MINUS)) { GetNextNonWSToken (…) Evaluate_Term (…) … }

5. {} in rewrite rules → loop in parser  { } procedure_body Evaluate_Term (…) { Evaluate_Factor (…) while ((tk == STAR) or (tk == DIV) or (tk == MOD)) { GetNextNonWSToken (…) Evaluate_Factor (…) … }