Syntax Analysis (Chapter 4) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

Slides:



Advertisements
Similar presentations
Translator Architecture Code Generator ParserTokenizer string of characters (source code) string of tokens abstract program string of integers (object.
Advertisements

1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.
Grammars, Languages and Parse Trees. Language Let V be an alphabet or vocabulary V* is set of all strings over V A language L is a subset of V*, i.e.,
Lecture # 11 Grammar Problems.
1 Languages and Compilers (SProg og Oversættere) Lecture 3 Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to Norm.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 531 Compiler Construction Ch.4: Syntactic Analysis Spring 2007 Marco Valtorta.
Top-Down Parsing.
Chapter 4 Lexical and Syntax Analysis Sections 1-4.
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
1 Predictive parsing Recall the main idea of top-down parsing: Start at the root, grow towards leaves Pick a production and try to match input May need.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 330 Programming Language Structures Chapter 3: Lexical and Syntactic Analysis.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
1 Chapter 4: Top-Down Parsing. 2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct.
Professor Yihjia Tsai Tamkang University
Syntax Analysis (Chapter 4) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
COS 320 Compilers David Walker. last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars.
Chapter 3 Chang Chi-Chung Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
Syntax Analysis (Chapter 4) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
1 Languages and Compilers (SProg og Oversættere) Parsing.
Syntax Analysis (Chapter 4) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
Parsing Chapter 4 Parsing2 Outline Top-down v.s. Bottom-up Top-down parsing Recursive-descent parsing LL(1) parsing LL(1) parsing algorithm First.
Top-Down Parsing - recursive descent - predictive parsing
4 4 (c) parsing. Parsing A grammar describes the strings of tokens that are syntactically legal in a PL A recogniser simply accepts or rejects strings.
LANGUAGE TRANSLATORS: WEEK 3 LECTURE: Grammar Theory Introduction to Parsing Parser - Generators TUTORIAL: Questions on grammar theory WEEKLY WORK: Read.
Languages A Language L over a finite alphabet  is a set of strings of characters from the alphabet. Examples are : 1.L = { s | s is a grammatical English.
PART I: overview material
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Lexical and Syntax Analysis
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
Parsing Lecture 5 Fri, Jan 28, Syntax Analysis The syntax of a language is described by a context-free grammar. Each grammar rule has the form A.
1 Languages and Compilers (SProg og Oversættere) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to Norm Hutchinson.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 531 Compiler Construction Ch.4: Syntactic Analysis Spring 2013 Marco Valtorta.
4 4 (c) parsing. Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces.
11 Chapter 4 Grammars and Parsing Grammar Grammars, or more precisely, context-free grammars, are the formalism for describing the structure of.
1 Languages and Compilers (SProg og Oversættere) Lecture 3 recap Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to.
Comp 311 Principles of Programming Languages Lecture 3 Parsing Corky Cartwright August 28, 2009.
Top-down Parsing lecture slides from C OMP 412 Rice University Houston, Texas, Fall 2001.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
1 Context free grammars  Terminals  Nonterminals  Start symbol  productions E --> E + T E --> E – T E --> T T --> T * F T --> T / F T --> F F --> (F)
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
Top-Down Parsing.
CSE 5317/4305 L3: Parsing #11 Parsing #1 Leonidas Fegaras.
Parsing and Code Generation Set 24. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program,
Parsing methods: –Top-down parsing –Bottom-up parsing –Universal.
GRAMMARS & PARSING. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program, referred to as a.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
Bernd Fischer RW713: Compiler and Software Language Engineering.
UMBC  CSEE   1 Chapter 4 Chapter 4 (b) parsing.
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Fangfang Cui Mar. 29, Overview 1. LL(k) 1. LL(k) Definition 2. LL(1) 3. Using a Parsing Table 4. First Sets 5. Follow Sets 6. Building a Parsing.
CS 614: Theory and Construction of Compilers Lecture 4 Fall 2002 Department of Computer Science University of Alabama Joel Jones.
Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.
Parsing #1 Leonidas Fegaras.
Programming Languages 2nd edition Tucker and Noonan
Programming Languages Translator
Lexical and Syntax Analysis
Syntax Analysis Chapter 4.
Lexical and Syntax Analysis
Top-Down Parsing CS 671 January 29, 2008.
Syntactic sugar causes cancer of the semicolon.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Course Overview PART I: overview material PART II: inside a compiler
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 2, 09/04/2003 Prof. Roy Levow.
Presentation transcript:

Syntax Analysis (Chapter 4) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a compiler PART II: inside a compiler 4Syntax analysis 5Contextual analysis 6Runtime organization 7Code generation PART III: conclusion 8Interpretation 9Review

Syntax Analysis (Chapter 4) 2 Nullable, First sets (starter sets), and Follow sets A non-terminal is nullable if it derives the empty string First(N) or starters(N) is the set of all terminals that can begin a sentence derived from N Follow(N) is the set of terminals that can follow N in some sentential form Next we will see algorithms to compute each of these.

Syntax Analysis (Chapter 4) 3 Algorithm for computing Nullable For each terminal t Nullable(t) = false For each non-terminal N Nullable(N) = is there a production N ::=  ? Repeat For each production N ::= x 1 x 2 x 3 … x n If Nullable(x i ) for all of x i then set Nullable(N) to true Until nothing new becomes Nullable

Syntax Analysis (Chapter 4) 4 Generalizing the definition of Nullable Define Nullable(x 1 x 2 x 3 … x n ) as: if n==0 then true else if !Nullable(x 1 ) then false else Nullable(x 2 x 3 … x n )

Syntax Analysis (Chapter 4) 5 Algorithm for computing First sets For each terminal t First(t) = { t } For each non-terminal N First(N) = { } Repeat For each production N ::= x 1 x 2 x 3 … x n First(N) = First(N)  First(x 1 ) For each i from 2 through n If Nullable(x 1 … x i-1 ), then First(N) = First(N)  First(x i ) Until no First set changes

Syntax Analysis (Chapter 4) 6 Generalizing the definition of First sets Define First(x 1 x 2 x 3 … x n ) as: if !Nullable(x 1 ) then First(x 1 ) else First(x 1 )  First(x 2 x 3 … x n ) Note: some textbooks add  (empty string) to First(N) whenever N is nullable, so that First(N) is never { } (empty set)

Syntax Analysis (Chapter 4) 7 Algorithm for computing Follow sets Follow(S) = {$}// the end-of-file symbol For each non-terminal N other than S Follow(N) = { } Repeat For each production N ::= x 1 x 2 x 3 … x n For each i from 1 through n-1 if x i is a non-terminal then Follow(x i ) = Follow(x i )  First(x i+1 … x n ) For each i from n downto 1 if x i is a non-terminal and Nullable(x i+1 … x n ) then Follow(x i ) = Follow(x i )  Follow(N) Until no Follow set changes

Syntax Analysis (Chapter 4) 8 Example of computing Nullable, First, Follow S ::= TUVW | WVUT T ::= aT | e U ::= Ub | f V ::= cV |  W ::= Wd |  Nullable?FirstFollow Sfalse{a, e, d, c, f}{$} Tfalse{a, e}{f, $} Ufalse{f}{c, d, $, a, e, b} Vtrue {c} or {c,  } {d, $, f} Wtrue {d} or {d,  } {$, c, f, d}

Syntax Analysis (Chapter 4) 9 Parsing We will now look at parsing. Topics: –Some terminology –Different types of parsing strategies bottom up top down –Recursive descent parsing What is it How to implement a parser given an EBNF specification

Syntax Analysis (Chapter 4) 10 Parsing: Some Terminology Recognition To answer the question “does the input conform to the syntax of the language” Parsing Recognition + also determine structure of program (for example by creating an AST data structure) Unambiguous grammar: A grammar is unambiguous if there is only at most one way to parse any input. (i.e. for syntactically correct program there is precisely one parse tree)

Syntax Analysis (Chapter 4) 11 Different kinds of Parsing Algorithms Two big groups of algorithms can be distinguished: –bottom up strategies –top down strategies Example: parsing of “Micro-English” Sentence ::= Subject Verb Object. Subject ::= I | A Noun | The Noun Object::= me | a Noun | the Noun Noun::= cat | bat | rat Verb::= like | is | see | sees Sentence ::= Subject Verb Object. Subject ::= I | A Noun | The Noun Object::= me | a Noun | the Noun Noun::= cat | bat | rat Verb::= like | is | see | sees The cat sees the rat. The rat sees me. I like a cat. The rat like me. I see the rat. I sees a rat.

Syntax Analysis (Chapter 4) 12 Bottom up parsing Thecatseesarat.Thecat Noun Subject sees Verb arat Noun Object. Sentence The parse tree “grows” from the bottom (leafs) up to the top (root).

Syntax Analysis (Chapter 4) 13 Top-down parsing Thecatseesarat.Thecatseesrat. The parse tree is constructed starting at the top (root). Sentence SubjectVerbObject. Sentence Noun Subject The Noun cat Verb seesa Noun Object Noun rat.

Syntax Analysis (Chapter 4) 14 Quick review Syntactic analysis –Lexical analysis Group letters into words (or group characters into tokens) Use regular expressions and deterministic FSM’s –Grammar transformations Left-factoring Left-recursion removal Substitution –Parsing = structural analysis of program Group words into sentences, paragraphs, and documents (or tokens into expressions, commands, and programs) Top-Down and Bottom-Up

Syntax Analysis (Chapter 4) 15 Recursive Descent Parsing Recursive descent parsing is a straightforward top-down parsing algorithm. We will now look at how to develop a recursive descent parser from an EBNF specification. Idea: the parse tree structure corresponds to the recursive calling structure of parsing functions that call each other.

Syntax Analysis (Chapter 4) 16 Recursive Descent Parsing Sentence ::= Subject Verb Object. Subject ::= I | A Noun | The Noun Object::= me | a Noun | the Noun Noun::= cat | bat | rat Verb::= like | is | see | sees Sentence ::= Subject Verb Object. Subject ::= I | A Noun | The Noun Object::= me | a Noun | the Noun Noun::= cat | bat | rat Verb::= like | is | see | sees Define a procedure parseN for each non-terminal N private void parseSentence( ) ; private void parseSubject( ); private void parseObject( ); private void parseNoun( ); private void parseVerb( ); private void parseSentence( ) ; private void parseSubject( ); private void parseObject( ); private void parseNoun( ); private void parseVerb( );

Syntax Analysis (Chapter 4) 17 Recursive Descent Parsing public class MicroEnglishParser { private TerminalSymbol currentTerminal; //Auxiliary methods will go here... //Parsing methods will go here... } public class MicroEnglishParser { private TerminalSymbol currentTerminal; //Auxiliary methods will go here... //Parsing methods will go here... }

Syntax Analysis (Chapter 4) 18 Recursive Descent Parsing: Auxiliary Methods public class MicroEnglishParser { private TerminalSymbol currentTerminal; private void accept (TerminalSymbol expected) { if (currentTerminal matches expected) currentTerminal = next input terminal ; else report a syntax error }... } public class MicroEnglishParser { private TerminalSymbol currentTerminal; private void accept (TerminalSymbol expected) { if (currentTerminal matches expected) currentTerminal = next input terminal ; else report a syntax error }... }

Syntax Analysis (Chapter 4) 19 Recursive Descent Parsing: Parsing Methods private void parseSentence( ) { parseSubject( ); parseVerb( ); parseObject( ); accept(‘.’); } private void parseSentence( ) { parseSubject( ); parseVerb( ); parseObject( ); accept(‘.’); } Sentence ::= Subject Verb Object.

Syntax Analysis (Chapter 4) 20 Recursive Descent Parsing: Parsing Methods private void parseSubject( ) { if (currentTerminal matches ‘ I ’) accept(‘ I ’); else if (currentTerminal matches ‘ A ’) { accept(‘ A ’); parseNoun( ); } else if (currentTerminal matches ‘ The ’) { accept(‘ The ’); parseNoun( ); } else report a syntax error } private void parseSubject( ) { if (currentTerminal matches ‘ I ’) accept(‘ I ’); else if (currentTerminal matches ‘ A ’) { accept(‘ A ’); parseNoun( ); } else if (currentTerminal matches ‘ The ’) { accept(‘ The ’); parseNoun( ); } else report a syntax error } Subject ::= I | A Noun | The Noun

Syntax Analysis (Chapter 4) 21 Recursive Descent Parsing: Parsing Methods private void parseNoun( ) { if (currentTerminal matches ‘ cat ’) accept(‘ cat ’); else if (currentTerminal matches ‘ bat ’) accept(‘ bat ’); else if (currentTerminal matches ‘ rat ’) accept(‘ rat ’); else report a syntax error } private void parseNoun( ) { if (currentTerminal matches ‘ cat ’) accept(‘ cat ’); else if (currentTerminal matches ‘ bat ’) accept(‘ bat ’); else if (currentTerminal matches ‘ rat ’) accept(‘ rat ’); else report a syntax error } Noun::= cat | bat | rat

Syntax Analysis (Chapter 4) 22 Recursive Descent Parsing: Parsing Methods private void parseObject( ) { ? } private void parseVerb( ) { ? } private void parseObject( ) { ? } private void parseVerb( ) { ? } Object::= me | a Noun | the Noun Verb::= like | is | see | sees Object::= me | a Noun | the Noun Verb::= like | is | see | sees Test yourself: Can you complete parseObject( ) and parseVerb( ) ?

Syntax Analysis (Chapter 4) 23 Systematic Development of Rec. Descent Parser (1)Express grammar in EBNF (2)Grammar Transformations: Left factorization and Left recursion elimination (3)Create a parser class with –private variable currentToken –methods to call the scanner: accept and acceptIt (4) Implement a public method for main function to call: –public parse method that fetches the first token from the scanner calls parse S (where S is start symbol of the grammar) verifies that scanner next produces the end–of–file token (5)Implement private parsing methods: –add private parse N method for each non terminal N