Syntax Analysis By Noor Dhia 2014 2015. Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.

Slides:

Advertisements

Similar presentations

lec02-parserCFG March 27, 2017 Syntax Analyzer

Advertisements

Lecture # 8 Chapter # 4: Syntax Analysis. Practice Context Free Grammars a) CFG generating alternating sequence of 0’s and 1’s b) CFG in which no consecutive.

Bottom-up Parsing A general style of bottom-up syntax analysis, known as shift-reduce parsing. Two types of bottom-up parsing: Operator-Precedence parsing.

1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.

Grammars, Languages and Parse Trees. Language Let V be an alphabet or vocabulary V* is set of all strings over V A language L is a subset of V*, i.e.,

CS5371 Theory of Computation

By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.

Context-Free Grammars Lecture 7

ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.

1 Foundations of Software Design Lecture 23: Finite Automata and Context-Free Grammars Marti Hearst Fall 2002.

Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)

Compiler Constreuction 1 Chapter 4 Syntax Analysis Topics to cover: Context-Free Grammars: Concepts and Notation Writing and rewriting a grammar Syntax.

1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.

COP4020 Programming Languages

1 Chapter 3 Context-Free Grammars and Parsing. 2 Parsing: Syntax Analysis decides which part of the incoming token stream should be grouped together.

(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.

Chapter 4 Context-Free Languages Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1.

CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.

BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.

Classification of grammars Definition: A grammar G is said to be 1)Right-linear if each production in P is of the form A  xB or A  x where A and B are.

Context-Free Grammars

Chapter 5 Context-Free Grammars

PART I: overview material

Languages & Grammars. Grammars  A set of rules which govern the structure of a language Fritz Fritz The dog The dog ate ate left left.

Lecture # 9 Chap 4: Ambiguous Grammar. 2 Chomsky Hierarchy: Language Classification A grammar G is said to be – Regular if it is right linear where each.

Joey Paquet, Lecture 12 Review. Joey Paquet, Course Review Compiler architecture –Lexical analysis, syntactic analysis, semantic.

11 Chapter 4 Grammars and Parsing Grammar Grammars, or more precisely, context-free grammars, are the formalism for describing the structure of.

6/4/2016IT 3271 The most practical Parsers: Predictive parser: 1.input (token string) 2.Stacks, parsing table 3.output (syntax tree, intermediate codes)

Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.

Bernd Fischer RW713: Compiler and Software Language Engineering.

CFG1 CSC 4181Compiler Construction Context-Free Grammars Using grammars in parsers.

CMSC 330: Organization of Programming Languages

CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 3: Introduction to Syntactic Analysis.

Overview of Previous Lesson(s) Over View  In our compiler model, the parser obtains a string of tokens from the lexical analyzer & verifies that the.

Syntax Analysis - Parsing Compiler Design Lecture (01/28/98) Computer Science Rensselaer Polytechnic.

1 Context free grammars  Terminals  Nonterminals  Start symbol  productions E --> E + T E --> E – T E --> T T --> T * F T --> T / F T --> F F --> (F)

Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE.

Syntax Analysis – Part I EECS 483 – Lecture 4 University of Michigan Monday, September 17, 2006.

Top-Down Parsing.

Syntax Analyzer (Parser)

1 Pertemuan 7 & 8 Syntax Analysis (Parsing) Matakuliah: T0174 / Teknik Kompilasi Tahun: 2005 Versi: 1/6.

Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.

Chapter 4: Syntax analysis Syntax analysis is done by the parser. –Detects whether the program is written following the grammar rules and reports syntax.

COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )

COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Lecture # 10 Grammar Problems. Problems with grammar Ambiguity Left Recursion Left Factoring Removal of Useless Symbols These can create problems for.

Spring 16 CSCI 4430, A Milanova 1 Announcements HW1 due on Monday February 8 th Name and date your submission Submit electronically in Homework Server.

Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1.

COMP 3438 – Part II - Lecture 4 Syntax Analysis I Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Syntax Analysis By Noor Dhia Left Recursion: Example1: S → S0s1s | 01 The grammar after eliminate left recursion is: S → 01 S’ S' → 0s1sS’

Compiler Chapter 5. Context-free Grammar Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.

1 Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5.

WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.

Chapter 3 – Describing Syntax

CONTEXT-FREE LANGUAGES

lec02-parserCFG May 8, 2018 Syntax Analyzer

CS 404 Introduction to Compiler Design

Programming Languages Translator

CS510 Compiler Lecture 4.

Bottom-Up Parsing.

Compiler Construction

Syntax Analysis source program lexical analyzer tokens syntax analyzer

Lecture 7: Introduction to Parsing (Syntax Analysis)

Bottom Up Parsing.

R.Rajkumar Asst.Professor CSE

lec02-parserCFG May 27, 2019 Syntax Analyzer

Presentation transcript:

Syntax Analysis By Noor Dhia

Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers the sequence of tokens for possible valid constructs of the programming language. Where lexical analysis splits the input into tokens, the purpose of syntax analysis (also known as parsing) is to recombine these tokens. Not back into a list of characters, but into something that reflects the structure of the text. We expect the parser to report any syntax errors in an intelligible fashion. It should also recover from commonly occurring errors so that it can continue processing the remainder of its input.

Syntax analysis:- There exist a number of parsing algorithms which are classified as top-down and bottom-up strategies. A top-down parser attempts to drive the input program starting with a special start symbol construction of syntax tree from the root towards the leafs, representation as leftmost derivation, whereas a bottom- up parser reduces a valid input program to the start symbol construction of syntax tree from the leafs towards the root, representation as (reversed) rightmost derivation. The notation we use for human manipulation is context-free grammars.

Syntax analysis tasks:- There are a number of tasks that might be conducted during parsing:  To find a derivation sequence in grammar G for the input token stream (or say that none exists).  Collecting information about various tokens into the symbol table.  Performing type checking and other kinds of semantic analysis.  Generating intermediate code.

Top-down parsing: In top-down parsing, you start with the start symbol and apply the productions until you arrive at the desired string. This type of parsing can be viewed to find a leftmost derivation for an input string. Ex: S → AB A → aA | ε B → b | bB

Here is a top-down parse of aaab. We begin with the start symbol and at each step, expand one of the remaining nonterminals by replacing it with the right side of one of its productions. We repeat until only terminals remain. S AB S → AB aAB A → aA aaAB A → aA aaaAB A → aA aaaεB A → ε aaab B → b

Grammar:- A context-free grammar (CFG) G = (N, T, P, S) consists of 1. N, a set of nonterminal symbols. 2. T, a set of terminal symbols or the alphabet. 3. S, a start symbol S N. 4. P, a set P of productions or rewrite rules; each production is of the form X→ α, where – X N is a nonterminal and – α (N T)* is a string of terminals and nonterminals Example: G = ({S}, {a, b}, P, S), where S→ ab S → aSb are the only productions in P. Derivations look like this: S → ab S → aSb→ aabb S → aSb → aaSbb → aaabbb L(G), the language generated by G is {a n b n |n > 0}.

Parse Trees:- A parse tree is a graphical representation of a derivation sequence of sentential form. Tree nodes represent symbols of the grammar (nonterminals or terminals) and tree edges represent derivation steps Example: Given the following grammar E → E+E | E-E | E*E | E/E | -E| (E) | id is the string -(id+id) sentence in this grammar? yes, because there is the following derivation: E → - E → - ( E) → - (E + E) → - (id + id) Lets examine this derivation by generating parse trees below:

Parse Trees

note: 1.The symbol → reads “ derives in one step”. 2.This is a top-down derivation because we start building the parse tree at the top. 3. Parse tree ignores variation in the order in which symbols in sentential forms are replaced. 4. These variations in the order in which productions are applied can also be eliminated by considering only leftmost or rightmost derivations. 5. It is not hard to see that every parse tree has associated with it unique left most and unique right most derivations.

Left- most & rightmost derivations: Example: according to the following Grammar: E → E + E / E * E / id Find the derivation of the following string id + id * id LMD E → E + E → id + E → id + E * E → id + id * E → id + id * id RMD E → E + E → E + E * E → E + E * id → E+ id * id → id + id * id LMD E → E * E → E + E * E → id + E * E → id + id * E → id + id * id RMD E → E * E → E * id → E + E * id → E+ id * id → id + id * id There are two pares tree (shown in example-1 above) to same string with same Grammar therefore this grammar is ambiguous.

Example: S → aS / Sa / a W = aa s s a s s a a a Ex: S → aSbS / bSaS / Є W = abab Ex: R → R + R / RR / R* /a /b /c W = a+bc

Left- most & rightmost derivations: Example: G = ({S}, {a, b}, P, S), where S → SS | aSb | ɛ The string abaabb LMD S → SS → SS → aSbS → abS → abaSb →abaaSbb→abaabb RMD S → SS → SS → SaSb → SaaSbb→ Saabb → aSbaabb→abaabb

Parse Tree:

Example: Given the following grammar E → E+E | E*E | ( E ) | - E | id Find the derivation for the expression: id +id * id Which derivation tree is correct?

Example note: Which derivation tree is correct?  According to the grammar, both are correct.  RE' S are most useful for describing the structure of lexical constructs such as identifiers, constants, keywords … ets. Grammars, on the other hand, are most useful in describing nested structures such as balanced parenthesis, matching begin- end's. corresponding if - thenelse's. These nested structures cannot be described by RE.  A grammar that produced more than one parse tree for any input sentence is said to be an ambiguous grammar. An ambiguous grammar can have more than one leftmost and rightmost derivations as discussion below.  For certain types of parsers, it is desirable that the grammar be made unambiguous, for if it is not, we can not uniquely determine which parse tree to select for a sentence.

 Which derivation tree is correct?  If there is an identifier between two operators which operator is done first (id + id ) * id or id + ( id * id ) To answer about these questions: According to the priority of operations the first tree (a) is correct. Check id + id +id  In compiler it must convert the ambiguous Grammar to unambiguous grammar this done by using left recursion.

Left Recursion: A grammar that has at least on production of the form: A → Aα ia a left recursive grammar. Ex: Given the following ambiguous grammar E → E+E | E*E | ( E ) | id can be eliminated E → E+T | T T → T*F | F F→ ( E ) | id Note: if the recursion in the left of operation its left recursive and if the recursion in the right of operation its right recursive.

Left Recursion: The left recursion is difficult while designing a parser. A top- down parser might loop forever when parsing an expression using this grammar. Left recursion can be eliminate by rewriting the grammar introducing a new nonoterminal symbol. A → Aα | β with A→ βA' A' → αA' | ε

β(α) * A→ βA' A' → αA' | ε A → Aα | β with A→ βA' A' → αA' | ε Eliminate of Left Recursion:

Ex: Given the following left-recusive grammar E → E+T | T T → T*F | F F→ ( E ) | id Can be rewrite to eliminate the immediate left recursion : E → TE’ E’ → +TE’| Є T → FT’ T’ → *FT’| Є F→ ( E ) | id A → Aα | β with A→ βA' A' → αA' | ε