Syntax Analysis By Noor Dhia 2014 2015. Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.

Syntax Analysis By Noor Dhia 2014 2015

Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers the sequence of tokens for possible valid constructs of the programming language. Where lexical analysis splits the input into tokens, the purpose of syntax analysis (also known as parsing) is to recombine these tokens. Not back into a list of characters, but into something that reflects the structure of the text. We expect the parser to report any syntax errors in an intelligible fashion. It should also recover from commonly occurring errors so that it can continue processing the remainder of its input.

Syntax analysis:- There exist a number of parsing algorithms which are classified as top-down and bottom-up strategies. A top-down parser attempts to drive the input program starting with a special start symbol construction of syntax tree from the root towards the leafs, representation as leftmost derivation, whereas a bottom- up parser reduces a valid input program to the start symbol construction of syntax tree from the leafs towards the root, representation as (reversed) rightmost derivation. The notation we use for human manipulation is context-free grammars.

Syntax analysis tasks:- There are a number of tasks that might be conducted during parsing:  To find a derivation sequence in grammar G for the input token stream (or say that none exists).  Collecting information about various tokens into the symbol table.  Performing type checking and other kinds of semantic analysis.  Generating intermediate code.

Top-down parsing: In top-down parsing, you start with the start symbol and apply the productions until you arrive at the desired string. This type of parsing can be viewed to find a leftmost derivation for an input string. Ex: S → AB A → aA | ε B → b | bB

Here is a top-down parse of aaab. We begin with the start symbol and at each step, expand one of the remaining nonterminals by replacing it with the right side of one of its productions. We repeat until only terminals remain. S AB S → AB aAB A → aA aaAB A → aA aaaAB A → aA aaaεB A → ε aaab B → b

Grammar:- A context-free grammar (CFG) G = (N, T, P, S) consists of 1. N, a set of nonterminal symbols. 2. T, a set of terminal symbols or the alphabet. 3. S, a start symbol S N. 4. P, a set P of productions or rewrite rules; each production is of the form X→ α, where – X N is a nonterminal and – α (N T)* is a string of terminals and nonterminals Example: G = ({S}, {a, b}, P, S), where S→ ab S → aSb are the only productions in P. Derivations look like this: S → ab S → aSb→ aabb S → aSb → aaSbb → aaabbb L(G), the language generated by G is {a n b n |n > 0}.

Parse Trees:- A parse tree is a graphical representation of a derivation sequence of sentential form. Tree nodes represent symbols of the grammar (nonterminals or terminals) and tree edges represent derivation steps Example: Given the following grammar E → E+E | E-E | E*E | E/E | -E| (E) | id is the string -(id+id) sentence in this grammar? yes, because there is the following derivation: E → - E → - ( E) → - (E + E) → - (id + id) Lets examine this derivation by generating parse trees below:

Parse Trees

note: 1.The symbol → reads “ derives in one step”. 2.This is a top-down derivation because we start building the parse tree at the top. 3. Parse tree ignores variation in the order in which symbols in sentential forms are replaced. 4. These variations in the order in which productions are applied can also be eliminated by considering only leftmost or rightmost derivations. 5. It is not hard to see that every parse tree has associated with it unique left most and unique right most derivations.

Left- most & rightmost derivations: Example: according to the following Grammar: E → E + E / E * E / id Find the derivation of the following string id + id * id LMD E → E + E → id + E → id + E * E → id + id * E → id + id * id RMD E → E + E → E + E * E → E + E * id → E+ id * id → id + id * id LMD E → E * E → E + E * E → id + E * E → id + id * E → id + id * id RMD E → E * E → E * id → E + E * id → E+ id * id → id + id * id There are two pares tree (shown in example-1 above) to same string with same Grammar therefore this grammar is ambiguous.

Example: S → aS / Sa / a W = aa s s a s s a a a Ex: S → aSbS / bSaS / Є W = abab Ex: R → R + R / RR / R* /a /b /c W = a+bc

Left- most & rightmost derivations: Example: G = ({S}, {a, b}, P, S), where S → SS | aSb | ɛ The string abaabb LMD S → SS → SS → aSbS → abS → abaSb →abaaSbb→abaabb RMD S → SS → SS → SaSb → SaaSbb→ Saabb → aSbaabb→abaabb

Parse Tree:

Example: Given the following grammar E → E+E | E*E | ( E ) | - E | id Find the derivation for the expression: id +id * id Which derivation tree is correct?

Example note: Which derivation tree is correct?  According to the grammar, both are correct.  RE' S are most useful for describing the structure of lexical constructs such as identifiers, constants, keywords … ets. Grammars, on the other hand, are most useful in describing nested structures such as balanced parenthesis, matching begin- end's. corresponding if - thenelse's. These nested structures cannot be described by RE.  A grammar that produced more than one parse tree for any input sentence is said to be an ambiguous grammar. An ambiguous grammar can have more than one leftmost and rightmost derivations as discussion below.  For certain types of parsers, it is desirable that the grammar be made unambiguous, for if it is not, we can not uniquely determine which parse tree to select for a sentence.

 Which derivation tree is correct?  If there is an identifier between two operators which operator is done first (id + id ) * id or id + ( id * id ) To answer about these questions: According to the priority of operations the first tree (a) is correct. Check id + id +id  In compiler it must convert the ambiguous Grammar to unambiguous grammar this done by using left recursion.

Left Recursion: A grammar that has at least on production of the form: A → Aα ia a left recursive grammar. Ex: Given the following ambiguous grammar E → E+E | E*E | ( E ) | id can be eliminated E → E+T | T T → T*F | F F→ ( E ) | id Note: if the recursion in the left of operation its left recursive and if the recursion in the right of operation its right recursive.

Left Recursion: The left recursion is difficult while designing a parser. A top- down parser might loop forever when parsing an expression using this grammar. Left recursion can be eliminate by rewriting the grammar introducing a new nonoterminal symbol. A → Aα | β with A→ βA' A' → αA' | ε

β(α) * A→ βA' A' → αA' | ε A → Aα | β with A→ βA' A' → αA' | ε Eliminate of Left Recursion:

Syntax Analysis By Noor Dhia 2014 2015. Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.

Similar presentations

Presentation on theme: "Syntax Analysis By Noor Dhia 2014 2015. Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Syntax Analysis By Noor Dhia 2014 2015. Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.

Similar presentations

Presentation on theme: "Syntax Analysis By Noor Dhia 2014 2015. Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers."— Presentation transcript:

Similar presentations

About project

Feedback