Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE.

Similar presentations


Presentation on theme: "Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE."— Presentation transcript:

1 Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE COMPILER DESIGN (170701)

2 Introduction Syntax analysis is the second phase after lexical analysis in compiler design. It basically checks the syntax of the language. It takes the token from lexical analyzer and groups them in such a way that some programming structure can be recognized. GPERI – CD - UNIT-32

3 Introduction After grouping the tokens if any syntax cannot be recognized then syntactic error will be generated. It is a major component of the front end of a compiler. For the syntactic specification of a programming language, use a notation called context free grammar. GPERI – CD - UNIT-33

4 Role of the parser It obtains a string of tokens from the lexical analyzer. Group the tokens to identify large structure in the program. It should be report any syntax error in the program. It should be recover from the error so that it can continue to process the rest of the input. GPERI – CD - UNIT-34

5 Role of the parser. GPERI – CD - UNIT-35 Lexical analyzer Parser Symbol Table Source Program Token getNextToken Parse Tree Syntax Error

6 Context-Free Grammar Grammar involves four quantities: Terminals, Non-terminals, A start symbol and Production. One non-terminal is selected as a start symbol. Each production consist of a non-terminal, followed by an arrow (  ) or (:=) followed by a string of non-terminals and terminals. GPERI – CD - UNIT-36

7 Context-Free Grammar A context free grammar (CFG) is defined: As 4-tuples (V N, ∑, P, S). Where: V N = Set of non-terminals ∑ = Set of terminals. S = A start symbol. P = Set of production rules. One non-terminal  finite string of terminals and/or non- terminals. GPERI – CD - UNIT-37

8 Context-Free Grammar Example. stmt  if ( expr ) stmt else stmt Where: Non-terminals: stmt, expr Terminals: if, (, ), else Start symbol: stmt GPERI – CD - UNIT-38

9 Context-Free Grammar Example. expression -> expression + term expression -> expression – term expression -> term term -> term * factor term -> term / factor term -> factor GPERI – CD - UNIT-39

10 Context-Free Grammar Example: factor -> ( expression ) factor -> id GPERI – CD - UNIT-310

11 Context-Free Grammar Notational Conventions: Terminal symbols: Lower case letters such as a,b,c. Operator symbols such as +, *, -, / etc. Punctuation symbols such as parentheses, comma and so on. The digits 0,1, ….., 9. Bold face string such as id or if, each of which represents a single terminal symbol. GPERI – CD - UNIT-311

12 Context-Free Grammar Notational Conventions: Non-terminal symbols: Uppercase letters, such as A, B, C. The letter S, when it appears, it usually the start symbol. Lowercase, italic such as expr or stmt. GPERI – CD - UNIT-312

13 Derivation The construction of parse tree can be precise by taking a derivational view, In which each productions are treated as rewriting rules. Beginning with start symbol, Each rewriting step replace a non-terminal by the body of one of its production. E  E + E | E * E | - E | ( E ) | id GPERI – CD - UNIT-313

14 Derivation list  list + digit list  list – digit list  digit digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 GPERI – CD - UNIT-314

15 Derivation list => list + digit => list – digit + digit => digit – digit + digit => 9 – digit + digit => 9 – 5 + digit => 9 – 5 + 2 GPERI – CD - UNIT-315

16 Derivation This is an example leftmost derivation, because we replaced the leftmost nonterminal (underlined) in each step. Likewise, a rightmost derivation replaces the rightmost nonterminal in each step. GPERI – CD - UNIT-316

17 Derivation Construct a CFG, for the language L = {w c w : w ϵ (a,b)*}. Sol, G = (V N,∑,P,S) Here, V N = {S}, ∑ = {a,b,c} Production rule P is defined as GPERI – CD - UNIT-317 S -> a S a S -> b S b S -> c

18 Parse Tree The string generated by a context free grammar can be represented by a hierarchical structure called tree. Such tree representing derivations are called derivation trees or parse tree or syntax tree. GPERI – CD - UNIT-318

19 Parse Tree Characteristics of parse tree: The root of the tree is labeled by the start symbol. Each leaf of the tree is labeled by a terminal (token or ϵ). Each interior node is labeled by a nonterminal. If A → X1 X2 … Xn is a production, then node A has immediate children X1, X2, …, Xn where Xi is a (non)terminal or ε (ε denotes the empty string) GPERI – CD - UNIT-319

20 Parse Tree - Example GPERI – CD - UNIT-320 list digit listdigit list digit 9 - 5 + 2

21 Exercise Write a CGF, which generates strings having equal number of a’s and b’s: Sol: CGF, G = (V N,∑,P,S) where V N = {S}, ∑ = {a,b} P is defined as: S -> aSb S -> bSa S -> ^ GPERI – CD - UNIT-321

22 Exercise Construct a CGF for the language L = {a n b n : n >= 1} Sol: CGF, G = (V N,∑,P,S) where V N = {S}, ∑ = {a,b} P is defined as: S -> aSb S -> ab GPERI – CD - UNIT-322

23 Exercise Write a CGF, which generates string of balanced parenthesis. Sol: Grammar will accept the balanced right and left parenthesis. e.g. (), ((( ))), CGF, G = (V N,∑,P,S) where V N = {S}, ∑ = { (, )} P is given by: S -> SS S -> (S) S -> ^ GPERI – CD - UNIT-323

24 Exercise A CGF given by the productions is: S -> a | a A S A -> bS Obtain the derivation tree of the word : a b a a b a a. GPERI – CD - UNIT-324

25 Exercise Given the grammar G = (V N,∑,P,S) where V N = {E}, S = E, ∑ = {id,+,*,c} and P consist of E -> E + E | E * E | (E) | id Obtain the derivation tree for id*id + id and (id+id)*id GPERI – CD - UNIT-325

26 Ambiguity A grammar is said to be ambiguous, If there exist more than one parse tree for the same sentence. Example: S -> aSbS | bSaS | ϵ For the string “abab” have two different parse tree. GPERI – CD - UNIT-326

27 Ambiguity A classical example of ambiguous grammar is that of: if-then-else construct of many programming language. Most of the language have both if-then and if-then-else versions of the statement. The grammar rules for it as follows: stmt -> if condition then stmt else stmt | if condition then stmt GPERI – CD - UNIT-327

28 Ambiguity Consider the following code segment: If a>b then if c>d then x=y else x=z GPERI – CD - UNIT-328

29 Ambiguity Leftmost derivation GPERI – CD - UNIT-329 stmt ifconditionthenstmtelsestmt ifcondition then stmt a>bx=z c>d x=y

30 Ambiguity Rightmost derivation GPERI – CD - UNIT-330 stmt ifconditionthenstmt ifcondition then stmt a>b x=z c>d x=y else stmt

31 Eliminating Ambiguity Ambiguities may be eliminated by rewriting the grammar: If-then-else grammar may be rewritten as: stmt -> m_stmt | un_stmt m_stmt -> if condition then m_stmt else m_stmt | other_stmt unm_stmt -> if condition then stmt | if condition then m_stmt else unm_stmt GPERI – CD - UNIT-331

32 Eliminating Ambiguity Another technique is to modify the language a bit. Many language require that an if should have a matching endif. Thus the grammar is modified as stmt -> if condition then stmt else stmt endif | if condition then stmt endif GPERI – CD - UNIT-332

33 Eliminating Ambiguity Example: Grammar GPERI – CD - UNIT-333 E -> I E -> E + E E -> E * E E -> (E) I -> a | b | c Ambiguity is due to the precedence of operator, if we correct the precedence then ambiguity may be removed. Here two causes of ambiguity: 1.The precedence of operator is not respected. 2.The sequence of identical operators can group either from left or from right..

34 Eliminating Ambiguity The unambiguous grammar. GPERI – CD - UNIT-334 E -> T T -> F F -> I E -> E + T T -> T * F F -> (E) I -> a | b | c

35 Eliminating Ambiguity The solve parse tree for a + b * c GPERI – CD - UNIT-335 E

36 Eliminating Ambiguity The solve parse tree for a + b * c GPERI – CD - UNIT-336 E +TE

37 Eliminating Ambiguity The solve parse tree for a + b * c GPERI – CD - UNIT-337 E +TE T

38 Eliminating Ambiguity The solve parse tree for a + b * c GPERI – CD - UNIT-338 E +TE T F I a

39 Eliminating Ambiguity The solve parse tree for a + b * c GPERI – CD - UNIT-339 E +TE T F I a T * F

40 Eliminating Ambiguity The solve parse tree for a + b * c GPERI – CD - UNIT-340 E +TE T F I a T * F F I b

41 Eliminating Ambiguity The solve parse tree for a + b * c GPERI – CD - UNIT-341 E +TE T F I a T * F F I b I c

42 Left Recursion A grammar is left recursive if it has a nonterminal, say A, that has a derivation of Aα from it. Presence of left recursion creates difficulties while designing parsers. Types of left recursion: Immediate left recursion General left recursion GPERI – CD - UNIT-342

43 Left Recursion Immediate left recursion: It happen with a nonterminal A having production rule of the form : A -> Aα OR The production is recursive if the leftmost symbol on right side is the same as non-terminal of the left side, for example: A -> Aα GPERI – CD - UNIT-343

44 Left Recursion Immediate left recursion: (Continue..) It can be eliminated by introducing a new nonterminal symbol, say A’. Modify the grammar: A -> βA’ A’ -> αA’ | ϵ GPERI – CD - UNIT-344

45 Left Recursion Immediate left recursion: (Continue..) Thus the rule. A -> Aα 1 | Aα 2 |…….| Aα m |β 1 | β 1 |…..…| β n A -> β 1 A’| β 2 A’|……| β n A’ A’ -> α 1 A’| α 2 A’|……. |α m A’|ϵ GPERI – CD - UNIT-345

46 Left Recursion Immediate left recursion: (Continue..) Example. E -> E + T | T T -> T * F | F F -> (E) | id GPERI – CD - UNIT-346 E -> TE’ E’ -> +TE’ | ϵ T -> FT’ T’ -> *FT’ | ϵ F -> (E) | id

47 Left Recursion General left recursion: (Continue..) If there may be no immediate left recursion, a number of production rules may act together to give a general left recursion. For example: S -> Aa A -> Sb | c GPERI – CD - UNIT-347 Here, S is left recursive, because: S -> Aa -> Sba

48 Left Recursion Algorithm eliminate left recursion: 1. Arrange non-terminals in some order say A 1,A 2,….,A m 2. For i = 1 to m do for j = 1 to i-1 do for each set of production A i -> A j γ and A j -> ᵟ1 | ᵟ2 | …….|ᵟk replace A i -> A j γ by A i -> ᵟ1γ | ᵟ2γ |…..|ᵟkγ 3. Eliminate immediate felt recursion from all production. GPERI – CD - UNIT-348

49 Left Recursion Example: S -> Aa A -> Sb | c GPERI – CD - UNIT-349 The order of non-terminals S,A. For i = 1, the rule S -> Aa, no immediate left recursion For i = 2, A -> Sb | c is modified as, A -> Aab | c, which has immediate left recursion, eliminated by modifying the rule as: A -> cA’ A’ -> abA’ | ϵ


Download ppt "Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE."

Similar presentations


Ads by Google