Presentation is loading. Please wait.

Presentation is loading. Please wait.

– 3 – Syntax Analysis ( Parsing )

Similar presentations


Presentation on theme: "– 3 – Syntax Analysis ( Parsing )"— Presentation transcript:

1 – 3 – Syntax Analysis ( Parsing )

2 Objectives To Understand Context Free Grammar & Writing a Grammar
Parsing – Top Down, Bottom up & Operator Precedence The Role of a Parser LR Parsers & Parser Generators Semantic analysis | Website for Students | VTU - Notes - Question Papers

3 Syntax Analysis & Parsing
Just to recapitulate the definitions: Syn-tax The way in which words are stringed together to form phrases, clauses and sentences Syntax analysis: The task concerned with fitting a sequence of tokens into a specified syntax. Parsing: To break a sentence down into its component parts of speech with an explanation of the form, function, and syntactical relationship of each part . | Website for Students | VTU - Notes - Question Papers

4 BACKUS – NAUR FORM ( BNF)
Syntax Analysis The step two of Front end in Compilation Every programming Language has RULES that prescribe the syntactic structure (i.e. the way in which words and phrases can be organized to form sentences.) These rules can be formally specified using : A.k.a ( also known as) BACKUS – NAUR FORM ( BNF) C O N T E X T F R E E G R A M M A R S ( or GRAMMAR for Short) | Website for Students | VTU - Notes - Question Papers

5 Intro to Context Free Grammar (CFG)
Essential formalism for describing the structure of the programs in a programming Language A recipe for constructing elements of a set of strings of symbols ( or tokens) Consists of a set production rules and a start symbol Each production rule: Defines a named syntactic construct using Two parts An LHS (with a syntactic Construct) and An RHS (showing the possible form of the construct) Connected by an Arrow symbol ( meaning “ is defined as” ) sentence subject predicate ‘.’ | Website for Students | VTU - Notes - Question Papers

6 Importance of the Grammar
Grammar offers significant advantages to both language Designers and Compiler writers: Gives precise, yet easy-to-understand, syntactic specification of Programming Languages Helps construct efficient parsers automatically ( not in all cases – of course) Provides a good structure to the language which in turn helps generate correct object code Makes language open ended easily amenable to add additional constructs at a later date | Website for Students | VTU - Notes - Question Papers

7 Context Free Grammar – Formal Definition
A context-free grammar G = (T, N, S, P) consists of: 1. T, a set of terminals (scanner tokens - symbols that may not appear on the left side of rule,). 2. N, a set of nonterminals (syntactic variables generated by productions – symbols on left or right of rule). 3. S, a designated start nonterminal. 4. P, a set of productions (Rules). Each production has the form, A::=  , where A is a nonterminal and  is a sentential form , i.e., a string of zero or more grammar symbols (terminals / nonterminals) | Website for Students | VTU - Notes - Question Papers

8 The terminal symbols are id, num, (, ), +, –, *, /, 
CFG – An Example A grammar derives strings by Beginning with the Start Symbol and Repeatedly replacing a nonterminal by the right side of the production rule for that nonterminal Example: expr  expr op expr expr  ( expr ) expr  – expr expr  id | num op  [ +, –, *, /,  ] The terminal symbols are id, num, (, ), +, –, *, /,  The non-terminal symbols are expr op The start symbol is expr | Website for Students | VTU - Notes - Question Papers

9 What are the token classes of this language?
CFG – Another Example What are the token classes of this language? It can be shown that the above syntax conforms to the Grammar | Website for Students | VTU - Notes - Question Papers

10 Notational Conventions
These Symbols are terminals Lower case letters early in the alphabet ( like a, b, c) Operator symbols such as +, - etc.. Punctuation symbols like ,(comma), (, ), etc The digits 0, 1, ………..9 Boldface strings such as if, id etc… These Symbols are nonterminals Uppercase letters early in the alphabet ( like A, B, C) The letter S when it appears, is usually the Start Symbol Lower case letters late in the alphabet chiefly u, v, …. ……. z represent string of terminals | Website for Students | VTU - Notes - Question Papers ………. Contd.

11 Notational Conventions
Uppercase letters late in the alphabet such as X, Y, Z represent grammar symbols that is either terminal or nonterminal Lower case Greek letters α, β, γ represent strings of grammar symbols If A → α1 ,, A → α2 …….. A → αn are all Productions with A on the LHS ( known as A-productions), then, A → α1 | α2 …| αn (α1 , α2 … αn are alternatives of A) Unless otherwise stated , the LHS of the first production is the start nonterminal s | Website for Students | VTU - Notes - Question Papers

12 Use of Conventional Notation – Example
Remember the previous Example: expr  expr op expr expr  ( expr ) expr  - expr expr  id | num op  [ +, -, *, /,  ] Using the notations described so far, this grammar can be written as: E → E A E | ( E ) | id | num A → + | – | * | ↑ | Website for Students | VTU - Notes - Question Papers

13 Grammar for a Tiny Language
program ::= statement ; Statement statement ::= assignStmt | ifStmt assignStmt ::= id = expr ; ifStmt ::= if ( expr ) stmt expr ::= id | int | expr + expr Id ::= a | b | c | i | j | k | n | x | y | z int ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Website for Students | VTU - Notes - Question Papers

14 Regular Expressions and Grammars
EVERY construct that can be described by an RE can also be described by a grammar Example : RE ( a |b )* abb can be described as: A0 → aA0 | bA0 | aA1 Then why use RE for Lex & CFG for syntax analysis ? Lexical rules are frequently quite simple & we do not need powerful grammars to describe them RE’s are more concise and easy to understand for tokens RE’s provide for automatic construction of efficient Lexical Analyzers A3 → ε A1 → bA2 A2 → bA3 | Website for Students | VTU - Notes - Question Papers

15 Derivation ctr + ( a – 5) / 2 Now let us derive a structure expr
A way of showing how an input sentence is recognized with a grammar Can also be thought of as a type of algebraic substitution using the grammar Remember the grammar: expr  expr op expr expr  ( expr ) expr  - expr expr  id | num op  [ +, -, *, /,  ] Now let us derive a structure ctr + ( a – 5) / 2 expr expr op expr expr + expr Expr + expr op exr expr + expr / expr expr + expr / num expr + (expr) / num expr + (expr – expr) / num id = id + ( id – num) / num | Website for Students | VTU - Notes - Question Papers

16 Derivations – Some definitions
Uses productions to derive more complex Expressions The symbol: denotes “ derives in one step”. denotes “ derives in zero or more steps” denotes “ derives in one or more steps” Given the grammar G with start symbol S, a string ω is part of the language L (G), If and only if S ω Further, if S α and α contains Only terminals, then it is a – nonterminals too, then it is a – > > * ( α β ) > * ( α β ) > + > + > * > * Sentence of G Sentential form of G | Website for Students | VTU - Notes - Question Papers

17 Derivation – Another Example
Consider the following Grammar a simple ; terminated expression consisting of interspersed numbers and + Stmt expr ; factor + factor ; Number + factor ; Number + Number ; 1: stmt → expr ‘;’ 2: expr → factor ‘ +’ factor 3: | factor 4: factor → number > > > > Let us try and come up with a derivation to prove that an input sentence (like ;) can be recognized from this grammar The process is called a Leftmost Derivation because the left most nonterminal is always replaced | Website for Students | VTU - Notes - Question Papers

18 > > Leftmost Derivation signifies left derivation
Start with the topmost symbol in the grammar Replace the leftmost nonterminal in the partially parsed sentence with the equivalent production’s RHS in each step The operator X Each of the step (Partial Derivation) is called Viable Prefix Portion of the viable prefix that is replaced at each step is called the Handle > L signifies left derivation > L Y means X derives Y by leftmost Derivation | Website for Students | VTU - Notes - Question Papers

19 Leftmost Derivation Example
| Website for Students | VTU - Notes - Question Papers

20 Leftmost Derivation Example
| Website for Students | VTU - Notes - Question Papers

21 > > Rightmost Derivation may signify Right Derivation
Replace the rightmost nonterminal in the partially parsed sentence The operator X It is possible to start with the start symbol and do rightmost derivation > R may signify Right Derivation > R Y means X derives Y by rightmost Derivation | Website for Students | VTU - Notes - Question Papers

22 Rightmost Derivation – Another Example
| Website for Students | VTU - Notes - Question Papers

23 Rightmost Derivation – Another Example
| Website for Students | VTU - Notes - Question Papers

24 Parse Tree WHAT ? A graphical representation for a derivation sequence showing how to derive the string of a language from grammar starting from Start symbol WHY ? Each stage of the compiler has two purposes: Detect and filter out some class of errors Compute some new information or translate the representation of the program to make things easier for later stages Recursive structure of tree suits recursive structure of language definition | Website for Students | VTU - Notes - Question Papers

25 Parse Tree (Contd.) expr expr op expr HOW ?
Tree nodes represent symbols of the grammar (nonterminals or terminals) and tree edges represent derivation steps Example: Given the production: expr  expr op expr The corresponding Parse tree is : It’s Properties are: The root is labeled by a start symbol Each leaf is labeled by a token or by ε Each interior node is labeled by a nonterminal expr expr op expr | Website for Students | VTU - Notes - Question Papers

26 expr → expr + (expr) / expr expr → expr + (expr – expr) / expr
Parse Tree Example Remember the grammar: expr  expr op expr expr  ( expr ) expr  - expr expr  id, num op  [ +, -, *, /,  ] And the String derived from it Using the derivation The corresponding Parse tree is expr → expr + expr expr → expr + expr / expr expr → expr + (expr) / expr expr → expr + (expr – expr) / expr id → id + ( id – num) / num expr expr expr (expr ) / Expr expr – expr Id (ctr) num (2) ctr + ( a – 5) / 2 Id (a) num (5) | Website for Students | VTU - Notes - Question Papers

27 Parse Tree Example (Continued)
the Yield of the tree The string derived or generated from the nonterminal at the root of the tree Obtained by reading the leaves of the parse tree read from left to right In this Example expr (ctr) expr expr (expr ) / Expr expr – expr Id (a) num (5) num (2) (ctr) ctr = ctr + (a – 5) / 2 | Website for Students | VTU - Notes - Question Papers

28 Parse Tree – Another Example
| Website for Students | VTU - Notes - Question Papers

29 Another Derivation Example
Given the grammar : E  E + E | E  E | ( E ) | - E | num Find a derivation for the expression: 6 + 2  4 E E + E + E + 2 4 6 6 + (2 * 4 ) = 14 Which derivation tree is correct? E E E + E + 6 2 4 (6 + 2) * 4 = 32 According to above grammar BOTH are correct!!! | Website for Students | VTU - Notes - Question Papers

30 Another Derivation Example
Given the grammar : E  E + E | E  E | ( E ) | - E | id Find a derivation for the expression: 6 + 2  4 E E + E + E + 2 4 6 A grammar that produces more than one parse tree for any input sentence is said to be an ambiguous grammar. 6 + (2 * 4 ) = 14 Which derivation tree is correct? E E E + E + 6 2 4 (6 + 2) * 4 = 32 According to above grammar BOTH are correct!!! | Website for Students | VTU - Notes - Question Papers

31 Ambiguity in Grammar A grammar that produces more than one parse tree for any input sentence Happens because the derivation can be either Leftmost or Rightmost Can also happen because of the way the grammar is defined Classic example — the if-then-else problem Stmt → if Expr then Stmt | if Expr then Stmt else Stmt | … other stmts … | Website for Students | VTU - Notes - Question Papers

32 Ambiguous IF … This sentential form has two derivations
Stmt → If Expr then Stmt | if Expr then Stmt else Stmt | … other stmts This sentential form has two derivations if Expr1 then if Expr2 then Stmt1 else Stmt2 | Website for Students | VTU - Notes - Question Papers

33 Ambiguous IF This sentential form has two derivations
Stmt → If Expr then Stmt | if Expr then Stmt else Stmt | … other stmts This sentential form has two derivations if Expr1 then if Expr2 then Stmt1 else Stmt2 | Website for Students | VTU - Notes - Question Papers

34 Removing the ambiguity
Add precedence ; for Example This grammar is slightly larger Takes more rewriting to reach some of the terminal symbols Encodes expected precedence Produces same parse tree under leftmost & rightmost derivations Let’s see how it parses x - 2 * y

35 Removing the ambiguity (Contd.)
| Website for Students | VTU - Notes - Question Papers

36 Removing the ambiguity (Contd.)
expression, because the grammar directly and explicitly encodes Both the leftmost and rightmost derivations give the same the desired precedence | Website for Students | VTU - Notes - Question Papers

37 Removing the ambiguity (Contd.)
Rewrite the grammar to avoid generating the problem Example The case of If then if then else : Match each else to innermost unmatched if (common sense rule) | Website for Students | VTU - Notes - Question Papers

38 Removing the ambiguity (Contd.)
| Website for Students | VTU - Notes - Question Papers

39 Removing the ambiguity (Contd.)
Rewriting a grammar to eliminate multiple productions starting with the same token is called left factoring. | Website for Students | VTU - Notes - Question Papers

40 Ambiguity - A Summary Ambiguity arises from two distinct sources
Confusion in the context-free syntax (if-then-else) Confusion that requires context to resolve (overloading) Resolving ambiguity To remove context-free ambiguity, rewrite the grammar To handle context-sensitive ambiguity use precedence This is a language design problem Sometimes, the compiler writer accepts an ambiguous grammar Parsing techniques must “do the right thing” (i.e., always select the same derivation) | Website for Students | VTU - Notes - Question Papers

41 Recursion In Grammars Very Common in grammars
Generally 3 types of recursions are Possible: A ::= Ab (Left Recursion) B ::= cB (Right Recursion) C ::= dCf (Middle Recursion or Self-embedding) Facts: If a grammar contains no middle recursion then the language it generates is regular. If there is no recursion at all in a grammar, then it is finite and therefore regular. | Website for Students | VTU - Notes - Question Papers

42 A grammar that has at least one production of the form
Left Recursion Consider the grammar: E  E + T | T T  T  F | F F  ( E ) | id A top-down parser might loop forever when parsing an expression using this grammar E E + T E + T E + T A grammar that has at least one production of the form A  A is a left recursive grammar Top Down parsing cannot handle left-recursive grammars | Website for Students | VTU - Notes - Question Papers

43 Eliminating left Recursion
First group the A productions as A A1 | A2| …..|Am|1 |2 |.. n | Where no i begins with an A. Then replace the A-productions by A 1 A’|2 A’|……n A’| A’ 1 A’| 2 A’|… n A’|  This eliminates all immediate left recursions from A and A’ productions BUT It does not eliminate left recursions involving derivations of two or more steps | Website for Students | VTU - Notes - Question Papers

44 Left Recursion ( Contd.)
Left-recursion can often be eliminated by rewriting the grammar E  E + T | T T  T  F | F F  ( E ) | id This left-recursive grammar: Can be re-written to eliminate the immediate left recursion: E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id | Website for Students | VTU - Notes - Question Papers

45 Properties of Grammars – A Summary
Epsilon Productions Has at least one production of the form “E → ε ”, Right( / Left) Linear Grammar: Grammars where all the RHS of all productions has at most one non- terminal and that nonterminal appears rightmost (/ Leftmost.) E → x E” or “E → x”. Left-Recursive Grammar: A grammar that can generate “E → E + X” (for example). Similarly, “Right-recursive grammars.” Ambiguous Grammar More than one parse tree is possible for a specific sentence. | Website for Students | VTU - Notes - Question Papers

46 Limitations of CFG CFGs cannot express context conditions For Example:
Every name must be declared before it is used The operands of an expression must have compatible types Possible solutions Use context-sensitive grammars Too complicated Check context conditions during semantic analysis | Website for Students | VTU - Notes - Question Papers

47 Build parse tree for the string ( of tokens) based on grammar rules
Syntax Analysis Problem Statement: To find a derivation sequence in a grammar G for the input token stream or To conclude that none exists Solution Build parse tree for the string ( of tokens) based on grammar rules The process is known as P A R S I N G | Website for Students | VTU - Notes - Question Papers

48 PARSING Constructs the syntax tree for a given sequence of tokens using grammar rules and productions such that: The top node is labeled with the start symbol of the grammar Leaf nodes are labeled with terminals and inner nodes with nonterminals the children of the inner node labeled N correspond to the members of an alternative of N, in the same order as they occur in that alternative The terminals labeling the leaf nodes correspond to the sequence of tokens, in the same order they occur in the input | Website for Students | VTU - Notes - Question Papers ………. Contd.

49 The Role of a Parser Symbol Table AND PARSER
Lexical analysis Rest of the front end PARSER Symbol Table The parser continues the process of translation and validation started by the Lexical Analyzer: Analyzing the phrase structure of the program, Adding hierarchical (tree) structure to the flat stream of tokens produced by the lexical Analyzer, Outputting a data structure conveying the program's syntax to subsequent compile phases, Reporting errors when the program does not match the syntactic structure of the source language AND | Website for Students | VTU - Notes - Question Papers

50 Parsing Methods Three Types of parsing methods in Vogue
Universal Parsing method Too inefficient to use in any practical compilers ( hence not discussed any further) EX: Coocke–Younger–Kasami (CYK) algorithm Top – Down Parsing Can be generated automatically or written manually Ex: : Left-to-Right –Top –Down Parser ( A.k.a. LL parser) Bottom – Up parsing Can only be generated automatically Ex : Left-to-Right –Bottom-Up Parser ( A.k.a. LR Parser) | Website for Students | VTU - Notes - Question Papers

51 Top – Down & Bottom – Up Methods
While they are efficient; They can work only on restricted classes of Grammars, ( such as LL and LR Grammars ) BUT They are expressive enough to describe most syntactic constructs in Programming Languages. SO Care must be taken while defining the grammar for the language to make sure that the grammar is unambiguous | Website for Students | VTU - Notes - Question Papers

52 TOP – DOWN Parsing The key Step
An attempt to construct a parse tree for an input string from the root and creating the nodes in PRE-ORDER (i.e. the top of the tree is constructed before its nodes) Can also be viewed as an attempt to find a leftmost derivation for an input string The way it works: The process starts at root node say N Repeat until the fringe (edge) of the parse tree matches the input string Determine the correct alternative for N Parser then proceeds to construct the left child of N The process of determining the correct alternative for the leftmost child continues till the left most child is a terminal The key Step | Website for Students | VTU - Notes - Question Papers

53 Top Down parsing example
String to be parsed :array [ num dotdot num] of integer type array [ simple ] of type num dotdot num simple Consider the Grammar: type  simple | id | array [simple] of type simple  integer | char | num dotdot num integer Token dotdot “..” is used to stress that character sequence is treated as unit | Website for Students | VTU - Notes - Question Papers

54 Top Down Parsing – Another Case
Consider the grammar : String to be parsed : s → cAd A → ab | a s W = c a d Parse tree Construction would be: Start with a single node S Use first production to expand S Expand the nonterminal A using the first alternative from rule 2 as we have a match for input token a But b does not match d ; We have to BACKTRACK to step 3 and try second alternative from rule 2 c A d a b s c A d a | Website for Students | VTU - Notes - Question Papers

55 Top Down Parsing - Observations
In general, The selection of a production for a nonterminal may involve trial – and – error We may have to try a production and backtrack to try another production if the first is found to be unsuitable. (A production is unsuitable ,if after using the production, we can not complete the tree to match the input string ) If we can make the correct choice by looking at just the next input symbol, we can build a Predictive Parser that can perform a top-down parse without backtracking | Website for Students | VTU - Notes - Question Papers

56 Top Down Parsing - Observations
In general, The selection of a production for a nonterminal may involve trial – and – error We may have to try a production and backtrack to try another production if the first is found to be unsuitable. (A production is unsuitable ,if after using the production, we can not complete the tree to match the input string ) If we can make the correct choice by looking at just the next input symbol, we can build a Predictive Parser that can perform a top-down parse without backtracking Programming language grammars are often suitable for Predictive Parsing | Website for Students | VTU - Notes - Question Papers

57 Predictive Parsing Most common technique used in parsers (Particularly TRUE in case of Parsers written manually) Use grammars that are carefully written to eliminate left recursion Further, if many alternatives exist for a nonterminal, the grammar is so defined that the input token will uniquely define the one and only one alternative and that will lead to correct parsing or Error. No question of backtracking and trying another alternative | Website for Students | VTU - Notes - Question Papers

58 Predictive Parsing Example
stmt  if expr then stmt else stmt | while expr do stmt | begin stmt_list end Consider the grammar: A parser for this grammar can be written with the following simple structure: switch(gettoken()) { case if: …. break; case while: case begin: default: reject input; } Based only on the first token, the parser knows which rule to use to derive a statement, because all the three outcomes are unique even at the first letter. Therefore this is called a Predictive Parser. | Website for Students | VTU - Notes - Question Papers

59 Parsing Table Represents a Grammar as
A two dimensional array M [ A,α ], where: A is the nonterminal and α is the input symbol Used in Parser as the reference table to decide the possible values ( thereby the choice of production to be used) E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id Grammar: Parsing Table: | Website for Students | VTU - Notes - Question Papers

60 A table Driven Predictive Parser Program
An input Buffer (containing string to be parsed with $ in the end) A stack ( containing a sequence of grammar symbols with $ at the bottom) A Parsing table and An Output stream id + INPUT: $ OUTPUT: T E’ E Predictive Parsing Program STACK: E $ T E’ PARSING TABLE: | Website for Students | VTU - Notes - Question Papers

61 The Working of the Table driven Predictive Parser
A Predictive Parser The Working of the Table driven Predictive Parser OUTPUT: INPUT: id + id id $ E T E’ STACK: Predictive Parsing Program T E’ $ E $ PARSING TABLE: | Website for Students | VTU - Notes - Question Papers

62 The Working of the Table driven Predictive Parser
A Predictive Parser The Working of the Table driven Predictive Parser INPUT: id + id id $ OUTPUT: E T E’ STACK: Predictive Parsing Program F T’ F T’ E’ $ T E’ $ PARSING TABLE: | Website for Students | VTU - Notes - Question Papers

63 Action when Top(Stack) = input  $ : Pop stack, advance input.
A Predictive Parser The Working of the Table driven Predictive Parser Action when Top(Stack) = input  $ : Pop stack, advance input. INPUT: id + id id $ OUTPUT: E T E’ STACK: Predictive Parsing Program F T’ id F T’ E’ $ id PARSING TABLE: | Website for Students | VTU - Notes - Question Papers

64 Action when Top(Stack) = ε : Pop stack
A Predictive Parser The Working of the Table driven Predictive Parser Action when Top(Stack) = ε : Pop stack INPUT: id + id id $ OUTPUT: E T E’ STACK: Predictive Parsing Program F T’ ε T’ E’ $ id ε PARSING TABLE: | Website for Students | VTU - Notes - Question Papers

65 Action when Top(Stack) = input  $ : Pop stack, advance input.
A Predictive Parser The Working of the Table driven Predictive Parser Action when Top(Stack) = input  $ : Pop stack, advance input. id + id id $ OUTPUT: INPUT: E T E’ STACK: Predictive Parsing Program F T’ + T E’ + T E’ $ E’ $ id ε PARSING TABLE: | Website for Students | VTU - Notes - Question Papers

66 A Predictive Parser The Working of the Table driven Predictive Parser INPUT: id + id id $ OUTPUT: E T E’ STACK: Predictive Parsing Program F T’ + T E’ T E’ $ id ε PARSING TABLE: | Website for Students | VTU - Notes - Question Papers

67 +     A Predictive Parser
The predictive parser proceeds in this fashion emitting the following productions: E F T’ T + E’ id T  FT’ F  id T’   FT’ F  id id F T’ T’   id E’   When Top(Stack) = input = $ the parser halts & accepts the input string – ( id + id * id ) | Website for Students | VTU - Notes - Question Papers

68 Algorithm for Predictive parser
The program execution is controlled by TWO inputs The input symbol a and the symbol on the top of the stack X There are THREE possibilities for the parser If X = a = $ : halt & announce the successful completion of parsing If X = a ≠ $ : Pop X off the stack and advance the input pointer to next input symbol If X is a nonterminal : Consult entry M [ X, a ] in the Parsing table M and replace X on top of the stack with the RHS of the production If M [ X, a ] is error call error routine | Website for Students | VTU - Notes - Question Papers

69 Algorithm for Predictive parser
The program execution is controlled by TWO inputs The input symbol a and the symbol on the top of the stack X There are THREE possibilities for the parser If X = a = $ : halt & announce the successful completion of parsing If X = a ≠ $ : Pop X off the stack and advance the input pointer to next input symbol If X is a nonterminal : Consult entry M [ X, a ] in the Parsing table M and replace X on top of the stack with the RHS of the production If M [ X, a ] is error call error routine It parses from left to right, and does a leftmost-derivation This parser is also known as LL (1) Parser since It looks up 1 symbol ahead to choose its next action. | Website for Students | VTU - Notes - Question Papers

70 > > * * FIRST & FOLLOW (If α ε , then ε is also in FIRST (α))
The two functions represent a set of terminal symbols, that aid us in constructing Predictive parser by helping us fill the Parsing Table with valid entries Set of terminal symbols yielded by FOLLOW function can also be used in error recovery FIRST – if α is the string of grammar symbols then FIRST (α) is the set of terminals that begin the strings derived from α FOLLOW (A) – the set of terminals a that can appear immediately to the right of A such that there exists a derivation of the form S α A a β for some α and β (If α ε , then ε is also in FIRST (α)) > * > * | Website for Students | VTU - Notes - Question Papers

71 FIRST & FOLLOW (Contd.) The set of terminal symbols ( including ε )that can appear at the far left of any parse tree derived from a particular nonterminal is the FIRST set of that nonterminal The set of terminal symbols ( including ε ) That can follow a nonterminal in some derivation or the other, is called the FOLLOW set of that nonterminal A set of rules are followed to compute the FIRST and FOLLOW set These sets will be used in creating Parsing table | Website for Students | VTU - Notes - Question Papers

72 Rules to Create FIRST FIRST rules: E  TE’ E’  +TE’ |  T  FT’
F  ( E ) | id GRAMMAR: 1. If X is a terminal, FIRST(X) = {X} 2. If X   , then   FIRST(X) 3. If X  aABC , then a  FIRST(X) 4. If X →ABCD, then FIRST (A)  FIRST (X). 4a. Further, If A   , in the above production, then FIRST (B)  FIRST (X) [& so on recursively] SETS: FIRST(id) = {id} FIRST(E’) = {} {+, } FIRST() = {} FIRST(T’) = {} {, } FIRST(+) = {+} FIRST(F) = {(, id} FIRST(() = {(} FIRST(T) = FIRST(F) = {(, id} FIRST()) = {)} | Website for Students | VTU - Notes - Question Papers FIRST(E) = FIRST(T) = {(, id}

73 Rules to Create FOLLOW FOLLOW rules:    A and B are non-terminals,
FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FIRST(E’) = {+, } FIRST(T’) = { , } Rules to Create FOLLOW FOLLOW rules: 1. If S is the start symbol, then $  FOLLOW(S) GRAMMAR: 2. If there is a production A  B, then, EVERYTHING in FIRST() except  is placed in FOLLOW(B) E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id 3. If there is a production A  B then, EVERYTHING in FOLLOW(A) is in FOLLOW(B) SETS: FOLLOW(E) = {$} { ), $} 3a. If there is a production A  B where FIRST() contains  ( i.e ) then, EVERYTHING in FOLLOW(A) is in FOLLOW(B) *    FOLLOW(E’) = { ), $} FOLLOW(T) = { ), $} {+, ), $} FOLLOW(T’) = { +, ), $} A and B are non-terminals,  and  are strings of grammar symbols FOLLOW(F) = {+, ), $} {+, , ), $} | Website for Students | VTU - Notes - Question Papers

74 Rules to Build Parse Table
E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id GRAMMAR: FOLLOW SETS: FOLLOW(E) = {), $} FOLLOW(E’) = { ), $} FOLLOW(T) = {+, ), $} FOLLOW(T’) = {+, ), $} FOLLOW(F) = {+, , ), $} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FIRST(E’) = {+, } FIRST(T’) = { , } FIRST Rules to Build Parse Table Rules 1. If A  , :if a  FIRST(), add A   to M[A, a] 2. If A  , if   FIRST(), add A   to M[A, b] for each terminal b  FOLLOW(A), 3.If A  , if   FIRST(), and $  FOLLOW(A), A   to M[A, $] PARSING TABLE: | Website for Students | VTU - Notes - Question Papers

75 Predictive Parsing A top down method of parsing (Recursive descent parsing ) in which a set of procedures are executed recursively to process the input. The speciality of predictive parsing is that : The look ahead symbol unambiguously determines the procedure selected for each nonterminal (No Prediction in fact !!!) Hence: The sequence of procedures called implicitly defines a parse tree for the input. | Website for Students | VTU - Notes - Question Papers

76 Algorithm for Predictive parsing
procedure match (c : token); { if ( lookahead ==c) then lookahead := nexttoken; else error; } procedure type; {if ( lookahead is [ integer, char, num ] ) then simple; else if (lookahead == ‘ ‘ ) { match (‘‘); match (id); } else if ( lookahead == array ) then { match (array); match(‘[‘); simple; match(‘]’); match(of); type; } else error; } | Website for Students | VTU - Notes - Question Papers ………. Contd.

77 Algorithm for Predictive parsing
procedure simple; { if ( lookahead == integer ) match (integer); else if ( lookahead == char ) match (char ); else if ( lookahead == num) { match (num ); match (dotdot); match( num );} else error; } | Website for Students | VTU - Notes - Question Papers

78 Error recovery in Predictive parsing
Error is encountered by Predictive Parser when: The terminal on the stack does not match the next input symbol The nonterminal A is on the top of the stack, A is the input symbol and the parsing table is empty for entry M [A, a] Two methods used for recovery Panic – Mode error recovery Skip symbols on the input until a token in a selected set of synchronizing tokens appears Effectiveness depends on set of synchronizing tokens chosen Phased – level recovery Fill empty slots of parsing table with pointers to error Routines | Website for Students | VTU - Notes - Question Papers

79 LL (1) Parsing Grammars whose predictive parsing tables contain no multiple entries are called LL(1). LL(1) ● Left-to-right parse of input string ● Leftmost derivation of grammar ● 1 symbol look ahead (we know only what the next token is) LL (k) : we know what the next k tokens are Every LL(1) grammar is an LL(2) grammar and so on | Website for Students | VTU - Notes - Question Papers

80 Bottom Up Parsing Constructs the nodes in the syntax tree in the post order i.e. bottom up by Beginning at the leaves and working up till the root Suitable for automatic parser generation, Less restrictive in the sense that it can postpone the decision of which production to use till it has seen all the input tokens Hence it can handle a larger class of grammars ( LR grammars ) than LL parsers But, Efficiency is low | Website for Students | VTU - Notes - Question Papers

81 Working of Bottom Up Parsing
Repeatedly matches a right-sentential form from the language against the tree’s upper frontier. At each match, it applies a reduction to build on the frontier: Each reduction matches an upper frontier of the partially built tree to the RHS of some production Each reduction adds a node on top of the frontier The final result is a rightmost derivation, in reverse. | Website for Students | VTU - Notes - Question Papers

82 Reducing the handle to N
Bottom Up Parsing The main task is to find the leftmost node that has not yet been constructed, BUT all of whose children have been constructed. Such a sequence of symbols whose top needs to be constructed is called – Creating a parent node N for the handle based on grammar rules is called Hence the problem can be stated as : To find the proper handle that leads to the expression & To locate the right grammar to reduce the handle to a parent node The Handle Reducing the handle to N | Website for Students | VTU - Notes - Question Papers

83 We want to parse the input string abbcde.
Bottom Up Parsing S  aABe A  Abc | b B  d Consider the Grammar: We want to parse the input string abbcde. INPUT: a b b c d e $ Production S  aABe Bottom-Up Parsing Program A  Abc A  b B  d | Website for Students | VTU - Notes - Question Papers

84 We want to parse the input string abbcde.
Bottom Up Parsing S  aABe A  Abc | b B  d Consider the Grammar: We want to parse the input string abbcde. OUTPUT: INPUT: a A b b c d e $ Production c A b S  aABe Bottom-Up Parsing Program A  Abc A b A  b B  d We are not reducing here in this example. A parser would reduce, get stuck and then backtrack! | Website for Students | VTU - Notes - Question Papers

85 We want to parse the input string abbcde.
Bottom Up Parsing S  aABe A  Abc | b B  d Consider the Grammar: We want to parse the input string abbcde. OUTPUT: INPUT: a A B d S e $ a S e Production B d c A b S  aABe Bottom-Up Parsing Program A  Abc A b A  b B  d This parser is known as an LR Parser because it scans the input from Left to right, and it constructs a Rightmost derivation in reverse order | Website for Students | VTU - Notes - Question Papers

86 Bottom Up Parsing Very inefficient because of: Too many scanning of productions for matching with handles in the input string, and Backtracking at times The bottom Up parsing, is also known as, Shift–reduce Parsing Uses a stack and two operations SHIFT & REDUCE SHIFT REDUCE Move current input symbol onto stack and advance the input If the top few items on the stack form the RHS of a production , Pop these items and replace it with LHS of that Production | Website for Students | VTU - Notes - Question Papers

87 Shift Reduce Parsing – An Illustration
ACTION INPUT STACK $ Id1 + id2 * id3 $ SHIFT $ id1 + id2 * id3 $ REDUCE by E → id $ E + id2 * id3 $ SHIFT $ E + id2 * id3 $ SHIFT $ E + id2 * Id3 $ REDUCE by E → id $ E + E * Id3 $ SHIFT $ E + E * Id3 $ SHIFT $ E + E * id3 $ REDUCE by E → id E E + E E E * E E ( E ) E id $ E + E * E $ REDUCE by E → E * E $ E + E $ REDUCE by E → E + E $E $ ACCEPT | Website for Students | VTU - Notes - Question Papers

88 Conflict Situations Two possible conflict Scenarios
Shift – Reduce Conflict: One rule implies shift while another implies reduce ( for E.g.) Grammar Productions : S → if E then S | if E then S else S On stack: if E then S and Input: else Should shift trying for 2nd rule or reduce by first rule? Reduce – Reduce Conflict Can have two grammar rules with same RHS ( For E.g.) Grammar Productions : A → α and B → α On stack: α Which Production to choose for Reduce A → α and B → α? | Website for Students | VTU - Notes - Question Papers

89 Conflict Situations Two possible conflict Scenarios Shift – Reduce Conflict: One rule implies shift while another implies reduce ( for E.g.) Grammar Productions : S → if E then S | if E then S else S On stack: if E then S and Input: else Should shift trying for 2nd rule or reduce by first rule? Reduce – Reduce Conflict Can have two grammar rules with same RHS ( For E.g.) Grammar Productions : A → α and B → α On stack: α Which Production to choose for Reduce A → α and B → α? Change the grammar as such grammars are unsuitable for this Kind of Parsing SOLUTION : | Website for Students | VTU - Notes - Question Papers

90 Operator Precedence Parsing
A simplified and efficient version of bottom up parsing technique that can be applied only to a small but important class of grammars – Operator Grammars Operator Grammars Require: No right hand side of rule is empty ( ε ) No right hand side has 2 adjacent nonterminals Following Grammars represent same language; E  E + E | E – E | E * E | ( E ) | id E  E A E | ( E ) | id A  | + | – | * | Operator Grammar | Website for Students | VTU - Notes - Question Papers

91 Operator Precedence Parsing
Works on the basis of Operator Precedence: Uses three precedence relations a <. b, a yields in precedence to b a .> b, a takes precedence over b a .= b, a has same precedence as b and Find handle as <. .> pattern at top of stack; Drawbacks Small class of grammars qualify Overloaded operators are hard (unary minus) Parser correctness hard to prove | Website for Students | VTU - Notes - Question Papers

92 LR PARSERS An efficient Bottom Up parsing technique that can be used to parse a large class of context free grammars An attractive option because it: Can be used with virtually any kind of grammar Is the most general non-backtracking shift reduce parsing technique known Can be implemented as efficiently as any other technique Can detect syntax error as soon as possible to do so on a left-to- right scan of the input But it is hard to construct Manually; Tools are available to construct them automatically. However, | Website for Students | VTU - Notes - Question Papers

93 The parsing table has 2 parts
LR Parser Consists of : A Driver Program An Input An Output A stack and A Parsing Table The Stack Stores a string of the form s0 x1 s1 x2 s2 x3….xm sm with (sm at the top) where each: xi is a grammar symbol & si is a symbol called State summarizing the information in the stack below it Input S t a c k LR Parsing Program Output TABLE action goto The parsing table has 2 parts A parsing action function and a A goto function - goto | Website for Students | VTU - Notes - Question Papers

94 OR LR Parser Functioning The parsing Program:
Reads input characters (a) from input buffer one at a time Determines the current state on the top of the stack (s) Consults the Parsing table based on current state s and current input symbol a & gets M [s, a] which can be either: Shift to state S Accept The function goto takes a state and grammar symbol as arguments and produces a new state The next move of parser is to read input a and state s and then consulting the entry action of parsing table If action [s,a] is accept , parsing is complete OR Reduce by a grammar production Error | Website for Students | VTU - Notes - Question Papers

95 LR Parsing Example (1) E  E + T (2) E  T (3) T  T  F (4) T  F
id + $ Let us parse the input string : Using the grammar & the parsing table (1) E  E + T (2) E  T (3) T  T  F (4) T  F (5) F  ( E ) (6) F  id

96 LR Parser Example GRAMMAR: (1) E  E + T (2) E  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id + id id $ LR Parsing Program STACK: E | Website for Students | VTU - Notes - Question Papers

97 LR Parser Example GRAMMAR: (1) E  E + T (2) E  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ LR Parsing Program STACK: E 5 id F id | Website for Students | VTU - Notes - Question Papers

98 LR Parser Example GRAMMAR: (1) E  E + T (2) E  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ LR Parsing Program STACK: F id | Website for Students | VTU - Notes - Question Papers

99 LR Parser Example GRAMMAR: (1) E  E + T (2) E  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ LR Parsing Program STACK: E 3 F T F id | Website for Students | VTU - Notes - Question Papers

100 LR Parser Example GRAMMAR: (1) E  E + T (2) E  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ LR Parsing Program STACK: T F id | Website for Students | VTU - Notes - Question Papers

101 LR Parser Example GRAMMAR: (1) E  E + T (2) E  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ LR Parsing Program STACK: E 2 T T F id | Website for Students | VTU - Notes - Question Papers

102 LR Parser Example GRAMMAR: (1) E  E + T (2) E’  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ LR Parsing Program STACK: E 7 2 T T F id | Website for Students | VTU - Notes - Question Papers

103 LR Parser Example GRAMMAR: (1) E  E + T (2) E’  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ LR Parsing Program STACK: 5 E id 7 T F id 2 F T id | Website for Students | VTU - Notes - Question Papers

104 LR Parser Example GRAMMAR: (1) E  E + T (2) E’  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ LR Parsing Program STACK: 7 2 T F id T F id | Website for Students | VTU - Notes - Question Papers

105 LR Parser Example GRAMMAR: (1) E  E + T (2) E’  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ LR Parsing Program STACK: E 10 F 7 2 T T T F F id id | Website for Students | VTU - Notes - Question Papers

106 LR Parser Example GRAMMAR: (1) E  E + T (2) E  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ LR Parsing Program STACK: E 10 F 7 2 T T T F F id id | Website for Students | VTU - Notes - Question Papers

107 LR Parser Example GRAMMAR: (1) E  E + T (2) E  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ E LR Parsing Program STACK: 2 T T T F F id id | Website for Students | VTU - Notes - Question Papers

108 LR Parser Example GRAMMAR: (1) E  E + T (2) E  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ E LR Parsing Program STACK: T T F F id id | Website for Students | VTU - Notes - Question Papers

109 LR Parser Example GRAMMAR: (1) E  E + T (2) E’  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ E LR Parsing Program STACK: 1 E T T F F id id | Website for Students | VTU - Notes - Question Papers

110 LR Parser Example GRAMMAR: (1) E  E + T (2) E’  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ E LR Parsing Program STACK: 6 + 1 E T T F F id id | Website for Students | VTU - Notes - Question Papers

111 LR Parser Example GRAMMAR: (1) E  E + T (2) E’  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ E LR Parsing Program STACK: 5 id 6 + 1 E T F id T F F id id | Website for Students | VTU - Notes - Question Papers

112 LR Parser Example GRAMMAR: (1) E  E + T (2) E’  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ E LR Parsing Program STACK: 6 + 1 E T F id T F F id id | Website for Students | VTU - Notes - Question Papers

113 LR Parser Example GRAMMAR: (1) E  E + T (2) E’  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ T E LR Parsing Program STACK: 3 F 6 + 1 E T F id T F F id id | Website for Students | VTU - Notes - Question Papers

114 LR Parser Example GRAMMAR: (1) E  E + T (2) E’  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: INPUT: id id + id $ E LR Parsing Program STACK: 6 + 1 E T F id T F F id id | Website for Students | VTU - Notes - Question Papers

115 LR Parser Example GRAMMAR: (1) E  E + T (2) E’  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: E + INPUT: id id + id $ T E LR Parsing Program STACK: 9 T 6 + 1 E T F id T F F id id | Website for Students | VTU - Notes - Question Papers

116 LR Parser Example GRAMMAR: (1) E  E + T (2) E  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: E + INPUT: id id + id $ T E LR Parsing Program STACK: T F id T F F id id | Website for Students | VTU - Notes - Question Papers

117 LR Parser Example GRAMMAR: (1) E  E + T (2) E’  T (3) T  T  F
(5) F  ( E ) (6) F  id OUTPUT: E + INPUT: id id + id $ T E LR Parsing Program STACK: 1 E T F id T F F id id | Website for Students | VTU - Notes - Question Papers

118 LR Parsing Observations
All LR parsers use the same parsing program What differentiates the LR parsers are the action and the goto tables: These tables can be constructed using : Simple LR (SLR): succeds for the fewest grammars, but is the easiest to implement Canonical LR: succeds for the most grammars, but is the hardest to implement. It splits states when necessary to prevent reductions that would get the parser stuck Look ahead LR (LALR): succeeds for most common syntactic constructions used in programming languages, but produces LR tables much smaller than canonical LR Independent of the table and hence of the Grammar. | Website for Students | VTU - Notes - Question Papers

119 SLR Parsing table Construction
Parsing table construction is based on the key idea : A handle should NOT be reduced to a nonterminal N if the next token (i.e. look ahead token ) can not follow N (thereby resulting in dead end and backtracking) So ensure that : Redction of item N → α● is done ONLY if the look ahead token is part of FOLLOW (N). In order to achieve this, we define a concept called; Dotted ITEM or ITEM for short | Website for Students | VTU - Notes - Question Papers

120 DOTTED ITEM ( A.k.a. ITEM) A concept used to summarize the state of our search & Sets of items to represent sets of hypotheses about the next token Definition: Consider a production N →αβ An LR dotted Item N →α●β ( Note the ● in between) means that we maintain the hypothesis that: αβ is a possible handle and We have already seen the α part When ● reaches the rightmost point like in N →αβ●, we have identified the handle An item with rightmost ● is called a REDUCE ITEM. All others are SHIFT ITEMS (or KERNEL ITEMS) | Website for Students | VTU - Notes - Question Papers

121 Dotted ITEM – An Example
Consider the production A → XYZ This production yields four items: The production A → ε yields only one item A → ● An item can be represented by a pair of integers One representing the production number in the grammar & Other indicating the position of the dot in the production A → ● XYZ A → X ● YZ A → XY ● Z A → XYZ ● Shift Items Reduce Item | Website for Students | VTU - Notes - Question Papers

122 Dotted ITEM – An Example
Consider the production A → XYZ This production yields four items: The production A → ε yields only one item A → ● An item can be represented by a pair of integers One representing the production number in the grammar & Other indicating the position of the dot in the production A → ● XYZ A → X ● YZ A → XY ● Z A → XYZ ● Shift Items Intuitively, an item indicates how much of a production we have seen at a given point in the parsing process Reduce Item | Website for Students | VTU - Notes - Question Papers

123 Closure Operation The operation of extending the context with items is called the closure operation Useful in identifying all “ITEMS” belonging to a state Example: If a state includes: A a●Bb Include all items that is start of B like B ●X Y Z Informally: if I expect to see B next, I expect to start anything that B can start with: So Include X ●a H I States are thus built by closure from individual items. | Website for Students | VTU - Notes - Question Papers

124 Parsing table construction
The Central Idea: To construct from the grammar a DFA to recognize “viable prefixes” The method: Use the grammar to build a model of the DFA Group items together into sets, which give rise to the states of the parser Use two functions goto( s, X ) and closure( s ) goto() is analogous to move() in the subset construction closure() adds information to round out a state Build up the states and transition functions of the DFA Use this info to fill in the ACTION & GOTO tables | Website for Students | VTU - Notes - Question Papers

125 Creating States in Parsing Table Gen
Start with the initial state Keep moving DOT across the symbols if it s a new state; If yes mark it . Otherwise mark it to an existing state Compute the closures, if any, in that state & include them Keep moving the dot and marking the states until all viable states are identified and inter transitions are marked. The generated DFA is then used to construct the parsing table with Action and go to items | Website for Students | VTU - Notes - Question Papers

126 Parsing Table Construction
| Website for Students | VTU - Notes - Question Papers

127 Constructing Parsing Table from DFA
Use the following rules : An arc between two states labeled with a terminal is a shift action. An arc between two states labeled with a non- terminal is a goto action. if a state contains an item A a. , (a reduce item) the action is to reduce by this production, for all terminals in Follow (A). If there are shift-reduce conflicts or reduce-reduce conflicts, more elaborate techniques are needed | Website for Students | VTU - Notes - Question Papers

128 Transition Diagram for Expression Grammar
Reduce 1 E + T * 1 6 9 7 F ( 3 4 Reduce 2 id 5 T * F 10 2 7 Reduce 3 E  E + T E  T T  T * F T  F F  ( E ) F  id ( id 4 F 3 5 Reduce 4 ( Reduce 5 ( E ) 8 11 4 T F + 2 6 id 3 id 5 | Website for Students | VTU - Notes - Question Papers Reduce 6

129 LR Parsing - A Summary Most general paser that can parse wide variety of grammars Parsing Program is one ; only the table changes Takes longer time to execute The table generation is tough, if manually done; Use of automated tools a necessity Error recovery is difficult | Website for Students | VTU - Notes - Question Papers

130 Comparison of LL & LR Parsers
Left to right scan of input k look ahead symbols Left-most derivation Grammar with no multiple defined entries in Parsing Tables(Restrictive) Shift Reduce Parser Recognize the use of production seeing only first k look ahead symbols of its right side derives Left to right scan of input k look ahead symbols Right-most derivation in reverse More wider range of Grammar acceptable ( less Restrictive) Shift-Reduce-Accept-Go to Parser Recognize the occurrence of the right side of a production, having seen all of what what is derived from the right side with k input look ahead symbols. | Website for Students | VTU - Notes - Question Papers

131 Deterministic languages
Comparison of Parsers PARSER TYPE ADVANTAGES DISADVANTAGES LL(1) Parsing ( Recursive, Descent, Top-down) Fast Hand Coded Simple High Maintenance Good Locality Right Associativity Good Error Detection LR (1) Parsing (Bottom- up) Fast Large Working Sets Deterministic languages Poor error Messages Automatable Large size tables Less Associative

132 Parser generators Yacc compiler C compiler
Yet Another Compiler Compiler ( YACC ). Yacc specification translate.y Yacc compiler y.tab.c C compiler y.tab.c a.out input a.out output | Website for Students | VTU - Notes - Question Papers

133 In Summary We use CFGs and Backus-Naur form to describe context- free languages Grammars can be ambiguous Parsing techniques use these grammars The general CYK algorithm, for testing membership in a context-free language Works for any unambiguous grammar BUT Running time is O(n3), much too slow for practical use (n is the length of the input) The practical techniques are LL ( recursive Descent) & LR ( Shift – Reduce ) Techniques | Website for Students | VTU - Notes - Question Papers ………. Contd.

134 In Summary ………. Contd. LL Parsing ( A.k.a Recursive Descent parsing )
LL stands for Left-to-right scan of input & Leftmost derivation Start by expanding the start symbol and then expand out nonterminals based on the input, building the parse tree from the top down Works only on grammars with some restrictions. ( LL Grammars with no left recursion and the first token uniquely identifies the Production) Can be easily built manually | Website for Students | VTU - Notes - Question Papers ………. Contd.

135 In Summary LR Parsing ( A.k.a. Shift – Reduce parsing )
LR stands for Left-to-right scan of input & Rightmost derivation Bottom – up parsing techniques that starts from leaf nodes and builds the tree upwards by reducing the tokens to its LHS (recognize occurrence of b (the handle) having seen all of what is derived from b ) Works on wide variety of grammars ( LR Grammars with no right recursion) Too complex to build manually Tools are available to build them automatically | Website for Students | VTU - Notes - Question Papers

136 E  . E E  . E + T E  .T T  .T * F T  .F F  .( E ) F  . id
E  E . E  E . + T 1 E  E + . T T  .T * F T  .F F  .( E ) F  . id 6 E  T . T  T . * F 2 T  T * .F F  .( E ) F  . id 7 T  F . 3 F  id . 5 4 F  (. E ) E  . E + T E  .T T  .T * F T  .F F  .( E ) F  . id E  E + T . T  T . * F 9 8 F (E . ) E  E . + T 10 T  T * F . F ( E ). 11 | Website for Students | VTU - Notes - Question Papers

137 Thank you Any Questions ????
| Website for Students | VTU - Notes - Question Papers

138 Augmented Grammar & its Functions
As a pre-requisite, “to construct DFA to recognize viable prefixes from the grammar” we define: Augmented grammar and Two functions with it namely: The closure Operation – Closure (I) and The Goto Operation - goto (I, X) Augmented Grammar If G is the grammar with start symbol S then the Augmented grammar for G is G’ with new start symbol S’ such that S’ → S Used to indicate the parser when to stop parsing | Website for Students | VTU - Notes - Question Papers

139 1. The closure Operation E’ ●E E  ●E + T E  ●T T  ●T  F GRAMMAR
If I is the set of items for a grammar G then Closure (I) is the set of items constructed from following 2 rules Initially every item in I is added to Closure (I) If A → α●Bβ is in the Closure (I) & B → γ is a production then add the item B → ●γ to I if not already included (This rule needs to be applied repeatedly until no more new items can be added to Closure (I) ) E’ ●E E  ●E + T E  ●T T  ●T  F T  ●F F ●( E ) F ●id GRAMMAR E’  E E  E + T | T T  T  F | F F  ( E ) | id If I is the set of one item { E’ ●E }, then the Closure (I) set would contain: | Website for Students | VTU - Notes - Question Papers

140 2. goto Operation E  E + ● T T  ●T  F T  ●F F ●( E ) F ●id
goto (I, X) ( where I is the set of items and X is the input grammar symbol) is defined as: The closure of the set of all items [A →αX●β] such that [A →α ● Xβ] is in I. Intuitively, if I is the set of items that are valid for some viable prefix γ, then goto (I,X) is the set of items that are valid for the viable prefix γX If I is the set of 2 items { E’ E● and E  E● + T} ; then, goto ( I, X) contains: E  E + ● T T  ●T  F T  ●F F ●( E ) F ●id | Website for Students | VTU - Notes - Question Papers


Download ppt "– 3 – Syntax Analysis ( Parsing )"

Similar presentations


Ads by Google