Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 2 Chang Chi-Chung 2008.03 rev.1. A Simple Syntax-Directed Translator This chapter contains introductory material to Chapters 3 to 8  To create.

Similar presentations


Presentation on theme: "Chapter 2 Chang Chi-Chung 2008.03 rev.1. A Simple Syntax-Directed Translator This chapter contains introductory material to Chapters 3 to 8  To create."— Presentation transcript:

1 Chapter 2 Chang Chi-Chung 2008.03 rev.1

2 A Simple Syntax-Directed Translator This chapter contains introductory material to Chapters 3 to 8  To create a syntax-directed translator that maps infix arithmetic expressions into postfix expressions. Building a simple compiler involves:  Defining the syntax of a programming language  Develop a source code parser: for our compiler we will use predictive parsing  Implementing syntax directed translation to generate intermediate code

3 A Code Fragment To Be Translated { int i; int j; float[100] a; float v; float x; while (true) { do i = i + 1; while ( a[i] < v ); do j = j – 1; while ( a[j] > v ); if ( i>= j ) break; x = a[i]; a[i] = a[j]; a[j] = x; } To extend syntax-directed translator to map code fragments into three- address code. See appendix A. 1: i = i + 1 2: t1 = a [ i ] 3: if t1 < v goto 1 4: j = j -1 5: t2 = a [ j ] 6: if t2 > v goto 4 7: ifFalse i >= j goto 9 8: goto 14 9: x = a [ i ] 10: t3 = a [ j ] 11: a [ i ] = t3 12: a [ j ] = x 13: goto 1 14:

4 Syntax tree A Model of a Compiler Front End Lexical analyzer Parser Character Stream Token stream Symbol Table Source program Intermediate Code Generator Three-address code

5 Two Forms of Intermediate Code Abstract syntax trees Tree-Address instructions do-while body assign i + i 1 > [ ] a v i 1: i = i + 1 2: t1 = a [ i ] 3: if t1 < v goto 1

6 Syntax Definition Using Context-free grammar (CFG) BNF: Backus-Naur Form Context-free grammar has four components:  A set of tokens (terminal symbols)  A set of nonterminals  A set of productions  A designated start symbol

7 Example of CFG G =  T = { +,-,0,1,2,3,4,5,6,7,8,9 }  N = { list, digit }  P = list  list + digit list  list – digit list  digit digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9  S = list

8 Derivations The set of all strings (sequences of tokens) generated by the CFG using derivation  Begin with the start symbol  Repeatedly replace a nonterminal symbol in the current sentential form with one of the right-hand sides of a production for that nonterminal

9 Example of the Derivations Leftmost derivation  replaces the leftmost nonterminal (underlined) in each step. Rightmost derivation  replaces the rightmost nonterminal in each step. list  list + digit  list - digit + digit  digit - digit + digit  9 - digit + digit  9 - 5 + digit  9 - 5 + 2 Production  list  list + digit  list  list – digit  list  digit  digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

10 Parser Trees Given a CFG, a parse tree according to the grammar is a tree with following propertes.  The root of the tree is labeled by the start symbol  Each leaf of the tree is labeled by a terminal (=token) or   Each interior node is labeled by a nonterminal  If A  X 1 X 2 … X n is a production, then node A has immediate children X 1, X 2, …, X n where X i is a (non)terminal or  (  denotes the empty string) Example  A  XYZ A XYZ

11 Example of the Parser Tree Parse tree of the string 9-5+2 using grammar G list digit 9-5+2 list digit The sequence of leafs is called the yield of the parse tree

12 Ambiguity Consider the following context-free grammar This grammar is ambiguous, because more than one parse tree represents the string 9- 5+2 P = string  string + string | string - string | 0 | 1 | … | 9 G =

13 Ambiguity (Cont ’ d) string 9-5+2 9-5+2

14 Associativity of Operators Left-associative  If an operand with an operator on both sides of it, then it belongs to the operator to its left. string a+b+c has the same meaning as (a+b)+c  Left-associative operators have left-recursive productions left  left + term | term Right-associative  If an operand with an operator on both sides of it, then it belongs to the operator to its right. string a=b=c has the same meaning as a=(b=c)  Right-associative operators have right-recursive productions right  term = right | term

15 Associativity of Operators (cont ’ d) list digit a+b+c list digit right = letter a c=b right letter left-associative right-associative

16 Precedence of Operators String 9+5*2 has the same meaning as 9+(5*2) * has higher precedence than + Constructs a grammar for arithmetic expressions with precedence of operators.  left-associative : + - (expr)  left-associative : * / (term) Step 4: expr  expr + term | expr – term | term term  term * factor | term / factor | factor factor  digit | ( expr ) Step 1: factor  digit | ( expr ) Step 2: term  term * factor | term / factor | factor Step 3: expr  expr + term | expr – term | term

17 An Example: Syntax of Statements The grammar is a subset of Java statements. This approach prevents the build-up of semicolons after statements such as if- and while-, which end with nested substatements. stmt  id = expression ; | if ( expression ) stmt | if ( expression ) stmt else stmt | while ( expression ) stmt | do stmt while ( expression ) ; | { stmts } stmts  stmts stmt | 

18 Syntax-Directed Translation Syntax-Directed translation is done by attaching rules or program fragments to productions in a grammar. Translate infix expressions into postfix notation. ( in this chapter )  Infix: 9 – 5 + 2  Postfix: 9 5 – 2 + An Example  expr  expr 1 + term  The pseudo-code of the translation translate expr 1 ; translate term ; handle + ;

19 Syntax-Directed Translation (Cont ’ d) Two concepts (approaches) related to Syntax-Directed Translation.  Synthesized Attributes Syntax-directed definition Build up a translation by attaching strings (semantic rules) as attributes to the nodes in the parse tree.  Translation Schemes Syntax-directed translation Build up a translation by program fragments which are called semantic actions and embedded within production bodies.

20 Syntax-directed definition The syntax-directed definition associates  With each grammar symbol (terminals and nonterminals), a set of attributes.  With each production, a set of semantic rules for computing the values of the attributes associated with the symbols appearing in the production. An attribute is said to be  Synthesized if its value at a parse-tree node is determined from attribute values at its children and at the node itself.  Inherited if its value at a parse-tree node is determined from attribute values at the node itself, its parent, and its siblings in the parse tree.

21 An Example: Synthesized Attributes An annotated parse tree  Suppose a node N in a parse tree is labeled by grammar symbol X.  The X.a is denoted the value of attribute a of X at node N. expr.t = “ 95-2+ ” term.t = “ 2 ” 9-5+2 expr.t = “ 95- ” expr.t = “ 9 ” term.t = “ 5 ” term.t = “ 9 ”

22 Semantic Rules ProductionSemantic Rules expr  expr 1 + term expr  expr 1 - term expr  term term  0 term  1 … term  9 expr.t = expr 1.t || term.t || ‘+’ expr.t = expr 1.t || term.t || ‘-’ expr.t = term.t term.t = ‘0’ term.t = ‘1’ … term.t = ‘9’ || is the operator for string concatenation in semantic rule.

23 Depth-First Traversals Tree traversals  Breadth-First  Depth-First Preorder: N L R Inorder: L N R Postorder: L R N Depth-First Traversals: Postorder 、 From left to right procedure visit(node N) { for ( each child C of N, from left to right ) { visit(C); } evaluate semantic rules at node N; }

24 Example: Depth-First Traversals expr.t = 95-2+ term.t = 2 9-5+2 expr.t = 95- expr.t = 9term.t = 5 term.t = 9 Note: all attributes are the synthesized type

25 Translation Schemes A translation scheme is a CFG embedded with semantic actions Example  rest  + term { print(“+”) } rest rest termrest+ { print( “ + ” ) } Embedded Semantic Action

26 An Example: Translation Scheme expr term 9 - 5 + 2 expr term { print( ‘ + ’ ) } { print( ‘ - ’ ) }{ print( ‘ 2 ’ ) } { print( ‘ 9 ’ ) } { print( ‘ 5 ’ ) } expr  expr + term { print( ‘ + ’ ) } expr  expr – term { print( ‘ - ’ ) } expr  term term  0 { print( ‘ 0 ’ ) } term  1 { print( ‘ 1 ’ ) } … term  9 { print( ‘ 9 ’ ) }

27 Parsing The process of determining if a string of terminals (tokens) can be generated by a grammar. Time complexity:  For any CFG there is a parser that takes at most O(n 3 ) time to parse a string of n terminals.  Linear algorithms suffice to parse essentially all languages that arise in practice. Two kinds of methods  Top-down: constructs a parse tree from root to leaves  Bottom-up: constructs a parse tree from leaves to root

28 Top-Down Parsing Recursive descent parsing is a top-down method of syntax analysis in which a set of recursive procedures is used to process the input.  One procedure is associated with each nonterminal of a grammar.  If a nonterminal has multiple productions, each production is implemented in a branch of a selection statement based on input lookahead information Predictive parsing  A special form of recursive descent parsing  The lookahead symbol unambiguously determines the flow of control through the procedure body for each nonterminal.

29 An Example: Top-Down Parsing stmt  expr ; | if ( expr ) stmt | for ( optexpr ; optexpr ; optexpr ) stmt | other optexpr   | expr stmt optexpr ε expr optexpr for ( ;;optexpr) stmt exprother

30 void stmt() { switch ( lookahead ) { case expr: match(expr); match(‘;’); break; case if: match(if); match(‘(‘); match(expr); match(‘)’); stmt(); break; case for: match(for); match(‘(‘); optexpr(); match(‘;’); optexpr(); match(‘)’); stmt(); break; case other: match(other); break; default: report(“syntax error”); } } void optexpr() { if ( lookahead == expr ) match(expr); } void match(terminal t) { if ( lookahead == t ) lookahead = nextTerminal; else report(“syntax error”); } stmt  expr ; | if ( expr ) stmt | for ( optexpr ; optexpr ; optexpr ) stmt | other optexpr   | expr Pseudocode For a Predictive Parser Use ε- Productions

31 Example: Predictive Parsing stmt for ( ; expr ; expr ) other Parse Tree Input LL(1) lookahead for match(for) ( match(‘(‘)optexpr()match(‘;‘) optexpr()match(‘;‘)optexpr()match(‘)‘)stmt() optexpr ; ; ) stmt

32 FIRST FIRST(  ) is the set of terminals that appear as the first symbols of one or more strings generated from   is Sentential Form Example  FIRST( stmt ) = { expr, if, for, other }  FIRST( expr ; ) = { expr } stmt  expr ; | if ( expr ) stmt | for ( optexpr ; optexpr ; optexpr ) stmt | other

33 Examples: First FIRST(simple) = { integer, char, num } FIRST(^ id) = { ^ } FIRST(type) = { integer, char, num, ^, array } type  simple | ^ id | array [ simple ] of type simple  integer | char | num dotdot num

34 Designing a Predictive Parser A predictive parser is a program consisting of a procedure for every nonterminal. The procedure for nonterminal A  It decides which A -production to use by examining the lookahead symbol. Left Factor Left Recursion ε Production  Mimics the body of the chosen production. Applying translation scheme  Construct a predictive parser, ignoring the actions.  Copy the actions from the translation scheme into the parser

35 Left Factor  One production for nonterminal A starts with the same symbols. Example: stmt  if ( expr ) stmt | if ( expr ) stmt else stmt Use Left Factoring to fix it stmt  if ( expr ) stmt rest rest  else stmt | ε

36 Left Recursion Left Recursive  A production for nonterminal A starts with a self reference.  A  A α | β An Example:  expr  expr + term | term Rewrite the left recursive to right recursive by using the following rules. A  β R R  αR | ε

37 Example: Left and Right Recursive βαα….α βαα α A A A A … A R R R … R ε left recursive right recursive

38 Abstract and Concrete Syntax + - 9 5 2 expr term 9-5+2 expr term helper

39 Conclusion: Parsing and Translation Scheme Give a CFG grammar G as below: expr  expr + term { print(‘+’) } expr  expr – term { print(‘-’) } expr  term term  0 { print(‘0’) } term  1 { print(‘1’) } … term  9 { print(‘9’) } Semantic actions for translating into postfix notation.

40 Conclusion: Parsing and Translation Scheme Step 1  To elimination left-recursion  Technique A  Aα | Aβ | γ into A  γ R R  α R | βR | ε Use the rule to transforms G.

41 Left-Recursion-elimination expr  term rest rest  + term { print(‘+’) } rest | – term { print(‘-’) } rest | ε term  0 { print(‘0’) } term  1 { print(‘1’) } … term  9 { print(‘9’) } Conclusion: Parsing and Translation Scheme

42 An Example: Left-Recursion-elimination expr term 9 { print( ‘ 9 ’ ) } 5 rest - term { print( ‘ - ’ ) } { print( ‘ 5 ’ ) } 2 rest + term { print( ‘ + ’ ) } { print( ‘ 2 ’ ) } ε rest expr  term rest rest  + term { print( ‘ + ’ ) } rest | – term { print( ‘ - ’ ) } rest | ε term  0 { print( ‘ 0 ’ ) } | 1 { print( ‘ 1 ’ ) } | … | 9 { print( ‘ 9 ’ ) }

43 Conclusion: Parsing and Translation Scheme Step 2  Procedures for Nonterminals. void expr() { term(); rest(); } void rest() { if ( lookahead == ‘+’ ) { match(‘+’); term(); print(‘+’); rest(); } else if ( lookahead == ‘-’ ) { match(‘-’); term(); print(‘-’); rest(); } else { } //do nothing with the input } void term() { if ( lookahead is a digit ) { t = lookahead; match(lookahead); print(t); } else report(“syntax error”); }

44 Step 3  Simplifying the Translator Conclusion: Parsing and Translation Scheme void rest() { while ( true ) { if ( lookahead == ‘+’ ) { match(‘+’); term(); print(‘+’); continue; } else if (lookahead == ‘-’) { match(‘-’); term(); print(‘-’); continue; } break; } void rest() { if ( lookahead == ‘+’ ) { match(‘+’); term(); print(‘+’); rest(); } else if (lookahead == ‘-’) { match(‘-’); term(); print(‘-’); rest(); } else { }

45 Conclusion: Parsing and Translation Scheme Complete void term() throws IOException { if (Character.isDigit((char)lookahead){ System.out.write((char)lookahead); match(lookahead); } else throw new Error(“syntax error”); } void match(int t) throws IOException { if ( lookahead == t ) lookahead = System.in.read(); else throw new Error(“syntax error”); } } import java.io.*; class Parser { static int lookahead; public Parser() throws IOException { lookahead = System.in.read(); } void expr() { term(); while ( true ) { if ( lookahead == ‘+’ ) { match(‘+’); term(); System.out.write(‘+’); continue; } else if (lookahead == ‘-’) { match(‘-’); term(); System.out.write(‘-’); continue; } else return; }


Download ppt "Chapter 2 Chang Chi-Chung 2008.03 rev.1. A Simple Syntax-Directed Translator This chapter contains introductory material to Chapters 3 to 8  To create."

Similar presentations


Ads by Google