Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 2. Design of a Simple Compiler J. H. Wang Sep. 21, 2015.

Similar presentations


Presentation on theme: "Chapter 2. Design of a Simple Compiler J. H. Wang Sep. 21, 2015."— Presentation transcript:

1 Chapter 2. Design of a Simple Compiler J. H. Wang Sep. 21, 2015

2 Outline An Informal Definition of the ac Language Formal Definition of ac Phases of a Simple compiler Scanning Parsing Abstract Syntax Trees Semantic Analysis Code Generation

3 Introduction An overview of compilation process by considering a simple language A quick overview of a compiler ’ s phases and their associated data structures

4 An Informal Definition of the ac Language ac : adding calculator Types –integer –float: allows 5 fractional digits after the decimal point –Automatic type conversion from integer to float Keywords –f: float –i: integer –p: print Variables –23 names from lowercase Roman alphabet except the three reserved keywords f, i, and p Target of translation: dc (desk calculator) –Reverse Polish notation (RPN)

5 An Example ac Program Example ac program: –f b i a a = 5 b = a + 3.2 p b $ Corresponding dc code –5 sa la 3.2 + sb lb p

6 Formal Definition of ac Syntax specification: context-free grammar (CFG) –(Chap. 4) Token specification: regular expressions –(Sec. 3.2)

7 Syntax Specification

8 CFG: –A set of productions or rewriting rules –E.g.: Stmt  id assign Val Expr | print id –Two kinds of symbols Terminals: cannot be rewritten –E.g.: id, assign, print –Empty or null string: λ –End of input stream or file: $ Nonterminals: –Start symbol: Prog –E.g.: Val, Expr –Left-hand side (LHS) –Right-hand side (RHS)

9 Starting from the start symbol Choosing some nonterminal symbol and finding a production for it Replacing it with the string of symbols on the RHS Any string of terminals that can be produced: syntactically valid Otherwise: syntax error

10

11 Token Specification

12

13 Phases of a Simple Compiler Scanner: source ac program -> tokens –Chap. 3 Parser: tokens -> abstract syntax tree (AST) –Chap. 5 & 6 Symbol table: created from AST –Chap. 8 Semantic analysis: AST decoration Translation: by traversing AST

14 Scanning To translate a stream of characters into a stream of tokens –Automatic construction of scanners: Chap.3 –Token: Type: membership in the terminal alphabet Semantic value: additional information –For most programming languages, the scanner’s job is not so easy +, ++ //, “, \” Variable-length tokens

15 CANNER PEEK ADVANCE CAN IGITS EXICAL RROR

16 CAN IGITS PEEK ADVANCE

17 Parsing To determine if the stream of tokens conforms to the language’s grammar (Chap. 4, 5, 6) –e.g.: Are these valid statements? b = a + 3.2 p b –For ac, a simple parsing technique called recursive descent is used “Mutually recursive parsing routines that descend through a derivation tree” Each nonterminal has an associated parsing procedure for determining if the token stream contains a sequence of tokens derivable from that nonterminal

18 Predicting a Parsing Procedure Examine the next input token to predict which production should be applied –E.g.: Stmt  id assign Val Expr Stmt  print id – Predict set {id} [1] {print} [6]

19 TMT PEEK MATCH ERROR AL XPR TMT PEEK MATCH ERROR AL XPR

20 Consider the productions for Stmts –Stmts  Stmt Stmts –Stmts  λ The predict sets –{id, print} [8] –{$} [11]

21 TMTS PEEK TMT TMTS ERROR

22 Implementing the Production When a terminal is encountered, a call to MATCH() is placed For each nonterminal, the corresponding procedure will be called For the symbol λ, no code is executed

23 Abstract Syntax Trees Some aspects of compilation that can be difficult to perform during syntax analysis –Some aspects of language cannot be specified in a CFG E.g.: symbol usage consistency with type declaration –Context sensitive In Java: x.y.z –Package x, class y, static field z –Variable x, field y, another field z Operator overloading –+: numerical addition or appending of strings –Separation into phases makes the compiler much easier to write and maintain

24 Parse trees are large and unnecessarily detailed (Fig. 2.4) –Abstract syntax tree (AST) (Fig. 2.9) Inessential punctuation and delimiters are not included –A common intermediate representation for all phases after syntax analysis Declarations need not be in source form Order of executable statements explicitly represented Assignment statement must retain identifier and expression Nodes representing computation: operation and operands Print statement must retain name of identifier

25

26 Semantic Analysis Example processing include: –Declarations and name scopes are processed to construct a symbol table –Type consistency –Make type-dependent behavior explicit

27 Symbol Tables To record all identifiers and their types –23 entries for 23 distinct identifiers in ac (Fig. 2.11) Type info.: integer, float, unused (null) Attributes: scope, storage class, protection properties –Symbol table construction (Fig. 2.10) Symbol declaration nodes call VISIT(SymDeclaring n) ENTERSYMBOL checks the given symbol has not been previously declared

28

29 VISIT GET YPE NTER YMBOL OOKUP YMBOL ERROR GET D

30 Type Checking Only two types in ac –Integer –Float Type hierarchy –Float wider than integer –Automatic widening (or casting) integer -> float

31 Type Analysis VISIT ONSISTENT ONVERT OOKUP YMBOL

32 ONSISTENT ENERALIZE ONVERT ENERALIZE ERROR

33 Type checking –Constants and symbol reference: simply set the node’s type based on the node’s contents –Computation nodes: CONSISTENT(n.c1, n.c2) –Assignment operation: CONVERT(n.c2, n.c1.type) CONSISTENT() –GENERALIZE(): determines the least general type –CONVERT(): checks whether conversion is necessary

34

35 Code Generation The formulation of target-machine instructions that faithfully represent the semantics of the source program –Chap. 11 & 13 –dc: stack machine model –Code generation proceeds by traversing the AST, starting at its root VISIT (Computing n): +, - VISIT (Assigning n): = VISIT (SymReferencing n) VISIT (Printing n) VISIT (Converting n)

36 VISIT ODE EN MIT VISIT ODE EN MIT

37

38 End of Chapter 2


Download ppt "Chapter 2. Design of a Simple Compiler J. H. Wang Sep. 21, 2015."

Similar presentations


Ads by Google