Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.

Similar presentations


Presentation on theme: "1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi."— Presentation transcript:

1 1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi

2 Outline 1. Intermediate Code Generation 2. Variants of Syntax Trees 1. Directed Acyclic Graph for Expressions 2. The Value-Number Method for Constructing DAG 3. Three-Address Code 1. Addresses and Instructions 2. Quadruples 3. Triples 4. Static Single-Assignment Form 5. Summary 2

3 Intermediate-Code Generation Lecture: 19-20 3

4 Where Are We Now? 4 Scanner Parser Semantics Analyzer Intermediate Code Generator Source code Syntax Tree Annotated Tree Intermediate code Tokens

5 Intermediate-Code Generation In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate representation, from which the back end generates target code Ideally, details of the source language are confined to the front end, and details of the target machine to the back end With a suitably defined intermediate representation, a compiler for language I and machine j can then be built by combining the front end for language I with back end for the machine j 5

6 Intermediate-Code Generation (Continue…) Following figure shows front-end model of compiler Static checking includes type checking, which ensures that operators are applied to compatible operands Static checking also includes any syntactic checks that remain after parsing  A break statement in C is enclosed within a while, for or switch statement 6

7 Intermediate-Code Generation (Continue…) While translating a program, compiler may construct a sequence of intermediate representations High-level representations are close to the source language and low-level representation are close to the target machine The abstract syntax trees are high-level intermediate representation  Depict natural hierarchical structure of the source program 7 Source Program High Level Intermediate Representation Low Level Intermediate Representation Target Code

8 Intermediate-Code Generation (Continue…) A low-level representation is suitable for machine- dependent tasks like register allocation and instruction selection Three-address code can range from high- to low- level, depending upon the choice of operators The difference between syntax trees and three- address code are superficial  A syntax tree represents the component of a statement, whereas three-address code contains labels and jump instructions to represent the flow of control, as in machine language 8

9 Intermediate-Code Generation (Continue…) The choice or design of an intermediate representation varies from compiler to compiler An intermediate representation may either be an actual language or it may consist of internal data structures that are shared by phases of the compiler C is a programming language, yet it is often used as an intermediate form  C is flexible, it compiles into efficient machine code, and its compilers are widely available  The C++ compiler consisted of a front end that generated C, treating a C compiler as a back end 9

10 Quiz# 3 Time Allowed: 10 Minutes 10

11 Variants of Syntax Trees Nodes in a syntax tree represent constructs in the source program  The children of the node represents meaningful components of a construct A directed acyclic graph (DAG) for an expression identifies the common suhexpression of the expression 11

12 Directed Acyclic Graphs for Expressions A directed acyclic graph (DAG), is a directed graph with no directed cycles Like syntax tree for an expression, a DAG has leaves corresponding to atomic operands and interior nodes corresponding to operators A node N in a DAG has more than one parent if N represents a common subexpression A DAG not only represents expressions more succinctly, it gives the compiler important clues regarding the generation of efficient code to evaluate the expression 12

13 Directed Acyclic Graphs for Expressions (Continue…) Create Syntax Trees and DAG’s for the following expressions  a = a + 10  a + b + (a + b)  a + b + a + b  a + a * (b – c) + (b – c) * d 13

14 The Value-Number Method for Constructing DAG’s Often, the nodes of a syntax tree or DAG are stored in an array of records Each row of the array represents one record, and therefore one node Consider the figure on next slide that shows a DAG along with an array for expression i = i + 10 14

15 The Value-Number Method for Constructing DAG’s (Continue…) In the following figure leaves have one additional field, which holds the lexical value, and interior nodes have two additional fields indicating the left and right children 15

16 The Value-Number Method for Constructing DAG’s (Continue…) In the array, we refer to nodes by giving the integer index of the record for that node within the array This integer is called the value number for the node or for the expression represented by the node 16

17 Three-Address Code In three-address code, there is at most one operation on the right side of an instruction Expression like x+y*z might be translated into the sequence of three-address instructions t 1 = y*z t 2 = x+t 1 t 1 and t 2 are compiler generated temporary names The use of names for intermediate values computed by a program allows three-address code to be rearranged easily 17

18 Three-Address Code (Continue…) Exercise  Represent the following DAG in three-address code sequence 18

19 Addresses and Instructions Three-address code is built from two concepts: addresses and instructions In object-oriented terms, these concepts correspond to classes, and the various kinds of addresses and instructions correspond to appropriate subclasses Alternatively, three-address code can be implemented using records with fields for the addresses The records called quadruples and triples 19

20 Addresses and Instructions (Continue…) In three-address code scheme, an address can be one of the following  A name : The names that appear in source program. In implementation, a source name is replaced by a pointer to its symbol table entry, where all the information about the name is kept  A constant : In practice, a compiler must deal with many different types of constants and variables  A compiler-generated temporary : It is useful, especially in optimizing compilers, to create a distinct name each time a temporary is needed 20

21 Addresses and Instructions (Continue…) Few examples of three-address code instructions are mentioned below;  Assignment instruction x = y op z  Assignment of the form x = op y  Copy instructions of the form x = y  An unconditional jump goto L  Conditional jumps of the form if x goto L  Indexed copy instructions of the form x = y[z] OR y[z] = x  etc. 21

22 Addresses and Instructions (Continue…) Consider the following statement and its three- address code in the figures; do i = i+1; while( a[i]<v ); 22

23 Quadruples & Triples The description of three-address instructions specifies components of each type of instructions, but it does not specify the representation of these instructions in a data structure In a compiler, these instructions can be implemented as objects or as records with fields for the operator and the operands Three such representations are called “quadruples”, “triples”, and “indirect triples” 23

24 Quadruples A quadruple or just “quad” has four fields, which we call op, arg 1, arg 2, and result  In x=y+z, ‘+’ is op, y and z are arg 1 and arg 2 whereas x is result The following are some exceptions in this rule;  Instructions with unary operators like x = minus y OR x = y do not use arg 2  Operators like param use neither arg 2 nor result  Conditional and unconditional jumps put the target label in result 24

25 Quadruples (Continue…) Example: Three-address code for the assignment a = b*-c+b*-c is shown below 25

26 Triples A triple has only three fields which we call op, arg 1, and arg 2 In earlier example we have seen the result field is used primarily for temporary names Using triples, we refer to the result of an operation x op y by its position rather than an explicit temporary name Consider the figure in next slide for details; 26

27 Triples (Continue…) Example: Three-address code using Triples 27

28 Static Single-Assignment Form The Static Single-Assignment Form (SSA) is an intermediate representation that facilitates certain code optimizations Two aspects distinguish SSA from three-address code  All assignments in SSA are to variables with distinct names  SSA uses a notational convention Φ -function to combine two definitions of same variables if( flag ) x = -1; else x = 1; y = x + a if( flag ) x 1 = -1; else x 2 = 1; x 3 = Φ( x 1,x 2 ) 28

29 29 Summary Any Questions?


Download ppt "1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi."

Similar presentations


Ads by Google