Compiler Chapter# 5 Intermediate code generation.

Slides:



Advertisements
Similar presentations
Chapter 6 Intermediate Code Generation
Advertisements

Intermediate Code Generation
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
1 CS 201 Compiler Construction Machine Code Generation.
Chapter 8 ICS 412. Code Generation Final phase of a compiler construction. It generates executable code for a target machine. A compiler may instead generate.
Intermediate Code Generation. 2 Intermediate languages Declarations Expressions Statements.
Intermediate Representations Saumya Debray Dept. of Computer Science The University of Arizona Tucson, AZ
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
8 Intermediate code generation
1 Compiler Construction Intermediate Code Generation.
Overview of Previous Lesson(s) Over View  Front end analyzes a source program and creates an intermediate representation from which the back end generates.
COP4020 Programming Languages Expression and assignment Prof. Xin Yuan.
Program Representations. Representing programs Goals.
Compiler Construction
Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,
Intermediate Representation I High-Level to Low-Level IR Translation EECS 483 – Lecture 17 University of Michigan Monday, November 6, 2006.
Chapter 14: Building a Runnable Program Chapter 14: Building a runnable program 14.1 Back-End Compiler Structure 14.2 Intermediate Forms 14.3 Code.
CS412/413 Introduction to Compilers Radu Rugina Lecture 16: Efficient Translation to Low IR 25 Feb 02.
Chapter 7Louden, Programming Languages1 Chapter 7 - Control I: Expressions and Statements "Control" is the general study of the semantics of execution.
Intermediate Code Generation Professor Yihjia Tsai Tamkang University.
1 Chapter 4 Language Fundamentals. 2 Identifiers Program parts such as packages, classes, and class members have names, which are formally known as identifiers.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Intermediate Code CS 471 October 29, CS 471 – Fall Intermediate Code Generation Source code Lexical Analysis Syntactic Analysis Semantic.
CSC 8505 Compiler Construction Intermediate Representations.
Compiler Construction A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University Second Semester 2008/2009.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
CS412/413 Introduction to Compilers Radu Rugina Lecture 15: Translating High IR to Low IR 22 Feb 02.
What is Three Address Code? A statement of the form x = y op z is a three address statement. x, y and z here are the three operands and op is any logical.
1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.
Chapter 8 Intermediate Code Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University.
Chapter 8 High-Level Programming Languages. 8-2 Chapter Goals Describe the translation process and distinguish between assembly, compilation, interpretation,
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
1 June 3, June 3, 2016June 3, 2016June 3, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa Pacific University,
Execution of an instruction
Introduction to Java Java Translation Program Structure
Joey Paquet, 2000, Lecture 10 Introduction to Code Generation and Intermediate Representations.
Introduction to Code Generation and Intermediate Representations
Overview of Previous Lesson(s) Over View  A program must be translated into a form in which it can be executed by a computer.  The software systems.
Chapter 1 Introduction Study Goals: Master: the phases of a compiler Understand: what is a compiler Know: interpreter,compiler structure.
Code Generation Ⅰ CS308 Compiler Theory1. 2 Background The final phase in our compiler model Requirements imposed on a code generator –Preserving the.
Instructions. Portability In addition to making hardware backward compatible, we have also made software portable. In describing software, “portable”
8-1 Compilers Compiler A program that translates a high-level language program into machine code High-level languages provide a richer set of instructions.
Chapter# 6 Code generation.  The final phase in our compiler model is the code generator.  It takes as input the intermediate representation(IR) produced.
LESSON 04.
Compilers Modern Compiler Design
1 Control Flow Analysis Topic today Representation and Analysis Paper (Sections 1, 2) For next class: Read Representation and Analysis Paper (Section 3)
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Principle of Programming Lanugages 3: Compilation of statements Statements in C Assertion Hoare logic Department of Information Science and Engineering.
Prepared By: Abhisekh Biswas - 04CS3002 Intermediate Code Generation.
Overview of Previous Lesson(s) Over View 3 Model of a Compiler Front End.
Code Generation How to produce intermediate or target code.
1 Structure of a Compiler Source Language Target Language Semantic Analyzer Syntax Analyzer Lexical Analyzer Front End Code Optimizer Target Code Generator.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 10 Ahmed Ezzat.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
1 Chapter10: Code generator. 2 Code Generator Source Program Target Program Semantic Analyzer Intermediate Code Generator Code Optimizer Code Generator.
CS 404 Introduction to Compiler Design
Intermediate code Jakub Yaghob
A Simple Syntax-Directed Translator
Expressions and Assignment
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Compiler Construction
Compiler Optimization and Code Generation
Chapter 6 Intermediate-Code Generation
Intermediate code generation
Compiler Design 21. Intermediate Code Generation
Intermediate Code Generation
Compiler Construction
Compiler Design 21. Intermediate Code Generation
Review: What is an activation record?
Intermediate Code Generating machine-independent intermediate form.
Presentation transcript:

Compiler Chapter# 5 Intermediate code generation

Intermediate Code Generation In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate representation, from which the back-end generates target code. Intermediate codes are machine independent codes, but they are close to machine instructions. The given program in a source language is converted to an equivalent program in an intermediate language by the intermediate code generator. Intermediate language can be many different languages, and the designer of the compiler decides this intermediate language.

Intermediate Code Generation (continue……..) Syntax trees can be used as an intermediate language. Postfix notation can be used as an intermediate language. Three-address code (Quadruples) can be used as an intermediate language We will use quadruples to discuss intermediate code generation Quadruples are close to machine instructions, but they are not actual machine instructions. some programming languages have well defined intermediate languages, for example: Java: Java virtual machine (JVM) In fact, there are byte-code to execute instructions in these intermediate languages.

Variants of syntax trees: Nodes in a syntax tree represent constructs in the source program. The children of a node represent the meaningful components of a construct. A directed acyclic graph (DAG) for an expression identifies the common sub-expressions (sub-expression that occur more than once) of the expression. DAG’s can be constructed by using the same techniques that construct syntax tree.

Cyclic Graphs for Expressions 5 Cyclic Graphs for Expressions a + a * ( b – c ) + ( b – c ) * d + + * * d a - b c

Directed acyclic graphs for expressions: Like the syntax tree for an expression, a DAG has leaves corresponding to atomic operands and interior codes corresponding to operators. The difference is that a node N in a DAG has more than one parent if N represents a common sub-expression. In a syntax tree, the tree for the common sub-expression would be replicated as many times as the sub-expression appears in the original expression. Thus DAG not only represents expressions more succinctly, it gives the compiler important clues regarding the generation of efficient code to evaluate the expressions.

Directed acyclic graphs for expressions: (continue….) The leaf for a has two parents, because a appears twice in the expression. The two components of the common sub- expression b-c are represented by one node, the node labeled -. That node has two parents, representing its two uses in the sub-expression a*(b-c) and (b-c)*d.

Three-address code In three-address code, there is at most one operator on the right side of an instruction. Thus a source language expression like x+y*z might be translated into the sequence of three- address instructions t1 = y * z t2 = x + t1 Where t1 and t2 are compiler-generated temporary names.

Addresses and instructions: Three-address code is built from two concepts: address and instructions. Three-address code can be implemented using records with fields for the addresses, records are called quadruples and triples. An address can be one of the following: A name: For convenience, we allow source-program names to appear as addresses in three-address code. A constant: In practice, a compiler must deal with many different types of constants. A compiler-generated temporary: It is useful, especially in optimizing compilers, to create a distinct name each time a temporary is needed.

Addresses and instructions: (continue……….) We now consider the common three-address instructions. Assignment instructions of the form x = y op z where op is a binary arithmetic or logical operation, and x,y and z are addresses. Assignment of the form x = op y, where op is a unary operation. Copy instruction of the form x=y, where x is assigned the value of y.

Quadruples: The description of three-address instruction specifies the components of each type of instruction. In a compiler, these instructions can be implemented as objects or as records with fields for the operator and the operands. Three such representations are called “quadruples”, “triples” and “indirect triples.” A quadruple has four fields, which we call op, arg1, arg2, and result. The op field contains an internal code for the operator. For instance the three-address instruction x=y+z is represented by placing + in op, y in arg1, z in arg2, and x in result.

Quadruples: (continue…….) The following are some exceptions to this rule: Instructions with unary operators like x= minus y or z=y do not use arg2. Conditional and unconditional jumps put the target label in result.

Three address code for the assignment a = b * -c + b * -c The special operator minus is used to distinguish the unary operator, as in –c, from the binary minus operator, as in b-c. Note that the unary-minus “three-address” statement has only two addresses, as does the copy statement a=t5. The quadruples in figure(b) implement the three- address code in (a) as follows:

Three address code for the assignment a = b * -c + b * -c continue……… t1 = minus c t2 = b * t1 t3 = minus c t4 = b * t3 t5 = t2 + t4 a = t5 op arg1 arg2 result minus c t1 1 * b t2 2 t3 3 t4 4 + t5 5 = a (a) Three-address code (b) Quadruples

Triples: A triple has only three fields, which we call op, arg1, arg2. Note that the result field in previous figure is used primarily for temporary names. Using triples we refer the result of an operation x op y by its position, rather than by an explicit temporary name. Thus, instead of the temporary t1 in Figure((b)previous slide), a triple representation would refer to position (0). Parenthesized numbers represent pointers into the triple structure itself. Positions or pointers to positions were called value numbers.

Representations of a = b * -c + b * -c * * b minus b minus c c op arg1 arg2 minus c 1 * b (0) 2 3 (2) 4 + (1) (3) 5 = a (4) Syntax tree (b) Triples

Types and declarations: The applications of types can be grouped under checking and translation: Type checking uses logical rules to reason about the behavior of a program at runtime. Specially, it ensures that the types of operands match the type expected by an operator. For example, the && operator in Java expects its two operands to be Boolean, the result is also type Boolean. Translation: From the type of a name, a compiler can determine the storage that will be needed for that name at run time. Type information is also needed to calculate the address denoted by an array reference.

Translation of expressions: We begin in this section with the translation of expressions into three-address code. An expression with more than one operator, like a+b*c, will translate into instructions with at most one operator per instruction. An array reference A[i][j] will expend into a sequence of three-address instructions that calculate an address for the reference.

Addressing array elements: Array elements can be accessed quickly if they are stored in a block of consecutive locations. In C and Java, array elements are numbered 0,1,2,……,n-1 for an array with n elements. If the width of each array element is w, then the ith element of array ‘A’ begins in location base + i x w Base is the relative address of A[0].

Type checking: To do type checking a compiler needs to assign a type expression to each component of the source program. The compiler must then determine that these type expression conform to a collection of logical rules that is called the type system for the source program. Type checking has the potential for catching errors in programs.

Type conversions: Consider expressions like x + i, where x is of type float and i is of type integer. Since the representation of integers and floating-point numbers is different within a computer and different machine instructions are used for operations on integers and floats. The compiler may need to convert one of the operands of + to ensure that both operands are of the same type when addition occurs. Suppose that integers are converted to floats when necessary, using a unary operator (float). For example, the integer 2 is converted to a float in the code for the expression 2 * 2.14 t1 = (float) 2 t2 = t1 * 3.14

Control flow: The translation of statements such as if-else- statements and while-statements is tied to the translation of Boolean expressions. In programming languages, Boolean expressions are often used to Alter the flow of control: Boolean expressions are used as conditional expressions in statements that alter the flow of control. Compute logical values. A Boolean expression can represent true or false as values.

Boolean expressions: Boolean expressions are composed of the Boolean operators (which we denote &&, ||, and !, using the C convention for the operators AND, OR, NOT, respectively) applied to elements that are Boolean variables or relational expressions. Relational expressions are of the form E1 rel E2, where E1 and E2 are arithmetic expressions. B B || B | B && B | !B | (B) | E rel E | true | false We use the attribute rel.op to indicate which of the six comparison operators <, <=, ==, !=, >, >= is represented by rel.

Boolean expressions: (continue……) Given the expression B1 || B2, if we determine that B1 is true, then we can conclude that the entire expressions is true without having to evaluate B2. Similarly, given B1 && B2, if B1 is false, then the entire expression is false.

Switch statements: The “switch” or “case” statement is available in a variety of languages. Our switch statement syntax is given bellow: switch( E ) { case V1 : S1 case V2 : S2 ……….. case Vn-1 : Sn-1 default : Sn } There is a selector expression E, switch is to be evaluated, followed by n constant values V1,V2,…Vn that the expression might take, perhaps including a default “value”, which always matches the expression if no other value does.

Translation of switch-statement The intended translation of a switch is code to: Evaluate the expression E. Find the value Vj in the list of cases that is the same as the expression. Execute the statement Sj associated with the value found.

The end