COMPILERS Intermediate Code hussein suleman uct csc3003s 2007.

Slides:



Advertisements
Similar presentations
6. Intermediate Representation Prof. O. Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and.
Advertisements

1 Lecture 10 Intermediate Representations. 2 front end »produces an intermediate representation (IR) for the program. optimizer »transforms the code in.
Intermediate Code Generation
The University of Adelaide, School of Computer Science
CSI 3120, Implementing subprograms, page 1 Implementing subprograms The environment in block-structured languages The structure of the activation stack.
1 Compiler Construction Intermediate Code Generation.
PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
COMPILERS Basic Blocks and Traces hussein suleman uct csc3005h 2006.
Overview of Previous Lesson(s) Over View  Front end analyzes a source program and creates an intermediate representation from which the back end generates.
Chapter 7:: Data Types Programming Language Pragmatics
Intermediate Representation I High-Level to Low-Level IR Translation EECS 483 – Lecture 17 University of Michigan Monday, November 6, 2006.
Chapter 14: Building a Runnable Program Chapter 14: Building a runnable program 14.1 Back-End Compiler Structure 14.2 Intermediate Forms 14.3 Code.
CS412/413 Introduction to Compilers Radu Rugina Lecture 16: Efficient Translation to Low IR 25 Feb 02.
CPSC Compiler Tutorial 8 Code Generator (unoptimized)
1 Storage Registers vs. memory Access to registers is much faster than access to memory Goal: store as much data as possible in registers Limitations/considerations:
1 Chapter 7: Runtime Environments. int * larger (int a, int b) { if (a > b) return &a; //wrong else return &b; //wrong } int * larger (int *a, int *b)
Memory Allocation. Three kinds of memory Fixed memory Stack memory Heap memory.
Runtime Environments Source language issues Storage organization
CS 536 Spring Code generation I Lecture 20.
Cse321, Programming Languages and Compilers 1 6/19/2015 Lecture #18, March 14, 2007 Syntax directed translations, Meanings of programs, Rules for writing.
CSE 5317/4305 L8: Intermediate Representation1 Intermediate Representation Leonidas Fegaras.
Run time vs. Compile time
Semantics of Calls and Returns
The environment of the computation Declarations introduce names that denote entities. At execution-time, entities are bound to values or to locations:
Ada Declarations Declarations ID Offsets i, 0 j; 2 type r is record x, 0 y: integer; 2 end; arecord: r; 4 a1: array (1..10) of r 8 a2: array (4..6) of.
Structured Data Types and Encapsulation Mechanisms to create new data types: –Structured data Homogeneous: arrays, lists, sets, Non-homogeneous: records.
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
Chapter 8 :: Subroutines and Control Abstraction
Chapter 7: Runtime Environment –Run time memory organization. We need to use memory to store: –code –static data (global variables) –dynamic data objects.
CS412/413 Introduction to Compilers Radu Rugina Lecture 15: Translating High IR to Low IR 22 Feb 02.
Runtime Environments What is in the memory? Runtime Environment2 Outline Memory organization during program execution Static runtime environments.
COP4020 Programming Languages
LANGUAGE TRANSLATORS: WEEK 24 TRANSLATION TO ‘INTERMEDIATE’ CODE (overview) Labs this week: Tutorial Exercises on Code Generation.
Runtime Environments Compiler Construction Chapter 7.
Chapter 8 Intermediate Code Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University.
Compiler Construction
Copyright © 2005 Elsevier Chapter 8 :: Subroutines and Control Abstraction Programming Language Pragmatics Michael L. Scott.
Review: –What is an activation record? –What are the typical fields in an activation record? –What are the storage allocation strategies? Which program.
COMPILERS Symbol Tables hussein suleman uct csc3003s 2007.
COSC 2021: Computer Organization Instructor: Dr. Amir Asif Department of Computer Science York University Handout # 3: MIPS Instruction Set I Topics: 1.
Arithmetic Expressions
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Run-Time Environments How do we allocate the space for the generated target code and the data object.
Run-Time Storage Organization Compiler Design Lecture (03/23/98) Computer Science Rensselaer Polytechnic.
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
CSC 8505 Compiler Construction Runtime Environments.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 11: Functions and stack frames.
Intermediate Representation II Storage Allocation and Management EECS 483 – Lecture 18 University of Michigan Wednesday, November 8, 2006.
Code Generation CPSC 388 Ellen Walker Hiram College.
10-1 Chapter 10: Implementing Subprograms The General Semantics of Calls and Returns Implementing “Simple” Subprograms Implementing Subprograms with Stack-Dynamic.
1 ENERGY 211 / CME 211 Lecture 4 September 29, 2008.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 10 Ahmed Ezzat.
COMPILERS Intermediate Code hussein suleman uct csc3003s 2009.
Run-Time Environments Presented By: Seema Gupta 09MCA102.
Run-Time Environments Chapter 7
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Intermediate code generation Jakub Yaghob
COMPILERS Basic Blocks and Traces
COMPILERS Basic Blocks and Traces
COMPILERS Intermediate Code
Chapter 9 :: Subroutines and Control Abstraction
Instructions - Type and Format
Chap. 8 :: Subroutines and Control Abstraction
Chap. 8 :: Subroutines and Control Abstraction
Chapter 6 Intermediate-Code Generation
Topic 3-a Calling Convention 1/10/2019.
Languages and Compilers (SProg og Oversættere)
Course Overview PART I: overview material PART II: inside a compiler
Run Time Environments 薛智文
Intermediate Representation
Runtime Environments What is in the memory?.
Presentation transcript:

COMPILERS Intermediate Code hussein suleman uct csc3003s 2007

IR Trees  An Intermediate Representation is a machine-independent representation of the instructions that must be generated.  We translate ASTs into IR trees using a set of rules for each of the nodes.  Why use IR? IR is easier to apply optimisations to. IR is simpler than real machine code. Separation of front-end and back-end.

CONST i Integer constant i NAME n Symbolic constant n TEMP t Temporary t - a register MEM m Contents of a word of memory starting at m IR Trees – Expressions 1/2

IR Trees – Expressions 2/2 BINOP op e1 e2 e1 op e2 - Binary operator Evaluate e1, then e2, then apply op to e1 and e2 CALL f(e1….en) Procedure call:evaluate f then the arguments in order, then call f ESEQ se Evaluate s for side effects then e for the result

IR Trees – Statements 1/2 MOVE TEMP t e Evaluate e then move the result to temporary t MOVE MEMe2 e1 Evaluate e1 giving address a, then evaluate e2 and move the move the result to address a EXP e Evaluate e then discard the result

IR Trees – Statements 2/2 JUMP e(l1….ln) Transfer control to address e; optional labels l1..ln are possible values for e CJUMP e1ope2etf Evaluate e1 then e2; compare the results using relational operator op; jump to t if true, f if false SEQ s1s2 The statement S1 followed by statement s2 LABEL n Define constant value of name n as current code address; NAME(n) can be used as target of jumps, calls, etc.

Expression Classes 1/2  Expression classes are an abstraction to support conversion of expression types (expressions, statements, etc.)  Expressions are indicated in terms of their natural form and then “cast” to the form needed where they are used.  Expression classes are not necessary in a compiler but make expression type conversion easier when generating code.

Expression Classes 2/2  Ex(exp) expressions that compute a value  Nx(stm) statements that compute no value, but may have side-effects  RelCx (op, l, r) conditionals that encode conditional expressions (jump to true and false destinations)

Casting Expressions  Conversion operators allow use of one form in context of another: unEx: convert to tree expression that computes value of inner tree. unNx: convert to tree statement that computes inner tree but returns no value. unCx(t, f): convert to statement that evaluates inner tree and branches to true destination if non-zero, false destination otherwise.  Trivially, unEx (Exp (e)) = e  Trivially, unNx (Stm (s)) = s  But, unNx (Exp (e)) = MOVE[TEMP t, e]

BINOP PLUSTEMP fpCONST k MEM + TEMP fpCONST k MEM Translation  Simple Variables simple variable v in the current procedure’s stack frame could be abbreviated to:

Expression Example  Consider the statement: A = (B + 23) * 4;  This would get translated into: Nx ( MOVE ( MEM ( +(TEMP fp, CONST k_A) ), * ( + ( MEM ( +(TEMP fp, CONST k_B) ), CONST 23 ), CONST 4 ) )

Simple Array Variables  C-like arrays are pointers to array base, so fetch with a MEM like any other variable: Ex(MEM(+(TEMP fp,CONST k)))  Thus, for e[I]: Ex(MEM(+(e.unEx,x(i.unEx, CONST w)))) i is index expression and w is word size – all values are word-sized (scalar)  Note: must first check array index i<size(e); runtime can put size in word preceding array base

Array creation  t[e1] of e2: Ex(externalCall(”initArray”, [e1.unEx, e2.unEx]))

General 1-Dimensional Arrays  var a : ARRAY [2..5] of integer;  a[e] translates to: MEM(+(TEMP fp, +(CONST k-2w,x(CONST w, e.unEx))))  where k is offset of static array from fp, w is word size  In Pascal, multidimensional arrays are treated as arrays of arrays, so A[i,j] is equivalent to A[i][j], so can translate as above.

Multidimensional Arrays 1/3  Array layout: Contiguous:  Row major  Rightmost subscript varies most quickly:  A[1,1], A[1,2],...  A[2,1], A[2,2],...  Used in PL/1, Algol, Pascal, C, Ada, Modula-3  Column major  Leftmost subscript varies most quickly:  A[1,1], A[2,1],...  A[1,2], A[2,2],  Used in FORTRAN By vectors  Contiguous vector of pointers to (non-contiguous) subarrays

Multidimensional Arrays 2/3  array [1..N,1..M] of T Equivalent to :  array [1..N] of array [1..M] of T  no. of elt’s in dimension j: Dj = U j - L j + 1  Memory address of A[i 1,...,i n ]: Memory addr. of A[L 1,…, L n ] + sizeof(T) * [ + (i n - L n ) + (i n-1 - L n-1 ) * D n + (i n-2 - L n-2 ) * D n * D n-1 + … + (i 1 - L 1 ) * D n * D n-1 * … * D 2 ]

 which can be rewritten as  i 1 *D 2 *…*D n + i 2 *D 3 *…*D n + … + i n-1 * D n + i n  - (L 1 *D 2 *…*D n + L 2 *D 3 *…*D n + … + L n-1 * D n + L n )  address of A[i 1,…,i n ]: address(A) + ((variable part - constant part) * element size) Variable part Constant part Multidimensional Arrays 3/3

Record Variables  Records are pointers to record base, so fetch like other variables. For e.f Ex(MEM(+(e.unEx, CONST o)))  where o is the byte offset of the field in the record  Note: must check record pointer is non-nil (i.e., non-zero)

Record Creation  t{f1=e1;f2=e2;….;fn=en} in the (preferably GC’d) heap, first allocate the space then initialize it: Ex( ESEQ(SEQ(MOVE(TEMP r, externalCall(”allocRecord”,[CONST n])), SEQ(MOVE(MEM(TEMP r), e1.unEx)), SEQ(..., MOVE(MEM(+(TEMP r, CONST(n- 1)w)),en.unEx))), TEMP r)) where w is the word size

String Literals  Statically allocated, so just use the string’s label Ex(NAME( label))  where the literal will be emitted as:.word 11 label:.ascii "hello world"

Comparisons  Translate a op b as: RelCx( op, a.unEx, b.unEx)  When used as a conditional unCx(t,f) yields: CJUMP( op, a.unEx, b.unEx, t, f ) where t and f are labels.  When used as a value unEx yields: ESEQ(SEQ(MOVE(TEMP r, CONST 1), SEQ(unCx(t, f), SEQ(LABEL f, SEQ(MOVE(TEMP r, CONST 0), LABEL t)))), TEMP r)

If Expressions 1/3 [not for exams]  If statements used as expressions are best considered as a special expression class to avoid spaghetti JUMPs.  Translate if e1 then e2 else e3 into: IfThenElseExp(e1,e2,e3)  When used as a value unEx yields: ESEQ(SEQ(SEQ(e1.unCx(t, f),SEQ(SEQ(LABEL t,SEQ(MOVE(TEMP r, e2.unEx),JUMP join)),SEQ(LABEL f,SEQ(MOVE(TEMP r, e3.unEx),JUMP join)))),LABEL join),TEMP r)

If Expressions 2/3  As a conditional unCx(t,f) yields: SEQ(e 1.unCx(tt,ff), SEQ(SEQ(LABEL tt, e 2.unCx(t, f )), SEQ(LABEL ff, e 3.unCx(t, f ))))

If Expressions 3/3  Applying unCx(t,f) to “if x b else 0”: SEQ(CJUMP(LT, x.unEx, CONST 5, tt, ff), SEQ(SEQ(LABEL tt, CJUMP(GT, a.unEx, b.unEx, t, f )), SEQ(LABEL ff, JUMP f )))  or more optimally: SEQ(CJUMP(LT, x.unEx, CONST 5, tt, f ), SEQ(LABEL tt, CJUMP(GT, a.unEx, b.uneX, t, f )))

While Loops 1/2  while c do s: evaluate c if false jump to next statement after loop if true fall into loop body branch to top of loop e.g.,  test:  if not(c)jump done  s  jump test  done:

While Loops 2/2  The tree produced is:  Nx( SEQ(SEQ(SEQ(LABEL test, c.unCx( body,done)), SEQ(SEQ(LABEL body, s.unNx), JUMP(NAME test))), LABEL done))  repeat e1 until e2 is the same with the evaluate/compare/branch at bottom of loop

For Loops 1/2  for i:= e 1 to e 2 do s evaluate lower bound into index variable evaluate upper bound into limit variable if index > limit jump to next statement after loop fall through to loop body increment index if index < limit jump to top of loop body

For Loops 2/2 t1 <- e1 t2 <- e2 if t1 > t2 jump done body: s t1 <- t1 +1 if t1 < t 2 jump body done:

Break Statements  when translating a loop push the done label on some stack  break simply jumps to label on top of stack  when done translating loop and its body, pop the label

Case Statement 1/3  case E of V 1 : S 1... Vn: Sn end evaluate the expression find value in list equal to value of expression execute statement associated with value found jump to next statement after case  Key issue: finding the right case sequence of conditional jumps (small case set)  O(|cases|) binary search of an ordered jump table (sparse case set)  O(log 2 |cases| ) hash table (dense case set)  O(1)

Case Statement 2/3  case E of V 1 : S 1... Vn: Sn end  One translation approach: t :=expr jump test L 1 : code for S1; jump next L 2 : code for S 2; jump next... Ln: code for Sn jump next test: if t = V1 jump L 1 if t = V2 jump L 2... if t = Vn jump Ln code to raise run-time exception next:

Case Statement 3/3  Another translation approach: t :=expr check t in bounds of 0…n-1 if not code to raise run- time exception jump jtable + t L 1 : code for S1; jump next L 2 : code for S 2; jump next... Ln: code for Sn jump next Jtable: jump L 1 jump L 2... jump Ln next:

Function Calls  f(e1;….. ;en): Ex(CALL(NAME label f, [sl,e1,... en]))  where sl is the static link for the callee f Non-local references can be found by following m static links from the caller, m being the difference between the levels of the caller and the callee.  In OO languages, you can also explicitly pass “this”.