Intermediate Representation

Slides:



Advertisements
Similar presentations
1 Lecture 3: MIPS Instruction Set Today’s topic:  More MIPS instructions  Procedure call/return Reminder: Assignment 1 is on the class web-page (due.
Advertisements

Intermediate Code Generation
The University of Adelaide, School of Computer Science
CSI 3120, Implementing subprograms, page 1 Implementing subprograms The environment in block-structured languages The structure of the activation stack.
1 Lecture 4: Procedure Calls Today’s topics:  Procedure calls  Large constants  The compilation process Reminder: Assignment 1 is due on Thursday.
1 Compiler Construction Intermediate Code Generation.
Computer Architecture CSCE 350
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /17/2013 Lecture 12: Procedures Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE CENTRAL.
Ch. 8 Functions.
The University of Adelaide, School of Computer Science
CSE 5317/4305 L7: Run-Time Storage Organization1 Run-Time Storage Organization Leonidas Fegaras.
Prof. Necula CS 164 Lecture 141 Run-time Environments Lecture 8.
1 Storage Registers vs. memory Access to registers is much faster than access to memory Goal: store as much data as possible in registers Limitations/considerations:
1 Chapter 7: Runtime Environments. int * larger (int a, int b) { if (a > b) return &a; //wrong else return &b; //wrong } int * larger (int *a, int *b)
CS 536 Spring Run-time organization Lecture 19.
3/17/2008Prof. Hilfinger CS 164 Lecture 231 Run-time organization Lecture 23.
CS 536 Spring Code generation I Lecture 20.
CSE 5317/4305 L8: Intermediate Representation1 Intermediate Representation Leonidas Fegaras.
Intro to Computer Architecture
1 Pertemuan 20 Run-Time Environment Matakuliah: T0174 / Teknik Kompilasi Tahun: 2005 Versi: 1/6.
Run time vs. Compile time
Run-time Environment and Program Organization
7/13/20151 Topic 3: Run-Time Environment Memory Model Activation Record Call Convention Storage Allocation Runtime Stack and Heap Garbage Collection.
Chapter 7: Runtime Environment –Run time memory organization. We need to use memory to store: –code –static data (global variables) –dynamic data objects.
Runtime Environments What is in the memory? Runtime Environment2 Outline Memory organization during program execution Static runtime environments.
13/02/2009CA&O Lecture 04 by Engr. Umbreen Sabir Computer Architecture & Organization Instructions: Language of Computer Engr. Umbreen Sabir Computer Engineering.
Runtime Environments Compiler Construction Chapter 7.
Compiler Construction
CPSC 388 – Compiler Design and Construction Runtime Environments.
Lesson 13 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
COP4020 Programming Languages Subroutines and Parameter Passing Prof. Xin Yuan.
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
COMPILERS Intermediate Code hussein suleman uct csc3003s 2007.
CSC 8505 Compiler Construction Runtime Environments.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 11: Functions and stack frames.
Run-Time Environments Presented By: Seema Gupta 09MCA102.
CS 404 Introduction to Compiler Design
STACKS & QUEUES for CLASS XII ( C++).
Storage Classes There are three places in memory where data may be placed: In Data section declared with .data in assembly language in C - Static) On the.
Run-Time Environments Chapter 7
Computer Science 210 Computer Organization
Names and Attributes Names are a key programming language feature
CS2100 Computer Organisation
Compiler Construction (CS-636)
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
COMPILERS Activation Records
Lecture 4: MIPS Instruction Set
Run-time organization
RISC Concepts, MIPS ISA Logic Design Tutorial 8.
Introduction to Compilers Tim Teitelbaum
Procedures (Functions)
Functions and Procedures
Instructions - Type and Format
Lecture 4: MIPS Instruction Set
Programming Languages (CS 550) Mini Language Compiler
CSCE Fall 2013 Prof. Jennifer L. Welch.
Chapter 6 Intermediate-Code Generation
MIPS Instructions.
The University of Adelaide, School of Computer Science
Understanding Program Address Space
Lecture 5: Procedure Calls
CSCE Fall 2012 Prof. Jennifer L. Welch.
UNIT V Run Time Environments.
Languages and Compilers (SProg og Oversættere)
Run Time Environments 薛智文
Runtime Environments What is in the memory?.
Computer Architecture
Programming Languages (CS 360) Mini Language Compiler
SPL – PS2 C++ Memory Handling.
Presentation transcript:

Intermediate Representation Leonidas Fegaras

Intermediate Representation (IR) The semantic phase of a compiler translates parse trees into an intermediate representation (IR), which is independent of the underlying computer architecture generates machine code from the IRs This makes the task of retargeting the compiler to another computer architecture easier to handle The IR data model includes raw memory (a vector of words/bytes), infinite size registers (unlimited number) data addresses The IR programs are trees that represent instructions in a universal machine architecture

IR (cont.) Some IR specs are actually machine-dependent: 32bit, instead of 64bit addresses some registers have a special meaning (sp, fp, gp, ra) Most IR specs are left unspecified and must be designed: frame layout variable allocation in the static section, in a frame, as a register, etc data layout eg, strings can be designed to be null-terminated (as in C) or with an extra length (as in Java)

IR Example Represents the IR: which evaluates the program: MOVE MEM + + MEM CONST TEMP CONST + 10 fp -16 TEMP CONST fp -20 Represents the IR: MOVE(MEM(+(TEMP(fp),CONST(-16))), +(MEM(+(TEMP(fp),CONST(-20))), CONST(10))) which evaluates the program: M[fp-16] := M[fp-20]+10

Expression IRs CONST(i): the integer constant i MEM(e): if e is an expression that calculates a memory address, then this is the contents of the memory at address e (one word) NAME(n): the address that corresponds to the label n eg. MEM(NAME(x)) returns the value stored at the location X TEMP(t): if t is a temporary register, return the value of the register, eg. MEM(BINOP(PLUS,TEMP(fp),CONST(24))) fetches a word from the stack located 24 bytes above the frame pointer BINOP(op,e1,e2): evaluate e1, evaluate e2, and perform the binary operation op over the results of the evaluations of e1 and e2 op can be PLUS, AND, etc we abbreviate BINOP(PLUS,e1,e2) by +(e1,e2) CALL(f,[e1,e2,...,en]): evaluate the expressions e1, e2, etc (in that order), and at the end call the function f over these n parameters eg. CALL(NAME(g),ExpList(MEM(NAME(a)),ExpList(CONST(1),NULL))) represents the function call g(a,1) ESEQ(s,e): execute statement s and then evaluate and return the value of the expression e

Statement IRs MOVE(TEMP(t),e): store the value of the expression e into the register t MOVE(MEM(e1),e2): evaluate e1 to get an address, then evaluate e2, and then store the value of e2 in the address calculated from e1 eg, MOVE(MEM(+(NAME(x),CONST(16))),CONST(1)) computes x[4] := 1 (since 4*4 bytes = 16 bytes). EXP(e): evaluate e and discard the result JUMP(L): Jump to the address L L must be defined in the program by some LABEL(L) CJUMP(o,e1,e2,t,f): evaluate e1 & e2. If the values of e1 and e2 are related by o, then jump to the address calculated by t, else jump the one for f the binary relational operator o must be EQ, NE, LT etc SEQ(s1,s2,...,sn): perform statement s1, s2, ... sn is sequence LABEL(n): define the name n to be the address of this statement you can retrieve this address using NAME(n)

Local Variables Local variables located in the stack are retrieved using an expression represented by the IR MEM(+(TEMP(fp),CONST(offset))) If a variable is located in an outer static scope k levels higher than the current scope, we follow the static chain k times, and then we retrieve the variable using the offset of the variable eg, if k=3: MEM(+(MEM(+(MEM(+(MEM(+(TEMP(fp),CONST(static))), CONST(static))), CONST(offset))) where static is the offset of the static link (for our frame layout, static = -8)

L-values An l-value is the result of an expression that can occur on the left of an assignment statement eg, x[f(a,6)].y is an l-value It denotes a location where we can store a value It is basically constructed by deriving the IR of the value and then dropping the outermost MEM call For example, if the value is MEM(+(TEMP(fp),CONST(offset))) then the l-value is: +(TEMP(fp),CONST(offset))

Data Layout: Vectors Usually stored in the heap Fixed-size vectors are usually mapped to n consecutive elements otherwise, the vector length is also stored before the elements In Tiger, vectors start from index 0 and each vector element is 4 bytes long (one word), which may represent an integer or a pointer to some value To retrieve the ith element of an array a, we use MEM(+(A,*(I,CONST(4)))) where A is the address of a and I is the value of i But this is not sufficient. The IR should check whether I<size(a): ESEQ(SEQ(CJUMP(gt,I,CONST(size_of_A), NAME(next),NAME(error_label)), LABEL(next)), MEM(+(A,*(I,CONST(4))))) 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9

Records For records, we need to know the byte offset of each field (record attribute) in the base record Since every value is 4 bytes long, the ith field of a structure a can be retrieved using MEM(+(A,CONST(i*4))), where A is the address of a here i is always a constant since we know the field name

Records (cont.) For example, suppose that i is located in the local frame with offset -24 and a is located in the immediate outer scope and has offset -40. Then, the statement a[i+1].first := a[i].second+2 is translated into the IR: MOVE(MEM(MEM(+(A,*(+(I,CONST(1)),CONST(4))))), +(MEM(+(MEM(+(A,*(I,CONST(4)))),CONST(4))), CONST(2))) where I = MEM(+(TEMP(fp),CONST(-24))) and A = MEM(+(MEM(+(TEMP(fp),CONST(-8))),CONST(-40))) since the offset of first is 0 and the offset of second is 4

Strings In Tiger, strings of size n are allocated in the heap in n+4 consecutive bytes, where the first 4 bytes contain the size of the string The string is simply a pointer to the first byte String literals are statically allocated Other languages, such as C, store a string of size n into the heap in n+1 consecutive bytes the last byte has a null value to indicate the end of string Then, you can allocate a string with address A of size n in the heap by adding n+1 to the global pointer (gp): MOVE(A,ESEQ(MOVE(TEMP(gp), +(TEMP(gp),CONST(n+1))), TEMP(gp)))

Control Statements The while loop is evaluated in the following way: while c do body; is evaluated in the following way: loop: if c goto cont else goto done cont: body goto loop done: which corresponds to the following IR: SEQ(LABEL(loop), CJUMP(EQ,c,0,NAME(done),NAME(cont)), LABEL(cont), body, JUMP(NAME(loop)), LABEL(done))

For-Loops The for statement is evaluated in the following way: for i:=lo to hi do body is evaluated in the following way: i := lo j := hi if i>j goto done loop: body i := i+1 if i<=j goto loop done:

Other Control Statements The break statement is translated into a JUMP The compiler keeps track which label to JUMP to on a “break” statement by maintaining a stack of labels that holds the “done:” labels of the for- or while-loop When it compiles a loop, it pushes the label in the stack, and when it exits a loop, it pops the stack The break statement is thus translated into a JUMP to the label at the top of the stack. A function call f(a1,...,an) is translated into the IR CALL(NAME(L),[sl,e1,...,en]) where L is the label of the first statement of the f code, sl is the static link, and ei is the IR for ai For example, if the difference between the static levels of the caller and callee is one, then sl is MEM(+(TEMP(fp),CONST(-8)))

Example Suppose that records and vectors are implemented as pointers (i.e. memory addresses) to dynamically allocated data in the heap. Consider the following declarations: struct { X: int, Y: int, Z: int } S; /* a record */ int i; int V[10][10]; /* a vector of vectors */ where the variables S, i, and V are stored in the current frame with offsets -16, -20, and -24 respectively We will the following abbreviations: S = MEM(+(TEMP(fp),CONST(-16))) I = MEM(+(TEMP(fp),CONST(-20))) V = MEM(+(TEMP(fp),CONST(-24)))

Example (cont.) S.Z+S.X if (i<10) then S.Y := i else i := i-1 +(MEM(+(S,CONST(8))),MEM(S)) if (i<10) then S.Y := i else i := i-1 SEQ(CJUMP(LT,I,CONST(10),trueL,falseL), LABEL(trueL), MOVE(MEM(+(S,CONST(4))),I), JUMP(exit), LABEL(falseL), MOVE(I,-(I,CONST(1))), LABEL(exit))

Example (cont.) V[i][i+1] := V[i][i]+1 for i:=0 to 9 do V[0][i] := i MOVE(MEM(+(MEM(+(V,*(I,CONST(4)))), *(+(I,CONST(1)),CONST(4)))), +(MEM(+(MEM(+(V,*(I,CONST(4)))),*(I,CONST(4)))), CONST(1))) for i:=0 to 9 do V[0][i] := i SEQ(MOVE(I,CONST(0)), MOVE(TEMP(t1),CONST(9)), CJUMP(GT,I,TEMP(t1),done,loop), LABEL(loop), MOVE(MEM(+(MEM(V),*(I,CONST(4)))),I), MOVE(I,+(I,CONST(1))), CJUMP(LEQ,I,TEMP(t1),loop,done), LABEL(done))