CSE 5317/4305 L8: Intermediate Representation1 Intermediate Representation Leonidas Fegaras.

Slides:



Advertisements
Similar presentations
1 Lecture 3: MIPS Instruction Set Today’s topic:  More MIPS instructions  Procedure call/return Reminder: Assignment 1 is on the class web-page (due.
Advertisements

Intermediate Code Generation
The University of Adelaide, School of Computer Science
CSI 3120, Implementing subprograms, page 1 Implementing subprograms The environment in block-structured languages The structure of the activation stack.
1 Lecture 4: Procedure Calls Today’s topics:  Procedure calls  Large constants  The compilation process Reminder: Assignment 1 is due on Thursday.
1 Compiler Construction Intermediate Code Generation.
Computer Architecture CSCE 350
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /17/2013 Lecture 12: Procedures Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE CENTRAL.
Ch. 8 Functions.
Apr. 12, 2000Systems Architecture I1 Systems Architecture I (CS ) Lecture 6: Branching and Procedures in MIPS* Jeremy R. Johnson Wed. Apr. 12, 2000.
The University of Adelaide, School of Computer Science
CSE 5317/4305 L7: Run-Time Storage Organization1 Run-Time Storage Organization Leonidas Fegaras.
Prof. Necula CS 164 Lecture 141 Run-time Environments Lecture 8.
Cmput Lecture 8 Department of Computing Science University of Alberta ©Duane Szafron 2000 Revised 1/26/00 The Java Memory Model.
Intermediate code generation. Code Generation Create linear representation of program Result can be machine code, assembly code, code for an abstract.
1 Storage Registers vs. memory Access to registers is much faster than access to memory Goal: store as much data as possible in registers Limitations/considerations:
1 Chapter 7: Runtime Environments. int * larger (int a, int b) { if (a > b) return &a; //wrong else return &b; //wrong } int * larger (int *a, int *b)
CS 536 Spring Run-time organization Lecture 19.
3/17/2008Prof. Hilfinger CS 164 Lecture 231 Run-time organization Lecture 23.
CS 536 Spring Code generation I Lecture 20.
Intro to Computer Architecture
1 Pertemuan 20 Run-Time Environment Matakuliah: T0174 / Teknik Kompilasi Tahun: 2005 Versi: 1/6.
Run time vs. Compile time
Run-time Environment and Program Organization
7/13/20151 Topic 3: Run-Time Environment Memory Model Activation Record Call Convention Storage Allocation Runtime Stack and Heap Garbage Collection.
Chapter 7: Runtime Environment –Run time memory organization. We need to use memory to store: –code –static data (global variables) –dynamic data objects.
Chapter 7 Evaluating the Instruction Set Architecture of H1: Part 1.
Runtime Environments What is in the memory? Runtime Environment2 Outline Memory organization during program execution Static runtime environments.
13/02/2009CA&O Lecture 04 by Engr. Umbreen Sabir Computer Architecture & Organization Instructions: Language of Computer Engr. Umbreen Sabir Computer Engineering.
Runtime Environments Compiler Construction Chapter 7.
Programming Language Principles Lecture 24 Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Subroutines.
Compiler Construction
CPSC 388 – Compiler Design and Construction Runtime Environments.
Lesson 13 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
COP4020 Programming Languages Subroutines and Parameter Passing Prof. Xin Yuan.
Activation Records (in Tiger) CS 471 October 24, 2007.
Execution of an instruction
Lecture 4: MIPS Instruction Set
Run-Time Storage Organization Compiler Design Lecture (03/23/98) Computer Science Rensselaer Polytechnic.
Prof. Fateman CS 164 Lecture 281 Virtual Machine Structure Lecture 28.
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
Introduction to Code Generation and Intermediate Representations
Computer Architecture CSE 3322 Lecture 3 Assignment: 2.4.1, 2.4.4, 2.6.1, , Due 2/3/09 Read 2.8.
COMPILERS Intermediate Code hussein suleman uct csc3003s 2007.
CSC 8505 Compiler Construction Runtime Environments.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 11: Functions and stack frames.
7. Runtime Environments Zhang Zhizheng
Chapter 2 — Instructions: Language of the Computer — 1 Conditional Operations Branch to a labeled instruction if a condition is true – Otherwise, continue.
Computer Organization Instructions Language of The Computer (MIPS) 2.
LECTURE 19 Subroutines and Parameter Passing. ABSTRACTION Recall: Abstraction is the process by which we can hide larger or more complex code fragments.
Run-Time Environments Presented By: Seema Gupta 09MCA102.
CS 404 Introduction to Compiler Design
STACKS & QUEUES for CLASS XII ( C++).
Storage Classes There are three places in memory where data may be placed: In Data section declared with .data in assembly language in C - Static) On the.
Run-Time Environments Chapter 7
CS2100 Computer Organisation
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Run-time organization
Instructions - Type and Format
Programming Languages (CS 550) Mini Language Compiler
MIPS Instructions.
The University of Adelaide, School of Computer Science
Understanding Program Address Space
UNIT V Run Time Environments.
Intermediate Representation
Runtime Environments What is in the memory?.
Computer Architecture
Programming Languages (CS 360) Mini Language Compiler
SPL – PS2 C++ Memory Handling.
Presentation transcript:

CSE 5317/4305 L8: Intermediate Representation1 Intermediate Representation Leonidas Fegaras

CSE 5317/4305 L8: Intermediate Representation2 Intermediate Representation (IR) The semantic phase of a compiler 1)translates parse trees into an intermediate representation (IR), which is independent of the underlying computer architecture 2)generates machine code from the IRs This makes the task of retargeting the compiler to another computer architecture easier to handle The IR data model includes –raw memory (a vector of words/bytes), infinite size –registers (unlimited number) –data addresses The IR programs are trees that represent instructions in a universal machine architecture

CSE 5317/4305 L8: Intermediate Representation3 IR (cont.) Some IR specs are actually machine-dependent: –32bit, instead of 64bit addresses –some registers have a special meaning (sp, fp, gp, ra) Most IR specs are left unspecified and must be designed: –frame layout –variable allocation in the static section, in a frame, as a register, etc –data layout eg, strings can be designed to be null-terminated (as in C) or with an extra length (as in Java)

CSE 5317/4305 L8: Intermediate Representation4 IR Example Represents the IR: MOVE(MEM(+(TEMP(fp),CONST(-16))), +(MEM(+(TEMP(fp),CONST(-20))), CONST(10))) which evaluates the program: –M[fp-16] := M[fp-20]+10 MOVE MEM + + MEM CONST TEMP CONST + 10 fp -16 TEMP CONST fp -20

CSE 5317/4305 L8: Intermediate Representation5 Expression IRs CONST(i): the integer constant i MEM(e): if e is an expression that calculates a memory address, then this is the contents of the memory at address e (one word) NAME(n): the address that corresponds to the label n –eg. MEM(NAME(x))returns the value stored at the location X TEMP(t): if t is a temporary register, return the value of the register, –eg. MEM(BINOP(PLUS,TEMP(fp),CONST(24))) fetches a word from the stack located 24 bytes above the frame pointer BINOP(op,e1,e2): evaluate e1, evaluate e2, and perform the binary operation op over the results of the evaluations of e1 and e2 –op can be PLUS, AND, etc –we abbreviate BINOP(PLUS,e1,e2) by +(e1,e2) CALL(f,[e1,e2,...,en]): evaluate the expressions e1, e2, etc (in that order), and at the end call the function f over these n parameters –eg. CALL(NAME(g),ExpList(MEM(NAME(a)),ExpList(CONST(1),NULL))) represents the function call g(a,1) ESEQ(s,e): execute statement s and then evaluate and return the value of the expression e

CSE 5317/4305 L8: Intermediate Representation6 Statement IRs MOVE(TEMP(t),e): store the value of the expression e into the register t MOVE(MEM(e1),e2): evaluate e1 to get an address, then evaluate e2, and then store the value of e2 in the address calculated from e1 –eg, MOVE(MEM(+(NAME(x),CONST(16))),CONST(1)) computes x[4] := 1(since 4*4 bytes = 16 bytes). EXP(e): evaluate e and discard the result JUMP(L): Jump to the address L –L must be defined in the program by some LABEL(L) CJUMP(o,e1,e2,t,f): evaluate e1 & e2. If the values of e1 and e2 are related by o, then jump to the address calculated by t, else jump the one for f –the binary relational operator o must be EQ, NE, LT etc SEQ(s1,s2,...,sn): perform statement s1, s2,... sn is sequence LABEL(n): define the name n to be the address of this statement –you can retrieve this address using NAME(n)

CSE 5317/4305 L8: Intermediate Representation7 Local Variables Local variables located in the stack are retrieved using an expression represented by the IR MEM(+(TEMP(fp),CONST(offset))) If a variable is located in an outer static scope k levels higher than the current scope, we follow the static chain k times, and then we retrieve the variable using the offset of the variable –eg, if k=3: MEM(+(MEM(+(MEM(+(MEM(+(TEMP(fp),CONST(static))), CONST(static))), CONST(offset))) where static is the offset of the static link (for our frame layout, static = -8)

CSE 5317/4305 L8: Intermediate Representation8 L-values An l-value is the result of an expression that can occur on the left of an assignment statement –eg, x[f(a,6)].y is an l-value It denotes a location where we can store a value It is basically constructed by deriving the IR of the value and then dropping the outermost MEM call For example, if the value is MEM(+(TEMP(fp),CONST(offset))) then the l-value is: +(TEMP(fp),CONST(offset))

CSE 5317/4305 L8: Intermediate Representation9 Data Layout: Vectors Usually stored in the heap Fixed-size vectors are usually mapped to n consecutive elements otherwise, the vector length is also stored before the elements In Tiger, vectors start from index 0 and each vector element is 4 bytes long (one word), which may represent an integer or a pointer to some value To retrieve the ith element of an array a, we use MEM(+(A,*(I,CONST(4)))) where A is the address of a and I is the value of i But this is not sufficient. The IR should check whether I<size(a): ESEQ(SEQ(CJUMP(gt,I,CONST(size_of_A), NAME(next),NAME(error_label)), LABEL(next)), MEM(+(A,*(I,CONST(4)))))

CSE 5317/4305 L8: Intermediate Representation10 Records For records, we need to know the byte offset of each field (record attribute) in the base record Since every value is 4 bytes long, the ith field of a structure a can be retrieved using MEM(+(A,CONST(i*4))), where A is the address of a –here i is always a constant since we know the field name

CSE 5317/4305 L8: Intermediate Representation11 Records (cont.) For example, suppose that i is located in the local frame with offset -24 and a is located in the immediate outer scope and has offset -40. Then, the statement a[i+1].first := a[i].second+2 is translated into the IR: MOVE(MEM(MEM(+(A,*(+(I,CONST(1)),CONST(4))))), +(MEM(+(MEM(+(A,*(I,CONST(4)))),CONST(4))), CONST(2))) where I = MEM(+(TEMP(fp),CONST(-24))) and A = MEM(+(MEM(+(TEMP(fp),CONST(-8))),CONST(-40))) since the offset of first is 0 and the offset of second is 4

CSE 5317/4305 L8: Intermediate Representation12 Strings In Tiger, strings of size n are allocated in the heap in n+4 consecutive bytes, where the first 4 bytes contain the size of the string The string is simply a pointer to the first byte String literals are statically allocated Other languages, such as C, store a string of size n into the heap in n+1 consecutive bytes –the last byte has a null value to indicate the end of string Then, you can allocate a string with address A of size n in the heap by adding n+1 to the global pointer (gp): MOVE(A,ESEQ(MOVE(TEMP(gp), +(TEMP(gp),CONST(n+1))), TEMP(gp)))

CSE 5317/4305 L8: Intermediate Representation13 Control Statements The while loop while c do body; is evaluated in the following way: loop: if c goto cont else goto done cont: body goto loop done: which corresponds to the following IR: SEQ(LABEL(loop), CJUMP(EQ,c,0,NAME(done),NAME(cont)), LABEL(cont), body, JUMP(NAME(loop)), LABEL(done))

CSE 5317/4305 L8: Intermediate Representation14 For-Loops The for statement for i:=lo to hi do body is evaluated in the following way: i := lo j := hi if i>j goto done loop: body i := i+1 if i<=j goto loop done:

CSE 5317/4305 L8: Intermediate Representation15 Other Control Statements The break statement is translated into a JUMP –The compiler keeps track which label to JUMP to on a “break” statement by maintaining a stack of labels that holds the “done:” labels of the for- or while-loop –When it compiles a loop, it pushes the label in the stack, and when it exits a loop, it pops the stack –The break statement is thus translated into a JUMP to the label at the top of the stack. A function call f(a1,...,an) is translated into the IR CALL(NAME(L),[sl,e1,...,en]) where L is the label of the first statement of the f code, sl is the static link, and ei is the IR for ai For example, if the difference between the static levels of the caller and callee is one, then sl is MEM(+(TEMP(fp),CONST(-8)))

CSE 5317/4305 L8: Intermediate Representation16 Example Suppose that records and vectors are implemented as pointers (i.e. memory addresses) to dynamically allocated data in the heap. Consider the following declarations: struct { X: int, Y: int, Z: int } S; /* a record */ int i; int V[10][10]; /* a vector of vectors */ where the variables S, i, and V are stored in the current frame with offsets -16, -20, and -24 respectively We will the following abbreviations: S = MEM(+(TEMP(fp),CONST(-16))) I = MEM(+(TEMP(fp),CONST(-20))) V = MEM(+(TEMP(fp),CONST(-24)))

CSE 5317/4305 L8: Intermediate Representation17 Example (cont.) S.Z+S.X +(MEM(+(S,CONST(8))),MEM(S)) if (i<10) then S.Y := i else i := i-1 SEQ(CJUMP(LT,I,CONST(10),trueL,falseL), LABEL(trueL), MOVE(MEM(+(S,CONST(4))),I), JUMP(exit), LABEL(falseL), MOVE(I,-(I,CONST(1))), JUMP(exit), LABEL(exit))

CSE 5317/4305 L8: Intermediate Representation18 Example (cont.) V[i][i+1] := V[i][i]+1 MOVE(MEM(+(MEM(+(V,*(I,CONST(4)))), *(+(I,CONST(1)),CONST(4)))), +(MEM(+(MEM(+(V,*(I,CONST(4)))),*(I,CONST(4)))), CONST(1))) for i:=0 to 9 do V[0][i] := i SEQ(MOVE(I,CONST(0)), MOVE(TEMP(t1),CONST(9)), CJUMP(GT,I,TEMP(t1),done,loop), LABEL(loop), MOVE(MEM(+(MEM(V),*(I,CONST(4)))),I), MOVE(I,+(I,CONST(1))), CJUMP(LEQ,I,TEMP(t1),loop,done), LABEL(done))