Intermediate Representation I High-Level to Low-Level IR Translation.

Slides:



Advertisements
Similar presentations
Intermediate Representations CS 671 February 12, 2008.
Advertisements

1 Lecture 10 Intermediate Representations. 2 front end »produces an intermediate representation (IR) for the program. optimizer »transforms the code in.
Intermediate Code Generation
The University of Adelaide, School of Computer Science
8 Intermediate code generation
1 Compiler Construction Intermediate Code Generation.
Computer Architecture CSCE 350
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /17/2013 Lecture 12: Procedures Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE CENTRAL.
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
Prof. Necula CS 164 Lecture 141 Run-time Environments Lecture 8.
Program Representations. Representing programs Goals.
Compiler Construction Intermediate Representation III Activation Records Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University.
Intermediate Representation I High-Level to Low-Level IR Translation EECS 483 – Lecture 17 University of Michigan Monday, November 6, 2006.
Chapter 14: Building a Runnable Program Chapter 14: Building a runnable program 14.1 Back-End Compiler Structure 14.2 Intermediate Forms 14.3 Code.
CS412/413 Introduction to Compilers Radu Rugina Lecture 16: Efficient Translation to Low IR 25 Feb 02.
Intermediate code generation. Code Generation Create linear representation of program Result can be machine code, assembly code, code for an abstract.
1 Storage Registers vs. memory Access to registers is much faster than access to memory Goal: store as much data as possible in registers Limitations/considerations:
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
1 Chapter 7: Runtime Environments. int * larger (int a, int b) { if (a > b) return &a; //wrong else return &b; //wrong } int * larger (int *a, int *b)
CS 536 Spring Run-time organization Lecture 19.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
3/17/2008Prof. Hilfinger CS 164 Lecture 231 Run-time organization Lecture 23.
1 ICS 51 Introductory Computer Organization Fall 2006 updated: Oct. 2, 2006.
CS 536 Spring Code generation I Lecture 20.
Intermediate Code CS 471 October 29, CS 471 – Fall Intermediate Code Generation Source code Lexical Analysis Syntactic Analysis Semantic.
Run time vs. Compile time
Semantics of Calls and Returns
Run-time Environment and Program Organization
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
Lecture 7: MIPS Instruction Set Today’s topic –Procedure call/return –Large constants Reminders –Homework #2 posted, due 9/17/
Chapter 7: Runtime Environment –Run time memory organization. We need to use memory to store: –code –static data (global variables) –dynamic data objects.
Imperative Programming
CS412/413 Introduction to Compilers Radu Rugina Lecture 15: Translating High IR to Low IR 22 Feb 02.
Runtime Environments What is in the memory? Runtime Environment2 Outline Memory organization during program execution Static runtime environments.
Runtime Environments Compiler Construction Chapter 7.
Chapter 8 Intermediate Code Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University.
Compiler Construction
Compiler Chapter# 5 Intermediate code generation.
Computer Architecture Instruction Set Architecture Lynn Choi Korea University.
Activation Records CS 671 February 7, CS 671 – Spring The Compiler So Far Lexical analysis Detects inputs with illegal tokens Syntactic analysis.
Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 1: Compiler II: Code Generation slide 1www.nand2tetris.org Building.
Runtime Organization.
Activation Records (in Tiger) CS 471 October 24, 2007.
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
Introduction to Code Generation and Intermediate Representations
CSC 8505 Compiler Construction Runtime Environments.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 11: Functions and stack frames.
Intermediate Representation II Storage Allocation and Management EECS 483 – Lecture 18 University of Michigan Wednesday, November 8, 2006.
7. Runtime Environments Zhang Zhizheng
RUNTIME ENVIRONMENT AND VARIABLE BINDINGS How to manage local variables.
Chapter 2 — Instructions: Language of the Computer — 1 Conditional Operations Branch to a labeled instruction if a condition is true – Otherwise, continue.
LECTURE 19 Subroutines and Parameter Passing. ABSTRACTION Recall: Abstraction is the process by which we can hide larger or more complex code fragments.
CS 404 Introduction to Compiler Design
Computer Architecture Instruction Set Architecture
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Run-time organization
Introduction to Compilers Tim Teitelbaum
Instructions - Type and Format
Lecture 4: MIPS Instruction Set
Code Generation.
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
Understanding Program Address Space
UNIT V Run Time Environments.
Course Overview PART I: overview material PART II: inside a compiler
Run Time Environments 薛智文
Runtime Environments What is in the memory?.
Computer Architecture
Presentation transcript:

Intermediate Representation I High-Level to Low-Level IR Translation

- 1 - Where We Are... Lexical Analysis Syntax Analysis Semantic Analysis Intermediate Code Gen Source code (character stream) token stream abstract syntax tree abstract syntax tree + symbol tables, types Intermediate code regular expressions grammars static semantics

- 2 - Intermediate Representation (aka IR) v The compilers internal representation »Is language-independent and machine- independent ASTIR Pentium Java bytecode Itanium TI C5x ARM optimize Enables machine independent and machine dependent optis

- 3 - What Makes a Good IR? v Captures high-level language constructs »Easy to translate from AST »Supports high-level optimizations v Captures low-level machine features »Easy to translate to assembly »Supports machine-dependent optimizations v Narrow interface: small number of node types (instructions) »Easy to optimize »Easy to retarget

- 4 - Multiple IRs v Most compilers use 2 IRs: »High-level IR (HIR): Language independent but closer to the language »Low-level IR (LIR): Machine independent but closer to the machine »A significant part of the compiler is both language and machine independent! ASTHIR Pentium Java bytecode Itanium TI C5x ARM optimize LIR optimize C++ C Fortran

- 5 - High-Level IR v HIR is essentially the AST »Must be expressive for all input languages v Preserves high-level language constructs »Structured control flow: if, while, for, switch »Variables, expressions, statements, functions v Allows high-level optimizations based on properties of source language »Function inlining, memory dependence analysis, loop transformations

- 6 - Low-Level IR v A set of instructions which emulates an abstract machine (typically RISC) v Has low-level constructs »Unstructured jumps, registers, memory locations v Types of instructions »Arithmetic/logic (a = b OP c), unary operations, data movement (move, load, store), function call/return, branches

- 7 - Alternatives for LIR v 3 general alternatives »Three-address code or quadruples  a = b OP c  Advantage: Makes compiler analysis/opti easier »Tree representation  Was popular for CISC architectures  Advantage: Easier to generate machine code »Stack machine  Like Java bytecode  Advantage: Easier to generate from AST

- 8 - Three-Address Code v a = b OP c »Originally, because instruction had at most 3 addresses or operands  This is not enforced today, ie MAC: a = b * c + d »May have fewer operands v Also called quadruples: (a,b,c,OP) v Example a = (b+c) * (-e) t1 = b + c t2 = -e a = t1 * t2 Compiler-generated temporary variable

- 9 - IR Instructions v Assignment instructions »a = b OP C (binary op)  arithmetic: ADD, SUB, MUL, DIV, MOD  logic: AND, OR, XOR  comparisons: EQ, NEQ, LT, GT, LEQ, GEQ »a = OP b (unary op)  arithmetic MINUS, logical NEG »a = b : copy instruction »a = [b] : load instruction »[a] = b : store instruction »a = addr b: symbolic address v Flow of control »label L: label instruction »jump L: unconditional jump »cjump a L : conditional jump v Function call »call f(a1,..., an) »a = call f(a1,..., an) v IR describes the instruction set of an abstract machine

IR Operands v The operands in 3-address code can be: »Program variables »Constants or literals »Temporary variables v Temporary variables = new locations »Used to store intermediate values »Needed because 3-address code not as expressive as high-level languages

Class Problem n = 0; while (n < 10) { n = n+1; } Convert the following code segment to assembly code

Translating High IR to Low IR v May have nested language constructs »E.g., while nested within an if statement v Need an algorithmic way to translate »Strategy for each high IR construct »High IR construct  sequence of low IR instructions v Solution »Start from the high IR (AST like) representation »Define translation for each node in high IR »Recursively translate nodes

Notation v Use the following notation: »[[e]] = the low IR representation of high IR construct e v [[e]] is a sequence of low IR instructions v If e is an expression (or statement expression), it represents a value »Denoted as: t = [[e]] »Low IR representation of e whose result value is stored in t v For variable v: t = [[v]] is the copy instruction »t = v

Translating Expressions v Binary operations: t = [[e1 OP e2]] »(arithmetic, logical operations and comparisons) v Unary operations: t = [[OP e]] OP e1e2 t1 = [[e1]] t2 = [[e2]] t1 = t1 OP t2 OP e1 t1 = [[e1]] t = OP t1

Translating Array Accesses v Array access: t = [[ v[e] ]] »(type of e is array [T] and S = size of T) t1 = addr v t2 = [[e]] t3 = t2 * S t4 = t1 + t3 t = [t4] /* ie load */ array ve

Translating Structure Accesses v Structure access: t = [[ v.f ]] »(v is of type T, S = offset of f in T) t1 = addr v t2 = t1 + S t = [t2] /* ie load */ struct vf

Translating Short-Circuit OR v Short-circuit OR: t = [[e1 SC-OR e2]] »e.g., || operator in C/C++ t = [[e1]] cjump t Lend t = [[e2]] Lend: semantics: 1. evaluate e1 2. if e1 is true, then done 3. else evaluate e2 SC-OR e1e2

Class Problem v Short-circuit AND: t = [[e1 SC-AND e2]] »e.g., && operator in C/C++ Semantics: 1. Evaluate e1 2. if e1 is true, then evaluate e2 3. else done

Translating Statements v Statement sequence: [[s1; s2;...; sN]] v IR instructions of a statement sequence = concatenation of IR instructions of statements [[ s1 ]] [[ s2 ]]... [[ sN ]] seq s1s2sN...

Assignment Statements v Variable assignment: [[ v = e ]] v Array assignment: [[ v[e1] = e2 ]] v = [[ e ]] t1 = addr v t2 = [[e1]] t3 = t2 * S t4 = t1 + t3 t5 = [[e2] [t4] = t5 /* ie store */ recall S = sizeof(T) where v is array(T)

Translating If-Then [-Else] v [[ if (e) then s ]] v [[ if (e) then s1 else s2 ]] t1 = [[ e ]] t2 = not t1 cjump t2 Lelse Lthen: [[ s1 ]] jump Lend Lelse: [[ s2 ]] Lend: t1 = [[ e ]] t2 = not t1 cjump t2 Lend [[ s ]] Lend: How could I do this more efficiently??

While Statements v [[ while (e) s ]] Lloop: t1 = [[ e ]] t2 = NOT t1 cjump t2 Lend [[ s ]] jump Lloop Lend: or while-do translation do-while translation t1 = [[ e ]] t2 = NOT t1 cjump t2 Lend Lloop: [[ s ]] t3 = [[ e ]] cjump t3 Lloop Lend: Which is better and why?

Switch Statements v [[ switch (e) case v1:s1,..., case vN:sN ]] t = [[ e ]] L1: c = t != v1 cjump c L2 [[ s1 ]] jump Lend /* if there is a break */ L2: c = t != v2 cjump c L3 [[ s2 ]] jump Lend /* if there is a break */... Lend: Can also implement switch as table lookup. Table contains target labels, ie L1, L2, L3. ‘t’ is used to index table. Benefit: k branches reduced to 1. Negative: target of branch hard to figure out in hardware

Call and Return Statements v [[ call f(e1, e2,..., eN) ]] v [[ return e ]] t1 = [[ e1 ]] t2 = [[ e2 ]]... tN = [[ eN ]] call f(t1, t2,..., tN) t = [[ e ]] return t

Nested Expressions v Translation recurses on the expression structure v Example: t = [[ (a – b) * (c + d) ]] t1 = a t2 = b t3 = t1 – t2 t4 = c t5 = d t5 = t4 + t5 t = t3 * t5 [[ (a – b) ]] [[ (c + d) ]] [[ (a-b) * (c+d) ]]

Nested Statements v Same for statements: recursive translation v Example: t = [[ if c then if d then a = b ]] t1 = c t2 = NOT t1 cjump t2 Lend1 t3 = d t4 = NOT t3 cjump t4 Lend2 t3 = b a = t3 Lend2: Lend1: [[ if c... ]] [[ a = b ]] [[ if d... ]]

Class Problem for (i=0; i<100; i++) { A[i] = 0; } if ((a > 0) && (b > 0)) c = 2; else c = 3; Translate the following to the generic assembly code discussed

Issues v These translations are straightforward v But, inefficient: »Lots of temporaries »Lots of labels »Lots of instructions v Can we do this more intelligently? »Should we worry about it?

Intermediate Representation II Storage Allocation and Management

Overview v Program Organization v Memory pools »Static »Automatic »Dynamic v Activation Records

Classes of Storage in Processor v Registers »Fast access, but only a few of them »Address space not visible to programmer  Doesn’t support pointer access! v Memory »Slow access, but large »Supports pointers v Storage class for each variable generally determined when map HIR to LIR

Distinct Regions of Memory v Code space – Instructions to be executed »Best if read-only v Static (or Global) – Variables that retain their value over the lifetime of the program v Stack – Variables that is only as long as the block within which they are defined (local) v Heap – Variables that are defined by calls to the system storage allocator (malloc, new)

Virtual Address Space v Traditional Organization »Code Area at the bottom »Static Data above  Constants  Static strings, variable  Global variables »Heap  Grows upward »Stack  Grows downward »Lot ’ s of free VM in between 0x0 0xffffffff

Class Problem Specify whether each variable is stored in register or memory. For memory which area of the memory? int a; void foo(int b, double c) { int d; struct { int e; char f;} g; int h[10]; char i = 5; float j; }

Zooming In. v Close look on the code area

Execution Stack v A memory area at the top of the VM »Grows downward »Grows on demand (with OS collaboration) v Purpose »Automatic storage for local variables

Overview v Program Organization v Memory pools »Static »Automatic »Dynamic v Activation Records v Parameter Passing Modes v Symbol Table

Memory Pools v Where does memory comes from ? v Three pools »Static »Automatic »Dynamic Static Automatic Dynamic

Static Pool v Content »All the static “ strings ” that appear in the program »All the static constants used in the program »All the global/static variables declared in the program  static int  static arrays  static records  static.... v Allocation ? »Well... it is static, i.e.,  All the sizes are determined at compile time.  Cannot grow or shrink Static

Dynamic Pool v Content »Anything allocated by the program at runtime v Allocation »Depends on the language  C malloc  C++/Java/C#new  ML/Lisp/Schemeimplicit v Deallocation »Depends on the language  Cfree  C++delete  Java/C#/ML/Lisp/SchemeGarbage collection Dynamic

Automatic Pool v Content »Local variables »Actuals (arguments to methods/functions/procedures) v Allocation »Automatic when calling a method/function/procedure v Deallocation »Automatic when returning from a method/function/procedure v Management policy »Stack-like

Overview v Program Organization v Memory pools »Static »Automatic »Dynamic v Activation Records

Activation Records  Also known as “ Frames ” »A record pushed on the execution stack

Creating the Frame v Three actors »The caller »The CPU »The callee int foo(int x,int y) {... } bar() {... x = foo(3,y);... } int foo(int x,int y) {... } bar() {... x = foo(3,y);... }

Creating the Frame v Three actors »The caller »The CPU »The callee int foo(int x,int y) {... } bar() {... x = foo(3,y);... } int foo(int x,int y) {... } bar() {... x = foo(3,y);... } Actual Function Call

Creating the Frame v Three actors »The caller »The CPU »The callee int foo(int x,int y) {... } bar() {... x = foo(3,y);... } int foo(int x,int y) {... } bar() {... x = foo(3,y);... }

Closeup on management data

Returning From a Call v Easy »The RET instruction simply  Access MGMT Area from FP  Restores SP  Restores FP  Transfer control to return address

Returning From a Call v Easy »The RET instruction simply  Access MGMT Area from FP  Restores SP  Restores FP  Transfer control to return address

Returning From a Call v Easy »The RET instruction simply  Access MGMT Area from FP  Restores SP  Restores FP  Transfer control to return address

Returning From a Call v Easy »The RET instruction simply  Access MGMT Area from FP  Restores SP  Restores FP  Transfer control to return address

Stack Frame Construction Example int f(int a) { int b, c; } void g(int a) { int b, c;... b = f(a+c);... } main() { int a, b;... g(a+b);... } a b a + b ret addr to main FP/SP for main b c a + c ret addr to g FP/SP for g b main g f c parameter local var... Note: I have left out the temp part of the stack frame

Class Problem For the following program: int foo(int a) { int x; if (a <= 1) return 1; x = foo(a-1) + foo(a-2); return (x); } main() { int y, z = 10; y = foo(z); } 1. Show the first 3 stack frames created when this program is executed (starting with main). 2. Whats the maximum number of frames the stack grows to during the execution of this program?

Data Layout v Naive layout strategies generally employed »Place the data in the order the programmer declared it! v 2 issues: size, alignment v Size – How many bytes is the data item? »Base types have some fixed size  E.g., char, int, float, double »Composite types (structs, unions, arrays)  Overall size is sum of the components (not quite!)  Calculate an offset for each field

Memory Alignment v Cannot arbitrarily pack variables into memory  Need to worry about alignment v Golden rule – Address of a variable is aligned based on the size of the variable »Char is byte aligned (any addr is fine) »Short is halfword aligned »Int is word aligned »This rule is for C/C++, other languages may have a slightly different rules

Structure Alignment (for C) v Each field is layed out in the order it is declared using Golden Rule for aligning v Identify largest field »Starting address of overall struct is aligned based on the largest field »Size of overall struct is a multiple of the largest field »Reason for this is so can have an array of structs

Structure Example struct { char w; int x[3] char y; short z; } Largest field is int (4 bytes) struct size is multiple of 4 struct starting addr is word aligned Struct must start at word-aligned address char w  1 byte, start anywhere x[3]  12 bytes, but must start at word aligned addr, so 3 empty bytes between w and x char y  1byte, start anywher short z  2 bytes, but must start at halfword aligned addr, so 1 empty byte between y and z Total size = 20 bytes!

Class Problem short a[100]; char b; int c; double d; short e; struct { char f; int g[1]; char h[2]; } i; How many bytes of memory does the following sequence of C declarations require (int = 4 bytes) ?