6. Intermediate Representation Prof. O. Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and.

Slides:



Advertisements
Similar presentations
Chapter 1 The Study of Body Function Image PowerPoint
Advertisements

Programming Language Concepts
CSE 5317/4305 L9: Instruction Selection1 Instruction Selection Leonidas Fegaras.
David Luebke 1 6/7/2014 ITCS 6114 Skip Lists Hashing.
Chapter 1 Object Oriented Programming 1. OOP revolves around the concept of an objects. Objects are created using the class definition. Programming techniques.
Hash Tables.
Semantic Analysis and Symbol Tables
Intermediate Representations CS 671 February 12, 2008.
Procedures. 2 Procedure Definition A procedure is a mechanism for abstracting a group of related operations into a single operation that can be used repeatedly.
Chapter 10: The Traditional Approach to Design
Compiler Construction
1 Programming Languages (CS 550) Mini Language Interpreter Jeremy R. Johnson.
Chapter 6 Intermediate Code Generation
8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
6. Intermediate Representation
1 Lecture 10 Intermediate Representations. 2 front end »produces an intermediate representation (IR) for the program. optimizer »transforms the code in.
Intermediate Code Generation
7. Optimization Prof. O. Nierstrasz Lecture notes by Marcus Denker.
Intermediate Code Generation. 2 Intermediate languages Declarations Expressions Statements.
8 Intermediate code generation
1 Compiler Construction Intermediate Code Generation.
PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
8. Introduction to Denotational Semantics. © O. Nierstrasz PS — Denotational Semantics 8.2 Roadmap Overview:  Syntax and Semantics  Semantics of Expressions.
COMPILERS Basic Blocks and Traces hussein suleman uct csc3005h 2006.
12. Common Errors, a few Puzzles. © O. Nierstrasz P2 — Common Errors, a few Puzzles 12.2 Common Errors, a few Puzzles Sources  Cay Horstmann, Computing.
Program Representations. Representing programs Goals.
Intermediate Representation I High-Level to Low-Level IR Translation EECS 483 – Lecture 17 University of Michigan Monday, November 6, 2006.
Intermediate Representations Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
10/22/2002© 2002 Hal Perkins & UW CSEG-1 CSE 582 – Compilers Intermediate Representations Hal Perkins Autumn 2002.
ESE Einführung in Software Engineering N. XXX Prof. O. Nierstrasz Fall Semester 2009.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
4/23/09Prof. Hilfinger CS 164 Lecture 261 IL for Arrays & Local Optimizations Lecture 26 (Adapted from notes by R. Bodik and G. Necula)
9. Optimization Marcus Denker. 2 © Marcus Denker Optimization Roadmap  Introduction  Optimizations in the Back-end  The Optimizer  SSA Optimizations.
Intermediate Code CS 471 October 29, CS 471 – Fall Intermediate Code Generation Source code Lexical Analysis Syntactic Analysis Semantic.
Metamodeling Seminar X. CHAPTER Prof. O. Nierstrasz Spring Semester 2008.
1 Intermediate representation Goals: encode knowledge about the program facilitate analysis facilitate retargeting facilitate optimization scanning parsing.
ESE Einführung in Software Engineering X. CHAPTER Prof. O. Nierstrasz Wintersemester 2005 / 2006.
N. XXX Prof. O. Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes.
Intermediate Code. Local Optimizations
Improving Code Generation Honors Compilers April 16 th 2002.
Compiler Construction A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University Second Semester 2008/2009.
OORPT Object-Oriented Reengineering Patterns and Techniques X. CHAPTER Prof. O. Nierstrasz.
CP — Concurrent Programming X. CHAPTER Prof. O. Nierstrasz Wintersemester 2005 / 2006.
12. eToys. © O. Nierstrasz PS — eToys 12.2 Denotational Semantics Overview:  … References:  …
CS412/413 Introduction to Compilers Radu Rugina Lecture 15: Translating High IR to Low IR 22 Feb 02.
10/1/2015© Hal Perkins & UW CSEG-1 CSE P 501 – Compilers Intermediate Representations Hal Perkins Autumn 2009.
1 Structure of a Compiler Front end of a compiler is efficient and can be automated Back end is generally hard to automate and finding the optimum solution.
Compiler Chapter# 5 Intermediate code generation.
Intermediate Code Representations
12/18/2015© Hal Perkins & UW CSEG-1 CSE P 501 – Compilers Intermediate Representations Hal Perkins Winter 2008.
COMPILERS Intermediate Code hussein suleman uct csc3003s 2007.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Intermediate Code Generation CS 671 February 14, 2008.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 10 Ahmed Ezzat.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CS 404 Introduction to Compiler Design
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
COMPILERS Intermediate Code
Intermediate Representations Hal Perkins Autumn 2011
Intermediate Representations
Chapter 6 Intermediate-Code Generation
Intermediate Representations
Intermediate Representations Hal Perkins Autumn 2005
Intermediate Code Generation
Review: What is an activation record?
Intermediate Code Generating machine-independent intermediate form.
Presentation transcript:

6. Intermediate Representation Prof. O. Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes.

© Oscar Nierstrasz Intermediate Representation 2 Roadmap  Intermediate representations  Example: IR trees for MiniJava See, Modern compiler implementation in Java (Second edition), chapters 7-8.

© Oscar Nierstrasz Intermediate Representation 3 Roadmap  Intermediate representations  Example: IR trees for MiniJava

Why use intermediate representations? 1. Software engineering principle —break compiler into manageable pieces 2. Simplifies retargeting to new host —isolates back end from front end 3. Simplifies support for multiple languages —different languages can share IR and back end 4. Enables machine-independent optimization —general techniques, multiple passes © Oscar Nierstrasz Intermediate Representation 4

IR scheme © Oscar Nierstrasz Intermediate Representation 5 front end produces IR optimizer transforms IR to more efficient program back end transform IR to target code

Kinds of IR  Abstract syntax trees (AST)  Linear operator form of tree (e.g., postfix notation)  Directed acyclic graphs (DAG)  Control flow graphs (CFG)  Program dependence graphs (PDG)  Static single assignment form (SSA)  3-address code  Hybrid combinations © Oscar Nierstrasz Intermediate Representation 6

Categories of IR  Structural —graphically oriented (trees, DAGs) —nodes and edges tend to be large —heavily used on source-to-source translators  Linear —pseudo-code for abstract machine —large variation in level of abstraction —simple, compact data structures —easier to rearrange  Hybrid —combination of graphs and linear code (e.g. CFGs) —attempt to achieve best of both worlds © Oscar Nierstrasz Intermediate Representation 7

Important IR properties  Ease of generation  Ease of manipulation  Cost of manipulation  Level of abstraction  Freedom of expression (!)  Size of typical procedure  Original or derivative © Oscar Nierstrasz Intermediate Representation 8 Subtle design decisions in the IR can have far-reaching effects on the speed and effectiveness of the compiler!  Degree of exposed detail can be crucial

Abstract syntax tree © Oscar Nierstrasz Intermediate Representation 9 An AST is a parse tree with nodes for most non-terminals removed. Since the program is already parsed, non-terminals needed to establish precedence and associativity can be collapsed! A linear operator form of this tree (postfix) would be: x 2 y * -

Directed acyclic graph © Oscar Nierstrasz Intermediate Representation 10 A DAG is an AST with unique, shared nodes for each value. x := 2 * y + sin(2*x) z := x / 2

Control flow graph  A CFG models transfer of control in a program —nodes are basic blocks (straight-line blocks of code) —edges represent control flow (loops, if/else, goto …) © Oscar Nierstrasz Intermediate Representation 11 if x = y then S1 else S2 end S3

Single static assignment (SSA)  Each assignment to a temporary is given a unique name —All uses reached by that assignment are renamed —Compact representation —Useful for many kinds of compiler optimization … © Oscar Nierstrasz Intermediate Representation 12 Ron Cytron, et al., “Efficiently computing static single assignment form and the control dependence graph,” ACM TOPLAS., doi: / x := 3; x := x + 1; x := 7; x := x*2; x 1 := 3; x 2 := x 1 + 1; x 3 := 7; x 4 := x 3 *2; 

3-address code © Oscar Nierstrasz Intermediate Representation 13  Statements take the form: x = y op z —single operator and at most three names x – 2 * y t1 = 2 * y t2 = x – t1  Advantages: —compact form —names for intermediate values

Typical 3-address codes assignments x = y op z x = op y x = y[i] x = y branches goto L conditional branches if x relop y goto L procedure calls param x param y call p address and pointer assignments x = &y *y = z © Oscar Nierstrasz Intermediate Representation 14

3-address code — two variants © Oscar Nierstrasz Intermediate Representation 15 QuadruplesTriples simple record structure easy to reorder explicit names table index is implicit name only 3 fields harder to reorder

IR choices  Other hybrids exist —combinations of graphs and linear codes —CFG with 3-address code for basic blocks  Many variants used in practice —no widespread agreement —compilers may need several different IRs!  Advice: —choose IR with right level of detail —keep manipulation costs in mind © Oscar Nierstrasz Intermediate Representation 16

© Oscar Nierstrasz Intermediate Representation 17 Roadmap  Intermediate representations  Example: IR trees for MiniJava

IR trees — expressions © Oscar Nierstrasz Intermediate Representation 18 CONST i NAME n TEMP t BINOP e1e2 MEM e CALL f[e1,…,en] ESEQ se integer constant symbolic constant register +, — etc. contents of word of memory procedure call expression sequence NB: evaluation left to right

IR trees — statements © Oscar Nierstrasz Intermediate Representation 19 MOVE t e evaluate e into temp t TEMP MOVE e1 e2 evaluate e1 to address a; e2 to word at a MEM EXP e evaluate e and discard JUMP e[l1,…,ln] transfer to address e with value l1 … CJUMP e1e2 evaluate and compare e1 and e2; jump to t or f tf LABEL n define name n as current address (can use NAME(n) as jump address) SEQ s1s2 statement sequence

Converting between kinds of expressions  Kinds of expressions: —Exp(exp) — expressions (compute a value) —Nx(stm) — statements (compute no value) —Cx.op(t,f) — conditionals (jump to true/false destinations)  Conversion operators: —cvtEx — convert to expression —cvtNx — convert to statement —cvtCx(t,f) — convert to conditional © Oscar Nierstrasz Intermediate Representation 20

Variables, arrays and fields © Oscar Nierstrasz Intermediate Representation 21 Local variables:t  Ex(TEMP(t)) Array elements: where w is the target machine’s word size Object fields: e[i]  Ex(MEM(+(e.cvtEx(), ×(i.cvtEx(), CONST(w))))) e.f  Ex(MEM(+(e.cvtEx(), CONST(o)))) where o is the byte offset of field f

MiniJava: string literals, object creation © Oscar Nierstrasz Intermediate Representation 22 String literals: allocate statically.word 11 label:.ascii “hello world” “hello world”  Ex(NAME(label)) Object creation: allocate object in heap new T()  Ex(CALL(NAME(“new”), CONST(fields), NAME(label for T’s vtable)))

Control structures  Basic blocks: —maximal sequence of straight-line code without branches —label starts a new block  Control structure translation: —control flow links up basic blocks —implementation requires bookkeeping —some care needed to produce good code! © Oscar Nierstrasz Intermediate Representation 23

while loops © Oscar Nierstrasz Intermediate Representation 24 if not (c) jump done body: s if c jump body done: while (c) s  Nx(SEQ(SEQ(c.cvtCx(b,x), SEQ(LABEL(b), s.cvtNx())), SEQ(c,cvtCx(b,x),LABEL(x)))) for example:

Method calls © Oscar Nierstrasz Intermediate Representation 25 eo.m(e1,…,en)  Ex(CALL(MEM(MEM(e0.cvtEx(), -w), m.index × w), e1.cvtEx(), …en.cvtEx()))

case statements  case E of V 1 : S 1 … V n : S n end —evaluate E to V —find value V in case list —execute statement for found case —jump to statement after case  Key issue: finding the right case —sequence of conditional jumps (small case set) –  O(# cases) —binary search of ordered jump table (sparse case set) –  O(log 2 # cases) —hash table (dense case set) –  O(1) © Oscar Nierstrasz Intermediate Representation 26

case statements — sample translation © Oscar Nierstrasz Intermediate Representation 27 t := expr jump test L1:code for S1 jump next L2:code for S2 jump next … Ln:code for Sn jump next test:if t = V1 jump L1 if t = V2 jump L2 … if t = Vn jump Ln code to raise exception next:…

Simplification  After translation, simplify trees —No SEQ or ESEQ —CALL can only be subtree of EXP() or MOVE(TEMP t, …)  Transformations: —Lift ESEQs up tree until they can become SEQs —turn SEQs into linear list © Oscar Nierstrasz Intermediate Representation 28

Linearizing trees ESEQ(s1, ESEQ(s2, e)=ESEQ(SEQ(s1, s2), e) BINOP(op, ESEQ(s, e1), e2)=ESEQ(s, BINOP(op, e1, e2)) MEM(ESEQ(s, e))=ESEQ(s, MEM(e)) JUMP(ESEQ(s,e))=SEQ(s, JUMP(e)) CJUMP(op, ESEQ(s, e1), e2, l1, l2) =SEQ(s, CJUMP(op, e1, e2, l1, l2)) BINOP(op, e1, ESEQ(s, e2))= ESEQ(MOVE(TEMP t, e1), ESEQ(s, BINOP(op, TEMP t, e2))) CJUMP(op, e1, ESEQ(s, e2), l1, l2) = SEQ(MOVE(TEMP t, e1), SEQ(s, CJUMP(op, TEMP t, e2, l1, l2))) MOVE(ESEQ(s, e1), e2)=SEQ(s, MOVE(e1, e2)) CALL(f, a)= ESEQ(MOVE(TEMP t, CALL(f, a)), TEMP(t)) © Oscar Nierstrasz Intermediate Representation 29

Semantic Analysis What you should know!  Why do most compilers need an intermediate representation for programs?  What are the key tradeoffs between structural and linear IRs?  What is a “basic block”?  What are common strategies for representing case statements? 30 © Oscar Nierstrasz

Semantic Analysis Can you answer these questions?  Why can’t a parser directly produced high quality executable code?  What criteria should drive your choice of an IR?  What kind of IR does JTB generate? 31 © Oscar Nierstrasz

Intermediate Representation 32 License > Attribution-ShareAlike 2.5 You are free: to copy, distribute, display, and perform the work to make derivative works to make commercial use of the work Under the following conditions: Attribution. You must attribute the work in the manner specified by the author or licensor. Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one. For any reuse or distribution, you must make clear to others the license terms of this work. Any of these conditions can be waived if you get permission from the copyright holder. Your fair use and other rights are in no way affected by the above.