Lecture 12 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4 www.cs.technion.ac.il/~yahave/tocs2011/compilers-lec12.pptx.

Slides:



Advertisements
Similar presentations
Register Allocation Consists of two parts: Goal : minimize spills
Advertisements

Target Code Generation
Register Usage Keep as many values in registers as possible Register assignment Register allocation Popular techniques – Local vs. global – Graph coloring.
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
Register allocation Morgensen, Torben. "Register Allocation." Basics of Compiler Design. pp from (
Register Allocation CS 320 David Walker (with thanks to Andrew Myers for most of the content of these slides)
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
Coalescing Register Allocation CS153: Compilers Greg Morrisett.
Register Allocation Mooly Sagiv Schrierber Wed 10:00-12:00 html://
COMPILERS Register Allocation hussein suleman uct csc305w 2004.
Stanford University CS243 Winter 2006 Wei Li 1 Register Allocation.
Register Allocation CS 671 March 27, CS 671 – Spring Register Allocation - Motivation Consider adding two numbers together: Advantages: Fewer.
Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.
Program Representations. Representing programs Goals.
From AST to Code Generation Professor Yihjia Tsai Tamkang University.
Lecture 26 Epilogue: Or Everything else you Wanted to Know about Compilers (more accurately Everything else I wanted you to Know) Topics Getreg – Error.
Lecture 10: Code Generation 1. You are here 2 Executable code exe Source text txt Compiler Lexical Analysis Syntax Analysis Semantic Analysis Inter. Rep.
Lecture 11 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
1 Handling nested procedures Method 1 : static (access) links –Reference to the frame of the lexically enclosing procedure –Static chains of such links.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
Register Allocation (Slides from Andrew Myers). Main idea Want to replace temporary variables with some fixed set of registers First: need to know which.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
1 Register Allocation Consists of two parts: –register allocation What will be stored in registers –Only unambiguous values –register assignment Which.
Prof. Bodik CS 164 Lecture 171 Register Allocation Lecture 19.
Register Allocation (via graph coloring)
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Register Allocation (via graph coloring). Lecture Outline Memory Hierarchy Management Register Allocation –Register interference graph –Graph coloring.
1 Liveness analysis and Register Allocation Cheng-Chia Chen.
Improving Code Generation Honors Compilers April 16 th 2002.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
Code Generation Ⅱ CS308 Compiler Theory.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
4/29/09Prof. Hilfinger CS164 Lecture 381 Register Allocation Lecture 28 (from notes by G. Necula and R. Bodik)
Precision Going back to constant prop, in what cases would we lose precision?
Spring 2014Jim Hogg - UW - CSE - P501P-1 CSE P501 – Compiler Construction Register allocation constraints Local allocation Fast, but poorer code Global.
1 Code Generation Part II Chapter 8 (1 st ed. Ch.9) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Register Allocation John Cavazos University.
1 Code Generation Part II Chapter 9 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
1 Lecture 10: Code Generation. You are here 2 Executable code exe Source text txt Compiler Lexical Analysis Syntax Analysis Semantic Analysis Inter. Rep.
1 Code Generation. 2 Position of a Code Generator in the Compiler Model Front-End Code Optimizer Source program Symbol Table Lexical error Syntax error.
Code Generation Ⅰ CS308 Compiler Theory1. 2 Background The final phase in our compiler model Requirements imposed on a code generator –Preserving the.
Chapter# 6 Code generation.  The final phase in our compiler model is the code generator.  It takes as input the intermediate representation(IR) produced.
Compilers Modern Compiler Design
Register Usage Keep as many values in registers as possible Keep as many values in registers as possible Register assignment Register assignment Register.
1 Chapter10: Code generator. 2 Code Generator Source Program Target Program Semantic Analyzer Intermediate Code Generator Code Optimizer Code Generator.
Code Generation Part I Chapter 8 (1st ed. Ch.9)
More Code Generation and Optimization Pat Morin COMP 3002.
Chapter 8 Code Generation
Compilation (Semester A, 2013/14)
Global Register Allocation Based on
Register Allocation Hal Perkins Autumn 2009
Register Allocation Noam Rinetzky Text book:
Code Generation Part I Chapter 9
Code Generation.
Unit IV Code Generation
Code Generation Part I Chapter 8 (1st ed. Ch.9)
Code Generation Part I Chapter 9
Lecture 16: Register Allocation
TARGET CODE -Next Usage
8 Code Generation Topics A simple code generator algorithm
Optimization 薛智文 (textbook ch# 9) 薛智文 96 Spring.
Compiler Construction
Lecture 17: Register Allocation via Graph Colouring
Code Generation Part II
Target Code Generation
Code Optimization.
Presentation transcript:

Lecture 12 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD

2 You are here Executable code exe Source text txt Compiler Lexical Analysis Syntax Analysis Parsing Semantic Analysis Inter. Rep. (IR) Code Gen.

simple code generation  registers  used as operands of instructions  can be used to store temporary results  can (should) be used as loop indexes due to frequent arithmetic operation  used to manage administrative info (e.g., runtime stack)  number of registers is limited  need to allocate them in a clever way 3

simple code generation  assume machine instructions of the form  LD reg, mem  ST mem, reg  OP reg,reg,reg  further assume that we have all registers available for our use  ignore registers allocated for stack management 4

simple code generation  translate each 3AC instruction separately  A register descriptor keeps track of the variable names whose current value is in that register.  we use only those registers that are available for local use within a basic block, we assume that initially, all register descriptors are empty.  As code generation progresses, each register will hold the value of zero or more names.  For each program variable, an address descriptor keeps track of the location or locations where the current value of that variable can be found.  The location may be a register, a memory address, a stack location, or some set of more than one of these  Information can be stored in the symbol-table entry for that variable 5

simple code generation For each three-address statement x := y op z, 1. Invoke getreg (x := y op z) to select registers R x, R y, and R z. 2. If Ry does not contain y, issue: “LD R y, y’ ”, for a location y’ of y. 3. If Rz does not contain z, issue: “LD R z, z’ ”, for a location z’ of z. 4. Issue the instruction “OP R x,R y,R z ” 5. Update the address descriptors of x, y, z, if necessary.  R x is the only location of x now, and R x contains only x (remove R x from other address descriptors). 6

updating descriptors  1. For the instruction LD R, x a) Change the register descriptor for register R so it holds only x. b) Change the address descriptor for x by adding register R as an additional location.  2. For the instruction ST x, R  change the address descriptor for x to include its own memory location.  3. For an operation such as ADD Rx, Ry, Rz, implementing a 3AC instruction x = y + z a) Change the register descriptor for Rx so that it holds only x. b) Change the address descriptor for x so that its only location is Rx. Note that the memory location for x is not now in the address descriptor for x. c) Remove Rx from the address descriptor of any variable other than x.  4. When we process a copy statement x = y, after generating the load for y into register Ry, if needed, and after managing descriptors as for all load statements (rule 1): a) Add x to the register descriptor for Ry. b) Change the address descriptor for x so that its only location is Ry. 7

example 8 t= A – B u = A- C v = t + u A = D D = v + u A B C D = live outside the block t,u,v = temporaries in local storate R1 R2R3 ABC A BC D D tuv t = A – B LD R1,A LD R2,B SUB R2,R1,R2 At R1 R2R3 A,R1 BC A BC DR2 D tuv u = A – C LD R3,C SUB R1,R1,R3 v = t + u ADD R3,R2,R1 utC R1 R2R3 AB C,R3 A BC DR2R1 D tuv utv R2R3 ABC A BC DR2R1 D tu R3 v

example 9 t= A – B u = A- C v = t + u A = D D = v + u A B C D = live outside the block t,u,v = temporaries in local storate A = D LD R2, D u A,D v R1 R2R3 R2 BC A BC D,R2 R1 D tu R3 v D = v + u ADD R1,R3,R1 exit ST A, R2 ST D, R1 DAv R1 R2R3 R2BC A BC R1 D tu R3 v utv R1 R2R3 ABC A BC DR2R1 D tu R3 v DAv R1 R2R3 A,R2 BC A BC D,R1 D tu R3 v

design of getReg  many design choices  simple rules:  If y is currently in a register, pick a register already containing y as Ry. No need to load this register.  If y is not in a register, but there is a register that is currently empty, pick one such register as Ry.  complicated case:  y is not in a register, but there is no free register. 10

design of getReg  instruction: x = y + z  y is not in a register, no free register  let R be a taken register holding value of a variable v  possibilities:  if the value v is available somewhere other than R, we can allocate R to be Ry  if v is x, the value computed by the instruction, we can use it as Ry (it is going to be overwritten anyway)  if v is not used later, we can use R as Ry  otherwise: spill the value to memory by ST v,R 11

global register allocation  so far we assumed that register values are written back to memory at the end of every basic block  want to save load/stores by keeping frequently accessed values in registers  e.g., loop counters  idea: compute “weight” for each variable  for each use of v in B prior to any definition of v add 1 point  for each occurrence of v in a following block using v add 2 points, as we save the store/load between blocks  cost(v) =  B use(v,B) + 2*live(v,B)  use(v,B) is is the number of times v is used in B prior to any definition of v  live(v, B) is 1 if v is live on exit from B and is assigned a value in B  after computing weights, allocate registers to the “heaviest” values 12

Example 13 a = b + c d = d - b e = a + f bcdf f = a - d acde cdef b = d + f e = a – c acdf bcdef b = d + c cdef bcdef b,c,d,e,f live B1 B2 B3 B4 acdef cost(a) =  B use(a,B) + 2*live(a,B) = 4 cost(b) = 6 cost(c) = 3 cost(d) = 6 cost(e) = 4 cost(f) = 4 b,d,e,f live

Example 14 LD R3,c ADD R0,R1,R3 SUB R2,R2,R1 LD R3,f ADD R3,R0,R3 ST e, R3 SUB R3,R0,R2 ST f,R3 LD R3,f ADD R1,R2,R3 LD R3,c SUB R3,R0,R3 ST e, R3 LD R3,c ADD R1,R2,R3 B1 B2B3 B4 LD R1,b LD R2,d ST b,R1 ST d,R2 ST b,R1 ST a,R2

Register Allocation by Graph Coloring  Address register allocation by  liveness analysis  reduction to graph coloring  optimizations by program transformation  Main idea  register allocation = coloring of an interference graph  every node is a variable  edge between variables that “interfere” = are both live at the same time  number of colors = number of registers 15

Example 16 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 v7v7 v8v8 time V1V1 V8V8 V2V2 V4V4 V7V7 V6V6 V5V5 V3V3

Example 17 a = read(); b = read(); c = read(); a = a + b + c; if (a<10) { d = c + 8; print(c); } else if (a<2o) { e = 10; d = e + a; print(e); } else { f = 12; d = f + a; print(f); } print(d); a = read(); b = read(); c = read(); a = a + b + c; if (a<10) goto B2 else goto B3 d = c + 8; print(c); if (a<20) goto B4 else goto B5 e = 10; d = e + a; print(e); f = 12; d = f + a; print(f); print(d); B1 B2 B3 B4 B5 B6 b a c d e f d d

Example: Interference Graph 18 fab dec a = read(); b = read(); c = read(); a = a + b + c; if (a<10) goto B2 else goto B3 d = c + 8; print(c); if (a<20) goto B4 else goto B5 e = 10; d = e + a; print(e); f = 12; d = f + a; print(f); print(d); B2 B3 B4 B5 B6 b a c d e f d d

Register Allocation by Graph Coloring  variables that interfere with each other cannot be allocated the same register  graph coloring  classic problem: how to color the nodes of a graph with the lowest possible number of colors  bad news: problem is NP-complete  good news: there are pretty good heuristic approaches 19

Heuristic Graph Coloring  idea: color nodes one by one, coloring the “easiest” node last  “easiest nodes” are ones that have lowest degree  fewer conflicts  algorithm at high-level  find the least connected node  remove least connected node from the graph  color the reduced graph recursively  re-attach the least connected node 20

21 Heuristic Graph Coloring fab dec fa dec fa de f de stack:  stack: b stack: cb stack: acb

22 f de stack: acb f d stack: eacb f stack: deacbstack: fdeacb f1 stack: deacb f1 d2 stack: eacb f1 d2e1 stack: acb f1a2 d2e1 stack: cb Heuristic Graph Coloring

23 f1a2b3 d2e1c1 f1a2 d2e1c1 f1a2 d2e1 stack:  stack: b stack: cb Heuristic Graph Coloring Result: 3 registers for 6 variables Can we do with 2 registers?

 two sources of non-determinism in the algorithm  choosing which of the (possibly many) nodes of lowest degree should be detached  choosing a free color from the available colors 24 Heuristic Graph Coloring

Supercompilation  exhaustive search in the space of (small) programs for finding optimal code sequences  often counter intuitive results, not what a human would write  can be very efficient  generate/test paradigm 25 ; n in register %ax cwd ; convert to double word: ; (%dx,%ax) = (extend_sign(%ax), %ax) negw %ax ; negate: (%ax,cf) = (-%ax,%ax != 0) adcw %dx,%dx ; add with carry: %dx = %dx + %dx + cf ; sign(n) in %dx

The End 26

27 Heuristic Graph Coloring fab dec fa dec fa de stack:  stack: b stack: cb stack: fcb a de

28 stack: fcb a e stack: dfcb a stack: edfcbstack: aedfcb a1 stack: edfcb a1 e2 stack: dfcb a1 d1e2 stack: fcb f2a1 d1e2 stack: cb Heuristic Graph Coloring a de

29 f2a1b3 d1e2c2 f2d1 e2c2 f2a1 d1e2 stack:  stack: b stack: cb Heuristic Graph Coloring

30 Heuristic Graph Coloring fab dec stack:  stack: f ab dec stack: ef ab dc stack: def ab c

31 stack: def b c stack: adef b stack: cadefstack: bcadef b1 stack: cadef b1 c2 stack: adef a3 stack: def Heuristic Graph Coloring ab c b1 c2

32 f2a3b1 d1e2c2 stack:  Heuristic Graph Coloring a3b1 d1c2 stack: ef a3b1 d1e2c2 stack: f