# 1 Code generation Our book's target machine (appendix A): opcode source1, source2, destination add r1, r2, r3 addI r1, c, r2 loadI c, r2 load r1, r2 loadAI.

## Presentation on theme: "1 Code generation Our book's target machine (appendix A): opcode source1, source2, destination add r1, r2, r3 addI r1, c, r2 loadI c, r2 load r1, r2 loadAI."— Presentation transcript:

1 Code generation Our book's target machine (appendix A): opcode source1, source2, destination add r1, r2, r3 addI r1, c, r2 loadI c, r2 load r1, r2 loadAI r1, c, r2// r2 := *(r1+c) loadAO r1, r2, r3 i2i r1, r2// r2 := r1 (for integers) cmp_LE r1, r2, r3// if r1<=r2, r3:=true, else r3:=false cbr r1, L1, L2// if (r1) goto L1, else goto L2 jump r1 Symbols: @x represents x's offset from the sp

2 Code generation Let's start with some examples. Generate code from a tree representing x = a+2 - (c+d-4) Issues: which children should go first? what if we already had a-c in a register? Does it make a difference if a and c are floating point as opposed to integer? Generate code for a switch statement Generate code for w = w*2*x*y*zw = w*2*x*y*z

3 Code generation Code generation = instruction selection instruction scheduling register allocation

4 Instruction selection IR to assembly Why is it an issue? Example: copy a value from r1 to r2 Let me count the ways... Criteria How hard is it? Use a cost model to choose. How about register usage?

5 Instruction selection How hard is it? Can make locally optimal choices Global optimality is NP-complete Criteria speed of generated code size of generated code power consumption Considering registers Assume enough registers are available, let register allocator figure it out.

6 Instruction scheduling Reorder instructions to hide latencies. Example: ( ) loadAI\$sp, @w, r1 ( ) addr1, r1, r1 ( ) loadAI\$sp, @x, r2 ( ) multr1, r2, r1 ( ) loadAI\$sp, @y, r2 ( ) mult r1, r2, r1 ( ) loadAI \$sp, @z, r2 ( ) multr1, r2, r1 ( ) storeAIr1, \$sp, @w memory ops : 3 cycles multiply : 2 cycles everything else: 1 cycle

7 Instruction scheduling Reorder instructions to hide latencies. Example: (1) loadAI\$sp, @w, r1 (4) addr1, r1, r1 (5) loadAI\$sp, @x, r2 (8) multr1, r2, r1 (9) loadAI\$sp, @y, r2 (12) mult r1, r2, r1 (13) loadAI \$sp, @z, r2 (16) multr1, r2, r1 (18)storeAIr1, \$sp, @w (20) (1) loadAI\$sp, @w, r1 (2) loadAI\$sp, @x, r2 (3) loadAI\$sp, @y, r3 (4) addr1, r1, r1 (5) multr1, r2, r1 (6) loadAI \$sp, @z, r2 (7) mult r1, r3, r1 (9) multr1, r2, r1 (11) storeAIr1, \$sp, @w (13)

8 Instruction scheduling Reorder instructions to hide latencies. Example2: (1) loadAI\$sp, @x, r1 (4) multr1, r1, r1 (6) mult r1, r1, r1 (8) multr1, r1, r1 (10) storeAIr1, \$sp, @x

9 Instruction scheduling Reorder instructions to hide latencies. We need to collect dependence info Scheduling affects register lifetimes ==> different demand for registers Should we do register allocation before or after? How hard is it? more than one instructions may be ready too many variables may be live at the same time NP-complete!

10 Register allocation Consists of two parts: register allocation register assignment Goal : minimize spills How hard is it? BB w/ one size of data: polynomial otherwise, NP-complete based on graph coloring.

11 Code generation Generating code for simple expressions If the expression is represented by a tree, a post-order walk gives an evaluation sequence Changing the evaluation sequence may result in code that uses fewer registers. Idea: find out how many registers are needed for each subtree and evaluate the most expensive one first. Consider the following algorithm that labels tree nodes with the number of registers needed (for the ILOC architecture): 1 if n is a left leaf, or right leaf and variable 0 if n is a right leaf and constant max(label lchild, label rchild ) if label lchild  label rchild label lchild +1 if label lchild == label rchild label(n) =

12 Code generation Generating code for simple expressions The idea behind our algorithm: Variables need to be loaded in registers, so leaves need one register For rhs constants we can use opI operations, so no extra register needed If we need k registers for each subtree, we'll use k for the left one, keep 1 for the result, and then use another k for the right subtree. That's a total of k+1 registers If we need k registers for one subtree and m { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/11/3337271/slides/slide_12.jpg", "name": "12 Code generation Generating code for simple expressions The idea behind our algorithm: Variables need to be loaded in registers, so leaves need one register For rhs constants we can use opI operations, so no extra register needed If we need k registers for each subtree, we ll use k for the left one, keep 1 for the result, and then use another k for the right subtree.", "description": "That s a total of k+1 registers If we need k registers for one subtree and m

13 Code generation Generating code for expressions What if an expression contains a function call? Should the compiler be allowed to change the evaluation order? Big idea #1 wrt to optimizing or moving code around : Be conservative! Generating code for boolean operators relational operators read the book array references