Optimization 薛智文 cwhsueh@csie.ntu.edu.tw (textbook ch# 9) 薛智文 cwhsueh@csie.ntu.edu.tw http://www.csie.ntu.edu.tw/~cwhsueh/ 96 Spring.

Slides:



Advertisements
Similar presentations
Target Code Generation
Advertisements

Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 3 Developed By:
Course Outline Traditional Static Program Analysis Software Testing
Lecture 11: Code Optimization CS 540 George Mason University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
CMPUT Compiler Design and Optimization1 CMPUT680 - Winter 2006 Topic 5: Peep Hole Optimization José Nelson Amaral
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
1 CS 201 Compiler Construction Machine Code Generation.
1 Chapter 8: Code Generation. 2 Generating Instructions from Three-address Code Example: D = (A*B)+C =* A B T1 =+ T1 C T2 = T2 D.
Code optimization: –A transformation to a program to make it run faster and/or take up less space –Optimization should be safe, preserve the meaning of.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
1 Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Lecture 11 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD
Intermediate Representations Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Peephole Optimization Final pass over generated code: examine a few consecutive instructions: 2 to 4 See if an obvious replacement is possible: store/load.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Lecture 23 Basic Blocks Topics Code Generation Readings: 9 April 17, 2006 CSCE 531 Compiler Construction.
4/23/09Prof. Hilfinger CS 164 Lecture 261 IL for Arrays & Local Optimizations Lecture 26 (Adapted from notes by R. Bodik and G. Necula)
Code Generation Professor Yihjia Tsai Tamkang University.
Code Generation for Basic Blocks Introduction Mooly Sagiv html:// Chapter
Register Allocation (via graph coloring)
Intermediate Code. Local Optimizations
Topic 6 -Code Generation Dr. William A. Maniatty Assistant Prof. Dept. of Computer Science University At Albany CSI 511 Programming Languages and Systems.
Improving Code Generation Honors Compilers April 16 th 2002.
Prof. Fateman CS164 Lecture 211 Local Optimizations Lecture 21.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
Compiler Construction A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University Second Semester 2008/2009.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Introduction For some compiler, the intermediate code is a pseudo code of a virtual machine. Interpreter of the virtual machine is invoked to execute the.
1 Code Generation Part II Chapter 8 (1 st ed. Ch.9) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
CSc 453 Final Code Generation Saumya Debray The University of Arizona Tucson.
1 Code Generation Part II Chapter 9 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
CH10.1 CSE 4100 Chap 10: Optimization Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371 Fairfield Way,
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Code Generation Ⅰ CS308 Compiler Theory1. 2 Background The final phase in our compiler model Requirements imposed on a code generator –Preserving the.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 10 Ahmed Ezzat.
©SoftMoore ConsultingSlide 1 Code Optimization. ©SoftMoore ConsultingSlide 2 Code Optimization Code generation techniques and transformations that result.
Topic #9: Target Code EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
1 Chapter10: Code generator. 2 Code Generator Source Program Target Program Semantic Analyzer Intermediate Code Generator Code Optimizer Code Generator.
More Code Generation and Optimization Pat Morin COMP 3002.
Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Code Optimization Overview and Examples
Introduction to Optimization
High-level optimization Jakub Yaghob
Code Optimization.
Material for course thanks to:
Optimization Code Optimization ©SoftMoore Consulting.
Eugene Gavrin – MSc student
Introduction to Optimization
Intermediate Representations
Wrapping Up Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit.
Unit IV Code Generation
CS 201 Compiler Construction
Intermediate Representations
Code Optimization Overview and Examples Control Flow Graph
Interval Partitioning of a Flow Graph
Introduction to Optimization
8 Code Generation Topics A simple code generator algorithm
Intermediate Code Generation
Compiler Construction
Compiler Construction
Code Generation Part II
Target Code Generation
TARGET CODE GENERATION
(via graph coloring and spilling)
Code Optimization.
Presentation transcript:

Optimization 薛智文 cwhsueh@csie.ntu.edu.tw (textbook ch# 9) 薛智文 cwhsueh@csie.ntu.edu.tw http://www.csie.ntu.edu.tw/~cwhsueh/ 96 Spring

Introduction For some compiler, the intermediate code is a pseudo code of a virtual machine. Interpreter of the virtual machine is invoked to execute the intermediate code. No machine-dependent code generation is needed. Usually with great overhead. Example: Pascal: P-code for the virtual P machine. JAVA: Byte code for the virtual JAVA machine. Optimization. Machine-dependent issues. Machine-independent issues. 12:14 /15

Machine-dependent Issues (1/2) Input and output formats: The formats of the intermediate code and the target program. Memory management: Alignment, indirect addressing, paging, segment, . . . Those you learned from your assembly language class. Instruction cost: Special machine instructions to speed up execution. Example: Increment by 1. Multiplying or dividing by 2. Bit-wise manipulation. Operators applied on a continuous block of memory space. Pick a fastest instruction combination for a certain target machine. 12:14 /15

Machine-dependent Issues (2/2) Register allocation: in-between machine dependent and independent issues. C language allows the user to management a pool of registers. Some language leaves the task to compiler. Idea: save mostly used intermediate result in a register. However, finding an optimal solution for using a limited set of registers is NP-hard. Example: Heuristic solutions: similar to the ones used for the swapping problem. t := a + b load R0, a load R1, b add R0, R1 store R0, T add R0, b 12:14 /15

Machine-independent Issues Dependence graphs. Basic blocks and flow graphs. Structure-preserving transformations. Algebraic transformations. Peephole optimization. 12:14 /15

Dependence Graphs Issues: Example: In an expression, assume its dependence graph is given. We can evaluate this expression using any topological ordering. There are many legal topological orderings. Pick one to increase its efficiency. Example: On a machine with only 2 free registers, some of the intermediate results in order#1 must be stored in the temporary space. STORE/LOAD takes time. E0 E2 E1 E4 E5 E6 E3 order#1 reg# order#2 E2 E3 E5 E6 E4 E1 E0 1 2 3 4 12:14 /15

Basic Blocks and Flow Graphs Basic block : a sequence of code such that jump statements, if any, are at the end of the sequence; codes in other basic block can only jump to the beginning of this sequence, but not in the middle. Example: t1 := a * a t2 := a * b t3 := 2 * t2 goto outter Flow graph : Using a flow chart-like graph to represent a program where nodes are basic blocks and edges are flow of control. B1 B2 B3 12:14 /15

How to find basic blocks? How to find leaders , which are the first statements of basic blocks? The first statement of a program is a leader. For each conditional and unconditional goto, its target is a leader; its next statement is also a leader. Using leaders to partition the program into basic blocks. Ideas for optimization: Two basic blocks are equivalent if they compute the same expressions. Use transformation techniques below to perform machine-independent optimization. 12:14 /15

Finding Basic Blocks — Examples Example: Three-address code for computing the dot product of two vectors a and b. prod := 0 i := 1 loop: t1 := 4 * i t2 := a[t1] t3 := 4 * i t4 := b[t3] t5 := t2 * t4 t6 := prod + t5 prod := t6 t7 := i + 1 i := t7 if i ≦ 20 goto loop · · · There are three blocks in the above example. 12:14 /15

DAG Representation of a Basic Block Inside a basic block: Expressions can be expressed using a DAG that is similar to the idea of a dependence graph. Graph might not be connected. Example: (1) t1 := 4 * i (2) t2 := a[t1] (3) t3 := 4 * i (4) t4 := b[t3] (5) t5 := t2 * t4 (6) t6 := prod + t5 (7) prod := t6 (8) t7 := i + 1 (9) i := t7 (10) if i ≦ 20 goto (1) 12:14 /15

Structure-Preserving Transformations (1/2) Techniques: using the information contained in the flow graph and DAG representation of basic blocks to do optimization. Common sub-expression elimination. Dead-code elimination: remove unreachable codes. Renaming temporary variables: better usage of registers and avoiding using unneeded temporary variables. a := b+c b := a−d c := b+c d := a−d a := b+c b := a−d c := b+c d := b 12:14 /15

Structure-Preserving Transformations (2/2) Interchange of two independent adjacent statements, which might be useful in discovering the above three transformations. Same expressions that are too far away to store E1 into a register. Example: t1 := E1 t2 := const // swap t2 and tn ... tn := E1 Note: The order of dependence cannot be altered after the exchange. t2 := t1 + tn // canoot swap t2 and tn 12:14 /15

Algebraic Transformations Algebraic identities: x + 0 == 0 + x == x x − 0 == x x * 1 == 1 * x == x x /1 == x Reduction in strength: x2 == x * x 2.0 * x == x + x x/2 == x * 0.5 Constant folding: 2 * 3.14 == 6.28 Standard representation for subexpression by commutativity and associativity: n * m == m * n. b < a == a > b. 12:14 /15

Peephole Optimization (1/2) Idea: Statement by statement translation might generate redundant codes. Locally improve the target code performance by examine a short sequence of target instructions (called a peephole ) and do optimization on this sequence. Complexity depends on the “window size”. Techniques: remove redundant codes. Redundant loads and stores. MOV R0, a MOV a, R0 Unreachable codes. An unlabeled instruction immediately following an unconditional jump may be removed. If statements based on constants: If debug then · · · . 12:14 /15

Peephole Optimization (2/2) More techniques: Flow of control optimization: goto L1 goto L2 · · · · · · L1: goto L2 L1: goto L2 Algebraic simplification. Use of special machine idioms. Better usage of registers. Loop unwrapping. 12:14 /15

Correctness after Optimization When side effects are expected, different evaluation orders may produce different results for expressions. Assume E5 is a procedure call with the side effect of changing some values in E6. LL and LR parsing produce different results. Watch out precisions when doing algebraic simplification. if (x = 321.00000123456789 − 321.00000123456788) > 0 then · · · Need to make sure code before and after optimization produce the same result. Complications arise when debugger is involved. 12:14 /15