CH10.1 CSE 4100 Chap 10: Optimization Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371 Fairfield Way,

Slides:



Advertisements
Similar presentations
CSC 4181 Compiler Construction Code Generation & Optimization.
Advertisements

Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 3 Developed By:
7. Optimization Prof. O. Nierstrasz Lecture notes by Marcus Denker.
Course Outline Traditional Static Program Analysis Software Testing
Lecture 11: Code Optimization CS 540 George Mason University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
CMPUT Compiler Design and Optimization1 CMPUT680 - Winter 2006 Topic 5: Peep Hole Optimization José Nelson Amaral
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
1 Chapter 8: Code Generation. 2 Generating Instructions from Three-address Code Example: D = (A*B)+C =* A B T1 =+ T1 C T2 = T2 D.
Code optimization: –A transformation to a program to make it run faster and/or take up less space –Optimization should be safe, preserve the meaning of.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
C Chuen-Liang Chen, NTUCS&IE / 321 OPTIMIZATION Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University.
1 Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Program Representations. Representing programs Goals.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
Peephole Optimization Final pass over generated code: examine a few consecutive instructions: 2 to 4 See if an obvious replacement is possible: store/load.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
4/23/09Prof. Hilfinger CS 164 Lecture 261 IL for Arrays & Local Optimizations Lecture 26 (Adapted from notes by R. Bodik and G. Necula)
Code Generation Professor Yihjia Tsai Tamkang University.
Intermediate Code. Local Optimizations
Improving Code Generation Honors Compilers April 16 th 2002.
Prof. Fateman CS164 Lecture 211 Local Optimizations Lecture 21.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
Compiler Construction A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University Second Semester 2008/2009.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
Optimizing Compilers Nai-Wei Lin Department of Computer Science and Information Engineering National Chung Cheng University.
Compiler Code Optimizations. Introduction Introduction Optimized codeOptimized code Executes faster Executes faster efficient memory usage efficient memory.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Introduction For some compiler, the intermediate code is a pseudo code of a virtual machine. Interpreter of the virtual machine is invoked to execute the.
What’s in an optimizing compiler?
1 Code Generation Part II Chapter 8 (1 st ed. Ch.9) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
CSc 453 Final Code Generation Saumya Debray The University of Arizona Tucson.
1 Code Generation Part II Chapter 9 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a.
CPSC 388 – Compiler Design and Construction Optimization.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
1 CS 201 Compiler Construction Introduction. 2 Instructor Information Rajiv Gupta Office: WCH Room Tel: (951) Office.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University.
3/2/2016© Hal Perkins & UW CSES-1 CSE P 501 – Compilers Optimizing Transformations Hal Perkins Autumn 2009.
CS412/413 Introduction to Compilers and Translators April 2, 1999 Lecture 24: Introduction to Optimization.
©SoftMoore ConsultingSlide 1 Code Optimization. ©SoftMoore ConsultingSlide 2 Code Optimization Code generation techniques and transformations that result.
Code Optimization More Optimization Techniques. More Optimization Techniques  Loop optimization  Code motion  Strength reduction for induction variables.
More Code Generation and Optimization Pat Morin COMP 3002.
Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Code Optimization Overview and Examples
Introduction to Optimization
High-level optimization Jakub Yaghob
Code Optimization.
Material for course thanks to:
Optimization Code Optimization ©SoftMoore Consulting.
Machine-Independent Optimization
Introduction to Optimization
Code Generation Part III
Optimizing Transformations Hal Perkins Autumn 2011
Optimizing Transformations Hal Perkins Winter 2008
Compiler Code Optimizations
Code Optimization Overview and Examples Control Flow Graph
Code Generation Part III
Introduction to Optimization
8 Code Generation Topics A simple code generator algorithm
Optimization 薛智文 (textbook ch# 9) 薛智文 96 Spring.
Intermediate Code Generation
Code Generation Part II
CSc 453 Final Code Generation
Code Optimization.
Presentation transcript:

CH10.1 CSE 4100 Chap 10: Optimization Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371 Fairfield Way, Unit 2155 Storrs, CT (860) Material for course thanks to: Laurent Michel Aggelos Kiayias Robert LeBarre

CH10.2 CSE 4100Overview  Motivation and Background  Code Level Optimization  Common Sub-expression elimination  Copy Propagation  Dead-code elimination  Peephole optimization  Load/Store elimination  Unreachable code  Flow of Control Optimization  Algebraic simplification  Strength Reduction  Concluding Remarks/Looking Ahead

CH10.3 CSE 4100Motivation  What we achieved  We have working machine code  What is missing  Code generation does not see the “big” picture  We can generate poor instruction sequences  What we need  A simple way to locally improve the code quality  Goal:  Transition from “Lousy” Intermediate Code to More Effective and Efficient Code  Response Time, Performance (Algorithms), Memory Usage  Measured in terms of Number of Variables Saved, Operands Saved, Memory Accesses, etc.

CH10.4 CSE 4100 Where can Optimation Occur?  Software Engineer can:  Profile Program  Change Algorithm Data  Transform/Improve Loops Front End LA, Parse, Int. Code Code Generator Int. Code Target Program Source Program  Compiler Can:  Improve Loops/Proc Calls  Calculate Addresses  Use Registers  Selected Instructions  Perform Peephole Opt.  All are Optimizations  1 st is User Controlled and Defined  At Intermediate Code Level by Compiler  At Assembly Level for Target Architecture (to take advantage of different machine features)

CH10.5 CSE 4100 Code Level Optimization  First Look at Optimization  Section 9.4 in 1 st Edition  Introduce and Discuss Basic Blocks  Requirements for Optimization  Section 10.1 in 1 st Edition  Basic Blocks, Flow Graphs  Indepth Examination of Optimization  Section 10.2 in 1 st Edition  Function Preserving Transformations  Loop Optimizations

CH10.6 CSE 4100 First Look at Optimization  Optimization Applied to 3 Address Coding (3AC) Version of Source Program - Examples:  A + B[i] * ct1 = b[i] t2 = t1 * a t3 = t2 * c

CH10.7 CSE 4100 First Look at Optimization  Once Code has been Generated in 3AC, an Algorithm can be Applied to:  Identify each Basic Block which Represents a set of Three Address Statements where  Execution Enters at Top and Leaves at Bottom  No Branches within Code  Represent the Control Flow  Dependencies Among and Between Basic Blocks  Defines what is Termed a “Flow Graph”  Let’s see an Example

CH10.8 CSE 4100 First Look at Optimization  Steps 1 to 12 from two Slides Back Represented as:  Optimization Works with Basic Blocks and Flow Graph to Perform Transformations that:  Generate Equivalent Flow Graph w/Improved Perf.

CH10.9 CSE 4100 First Look at Optimization  Optimization will Perform Transformations on Basic Blocks/Flow Graph  Resulting Graph(s) Passed Through to Final Code Generation to Obtain More Optimal Code  Two Fold Goal of Optimization  Reduce Time  Reduce Space  Optimization Used to Come at a Cost:  In “Old Days” Turning on Optimizer Could Double the Compilation Time  From 2 hours to 4 hours  Is this an Issue Today?

CH10.10 CSE 4100 First Look at Optimization  Two Types of Transformations  Structure Preserving  Inherent Structure and Implicit Functionality of Basic Blocks is Unchanged  Algebraic  Elimination of Useless Expressions x = x + 0 or y = y * 1  Replace Expensive Operators Change x = y ** 2 to x = y * y Why?  We’ll Focus on Both …

CH10.11 CSE 4100 Structure Preserving Transformations  Common Sub-Expression Elimination  How can Following Code be Improved? a = b + c b = a – d c = b + c d = a – d  What Must Make Sure Doesn’t happen?  Dead-Code Elimination  If x is not Used in Block, Can it be Removed? x = y + z  What are the Possible Ramifications if so? d = b

CH10.12 CSE 4100 Structure Preserving Transformations  Renaming Temporary Variables  Consider the code t = b + c  Can be Changed to u = b + c  May Reduce the Number of temporaries  Make Change from all t’s to all u’s  Interchange of Statements  Consider and Change to: t1 = b + c t2 = x + y t2 = x + yt1 = b + c  This can Occur as Long as:  x and y not t1  b and c not t2  What Do you have to Check?

CH10.13 CSE 4100 Requirements for Optimization  Identify Frequently Executed Portions of Code and Make them Perform Better  Rule-of-Thumb - Most Programs spend 80% of their Time in 20% of Code – Is this True?  We Focus on Loops since Every Gain in Space or Time is Multiplied by Loop Iterations  Reduce Loop’s Code and Improve Performance  What Other Programming Technique Should be a Major Concern for Optimization?

CH10.14 CSE 4100 Requirements for Optimization  Criteria for Transformations  Preserve Meaning of Code Don’t Change Output, Introduce Errors, etc.  Speed up Programs by Measurable Amount (on Average for Entire Code)  Must be Work the Effort Stick to Meaningful, Useful Transformations  Provide Different Versions of Compiler  Non-Optimizing  Optimizing  Extra Optimization on Demand

CH10.15 CSE 4100 Requirements for Optimization  Beware that Some Optimization Directives are Ignored!  In C, Define variable as “register int I;”  While a Feature of Language, cc States that these Instructions are Ignored and Compiler Controls Use of Registers

CH10.16 CSE 4100 The Overall Optimization Process  Advantages  Intermediate Code has Explicit Operations and Their Identification Promotes Optimization  Intermediate Code is Relatively Machine Independent  Therefore, Optimization Doesn’t Impact Final Code Generation

CH10.17 CSE 4100 Example Source Code

CH10.18 CSE 4100 Generated Three Address Coding

CH10.19 CSE 4100 Flow Graph of Basic Blocks

CH10.20 CSE 4100 Indepth Examination of Optimization  Code-Transformation Techniques:  Local – within a “Basic Block”  Global – between “Basic Blocks”  Data Flow Dependencies Determined by Inspection what do i, a, and v refer to?  Dependent in Another Basic Block  Scoping is Very Critical

CH10.21 CSE 4100 Indepth Examination of Optimization  Function Preserving Transformations  Common Subexpressions  Copy Propagation  Deal Code Elimination  Loop Optimizations  Code Motion  Induction Variables  Strength Reduction

CH10.22 CSE 4100 Common Sub-Expressions  E is a Common Sub-Expression if  E as Previously Computed  Value of E Unchanged since Previous Computation  What Can be Saved in B5?  t6 and t7 same computation  t8 and t10 same computation  Save:  Remove 2 temp variables  Remove 2 multiplications  Remove 4 variable accesses  Remove 2 assignments t6 := 4 * i x := a[t6] t7 := 4 * i t8 := 4 * j t9 := a[t8] a[t7] := t9 t10 := 4 * j a[t10]:= x Goto B2 t6 := 4 * i x := a[t6] t8 := 4 * j t9 := a[t8] a[t6] := t9 a[t8]:= x Goto B2

CH10.23 CSE 4100 Common Sub-Expressions  What about B6?  t11 and t12  t13 and t15  Similar Savings as in B5 t11 := 4 * i x := a[t11] t12 := 4 * i t13 := 4 * n t14 := a[t13] a[t12]:= t14 t15 := 4 * n a[t15]:= x t11 := 4 * i x := a[t11] t13 := 4 * n t14 := a[t13] a[t11]:= t14 a[t13]:= x

CH10.24 CSE 4100 Common Sub-Expressions  What else Can be Accomplished?  Where is Variable j Determined?  In B3 – and when drop through B3 to B4 and into B5, no change occurs to j!  What Does B5 Become?  Are we done? No t9 same as t5!  Again savings in access, variables, operations, etc. t6 := 4 * i x := a[t6] t8 := 4 * j t9 := a[t8] a[t6] := t9 a[t8]:= x Goto B2 j := j - 1 t4 := 4 * j t5 := a[t4] if t5>4 goto B3 B4 t6 := 4 * i x := a[t6] t9 := a[t4] a[t6] := t9 a[t4]:= x Goto B2 t6 := 4 * i x := a[t6] a[t6] := t5 a[t4]:= x Goto B2

CH10.25 CSE 4100 Common Sub-Expressions  Are we done yet?  Where is “i” defined?  Any Values we can Leverage?  Yes!  t2 = 4*i Defined in B2 and is unchanged as it arrives at B5  t3 = a[t2] in B3 and B2 and also unchanged as it arrives  Result at Left Saves:  From 9 statements down to 4  4 Multiplications are Gone  4 addr/array offsets are only 2 t6 := 4 * i x := a[t6] a[t6] := t5 a[t4]:= x Goto B2 x := t3 a[t2] := t5 a[t4]:= x Goto B2

CH10.26 CSE 4100 Common Sub-Expressions  B6 is Similarly Changed …. t11 := 4 * i x := a[t11] t13 := 4 * n t14 := a[t13] a[t11]:= t14 a[t13]:= x x := t3 t14 := a[t1] a[t2]:= t14 a[t1]:= x

CH10.27 CSE 4100 Resulting Flow Diagram

CH10.28 CSE 4100 Copy Propagation  Introduce a Common Copy Statement to Replace an Arithmetic Calculation with Assignment  Regardless of the Path Chosen, the use of an Assignment Saves Time and Space a:= d + eb:= d + e c:= d + e a:= d + e a:= t b:= d + e a:= t c:= t

CH10.29 CSE 4100 Copy Propagation  In our Example for B5 and B6 Below:  Since x is t3, we can replace the use of x on right hand side as below:  We’ll come back to this shortly! x := t3 t14 := a[t1] a[t2]:= t14 a[t1]:= x x := t3 a[t2] := t5 a[t4]:= x Goto B2 x := t3 t14 := a[t1] a[t2] := t14 a[t1] := t3 x := t3 a[t2] := t5 a[t4] := t3 Goto B2

CH10.30 CSE 4100 Dead Code Elimination  Variable is “Dead” if its Value will never be Utilized Again Subsequently  Otherwise, Variable is “Live”  What’s True about B5 and B6?  Can Any Statements be Eliminated? Which Ones? Why?  B5 and B6 are Now Optimized with  B5 has 9 Statements Reduced to 3  B56 has 8 Statements Reduced to 3 x := t3 t14 := a[t1] a[t2] := t14 a[t1] := t3 x := t3 a[t2] := t5 a[t4] := t3 Goto B2

CH10.31 CSE 4100 Loop Optimizations  Three Types: Code Motion, Induction Variables, and Strength Reduction  Code Motion  Remove Invariant Operations from Loop while (limit * 2 > i) do  Replaced by: t = limit * 2 while (t > i) do  Induction Variables  Identify Which Variables are Interdependent or in Step j = j – 1 t4 = 4 * j  Replaced by below with an initialization of t4 t4 = t4 - 4

CH10.32 CSE 4100 Loop Optimizations  Strength Reduction  Replace an Expensive Operation (Such as Multiply) with a Cheaper Operation (Such as Add)  In B4, I and j can be replaced with t2 and t4  This Eliminates the need for Variables i and j

CH10.33 CSE 4100 Final Optimized Flow Graph – Done?

CH10.34 CSE 4100 Turn to Prof. Michel’s Slides …  Motivation  Rewrite the basic block to eliminate sub- expressions  Technique  Change the representation  Move to a tree!

CH10.35 CSE 4100Example L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1 L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1

CH10.36 CSE 4100Example L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1 L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1

CH10.37 CSE 4100Example L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1 L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1

CH10.38 CSE 4100Example L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1 L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1

CH10.39 CSE 4100Example L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1 L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1

CH10.40 CSE 4100Example L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1 L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1

CH10.41 CSE 4100Example L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1 L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1

CH10.42 CSE 4100Example L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1 L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1

CH10.43 CSE 4100Example L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1 L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1

CH10.44 CSE 4100Example L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1 L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1

CH10.45 CSE 4100Example  What we have  Common sub-expressions are known  Used variables are known (leaves)  Live on exit are known L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1 L1:t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1

CH10.46 CSE 4100 Peephole Optimization  Simple Idea  Slide a window over the code  Optimize code in the window only.  Optimizations are  Local [still no big picture]  Semantic preserving  Cheap to implement  Usually  One can repeat the peephole several times!  Each pass can create new opportunities for more

CH10.47 CSE 4100 Peephole Optimizer block_3: mov [esp-4],ebp mov ebp,esp mov [ebp-8],esp sub esp,28 mov eax,[ebp+8] cmp eax,0 mov eax,0 sete ah cmp eax,0 jz block_5 block_4: mov eax,1 jmp block_6 block_5: mov eax,[ebp+8] sub eax,1 push eax mov eax,[ebp+4] push eax mov eax,[eax] call eax add esp,8 mov ebx,[ebp+8] imul ebx,eax mov eax,ebx block_6: mov esp,[ebp-8] mov ebp,[ebp-4] ret block_3: mov [esp-4],ebp mov ebp,esp mov [ebp-8],esp sub esp,28 mov eax,[ebp+8] cmp eax,0 mov eax,0 sete ah cmp eax,0 jz block_5 block_4: mov eax,1 jmp block_6 block_5: mov eax,[ebp+8] sub eax,1 push eax mov eax,[ebp+4] push eax mov eax,[eax] call eax add esp,8 mov ebx,[ebp+8] imul ebx,eax mov eax,ebx block_6: mov esp,[ebp-8] mov ebp,[ebp-4] ret

CH10.48 CSE 4100 Peephole Optimizations  A Few Simple technique [in a nutshell]  Load/Store elimination  Get rid of redundant operations  Unreachable code  Get rid of code guaranteed to never execute  Flow of Control Optimization  Simply jump sequences.  Algebraic simplification  Use rules of algebra to rewrite some basic operation  Strength Reduction  Replace expensive instructions by equivalent ones (yet cheaper)  Machine Idioms  Replace expensive instructions by equivalent ones (for a given machine)

CH10.49 CSE 4100 Load / Store Sequences  Imagine the following sequence  “a” is a label for a memory location  e.g. a variable in memory on on the stack  If “a” is on the stack, it would look like ebp(k) [k == constant] mov a,eax mov eax,a mov a,eax mov eax,a What is guaranteed to be true after the first instruction ? Corollary....

CH10.50 CSE 4100 Unreachable Code  What is it?  A situation that arise because...  Conditional compilation  Previous optimizations “created/exposed” dead code  Example #define debug if (debug) { printf(“This is a trace message\n”); }.... #define debug if (debug) { printf(“This is a trace message\n”); }....

CH10.51 CSE 4100Example  The Generated code looks like....  If we know that...  debug == 0  Then.... if (debug == 0) goto L2 printf(“This is a trace message\n”); L2: if (debug == 0) goto L2 printf(“This is a trace message\n”); L2: if (0 == 0) goto L2 printf(“This is a trace message\n”); L2: if (0 == 0) goto L2 printf(“This is a trace message\n”); L2:.... 1

CH10.52 CSE 4100Example  Final transformation  Given this code  There is no way to branch “into” the blue block  The last instruction (goto L2) jumps over the blue block  The blue block is never used. Get rid of it!.... goto L2 printf(“This is a trace message\n”); L2: goto L2 printf(“This is a trace message\n”); L2:....

CH10.53 CSE 4100 Unreachable Code Example  Bottom Line  Now L2 is instruction after goto...  So get rid of goto altogether!.... goto L2 L2: goto L2 L2: L2: L2:....

CH10.54 CSE 4100 Flow of Control Optimization  Situation  We can have chains of jumps  Direct to conditional or vice-versa  Objective  Avoid extra jumps.  Why? [a.k.a. motivation....]  Example if (x relop y) goto L2.... L2:goto L4 L3:.... L4: L4_BLOCK if (x relop y) goto L2.... L2:goto L4 L3:.... L4: L4_BLOCK

CH10.55 CSE 4100 Flow of Control  What can be done  Collapse the chain if (x relop y) goto L4.... L2:goto L4 L3:.... L4: L4_BLOCK if (x relop y) goto L4.... L2:goto L4 L3:.... L4: L4_BLOCK

CH10.56 CSE 4100 Algebraic Simplification  Simple Idea  Use algebraic rules to rewrite some code  Examples x := y + 0 x := y x := y * 1 x := y x := y * 0 x := 0

CH10.57 CSE 4100 Strength Reduction  Idea  Replace expensive operation  By semantically equivalent cheaper ones.  Examples  Multiplication by 2 is equivalent to a left shift  Left shift is much faster

CH10.58 CSE 4100 Hardware Idiom  Idea  Replace expensive instructions by...  Equivalent instruction that are optimized for the platform  Example add eax,1 inc eax

CH10.59 CSE 4100 Concluding Remarks/Looking Ahead  Optimization Techniques/Concepts are Not Only Relevant to Programming Languages  Database Systems do Optimization to Reduce Access to Secondary Storage  Concern when Asking for too Much Data  Joining Three or More Tables at Once  Doing a Cartesian Product Instead of a Join  Doing Selections before Joins  Termed Query Optimization  Looking Ahead  Review Machine Code Generation (if time)  Final Exam Review