Reduction in Strength CS 480. Our sample calculation for i := 1 to n for j := 1 to m c [i, j] := 0 for k := 1 to p c[i, j] := c[i, j] + a[i, k] * b[k,

Slides:



Advertisements
Similar presentations
Example of Constructing the DAG (1)t 1 := 4 * iStep (1):create node 4 and i 0 Step (2):create node Step (3):attach identifier t 1 (2)t 2 := a[t 1 ]Step.
Advertisements

Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 3 Developed By:
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Register Allocation CS 320 David Walker (with thanks to Andrew Myers for most of the content of these slides)
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
Loops or Lather, Rinse, Repeat… CS153: Compilers Greg Morrisett.
1 Chapter 8: Code Generation. 2 Generating Instructions from Three-address Code Example: D = (A*B)+C =* A B T1 =+ T1 C T2 = T2 D.
Jeffrey D. Ullman Stanford University. 2  A never-published Stanford technical report by Fran Allen in  Fran won the Turing award in  Flow.
Graph-Coloring Register Allocation CS153: Compilers Greg Morrisett.
SSA.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
Loop invariant code removal CS 480. Our sample calculation for i := 1 to n for j := 1 to m c [i, j] := 0 for k := 1 to p c[i, j] := c[i, j] + a[i, k]
1 Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Register Allocation CS 671 March 27, CS 671 – Spring Register Allocation - Motivation Consider adding two numbers together: Advantages: Fewer.
1 CS 201 Compiler Construction Lecture 7 Code Optimizations: Partial Redundancy Elimination.
Solving Linear Equations
1 Really Basic Stuff Flow Graphs Constant Folding Global Common Subexpressions Induction Variables/Reduction in Strength.
1 Copy Propagation What does it mean? Given an assignment x = y, replace later uses of x with uses of y, provided there are no intervening assignments.
1 CS 201 Compiler Construction Lecture 5 Code Optimizations: Copy Propagation & Elimination.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
CS Data Structures Appendix 1 How to transfer a simple loop- expression to a recursive function (factorial calculation)
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Improving Code Generation Honors Compilers April 16 th 2002.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
Compiler Construction A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University Second Semester 2008/2009.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
analysis, plug ‘n’ chug, & induction
Using addition property of equality
Optimizing Compilers Nai-Wei Lin Department of Computer Science and Information Engineering National Chung Cheng University.
Compiler Code Optimizations. Introduction Introduction Optimized codeOptimized code Executes faster Executes faster efficient memory usage efficient memory.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
What’s in an optimizing compiler?
1 Code Generation Part II Chapter 9 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
CPSC 388 – Compiler Design and Construction Optimization.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
More on Loop Optimization Data Flow Analysis CS 480.
Think Possibility 1 Iterative Constructs ITERATION / LOOPS C provides three loop structures: the for-loop, the while-loop, and the do-while-loop. Each.
CS 412/413 Spring 2005Introduction to Compilers1 CS412/CS413 Introduction to Compilers Tim Teitelbaum Lecture 30: Loop Optimizations and Pointer Analysis.
Code Optimization More Optimization Techniques. More Optimization Techniques  Loop optimization  Code motion  Strength reduction for induction variables.
Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Code Optimization Overview and Examples
High-level optimization Jakub Yaghob
Code Optimization.
Lecture 5 Partial Redundancy Elimination
CS314 – Section 5 Recitation 13
Static Single Assignment
Optimization Code Optimization ©SoftMoore Consulting.
Basic Block Optimizations
Fall Compiler Principles Lecture 8: Loop Optimizations
Programming Misconceptions
Control Flow Analysis CS 4501 Baishakhi Ray.
Preliminary Transformations
TARGET CODE GENERATION
Compiler Code Optimizations
Code Optimization Overview and Examples Control Flow Graph
Optimizations using SSA
Interval Partitioning of a Flow Graph
Assignment Operators Topics Increment and Decrement Operators
Loop Optimization “Programs spend 90% of time in loops”
Assignment Operators Topics Increment and Decrement Operators
Assignment Operators Topics Increment and Decrement Operators
Assignment Operators Topics Increment and Decrement Operators
Assignment Operators Topics Increment and Decrement Operators
Basic Block Optimizations
Code Optimization.
Presentation transcript:

Reduction in Strength CS 480

Our sample calculation for i := 1 to n for j := 1 to m c [i, j] := 0 for k := 1 to p c[i, j] := c[i, j] + a[i, k] * b[k, j]

Graph rep of our program Node 1 t1 <- n i <- 1 (implicit goto node 2) Node 2 If i > t1 goto node 10 Node 3 t2 <- m j <- 1 Node 4 if j > t2 goto node 9 Node 5 (c + ((i*4-4) * m + (j*4-4))) <- 0 t3 <- p k <- 1 Node 6 If k > t3 goto node 8 Node 7 t4 <- i*4-4 t5 <- j*4-4 t7 <- t4 * m + t5 t8 <- k*4-4 (c + t7) + (t4 * p + t8)) + (t8 * m + t5)) k <- k + 1 goto node 6 Node 8 j <- j + 1 goto Node 4 Node 9 i <- i + 1 goto Node 2 Node 10 (procedure exit)

Innermost loop 7 *, 12 + Node 5 (c + ((i*4-4) * m + (j*4-4))) <- 0 t3 <- p k <- 1 Node 6 If k > t3 goto node 8 Node 7 t4 <- i*4-4 t5 <- j*4-4 t7 <- t4 * m + t5 t8 <- k*4-4 (c + t7) + (t4 * p + t8)) + (t8 * m + t5)) k <- k + 1 goto node 6

Loop Invariant Code Removal Last time we looked for expressions that did not change within a loop, and moved them out Called Loop Invariant Code removal We were able to reduce the operations to 8 additions and 3 multiplications

After loop invariant code removal Node 5 t4 <- i * 4 – 4 t5 <- j * 4 – 4 t7 <- t4 * m + t5 t6 <- t4 * p (c + t7) <- 0 t3 <- p k <- 1 Node 6 If k > t3 goto node 8 Node 7 t8 <- k*4-4 (c + t7) + (t6 + t8)) + (8 * m + t5)) k <- k + 1 goto node 6

So, are we done? Originally had +/-: 12 and */%: 7 in innermost loop Now have +/-: 8 and */%: 3 Are we done? Can we do better? Sure we can.. Next optimization

Reduction in Strength Reduce a costly (“strong”) operation to a less costly (“weak”) operation Typically replace multiplications by additions, which can be executed much faster

Reduction in Strength algorithm Step 1 within a loop, look for variables that are changing by the addition or subtraction of a constant Called Induction Variables Step 2 look for expressions of the form (iv + c) * c + c (these come up a lot in subscript calculations) Step 3. Replace these with a temporary variable

Justification Think about the variable k in the innermost loop How is it changing? 1,2,3,……. Then how does the expression k * 4 – 4 change?

Finding induction variables Same type of analysis as before, only now we need to know not only is a variable being changed, but NOW it is being changed Look for iv <- iv + c Most commonly from for statements, but can actually come from anyplace (or even multiple assignments)

Finding expressions Look for expressions of the form (iv + c) * c + c (again, these are common in subscripts) Create a temporary that tracks these expressions Every time our induction variable changes, change the temporary (by adding a constant)

See what the constant is Suppose we have iv = iv + c 1 Suppose we have (iv + c 2 ) * c 3 + c 4 New expression is (iv + c 1 + c 2 ) * c 3 + c 4 Which is (iv + c 2 ) * c 3 + c 4 + c 1 * c 2 Which is old expression + c 1 * c 2 So new expression is just a constant change from old expression

Lets try it Node 5 t4 <- i * 4 – 4 t5 <- j * 4 – 4 t7 <- t4 * m + t5 t6 <- t4 * p (c + t7) <- 0 t3 <- p k <- 1 t8 <- 0 (this is k * 4 – 4 at this point) Node 6 If k > t3 goto node 8 Node 7 t8 <- k*4-4 (NO LONGER NEEDED) (c + t7) + (t6 + t8)) + (t8 * m + t5)) k <- k + 1 t8 <- t8 + 4 goto node 6

Are we done yet? Nope. As aways, after we have done one optimization, we have a different program Need to look once more since there might now be further optimizations that were not possible previously

New induction variable T8 is now an induction variable It changes only by a constant amount (namely, m) each time through the loop Can replace with another temporary

Lets try it Node 5 t4 <- i * 4 – 4 t5 <- j * 4 – 4 t7 <- t4 * m + t5 t6 <- t4 * p (c + t7) <- 0 t3 <- p k <- 1 t8 <- 0 t11 <- 0 (this is t8* m at this point) Node 6 If k > t3 goto node 8 Node 7 (c + t7) + t6 + t8) + t11+ t5) k <- k + 1 t8 <- t8 + 4 t11 <- t11 + 4m goto node 6

Operation counts Original 7 * 12 + After loop invariant code removal 3 * 8 + After reduction in strength 1 * 9 + Still other types of optizations that could be done, but this is pretty good