SSA-Based Constant Propagation, SCP, SCCP, & the Issue of Combining Optimizations 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon,

Slides:



Advertisements
Similar presentations
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
Advertisements

Operator Strength Reduction From Cooper, Simpson, & Vick, “Operator Strength Reduction”, ACM TOPLAS, 23(5), See also § of EaC2e. 1COMP 512,
Lecture 11: Code Optimization CS 540 George Mason University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
Coalescing Register Allocation CS153: Compilers Greg Morrisett.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
A Deeper Look at Data-flow Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University.
Code Motion of Control Structures From the paper by Cytron, Lowry, and Zadeck, COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda.
The Last Lecture Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission.
Lazy Code Motion Comp 512 Spring 2011
Lazy Code Motion C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Loop Invariant Code Motion — classical approaches — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students.
Introduction to Code Optimization Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice.
1 CS 201 Compiler Construction Lecture 10 SSA-Based Sparse Conditional Constant Propagation.
Code Shape III Booleans, Relationals, & Control flow Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
Intermediate Code. Local Optimizations
Introduction to Optimization Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Prof. Fateman CS164 Lecture 211 Local Optimizations Lecture 21.
Instruction Scheduling II: Beyond Basic Blocks Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp.
1 CS 201 Compiler Construction Lecture 10 SSA-Based Sparse Conditional Constant Propagation.
Code Optimization, Part III Global Methods Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students.
Introduction to Optimization, II Value Numbering & Larger Scopes Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
First Principles (with examples from value numbering) C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
The Procedure Abstraction, Part V: Support for OOLs Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in.
Using SSA Dead Code Elimination & Constant Propagation C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Operator Strength Reduction C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students.
Building SSA Form, III 1COMP 512, Rice University This lecture presents the problems inherent in out- of-SSA translation and some ways to solve them. Copyright.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Static Single Assignment John Cavazos.
Lexical Analysis: DFA Minimization Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice.
Global Redundancy Elimination: Computing Available Expressions Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Algebraic Reassociation of Expressions Briggs & Cooper, “Effective Partial Redundancy Elimination,” Proceedings of the ACM SIGPLAN 1994 Conference on Programming.
Building SSA Form (A mildly abridged account) For the full story, see the lecture notes for COMP 512 (lecture 8) and.
Top-down Parsing Recursive Descent & LL(1) Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Dead Code Elimination This lecture presents the algorithm Dead from EaC2e, Chapter 10. That algorithm derives, in turn, from Rob Shillner’s unpublished.
Boolean & Relational Values Control-flow Constructs Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Terminology, Principles, and Concerns, II With examples from superlocal value numbering (Ch 8 in EaC2e) Copyright 2011, Keith D. Cooper & Linda Torczon,
Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.
Profile-Guided Code Positioning See paper of the same name by Karl Pettis & Robert C. Hansen in PLDI 90, SIGPLAN Notices 25(6), pages 16–27 Copyright 2011,
Building SSA Form, I 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at.
Profile Guided Code Positioning C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Instruction Scheduling: Beyond Basic Blocks Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp.
Definition-Use Chains
Introduction to Optimization
Finding Global Redundancies with Hopcroft’s DFA Minimization Algorithm
Princeton University Spring 2016
Local Instruction Scheduling
Introduction to Optimization
Wrapping Up Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit.
Instruction Scheduling: Beyond Basic Blocks
Building SSA Form COMP 512 Rice University Houston, Texas Fall 2003
Code Shape III Booleans, Relationals, & Control flow
CS 201 Compiler Construction
Optimization through Redundancy Elimination: Value Numbering at Different Scopes COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith.
The Last Lecture COMP 512 Rice University Houston, Texas Fall 2003
Optimizations using SSA
Introduction to Optimization
Instruction Scheduling: Beyond Basic Blocks
Static Single Assignment
Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II
Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Algebraic Reassociation of Expressions COMP 512 Rice University Houston, Texas Fall 2003 P. Briggs & K.D. Cooper, “Effective Partial Redundancy Elimination,”
The Partitioning Algorithm for Detecting Congruent Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper.
Objectives Identify advantages (and disadvantages ?) of optimizing in SSA form Given a CFG in SSA form, perform Global Constant Propagation Dead code elimination.
Objectives Identify advantages (and disadvantages ?) of optimizing in SSA form Given a CFG in SSA form, perform Global Constant Propagation Dead code elimination.
Presentation transcript:

SSA-Based Constant Propagation, SCP, SCCP, & the Issue of Combining Optimizations 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. Comp 512 Spring 2011

COMP 512, Rice University2 Constant Propagation Safety Proves that name always has known value Specializes code around that value  Moves some computations to compile time (  code motion )  Exposes some unreachable blocks (  dead code ) Opportunity Value ≠  signifies an opportunity Profitability Compile-time evaluation is cheaper than run-time evaluation Branch removal may lead to block coalescing ( C LEAN )  If not, it still avoids the test & makes branch predictable

COMP 512, Rice University3 Using SSA — Sparse Constant Propagation  expression, e Value(e)  WorkList  Ø  SSA edge s = if Value(u) ≠ TOP then add s to WorkList while (WorkList ≠ Ø) remove s = from WorkList let o be the operation that uses v if Value(o) ≠ BOT then t  result of evaluating o if t ≠ Value(o) then  SSA edge add to WorkList { TOP if its value is unknown c i if its value is known BOT if its value is known to vary Evaluating a Ø-node: Ø(x 1,x 2,x 3, … x n ) is Value(x 1 )  Value(x 2 )  Value(x 3 ) ...  Value(x n ) Where TOP  x = x  x c i  c j = c i if c i = c j c i  c j = BOT if c i ≠ c j BOT  x = BOT  x Same result, fewer  operations Performs  only at Ø nodes i.e., o is “a  b op v” or “a  v op b”

COMP 512, Rice University4 Using SSA — Sparse Constant Propagation How long does this algorithm take to halt? Initialization is two passes  |ops| + 2 x |ops| edges Value(x) can take on 3 values  TOP, c i, BOT  Each use can be on WorkList twice  2 x |args| = 4 x |ops| evaluations, WorkList pushes & pops As discussed in Lecture 7, this is an optimistic algorithm Initialize all values to TOP, unless they are known constants Every value becomes BOT or c i, unless its use is uninitialized TOP BOT cici cjcj ckck clcl cmcm cncn...

COMP 512, Rice University5 12      Pessimistic initializations Leads to: i 1  12     x   * 17   j   i 3   Sparse Constant Propagation Optimism i 0  12 while ( … ) i 1  Ø(i 0,i 3 ) x  i 1 * 17 j  i 1 i 2  … … i 3  j Clear that i is always 12 at def of x In general Optimism helps inside loops Largely a matter of initial value M.N. Wegman & F.K. Zadeck, Constant propagation with conditional branches, ACM TOPLAS, 13(2), April 1991, pages 181–210. * Optimism This version of the algorithm is an optimistic formulation Initializes values to TOP Prior version used  ( implicit ) 12 TOP Optimistic initializations Leads to: i 1  12  TOP  12 x  12 * 17  204 j  12 i 3  12 i 1  12  12  12

COMP 512, Rice University6 Sparse Constant Propagation What happens when it propagates a value into a branch? TOP  we gain no knowledge BOT  either path can execute TRUE or FALSE  only one path can execute } But, the algorithm does not use this... Using this observation, we can work into the algorithm an element of refining feasible paths in the CFG that will take it beyond the standard limits of DFA → Until a block can execute, treat it as unreachable → Optimistic initializations allow analysis to proceed with unevaluated blocks Result is an analysis that can use limited symbolic evaluation to combine constant propagation with unreachable code elimination

Sparse Conditional Constant Propagation Can use constant-valued control predicates to refine the CFG If compiler knows the value of x, it can eliminate either the then or the else case  (x > 0) ⇒ y is 17 in B 3  (x > 0) ⇒ B 2 is unreachable This approach combines constant propagation with CFG reachability analysis to produce better results in each Example of Click’s notion of “combining optimizations”  Predated & motivated Click COMP 512, Rice University7 then caseelse case … if (x > 0) B0B0 … B3B3 y   17 B1B1 y  y + zy  y + z B2B2 Classic DFA assumes that all paths can be taken at runtime, including (B 0,B 2,B 3 )

COMP 512, Rice University8 Sparse Constant Propagation To work this idea into the algorithm, make several modifications Use two worklists:  SSAWorkList  Holds edges in the SSA graph  SSA worklist propagates changing values  CFGWorkList  Holds edges in the control-flow graph  CFG worklist propagates information on reachability Do not evaluate operations until block is reachable When it marks a block as reachable, must evaluate all operations

COMP 512, Rice University9 Sparse Conditional Constant Propagation SSAWorkList  Ø CFGWorkList  n 0  block b clear b’s mark  expression e in b Value(e)  TOP Initialization Step To evaluate a branch if arg is BOT then put both targets on CFGWorklist else if arg is TRUE then put TRUE target on CFGWorkList else if arg is FALSE then put FALSE target on CFGWorkList To evaluate a jump place its target on CFGWorkList while ((CFGWorkList  SSAWorkList) ≠ Ø) while(CFGWorkList ≠ Ø) remove b from CFGWorkList mark b evaluate each Ø-function in b evaluate each op o in b, in order  SSA edge if block(x) is marked add to SSAWorklist while(SSAWorkList ≠ Ø) remove s = from WorkList let o be the operation that contains v t  result of evaluating o if t ≠ Value(o) then Value(o)  t  SSA edge if block(x) is marked, then add to SSAWorkList Propagation Step Note that the statement of this algorithm in EaC1e is badly mangeled. Fixed in EaC2e.

COMP 512, Rice University10 Sparse Conditional Constant Propagation There are some subtle points Branch conditions should not be TOP when evaluated  Indicates an upwards-exposed use ( no initial value )  Hard to envision compiler producing such code Initialize all operations to TOP  Block processing will fill in the non-top initial values  Unreachable paths contribute TOP to Ø-functions Code shows CFG edges first, then SSA edges  Can intermix them in arbitrary order ( correctness )  Taking CFG edges first may help with speed ( minor effect )

COMP 512, Rice University11 Sparse Conditional Constant Propagation More subtle points TOP * BOT  TOP  If TOP becomes 0, then 0 * BOT  0  This prevents non-monotonic behavior for the result value  Uses of the result value might go irretrievably to BOT  Similar effects with any operation that has a “zero” Some values reveal simplifications, rather than constants  BOT * c i  BOT, but might turn into shifts & adds ( c i = 2, BOT ≥ 0 )  Removes commutativity ( reassociation )  BOT **2  BOT * BOT ( vs. series or call to library ) cbr TRUE  L 1,L 2 becomes br  L 1  Method discovers this; it must rewrite the code, too!

COMP 512, Rice University12 Sparse Conditional Constant Unreachable Code Cannot get this any other way D EAD code cannot test (i > 0) D EAD marks j 2 as useful * Optimism Initialization to TOP is still important Unreachable code keeps TOP  with TOP has desired result i  17 if (i > 0) then j 1  10 else j 2  20 j 3  Ø(j 1, j 2 ) k  j 3 * 17 In general, combining two optimizations can lead to answers that cannot be produced by any combination of running them separately. This algorithm is one example of that general principle. Combining allocation & scheduling is another... All paths execute   17 With SCC marking blocks TOP TOP 10