Optimizations using SSA

Slides:



Advertisements
Similar presentations
SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Advertisements

1 SSA review Each definition has a unique name Each use refers to a single definition The compiler inserts  -functions at points where different control.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Lecture 11: Code Optimization CS 540 George Mason University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
SSA.
Static Single Assignment CS 540. Spring Efficient Representations for Reachability Efficiency is measured in terms of the size of the representation.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.
Program Representations. Representing programs Goals.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
CS 536 Spring Global Optimizations Lecture 23.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Computing SSA Emery Berger University.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Class canceled next Tuesday. Recap: Components of IR Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements.
1 CS 201 Compiler Construction Lecture 6 Code Optimizations: Constant Propagation & Folding.
1 Copy Propagation What does it mean? – Given an assignment x = y, replace later uses of x with uses of y, provided there are no intervening assignments.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
Intermediate Code. Local Optimizations
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
1 Region-Based Data Flow Analysis. 2 Loops Loops in programs deserve special treatment Because programs spend most of their time executing loops, improving.
Precision Going back to constant prop, in what cases would we lose precision?
CSE P501 – Compiler Construction
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Using SSA Dead Code Elimination & Constant Propagation C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Static Single Assignment John Cavazos.
Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Dead Code Elimination This lecture presents the algorithm Dead from EaC2e, Chapter 10. That algorithm derives, in turn, from Rob Shillner’s unpublished.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Code Optimization Overview and Examples
Introduction to Optimization
Data Flow Analysis Suman Jana
Lecture 5 Partial Redundancy Elimination
Static Single Assignment
© Seth Copen Goldstein & Todd C. Mowry
Efficiently Computing SSA
Topic 10: Dataflow Analysis
Introduction to Optimization
Factored Use-Def Chains and Static Single Assignment Forms
University Of Virginia
Building SSA Form COMP 512 Rice University Houston, Texas Fall 2003
CSC D70: Compiler Optimization Static Single Assignment (SSA)
Code Optimization Overview and Examples Control Flow Graph
Static Single Assignment Form (SSA)
Interval Partitioning of a Flow Graph
Data Flow Analysis Compiler Design
EECS 583 – Class 7 Static Single Assignment Form
Introduction to Optimization
Static Single Assignment
Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II
Reference These slides, with minor modification and some deletion, come from U. of Delaware – and the web, of course. 4/4/2019 CPEG421-05S/Topic5.
Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Reference These slides, with minor modification and some deletion, come from U. of Delaware – and the web, of course. 4/17/2019 CPEG421-05S/Topic5.
EECS 583 – Class 7 Static Single Assignment Form
SSA-based Optimizations
Code Generation Part II
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Objectives Identify advantages (and disadvantages ?) of optimizing in SSA form Given a CFG in SSA form, perform Global Constant Propagation Dead code elimination.
Objectives Identify advantages (and disadvantages ?) of optimizing in SSA form Given a CFG in SSA form, perform Global Constant Propagation Dead code elimination.
Presentation transcript:

Optimizations using SSA CS 671 March 25, 2008

Last Time – SSA Form Generating SSA form Inserting -functions using dominance frontiers Renaming variables if (…) X  5 X  3 Y  X B1 B2 B3 B4 if (…) X0  5 X1  3 X2  (X0, X1) Y0  X2 B1 B4 B2 B3 Before SSA After SSA

Dominance Tree B1 B2 B3 B4 B5 B6 B7 B0 a b c d i Counters Stacks 5 5 7 a1  (a0,a4) b1  (b0,b4) c1  (c0,c6) d1  (d0,d6) i1  (i0,i2) a2  ... c2  ... B1 b2  ... c3  ... d2  ... B2 a3  ... d3  ... B3 Dominance Tree d4  ... B4 c4  ... B5 d5  (d4,d3) c5  (c2,c4) b3  ... B6 B7 a b c d i a4  (a2,a3) b4  (b2,b3) c6  (c3,c5) d6  (d2,d5) y  a4+b4 z  c6+d6 i2  i1+1 Counters Stacks 5 5 7 7 3 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 b4 c2 d6 i2 i > 100 a4 c6 30

Today – Using SSA in Optimizations SSA simplifies many optimization algorithms Simplifies def-use chains Examples: Dead code elimination, constant propagation

Dead-Code Elimination Dead code is either: Unreachable code Assignments where the result is never used Examples “y in 1” is dead “x in 1” is partially dead along path 1-2-4 but not 1-3-4 “z in 4,5” is never used in relevant computations here: only x, y are relevant x=a+b y=c+d x= z=z+1 y= z=x+y out(x,y) 1 2 3 4 5 6

Dead-Code Elimination SSA makes dead-code analysis particularly simple Defn: A variable is live at its definition iff its list of uses is not empty There can be no other definition of the variable The definition of a variable dominates every use So there must be a path from definition to use while (there is some variable v with no uses && the statement that defines v has no side effects) delete the statement that defines v When deleting v  x  y or v  (x, y) remove x, y from use list

Dead-Code Elimination in SSA Form W  a list of all variables in SSA program while W is not empty remove some variable v from W if v’s list of uses is empty let S be v’s statement of definition if S has no other side effects delete S from the program for each variable xi used by S delete S from the list of uses of xi W  W  {xi}

Simple Constant Propagation For any statement of the form v  c for some constant c Any use of v can be replaced with a use of c Any -function of the form v  (c1, c2, …, cn) where all the ci are equal, can be replaced by v  c Easy to detect using SSA Easy to implement using work-list algorithm

Simple Constant Prop in SSA Form W  a list of all statements in SSA program while W is not empty remove some statement S from W if S is v  (c, c,…, c) for some constant c replace S by v  c if S is v  c for some constant c delete S from the program for each statement T that uses v substitute c for v in T W  W  {T}

Other Transformations … Can be incorporated into the work-list algorithm All can be done in linear time Examples Copy propagation Constant conditions Unreachable code

Copy Propagation A single argument -function x  (y) or a copy assignment x  y can be deleted and y substituted every use of x i1  1 j1  1 k1  0 j2   (j4, j1) k2   (k4, k1) if k2 < 100 if j2 < 20 return j2 j3  i1 k3  k2 + 1 j5  k2 k5  k2 + 2 j4   (j3, j5) k4   (k3, k5)

Constant Conditions if (a < b) goto L1 else L2 where a and b are constant becomes goto L1 (or goto L2) Extraneous control-flow edge must be deleted -functions must be adjusted (to account for predecessor-1) j = 1 if (j < 20) goto L1 else goto L2 L1 L2

Unreachable Code Deleting a predecessor may cause L2 to become unreachable All statements in L2 can be deleted Use-lists of all variables used in L2 must be adjusted L2 can be deleted (and its successors updated) j = 1 goto L1 L1 L2

Conditional Constant Propagation Is j always equal to 1? Simple constant propagation missed this opportunity! if j < 20 if k < 100 return j i  1 j  1 k  0 j  i k  k + 1 j  k k  k + 2

SSA Conditional Constant Propagation Keeps track of the result of conditional branches Only propagate definitions when the flow graph is marked executable When propagating constants, ignore edges at join nodes that are not executable. Does not assume that a variable is non-constant until there is evidence Does not assume that we execute a given block until there is evidence

SSA Conditional Constant Propagation Uses a lattice: [x] = T No evidence that any assignment to v is executed [x] = 4 Evidence of x  4 has been seen [x] =  Evidence that x may have two different values Tracks the run-time value of variables New information can only move a variable down the lattice T Never defined ci cj ck cl cm cn ... Defined as c Overdefined 

Constant Propagation (cont.) Side effect of the meet operator: Ç T c ^ T T c ^ (c =c ) ? c c 1 ^ 1 1 c : ^ ^ ^ ^ ^ x y z z = f(x, y)

Executability Also track the executability of each block: [B] = false We have seen no evidence that block B can ever be executed [B] = true We have seen evidence that block B can be executed Start with all blocks: [B] = false The start block B1 is executable: [B1] = true For any executable block B with one successor C: [C] = true For executable branches if x<y goto L1 else L2: [x] = T or [y] = T [L2] = true and [L2] = true

An Example Start with all variables: [x] = T Start with all blocks: [B] = false Calculate  and  1 i1  1 j1  1 k1  0 x [x] i1 j1 j2 j3 j4 j5 k1 k2 k3 k4 k5 2 j2   (j4, j1) k2   (k4, k1) if k2 < 100 B [B] 1 2 3 4 5 6 7 3 4 if j2 < 20 return j2 5 6 j3  i1 k3  k2 + 1 j5  k2 k5  k2 + 2 7 j4   (j3, j5) k4   (k3, k5)

Using SSA – Dead code elimination Conceptually similar to mark-sweep garbage collection Mark useful operations Everything not marked is useless Need an efficient way to find and to mark useful operations Start with critical operations Work back up SSA edges to find their antecedents Define critical I/O statements, linkage code (entry & exit blocks), return values, calls to other procedures Algorithm will use post-dominators & reverse dominance frontiers

Using SSA – Dead code elimination Mark for each op i clear i’s mark if i is critical then mark i add i to WorkList while (Worklist ≠ Ø) remove i from WorkList (i has form “xy op z” ) if def(y) is not marked then mark def(y) add def(y) to WorkList if def(z) is not marked then mark def(z) add def(z) to WorkList for each b  RDF(block(i)) mark the block-ending branch in b add it to WorkList Sweep for each op i if i is not marked then if i is a branch then rewrite with a jump to i’s nearest useful post-dominator if i is not a jump then delete i Notes: Eliminates some branches Reconnects dead branches to the remaining live code Find useful post-dominator by walking post-dom tree Entry & exit nodes are useful

Using SSA – Dead code elimination When is a branch useful? When a useful operation depends on its existence j control dependent on i  one path from i leads to j, one doesn’t This is the reverse dominance frontier of j (RDF(j)) Algorithm uses RDF(n ) to mark branches as live In the CFG, j is control dependent on i if 1.  a non-null path p from i to j  j post-dominates every node on p after i 2. j does not strictly post-dominate i

Using SSA – Dead Code Elimination What’s left? Algorithm eliminates useless definitions & some useless branches Algorithm leaves behind empty blocks & extraneous control-flow Two more issues Simplifying control-flow Eliminating unreachable blocks Both are CFG transformations (no need for SSA)

Eliminating Useless Control Flow Transformations Both sides of branch target Bi Neither block must be empty Replace it with a jump to Bi Simple rewrite of last op in B1 How does this happen? Rewriting other branches How do we find it? Check each branch B1 B2 Eliminating redundant branches Branch, not a jump

Eliminating Useless Control Flow Transformations Merging an empty block Empty B1 ends in a jump Coalesce B1 with B2 Move B1’s incoming edges Eliminates extraneous jump Faster, smaller code How does this happen? Eliminate operations in B1 How do we find it? Test for empty block Eliminating empty blocks B2 B1 empty

Eliminating Useless Control Flow Transformations Coalescing blocks Neither block must be empty B1 ends with a jump B2 has 1 predecessor Combine the two blocks Eliminates a jump How does this happen? Simplifying edges out of B1 How do we find it? Check target of jump |preds | Combining non-empty blocks B1 B2 B1 B2 B1 and B2 should be a single basic block If one executes, both execute, in linear order. *

Eliminating Useless Control Flow Transformations Jump to a branch B1 ends with jump, B2 is empty Eliminates pointless jump Copy branch into end of B1 Might make B2 unreachable How does this happen? Eliminating operations in B2 How do we find this? Jump to empty block Hoisting branches from empty blocks B1 B2 empty

Eliminating Useless Control Flow The Algorithm OnePass() for each block i, in postorder if i ends in a conditional branch then if both targets are identical then replace the branch with a jump if i ends in a jump to j then if i is empty then replace transfers to i with transfers to j if j has only one predecessor coalesce i and j if j is empty & j ends in a conditional branch then rewrite i’s jump with j’s branch Clean() until CFG stops changing compute postorder

Eliminating Useless Control Flow What about an empty loop? By itself, CLEAN cannot eliminate the loop Loop body branches to itself Branch is not redundant Doesn’t end with a jump Key is to eliminate self-loop Add a new transformation? Then, B1 merges with B2 B0 B0 B2 B1 B0 B2 B1 Targets two distinct blocks! B1 B2 New transformation must recognize that B1 is empty. Presumably, it has code to test exit condition & (probably) increment an induction variable. This requires looking at code inside B1 and doing some sophisticated pattern matching. This is awfully complicated.

Eliminating Useless Control Flow What about an empty loop? How to eliminate <B1,B1> ? Pattern matching ? Useless code elimination ? What does DEAD do to B1? Remember, it is empty So, B1  RDF(B2) B1’s branch is useless DEAD rewrites it as a jump B0 B2 B1 *

Eliminating Useless Control Flow What about an empty loop? How to eliminate <B1,B1> ? Pattern matching ? Useless code elimination ? What does DEAD do to B1? Remember, it is empty So, B1  RDF(B2) B1’s branch is useless DEAD rewrites it as a jump DEAD converts it to a form where CLEAN handles it B0 B2 B1 B0 B2 B1 DEAD

Dead Code Elimination Summary Useless Computations  DEAD (Mark and Sweep) Useless Control-flow  CLEAN Unreachable Blocks  Execution counts Other Techniques Constant propagation can eliminate branches Algebraic identities eliminate some operations Redundancy elimination Creates useless operations, or Eliminates them

Using SSA In general, using SSA leads to Cleaner formulations Better results Faster algorithms We’ve seen two SSA-based algorithms. Dead-code elimination Constant propagation These optimizations leave behind other inefficiencies