Building SSA Form (A mildly abridged account)

Slides:



Advertisements
Similar presentations
SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Advertisements

1 SSA review Each definition has a unique name Each use refers to a single definition The compiler inserts  -functions at points where different control.
8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Operator Strength Reduction From Cooper, Simpson, & Vick, “Operator Strength Reduction”, ACM TOPLAS, 23(5), See also § of EaC2e. 1COMP 512,
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
SSA.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Static Single Assignment John Cavazos.
SSA-Based Constant Propagation, SCP, SCCP, & the Issue of Combining Optimizations 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon,
Advanced Compiler Design – Assignment 1 SSA Construction & Destruction Michael Fäs (Original Slides by Luca Della Toffola)
Introduction to Code Optimization Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
CS745: SSA© Seth Copen Goldstein & Todd C. Mowry Static Single Assignment.
Lecture #17, June 5, 2007 Static Single Assignment phi nodes Dominators Dominance Frontiers Dominance Frontiers when inserting phi nodes.
Introduction to Optimization Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Instruction Scheduling II: Beyond Basic Blocks Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp.
Code Optimization, Part III Global Methods Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Introduction to Optimization, II Value Numbering & Larger Scopes Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Static Single Assignment.
Using SSA Dead Code Elimination & Constant Propagation C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Building SSA Form, III 1COMP 512, Rice University This lecture presents the problems inherent in out- of-SSA translation and some ways to solve them. Copyright.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Static Single Assignment John Cavazos.
Global Redundancy Elimination: Computing Available Expressions Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Generating SSA Form (mostly from Morgan). Why is SSA form useful? For many dataflow problems, SSA form enables sparse dataflow analysis that –yields the.
Building SSA Form (A mildly abridged account) For the full story, see the lecture notes for COMP 512 (lecture 8) and.
Dead Code Elimination This lecture presents the algorithm Dead from EaC2e, Chapter 10. That algorithm derives, in turn, from Rob Shillner’s unpublished.
Introduction to SSA Data-flow Analysis Revisited – Static Single Assignment (SSA) Form Liberally Borrowed from U. Delaware and Cooper and Torczon Text.
Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.
Computer Science 313 – Advanced Programming Topics.
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Building SSA Form, I 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at.
Single Static Assignment Intermediate Representation (or SSA IR) Many examples and pictures taken from Wikipedia.
Optimizing The Optimizer: Improving Data Flow Analysis in Soot Michael Batchelder 4 / 6 / 2005 A COMP-621 Class Project.
Definition-Use Chains
Introduction to Optimization
Static Single Assignment
Efficiently Computing SSA
Finding Global Redundancies with Hopcroft’s DFA Minimization Algorithm
Topic 10: Dataflow Analysis
Introduction to Optimization
Factored Use-Def Chains and Static Single Assignment Forms
Lexical Analysis — Part II: Constructing a Scanner from Regular Expressions Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Wrapping Up Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit.
Instruction Scheduling: Beyond Basic Blocks
Building SSA Form COMP 512 Rice University Houston, Texas Fall 2003
The Procedure Abstraction Part I: Basics
Lexical Analysis — Part II: Constructing a Scanner from Regular Expressions Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
CSC D70: Compiler Optimization Static Single Assignment (SSA)
Code Shape III Booleans, Relationals, & Control flow
Optimization through Redundancy Elimination: Value Numbering at Different Scopes COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith.
Static Single Assignment Form (SSA)
Lexical Analysis — Part II: Constructing a Scanner from Regular Expressions Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Optimizations using SSA
Dominator Tree First BB is the root node, each node
Data Flow Analysis Compiler Design
EECS 583 – Class 7 Static Single Assignment Form
Introduction to Optimization
Instruction Scheduling: Beyond Basic Blocks
Static Single Assignment
Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II
Reference These slides, with minor modification and some deletion, come from U. of Delaware – and the web, of course. 4/4/2019 CPEG421-05S/Topic5.
Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Algebraic Reassociation of Expressions COMP 512 Rice University Houston, Texas Fall 2003 P. Briggs & K.D. Cooper, “Effective Partial Redundancy Elimination,”
Reference These slides, with minor modification and some deletion, come from U. of Delaware – and the web, of course. 4/17/2019 CPEG421-05S/Topic5.
EECS 583 – Class 7 Static Single Assignment Form
The Partitioning Algorithm for Detecting Congruent Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper.
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Presentation transcript:

Building SSA Form (A mildly abridged account) For the full story, see the lecture notes for COMP 512 http://www.cs.rice.edu/~keith/512 (lecture 8) and read Chapter 9 of Engineering a Compiler. Building SSA Form (A mildly abridged account) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make copies of these materials for their personal use.

Review ? SSA-form Each name is defined exactly once Each use refers to exactly one name What’s hard Straight-line code is trivial Splits in the CFG are trivial Joins in the CFG are hard Building SSA Form Insert Ø-functions at birth points Rename all values for uniqueness x  17 - 4 x  a + b x  y - z x  13 z  x * q ? s  w - x *

Birth Points (another notion due to Tarjan) Consider the flow of values in this example x  17 - 4 The value x appears everywhere It takes on several values. Here, x can be 13, y-z, or 17-4 Here, it can also be a+b If each value has its own name … Need a way to merge these distinct values Values are “born” at merge points x  a + b x  y - z x  13 z  x * q s  w - x *

Birth Points (another notion due to Tarjan) Consider the flow of values in this example x  17 - 4 New value for x here 17 - 4 or y - z x  a + b x  y - z x  13 New value for x here 13 or (17 - 4 or y - z) z  x * q New value for x here a+b or ((13 or (17-4 or y-z)) s  w - x

Birth Points (another notion due to Tarjan) Consider the flow of values in this example x  17 - 4 x  a + b x  y - z These are all birth points for values All birth points are join points Not all join points are birth points Birth points are value-specific … x  13 z  x * q s  w - x

Review SSA-form Each name is defined exactly once Each use refers to exactly one name What’s hard Straight-line code is trivial Splits in the CFG are trivial Joins in the CFG are hard Building SSA Form Insert Ø-functions at birth points Rename all values for uniqueness A Ø-function is a special kind of copy that selects one of its parameters. The choice of parameter is governed by the CFG edge along which control reached the current block. Real machines do not implement a Ø-function directly in hardware. y1  ... y2  ... y3  Ø(y1,y2) *

SSA Construction Algorithm (High-level sketch) 1. Insert Ø-functions 2. Rename values … that’s all ... … of course, there is some bookkeeping to be done ... *

SSA Construction Algorithm (Lower-level sketch) 1. Insert Ø-functions at every join for every name 2. Solve reaching definitions 3. Rename each use to the def that reaches (will be unique)

Reaching Definitions The equations REACHES(n0 ) = Ø Domain is |DEFINITIONS|, same as number of operations Reaching Definitions The equations REACHES(n0 ) = Ø REACHES(n ) = ppreds(n) DEFOUT(p )  (REACHES(p )  SURVIVED(p)) REACHES(n ) is the set of definitions that reach block n DEFOUT(n) is the set of definitions in n that reach the end of n SURVIVED(n) is the set of defs not obscured by a new def in n Computing REACHES(n) Use any data-flow method (i.e., the iterative method) This particular problem has a very-fast solution (Zadeck) F.K. Zadeck, “Incremental data-flow analysis in a structured program editor,” Proceedings of the SIGPLAN 84 Conf. on Compiler Construction, June, 1984, pages 132-143.ge *

SSA Construction Algorithm (Lower-level sketch) 1. Insert Ø-functions at every join for every name 2. Solve reaching definitions 3. Rename each use to the def that reaches (will be unique) What’s wrong with this approach? Too many Ø-functions (precision) Too many Ø-functions (space) Too many Ø-functions (time) Need to relate edges to Ø-function parameters (bookkeeping) To do better, we need a more complex approach Builds “maximal” SSA

SSA Construction Algorithm (Lower-level sketch) 1. Insert Ø-functions at every join for every name 2. Solve reaching definitions 3. Rename each use to the def that reaches (will be unique) What’s wrong with this approach? A Ø-function may be useless Value is never used  Ø-function is extraneous A Ø-function may be pointless As in “ x17  Ø(x12, x12) ” Can we get there by eliminating these? Probably not Can have Ø ’s that depend on themselves … Builds “maximal” SSA Someone always suggests this.

SSA Construction Algorithm (Detailed sketch) 1. Insert Ø-functions a.) calculate dominance frontiers b.) find global names for each name, build a list of blocks that define it c.) insert Ø-functions  global name n  block b in which n is assigned  block d in b’s dominance frontier insert a Ø-function for n in d add d to n’s list of defining blocks Moderately complex Compute list of blocks where each name is assigned & use as a worklist { Creates the iterated dominance frontier This adds to the worklist ! Use a checklist to avoid putting blocks on the worklist twice; keep another checklist to avoid inserting the same Ø-function twice. *

SSA Construction Algorithm (Detailed sketch) 2. Rename variables in a pre-order walk over dominator tree (use an array of stacks, one stack per global name) Staring with the root block, b a.) generate unique names for each Ø-function and push them on the appropriate stacks b.) rewrite each operation in the block i. Rewrite uses of global names with the current version (from the stack) ii. Rewrite definition by inventing & pushing new name c.) fill in Ø-function parameters of successor blocks d.) recurse on b’s children in the dominator tree e.) <on exit from b > pop names generated in b from stacks 1 counter per name for subscripts Reset the state Need the end-of-block name for this path *

SSA Construction Algorithm (Details) Computing Dominance First step in Ø-function insertion computes dominance Recall that n dominates m iff n is on every path from n0 to m Every node dominates itself n’s immediate dominator is its closest dominator, IDOM(n)† DOM(n0 ) = { n0 } DOM(n) = { n }  (ppreds(n) DOM(p)) Computing DOM These equations form a rapid data-flow framework Iterative algorithm will solve them in d(G) + 3 passes Each pass does N unions & E intersections, E is worst case O(N 2),  O(N 2) work Initially, DOM(n) = N,  n≠n0 †IDOM(n ) ≠ n, unless n is n0, by convention.

Example Progress of iterative solution for DOM B1 B2 B3 B4 B5 B6 B7 B0 Results of iterative solution for DOM Flow Graph Figure 9.3, EaC *

Example Progress of iterative solution for DOM B1 B2 B3 B4 B5 B6 B7 B0 Results of iterative solution for DOM There are asymptotically faster algorithms. With the right data structures, the iterative algorithm can be made faster than them. Data structure in Chapter 9 Cooper, Harvey, Kennedy; to appear Dominance Tree

Example Dominance Frontiers & Ø-Function Insertion A definition at n forces a Ø-function at m iff n  DOM(m) but n  DOM(p) for some p  preds(m) DF(n ) is fringe just beyond region n dominates B1 B2 B3 B4 B5 B6 B7 B0 x Ø(...)  in 7 forces Ø-function in DF(7) = {1} x ... x Ø(...) DF(4) is {6}, so  in 4 forces Ø-function in 6 x Ø(...)  in 6 forces Ø-function in DF(6) = {7} Dominance Frontiers  in 1 forces Ø-function in DF(1) = Ø (halt ) For each assignment, we insert the Ø-functions *

Example Computing Dominance Frontiers Only join points are in DF(n) for some n Leads to a simple, intuitive algorithm for computing dominance frontiers For each join point x (i.e., |preds(x)| > 1) For each CFG predecessor of x Run up to IDOM(x ) in the dominator tree, adding x to DF(n) for each n between x and IDOM(x ) B1 B2 B3 B4 B5 B6 B7 B0 For some applications, we need post-dominance, the post-dominator tree, and reverse dominance frontiers, RDF(n ) Just dominance on the reverse CFG Reverse the edges & add unique exit node We will use these in dead code elimination Dominance Frontiers *

SSA Construction Algorithm (Reminder) 1. Insert Ø-functions at every join for every name a.) calculate dominance frontiers b.) find global names for each name, build a list of blocks that define it c.) insert Ø-functions  global name n  block b in which n is assigned  block d in b’s dominance frontier insert a Ø-function for n in d add d to n’s list of defining blocks Needs a little more detail *

SSA Construction Algorithm Finding global names Different between two forms of SSA Minimal uses all names Semi-pruned uses names that are live on entry to some block Shrinks name space & number of Ø-functions Pays for itself in compile-time speed For each “global name”, need a list of blocks where it is defined Drives Ø-function insertion b defines x implies a Ø-function for x in every c  DF(b) Pruned SSA adds a test to see if x is live at insertion point Otherwise, cannot need a Ø-function

Excluding local names avoids Ø’s for y & z a  Ø(a,a) b  Ø(b,b) c  Ø(c,c) d  Ø(d,d) y  a+b z  c+d i  i+1 B7 i > 100 i  ••• B0 b  ••• c  ••• d  ••• B2 i  Ø(i,i) a  ••• B1 B3 B4 B5 B6 Example Excluding local names avoids Ø’s for y & z With all the Ø-functions Lots of new ops Renaming is next Assume a, b, c, & d defined before B0

SSA Construction Algorithm (Details) 2. Rename variables in a pre-order walk over dominator tree (use an array of stacks, one stack per global name) Staring with the root block, b a.) generate unique names for each Ø-function and push them on the appropriate stacks b.) rewrite each operation in the block i. Rewrite uses of global names with the current version (from the stack) ii. Rewrite definition by inventing & pushing new name c.) fill in Ø-function parameters of successor blocks d.) recurse on b’s children in the dominator tree e.) <on exit from b > pop names generated in b from stacks 1 counter per name for subscripts Reset the state Need the end-of-block name for this path

SSA Construction Algorithm (Details) Adding all the details ... Rename(b) for each Ø-function in b, x  Ø(…) rename x as NewName(x) for each operation “x  y op z” in b rewrite y as top(stack[y]) rewrite z as top(stack[z]) rewrite x as NewName(x) for each successor of b in the CFG rewrite appropriate Ø parameters for each successor s of b in dom. tree Rename(s) pop(stack[x]) for each global name i counter[i]  0 stack[i]  Ø call Rename(n0) NewName(n) i  counter[n] counter[n]  counter[n] + 1 push ni onto stack[n] return ni

Example Before processing B0 Assume a, b, c, & d defined before B0 a  Ø(a,a) b  Ø(b,b) c  Ø(c,c) d  Ø(d,d) y  a+b z  c+d i  i+1 B7 i > 100 i  ••• B0 b  ••• c  ••• d  ••• B2 i  Ø(i,i) a  ••• B1 B3 B4 B5 B6 Example Before processing B0 Assume a, b, c, & d defined before B0 i has not been defined a b c d i Counters Stacks 1 1 1 1 a0 b0 c0 d0 *

Example End of B0 a b c d i Counters Stacks 1 1 1 1 1 * a0 b0 c0 d0 i0 a  Ø(a0,a) b  Ø(b0,b) c  Ø(c0,c) d  Ø(d0,d) i  Ø(i0,i) a  ••• c  ••• B1 End of B0 b  ••• c  ••• d  ••• B2 a  ••• d  ••• B3 d  ••• B4 c  ••• B5 d  Ø(d,d) c  Ø(c,c) b  ••• B6 a b c d i Counters Stacks 1 1 1 1 1 B7 a  Ø(a,a) b  Ø(b,b) c  Ø(c,c) d  Ø(d,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 i > 100 *

Example End of B1 a b c d i Counters Stacks 3 2 3 2 2 * a0 b0 c0 d0 i0 a1  Ø(a0,a) b1  Ø(b0,b) c1  Ø(c0,c) d1  Ø(d0,d) i1  Ø(i0,i) a2  ••• c2  ••• B1 End of B1 b  ••• c  ••• d  ••• B2 a  ••• d  ••• B3 d  ••• B4 c  ••• B5 d  Ø(d,d) c  Ø(c,c) b  ••• B6 a b c d i Counters Stacks 3 2 3 2 2 B7 a  Ø(a,a) b  Ø(b,b) c  Ø(c,c) d  Ø(d,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 i > 100 *

Example End of B2 a b c d i Counters Stacks 3 3 4 3 2 * a0 b0 c0 d0 i0 a1  Ø(a0,a) b1  Ø(b0,b) c1  Ø(c0,c) d1  Ø(d0,d) i1  Ø(i0,i) a2  ••• c2  ••• B1 End of B2 b2  ••• c3  ••• d2  ••• B2 a  ••• d  ••• B3 d  ••• B4 c  ••• B5 d  Ø(d,d) c  Ø(c,c) b  ••• B6 a b c d i Counters Stacks 3 3 4 3 2 B7 a  Ø(a2,a) b  Ø(b2,b) c  Ø(c3,c) d  Ø(d2,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 b2 c2 d2 c3 i > 100 *

Example Before starting B3 a b c d i Counters Stacks 3 3 4 3 2 * a0 b0 a1  Ø(a0,a) b1  Ø(b0,b) c1  Ø(c0,c) d1  Ø(d0,d) i1  Ø(i0,i) a2  ••• c2  ••• B1 Before starting B3 b2  ••• c3  ••• d2  ••• B2 a  ••• d  ••• B3 d  ••• B4 c  ••• B5 d  Ø(d,d) c  Ø(c,c) b  ••• B6 a b c d i Counters Stacks 3 3 4 3 2 B7 a  Ø(a2,a) b  Ø(b2,b) c  Ø(c3,c) d  Ø(d2,d) y  a+b z  c+d i  i+1 i ≤ 100 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 i > 100 *

Example End of B3 a b c d i Counters Stacks 4 3 4 4 2 * a0 b0 c0 d0 i0 a1  Ø(a0,a) b1  Ø(b0,b) c1  Ø(c0,c) d1  Ø(d0,d) i1  Ø(i0,i) a2  ••• c2  ••• B1 End of B3 b2  ••• c3  ••• d2  ••• B2 a3  ••• d3  ••• B3 d  ••• B4 c  ••• B5 d  Ø(d,d) c  Ø(c,c) b  ••• B6 a b c d i Counters Stacks 4 3 4 4 2 B7 a  Ø(a2,a) b  Ø(b2,b) c  Ø(c3,c) d  Ø(d2,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 d3 a3 i > 100 *

Example End of B4 a b c d i Counters Stacks 4 3 4 5 2 * a0 b0 c0 d0 i0 a1  Ø(a0,a) b1  Ø(b0,b) c1  Ø(c0,c) d1  Ø(d0,d) i1  Ø(i0,i) a2  ••• c2  ••• B1 End of B4 b2  ••• c3  ••• d2  ••• B2 a3  ••• d3  ••• B3 d4  ••• B4 c  ••• B5 d  Ø(d4,d) c  Ø(c2,c) b  ••• B6 a b c d i Counters Stacks 4 3 4 5 2 B7 a  Ø(a2,a) b  Ø(b2,b) c  Ø(c3,c) d  Ø(d2,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 d3 a3 d4 i > 100 *

Example End of B5 a b c d i Counters Stacks 4 3 5 5 2 * a0 b0 c0 d0 i0 a1  Ø(a0,a) b1  Ø(b0,b) c1  Ø(c0,c) d1  Ø(d0,d) i1  Ø(i0,i) a2  ••• c2  ••• B1 End of B5 b2  ••• c3  ••• d2  ••• B2 a3  ••• d3  ••• B3 d4  ••• B4 c4  ••• B5 d  Ø(d4,d3) c  Ø(c2,c4) b  ••• B6 a b c d i Counters Stacks 4 3 5 5 2 B7 a  Ø(a2,a) b  Ø(b2,b) c  Ø(c3,c) d  Ø(d2,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 d3 a3 c4 i > 100 *

Example End of B6 a b c d i Counters Stacks 4 4 6 6 2 * a0 b0 c0 d0 i0 a1  Ø(a0,a) b1  Ø(b0,b) c1  Ø(c0,c) d1  Ø(d0,d) i1  Ø(i0,i) a2  ••• c2  ••• B1 End of B6 b2  ••• c3  ••• d2  ••• B2 a3  ••• d3  ••• B3 d4  ••• B4 c4  ••• B5 d5  Ø(d4,d3) c5  Ø(c2,c4) b3  ••• B6 a b c d i Counters Stacks 4 4 6 6 2 B7 a  Ø(a2,a3) b  Ø(b2,b3) c  Ø(c3,c5) d  Ø(d2,d5) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 b3 c2 d3 a3 c5 d5 i > 100 *

Example Before B7 a b c d i Counters Stacks 4 4 6 6 2 * a0 b0 c0 d0 i0 a1  Ø(a0,a) b1  Ø(b0,b) c1  Ø(c0,c) d1  Ø(d0,d) i1  Ø(i0,i) a2  ••• c2  ••• B1 Before B7 b2  ••• c3  ••• d2  ••• B2 a3  ••• d3  ••• B3 d4  ••• B4 c4  ••• B5 d5  Ø(d4,d3) c5  Ø(c2,c4) b3  ••• B6 a b c d i Counters Stacks 4 4 6 6 2 B7 a  Ø(a2,a3) b  Ø(b2,b3) c  Ø(c3,c5) d  Ø(d2,d5) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 i > 100 *

Example End of B7 a b c d i Counters Stacks 5 5 7 7 3 * a0 b0 c0 d0 i0 a1  Ø(a0,a4) b1  Ø(b0,b4) c1  Ø(c0,c6) d1  Ø(d0,d6) i1  Ø(i0,i2) a2  ••• c2  ••• B1 End of B7 b2  ••• c3  ••• d2  ••• B2 a3  ••• d3  ••• B3 d4  ••• B4 c4  ••• B5 d5  Ø(d4,d3) c5  Ø(c2,c4) b3  ••• B6 a b c d i Counters Stacks 5 5 7 7 3 B7 a4  Ø(a2,a3) b4  Ø(b2,b3) c6  Ø(c3,c5) d6  Ø(d2,d5) y  a4+b4 z  c6+d6 i2  i1+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 b4 c2 d6 i2 a4 c6 i > 100 *

Example After renaming Semi-pruned SSA form We’re done … Counters B0 i0  ••• i > 100 Example a1  Ø(a0,a4) b1  Ø(b0,b4) c1  Ø(c0,c6) d1  Ø(d0,d6) i1  Ø(i0,i2) a2  ••• c2  ••• B1 After renaming Semi-pruned SSA form We’re done … b2  ••• c3  ••• d2  ••• B2 a3  ••• d3  ••• B3 d4  ••• B4 c4  ••• B5 d5  Ø(d4,d3) c5  Ø(c2,c4) b3  ••• B6 Counters Stacks B7 a4  Ø(a2,a3) b4  Ø(b2,b3) c6  Ø(c3,c5) d6  Ø(d2,d5) y  a4+b4 z  c6+d6 i2  i1+1 Semi-pruned  only names live in 2 or more blocks are “global names”. i > 100

SSA Construction Algorithm (Pruned SSA) What’s this “pruned SSA” stuff? Minimal SSA still contains extraneous Ø-functions Inserts some Ø-functions where they are dead Would like to avoid inserting them Two ideas Semi-pruned SSA: discard names used in only one block Significant reduction in total number of Ø-functions Needs only local Live information (cheap to compute) Pruned SSA: only insert Ø-functions where their value is live Inserts even fewer Ø-functions, but costs more to do Requires global Live variable analysis (more expensive) In practice, both are simple modifications to step 1.

SSA Construction Algorithm We can improve the stack management Push at most one name per stack per block (save push & pop) Thread names together by block To pop names for block b, use b’s thread This is another good use for a scoped hash table Significant reductions in pops and pushes Makes a minor difference in SSA construction time Scoped table is a clean, clear way to handle the problem

SSA Deconstruction At some point, we need executable code Few machines implement Ø operations Need to fix up the flow of values Basic idea Insert copies Ø-function pred’s Simple algorithm Works in most cases Adds lots of copies Most of them coalesce away X17 Ø(x10,x11) ...  x17 ... ...  x17 X17  x10 X17  x11 *

Extra Slides Start Here