Presentation is loading. Please wait.

Presentation is loading. Please wait.

Building SSA Form COMP 512 Rice University Houston, Texas Fall 2003

Similar presentations


Presentation on theme: "Building SSA Form COMP 512 Rice University Houston, Texas Fall 2003"— Presentation transcript:

1 Building SSA Form COMP 512 Rice University Houston, Texas Fall 2003
Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use.

2 Review SSA-form Each name is defined exactly once
Each use refers to exactly one name What’s hard Straight-line code is trivial Splits in the CFG are trivial Joins in the CFG are hard Building SSA Form Insert Ø-functions at birth points of values Rename all values for uniqueness A Ø-function is a special kind of copy that selects one of its parameters. The choice of parameter is governed by the CFG edge along which control reached the current block. Few machines implement a Ø-function directly in hardware. y1  ... y2  ... y3  Ø(y1,y2) COMP 512, Fall 2003 *

3 SSA Construction Algorithm (High-level sketch)
1. Insert Ø-functions 2. Rename values … that’s all ... … of course, there is some bookkeeping to be done ... COMP 512, Fall 2003 *

4 SSA Construction Algorithm (Less high-level sketch)
1. Insert Ø-functions at every join for every name 2. Solve reaching definitions 3. Rename each use to the def that reaches it (will be unique) What’s wrong with this approach Too many Ø-functions (precision) Too many Ø-functions (space) Too many Ø-functions (time) Need to relate edges to Ø-functions parameters (bookkeeping) To do better, we need a more complex approach Builds maximal SSA COMP 512, Fall 2003

5 SSA Construction Algorithm (Less high-level sketch)
1. Insert Ø-functions a.) calculate dominance frontiers b.) find global names for each name, build a list of blocks that define it c.) insert Ø-functions  global name n  block b in which n is assigned  block d in b’s dominance frontier insert a Ø-function for n in d add d to n’s list of defining blocks Moderately complex Compute list of blocks where each name is assigned & use as a worklist This adds to the worklist ! { Creates the iterated dominance frontier Use a checklist to avoid putting blocks on the worklist twice; keep another checklist to avoid inserting the same Ø-function twice. COMP 512, Fall 2003 *

6 SSA Construction Algorithm (Less high-level sketch)
2. Rename variables in a pre-order walk over dominator tree (use an array of stacks, one stack per global name) Staring with the root block, b a.) generate unique names for each Ø-function and push them on the appropriate stacks b.) rewrite each operation in the block i. Rewrite uses of global names with the current version (from the stack) ii. Rewrite definition by inventing & pushing new name c.) fill in Ø-function parameters of successor blocks d.) recurse on b’s children in the dominator tree e.) <on exit from block b > pop names generated in b from stacks 1 counter per name for subscripts Reset the state Need the end-of-block name for this path COMP 512, Fall 2003 *

7 SSA Construction Algorithm (Low-level detail)
Computing Dominance First step in Ø-function insertion computes dominance Recall that n dominates m iff n is on every path from n0 to m Every node dominates itself n’s immediate dominator is its closest dominator, IDOM(n)† DOM(n0 ) = { n0 } DOM(n) = { n }  (ppreds(n) DOM(p)) Computing DOM These equations form a rapid data-flow framework Iterative algorithm will solve them in d(G) + 3 passes Each pass does N unions & E intersections, E is O(N 2),  O(N 2) work Initially, DOM(n) = N,  n≠n0 †IDOM(n ) ≠ n, unless n is n0, by convention. COMP 512, Fall 2003

8 Example Progress of iterative solution for DOM B1 B2 B3 B4 B5 B6 B7 B0
Results of iterative solution for DOM Flow Graph COMP 512, Fall 2003 *

9 Example Progress of iterative solution for DOM B1 B2 B3 B4 B5 B6 B7 B0
Results of iterative solution for DOM Dominance Tree There are asymptotically faster algorithms. With the right data structures, the iterative algorithm can be made faster. See Cooper, Harvey, & Kennedy, on the web site. COMP 512, Fall 2003

10 Example Dominance Frontiers & Ø-Function Insertion
A definition at n forces a Ø-function at m iff n  DOM(m) but n  DOM(p) for some p  preds(m) DF(n ) is fringe just beyond region n dominates B1 B2 B3 B4 B5 B6 B7 B0 x Ø(...)  in 7 forces Ø-function in DF(7) = {1} x ... x Ø(...) DF(4) is {6}, so  in 4 forces Ø-function in 6 x Ø(...)  in 6 forces Ø-function in DF(6) = {7} Dominance Frontiers  in 1 forces Ø-function in DF(1) = Ø (halt ) For each assignment, we insert the Ø-functions COMP 512, Fall 2003 *

11 Example Computing Dominance Frontiers
Only join points are in DF(n) for some n Leads to a simple, intuitive algorithm for computing dominance frontiers For each join point x (i.e., |preds(x)| > 1) For each CFG predecessor of x Run up to IDOM(x ) in the dominator tree, adding x to DF(n) for each n between x and IDOM(x ) B1 B2 B3 B4 B5 B6 B7 B0 For some applications, we need post-dominance, the post-dominator tree, and reverse dominance frontiers, RDF(n ) Just dominance on the reverse CFG Reverse the edges & add unique exit node We will use these in dead code elimination Dominance Frontiers COMP 512, Fall 2003 *

12 SSA Construction Algorithm (Reminder)
1. Insert Ø-functions at every join for every name a.) calculate dominance frontiers b.) find global names for each name, build a list of blocks that define it c.) insert Ø-functions  global name n  block b in which n is assigned  block d in b’s dominance frontier insert a Ø-function for n in d add d to n’s list of defining blocks Needs a little more detail COMP 512, Fall 2003 *

13 SSA Construction Algorithm
Finding global names Different between two forms of SSA Minimal uses all names Semi-pruned uses names that are live on entry to some block Shrinks name space & number of Ø-functions Pays for itself in compile-time speed For each “global name”, need a list of blocks where it is defined Drives Ø-function insertion b defines x implies a Ø-function for x in every c  DF(b) Pruned SSA adds a test to see if x is live at insertion point Otherwise, cannot need a Ø-function Occasionally, building pruned is faster than building semi-pruned. Any algorithm that has non-linear behavior in the number of Ø-functions will have a size where pruned is the SSA flavor of choice. COMP 512, Fall 2003

14 Excluding local names avoids Ø’s for y & z
a  Ø(a,a) b  Ø(b,b) c  Ø(c,c) d  Ø(d,d) y  a+b z  c+d i  i+1 B7 i > 100 i  ••• B0 b  ••• c  ••• d  ••• B2 i  Ø(i,i) a  ••• B1 B3 B4 B5 B6 Example Excluding local names avoids Ø’s for y & z With all the Ø-functions Lots of new ops Renaming is next Assume a, b, c, & d defined before B0

15 SSA Construction Algorithm (Less high-level sketch)
2. Rename variables in a pre-order walk over dominator tree (use an array of stacks, one stack per global name) Staring with the root block, b a.) generate unique names for each Ø-function and push them on the appropriate stacks b.) rewrite each operation in the block i. Rewrite uses of global names with the current version (from the stack) ii. Rewrite definition by inventing & pushing new name c.) fill in Ø-function parameters of successor blocks d.) recurse on b’s children in the dominator tree e.) <on exit from block b > pop names generated in b from stacks 1 counter per name for subscripts Reset the state Need the end-of-block name for this path COMP 512, Fall 2003

16 SSA Construction Algorithm (Less high-level sketch)
Adding all the details ... Rename(b) for each Ø-function in b, x  Ø(…) rename x as NewName(x) for each operation “x  y op z” in b rewrite y as top(stack[y]) rewrite z as top(stack[z]) rewrite x as NewName(x) for each successor of b in the CFG rewrite appropriate Ø parameters for each successor s of b in dom. tree Rename(s) pop(stack[x]) for each global name i counter[i]  0 stack[i]  Ø call Rename(n0) NewName(n) i  counter[n] counter[n]  counter[n] + 1 push ni onto stack[n] return ni COMP 512, Fall 2003

17 Example Before processing B0 Assume a, b, c, & d defined before B0
a  Ø(a,a) b  Ø(b,b) c  Ø(c,c) d  Ø(d,d) y  a+b z  c+d i  i+1 B7 i > 100 i  ••• B0 b  ••• c  ••• d  ••• B2 i  Ø(i,i) a  ••• B1 B3 B4 B5 B6 Example Before processing B0 Assume a, b, c, & d defined before B0 i has not been defined a b c d i Counters Stacks 1 1 1 1 a0 b0 c0 d0 *

18 Example End of B0 a b c d i Counters Stacks 1 1 1 1 1 * a0 b0 c0 d0 i0
a  Ø(a0,a) b  Ø(b0,b) c  Ø(c0,c) d  Ø(d0,d) i  Ø(i0,i) a  ••• c  ••• B1 End of B0 b  ••• c  ••• d  ••• B2 a  ••• d  ••• B3 d  ••• B4 c  ••• B5 d  Ø(d,d) c  Ø(c,c) b  ••• B6 a b c d i Counters Stacks 1 1 1 1 1 B7 a  Ø(a,a) b  Ø(b,b) c  Ø(c,c) d  Ø(d,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 i > 100 *

19 Example End of B1 a b c d i Counters Stacks 3 2 3 2 2 * a0 b0 c0 d0 i0
a1  Ø(a0,a) b1  Ø(b0,b) c1  Ø(c0,c) d1  Ø(d0,d) i1  Ø(i0,i) a2  ••• c2  ••• B1 End of B1 b  ••• c  ••• d  ••• B2 a  ••• d  ••• B3 d  ••• B4 c  ••• B5 d  Ø(d,d) c  Ø(c,c) b  ••• B6 a b c d i Counters Stacks 3 2 3 2 2 B7 a  Ø(a,a) b  Ø(b,b) c  Ø(c,c) d  Ø(d,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 i > 100 *

20 Example End of B2 a b c d i Counters Stacks 3 3 4 3 2 * a0 b0 c0 d0 i0
a1  Ø(a0,a) b1  Ø(b0,b) c1  Ø(c0,c) d1  Ø(d0,d) i1  Ø(i0,i) a2  ••• c2  ••• B1 End of B2 b2  ••• c3  ••• d2  ••• B2 a  ••• d  ••• B3 d  ••• B4 c  ••• B5 d  Ø(d,d) c  Ø(c,c) b  ••• B6 a b c d i Counters Stacks 3 3 4 3 2 B7 a  Ø(a2,a) b  Ø(b2,b) c  Ø(c3,c) d  Ø(d2,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 b2 c2 d2 c3 i > 100 *

21 Example Before starting B3 a b c d i Counters Stacks 3 3 4 3 2 * a0 b0
a1  Ø(a0,a) b1  Ø(b0,b) c1  Ø(c0,c) d1  Ø(d0,d) i1  Ø(i0,i) a2  ••• c2  ••• B1 Before starting B3 b2  ••• c3  ••• d2  ••• B2 a  ••• d  ••• B3 d  ••• B4 c  ••• B5 d  Ø(d,d) c  Ø(c,c) b  ••• B6 a b c d i Counters Stacks 3 3 4 3 2 B7 a  Ø(a2,a) b  Ø(b2,b) c  Ø(c3,c) d  Ø(d2,d) y  a+b z  c+d i  i+1 i ≤ 100 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 i > 100 *

22 Example End of B3 a b c d i Counters Stacks 4 3 4 4 2 * a0 b0 c0 d0 i0
a1  Ø(a0,a) b1  Ø(b0,b) c1  Ø(c0,c) d1  Ø(d0,d) i1  Ø(i0,i) a2  ••• c2  ••• B1 End of B3 b2  ••• c3  ••• d2  ••• B2 a3  ••• d3  ••• B3 d  ••• B4 c  ••• B5 d  Ø(d,d) c  Ø(c,c) b  ••• B6 a b c d i Counters Stacks 4 3 4 4 2 B7 a  Ø(a2,a) b  Ø(b2,b) c  Ø(c3,c) d  Ø(d2,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 d3 a3 i > 100 *

23 Example End of B4 a b c d i Counters Stacks 4 3 4 5 2 * a0 b0 c0 d0 i0
a1  Ø(a0,a) b1  Ø(b0,b) c1  Ø(c0,c) d1  Ø(d0,d) i1  Ø(i0,i) a2  ••• c2  ••• B1 End of B4 b2  ••• c3  ••• d2  ••• B2 a3  ••• d3  ••• B3 d4  ••• B4 c  ••• B5 d  Ø(d4,d) c  Ø(c2,c) b  ••• B6 a b c d i Counters Stacks 4 3 4 5 2 B7 a  Ø(a2,a) b  Ø(b2,b) c  Ø(c3,c) d  Ø(d2,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 d3 a3 d4 i > 100 *

24 Example End of B5 a b c d i Counters Stacks 4 3 5 5 2 * a0 b0 c0 d0 i0
a1  Ø(a0,a) b1  Ø(b0,b) c1  Ø(c0,c) d1  Ø(d0,d) i1  Ø(i0,i) a2  ••• c2  ••• B1 End of B5 b2  ••• c3  ••• d2  ••• B2 a3  ••• d3  ••• B3 d4  ••• B4 c4  ••• B5 d  Ø(d4,d3) c  Ø(c2,c4) b  ••• B6 a b c d i Counters Stacks 4 3 5 5 2 B7 a  Ø(a2,a) b  Ø(b2,b) c  Ø(c3,c) d  Ø(d2,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 d3 a3 c4 i > 100 *

25 Example End of B6 a b c d i Counters Stacks 4 4 6 6 2 * a0 b0 c0 d0 i0
a1  Ø(a0,a) b1  Ø(b0,b) c1  Ø(c0,c) d1  Ø(d0,d) i1  Ø(i0,i) a2  ••• c2  ••• B1 End of B6 b2  ••• c3  ••• d2  ••• B2 a3  ••• d3  ••• B3 d4  ••• B4 c4  ••• B5 d5  Ø(d4,d3) c5  Ø(c2,c4) b3  ••• B6 a b c d i Counters Stacks 4 4 6 6 2 B7 a  Ø(a2,a3) b  Ø(b2,b3) c  Ø(c3,c5) d  Ø(d2,d5) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 b3 c2 d3 a3 c5 d5 i > 100 *

26 Example Before B7 a b c d i Counters Stacks 4 4 6 6 2 * a0 b0 c0 d0 i0
a1  Ø(a0,a) b1  Ø(b0,b) c1  Ø(c0,c) d1  Ø(d0,d) i1  Ø(i0,i) a2  ••• c2  ••• B1 Before B7 b2  ••• c3  ••• d2  ••• B2 a3  ••• d3  ••• B3 d4  ••• B4 c4  ••• B5 d5  Ø(d4,d3) c5  Ø(c2,c4) b3  ••• B6 a b c d i Counters Stacks 4 4 6 6 2 B7 a  Ø(a2,a3) b  Ø(b2,b3) c  Ø(c3,c5) d  Ø(d2,d5) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 i > 100 *

27 Example End of B7 a b c d i Counters Stacks 5 5 7 7 3 * a0 b0 c0 d0 i0
a1  Ø(a0,a4) b1  Ø(b0,b4) c1  Ø(c0,c6) d1  Ø(d0,d6) i1  Ø(i0,i2) a2  ••• c2  ••• B1 End of B7 b2  ••• c3  ••• d2  ••• B2 a3  ••• d3  ••• B3 d4  ••• B4 c4  ••• B5 d5  Ø(d4,d3) c5  Ø(c2,c4) b3  ••• B6 a b c d i Counters Stacks 5 5 7 7 3 B7 a4  Ø(a2,a3) b4  Ø(b2,b3) c6  Ø(c3,c5) d6  Ø(d2,d5) y  a4+b4 z  c6+d6 i2  i1+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 b4 c2 d6 i2 a4 c6 i > 100 *

28 Example After renaming Semi-pruned SSA form We’re done … Counters
B0 i0  ••• i > 100 Example a1  Ø(a0,a4) b1  Ø(b0,b4) c1  Ø(c0,c6) d1  Ø(d0,d6) i1  Ø(i0,i2) a2  ••• c2  ••• B1 After renaming Semi-pruned SSA form We’re done … b2  ••• c3  ••• d2  ••• B2 a3  ••• d3  ••• B3 d4  ••• B4 c4  ••• B5 d5  Ø(d4,d3) c5  Ø(c2,c4) b3  ••• B6 Counters Stacks B7 a4  Ø(a2,a3) b4  Ø(b2,b3) c6  Ø(c3,c5) d6  Ø(d2,d5) y  a4+b4 z  c6+d6 i2  i1+1 Semi-pruned  only names live in 2 or more blocks are “global names”. i > 100

29 SSA Construction Algorithm (Pruned SSA)
What’s this “pruned SSA” stuff? Minimal SSA still contains extraneous Ø-functions Inserts some Ø-functions where they are dead Would like to avoid inserting them Two ideas Semi-pruned SSA: discard names used in only one block Significant reduction in total number of Ø-functions Needs only local Live information (cheap to compute) Pruned SSA: only insert Ø-functions where their value is live Inserts even fewer Ø-functions, but costs more to do Requires global Live variable analysis (more expensive) In practice, both are simple modifications to step 1. COMP 512, Fall 2003

30 SSA Construction Algorithm
We can improve the stack management Push at most one name per stack per block (save push & pop) Thread names together by block To pop names for block b, use b’s thread This is another good use for a scoped hash table Significant reductions in pops and pushes Makes a minor difference in SSA construction time Scoped table is a clean, clear way to handle the problem COMP 512, Fall 2003

31 SSA Deconstruction At some point, we need executable code
Few machines implement Ø operations Need to fix up the flow of values Basic idea Insert copies Ø-function pred’s Simple algorithm Works in most cases Adds lots of copies Most of them coalesce away X17 Ø(x10,x11) ...  x17 ... ...  x17 X17  x10 X17  x11 COMP 512, Fall 2003 *


Download ppt "Building SSA Form COMP 512 Rice University Houston, Texas Fall 2003"

Similar presentations


Ads by Google