Proliferation of Data-flow Problems Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University.

Proliferation of Data-flow Problems Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. Comp 512 Spring 2011

Data-flow Basics: Review from Last Lecture Graph, sets, equations, & initial values define a problem  Sets form a semilattice  Equations treated as transfer functions  Properties of semilattice & transfer functions determine termination, and correctness Dominator calculation  One of the simplest forward data-flow problems Liveness information  A typical (backwards) global data-flow problem Both DOM and LIVE are well behaved  Admissible frameworks with unique fixed points Postorder, rPostorder, Preorder, & rPreorder COMP 512, Rice University2

Today’s Lecture Framework properties and speed of convergence  Kam-Ullman rapid condition  Worklist iterative algorithm More data-flow problems  Available expressions  Very busy expressions  Constant propagation ( Kildall style, not SCCP ) Problems that don’t fit the Kam-Ullman model for speed COMP 512, Rice University3

4 Iterative Data-flow Analysis Correctness If every f n  F is monotone, i.e., f(x  y) ≤ f(x)  f(y), and If the lattice is bounded, i.e., every descending chain is finite  Chain is sequence x 1, x 2, …, x n where x i  L, 1 ≤ i ≤ n  x i > x i+1, 1 ≤ i < n  chain is descending Then The round-robin algorithm computes a least fixed-point ( LFP ) The uniqueness of the solution depends on other properties of F Unique solution  it finds the one we want Multiple solutions  we need to know which one it finds Reference material

COMP 512, Rice University5 Iterative Data-flow Analysis Correctness Does the iterative algorithm compute the desired answer? Admissible Function Spaces 1.  f  F,  x,y  L, f (x  y) = f (x)  f (y) 2.  f i  F such that  x  L, f i (x) = x 3.f,g  F  h  F such that h(x ) = f (g(x)) 4.  x  L,  a finite subset H  F such that x =  f  H f (  ) If F meets these four conditions, then an instance of the problem will have a unique fixed point solution (instance  graph + initial values)  LFP = MFP = MOP  order of evaluation does not matter * Both DOM & LIVE meet all four criteria If meet does not distribute over function application, then the fixed point solution may not be unique. The iterative algorithm will find a LFP. Reference material

COMP 512, Rice University6 Iterative Data-flow Analysis Speed For a problem with an admissible function space & a bounded semilattice, If the functions all meet the rapid condition, i.e.,  f,g  F,  x  L, f (g(  )) ≥ g(  )  f (x)  x then, a round-robin, reverse-postorder iterative algorithm will halt in d(G)+3 passes over a graph G d(G) is the loop-connectedness of the graph w.r.t a DFST  Maximal number of back edges in an acyclic path  Several studies suggest that, in practice, d(G) is small ( <3 )  For most CFGs, d(G) is independent of the specific DFST Sets stabilize in two passes around a loop Each pass does O(E ) meets & O(N ) other operations * Both DOM & LIVE are rapid frameworks

COMP 512, Rice University7 Iterative Data-flow analysis What does this mean? Reverse postorder  Easily computed order that increases propagation per pass Round-robin iterative algorithm  Visit all the nodes in a consistent order ( RPO )  Do it again until the sets stop changing Rapid condition  Most classic global data-flow problems meet this condition These conditions are easily met  Admissible framework, rapid function space  Round-robin, reverse-postorder, iterative algorithm  The analysis runs in ( effectively ) linear time

COMP 512, Rice University8 The Round-robin Iterative Algorithm LIVEOUT(n f )  Ø for all other nodes n LIVEIN(n)  UEVAR(n) change  true while (change) change  false for i  0 to N n  rPreorder(i) T EMP   s  succ(n) LIVEIN(s) if LIVEOUT(n) ≠ T EMP then change  true LIVEOUT(n )  T EMP LIVEIN(n)  UEVAR(x)  (LIVEOUT(x)  VARKILL(x)) Meet operation We could, for brevity, fold the transfer function into the computation of LIVEOUTand eliminate the need to represent LIVEIN. We use this formulation for clarity. Transfer function

COMP 512, Rice University9 The Worklist Iterative Algorithm Worklist  { all nodes, n i } LIVEOUT(n f )  Ø for all other nodes n LIVEIN(n)  UEVAR(n) while (Worklist  Ø) remove a node n from Worklist TEMP   s  succ(n) LIVEIN(s) if LIVEOUT(n) ≠ TEMP then LIVEOUT(n i )  TEMP LIVEIN(n i )  UEVAR(x)  (LIVEOUT(x)  VARKILL(x)) Worklist  Worklist  preds(n) This algorithm avoids re-evaluating LIVEOUT when none of its constituents has changed. Thus, it does fewer set operations on LIVEOUT & LIVEIN than the round-robin algorithm. It does manipulate the Worklist, which is an added cost. Implement the Worklist as a set – no duplicate entries

COMP 512, Rice University10 Round-robin Versus Worklist Algorithm Termination, correctness, & complexity arguments are for the round-robin version The worklist algorithm does less work; it visits fewer nodes Consider a version with two worklists: current & next  Draw nodes from current in RPO ; add them to next  Swap the worklists when current becomes empty ( 1 r-r pass )  Makes same changes, in same order, as round robin  Termination & correctness arguments should carry over Is this the best order? Maybe not …  Iterate around a loop until it stabilizes  Many data structures for worklist, many orders  Your mileage will vary ( measurable effect ) Worklist algorithm is a sparse version of round-robin algorithm

COMP 512, Rice University11 Iterative Data-flow Analysis How do we use these results?  Prove that data-flow framework is admissible & rapid  Its just algebra  Most (but not all) global data-flow problems are rapid  This is a property of F  Code up an iterative algorithm  World’s simplest data-flow algorithm Rely on the results  Theoretically sound, robust algorithm This lets us ignore most of the other data-flow algorithms in 512

COMP 512, Rice University12 Live Variables Transformation: Eliminating unneeded stores e in a register, have seen last definition, never again used The store is dead ( except for debugging ) Compiler can eliminate the store Data-flow problem: Live variables L IVE O UT (b) =  s  succ(b) (UEV AR (s)  (L IVE (s)  V AR K ILL (s))) L IVE O UT (n f ) = Ø L IVE (b) is the set of variables live on exit from b V AR K ILL (b) is the set of variables that are redefined in b UEV AR (b) is the set of variables used before any redefinition in b Live analysis is a backward flow problem |L IVE O UT | = |variables| L IVE plays an important role in both register allocation and the pruned-SSA construction. Form of f is same as in A VAI L *

COMP 512, Rice University13 Computing Available Expressions An expression is available on entry to b if, along every path from the entry node to b, it has been computed & none of its constituent subexpressions has been subsequently modified. AVAIL(b) =  x  pred(b) (DEEXPR(x)  (AVAIL(x)  EXPRKILL(x))) where EXPRKILL(b) is the set of expression killed in b, and DEEXPR(b) is the set of expressions defined in b and not subsequently killed in b Initial conditions AVAIL(n 0 ) = Ø, because nothing is computed before n 0 AVAIL(n) = { all expressions }, because ∧ is ∩ Available expressions is a forward data-flow problem |AVAIL| = |expressions|

Using Available Expressions AVAIL is the basis for global common subexpression elimination If x+y ∈ AVAIL(b) then the compiler can replace any upwards exposed occurrence of x+y in b with a reference to the previously computed value Standard caveats occur about preserving prior computations  Hash x+y and assign temporary names; copy all x+y to name  Walk backward through code to find minimal set of copies  Other clever schemes abount Resulting algorithm is similar to value numbering  Based on textual identity of expressions, not value identity  Recognizes redundancy carried along loop-closing branch The classic form of GCSE ( Cocke, SIGPLAN Conference, 1970 ) COMP 512, Rice University14

COMP 512, Rice University15 Very Busy Expressions An expression e is very busy on exit from n, iff e is evaluated & used along every path from n to n f AND evaluating e at the end of n would produce the same result as the next evaluation along those paths The Analysis Annotate each block n with a set V ERY B USY (n) that contains each expression Perform analysis to find opportunities Problem is a backward problem Initialization(s) VERYBUSY(n f ) = Ø VERYBUSY(n) = ? Recurrence Equation VERYBUSY(n) = ? …  x+y …  x+y … … … Hint: Use UEEXPR(n) and EXPRKILL(n)

COMP 512, Rice University16 Very Busy Expressions Transformation: Hoisting e defined in every successor of b Evaluating e in b produces same result Saves code space, but shortens no path Data-flow problem: Very Busy Expressions VERYBUSY(b) =  s  succ(b) (UEEXPR(s)  (VERYBUSY(s) - EXPRKILL(s))) VERYBUSY(n f ) = Ø VERYBUSY(b) contains expressions that are very busy at end of b UEEXPR(b) is the set of expressions used before they are killed in b EXPRKILL(b) is the set of expressions killed before they are used in b VERYBUSY expressions is a backward flow problem |V ERY B USY | = |expressions| Form of f is same as in LIVE *

COMP 512, Rice University17 Constant Propagation ( Classic formulation ) Transformation: Global Constant Folding Along every path to p, v has same known value Specialize computation at p based on v’s value Data-flow problem: Constant Propagation Domain is the set of pairs where v i is a variable and c i  C CONSTANTS(b) =  p  preds(b) f p (CONSTANTS(p))  performs a pairwise meet on two sets of pairs f p (x) is a block specific function that models the effects of block p on the pairs in x Constant propagation is a forward flow problem |CONSTANTS| = |variables| … c -2 c -1 c 0 c 1 c 2... The Lattice C Form of f is quite different than in A VAIL *

COMP 512, Rice University18 Example Meet operation requires more explanation c 1  c 2 = c 1 if c 1 = c 2, else  ( bottom & top as expected ) What about f p ? If p has one statement then x  y with CONSTANTS(p) = {…,… …} then f p (CONSTANTS(p)) = CONSTANTS(p) - + X  y op z with CONSTANTS(p) = {…,… … >,… …} then f p (CONSTANTS(p)) = CONSTANTS(p) - + If p has n statements then f p (CONSTANTS(p)) = f n (f n-1 (f n-2 (…f 2 (f 1 (CONSTANTS(p)))…))) where f i is the function generated by the i th statement in p f p interprets p over CONSTANTS

COMP 512, Rice University19 Proliferation of Data-flow Problems If we continue in this fashion, We will need a huge set of data-flow problems Each will have slightly different initial information This seems like a messy way to go … Desiderata Solve one data-flow problem Use it for all transformations To address this issue, researchers invented information chains Topic of next lecture

COMP 512, Rice University20 Some problems are not admissible Global constant propagation First condition in admissibility  f  F,  x,y  L, f (x  y) = f (x)  f (y) Constant propagation is not admissible  Kam & Ullman time bound does not hold  There are tight time bounds, however, based on lattice height  Require a variable-by-variable formulation … a  b + c Function “f” models block’s effects f( S1 ) = {a=7,b=3,c=4} f( S2 ) = {a=7,b=1,c=6} f(S1  S2) = Ø S1 : {b=3,c=4} S2 : {b=1,c=6} F and  do not distribute

Interprocedural May Modify Information At a call site, what variables may be modified by its execution? Generalized answer for each procedure p : GMOD(P)  GMOD(p) is independent of the calling context of p IMOD(p) is the set of variables modified locally in p Project an answer back to each call site s : MOD(s) Alias-free MOD(s) is DMOD(s), direct modification side effect GMOD(p) = IMOD(p)  (  s ∈ succ(p) DMOD(s)) DMOD(s) = B s (GMOD(p)), where p is callee at s To compute final MOD sets, must factor in ALIAS information  If ∈ ALIAS(p) & X ∈ DMOD(s) for some s ∈ p, then put both x & y in MOD(s) COMP 512, Rice University21 Flow-insensitive formulation by Banning B e models the effects of call-by-reference binding. With call-by-value, this problem is much simpler.

Some admissible problems are not rapid Flow-insensitive Interprocedural May Modify sets Iterations proportional to number of parameters  Not a function of the call graph  Can make example arbitrarily bad Proportional to length of chain of bindings… COMP 512, Rice University22 shift(a,b,c,d,e,f) { local t; … call shift(t,a,b,c,d,e); f = 1; … } Assume call-by-reference (Call-by-value is simpler) Compute the set of variables (in shift) that can be modified by a call to shift How long does it take? shift abcdef Nothing to do with d(G) Hint: build a sparse evaluation graph based on parameter binding [Cooper & Kennedy, PLDI 1988]

Proliferation of Data-flow Problems Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University.

Similar presentations

Presentation on theme: "Proliferation of Data-flow Problems Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Proliferation of Data-flow Problems Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University.

Similar presentations

Presentation on theme: "Proliferation of Data-flow Problems Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University."— Presentation transcript:

Similar presentations

About project

Feedback