# Data-Flow Analysis II CS 671 March 13, 2008. CS 671 – Spring 2008 1 Data-Flow Analysis Gather conservative, approximate information about what a program.

## Presentation on theme: "Data-Flow Analysis II CS 671 March 13, 2008. CS 671 – Spring 2008 1 Data-Flow Analysis Gather conservative, approximate information about what a program."— Presentation transcript:

Data-Flow Analysis II CS 671 March 13, 2008

CS 671 – Spring 2008 1 Data-Flow Analysis Gather conservative, approximate information about what a program does Result: some property that holds every time the instruction executes The Data-Flow Abstraction Execution of an instruction transforms program state To analyze a program, we must consider all possible sequences of program points (paths) Summarize all possible program states with finite set of facts Limitation: may consider some infeasible paths

CS 671 – Spring 2008 2 The General Approach Setting up and solving systems of equations that relate information at various points in the program such as out[S] = gen[S]  ( in[S] - kill[S] ) where –S is a statement –in[S] and out[S] are information before and after S –gen[S] and kill[S] are information generated and killed by S definition of in, out, gen, and kill depends on the desired information

CS 671 – Spring 2008 3 Data-Flow Analysis (cont.) Properties: either a forward analysis (out as function of in) or a backward analysis (in as a function of out). either an “along some path” problem or an “along all paths” problem. Data-flow analysis must be conservative Definitions: point between two statements (or before the first statements and after the last) path is a sequence of consecutive points in the control-flow graph

CS 671 – Spring 2008 4 Example – Live Variables Steps: Set up live sets for each program point Instantiate equations Solve equations if (c) x = y+1 y = 2*z if (d) x = y+z z = 1 z = x

CS 671 – Spring 2008 5 Example Program points if (c) x = y+1 y = 2*z if (d) x = y+z z = 1 z = x L1 L5 L9 L2 L6 L3 L11 L4 L10 L7 L8 L12

CS 671 – Spring 2008 6 Example if (c) x = y+1 y = 2*z if (d) x = y+z z = 1 z = x L1 L5 L9 L2 L6 L3 L11 L4 L10 L7 L8 L12 1 2 3 4 5 6 7 StmtDefsUses 1 2 3 4 5 6 7

CS 671 – Spring 2008 7 Example if (c) x = y+1 y = 2*z if (d) x = y+z z = 1 z = x 1 2 3 4 5 6 7 L1 = L2 = L3 = L4 = L5 = L6 = L7 = L8 = L9 = L10 = L11 = L12 = in[I] = ( out[I] – def[I] )  use[I] out[B] =  in[B’] B’  succ(B) L1 = { } L5 = { } L9 = { } L2 = { } L6 = { } L3 = { } L11 = { } L4 = { } L10 = { } L7 = { } L8 = { } L12 = { }

CS 671 – Spring 2008 8 More Terminology Successors Succ(B1) = Succ(B2) = Succ(B3) = Predecessors Pred(B2) = Pred(B3) = Pred(B4) = B1 B2B3 B4 Branch node – more than one successor Join node – more than one predecessor

CS 671 – Spring 2008 9 Dominators Dominance is a binary relation on the flow graph nodes that allows us to easily find loops Node d dominates node i (d dom i) if every possible execution path from entry to i includes d Dominance is: Reflexive – every node dominates itself Transitive – if a dom b and b dom c, then a dom c Antisymmetric – if a dom b and b dom a then a=b entry B1 B2B3 B4 B6 B5 exit dom(entry) = dom(b1) = dom(b2) = dom(b3) = dom(b4) = dom(b5) = dom(b6) = dom(exit) =

CS 671 – Spring 2008 10 Immediate dominators Idom(b) – a iff (a  b) and (a dom b) and there does not exist a node c such that (a dom c) and (c dom b) with c different than a and b Idom of a node is unique Idom relationship forms a tree whose root is the entry node idom(b1) = idom(b2) = idom(b3) = idom(b4) = idom(b5) = idom(b6) = idom(exit) = entry B1 B2B3 B4 B6 B5 exit Flow graph

CS 671 – Spring 2008 11 Strict Dominators and Postdominators (d sdom i) if d dominates i and d  i (p pdom i) if every possible execution path from i to exit includes p entry B1 B2B3 B4 B6 B5 exit Flow graph pdom(entry) = pdom(b1) = pdom(b2) = pdom(b3) = pdom(b4) = pdom(b5) = pdom(b6) =

CS 671 – Spring 2008 12 Loops Back edge – edge whose head dominates its tail Loop containing this type of back edge is a natural loop i.e. it has a single external entry point For back edge b  c the loop header is c entry B1 B2 B3 exit Natural loops = Loop header (B3  B1) = Loop header (B2  B2) =

CS 671 – Spring 2008 13 Quicksort Example How might we optimize this code? i := m-1 j := n t1 := 4*n v := a[t1] i := i+1 t2 := 4*i t3 := a[t2] if t3 < v goto b2 j := j-1 t4 := 4*j t5 := a[t4] if t5 > v goto b3 if i >= j goto b6 t6 :=4*i x := a[t6] t7 := 4*i t8 := 4*j t9 := a[t8] a[t7] :=t9 t10 := 4*j a[t10] := x t11 := 4*i x := a[t11] t12 := 4*i t13 := 4*n t14 := a[t13] a[t12] := t14 t15 := 4*n a[t15] := x b1 b2 b3 b4 b6b5 [Quicksort] (i, j, v, x variables are needed outside)

CS 671 – Spring 2008 14 Reaching Definitions Informally: determine if a particular definition (e.g. “x” in “x = 5”) may reach a given point in the program Why reaching definitions may be useful: x := 5 y := x + 2 if “x := 5” is the only definition reaching “y := x+2”, it can be simplified to “y := 7” (constant propagation)

CS 671 – Spring 2008 15 Reaching Definitions Definition of a variable X: is a statements that assigns (or may assign) a value to X unambiguous: X := 3 ambiguous: foo(X) or *Y := 3 A definition d reaches a point p : if there is a path from the point immediately following d to p, such that d is not killed along that path. A definition d of variable X is killed along path p if there is another definition of X along p.

CS 671 – Spring 2008 16 Reaching Definitions (cont.) Has the following properties: forward analysis “along some path” problem Is conservative in that: definition d may not define variable X along a path p, there is another definition of X, but this other definition is ambiguous definition d may be killed along infeasible paths

CS 671 – Spring 2008 17 Data-Flow Analysis: Structured Programs Most programs are structured: sequence of statements if-then-else construct while-loops (including for-loops, loops with breaks,...) For these programs, we may use an inductive (syntax driven) approach: 1 23 1 2-3 1-2-3

CS 671 – Spring 2008 18 Reaching Definitions for Structured Programs S gen[S] = gen[S2]  ( gen[S1] - kill[S2] ) kill[S] = kill[S2]  ( kill[S1] - gen[S2] ) in[S1] = in[S] in[S2] = out[S1] out[S] = out[S2] S1 S2 Sd: a=b+c gen[S] = {d} kill[S] = All-defs-of-a - {d} out[S] = gen[S]  ( in[S] - kill[S] )

CS 671 – Spring 2008 19 Reaching Definitions for Structured Programs (cont.) S gen[S] = gen[S1]  gen[S2] kill[S] = kill[S1]  kill[S2] in[S1] = in[S2] = in[S] out[S] = out[S1]  out[S2] S1S2 S gen[S] = gen[S1] kill[S] = kill[S1] in[S1] = in[S]  gen[S1] out[S] = out[S1] S1

CS 671 – Spring 2008 20 Iterative Solution: Data-Flow Equations Inductive approach only applicable to structured programs because utilizes the structure of the program to synthesize & distribute the data-flow information Need a general technique: Iterative Approach compute the gen/kill sets of each statement / basic block initialize the in/out sets repetitively compute out/in sets until a steady state is reached

CS 671 – Spring 2008 21 Reaching Definitions Reaching definitions: set of definitions that may reach (along one or more paths) a given point gen[S]: definition d is in gen[S] if d may reach the end of S, independently of whether it reaches the beginning of S. kill[S]: the set of definitions that never reach the end of S, even if they reach the beginning. Equations: in[S] = (P a predecessor of S) out[P ] out[S] = gen[S] ( in[S] - kill[S] )

CS 671 – Spring 2008 22 Reaching Definitions (cont.) Algorithm: for each basic block B: out[B] := gen[B]; (1) do change := false; for each basic block B do in[B] = (P a predecessor of B) out[P ]; (2) old-out = out[B]; (3) out[B] = gen[B] (in[B] - kill[B]);(4) if (out[B] != old-out) then change := true;(5) end while change

CS 671 – Spring 2008 23 Example for Reaching Definitions i := m-1d1 j := nd2 a := u1d3 i := i+1d4 j := j-1d5 b1 b2 a := u2d6 b3 i := u3d7 b4 initial in[B] 000 0000 out[B] 000 0000 b1 b2 b3 b4 pass1 in[B] 000 0000 out[B] 000 0000 pass2 in[B] 000 0000 out[B] 000 0000 gen[b1] := {d1, d2, d3} kill[b1] := {d4, d5, d6, d7} gen[b2] := {} kill[b2] := {} gen[b3] := {} kill[b3] := {} gen[b4] := {} kill[b4] := {} pass3 in[B] 000 0000 out[B] 000 0000 Compute gen/kill and iterate (visiting order: b1, b2, b3, b4)

CS 671 – Spring 2008 24 Generalizations: Other Data-Flow Analyses Reaching definitions is a (forward; some-path) analysis For backward analysis: interchange in / out sets in the previous algorithm, lines (1-5) For all-path analysis: intersection is substituted for union in line (2)

CS 671 – Spring 2008 25 Common Subexpression Elimination Rule used to eliminate subexpression within a basic block The subexpression was already defined The value of the subexpression is not modified –i.e. none of the values needed to compute the subexpression are redefined What about eliminating subexpressions across basic blocks?

CS 671 – Spring 2008 26 Available Expressions An expression x+y is available at a point p: if every path from the initial node to p evaluates x+y, and after the last such evaluation, prior to reaching p, there are no subsequent assignments to x or y. Definitions: forward, all-path, e-gen[S]: expressions definitely generated by S, –e.g. “z := x+y”: expression “x+y” is generated e-kill[S]: expressions that may be killed by S –e.g. “z := x+y”: all expression containing “z” are killed. order: compute e-gen and then e-kill, e.g. “x:= x+y”

CS 671 – Spring 2008 27 Available Expressions (cont.) Algorithm: for each basic block B: out[B] := e-gen[B]; (1) do change := false; for each basic block B do in[B] = (P a predecessor of B) out[P];(2) old-out = out[B]; (3) out[B] = e-gen[B] (in[B] - e-kill[B]);(4) if (out[B] != old-out) then change := true;(5) end while change difference: line (2), use intersection instead of union

CS 671 – Spring 2008 28 Pointer Analysis Identify the memory locations that may be addressed by a pointer may be formalized as a system of data-flow equations. Simple programming model: pointer to integer (or float, arrays of integer, arrays of float) no pointer to pointers allowed Definitions: in[S]: the set of pairs (p, a), where p is a pointer, a is a variables, and p might point to a before statement S. out[S]: the set of pairs (p, a), where p might point to a after statement S. gen[S]: the new pairs (p, a) generated by the statement S. kill[S]: the pairs (p, a) killed by the statement S.

CS 671 – Spring 2008 29 Pointer Analysis (cont.) S: a=b+c gen [S ] = { } kill[S ] = { } S: p = &a gen [S ] = { (p, a) } kill[S, input set ] = { (p, b) | (p, b) is in input set } S: p = q gen [S, input set ] = { (p, b) | (q, b) is in input set } kill[S, input set ] = { (p, b) | (p, b) is in input set } input set

CS 671 – Spring 2008 30 Pointer Analysis (cont.) Algorithm: for each basic block B: out[B] := gen []; (1) do change := false; for each basic block B do in[B] = (P a predecessor of B) out[P];(2) old-out = out[B]; (3) out[B] = gen[B, in[B] ] in[B] - kill[B, in[B] ] )(4) if (out[B] != old-out) then change := true;(5) end while change difference: line (4): gen and kill are functions of B and in[B].

CS 671 – Spring 2008 31 Performance of Iterative Solutions Global analysis may be memory-space / computing intensive May be reduced by using bitvector representations for sets analyzing only relevant variables –e.g. temporary variables may be ignored synthesizing data-flow within basic block mixing inductive and iterative solutions suitably ordering the basic block –e.g. depth first order is good for forward analysis limiting scope –may reduce the precision of analysis

CS 671 – Spring 2008 32 Summary Iterative algorithm: solve data-flow problem for arbitrary control flow graph To solve a new data-flow problem: define gen/kill accordingly determine properties: –forward / backward –some-path / all-path

Download ppt "Data-Flow Analysis II CS 671 March 13, 2008. CS 671 – Spring 2008 1 Data-Flow Analysis Gather conservative, approximate information about what a program."

Similar presentations