# 1 Introduction to Data Flow Analysis. 2 Data Flow Analysis Construct representations for the structure of flow-of-data of programs based on the structure.

## Presentation on theme: "1 Introduction to Data Flow Analysis. 2 Data Flow Analysis Construct representations for the structure of flow-of-data of programs based on the structure."— Presentation transcript:

1 Introduction to Data Flow Analysis

2 Data Flow Analysis Construct representations for the structure of flow-of-data of programs based on the structure of flow-of-control of programs Collect information about the attributes of data at various program points according to the structure of flow-of-data of programs

3 Points Within each basic block, a point is assigned between two adjacent statements, before the first statement, and after the last statement

4 An Example d 1 : i = m - 1 d 2 : j = n d 3 : a = u1 d 4 : i = i + 1 d 5 : j = j - 1 d 6 : a = u2 B1B1 B2B2 B3B3 B4B4 B5B5 B6B6

5 Paths A path from p 1 to p n is a sequence of points p 1, p 2, …, p n such that for each i, 1  i  n-1, either p i is the point immediately preceding a statement and p i+1 is the point immediately following that statement in the same block, or p i is the end of some block and p i+1 is the beginning of a successor block

6 An Example d 1 : i = m - 1 d 2 : j = n d 3 : a = u1 d 4 : i = i + 1 d 5 : j = j - 1 d 6 : a = u2 B1B1 B2B2 B3B3 B4B4 B5B5 B6B6 d 7 : i = u3 if e3

7 Reaching Definitions A definition of a variable x is a statement that assigns or may assign a value to x A definition d of some variable x reaches a point p if there is a path from the point immediately following d to p such that no unambiguous definition of x appear on that path

8 An Example d 1 : i = m - 1; d 2 : j = n; d 3 : a = u1; do d 4 : i = i + 1; d 5 : j = j - 1; if e1 then d 6 : a = u2 else d 7 : i = u3 while e2

9 Ambiguity of Definitions Unambiguous definitions (must assign values) –assignments to a variable –statements that read a value to a variable Ambiguous definitions (may assign values) –procedure calls that have call-by-reference parameters –procedure calls that may access nonlocal variables –assignments via pointers

10 Safe or Conservative Information Consider all execution paths of the control flow graph Allow definitions to pass through ambiguous definitions of the same variables The computed set of reaching definitions is a superset of the exact set of reaching definitions

11 Information for Reaching Definitions gen[S]: definitions generated within S and reaching the end of S kill[S]: definitions killed within S in[S]: definitions reaching the beginning of S out[S]: definitions reaching the end of S

12 Data Flow Equations Data flow information can be collected by setting up and solving systems of equations that relate information at various points out[S] = gen[S]  (in[S] - kill[S]) The information at the end of a statement is either generated within the statement or enters at the beginning and is not killed as control flows through the statement

13 The Iterative Algorithm Repeatedly compute in and out sets for each node in the control flow graph simultaneously until there is no change in[B] =  p  pred(B) out[P] out[B] = gen[B]  (in[B] - kill[B])

14 Algorithm: Reaching Definitions /* Assume in[B] =  for all B */ for each block B do out[B] := gen[B] change := true; while change do begin change := false; for each block B do begin in[B] :=  p  pred(B) out[p] oldout := out[B] out[B] := gen[B]  (in[B] - kill[B]) if out[B]  oldout then change := true end

15 An Example d 1 : i = m - 1 d 2 : j = n d 3 : a = u1 d 4 : i = i + 1 d 5 : j = j - 1 d 6 : a = u2 B1B1 B2B2 B3B3 B4B4 d 7 : i = u3 111 0000 000 1111 000 1100 110 0001 000 0010 001 0000 000 0001 100 1000

16 An Example Block InitialPass 1Pass 2 In[B] Out[B] 000 0000 111 0000000 0000111 0000 000 0000111 0011001 1110111 1111001 1110000 1100 000 0000001 1110001 0111001 1110001 0111000 0001 000 0000001 1110000 1110001 1110000 1110000 0010 B1B1 B2B2 B3B3 B4B4

17 Conservative Computation The computed gen set of reaching definitions is a superset of the exact gen set of reaching definitions The computed kill set of reaching definitions is a subset of the exact kill set of reaching definitions The computed in and out sets of reaching definitions is a superset of the exact in and out sets of reaching definitions

18 Local Data Flow Information The gen and kill sets for a basic block is obtained from the gen and kill sets for the statements in the basic block Only the in and out sets for the basic blocks are computed in the global data flow analysis The in and out sets for the statements in a basic block can be computed locally from the in set for the basic block if necessary

19 UD-Chains and DU-Chains A variable is used at statement s if its r- value may be required The reaching definitions information is often stored as use-definition chains (or ud- chains) The ud-chain for a use u of a variable x is the list of all the definitions of x that reach u The definition-use chains (or du-chains) for a definition d of a variable x is the list of all the uses of x that use the value defined at d

20 A Taxonomy of Data Flow Problems in[B] =  p  pred(B) out[p] out[B] = gen[B]  (in[B] - kill[B]) out[B] =  s  succ(B) in[s] in[B] = gen[B]  (out[B] - kill[B]) in[B] =  p  pred(B) out[p] out[B] = gen[B]  (in[B] - kill[B]) out[B] =  s  succ(B) in[s] in[B] = gen[B]  (out[B] - kill[B]) Any path All path Forward-FlowBackward-Flow

21 Available Expressions An expression x+y is available at a point p if every path from the initial node to p evaluates x+y, and after the last such evaluation prior to reaching p, there are no subsequent assignments to x or y

22 An Example t 2 = 4 * i t 1 = 4 * ii = … t 0 = 4 * i t 2 = 4 * i t 1 = 4 * i ?

23 The gen and kill Sets A block kills expression x+y if it possibly assign x or y and does not subsequently reevaluate x+y A block generates expression x+y if it definitely evaluates x+y and does not subsequently redefine x or y

24 The gen Set for a Block No expressions are available at the beginning Assume set A of expressions is available before statement x = y+z. The set of expressions available after the statement is formed by –adding to A the expression y+z –deleting from A any expression involving x At the end, A is the set of generated expressions

25 The kill Set for a Block All expressions y+z such that either y or z is defined and y+z is not generated by the block

26 An Example StatementsAvailable Expressions ……………….…………none a = b + c ……………….…………only b + c b = a - d ……………….…………only a - d c = b + c …………….……………only a - d d = a - d …………….……………none

27 The in and out Sets in[B] = ,for B = initial in[B] =  p  pred(B) out[p],for B  initial out[B] = gen[B]  (in[B] - kill[B])

28 Initialization of the in Sets B1B1 B2B2 O j+1 = G  (I j - K) I j+1 = out[B 1 ]  O j+1 I 0 =  O 1 = G I 1 = out[B 1 ]  G O 2 = G I 0 = U O 1 = U - K I 1 = out[B 1 ] - K O 2 = G  (out[B 1 ] - K)

29 Algorithm: Available Expressions /* Assume in[B 1 ] =  and in[B] = U for all B  B 1 */ in[B 1 ] =  ; out[B 1 ] = gen[B 1 ]; for each block B  B 1 do out[B] := U - kill[B]; change := true; while change do begin change := false; for each block B  B 1 do begin in[B] :=  p  pred(B) out[p] oldout := out[B] out[B] := gen[B]  (in[B] - kill[B]) if out[B]  oldout then change := true end

30 Conservative Computation The computed gen set of available expressions is a subset of the exact gen set of available expressions The computed kill set of available expressions is a superset of the exact kill set of available expressions The computed in and out sets of available expressions is a subset of the exact in and out sets of available expressions

31 Live Variables A variable x is live at a point p if the value of x at p could be used along some path in the control flow graph starting at p; otherwise, x is dead at p

32 The def and use Sets def[B]: the set of variables definitely assigned values in B use[B]: the set of variables whose values are possibly used in B prior to any definition of the variable

33 The in and out Sets out[B] =  s  succ(B) in[s] in[B] = use[B]  (out[B] - def[B])

34 Algorithm: Live Variables /* Assume in[B] =  for all B */ for each block B do in[B] :=  change := true; while change do begin change := false; for each block B do begin out[B] :=  s  succ(B) in[s] oldin := in[B] in[B] := use[B]  (out[B] - def[B]) if in[B]  oldin then change := true end

35 Conservative Computation The computed use set of live variables is a superset of the exact use set of live variables The computed def set of live variables is a subset of the exact def set of live variables The computed in and out sets of live variables is a superset of the exact in and out sets of live variables

36 Busy Expressions An expression is busy at a point p if along all paths from p to the final node its value is used before the expression is killed

37 The use and kill Sets use[B]: the set of expressions that are used before they are killed in B kill[B]: the set of expressions that are killed before they are used in B

38 The in and out Sets out[B] = ,for B = final out[B] =  s  succ(B) in[s],for B  final in[B] = use[B]  (out[B] - kill[B])

39 Algorithm: Busy Expressions /* Assume out[B n ] =  and out[B] = U for all B  B n */ out[B n ] =  ; in[B n ] = use[B n ]; for each block B  B n do in[B] := U - kill[B]; change := true; while change do begin change := false; for each block B  B n do begin out[B] :=  s  succ(B) in[s] oldin := in[B] in[B] := use[B]  (out[B] - kill[B]) if in[B]  oldin then change := true end

40 Conservative Computation The computed use set of busy expressions is a subset of the exact use set of busy expressions The computed kill set of busy expressions is a superset of the exact kill set of busy expressions The computed in and out sets of busy expressions is a subset of the exact in and out sets of busy expressions

Download ppt "1 Introduction to Data Flow Analysis. 2 Data Flow Analysis Construct representations for the structure of flow-of-data of programs based on the structure."

Similar presentations