Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ben Livshits Based in part of Stanford class slides from

Similar presentations


Presentation on theme: "Ben Livshits Based in part of Stanford class slides from"— Presentation transcript:

1 Ben Livshits Based in part of Stanford class slides from http://infolab.stanford.edu/~ullman/dragon/w06/w06.html

2 Really basic stuff Flow Graphs Constant Folding Global Common Subexpressions Induction Variables/Reduction in Strength Data-flow analysis Proving Little Theorems Data-Flow Equations Major Examples Pointer analysis

3 Compiler Organization

4 L2: Compiler Organization Dataflow analysis basics L3: Dataflow lattices Integrative dataflow solution Gen/kill frameworks

5 L10: Pointer analysis L11 Pointer analysis and bddbddb

6 6 Really Basic Stuff Flow Graphs Constant Folding Global Common Subexpressions Induction Variables/Reduction in Strength

7 7 Dawn of Code Optimization uA never-published Stanford technical report by Fran Allen in 1968 uFlow graphs of intermediate code uKey things worth doing

8 8 Intermediate Code for (i=0; i<n; i++) A[i] = 1; uIntermediate code exposes optimizable constructs we cannot see at source-code level. uMake flow explicit by breaking into basic blocks = sequences of steps with entry at beginning, exit at end.

9 9 Basic Blocks i = 0 if i>=n goto … t1 = 8*i A[t1] = 1 i = i+1 for (i=0; i<n; i++) A[i] = 1;

10 10 Induction Variables ux is an induction variable in a loop if it takes on a linear sequence of values each time through the loop. uCommon case: loop index like i and computed array index like t1. uEliminate “superfluous” induction variables. uReplace multiplication by addition (reduction in strength ).

11 11 Example i = 0 if i>=n goto … t1 = 8*i A[t1] = 1 i = i+1 t1 = 0 n1 = 8*n if t1>=n1 goto … A[t1] = 1 t1 = t1+8

12 12 Loop-Invariant Code Motion uSometimes, a computation is done each time around a loop. uMove it before the loop to save n-1 computations. wBe careful: could n=0? I.e., the loop is typically executed 0 times.

13 13 Example i = 0 if i>=n goto … t1 = y+z x = x+t1 i = i+1 i = 0 t1 = y+z if i>=n goto … x = x+t1 i = i+1

14 14 Constant Folding uSometimes a variable has a known constant value at a point. uIf so, replacing the variable by the constant simplifies and speeds-up the code. uEasy within a basic block; harder across blocks.

15 15 Example i = 0 n = 100 if i>=n goto … t1 = 8*i A[t1] = 1 i = i+1 t1 = 0 if t1>=800 goto … A[t1] = 1 t1 = t1+8

16 16 Global Common Subexpressions uSuppose block B has a computation of x+y. uSuppose we are sure that when we reach this computation, we are sure to have: 1.Computed x+y, and 2.Not subsequently reassigned x or y. uThen we can hold the value of x+y and use it in B.

17 17 Example a = x+y b = x+y c = x+y t = x+y a = t t = x+y b = t c = t

18 18 Example --- Even Better t = x+y a = t b = t c = t t = x+y a = t t = x+y b = t c = t t = x+y b = t

19 19 Data-Flow Analysis Proving Little Theorems Data-Flow Equations Major Examples

20 20 An Obvious Theorem boolean x = true; while (x) {... // no change to x } uDoesn’t terminate. uProof: only assignment to x is at top, so x is always true.

21 21 As a Flow Graph x = true if x == true “body”

22 22 Formulation: Reaching Definitions uEach place some variable x is assigned is a definition. uAsk: for this use of x, where could x last have been defined. uIn our example: only at x=true.

23 23 Example: Reaching Definitions d 1 : x = true if x == true d 2 : a = 10 d2d2 d1d1 d1d1 d2d2 d1d1

24 24 Clincher uSince at x == true, d 1 is the only definition of x that reaches, it must be that x is true at that point. uThe conditional is not really a conditional and can be replaced by a branch.

25 25 Not Always That Easy int i = 2; int j = 3; while (i != j) { if (i < j) i += 2; else j += 2; } uWe’ll develop techniques for this problem, but later …

26 26 The Flow Graph d 1 : i = 2 d 2 : j = 3 if i != j if i < j d 4 : j = j+2d 3 : i = i+2 d 1, d 2, d 3, d 4 d1d1 d3d3 d4d4 d2d2 d 2, d 3, d 4 d 1, d 3, d 4 d 1, d 2, d 3, d 4

27 27 DFA Is Sometimes Insufficient uIn this example, i can be defined in two places, and j in two places. uNo obvious way to discover that i!=j is always true. uBut OK, because reaching definitions is sufficient to catch most opportunities for constant folding (replacement of a variable by its only possible value).

28 28 Be Conservative! u(Code optimization only) uIt’s OK to discover a subset of the opportunities to make some code- improving transformation. uIt’s not OK to think you have an opportunity that you don’t really have.

29 29 Example: Be Conservative boolean x = true; while (x) {... *p = false;... } uIs it possible that p points to x?

30 30 As a Flow Graph d 1 : x = true if x == true d 2 : *p = false d1d1 d2d2 Another def of x

31 31 Possible Resolution uJust as data-flow analysis of “reaching definitions” can tell what definitions of x might reach a point, another DFA can eliminate cases where p definitely does not point to x. uExample: the only definition of p is p = &y and there is no possibility that y is an alias of x.

32 32 Reaching Definitions Formalized uA definition d of a variable x is said to reach a point p in a flow graph if: 1.Every path from the entry of the flow graph to p has d on the path, and 2.After the last occurrence of d there is no possibility that x is redefined.

33 33 Data-Flow Equations --- (1) uA basic block can generate a definition. uA basic block can either 1. Kill a definition of x if it surely redefines x. 2.Transmit a definition if it may not redefine the same variable(s) as that definition.

34 34 Data-Flow Equations --- (2) uVariables: 1.IN(B) = set of definitions reaching the beginning of block B. 2.OUT(B) = set of definitions reaching the end of B.

35 35 Data-Flow Equations --- (3) uTwo kinds of equations: 1.Confluence equations : IN(B) in terms of outs of predecessors of B. 2.Transfer equations : OUT(B) in terms of of IN(B) and what goes on in block B.

36 36 Confluence Equations IN(B) = ∪ predecessors P of B OUT(P) P2P2 B P1P1 {d 1, d 2, d 3 } {d 2, d 3 }{d 1, d 2 }

37 37 Transfer Equations uGenerate a definition in the block if its variable is not definitely rewritten later in the basic block. uKill a definition if its variable is definitely rewritten in the block. uAn internal definition may be both killed and generated.

38 38 Example: Gen and Kill d 1 : y = 3 d 2 : x = y+z d 3 : *p = 10 d 4 : y = 5 IN = {d 2 (x), d 3 (y), d 3 (z), d 5 (y), d 6 (y), d 7 (z)} Kill includes {d 1 (x), d 2 (x), d 3 (y), d 5 (y), d 6 (y),…} Gen = {d 2 (x), d 3 (x), d 3 (z),…, d 4 (y)} OUT = {d 2 (x), d 3 (x), d 3 (z),…, d 4 (y), d 7 (z)}

39 39 Transfer Function for a Block uFor any block B: OUT(B) = (IN(B) – Kill(B)) ∪ Gen(B)

40 40 Iterative Solution to Equations uFor an n-block flow graph, there are 2n equations in 2n unknowns. uAlas, the solution is not unique. uUse iterative solution to get the least fixed- point. wIdentifies any def that might reach a point.

41 41 Iterative Solution --- (2) IN(entry) = ∅ ; for each block B do OUT(B)= ∅ ; while (changes occur) do for each block B do { IN(B) = ∪ predecessors P of B OUT(P); OUT(B) = (IN(B) – Kill(B)) ∪ Gen(B); }

42 42 Example: Reaching Definitions d 1 : x = 5 if x == 10 d 2 : x = 15 B1B1 B3B3 B2B2 IN(B 1 ) = {} OUT(B 1 ) = { OUT(B 2 ) = { OUT(B 3 ) = { d1}d1} IN(B 2 ) = {d1,d1, d1,d1, IN(B 3 ) = {d1,d1, d2}d2} d2}d2} d2}d2} d2}d2}

43 43 Aside: Notice the Conservatism uNot only the most conservative assumption about when a def is killed or gen’d. uAlso the conservative assumption that any path in the flow graph can actually be taken.

44 44 Everything Else About Data Flow Analysis Flow- and Context-Sensitivity Logical Representation Pointer Analysis Interprocedural Analysis

45 45 Three Levels of Sensitivity uIn DFA so far, we have cared about where in the program we are. wCalled flow-sensitivity. uBut we didn’t care how we got there. wCalled context-sensitivity. uWe could even care about neither. wExample: where could x ever be defined in this program?

46 46 Flow/Context Insensitivity uNot so bad when program units are small (few assignments to any variable). uExample: Java code often consists of many small methods. wRemember: you can distinguish variables by their full name, e.g., class.method.block.identifier.

47 47 Context Sensitivity uCan distinguish paths to a given point. uExample: If we remembered paths, we would not have the problem in the constant-propagation framework where x+y = 5 but neither x nor y is constant over all paths.

48 48 The Example Again x = 3 y = 2 x = 2 y = 3 z = x+y

49 49 An Interprocedural Example int id(int x) {return x;} void p() {a=2; b=id(a);…} void q() {c=3; d=id(c);…} uIf we distinguish p calling id from q calling id, then we can discover b=2 and d=3. uOtherwise, we think b, d = {2, 3}.

50 50 Context-Sensitivity --- (2) uLoops and recursive calls lead to an infinite number of contexts. uGenerally used only for interprocedural analysis, so forget about loops. uNeed to collapse strong components of the calling graph to a single group. u“Context” becomes the sequence of groups on the calling stack.

51 51 Example: Calling Graph main p sr q t Contexts: Green Green, pink Green, yellow Green, pink, yellow

52 52 Comparative Complexity uInsensitive: proportional to size of program (number of variables). uFlow-Sensitive: size of program, squared (points times variables). uContext-Sensitive: worst-case exponential in program size (acyclic paths through the code).

53 53 Logical Representation uWe have used a set-theoretic formulation of DFA. wIN = set of definitions, e.g. uThere has been recent success with a logical formulation, involving predicates. uExample: Reach(d,x,i) = “definition d of variable x can reach point i.”

54 54 Comparison: Sets Vs. Logic uBoth have an efficiency enhancement. wSets: bit vectors and boolean ops. wLogic: BDD’s, incremental evaluation. uLogic allows integration of different aspects of a flow problem. wThink of PRE as an example. We needed 6 stages to compute what we wanted.

55 55 Datalog --- (1) Atom = Reach(d,x,i) Literal = Atom or NOT Atom Rule = Atom :- Literal & … & Literal Predicate Arguments: variables or constants The body : For each assignment of values to variables that makes all these true … Make this atom true (the head ).

56 56 Example: Datalog Rules Reach(d,x,j) :- Reach(d,x,i) & StatementAt(i,s) & NOT Assign(s,x) & Follows(i,j) Reach(s,x,j) :- StatementAt(i,s) & Assign(s,x) & Follows(i,j)

57 57 Datalog --- (2) uIntuition: subgoals in the body are combined by “and” (strictly speaking: “join”). uIntuition: Multiple rules for a predicate (head) are combined by “or.”

58 58 Datalog --- (3) uPredicates can be implemented by relations (as in a database). uEach tuple, or assignment of values to the arguments, also represents a propositional (boolean) variable.

59 59 Iterative Algorithm for Datalog uStart with the EDB predicates = “whatever the code dictates,” and with all IDB predicates empty. uRepeatedly examine the bodies of the rules, and see what new IDB facts can be discovered from the EDB and existing IDB facts.

60 60 Example: Seminaive Path(x,y) :- Arc(x,y) Path(x,y) :- Path(x,z) & Path(z,y) NewPath(x,y) = Arc(x,y); Path(x,y) = ∅ ; while (NewPath != ∅ ) do { NewPath(x,y) = {(x,y) | NewPath(x,z) && Path(z,y) || Path(x,z) && NewPath(z,y)} – Path(x,y); Path(x,y) = Path(x,y) ∪ NewPath(x,y); }

61 61

62 62 New Topic: Pointer Analysis uWe shall consider Andersen’s formulation of Java object references. uFlow/context insensitive analysis. uCast of characters: 1.Local variables, which point to: 2.Heap objects, which may have fields that are references to other heap objects.

63 63 Representing Heap Objects uA heap object is named by the statement in which it is created. uNote many run-time objects may have the same name. uExample: h: T v = new T; says variable v can point to (one of) the heap object(s) created by statement h. vh

64 64 Other Relevant Statements u v.f = w makes the f field of the heap object h pointed to by v point to what variable w points to. v hg w i ff

65 65 Other Statements --- (2) u v = w.f makes v point to what the f field of the heap object h pointed to by w points to. v hg wi f

66 66 Other Statements --- (3) u v = w makes v point to whatever w points to. wInterprocedural Analysis : Also models copying an actual parameter to the corresponding formal or return value to a variable. v h w

67 67 Datalog Rules 1.Pts(V,H) :- “H: V = new T” 2.Pts(V,H) :- “V=W” & Pts(W,H) 3.Pts(V,H) :- “V=W.F” & Pts(W,G) & Hpts(G,F,H) 4.Hpts(H,F,G) :- “V.F=W” & Pts(V,H) & Pts(W,G)

68 68 Example T p(T x) { h:T a = new T; a.f = x; return a; } void main() { g:T b = new T; b = p(b); b = b.f; }

69 69 Apply Rules Recursively --- Round 1 T p(T x) {h: T a = new T; a.f = x; return a;} void main() {g: T b = new T; b = p(b); b = b.f;} Pts(a,h) Pts(b,g)

70 70 Apply Rules Recursively --- Round 2 T p(T x) {h: T a = new T; a.f = x; return a;} void main() {g: T b = new T; b = p(b); b = b.f;} Pts(a,h) Pts(b,g) Pts(b,h) Pts(x,g)

71 71 Apply Rules Recursively --- Round 3 T p(T x) {h: T a = new T; a.f = x; return a;} void main() {g: T b = new T; b = p(b); b = b.f;} Pts(a,h) Pts(b,g) Pts(x,g) Pts(b,h) Hpts(h,f,g) Pts(x,h)

72 72 Apply Rules Recursively --- Round 4 T p(T x) {h: T a = new T; a.f = x; return a;} void main() {g: T b = new T; b = p(b); b = b.f;} Pts(a,h) Pts(b,g) Pts(x,g) Pts(b,h) Pts(x,h)Hpts(h,f,g) Hpts(h,f,h)

73 73 Adding Context Sensitivity uInclude a component C = context. wC doesn’t change within a function. wCall and return can extend the context if the called function is not mutually recursive with the caller.

74 74 Example of Rules: Context Sensitive Pts(V,H,B,I+1,C) :- “B,I: V=W” & Pts(W,H,B,I,C) Pts(X,H,B0,0,D) :- Pts(V,H,B,I,C) & “B,I: call P(…,V,…)” & “X is the corresponding actual to V in P” & “B0 is the entry of P” & “context D is C extended by P”


Download ppt "Ben Livshits Based in part of Stanford class slides from"

Similar presentations


Ads by Google