Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Program Analysis Mooly Sagiv Tel Aviv University 640-6706 Textbook: Principles of Program Analysis.

Similar presentations


Presentation on theme: "1 Program Analysis Mooly Sagiv Tel Aviv University 640-6706 Textbook: Principles of Program Analysis."— Presentation transcript:

1 1 Program Analysis Mooly Sagiv http://www.cs.tau.ac.il/~msagiv/courses/pa05.html Tel Aviv University 640-6706 Textbook: Principles of Program Analysis Chapter 2.1 +6 (modified) Appendix A

2 Outline u A gentle introduction constant propagation u Mathematical background u Chaotic iterations u Abstract interpretation u More examples –Kill/Gen Problems –Garbage variables –Pointer analysis »Stack »Heap (later) –Array bound (next week) u Precision/Completeness

3 A Simple Example Program z = 3 x = 1 while (x > 0) ( if (x = 1) then y = 7 else y = z + 4 x = 3 print y ) [x  0, y  0, z  0 ] [x  1, y  0, z  3 ] [x  1, y  7, z  3 ] [x  0, y  0, z  3 ] [x  3, y  7, z  3 ] [x  1, y  7, z  3 ]

4 A Simple Example Program z = 3 x = 1 while (x > 0) ( if (x = 1) then y = 7 else y = z + 4 x = 3 print y ) [x  0, y  0, z  0 ] [x  1, y  0, z  3 ] [x , y  , z  3 ] [x  1, y  7, z  3 ] [x  0, y  0, z  3 ] [x  3, y  7, z  3 ]

5 A Simple Example Program z = 3 x = 1 while (x > 0) ( if (x = 1) then y = 7 else y = z + 4 x = 3 print y ) [x  0, y  0, z  0 ] [x  1, y  0, z  3 ] [x , y  , z  3 ] [x  1, y  7, z  3 ] [x  0, y  0, z  3 ] [x  3, y  7, z  3 ]

6 A Simple Example Program z = 3 x = 1 while (x > 0) ( if (x = 1) then y = 7 else y =z + 4 x = 3 print y ) [x  0, y  0, z  0 ] [x  1, y  0, z  3 ] [x , y  0, z  3 ] [x , y  , z  3 ] [x  0, y  0, z  3 ] [x  3, y  7, z  3 ] [x  1, y  7, z  3 ]

7 A Simple Example Program z = 3 x = 1 while (x > 0) ( if (x = 1) then y = 7 else y =z + 4 x = 3 print y ) [x  0, y  0, z  0 ] [x  1, y  0, z  3 ] [x , y  , z  3 ] [x  1, y  7, z  3 ] [x , y  , z  3 ] [x  0, y  0, z  3 ] [x  3, y  7, z  3 ] [x , y  7, z  3 ] [x  1, y  7, z  3 ]

8 A Simple Example Program z = 3 x = 1 while (x > 0) ( if (x = 1) then y = 7 else y =z + 4 x = 3 print y ) [x  0, y  0, z  0 ] [x  1, y  0, z  3 ] [x , y  , z  3 ] [x  1, y  7, z  3 ] [x  0, y  0, z  3 ] [x  3, y  7, z  3 ] [x , y  7, z  3 ]

9 Computing Constants u Construct a control flow graph (CFG) u Associate transfer functions with control flow graph edges u Iterate until a solution is found u The solution is unique –But order of evaluation may affect the number of iterations

10 Constructing CFG z = 3 x = 1 while (x > 0) ( if (x = 1) then y = 7 else y =z + 4 x = 3 print y ) z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y

11 Associating Transfer Functions z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e

12 Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]       

13 Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]       [x  1, y  0, z  0 ]

14 Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]       [x  0, y  0, z  3 ]

15 Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]      [x  0, y  0, z  3 ] [x  1, y  0, z  3 ]

16 Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]     [x  0, y  0, z  3 ] [x  1, y  0, z  3 ]

17 Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]    [x  0, y  0, z  3 ] [x  1, y  0, z  3 ]

18 Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]   [x  0, y  0, z  3 ] [x  1, y  0, z  3 ] [x  1, y  7, z  3 ]

19 Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]  [x  0, y  0, z  3 ] [x  1, y  0, z  3 ] [x  1, y  7, z  3 ] [x  3, y  7, z  3 ]

20 Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]  [x  0, y  0, z  3 ] [x , y  0, z  3 ] [x  1, y  0, z  3 ] [x  1, y  7, z  3 ] [x  3, y  7, z  3 ]

21 Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]  [x  0, y  0, z  3 ] [x , y  0, z  3 ] [x  1, y  0, z  3 ] [x  1, y  7, z  3 ] [x  3, y  7, z  3 ]

22 Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]  [x  0, y  0, z  3 ] [x , y  0, z  3 ] [x  1, y  7, z  3 ] [x  3, y  7, z  3 ]

23 Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ] [x  0, y  0, z  3 ] [x , y  0, z  3 ] [x  1, y  7, z  3 ] [x  3, y  7, z  3 ] [x , y  0, z  3 ]

24 Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ] [x  0, y  0, z  3 ] [x , y  0, z  3 ] [x  1, y  0, z  3 ] [x , y  7, z  3 ] [x  3, y  7, z  3 ] [x , y  0, z  3 ]

25 Mathematical Background u Declaratively define –The result of the analysis –The exact solution –Allow comparison

26 Posets u A partial ordering is a binary relation  : L  L  {false, true} –For all l  L : l  l (Reflexive) –For all l 1, l 2, l 3  L : l 1  l 2, l 2  l 3  l 1  l 3 (Transitive) –For all l 1, l 2  L : l 1  l 2, l 2  l 1  l 1 = l 2 (Anti-Symmetric) u Denoted by (L,  ) u In program analysis –l 1  l 2  l 1 is more precise than l 2  l 1 represents fewer concrete states than l 2 u Examples –Total orders (N,  ) –Powersets (P(S),  ) –Powersets (P(S),  ) –Constant propagation

27 Posets u More notations –l 1  l 2  l 2  l 1 –l 1  l 2  l 1  l 2  l 1  l 2 –l 1  l 2  l 2  l 1

28 Upper and Lower Bounds u Consider a poset (L,  ) u A subset L’  L has a lower bound l  L if for all l’  L’ : l  l’ u A subset L’  L has an upper bound u  L if for all l’  L’ : l’  u u A greatest lower bound of a subset L’  L is a lower bound l 0  L such that l  l 0 for any lower bound l of L’ u A lowest upper bound of a subset L’  L is an upper bound u 0  L such that u 0  u for any upper bound u of L’ u For every subset L’  L: –The greatest lower bound of L’ is unique if at all exists »  L’ (meet) a  b –The lowest upper bound of L’ is unique if at all exists »  L’ (join) a  b

29 Complete Lattices u A poset (L,  ) is a complete lattice if every subset has least and upper bounds u L = (L,  ) = (L, , , , ,  ) –  =   =  L –  =  L =   u Examples –Total orders (N,  ) –Powersets (P(S),  ) –Powersets (P(S),  ) –Constant propagation

30 Complete Lattices u Lemma For every poset (L,  ) the following conditions are equivalent –L is a complete lattice –Every subset of L has a least upper bound –Every subset of L has a greatest lower bound

31 Cartesian Products u A complete lattice (L 1,  1 ) = (L 1, ,  1,  1,  1,  1 ) u A complete lattice (L 2,  2 ) = (, ,  2,  2,  2,  2 ) u Define a Poset L = (L 1  L 2,  ) where –(x 1, x 2 )  (y 1, y 2 ) if »x 1  y 1 and »x 2  y 2 u L is a complete lattice

32 Finite Maps u A complete lattice (L 1,  1 ) = (L 1, ,  1,  1,  1,  1 ) u A finite set V u Define a Poset L = (V  L 1,  ) where –e 1  e 2 if for all v  V »e 1 v  e 2 v u L is a complete lattice

33 Chains u A subset Y  L in a poset (L,  ) is a chain if every two elements in Y are ordered –For all l 1, l 2  Y: l 1  l 2 or l 2  l 1 u An ascending chain is a sequence of values –l 1  l 2  l 3  … u A strictly ascending chain is a sequence of values –l 1  l 2  l 3  … u A descending chain is a sequence of values –l 1  l 2  l 3  … u A strictly descending chain is a sequence of values –l 1  l 2  l 3  … u L has a finite height if every chain in L is finite u Lemma A poset (L,  ) has finite height if and only if every strictly decreasing and strictly increasing chains are finite

34 Monotone Functions u A poset (L,  ) u A function f: L  L is monotone if for every l 1, l 2  L: –l 1  l 2  f(l 1 )  f(l 2 )

35 Galois Connections u Lattices C and A and functions  : C  A and  : A  C u The pair of functions ( ,  ) form Galois connection if –  and  are monotone –  a  A »  (  (a))  a –  c  C »c   (  (C)) u Alternatively if:  c  C  a  A  (c)  a iff c   (a) u  and  uniquely determine each other

36 Fixed Points u A monotone function f: L  L where (L, , , , ,  ) is a complete lattice u Fix(f) = { l: l  L, f(l) = l} u Red(f) = {l: l  L, f(l)  l} u Ext(f) = {l: l  L, l  f(l)} –l 1  l 2  f(l 1 )  f(l 2 ) u Tarski’s Theorem 1955: if f is monotone then: – lfp(f) =  Fix(f) =  Red(f)  Fix(f) – gfp(f) =  Fix(f) =  Ext(f)  Fix(f)   f(  ) f(  ) f2()f2() f2()f2() Fix(f) Ext(f) Red(f) gfp(f) lfp(f)

37 Chaotic Iterations u A lattice L = (L, , , , ,  ) with finite strictly increasing chains u L n = L  L  …  L u A monotone function f: L n  L n u Compute lfp(f) u The simultaneous least fixed of the system {x[i] = f i (x) : 1  i  n } x := ( , , …,  ) while (f(x)  x ) do x := f(x) for i :=1 to n do x[i] =  WL = {1, 2, …, n} while (WL   ) do select and remove an element i  WL new := f i (x) if (new  x[i]) then x[i] := new; Add all the indexes that directly depends on i to WL

38 Chaotic Iterations u L n = L  L  …  L u A monotone function f: L n  L n u Compute lfp(f) u The simultaneous least fixed of the system {x[i] = f i (x) : 1  i  n } u Minimum number of non-constant u Maximum number of  for i :=1 to n do x[i] =  WL = {1, 2, …, n} while (WL   ) do select and remove an element i  WL new := f i (x) if (new  x[i]) then x[i] := new; Add all the indexes that directly depends on i to WL

39 Specialized Chaotic Iterations Chaotic(G(V, E): Graph, s: Node, L: Lattice,  : L, f: E  (L  L) ){ for each v in V to n do df entry [v] :=  df[v] =  WL = {s} while (WL   ) do select and remove an element u  WL for each v, such that. (u, v)  E do temp = f(e)(df entry [u]) new := df entry (v)  temp if (new  df entry [v]) then df entry [v] := new; WL := WL  {v}

40 z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e  1 2 3 4 5 6 7 8 [x  0, y  0, z  0] WLdf entry ]v] {1} {2} df[2]:=[x  0, y  0, z  3] {3} df[3]:=[x  1, y  0, z  3] {4} df[4]:=[x  1, y  0, z  3] {5} df[5]:=[x  1, y  0, z  3] {7} df[7]:=[x  1, y  7, z  3] {8} df[8]:=[x  3, y  7, z  3] {3} df[3]:=[x , y , z  3] {4} df[4]:=[x , y , z  3] {5,6} df[5]:=[x  1, y , z  3] {6,7} df[6]:=[x , y , z  3] {7} df[7]:=[x , y  7, z  3]

41 Specialized Chaotic Iterations System of Equations S = df entry [s] =  df entry [v] =  {f(u, v) (df entry [u]) | (u, v)  E } F S :L n  L n F S (X)[s] =  F S (X)[v] =  {f(u, v)(X[u]) | (u, v)  E } lfp(S) = lfp(F S )

42 Complexity of Chaotic Iterations u Parameters: –n the number of CFG nodes –k is the maximum outdegree of edges –A lattice of height h –c is the maximum cost of »applying f (e) »  »L comparisons u Complexity O(n * h * c * k)

43 Soundness u Every detected constant is indeed such u Every error will be detected u The least fixed points represents all occurring runtime states

44 Completeness u Every constant is indeed detected as such u Every detected error is real u Every state represented by the least fixed is reachable for some input

45 The Abstract Interpretation Technique u The foundation of program analysis u Goals –Establish soundness of (find faults in) a given program analysis algorithm –Design new program analysis algorithms u The main ideas: –Relate each step in the algorithm to a step in a structural operational semantics –Establish global correctness using a general theorem u Not limited to a particular form of analysis

46 Soundness in Constant Propagation u Every detected constant is indeed such u May include fewer constants u May miss  u At every CFG node l All constants in df entry (l) are indeed constants u At every CFG node l df entry (l) “represents” all the possible concrete states arising when the structural operational semantics reaches l

47 Proof of Soundness u Define an “appropriate” structural operational semantics u Define “collecting” structural operational semantics u Establish a Galois connection between collecting states and abstract states u (Local correctness) Show that the abstract interpretation of every atomic statement is sound w.r.t. the collecting semantics u (Global correctness) Conclude that the analysis is sound CC1976  (lfp(S)[v])  reachable-states[v]

48 Structural Semantics for While [ass sos ]  s[x  A  a  s] [skip sos ]  s [comp 1 sos ]   axioms rules [comp 2 sos ]  s’ 

49 Structural Semantics for While if construct [if tt sos ]  if B  b  s=tt [if ff os ]  if B  b  s=ff

50 Structural Semantics for While while construct [while sos ] 

51 Collecting Semantics u The input state is not known at compile-time u “Collect” all the states for all possible inputs to the program u No lost of precision

52 A Simple Example Program z = 3 x = 1 while (x > 0) ( if (x = 1) then y = 7 else y = z + 4 x = 3 print y ) {[x  0, y  0, z  0 ]} {[x  1, y  0, z  3 ]} {[x  1, y  0, z  3 ], [x  3, y  0, z  3], } {[x  0, y  0, z  3 ]} {[x  1, y  7, z  3 ], [x  3, y  7, z  3] } { [x  3, y  7, z  3] }

53 Another Example x= 0 while (true) do x = x +1

54 An “Iterative” Definition u Generate a system of monotone equations u The least solution is well-defined u The least solution is the collecting interpretation u But may not be computable

55 Equations Generated for Collecting Interpretation u Equations for elementary statements –[skip] CS exit (1) = CS entry (l) –[b] CS exit (1) = {  :   CS entry (l),  b   =tt} –[x := a] CS exit (1) = { (s[x  A  a  s]) | s  CS entry (l)} u Equations for control flow constructs CS entry (l) =  CS exit (l’) l’ immediately precedes l in the control flow graph u An equation for the entry CS entry (1) = {  |   Var *  Z }

56 Specialized Chaotic Iterations System of Equations (Collecting Semantics) S = CS entry [s] ={  0 } CS entry [v] =  {f(e)(CS entry [u]) | (u, v)  E } where f(e) = X. {  st(e)   |  X} for atomic statements f(e) = X.{  |  b(e)   =tt } F S :L n  L n F s (X)[v] =  {f(e)[u] | (u, v)  E } lfp(S) = lfp(F S )

57 The Least Solution u n sets of equations CS entry (1), …, CS entry (n) u Can be written in vectorial form u The least solution lfp(F cs ) is well-defined u Every component is minimal u Since F cs is monotone such a solution always exists u CS entry (v) = {s|  s 0 |  * (S’, s)), init(S’)=v} u Simplify the soundness criteria

58 Abstract (Conservative) interpretation abstract representation Set of states concretization Abstract semantics statement s abstract representation concretization Operational semantics statement s Set of states 

59 Abstract (Conservative) interpretation abstract representation Set of states abstraction Abstract semantics statement s abstract representation abstraction Operational semantics statement s Set of states abstract representation 

60 The Abstraction Function u Map collecting states into constants u The abstraction of an individual state  CP :[Var *  Z]  [Var *  Z  { ,  }]  CP (  ) =  u The abstraction of set of states  CP :P([Var *  Z])  [Var *  Z  { ,  }]  CP (CS) =  {  CP (  ) |   CS} =  {  |   CS} u Soundness  CP (CS entry (v))  df entry (v) u Completeness

61 The Concretization Function u Map constants into collecting states u The formal meaning of constants u The concretization  CP : [Var *  Z  { ,  }]  P([Var *  Z])  CP (df) = {  |  CP (  )  df} = {  |   df} u Soundness CS entry (v)   CP (df entry (v)) u Optimality

62 Galois Connection u  CP is monotone u  CP is monotone u  df  [Var *  Z  { ,  }] –  CP (  CP (df))  df u  c  P([Var *  Z]) –c CP   CP (  CP (C))

63 Local Concrete Semantics u For every atomic statement S –  S  : [Var *  Z]  [Var *  Z] –  x := a]  s = s[x  A  a  s] –  skip]  s = s u For Boolean conditions …

64 Local Abstract Semantics u For every atomic statement S –  S  # :Var *  L  Var *  L –  x := a  # (e) = e [x   a  # (e) –  skip  # (e) = e u For Booleans …

65 Local Soundness u For every atomic statement S show one of the following –  CP ({  S   |   CS }   S  # (  CP (CS)) –{  S   |    CP (df)}   CP (  S  # (df)) –  ({  S   |    CP (df)})   S  # (df) u The above condition implies global soundness [Cousot & Cousot 1976]  (CS entry (l))  df entry (l) CS entry (l)   (df entry (l))

66 Lemma 1 Consider a lattice L. f: L  L is monotone iff for all X  L:  {f(z) | z  X }  f(  {z | z  X })

67 Assignments in constant propagation u Monotone –df 1  df 2   x :=e  ) # df 1 )   x :=e  ) # df 2 ( u Local Soundness –  ({  x :=e   |   CS }   x :=e  # (  (CS))

68 Soundness Theorem(1) 1. Let ( ,  ) form Galois connection from C to A 2. f: C  C be a monotone function 3. f # : A  A be a monotone function 4.  a  A: f(  (a))   (f # (a)) lfp(f)   (lfp(f # ))  (lfp(f))  lfp(f # )

69 Fixed Points u A monotone function f: L  L where (L, , , , ,  ) is a complete lattice u Fix(f) = { l: l  L, f(l) = l} u Red(f) = {l: l  L, f(l)  l} u Ext(f) = {l: l  L, l  f(l)} –l 1  l 2  f(l 1 )  f(l 2 ) u Tarski’s Theorem 1955: if f is monotone then: – lfp(f) =  Fix(f) =  Red(f)  Fix(f) – gfp(f) =  Fix(f) =  Ext(f)  Fix(f)   f(  ) f(  ) f2()f2() f2()f2() Fix(f) Ext(f) Red(f) gfp(f) lfp(f)

70 Soundness Theorem(1) 1. Let ( ,  ) form Galois connection from C to A 2. f: C  C be a monotone function 3. f # : A  A be a monotone function 4.  a  A: f(  (a))   (f # (a)) lfp(f)   (lfp(f # ))  (lfp(f))  lfp(f # )

71 Soundness Theorem(2) 1. Let ( ,  ) form Galois connection from C to A 2. f: C  C be a monotone function 3. f # : A  A be a monotone function 4.  c  C:  (f(c))  f # (  (c))  (lfp(f))  lfp(f # ) lfp(f)   (lfp(f # ))

72 Soundness Theorem(3) 1. Let ( ,  ) form Galois connection from C to A 2. f: C  C be a monotone function 3. f # : A  A be a monotone function 4.  a  A:  (f(  (a)))  f # (a)  (lfp(f))  lfp(f # ) lfp(f)   (lfp(f # ))

73 Proof of Soundness (Summary) u Define an “appropriate” structural operational semantics u Define “collecting” structural operational semantics u Establish a Galois connection between collecting states and abstract states u (Local correctness) Show that the abstract interpretation of every atomic statement is sound w.r.t. the collecting semantics u (Global correctness) Conclude that the analysis is sound

74 Example Program Analysis Problem Reaching Definitions u An assignment (definition) of the form [x := a] l may reach an elementary block l’ if – there is execution of the program that leads to l' where x was last assigned at l

75 Usage of Reaching Definitions u Compiler optimizations –An occurrence of a variable x in in an elementary block l is constant n if all in the reaching definitions (x, l'), l' assigns n to x –Loop invariant code motion –Program dependence graphs u Software quality tools –A usage of a variable x in an elementary block may be uninitialized if... –Program slicing

76 Soundness in Reaching Definitions u Every reachable definition is detected u May include more definitions –Less constants may be identified –Not all the loop invariant code will be identified –May warn against uninitailzed variables that are in fact in initialized u But never miss a reaching definition –All constants are indeed such –Never move a non invariant code –Never miss an error

77 Reaching Definitions in Factorial [y := x] 1 ; [z := 1] 2 ; while [y>1] 3 do ( [z:= z * y] 4 ; [y := y - 1] 5 ; ) [y := 0] 6 ; {(x, ?), (y, ?), (z, ?)} {(x, ?), (y, 1), (z, ?)} {(x, ?), (y, 1), (z, 2)} {(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)} {(x, ?), (y, 1), (y, 5), (z, 4)} {(x, ?), (y, 5), (z, 4)} {(x, ?), (y, 6), (z, 2), (z, 4)} {(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)}

78 Unsound Reaching Definitions 1 [y := x] 1 ; [z := 1] 2 ; while [y>1] 3 do ( [z:= z * y] 4 ; [y := y - 1] 5 ; ) [y := 0] 6 ; {(x, ?), (y, ?), (z, ?)} {(x, ?), (y, 1), (z, ?)} {(x, ?), (y, 1), (z, 2)} {(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)} {(x, ?), (y, 1), (z, 2)} {(x, ?), (y, 1), (z, 4)} {(x, ?), (y, 5), (z, 4)} {(x, ?), (y, 6), (z, 2), (z, 4)} {(x, ?), (y, 1), (z, 2)}

79 Unsound Reaching Definitions 2 [y := x] 1 ; [z := 1] 2 ; while [y>1] 3 do ( [z:= z * y] 4 ; [y := y - 1] 5 ; ) [y := 0] 6 ; {(x, ?), (y, ?), (z, ?)} {(x, ?), (y, 1), (z, ?)} {(x, ?), (y, 1), (z, 2)} {(x, ?), (y, 5), (z, 4)} {(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)} {(x, ?), (y, 1), (y, 5), (z, 4)} {(x, ?), (y, 5), (z, 4)} {(x, ?), (y, 6), (z, 4)} {(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)}

80 Suboptimal Reaching Definitions [y := x] 1 ; [z := 1] 2 ; while [y>1] 3 do ( [z:= z * y] 4 ; [y := y - 1] 5 ; ) [y := 0] 6 ; {(x, ?), (y, ?), (z, ?)} {(x, ?), (y, 1), (z, ?)} {(x, ?), (y, 1), (z, 2)} {(x, ?), (y, 1), (y, 5), (y, 6), (z, 2), (z, 4)} {(x, ?), (y, 1), (y, 6), (y, 5), (z, 4)} {(x, ?), (y, 1), (y, 5), (y, 6), (z, 4)} {(x, ?), (y, 5), (z, 4)} {(x, ?), (y, 6), (z, 2), (z, 4)} {(x, ?), (y, 1), (y, 5), (y, 6), (z, 2), (z, 4)}

81 Equations Generated for Reaching Definitions u Equations for elementary statements –[skip] l RD exit (1) = RD entry (l) –[b] l RD exit (1) = RD entry (l) –[x := a] l RD exit (1) = (RD entry (l) - {(x, l’) | l’  Lab })  {(x, l)} u Equations for control flow constructs RD entry (l) =  RD exit (l’) l’ immediately precedes l in the control flow graph u An equation for the entry RD entry (1) = {(x, ?) | x is a variable in the program}

82 RDentry(1)={(x, ?), (y, ?), (z, ?)} RDexit(1)= RDentry(1)-{(y, l’) | l’  Lab })  {(y, 1)} RDentry(2) = RDexit(1) RDexit(2)= RDentry(2)-{(z, l’) | l’  Lab })  {(z, 2)} RDentry(3) = RDexit(2)  RDexit(5) RDexit(3) = RDentry(3) RDentry(4) = RDexit(3) RDexit(4)= RDentry(4)-{(z, l’) | l’  Lab })  {(z, 4)} RDentry(5) = RDexit(4) RDexit(5)= RDentry(5)-{(y, l’) | l’  Lab })  {(y, 5)} RDentry(6) = RDexit(6) RDexit(6)= RDentry(6)-{(y, l’) | l’  Lab })  {(y, 6)} [y := x] 1 ; [z := 1] 2 ; [y>1] 3 [z:= z * y] 4 [y := y - 1] 5 ; [y := 0] 6 ;

83 Chaotic Computation of the Least Solution Initialize RD entry (1):={(x, ?), (y, ?), (z, ?)}; RDexit (1):=  RD entry (2):=  ; RDexit (2):=  RD entry (3):=  ; RDexit (3):=  RD entry (4):=  ; RDexit(4) :=  RD entry (5):=  ; RDexit(5):=  RD entry (6):=  ; RDexit(6):=  WL = {1, 2, 3, 4, 5, 6} while WL !=  do select and remove an l from WL new := FRD exit (l)(…) if (new != RD exit (l)) then RD exit (l) := new for all l’ such that RD exit (l) is used in FRD entry (l’)(…) do RD entry (l’) := RD entry (l’)  new WL := WL  {l’}

84 [y := x] 1 ; [z := 1] 2 ; [y>1] 3 [z:= z * y] 4 [y := y - 1] 5 ; [y := 0] 6 ; lRDexit(l)WL {1, 2, 3, 4, 5, 6} 1 {(x, ?), (y:1), (z:?)}{2, 3, 4, 5, 6} 2 {(x, ?), (y:1), (z:2)}{3, 4, 5, 6} 3 {(x, ?), (y:1), (z:2)}{4, 5, 6} 4 {(x, ?), (y:1), (z:4)}{5, 6} 5 {(x:?), (y: 5), (z: 4)}{6, 3} (x, ?), (y, ?), (z, ?)}{(x, ?), (y:1), (z:?)} {(x, ?), (y:1), (z:2)} {(x, ?), (y:1), (z:4)} {(x, ?), (y:1), (z:2)}

85 [y := x] 1 ; [z := 1] 2 ; [y>1] 3 [z:= z * y] 4 [y := y - 1] 5 ; [y := 0] 6 ; lRDexit(l)WL {1, 2, 3, 4, 5, 6} 1 {(x, ?), (y:1), (z:?)}{2, 3, 4, 5, 6} 2 {(x, ?), (y:1), (z:2)}{3, 4, 5, 6} 3 {(x, ?), (y:1), (z:2)}{4, 5, 6} 4 {(x, ?), (y:1), (z:4)}{5, 6} 5 {(x:?), (y: 5), (z: 4)}{6, 3} (x, ?), (y, ?), (z, ?)}{(x, ?), (y:1), (z:?)} {(x, ?), (y:1), (z:2)} {(x, ?), (y:1), (z:2), (y: 5), (z:4)} {(x, ?), (y:1), (z:2)} {(x, ?), (y:1), (z:4)} {(x:?), (y: 5), (z: 4)}

86 [y := x] 1 ; [z := 1] 2 ; [y>1] 3 [z:= z * y] 4 [y := y - 1] 5 ; [y := 0] 6 ; lRDexit(l)WL {1, 2, 3, 4, 5, 6} 1 {(x, ?), (y:1), (z:?)}{2, 3, 4, 5, 6} 2 {(x, ?), (y:1), (z:2)}{3, 4, 5, 6} 3 {(x, ?), (y:1), (z:2)}{4, 5, 6} 4 {(x, ?), (y:1), (z:4)}{5, 6} 5 {(x:?), (y: 5), (z: 4)}{6, 3} 3 {(x: ?), (y:1), (z:2), (y: 5), (z: 4)} {6, 4} (x, ?), (y, ?), (z, ?)}{(x, ?), (y:1), (z:?)} {(x, ?), (y:1), (z:2)} {(x, ?), (y:1), (z:2), (y: 5), (z:4)} {(x, ?), (y:1), (z:2)} {(x, ?), (y:1), (z:4)} {(x:?), (y: 5), (z: 4)}

87 Soundness in Reaching Definitions u Every reachable definition is detected u May include more definitions –Less constants may be identified –Not all the loop invariant code will be identified –May warn against uninitailzed variables that are in fact in initialized u At every elementary block l RD entry (l) includes all the possibly definitions reaching l u At every elementary block l RD entry (l) “represents” all the possible concrete states arising when the structural operational semantics reaches l

88 Structural Operational Semantics to justify Reaching Definitions u Normal states [Var *  Z] are not enough u Instrumented states [Var *  Z]  [Var *  Lab * ] u For an instrumented state (s, def) and variable x def(x) holds the last definition of x

89 Instrumented Structural Semantics for While [ass sos ]  (s[x  A  a  s], d(x  l)) [skip sos ]  (s, d) [comp 1 sos ]   axioms rules [comp 2 sos ]  (s’, d’) 

90 Instrumented Structural Semantics if construct [if tt sos ]  if B  b  s=tt [if ff sos ]  if B  b  s=ff

91 Instrumented Structural Semantics while construct [while sos ] 

92 The Factorial Program [y := x] 1 ; [z := 1] 2 ; while [y>1] 3 do ( [z:= z * y] 4 ; [y := y - 1] 5 ; ) [y := 0] 6 ;

93 Code Instrumentation u Alternative instrumentation u Generate an equivalent program which maintains more information u Use standard structural operational semantics

94 Other Consumers of Instrumentation u Specialized interpreters u Code Instrumentation –Performance analysis qpt »count the number of executions of basic blocks or the number of calls to a function –Profiling Tools »Find “hot” paths (paths that are executed often) by remembering which edge in the control flow graph was executed –Cleanness Tools Purify, Insure »identify uninitialized objects

95 Collecting (Instrumented) Semantics u The input state is not known at compile-time u “Collect” all the (instrumented) states for all possible inputs to the program u No lost of precision

96 Flow Information for While u Associate labels with program statements describing when statements begin and end u init:Stm  Lab * –init([x := a] l ) = l –init([skip] l ) = l –init(S 1 ; S 2 ) = init(S 1 ) –init(if [b] l then S 1 else S 2 ) = l –init(while [b] l do S) = l u final:Stm  P(Lab * ) –final([x := a] l ) = {l} –final([skip] l ) = {l} –final(S 1 ; S 2 ) = final(S 2 ) –final(if [b] l then S 1 else S 2 ) = final(S 1 )  final(S 2 ) –final(while [b] l do S) = {l}

97 Collecting (Instrumented) Semantics(Cont) u The input state is not known at compile-time u “Collect” all the (instrumented) states for all possible inputs to the program u Define d ? :Var *  Lab * by d ? (x)=? u CS entry (l) = {(s’, d’)|  s 0 : (P, (s 0, d ? )  * (S’, (s’, d’)), init(S’)=l} u Soundness w.r.t. operational semantics For all (s’, d’) in CS entry (l) For all variable x (x, d(l))  RD entry (l) u Optimality w.r.t. operational semantics

98 The Factorial Program [y := x] 1 ; [z := 1] 2 ; while [y>1] 3 do ( [z:= z * y] 4 ; [y := y - 1] 5 ; ) [y := 0] 6 ;

99 An “Iterative” Definition u Generate a system of monotonic equations u The least solution is well-defined u The least solution is the collecting interpretation

100 Equations Generated for Collecting Interpretation u Equations for elementary statements –[skip] l CS exit (1) = CS entry (l) –[b] l CS exit (1) = CS entry (l) –[x := a] l CS exit (1) = { (s[x  A  a  s], d(x  l)) | (s, d)  CS entry (l)} u Equations for control flow constructs CS entry (l) =  CS exit (l’) l’ immediately precedes l in the control flow graph u An equation for the entry CS entry (1) = {(s 0, d ? ) |s 0  Var *  Z }

101 The Least Solution u 12 sets of equations CS entry (1), …, CS exit (6) u Can be written in vectorial form u The least solution lfp(F cs ) is well-defined u Every component is minimal u Since F cs is monotonic such a solution always exists u CS entry (l) = {(s’, d’)|  s 0 : (P, (s 0, d ? )  * (S’, (s’, d’)), init(S’)=l} u Simplify the soundness criteria

102 Abstract (Conservative) interpretation abstract representation Set of states concretization Abstract semantics statement s abstract representation abstraction Operational semantics statement s Set of states

103 The Abstraction Function u Map collecting states into reaching definitions u The abstraction of an individual state  :[Var *  Z]  [Var *  Lab * ]  P(Var *  Lab * )  (s,d) = {(x, d(x) | x  Var * } u The abstraction of set of states  :P([Var *  Z]  [Var *  Lab * ])  P(Var *  Lab * )  (CS) =  (s, d)  CS  (s,d) = = {(x, d(x) | (s, d)  CS, x  Var * } u Soundness  (CS entry (l))  RD entry (l) u Optimality

104 The Concretization Function u Map reaching definitions into collecting states u The formal meaning of reaching definitions u The concretization  : P(Var *  Lab * )  P([Var *  Z]  [Var *  Lab * ])  (RD) = {(s, d) |  x  Var * : (x, d(x)  RD}= = { (s, d) |  (s, d)  RD } u Soundness CS entry (l)   (RD entry (l)) u Optimality

105 Galois Connections u The pair of functions ( ,  ) form a Galois connection if:  CS  P([Var *  Z]  [Var *  Lab * ])  RD  P(Var *  Lab * )  (CS)  RD iff CS   (RD) u Alternatively:  CS  P([Var *  Z]  [Var *  Lab * ])  RD  P(Var *  Lab * )  (  (RD))  RD and CS   (  (CS)) u  and  uniquely determine each other

106 Local Abstract Semantics u For every atomic statement S –  S  # : P(Var *  Lab * )  P(Var *  Lab * ) –  x := a] l  # (RD) = (RD - {(x, l’) | l’  Lab })  {(x, l)} –  skip] l  # (RD) = (RD) –  b] l  # (RD) = (RD)

107 Local Soundness u For every atomic statement S show one of the following –  ({  S  (s, d) | (s, d)  CS }   S  # (  (CS)) –{  S  (s, d) | (s, d)   (RD)}   (  S  # (RD)) –  ({  S  (s, d) | (s, d)   (RD)})   S  # (RD) u The above condition implies global soundness [Cousot & Cousot 1976]  (CS entry (l))  RD entry (l) CS entry (l)   (RD entry (l))

108 Best (Conservative) interpretation abstract representation Set of states concretization Abstract semantics statement s abstract representation abstraction Operational semantics statement s Set of states concretization Set of states 

109 Induced Analysis (Relatively Optimal) u It is sometimes possible to show that a given analysis is not only sound but optimal w.r.t. the chosen abstraction –but not necessarily optimal! u Define  S  # (df) =  ({  S   |    (df)}) u But this  S  # may not be computable u Derive (at compiler-generation time) an alternative form for  S  # u A useful measure to decide if the abstraction must lead to overly imprecise results u Recent results use theorem provers to produce best transformers

110 Example Dataflow Problem u Formal available expression analysis u Find out which expressions are available at a given program point u Example program x = y + t z = y + r while (…) { t = t + (y + r) } u Lattice u Galois connection u Basic statements u Soundness

111 Example: May-Be-Garbage u A variable x may-be-garbage at a program point v if there exists a execution path leading to v in which x’s value is unpredictable: –Was not assigned –Was assigned using an unpredictable expression u Lattice u Galois connection u Basic statements u Soundness

112 Points-To Analysis u Determine if a pointer variable p may point to q on some path leading to a program point u “Adapt” other optimizations –Constant propagation x = 5; *p = 7 ; … x … u Pointer aliases –Variables p and q are may-aliases at v if the points-to set at v contains entries (p, x) and (q, x) u Side-effect analysis *p = *q + * * t

113 The PWhile Programming Language Abstract Syntax a := x | *x | &x | n | a 1 op a a 2 b := true | false | not b | b 1 op b b 2 | a 1 op r a 2 S := x := a | *x := a | skip | S 1 ; S 2 | if b then S 1 else S 2 | while b do S

114 Concrete Semantics 1 for PWhile For every atomic statement S  S  : States1  States1  x := a  (  )=  [loc(x)  A  a   ]  x := &y  (  )  x := *y  (  )  x := y  (  )  *x := y  (  ) State1= [Loc  Loc  Z]

115 Points-To Analysis u Lattice L pt = u Galois connection

116 Abstract Semantics for PWhile For every atomic statement S  S  # : P(Var*  Var*)  P(Var*  Var*)  x := &y  #  x := *y  #  x := y  #  *x := y  #

117 t := &a; y := &b; z := &c; if x> 0; then p:= &y; else p:= &z; *p := t;

118 /*  */ t := &a; /* {(t, a)}*/ /* {(t, a)}*/ y := &b; /* {(t, a), (y, b) }*/ /* {(t, a), (y, b)}*/ z := &c; /* {(t, a), (y, b), (z, c) }*/ if x> 0; then p:= &y; /* {(t, a), (y, b), (z, c), (p, y)}*/ else p:= &z; /* {(t, a), (y, b), (z, c), (p, z)}*/ /* {(t, a), (y, b), (z, c), (p, y), (p, z)}*/ *p := t; /* {(t, a), (y, b), (y, c), (p, y), (p, z), (y, a), (z, a)}*/

119 Flow insensitive points-to-analysis Steengard 1996 u Ignore control flow u One set of points-to per program u Can be represented as a directed graph u Conservative approximation –Accumulate pointers u Can be computed in almost linear time

120 t := &a; y := &b; z := &c; if x> 0; then p:= &y; else p:= &z; *p := t;

121 Precision u We cannot usually have –  (CS) = DF on all programs u But can we say something about precision in all programs?

122 The Join-Over-All-Paths (JOP) u Let paths(v) denote the potentially infinite set paths from start to v (written as sequences of labels) u For a sequence of edges [e 1, e 2, …, e n ] define f [e 1, e 2, …, e n ]: L  L by composing the effects of basic blocks f [e 1, e 2, …, e n ](l) = f(e n ) (… (f(e 2 ) (f(e 1 ) (l)) …) u JOP[v] =  {f[e 1, e 2, …,e n ](  ) [e 1, e 2, …, e n ]  paths(v)}

123 JOP vs. Least Solution u The DF solution obtained by Chaotic iteration satisfies for every l: –JOP[v]  DF entry (v) u A function f is additive (distributive) if –f(  {x| x  X}) =  {f(x) |  X} u If every f l is additive (distributive) for all the nodes v –JOP[v] = DF entry (v)

124 Conclusions u Chaotic iterations is a powerful technique u Easy to implement u Rather precise u But expensive –More efficient methods exist for structured programs u Abstract interpretation relates runtime semantics and static information u The concrete semantics serves as a tool in designing abstractions –More intuition will be given in the sequel


Download ppt "1 Program Analysis Mooly Sagiv Tel Aviv University 640-6706 Textbook: Principles of Program Analysis."

Similar presentations


Ads by Google