Presentation is loading. Please wait.

Presentation is loading. Please wait.

Interprocedural Analysis Noam Rinetzky Mooly Sagiv.

Similar presentations


Presentation on theme: "Interprocedural Analysis Noam Rinetzky Mooly Sagiv."— Presentation transcript:

1 Interprocedural Analysis Noam Rinetzky Mooly Sagiv

2 Challenges in Interprocedural Analysis Respect call-return mechanism Handling recursion Local variables Parameter passing mechanisms The called procedure is not always known The source code of the called procedure is not always available Concurrency

3 Stack regime P() { … R(); … } R(){ … } Q() { … R(); … } call return call return P R

4 Guiding light Exploit stack regime  Precision  Efficiency

5 Simplifying Assumptions  Parameter passed by value  No OO  No nesting  No concurrency Recursion is supported

6 Topics Covered The trivial approach Procedure Inlining The naive approach The call string approach Functional Approach Valid paths Interprocedural Analysis via Graph Reachability Beyond reachability [Demand analysis] [Advanced Topics] –Karr’s analysis –Backward –MOPED

7 Constant Propagation Find variables with constant value at a given program location int p(int x){ return x * x; void main()} { int z; if (...) z = p(5) + 5; else z = p(4) + 4; printf (z); }

8 Example Program x = 5; y = 7; if (...) y = x + 2; z = x +y;

9 Undecidability Issues It is undecidable if a program point is reachable in some execution Some static analysis problems are undecidable even if the program conditions are ignored

10 The Constant Propagation Example while (...) { if (...) x_1 = x_1 + 1; if (...) x_2 = x_2 + 1;... if (...) x_n = x_n + 1; } y = truncate (1/ (1 + p 2 (x_1, x_2,..., x_n)) /* Is y=0 here? */

11 Conservative Solution Every detected constant is indeed constant –But may fail to identify some constants Every potential impact is identified –Superfluous impacts

12 The extra cost of procedures Call return complicates the analysis Unbounded stack memory Increases asymptotic complexity But can be tolerable in practice Sometimes even cheaper/more precise than intraprocedural analysis

13 A trivial treatment of procedure Analyze a single procedure After every call continue with conservative information –Global variables and local variables which “may be modified by the call” have unknown values Can be easily implemented Procedures can be written in different languages Procedure inline can help

14 Disadvantages of the trivial solution Modular (object oriented and functional) programming encourages small frequently called procedures Almost all information is lost

15 Procedure Inlining Inline called procedures Simple Does not handle recursion Exponential blow up p1 { call p2 … call p2 } p2 { call p3 … call p3 } p3{ }

16 A Naive Interprocedural solution Treat procedure calls as gotos Abstract call/return correlations Obtain a conservative solution Use chaotic iterations Usually fast

17 Simple Example int p(int a) { return a + 1; } void main() { int x ; x = p(7); x = p(9) ; }

18 Simple Example int p(int a) { [a  7] return a + 1; } void main() { int x ; x = p(7); x = p(9) ; }

19 Simple Example int p(int a) { [a  7] return a + 1; [a  7, $$  8] } void main() { int x ; x = p(7); x = p(9) ; }

20 Simple Example int p(int a) { [a  7] return a + 1; [a  7, $$  8] } void main() { int x ; x = p(7); [x  8] x = p(9) ; [x  8] }

21 Simple Example int p(int a) { [a  7] return a + 1; [a  7, $$  8] } void main() { int x ; x =l p(7); [x  8] x = p(9) ; [x  8] }

22 Simple Example int p(int a) { [a  ] return a + 1; [a  7, $$  8] } void main() { int x ; x = p(7); [x  8] x = p(9) ; [x  8] }

23 Simple Example int p(int a) { [a  ] return a + 1; [a , $$  ] } void main() { int x ; x = p(7); [x  8] x = p(9); [x  8] }

24 Simple Example int p(int a) { [a  ] return a + 1; [a , $$  ] } void main() { int x ; x = p(7) ; [x  ] x = p(9) ; [x  ] }

25 The Call String Approach The data flow value is associated with sequences of calls (call string) Use Chaotic iterations To guarantee termination limit the size of call string (typically 1 or 2) –Represents tails of calls Abstract inline

26 Simple Example int p(int a) { return a + 1; } void main() { int x ; c1: x = p(7); c2: x = p(9) ; }

27 Simple Example int p(int a) { c1: [a  7] return a + 1; } void main() { int x ; c1: x = p(7); c2: x = p(9) ; }

28 Simple Example int p(int a) { c1: [a  7] return a + 1; c1:[a  7, $$  8] } void main() { int x ; c1: x = p(7); c2: x = p(9) ; }

29 Simple Example int p(int a) { c1: [a  7] return a + 1; c1:[a  7, $$  8] } void main() { int x ; c1: x = p(7);  : x  8 c2: x = p(9) ; }

30 Simple Example int p(int a) { c1:[a  7] return a + 1; c1:[a  7, $$  8] } void main() { int x ; c1: x = p(7);  : [x  8] c2: x = p(9) ; }

31 Simple Example int p(int a) { c1:[a  7] c2:[a  9] return a + 1; c1:[a  7, $$  8] } void main() { int x ; c1: x = p(7);  : [x  8] c2: x = p(9) ; }

32 Simple Example int p(int a) { c1:[a  7] c2:[a  9] return a + 1; c1:[a  7, $$  8] c2:[a  9, $$  10] } void main() { int x ; c1: x = p(7);  : [x  8] c2: x = p(9) ; }

33 Simple Example int p(int a) { c1:[a  7] c2:[a  9] return a + 1; c1:[a  7, $$  8] c2:[a  9, $$  10] } void main() { int x ; c1: x = p(7);  : [x  8] c2: x = p(9) ;  : [x  10] }

34 Another Example int p(int a) { c1:[a  7] c2:[a  9] return c3: p1(a + 1); c1:[a  7, $$  ] c2:[a  9, $$  ] } void main() { int x ; c1: x = p(7);  : [x  8] c2: x = p(9) ;  : [x  10] } int p1(int b) { (c1|c2)c3:[b  ] return 2 * b; (c1|c2)c3:[b , $$  ] }

35 Recursion int p(int a) { c1: [a  7] if (…) { c1: [a  7] a = a -1 ; c1: [a  6] c2: p (a); a = a + 1;{ x = -2*a + 5; } void main() { c1: x = p(7); }

36 Recursion int p(int a) { c1: [a  7] c1.c2: [a  6] if (…) { c1: [a  7] a = a -1 ; c1: [a  6] c2: p (a); a = a + 1;} x = -2*a + 5; } void main() { c1: x = p(7); }

37 Recursion int p(int a) { c1: [a  7] c1.c2: [a  6] if (…) { c1: [a  7] c1.c2: [a  6] a = a -1 ; c1: [a  6] c2: p (a); a = a + 1;} x = -2*a + 5; } void main() { c1: x = p(7); }

38 Recursion int p(int a) { c1: [a  7] c1.c2: [a  6] if (…) { c1: [a  7] c1.c2: [a  6] a = a -1 ; c1: [a  6] c1.c2: [a  5] c2: p (a); a = a + 1;} x = -2*a + 5; } void main() { c1: x = p(7); }

39 Recursion int p(int a) { c1: [a  7] c1.c2+: [a   ] if (…) { c1: [a  7] c1.c2: [a  6] a = a -1 ; c1: [a  6] c1.c2: [a  5] c2: p (a); a = a + 1;} x = -2*a + 5; } void main() { c1: x = p(7); }

40 Recursion int p(int a) { c1: [a  7] c1.c2: [a   ] if (…) { c1: [a  7] c1.c2+: [a   ] a = a -1 ; c1: [a  6] c1.c2: [a  5] c2: p (a); a = a + 1;} x = -2*a + 5; } void main() { c1: x = p(7); }

41 Recursion int p(int a) { c1: [a  7] c1.c2+: [a   ] if (…) { c1: [a  7] c1.c2+: [a   ] a = a -1 ; c1: [a  6] c1.c2+: [a   ] c2: p (a); a = a + 1; x = -2*a + 5; } void main() { c1: x = p(7); }

42 int p(int a) { c1: [a  7] c1.c2+: [a   ] if (…) { c1: [a  7] c1.c2+: [a   ] a = a -1 ; c1: [a  6] c1.c2+: [a   ] c2: p (a); a = a + 1;} c1: [a  7] c1.c2+: [a   ] x = -2*a + 5; c1: [a  7, x  -9] c1.c2+: [a  , x  ] } void main() { c1: x = p(7); }

43 int p(int a) { c1: [a  7] c1.c2+: [a   ] if (…) { c1: [a  7] c1.c2+: [a   ] a = a -1 ; c1: [a  6] c1.c2+: [a   ] c2: p (a); c1.c2+: [a  , x  ] a = a + 1; } c1: [a  7] c1.c2+: [a   ] x = -2*a + 5; c1: [a  7, x  -9] c1.c2+: [a  , x  ] } void main() { c1: x = p(7); }

44 int p(int a) { c1: [a  7] c1.c2+: [a   ] if (…) { c1: [a  7] c1.c2+: [a   ] a = a -1 ; c1: [a  6] c1.c2+: [a   ] c2: p (a); c1.c2+: [a  , x  ] a = a + 1; c1.c2+: [a  , x  ] } c1: [a  7] c1.c2+: [a   ] x = -2*a + 5; c1: [a  7, x  -9] c1.c2+: [a  , x  ] } void main() { c1: x = p(7); }

45 int p(int a) { c1: [a  7] c1.c2+: [a   ] if (…) { c1: [a  7] c1.c2+: [a   ] a = a -1 ; c1: [a  6] c1.c2+: [a   ] c2: p (a); c1: [a  , x  ] a = a + 1; c1.c2+: [a  , x  ] c1: [a  , x  ] } c1: [a  7] c1.c2+: [a   ] x = -2*a + 5; c1: [a  , x  ] c1.c2+: [a  , x  ] } void main() { c1: x = p(7); }

46 int p(int a) { c1: [a  7] c1.c2+: [a   ] if (…) { c1: [a  7] c1.c2+: [a   ] a = a -1 ; c1: [a  6] c1.c2+: [a   ] c2: p (a); c1.c2*: [a   ] a = a + 1; c1.c2*: [a   ] } c1.c2*: [a   ] x = -2*a + 5; c1.c2*: [a  , x  ] } void main() { c1: p(7);  : [x  ] }

47 Special cases and Extensions Finite distributive lattice –There exists a bound on the call string which leads to join over all valid paths –Solution seemingly unsafe –Decomposable problem (Independent attributes) L = L 1 x L 2 x … x L k F = F 1 x F 2 x … x F k Cost is quadratic in max |L i | Kill-Gen is a special case Abstract call strings –Generalizes k-limiting of suffixes

48 Summary Call String Easy to implement Efficient for small call strings For finite domains can be precise even with recursion Limited precision Order of calls can be abstracted Related method: procedure cloning

49 The Functional Approach The meaning of a procedure is mapping from states into states The abstract meaning of a procedure is function from an abstract state to abstract states Relation between input and output

50 The Functional Approach Two phase algorithm –Compute the dataflow solution at the exit of a procedure as a function of the initial values at the procedure entry (functional values) –Compute the dataflow values at every point using the functional values

51 Phase 1 int p(int a) { [a  a 0, x  x 0 ] if (…) { a = a -1 ; p (a); a = a + 1; { x = -2*a + 5; [a  a 0, x  -2a 0 + 5] } void main() { p(7); }

52 Phase 1 int p(int a) { [a  a 0, x  x 0 ] if (…) { a = a -1 ; p (a); a = a + 1; { [a  a 0, x  x 0 ] x = -2*a + 5; } void main() { p(7); }

53 Phase 1 int p(int a) { [a  a 0, x  x 0 ] if (…) { a = a -1 ; p (a); a = a + 1; { [a  a 0, x  x 0 ] x = -2*a + 5; [a  a 0, x  -2a 0 + 5] } void main() { p(7); }

54 Phase 1 int p(int a) { [a  a 0, x  x 0 ] if (…) { a = a -1 ; [a  a 0 -1, x  x 0 ] p (a); a = a + 1; { [a  a 0, x  x 0 ] x = -2*a + 5; [a  a 0, x  -2a 0 + 5] } void main() { p(7); }

55 Phase 1 int p(int a) { [a  a 0, x  x 0 ] if (…) { a = a -1 ; [a  a 0 -1, x  x 0 ] p (a); [a  a 0 -1, x  ] a = a + 1; { [a  a 0, x  x 0 ] x = -2*a + 5; [a  a 0, x  -2a 0 + 5] } void main() { p(7); }

56 Phase 1 int p(int a) { [a  a 0, x  x 0 ] if (…) { a = a -1 ; [a  a 0 -1, x  x 0 ] p (a); [a  a 0 -1, x  ] a = a + 1; [a  a 0, x  ] { [a  a 0, x  x 0 ] x = -2*a + 5; [a  a 0, x  -2a 0 + 5] } void main() { p(7); }

57 Phase 1 int p(int a) { [a  a 0, x  x 0 ] if (…) { a = a -1 ; [a  a 0 -1, x  x 0 ] p (a); [a  a 0 -1, x  -2(a 0 -1) + 5] a = a + 1; [a  a 0, x  -2(a 0 -1) + 5] { [a  a 0, x  ] x = -2*a + 5; [a  a 0, x  -2a 0 + 5] } void main() { p(7); }

58 Phase 2 int p(int a) { if (…) { a = a -1 ; p (a); a = a + 1; { x = -2*a + 5; } void main() { p(7); [x  -9] } p(a 0,x 0 ) = [a  a 0, x  -2a 0 + 5]

59 Phase 2 int p(int a) { [a  7, x  0] if (…) { a = a -1 ; p (a); a = a + 1; { x = -2*a + 5; } void main() { p(7); [x  -9] } p(a 0,x 0 ) = [a  a 0, x  -2a 0 + 5]

60 Phase 2 int p(int a) { [a  7, x  0] if (…) { a = a -1 ; p (a); a = a + 1; { [a  7, x  0] x = -2*a + 5; } void main() { p(7); [x  -9] } p(a 0,x 0 ) = [a  a 0, x  -2a 0 + 5]

61 Phase 2 int p(int a) { [a  7, x  0] if (…) { a = a -1 ; p (a); a = a + 1; { [a  7, x  0] x = -2*a + 5; [a  7, x  -9] } void main() { p(7); [x  -9] } p(a 0,x 0 ) = [a  a 0, x  -2a 0 + 5]

62 Phase 2 int p(int a) { [a  7, x  0] if (…) { a = a -1 ; [a  6, x  0] p (a); a = a + 1; { [a  7, x  0] x = -2*a + 5; [a  7, x  -9] } void main() { p(7); [x  -9] } p(a 0,x 0 ) = [a  a 0, x  -2a 0 + 5]

63 Phase 2 int p(int a) { [a  7, x  0] if (…) { a = a -1 ; [a  6, x  0] p (a); [a  6, x  -9] a = a + 1; { [a  7, x  0] x = -2*a + 5; [a  7, x  -9] } void main() { p(7); [x  -9] } p(a 0,x 0 ) = [a  a 0, x  -2a 0 + 5]

64 Phase 2 int p(int a) { [a  7, x  0] if (…) { a = a -1 ; [a  6, x  0] p (a); [a  6, x  -9] a = a + 1; [a  7, x  -9] { [a  7, x  0] x = -2*a + 5; [a  7, x  -9] } void main() { p(7); [x  -9] } p(a 0,x 0 ) = [a  a 0, x  -2a 0 + 5]

65 Phase 2 int p(int a) { [a  7, x  0] if (…) { a = a -1 ; [a  6, x  0] p (a); [a  6, x  -9] a = a + 1; [a  7, x  -9] { [a  7, x  ] x = -2*a + 5; [a  7, x  -9] } void main() { p(7); [x  -9] } p(a 0,x 0 ) = [a  a 0, x  -2a 0 + 5]

66 Phase 2 int p(int a) { [a  7, x  0] if (…) { a = a -1 ; [a  6, x  0] p (a); [a  6, x  -9] a = a + 1; [a  7, x  -9] { [a  7, x  ] x = -2*a + 5; [a  7, x  -9] } void main() { p(7); [x  -9] } p(a 0,x 0 ) = [a  a 0, x  -2a 0 + 5]

67 Phase 2 int p(int a) { [a  7, x  0] [a  6, x  0] if (…) { a = a -1 ; [a  6, x  0] p (a); [a  6, x  -9] a = a + 1; [a  7, x  -9] { [a  7, x  ] x = -2*a + 5; [a  7, x  -9] } void main() { p(7); [x  -9] } p(a 0,x 0 ) = [a  a 0, x  -2a 0 + 5]

68 Phase 2 int p(int a) { [a , x  0] if (…) { a = a -1 ; [a  6, x  0] p (a); [a  6, x  -9] a = a + 1; [a  7, x  -9] { [a  7, x  ] x = -2*a + 5; [a  7, x  -9] } void main() { p(7); [x  -9] } p(a 0,x 0 ) = [a  a 0, x  -2a 0 + 5]

69 Phase 2 int p(int a) { [a , x  0] if (…) { a = a -1 ; [a , x  0] p (a); [a  6, x  -9] a = a + 1; [a  7, x  -9] { [a  7, x  ] x = -2*a + 5; [a  7, x  -9] } void main() { p(7); [x  -9] } p(a 0,x 0 ) = [a  a 0, x  -2a 0 + 5]

70 Phase 2 int p(int a) { [a , x  0] if (…) { a = a -1 ; [a , x  0] p (a); [a , x  ] a = a + 1; [a  7, x  -9] { [a  7, x  ] x = -2*a + 5; [a  7, x  -9] } void main() { p(7); [x  -9] } p(a 0,x 0 ) = [a  a 0, x  -2a 0 + 5]

71 Phase 2 int p(int a) { [a , x  0] if (…) { a = a -1 ; [a , x  0] p (a); [a , x  ] a = a + 1; [a , x  ] { [a  7, x  ] x = -2*a + 5; [a  7, x  -9] } void main() { p(7); [x  -9] } p(a 0,x 0 ) = [a  a 0, x  -2a 0 + 5]

72 Phase 2 int p(int a) { [a , x  0] if (…) { a = a -1 ; [a , x  0] p (a); [a , x  ] a = a + 1; [a , x  ] { [a , x  ] x = -2*a + 5; [a  7, x  -9] } void main() { p(7); [x  -9] } p(a 0,x 0 ) = [a  a 0, x  -2a 0 + 5]

73 Phase 2 int p(int a) { [a , x  0] if (…) { a = a -1 ; [a , x  0] p (a); [a , x  ] a = a + 1; [a , x  ] { [a , x  ] x = -2*a + 5; [a , x  ] } void main() { p(7); [x  -9] } p(a 0,x 0 ) = [a  a 0, x  -2a 0 + 5]

74 Summary Functional approach Computes procedure abstraction Sharing between different contexts Rather precise Recursive procedures may be more precise/efficient than loops But requires more from the implementation –Representing relations –Composing relations

75 Interprocedural analysis sRsR n1n1 n3n3 n2n2 f1f1 f3f3 f2f2 f1f1 eReR f3f3 sPsP n5n5 n4n4 f1f1 ePeP f5f5 P R Call Q f c2e f x2r Call node Return node Entry node Exit node sQsQ n7n7 n6n6 f1f1 eQeQ f6f6 Q Call Q Call node Return node f c2e f x2r f1f1 Supergraph

76 paths(n) the set of paths from s to n –( (s,n 1 ), (n 1,n 3 ), (n 3,n 1 ) ) Paths s n1n1 n3n3 n2n2 f1f1 f1f1 f3f3 f2f2 f1f1 e f3f3

77 Interprocedural Valid Paths () f1f1 f2f2 f k-1 fkfk f3f3 f4f4 f5f5 f k-2 f k-3 call q enter q exit q ret IVP: all paths with matching calls and returns –And prefixes

78 Interprocedural Valid Paths IVP set of paths –Start at program entry Only considers matching calls and returns –aka, valid Can be defined via context free grammar –matched ::= matched ( i matched ) i | ε –valid ::= valid ( i matched | matched paths can be defined by a regular expression

79 Join Over All Paths (JOP) start n i JOP[v] =  {[[e 1, e 2, …,e n ]](  ) | (e 1, …, e n )  paths(v)} JOP  LFP –Sometimes JOP = LFP precise up to “symbolic execution” Distributive problem  L  L

80 The Join-Over-Valid-Paths (JVP) vpaths(n) all valid paths from program start to n JVP[n] =  {[[e 1, e 2, …, e]](  ) (e 1, e 2, …, e)  vpaths(n)} JVP  JFP –In some cases the JVP can be computed –(Distributive problem)

81 Main idea functional approach Iterate on the abstract domain of functions from L to L Two phase algorithm –Compute the dataflow solution at the exit of a procedure as a function of the initial values at the procedure entry (functional values) –Compute the dataflow values at every point using the functional values Computes JVP for distributive problems

82 Example: Constant propagation L = Var  N  { ,  } Domain: F:L  L –(f 1  f 2 )(x) = f 1 (x)  f 2 (x) x=7 y=x+1 x=y  env.env[x  7]  env.env[y  env(x)+1] Id=  env  L.env  env.env[x  7] ○  env.env  env.env[y  env(x)+1] ○  env.env[x  7] ○  env.env

83 Example: Constant propagation L = Var  N  { ,  } Domain: F:L  L –(f 1  f 2 )(x) = f 1 (x)  f 2 (x) x=7y=x+1 x=y  env.env[x  7]  env.env[y  env(x)+1] Id=  env.env  env.env[y  env(x)+1]   env.env[x  7]

84 init a=7 9 call p 10 call p 11 print(x) 12 p1p1 If( … ) 2 a=a-1 3 call p 4 call p 5 a=a+1 6 x=-2a+5 7 end 8 begin 0 end 13 NFunction 0 e.[x  e(x), a  e(a)]=id 1 3-13 e. 

85 Phase 1 NFunction 1 e. [x  e(x), a  e(a)]=id 2id 7 8 e.[x  -2e(a)+5, a  e(a)] 3id 4 e.[x  e(x), a  e(a)-1] 5 f 8 ○ e.[x  e(x), a  e(a)-1] = e.[x  -2(e(a)-1)+5, a  e(a)-1] 6 7 e.[x  -2(e(a)-1)+5, a  e(a)]  e.[x  e(x), a  e(a)] 8 a, x.[x  -2e(a)+5, a  e(a)] a=7 9 call p 10 call p 11 print(x) 12 p1p1 If( … ) 2 a=a-1 3 call p 4 call p 5 a=a+1 6 x=-2a+5 7 end 8 begin 0 end 13 0 e.[x  e(x), a  e(a)]=id 10 e.[x  e(x), a  7] 11 a, x.[x  -2e(a)+5, a  e(a)] ○ f 10

86 Phase 2 NFunction 1[x  0, a  7] 2 7 8[x  -9, a  7] 3[x  0, a  7] 4[x  0, a  6] 1[x  -7, a  6] 6[x  -7, a  7] 7[x , a  7] 8[x  -9, a  7] 1[x , a  ] a=7 9 Call p 10 Call p 11 print(x) 12 p1p1 If( … ) 2 a=a-1 3 Call p 4 Call p 5 a=a+1 6 x=-2a+5 7 end 8 begin 0 end 13 0[x  0, a  0] 10[x  0, a  7] 11[x  -9, a  7]

87 Issues in Functional Approach How to guarantee that finite height for functional lattice? –It may happen that L has finite height and yet the lattice of monotonic function from L to L do not Efficiently represent functions –Functional join –Functional composition –Testing equality

88 Knoop and Steffen, CC’92 Extends the functional approach –Adds parameters, local variables –Information at return-site on Globals: comes from return-site (SP81) Locals: comes from call-site –Combiner: [[R n ]]: (C  C)  C Coincidence theorem –Analysis computes (approximates) JVP [[n]]:C  C are distributive (monotonic) [[R n ]]: (C  C)  C are distributive (monotonic) –Theorem more general

89 Knoop and Steffen, CC’92 Extends the functional approach –Adds parameters, local variables Approach: –Lifts abstract semantics to manipulate unbounded stacks of abstract values C lattice  STACK of C statement n on stk  [[n]](stk) = push( pop(skt), [[n]](top(stk)) ) Call p on stk  push( stk, [[call-to-entry p]](top(stk)) ) return from stk  push( pop(pop(stk)),R n (top(pop(stk)),top(stk)) ) (*)

90 Coincidence theorem Analyze program with domain STACK  STACK Shows that stacks are always at height  2 Analysis computes (approximates) JVP –[[n]]:C  C are distributive (monotonic) –[[R n ]]: (C  C)  C are distributive (monotonic) C  C is a lattice f : (C  C)  (C  C) is distributive if f(c,c’)  f(c’’,c’’’) = f((c, c’)  (c’’, c’’’) ) is distributive

91 CFL-Graph reachability Special cases of functional analysis Finite distributive lattices Provides more efficient analysis algorithms Reduce the interprocedural analysis problem to finding context free reachability

92 IDFS / IDE IDFS Interprocedural Distributive Finite Subset Precise interprocedural dataflow analysis via graph reachability. Reps, Horowitz, and Sagiv, POPL’95 IDE Interprocedural Distributive Environment Precise interprocedural dataflow analysis with applications to constant propagation. Reps, Horowitz, and Sagiv, FASE’95, TCS’96 –More general solutions exist

93 Possibly Uninitialized Variables Startx = 3 if... y = x y = w w = 8 printf(y) {w,x,y} {w,y} {w} {w,y} {} {w,y} {}

94 Efficiently Representing Functions Let f:2 D  2 D be a distributive function Then: f(X) = f(  )   {z  X: f({z}) }

95 Representing Dataflow Functions Identity Function Constant Function a bc a bc

96 “Gen/Kill” Function Non-“Gen/Kill” Function a bc a bc Representing Dataflow Functions

97 x = 3 p(x,y) return from p printf(y) start main exit main start p(a,b) if... b = a p(a,b) return from p printf(b) exit p xy a b

98 a bcbc a Composing Dataflow Functions bc a

99 x = 3 p(x,y) return from p start main exit main start p(a,b) if... b = a p(a,b) return from p exit p xy a b printf(y) Might b be uninitialized here? printf(b) NO! ( ] Might y be uninitialized here? YES! ( )

100 Interprocedural Dataflow Analysis via CFL-Reachability Graph: Exploded control-flow graph L: L(unbalLeft) –unbalLeft = valid Fact d holds at n iff there is an L(unbalLeft)- path from

101 Asymptotic Running Time CFL-reachability –Exploded control-flow graph: ND nodes –Running time: O(N 3 D 3 ) Exploded control-flow graph Special structure Running time: O(ED 3 ) Typically: E l N, hence O(ED 3 ) l O(ND 3 ) “Gen/kill” problems: O(ED)

102 a bcbc a Dataflow Flow Functions bc a

103 x = 3 p(x,y) return from p start main exit main start p(a,b) if... b = a p(a,b) return from p exit p xy a b printf(y) Might b be uninitialized here? printf(b) NO! ( ] Might y be uninitialized here? YES! ( )

104 Local variables Global information: –Exit-to-return edges Local information: –Call-to-return edges Can be generalized to handle binary combiner functions –Rinetzky, Sagiv, and Yahav, SAS’05

105 IDE Goes beyond IFDS problems –Can handle unbounded domains Requires special form of the domain Can be much more efficient than IFDS

106 Domain of environments Environments: Env(D,L) = D  L –D finite set of program symbols –L finite height lattice E.g., map from variables (D) to values (L) e  e’ =  d.(e(d)  e’(d) ) Top in Env(D,L) is  d. 

107 Environment transformers t: Env(D,L)  Env(D,L) –t should be distributive  d  D: (  {t(e)|e  E})(d) = (t(  {(e) | e  E}))(d) Must hold for inifinite sets E  Env(D,L)

108 IDE Problem IP=(G*,D,L,M) –G* supergraph –D, L as above –M assigns distributive transformers to E*

109

110 Point-wise representation of environments IFDS with “weights” Bi-partite graph –D  {  }  D  {  } –R t (d’,d)=  l. … t (  d. ) macro-function R t (d’,d) (  l.) micro-function a x  l.l - 2  l.l x = a - 2 [[R t ]](env)(d)= f( … R t (, ) …)(d) t = [[R t ]]

111 Point-wise representation of environments  = top  = bottom  = join

112 Point-wise representation of environments [[R t ]](env)(d)=R t ( ,d)(  )  (  d’  D R t (d’,d)(env(d’))) t = [[R t ]]  = bottom  = top

113 Example Linear Constant Propagation Consider the constant propagation lattice The value of every variable y at the program exit can be represented by: y =  {(a x x + b x )| x  Var * }  c a x,c  Z  { ,  } b x  Z Supports efficient composition and “functional” join –[z := a * y + b] –What about [z:=x+y]?

114 Linear constant propagation Point-wise representation of environment transformers

115 IDE Analysis Point-wise representation closed under composition CFL-Reachability on the exploded graph Compose functions

116

117

118 Costs O(ED 3 ) Class of value transformers F  L  L –id  F –Finite height Representation scheme with (efficient) Application Composition Join Equality Storage

119 New Techniques Representing affine relations –a 1 * x 1 + a 2 * x 2 + … + a n * x n Backward summary computations

120 Conclusion Handling functions is crucial for abstract interpretation Virtual functions and exceptions complicate things But scalability is an issue –Small call strings –Small functional domains –Demand analysis


Download ppt "Interprocedural Analysis Noam Rinetzky Mooly Sagiv."

Similar presentations


Ads by Google