# Precise Interprocedural Analysis using Random Interpretation Sumit Gulwani George Necula UC-Berkeley.

## Presentation on theme: "Precise Interprocedural Analysis using Random Interpretation Sumit Gulwani George Necula UC-Berkeley."— Presentation transcript:

Precise Interprocedural Analysis using Random Interpretation Sumit Gulwani George Necula UC-Berkeley

1 Random Interpretation = Random Testing + Abstract Interpretation Almost as simple as random testing but better soundness guarantees. Almost as sound as abstract interpretation but more precise, efficient, and simple.

2 Example a := 0; b := i;a := i-2; b := 2; c := b – a; d := i – 2b; assert(c+d = 0); assert(c = a+i) c := 2a + b; d := b – 2i; TrueFalse Random testing needs to execute all 4 paths to verify assertions. Abstract interpretation analyzes statements once but uses complicated operations. Random interpretation simply executes program once (and captures effect of all paths). True * *

3 Outline Framework for intraprocedural random interpretation –Advantages Investigate all analyses using one framework Design and proof of new analyses will be simpler A generic algorithm for interprocedural analysis

4 Outline Framework for intraprocedural random interpretation –Affine join function –Eval function –Example A generic algorithm for interprocedural analysis

5 Random Interpretation framework Goal: Detect equivalences of expressions. Generic Algorithm: Choose random values for input variables. Execute assignments. –Using Eval function to evaluate expressions. Execute both branches of conditionals and combine the program states at join points. –Using Affine Join function. Compare values of expressions to decide equality.

6 Affine Join function Used for combining program states at join points. w : State £ State ! State Let = w ( 1, 2 ). Then, (y) = def w £ 1 (y) + (1-w) £ 2 (y) 2 : [a=4, b=1] 1 : [a=2, b=3] a := 2; b := 3; a := 4; b := 1; = 7 ( 1, 2 ): [a=7 ¢ 2 + (1-7) ¢ 4, b=7 ¢ 3 +(1-7) ¢ 1] i.e. [a=-10, b=15]

7 2 : [a=4, b=1] 1 : [a=2, b=3] Properties of Affine Join Affine join preserves common linear relationships e.g. a+b=5. It does not introduce false relationships w.h.p. a := 2; b := 3; a := 4; b := 1; = 7 ( 1, 2 ): [a=7 ¢ 2 + (1-7) ¢ 4, b=7 ¢ 3 +(1-7) ¢ 1] i.e. [a=-10, b=15]

8 Eval function Eval: Expression £ State ! Value Used for executing expressions Defined in terms of Poly: Expression ! Polynomial Poly is abstraction specific Eval(e, ) = Evaluation of Poly(e) using and random choices for non-program variables Poly must satisfy: Correctness: Poly(e 1 ) = Poly(e 2 ) iff e 1 = e 2 Linearity: Poly(e) is linear in program variables.

9 Example of Poly function Linear Arithmetic (POPL 2003) Expression e := y | e 1 § e 2 | c ¢ e Poly(e) = e Uninterpreted Functions (POPL 2004) Expression e := y | F(e) Poly(y) = y Poly(F(e)) = a £ Poly(e) + b

Example: Random Interpretation for Linear Arithmetic i=3, a=0, b=3 i=3 a := 0; b := i; a := i-2; b := 2; c := b – a; d := i – 2b; assert (c+d = 0); assert (c = a+i) i=3, a=-4, b=7 c=23, d=-23 c := 2a + b; d := b – 2i; i=3, a=1, b=2 i=3, a=-4, b=7 c=-1, d=1 i=3, a=-4, b=7 c=11, d=-11 False w 1 = 5 w 2 = 2 True * *

11 Outline Framework for intraprocedural random interpretation –Affine join function –Eval function –Example A generic algorithm for interprocedural analysis –Random summary (Idea #1) –Issue of freshness (Idea #2) –Error probability and complexity –Experiments

i=3, a=0, b=3 i=3 a := 0; b := i; a := i-2; b := 2; c := b – a; d := i – 2b; assert (c+d = 0); assert (c = a+i) i=3, a=-4, b=7 c=23, d=-23 c := 2a + b; d := b – 2i; i=3, a=1, b=2 i=3, a=-4, b=7 c=-1, d=1 i=3, a=-4, b=7 c=11, d=-11 False w 1 = 5 w 2 = 2 Example True * * The second assertion is true in the context i=2. We need two new ideas to make the analysis interprocedural.

i=2 a=0, b=i a := 0; b := i;a := i-2; b := 2; c := b – a; d := i – 2b; assert (c+d = 0); assert (c = a+i) a=8-4i, b=5i-8 c=21i-40, d=40-21i c := 2a + b; d := b – 2i; a=i-2, b=2 a=8-4i, b=5i-8 c=8-3i, d=3i-8 a=8-4i, b=5i-8 c=9i-16, d=16-9i False w 1 = 5 w 2 = 2 Idea #1: Keep input variables symbolic Do not choose random values for input variables (to later instantiate by any context). Resulting program state at the end is a random summary. a=0, b=2 c=2, d=-2 True * *

14 Idea #2: Generate fresh summaries u = 5 ¢ 2 -7 = 3 v = 5 ¢ 1 -7 = -2 w = 5 ¢ 1 -7 = -2 x = 5i-7 w = 5 x = 3x = i+1 x := i+1;x := 3; return x; * Procedure P Input: i Assert (u = 3); Assert (v = w); u := P(2); v := P(1); w := P(1); Procedure Q Plugging the same summary twice is unsound. Fresh summaries can be generated by random affine combination of few independent summaries! True False

15 Generating 2 random summaries for P Procedure P x=[5i-7,7-2i] w=[5,-2] x = [3,3] x=[i+1,i+1] x := i+1;x := 3; return x; * Input: i True False x = 7 (5i-7,7-2i) = 47i-91 x = 6 (5i-7,7-2i) = 40i-77 x = 2 (5i-7,7-2i) = 19i-35 x = 0 (5i-7,7-2i) = 7-2i x = 5 (5i-7,7-2i) = 33i-63 x = 1 (5i-7,7-2i) = 5i-7 Procedure Q calls P 3 times. Hence, generating 2 random summaries for Q requires 2 £ 3 fresh summaries of P.

16 Generating 2 random summaries for Q u = [47 ¢ 2-91, 40 ¢ 2-77] =[3,3] v = [19 ¢ 1-35, 7-2 ¢ 1] =[-16,5] w = [33 ¢ 1-63, 5 ¢ 1-7] =[-30,-2] Assert (u = 3); Assert (v = w); u := P(2); v := P(1); w := P(1); Procedure Q x = 7 (5i-7,7-2i) = 47i-91 x = 6 (5i-7,7-2i) = 40i-77 x = 2 (5i-7,7-2i) = 19i-35 x = 0 (5i-7,7-2i) = 7-2i x = 5 (5i-7,7-2i) = 33i-63 x = 1 (5i-7,7-2i) = 5i-7

17 Loops and Fixed point computation In presence of loops (in procedures and call- graphs), fixed point computation is required. The number of iterations required to reach fixed point is k v (2k I +1) + 1 k v : # of visible variables k I : # of input variables

18 Error Probability and Complexity Time Complexity = nk V k I 2 t Error probability = 1/q t-m n: size of program k V, k I : # of visible and input variables t: # of random summaries q: size of set from which random values are chosen m: k I k V (generic bound) k I + k V (for linear arithmetic) 4 (for unary uninterpreted functions)

19 Related Work Intraprocedural random interpretation –Linear arithmetic (POPL 03) –Uninterpreted functions (POPL 04) Interprocedural dataflow analysis (POPL 95, TCS 96) –Sagiv, Reps, Horwitz –Cons: simpler properties, e.g. liveness, linear constants –Pro: better computational complexity Interprocedural linear arithmetic (POPL 04) –Muller-Olm, Seidl –Cons: O(k 2 ) times slower –Pro: works for non-linear relationships too

20 Related Work Intraprocedural random interpretation –Linear arithmetic (POPL 03) –Uninterpreted functions (POPL 04) Interprocedural dataflow analysis (POPL 95, TCS 96) –Sagiv, Reps, Horwitz –Cons: simpler properties, e.g. liveness, linear constants –Pro: better computational complexity Interprocedural linear arithmetic (POPL 04) –Muller-Olm, Seidl –Cons: O(k 2 ) times slower –Pro: works for non-linear relationships too

21 Experiments ProgLineInpVarTime go29K63170047 ijpeg28K318254 li23K5339234 gzip8K495252 Random Inter (this paper) Random Intra (POPL 2003) Det Inter (TCS 96) Var)Speedup 170107 3424 160756 20039 Inp) Speedup 171.9 32.3 201.3 62.0 Inp: # of input variables that were constants Var: # of local variable that were constants (Var): # of fewer local variable constants discovered Random Inter discovers 10-70% more facts; Random Intra is faster by 10-500 times; Det Inter is faster by 2 times.

22 Conclusion Randomization buys efficiency, simplicity at cost of probabilistic soundness. Combining randomized techniques with symbolic techniques is powerful.

Similar presentations