Presentation is loading. Please wait.

Presentation is loading. Please wait.

Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler.

Similar presentations


Presentation on theme: "Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler."— Presentation transcript:

1 Random Interpretation Sumit Gulwani UC-Berkeley

2 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler optimizations Translation validation Parameters Completeness (precision, no false positives) Computational complexity Ease of implementation What if we allow probabilistic soundness? –We obtain a new class of analyses: random interpretation

3 2 Random Interpretation = Random Testing + Abstract Interpretation Almost as simple as random testing but better soundness guarantees. Almost as sound as abstract interpretation but more precise, efficient, and simple.

4 3 a := 0; b := i;a := i-2; b := 2; c := b – a; d := i – 2b; assert(c+d = 0); assert(c = a+i) c := 2a + b; d := b – 2i; TrueFalse True * * Example 1 Random testing needs to execute all 4 paths to verify assertions. Abstract interpretation analyzes statements once but uses complicated operations. Random interpretation executes program once, but in a way that captures effect of all paths.

5 4 Outline Random Interpretation –Linear arithmetic (POPL 2003) –Uninterpreted functions (POPL 2004) –Inter-procedural analysis (POPL 2005) –Other applications

6 5 Problem: Linear relationships in linear programs Does not mean inapplicability to “real” programs –“abstract” other program stmts as non-deterministic assignments (standard practice in program analysis) Linear relationships are useful for –Program correctness Buffer overflows –Compiler optimizations Constant propagation, copy propagation, common subexpression elimination, induction variable elimination.

7 6 Basic idea in random interpretation Generic algorithm: Choose random values for input variables. Execute both branches of a conditional. Combine the values of variables at join points. Test the assertion.

8 7 Idea #1: The Affine Join operation Affine join of v 1 and v 2 w.r.t. weight w  w (v 1,v 2 ) ´ w v 1 + (1-w) v 2 Affine join preserves common linear relationships (e.g. a+b=5) It does not introduce false relationships w.h.p. Unfortunately, non-linear relationships are not preserved (e.g. a £ (1+b) = 8) w = 7 a = 2 b = 3 a = 4 b = 1 a =  7 (2,4) = -10 b =  7 (3,1) = 15

9 8 Geometric Interpretation of Affine Join a b a + b = 5 b = 2 (a = 2, b = 3) (a = 4, b = 1) : State before the join : State after the join satisfies all the affine relationships that are satisfied by both (e.g. a + b = 5) Given any relationship that is not satisfied by any of (e.g. b=2), also does not satisfy it with high probability

10 i=3, a=0, b=3 i=3 a := 0; b := i; a := i-2; b := 2; c := b – a; d := i – 2b; assert (c+d = 0); assert (c = a+i) i=3, a=-4, b=7 c=23, d=-23 c := 2a + b; d := b – 2i; i=3, a=1, b=2 i=3, a=-4, b=7 c=-1, d=1 i=3, a=-4, b=7 c=11, d=-11 False w 1 = 5 w 2 = 2 True * * Example 1 Choose a random weight for each join independently. All choices of random weights verify first assertion Almost all choices contradict second assertion

11 10 Example 2 We need to make use of the conditional x=y on the true branch to prove the assertion. a := x + y b := a b := 2x assert (b = 2x) TrueFalse x = y ?

12 11 Idea #2: The Adjust Operation Execute multiple runs of the program in parallel. Sample = Collection of states at a program point Combine states in the sample before a conditional s.t. –The equality conditional is satisfied. –Original relationships are preserved. Use adjusted sample on the true branch.

13 12 Geometric Interpretation of Adjust Program states = points Adjust = projection onto the hyperplane S’ satisfies e=0 and all relationships satisfied by S Algorithm to obtain S’ = Adjust(S, e=0) S4S4 S2S2 S3S3 S1S1 S’ 3 S’ 1 S’ 2 Hyperplane e = 0

14 13 Correctness of Random Interpreter R Completeness: If e 1 =e 2, then R ) e 1 =e 2 –assuming non-det conditionals Soundness: If e 1  e 2, then R ) e 1 =e 2 –error prob. · b: number of branches j: number of joins d: size of the field k: number of points in the sample –If j = b = 10, k = 15, d ¼ 2 32, then error ·

15 14 Outline Random Interpretation –Linear arithmetic (POPL 2003) –Uninterpreted functions (POPL 2004) –Inter-procedural analysis (POPL 2005) –Other applications

16 15 Problem: Global value numbering Goal: Detect expression equivalence in programs that have been abstracted using “uninterpreted functions” Axiom of the theory of uninterpreted functions If x=y, then F(x)=F(y) Applications –Compiler optimizations –Translation validation

17 assert(x = y); assert(z = F(y)); * x =  (a,b) y =  (a,b) z =  (F(a),F(b)) F(y) = F(  (a,b)) Typical algorithms treat  as uninterpreted –Hence cannot verify the second assertion The randomized algorithm interprets  –as affine join operation  w x := a; y := a; z := F(a); x := b; y := b; z := F(b); Example True False

18 17 How to “execute” uninterpreted functions e := y | F(e 1,e 2 ) Choose a random interpretation for F Non-linear interpretation –E.g. F(e 1,e 2 ) = r 1 e 1 2 + r 2 e 2 2 –Preserves all equivalences in straight-line code –But not across join points Lets try linear interpretation

19 18 Random Linear Interpretation Encode F(e 1,e 2 ) = r 1 e 1 + r 2 e 2 Preserves all equivalences across a join point Introduces false equivalences in straight-line code. E.g. e and e’ have same encodings even though e  e’ Problem: Scalar multiplication is commutative. Solution: Evaluate expressions to vectors and choose r 1 and r 2 to be random matrices F FF abcd e =e =F FF acbd e’ = Encodings e = r 1 (r 1 a+r 2 b) + r 2 (r 1 c+r 2 d) = r 1 2 (a)+r 1 r 2 (b)+r 2 r 1 (c)+r 2 2 (d) e’ = r 1 2 (a)+r 1 r 2 (c)+r 2 r 1 (b)+r 2 2 (d)

20 19 Outline Random Interpretation –Linear arithmetic (POPL 2003) –Uninterpreted functions (POPL 2004) –Inter-procedural analysis (POPL 2005) –Other applications

21 20 Example a := 0; b := i;a := i-2; b := 2; c := b – a; d := i – 2b; assert (c + d = 0); assert (c = a + i) c := 2a + b; d := b – 2i; True False The second assertion is true in the context i=2. Interprocedural Analysis requires computing procedure summaries. True * *

22 i=2 a=0, b=i a := 0; b := i;a := i-2; b := 2; c := b – a; d := i – 2b; assert (c+d = 0); assert (c = a+i) a=8-4i, b=5i-8 c=21i-40, d=40-21i c := 2a + b; d := b – 2i; a=i-2, b=2 a=8-4i, b=5i-8 c=8-3i, d=3i-8 a=8-4i, b=5i-8 c=9i-16, d=16-9i False w 1 = 5 w 2 = 2 Idea #1: Keep input variables symbolic Do not choose random values for input variables (to later instantiate by any context). Resulting program state at the end is a random procedure summary. a=0, b=2 c=2, d=-2 True * *

23 22 Idea #2: Generate fresh summaries u = 5 ¢ 2 -7 = 3 v = 5 ¢ 1 -7 = -2 w = 5 ¢ 1 -7 = -2 x = 5i-7 w = 5 x = 3x = i+1 x := i+1;x := 3; return x; * Procedure P Input: i Assert (u = 3); Assert (v = w); u := P(2); v := P(1); w := P(1); Procedure Q Plugging the same summary twice is unsound. Fresh summaries can be generated by random affine combination of few independent summaries! True False

24 23 Experiments

25 24 Experiments Randomized Deterministic Randomized algorithm discovers 10-70% more facts. Randomized algorithm is slower by a factor of 2.

26 25 Experimental measure of error The % of incorrect relationships decreases with increase in S = size of set from which random values are chosen. N = # of random summaries used. 10 3 10 5 10 8 295.5 364.33.20 40.200 5000 6000 S N The experimental results are better than what is predicted by theory.

27 26 Outline Random Interpretation –Linear arithmetic (POPL 2003) –Uninterpreted functions (POPL 2004) –Inter-procedural analysis (POPL 2005) –Other applications

28 27 Other applications of random interpretation Model Checking –Randomized equivalence testing algorithm for FCEDs, which represent conditional linear expressions and are generalization of BDDs. (SAS 04) Theorem Proving –Randomized decision procedure for linear arithmetic and uninterpreted functions. This runs an order of magnitude faster than det. algo. (CADE 03) Ideas for deterministic algorithms –PTIME algorithm for global value numbering, thereby solving a 30 year old open problem. (SAS 04)

29 28 Future Work and Limitations Future Work Random interpreters for other theories –E.g. data-structures Combining random interpreters –E.g. random interpreter for the combined theory of linear arithmetic and uninterpreted functions. Limitations Does not discover “never equal” information –Only detects “always equal” information

30 Summary Key IdeaComplexity Linear ArithmeticAffine JoinO(n 2 )O(n 4 ) Random interpretation Abstract interpretation Lessons Learned Randomization buys efficiency, simplicity at cost of prob. soundness. Randomization suggests ideas for deterministic algorithms. Combining randomized techniques with symbolic is powerful. Uninterpreted Fns.VectorsO(n 3 )O(n 4 ) Interproc. AnalysisSymbolic i/p variablesPoly blowup?


Download ppt "Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler."

Similar presentations


Ads by Google