Download presentation
Presentation is loading. Please wait.
1
Challenges in Program Analysis
Mooly Sagiv
2
Content Future directions of program analysis
Open problems in program analysis
3
Future Directions New applications New abstractions
Combine with other methods Dynamic analysis Decision procedures Machine learning
4
SQL injection String queryString = "SELECT info FROM userTable WHERE "; if ((! login.equals("")) && (! password.equals(""))) { queryString += "login='" + login + "' AND pass='" + password + "'"; } else { queryString+="login='guest'"; } ResultSet tempSet = stmt.executeQuery(queryString); User submits: login “doe” and password “xyz” SELECT info FROM users WHERE login=’doe’ AND pass=’xyz’ Attacker submits: login “admin’ – ” and password “”
5
SQL injection solutions
Compile-time detection Static string context analysis followed by a cheap runtime check Regular languages Context free languages Taint dynamic monitoring
6
Other code injection attacks
Heap spraying Shell/script injection html injection
7
Static Analysis of long lived programs
Static analysis can be applied at runtime Just in time compilation Cloud computing Distributed applications Hadoop
8
The Prime Code Search Tool Alon Mishne & Eran Yahav
Lots of public domain code Hard to look for the right code sequence Use abstract interpretation
9
Static Analysis for Program Equivalence
Program equivalence applications Compiler correctnes Software patches Can we use static analysis to detect equivalence? Instrument & Abstract
10
Abstraction-Guided Synthesis
Eran Yahav Technion Joint work with Martin Vechev, Greta Yorsh, Michael Kuperstein, Veselyn Raychev
11
Verification with Abstraction
P S ?
12
Now what? P ’ S Refine the abstraction P S
13
Alternatively… P ’ S’ Relax the specification (but to what?) P S
14
Alternatively… P’ ’ S Change the program P S
15
A Standard Approach: Abstraction Refinement
program Valid specification Abstract counter example Verify abstraction Abstract counter example Abstraction Refinement Change the abstraction to match the program
16
Abstraction-Guided Synthesis [VYY-POPL’10]
program P’ Program Restriction Implement specification Abstract counter example Verify Notes: Should say here --- could have many solutions, implement is picking one based on quantitative criterion Constraint captures changes to the program execution --- what is permitted during program execution abstraction Abstract counter example Abstraction Refinement Change the program to match the abstraction
17
Example Initially x = z = 0 Every single statement is atomic
1: y1 = f(x) 2: y2 = x 3: assert(y1 != y2) f(x) { if (x == 1) return 3 else if (x == 2) return 6 else return 5 } Initially x = z = 0 Every single statement is atomic f(x) is atomic
18
Example: Concrete Values
y1 6 5 x += z; x += z; z++;z++;y1=f(x);y2=x;assert y1=5,y2=0 4 3 2 z++; x+=z; y1=f(x); z++; x+=z; y2=x;assert y1=3,y2=3 1 … 1 2 3 4 y2 Concrete values T1 1: x += z 2: x += z T2 1: z++ 2: z++ T3 1: y1 = f(x) 2: y2 = x 3: assert(y1 != y2) f(x) { if (x == 1) return 3 else if (x == 2) return 6 else return 5 }
19
Example: Parity Abstraction
2 3 1 4 5 6 y2 y1 6 5 4 3 2 1 1 2 3 4 y2 Concrete values Parity abstraction (even/odd) x += z; x += z; z++;z++;y1=f(x);y2=x;assert y1=Odd,y2=Even T1 1: x += z 2: x += z T2 1: z++ 2: z++ T3 1: y1 = f(x) 2: y2 = x 3: assert(y1 != y2) f(x) { if (x == 1) return 3 else if (x == 2) return 6 else return 5 }
20
Dynamically enforce consistency
Use static analysis to reduce the cost Example array-bound check int a[100] … for (i=0; i <n; i++) { … … a[i] … Can be very effective Hardware support may be available Does not assure the absence of bugs
21
Reducing the cost of static analysis via dynamic analysis
Quickly find properties which do not hold Locate good abstractions
22
Abstractions from Tests [POPL’12]
program P query q info Dynamic Analysis Parameter Inference parameter Parametric Static Analysis proof don’t know disproved
23
Hypothesis If a query is simple, we can find why the query holds simply by looking at a few execution traces
24
Parameter Inference based on separability
[[Q]] (a)
25
Thread-local information
Does a local variable point to an object that cannot be reached from other threads Reachable from global for (i = 0; i < n; i++) { x0 = new h0; x1 = new h1; x1.f1 = x0; x2 = new h2; x2.f2 = x1; x3 = new h3; x3.f3 = x2; x0.start(); pc: x2.id = i; //local(x2)? x3.start(); }
26
Parametric thread-escape analysis
Represent the heap with two summary nodes E and L (L) represents objects which are guaranteed to be thread-local (E) represents objects which may escape Param = AllocSite {L, E} L can move to E but not vice versa
27
Example Partition for (i = 0; i < n; i++) { x0 = new h0;//E
x1.f1 = x0; x2 = new h2;//L x2.f2 = x1; x3 = new h3;// L x3.f3 = x2; x0.start(); pc: x2.id = i; //local(x2)? x3.start(); }
28
Difficulties in choosing a good parameter
Using more L makes the analysis more expensive More L does not necessarily mean more precision for (i = 0; i < n; i++) { x0 = new h0;//L x1 = new h1; // L x1.f1 = x0; x2 = new h2;//L x2.f2 = x1; x3 = new h3;// L x3.f3 = x2; x0.start(); pc: x2.id = i; //local(x2)? x3.start(); }
29
Setting for the experiments
6 concurrent Java programs from Dacapo: 161K - 491K bytecode (including analyzed JDK) Up to 5K allocation sites per program 47K queries, but only 17K(37%) reached during testing
30
Experiments 6-8 s program P query q 38s-86ms info Dynamic Analysis
Parameter Inference parameter Parametric Static Analysis 20% don’t know 52% proved 28% disproved
31
Summary Learning for Dynamic Analysis
Can be effective Not sure that actual runs are needed
32
Abstractions can be used to improve dynamic analysis
Debugging Garbage collection
33
Abstractions and Decision Procedures
Compute best transformers Assume Gurantee reasoning CEGAR (interpolants) Abduction for composing analysis Abstraction can help scaling decision procedures SMT
34
Symbolic Operations: Three Value-Spaces
T# T Concrete Values Formulas Abstract Values
35
Symbolic Operations: Three Value-Spaces
2, 4, 16, … x=E even(x) Concrete Values Formulas Abstract Values
36
Symbolic Operations: Three Value-Spaces
x ... u1 x u Concrete Values Formulas Abstract Values
37
Required Primitive Operations
Abstraction (S) = storeS (store) ( ) = { } Symbolic concretization ( ) = v1,v2 : nodeu1(v1) nodeu (v2) v1 ≠ v2 v : nodeu1(v) nodeu (v) Theorem prover returning a satisfying structure (store) S u1 x u x u1 x u
38
Constant-Propagation Domain
(Var ZT), where ZT = Examples: , [x0, y43, z0], [xT, yT, z0], [xT, yT, z T] Infinite cardinality, but finite height
39
Three Value-Spaces Concrete Values Formulas Abstract Values
[x0, y0, z0] [x0, y1, z0] [x0, y2, z0] [x0, yT, z0] (x = 0) (z = 0) Concrete Values Formulas Abstract Values
40
Three Value-Spaces Concrete Values Formulas Abstract Values
[x0, y0, z0] [x0, y1, z0] [x0, y2, z0] (x = 0) (z = 0) [x0, y2, z0] Concrete Values Formulas Abstract Values
41
Required Primitive Operations
Abstraction (S) = storeS (store) ([x 0, y 2, z 0]) = [x0, y2, z0] Symbolic concretization ([x0, yT, z0]) = (x = 0) (z = 0) Theorem prover returning a satisfying structure (store) S [x 0, y 2, z 0] (x = 0) (z = 0)
42
Required Primitive Operations
Abstraction (S) = storeS (store) ([x 0, y 2, z 0]) = [x0, y2, z0] Symbolic concretization ([x0, yT, z0]) = (x = 0) (z = 0) Theorem prover returning a satisfying structure (store) S [x 0, y 2, z 0] (z = 0) (x = y*z)
43
Constant Propagation λe.e[x e(y)*e(z)] [x3, y4, z1] x = y * z
T[x = y * z] λe.e[x e(y)*e(z)] [x’4, y’4, z’1] T[x := y*z] =df (x’ = y * z) (y’ = y) (z’ = z) (x’ = y * z) (y’ = y) (z’ = z) [x3, y4, z1, x’4, y’4, z’1]
44
Constant Propagation λe.e[x e(y) # e(z)] x = y * z T#[x = y * z]
[x3, yT, z1] x = y * z T#[x = y * z] λe.e[x e(y) # e(z)] [x’T, y’T, z’1]
45
Three Value-Spaces α Concrete Values Formulas αT Abstract Values
[x’0,y’T,z’0] αT (x’ = 0) (z’ = 0) T[x := y*z] [xT,yT,z0] (z = 0) Abstract Values
46
Remainder () – best abstract value that represents
Best = T – best abstract transformer
47
Idea Behind Procedure CP()
ans Concrete Values Formulas Abstract Values
48
Idea Behind Procedure CP()
S S (S) ans Concrete Values Formulas Abstract Values
49
Idea Behind Procedure CP()
(ans) S S (ans) (S) (ans) ans Concrete Values Formulas Abstract Values
50
Idea Behind Procedure CP()
1 1 (ans) S 1 1 (ans) (S) (ans) S ans Concrete Values Formulas Abstract Values
51
Idea Behind Procedure CP()
2 S 2 S (S) 2 ans Concrete Values Formulas Abstract Values 2 = 1 (ans)
52
Idea Behind Procedure CP()
2 2 (ans) S 2 S (ans) (ans) (S) 2 ans Concrete Values Formulas Abstract Values
53
Idea Behind Procedure CP()
(ans) (ans), (ans) 5 = false ans Concrete Values Formulas Abstract Values
54
Procedure (formula ) { ans := := while ( is satisfiable) {
Select a store S such that S ans := ans (S) := (ans) } return ans
55
Procedure CP() ans Concrete Values Formulas Abstract Values
(z = 0) (x = y * z) [x0, y43, z0] S ans [x0,y43,z0] Concrete Values Formulas Abstract Values
56
Procedure CP() ans Concrete Values Formulas Abstract Values
(z = 0) (x = y * z) (ans) [x0, y43, z0] S (x = 0) (y = 43) (z = 0) (ans) ans [x0,y43,z0] Concrete Values Formulas Abstract Values
57
Procedure CP() Concrete Values Formulas Abstract Values S
(z = 0) (x = y * z) (y 43) S [x0, y24, z0] [x0,y24,z0] [x0, y43, z0] Concrete Values Formulas Abstract Values
58
Procedure CP() ans Concrete Values Formulas Abstract Values S
(z = 0) (x = y * z) (y 43) [x0, yT, z0] S (x = 0) (z = 0) (x = 0) (z = 0) ans Concrete Values Formulas Abstract Values
59
The Idea Behind Best = T
(a)T (a) a (a) T Formulas Abstract Values
60
The Idea Behind Best = T
(a)T (a) a (a) T Formulas Abstract Values
61
The Idea Behind Best = T
(a)T (a) a (a) ans T Formulas Abstract Values
62
The Idea Behind Best = T
(a)T (a) a (a) ans T Formulas Abstract Values
63
Procedure Best Best(two-store-formula T, abs-store a) { ans’ := ’
while ( is satisfiable) { Select a store pair (S,S ’) such that (S,S ’) ans’ := ans’ ’(S ’) := ’(ans’) } return ans’
64
Best((x’ = y * z) (y’ = y) (z’ = z), [xT, yT, z0])
Initialization: ans’ := ’ := (z = 0) (x’ = y * z) (y’ = y) (z’ = z) Iteration 1: (S,S ’) := [x 5, y 17, z 0, x’ 0, y’ 17, z’ 0]
65
The Idea Behind Best = T
[ x’0, y’17, z’0] (a)T (a) a (a) [x5, y17, z0] Formulas Abstract Values T
66
Best((x’ = y * z) (y’ = y) (z’ = z), [xT, yT, z0])
Initialization: ans’ := ’ := (z = 0) (x’ = y * z) (y’ = y) (z’ = z) Iteration 1: (S,S ’) := [x 5, y 17, z 0, x’ 0, y’ 17, z’ 0] ans’ := [x’0, y’17, z’0] ’(ans’) = (x’= 0) (y’= 17) (z’= 0) := (z = 0) (x’ = y*z) (y’ = y) (z’ = z) (y’ 17)
67
Best((x’ = y * z) (y’ = y) (z’ = z), [xT, yT, z0])
Iteration 2: (S,S ’) := [x 12, y 99, z 0, x’ 0, y’ 99, z’ 0] ans’ := [x’0, y’17, z’0] [x’0, y’99, z’0] = [x’0, y’T, z’0] ’(ans’) = (x’= 0) (z’= 0) := (z = 0) (x’ = y * z) (y’ = y) (z’ = z) (y’ 17) ((x’ 0) (z’ 0)) = false Iteration 3: is unsatisfiable Return value: [x’0, y’T, z’0]
68
. . . (y’(v) v1: x(v1) n(v1,v)) . . .
u1 x u Best(y = x next, ) r[x] r[x] . . . (y’(v) v1: x(v1) n(v1,v)) . . . u1 u2 u3 x’ r[x]’,r[y]’ r[x]’ y’ x r[x] u4 u2 x u r[x],r[y] u1 r[x] y
69
Predicate Abstraction
y := 3 x := 4*y + 1 [x 13, y 3] { B1 (y = 1), B2 (y = 3), B3 (y = 4), B4 (x = 1), B5 (x = 3), B6 (x = 4) } B1 B2 B3 B4 B5 B6 y = 3 x {1, 3, 4} [x 13, y 3]
70
Three Value-Spaces Concrete Values Formulas Abstract Values
[x5, y3] [x0, y3] [x17, y3] (B1, B2,B3, B4,B5,B6) (y ≠ 1) (y = 3) (y ≠ 4) (x ≠ 1) (x ≠ 3) (x ≠ 4) Abstract Values
71
Three Value-Spaces α Concrete Values Formulas αT Abstract Values
(B1, B2,B3,B6) α αT (y ≠ 1) (y = 3) (y ≠ 4) (x ≠ 4) T[x := x+1] (B1, B2,B3, B4,B5,B6) (y ≠ 1) (y = 3) (y ≠ 4) (x ≠ 1) (x ≠ 3) (x ≠ 4) Abstract Values
72
Predicate Abstraction
Abstract values (B1, B2, B3, B4, B5, B6) Apply , which performs symbolically (y ≠ 1) (y = 3) (y ≠ 4) (x ≠ 1) (x ≠ 3) (x ≠ 4) Apply T, which implements α T
73
α PA: Most-Precise Abstract Value [Predicate Abstraction]
(B1, B2,B3, B4,B5,B6) αPA (y = 3) (x = 4*y + 1) Concrete Values Formulas Abstract Values
74
α PA: Most-Precise Abstract Value [Predicate Abstraction]
false j = 1 k Bj if j is valid Bj if j is valid true otherwise if is unsatisfiable otherwise PA((y = 3) (x = 4*y + 1)) = B1, B2, B3, B4, B5, B6 (y = 3) (x = 4*y + 1) (y = 1) (y = 3) (x = 4*y + 1) (y = 3) (y = 3) (x = 4*y + 1) (y = 4)
75
α PA: Most-Precise Abstract Value [Predicate Abstraction]
false j = 1 k Bj if j is valid Bj if j is valid true otherwise if is unsatisfiable otherwise PA((y = 3) (x = 4*y + 1)) = B1, B2, B3, B4, B5, B6 (y = 3) (x = 4*y + 1) (x = 1) (y = 3) (x = 4*y + 1) (x = 3) (y = 3) (x = 4*y + 1) (x = 4)
76
Procedure PA vs. General
Concrete Values Formulas Abstract PA i Formulas Concrete Values Abstract i S i S ansi = ansi-1 (S) ansi-1 (ansi-1)
77
Conclusions Requirements () – best abstract value that represents
Finite-height abstract domain Theorem prover that returns a satisfying structure (store) (S) = sS (S) Symbolic-concretization operation () () – best abstract value that represents Best(T,a) – best abstract transformer
78
Abstractions and Machine Learning Sriram Rajamani & Percy Liang
Machine learning techniques can learn abstractions Abstractions can be used for machine learning
79
Open Problems in Program Analysis
Predictability Specializing static analysis to a set of programs Numerical analysis Polyhedra Cost of operations Scaling Disjunctions Narrowing Interesting subclasses Shape Analysis Ownership Non disjunctive domains Binary widening Can PL be designed for better static analysis? Concurrency Abstractions relations on programs Modularity Is greatest fixed point the answer? Theory of AI Widening Montonicity Necessity
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.