Presentation is loading. Please wait.

Presentation is loading. Please wait.

Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.

Similar presentations


Presentation on theme: "Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic."— Presentation transcript:

1 Leonardo de Moura Microsoft Research

2 Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic Reasoning

3 Logic is “The Calculus of Computer Science” (Z. Manna). High computational complexity Satisfiability Modulo Theories: A Calculus of Computation

4 Test case generationVerifying CompilersPredicate Abstraction Invariant Generation Type Checking Model Based Testing

5 VCC Hyper-V Terminator T-2 NModel HAVOC F7 SAGE Vigilante SpecExplorer Satisfiability Modulo Theories: A Calculus of Computation

6 unsigned GCD(x, y) { requires(y > 0); while (true) { unsigned m = x % y; if (m == 0) return y; x = y; y = m; } We want a trace where the loop is executed twice. (y 0 > 0) and (m 0 = x 0 % y 0 ) and not (m 0 = 0) and (x 1 = y 0 ) and (y 1 = m 0 ) and (m 1 = x 1 % y 1 ) and (m 1 = 0) SolverSolver x 0 = 2 y 0 = 4 m 0 = 2 x 1 = 4 y 1 = 2 m 1 = 0 SSASSA Satisfiability Modulo Theories: A Calculus of Computation

7 Signature: div : int, { x : int | x  0 }  int Satisfiability Modulo Theories: A Calculus of Computation Subtype Call site: if a  1 and a  b then return div(a, b) Verification condition a  1 and a  b implies b  0

8 Satisfiability Modulo Theories: A Calculus of Computation Is formula F satisfiable modulo theory T ? SMT solvers have specialized algorithms for T

9 b + 2 = c and f(read(write(a,b,3), c-2) ≠ f(c-b+1) Satisfiability Modulo Theories: A Calculus of Computation

10 ArithmeticArithmetic b + 2 = c and f(read(write(a,b,3), c-2) ≠ f(c-b+1) Satisfiability Modulo Theories: A Calculus of Computation

11 ArithmeticArithmetic Array Theory b + 2 = c and f(read(write(a,b,3), c-2) ≠ f(c-b+1) Satisfiability Modulo Theories: A Calculus of Computation

12 ArithmeticArithmetic Array Theory Uninterpreted Functions b + 2 = c and f(read(write(a,b,3), c-2) ≠ f(c-b+1) Satisfiability Modulo Theories: A Calculus of Computation

13 A Theory is a set of sentences Alternative definition: A Theory is a class of structures Satisfiability Modulo Theories: A Calculus of Computation

14 Z3 is a new solver developed at Microsoft Research. Development/Research driven by internal customers. Free for academic research. Interfaces: http://research.microsoft.com/projects/z3 Z3 TextC/C++.NETOCaml Satisfiability Modulo Theories: A Calculus of Computation

15 For most SMT solvers: F is a set of ground formulas Many Applications Bounded Model Checking Test-Case Generation Satisfiability Modulo Theories: A Calculus of Computation

16 An SMT Solver is a collection of Little Engines of Proof Satisfiability Modulo Theories: A Calculus of Computation

17 a = b, b = c, d = e, b = s, d = t, a  e, a  s Satisfiability Modulo Theories: A Calculus of Computation a a b b c c d d e e s s t t

18 a = b, b = c, d = e, b = s, d = t, a  e, a  s Satisfiability Modulo Theories: A Calculus of Computation a a b b c c d d e e s s t t a,b

19 a = b, b = c, d = e, b = s, d = t, a  e, a  s Satisfiability Modulo Theories: A Calculus of Computation c c d d e e s s t t a,b a,b,c

20 d,e a = b, b = c, d = e, b = s, d = t, a  e, a  s Satisfiability Modulo Theories: A Calculus of Computation d d e e s s t t a,b,c

21 a,b,c,s a = b, b = c, d = e, b = s, d = t, a  e, a  s Satisfiability Modulo Theories: A Calculus of Computation s s t t a,b,c d,e

22 a = b, b = c, d = e, b = s, d = t, a  e, a  s Satisfiability Modulo Theories: A Calculus of Computation t t d,e a,b,c,s d,e,t

23 a = b, b = c, d = e, b = s, d = t, a  e, a  s Satisfiability Modulo Theories: A Calculus of Computation a,b,c,s d,e,t

24 a = b, b = c, d = e, b = s, d = t, a  e, a  s Satisfiability Modulo Theories: A Calculus of Computation a,b,c,s d,e,t Unsatisfiable

25 a = b, b = c, d = e, b = s, d = t, a  e Satisfiability Modulo Theories: A Calculus of Computation a,b,c,s d,e,t Model |M| = { 0, 1 } M(a) = M(b) = M(c) = M(s) = 0 M(d) = M(e) = M(t) = 1

26 a = b, b = c, d = e, b = s, d = t, f(a, g(d))  f(b, g(e)) Satisfiability Modulo Theories: A Calculus of Computation a,b,c,s d,e,t g(d) f(a,g(d)) g(e) f(b,g(e)) Congruence Rule: x 1 = y 1, …, x n = y n implies f(x 1, …, x n ) = f(y 1, …, y n )

27 a = b, b = c, d = e, b = s, d = t, f(a, g(d))  f(b, g(e)) Satisfiability Modulo Theories: A Calculus of Computation a,b,c,s d,e,t g(d) f(a,g(d)) g(e) f(b,g(e)) Congruence Rule: x 1 = y 1, …, x n = y n implies f(x 1, …, x n ) = f(y 1, …, y n ) g(d),g(e)

28 a = b, b = c, d = e, b = s, d = t, f(a, g(d))  f(b, g(e)) Satisfiability Modulo Theories: A Calculus of Computation a,b,c,s d,e,t f(a,g(d)) f(b,g(e)) Congruence Rule: x 1 = y 1, …, x n = y n implies f(x 1, …, x n ) = f(y 1, …, y n ) g(d),g(e) f(a,g(d)),f(b,g(e))

29 a = b, b = c, d = e, b = s, d = t, f(a, g(d))  f(b, g(e)) Satisfiability Modulo Theories: A Calculus of Computation a,b,c,s d,e,t g(d),g(e) f(a,g(d)),f(b,g(e)) Unsatisfiable

30 (fully shared) DAGs for representing terms Union-find data-structure + Congruence Closure O(n log n) Satisfiability Modulo Theories: A Calculus of Computation

31 Many theories can be reduced to Equality + Uninterpreted functions Arrays, Sets Lists, Tuples Inductive Datatypes Satisfiability Modulo Theories: A Calculus of Computation

32 a – b ≤ 2 Satisfiability Modulo Theories: A Calculus of Computation x := x + 1 x 1 = x 0 + 1 x 1 ≤ x 0 + 1, x 1 ≥ x 0 + 1 x 1 – x 0 ≤ 1, x 0 – x 1 ≤ -1

33 a – b ≤ 2, b – c ≤ -3, b – d ≤ -1, d – a ≤ 4, c – a ≤ 0 Satisfiability Modulo Theories: A Calculus of Computation a a b b c c d d

34 a – b ≤ 2, b – c ≤ -3, b – d ≤ -1, d – a ≤ 4, c – a ≤ 0 Satisfiability Modulo Theories: A Calculus of Computation a a b b c c d d 2

35 a – b ≤ 2, b – c ≤ -3, b – d ≤ -1, d – a ≤ 4, c – a ≤ 0 Satisfiability Modulo Theories: A Calculus of Computation a a b b c c d d 2 -3

36 a – b ≤ 2, b – c ≤ -3, b – d ≤ -1, d – a ≤ 4, c – a ≤ 0 Satisfiability Modulo Theories: A Calculus of Computation a a b b c c d d 2 -3

37 a – b ≤ 2, b – c ≤ -3, b – d ≤ -1, d – a ≤ 4, c – a ≤ 0 Satisfiability Modulo Theories: A Calculus of Computation a a b b c c d d 2 -3 4

38 a – b ≤ 2, b – c ≤ -3, b – d ≤ -1, d – a ≤ 4, c – a ≤ 0 Satisfiability Modulo Theories: A Calculus of Computation a a b b c c d d 2 -3 4 0

39 a – b ≤ 2, b – c ≤ -3, b – d ≤ -1, d – a ≤ 4, c – a ≤ 0 Satisfiability Modulo Theories: A Calculus of Computation a a b b c c d d 2 -3 4 0 Unsatisfiable iff Negative Cycle Unsatisfiable iff Negative Cycle a – b ≤ 2, b – c ≤ -3, c – a ≤ 0 0 ≤ -1

40 Shortest Path Algorithms Bellman-Ford: O(nm) Floyd-Warshall: O(n 3 ) Satisfiability Modulo Theories: A Calculus of Computation

41 Linear ArithmeticSimplex, Fourier Motzkin Non-linear (complex) arithmetic Grobner Basis Non-linear (real) arithmeticCylindrical Algebraic Decomposition Equational theoriesKnuth-Bendix Completion BitvectorsBit-blasting, rewriting

42 Satisfiability Modulo Theories: A Calculus of Computation x 2 y – 1 = 0, xy 2 – y = 0, xz – z + 1 = 0 Tool: Gröbner Basis

43 Satisfiability Modulo Theories: A Calculus of Computation Polynomial Ideals: Algebraic generalization of zeroness 0  I p  I, q  I implies p + q  I p  I implies pq  I

44 Satisfiability Modulo Theories: A Calculus of Computation The ideal generated by a finite collection of polynomials P = { p 1, …, p n } is defined as: I(P) = {p 1 q 1 + … + p n q n | q 1, …, q n are polynomials} P is called a basis for I(P). Intuition: For all s  I(P), p 1 = 0, …, p n = 0 implies s = 0

45 Satisfiability Modulo Theories: A Calculus of Computation Hilbert’s Weak Nullstellensatz p 1 = 0, …, p n = 0 is unsatisfiable over C iff I({p 1, …, p n }) contains all polynomials 1  I({p 1, …, p n })

46 Satisfiability Modulo Theories: A Calculus of Computation 1 st Key Idea: polynomials as rewrite rules. xy 2 – y = 0 Becomes xy 2  y The rewriting system is terminating but it is not confluent. xy 2  y, x 2 y  1 x2y2x2y2 xy y

47 Satisfiability Modulo Theories: A Calculus of Computation 2 nd Key Idea: Completion. xy 2  y, x 2 y  1 x2y2x2y2 xy y Add polynomial: xy – y = 0 xy  y

48 Satisfiability Modulo Theories: A Calculus of Computation x 2 y – 1 = 0, xy 2 – y = 0, xz – z + 1 = 0 x 2 y  1, xy 2  y, xz  z – 1 x 2 y  1, xy 2  y, xz  z – 1, xy  y xy  1, xy 2  y, xz  z – 1, xy  y y  1, xy 2  y, xz  z – 1, xy  y y  1, x  1, xz  z – 1, xy  y y  1, x  1, 1 = 0, xy  y

49 In practice, we need a combination of theory solvers. Nelson-Oppen combination method. Reduction techniques. Model-based theory combination. Satisfiability Modulo Theories: A Calculus of Computation

50 p  q, p   q,  p  q,  p   q

51 Satisfiability Modulo Theories: A Calculus of Computation p  q, p   q,  p  q,  p   q Assignment: p = false, q = false

52 Satisfiability Modulo Theories: A Calculus of Computation p  q, p   q,  p  q,  p   q Assignment: p = false, q = true

53 Satisfiability Modulo Theories: A Calculus of Computation p  q, p   q,  p  q,  p   q Assignment: p = true, q = false

54 Satisfiability Modulo Theories: A Calculus of Computation p  q, p   q,  p  q,  p   q Assignment: p = true, q = true

55 M | F Partial model Set of clauses Satisfiability Modulo Theories: A Calculus of Computation

56 Guessing (case-splitting) p,  q | p  q,  q  r p | p  q,  q  r Satisfiability Modulo Theories: A Calculus of Computation

57 Deducing p, s| p  q,  p  s p | p  q,  p  s Satisfiability Modulo Theories: A Calculus of Computation

58 Backtracking p, s| p  q, s  q,  p   q p,  s, q | p  q, s  q,  p   q Satisfiability Modulo Theories: A Calculus of Computation

59 Efficient indexing (two-watch literal) Non-chronological backtracking (backjumping) Lemma learning … Satisfiability Modulo Theories: A Calculus of Computation

60 Efficient decision procedures for conjunctions of ground literals. a=b, a 10 Satisfiability Modulo Theories: A Calculus of Computation

61 a=b, a > 0, c > 0, a + c < 0 | F backtrack

62 Satisfiability Modulo Theories: A Calculus of Computation SMT Solver = DPLL + Decision Procedure Standard question: Why don’t you use CPLEX for handling linear arithmetic? Standard question: Why don’t you use CPLEX for handling linear arithmetic?

63 Satisfiability Modulo Theories: A Calculus of Computation Decision Procedures must be: Incremental & Backtracking Theory Propagation a=b, a<5 | … a<6  f(a) = a a=b, a<5, a<6 | … a<6  f(a) = a

64 Satisfiability Modulo Theories: A Calculus of Computation Decision Procedures must be: Incremental & Backtracking Theory Propagation Precise (theory) lemma learning a=b, a > 0, c > 0, a + c < 0 | F Learn clause:  (a=b)   (a > 0)   (c > 0)   (a + c < 0) Imprecise! Precise clause:  a > 0   c > 0   a + c < 0

65 For some theories, SMT can be reduced to SAT bvmul 32 (a,b) = bvmul 32 (b,a) Higher level of abstraction Satisfiability Modulo Theories: A Calculus of Computation

66 F  TF  T First-order Theorem Prover T may not have a finite axiomatization Satisfiability Modulo Theories: A Calculus of Computation

67

68 Test (correctness + usability) is 95% of the deal: Dev/Test is 1-1 in products. Developers are responsible for unit tests. Tools: Annotations and static analysis (SAL + ESP) File Fuzzing Unit test case generation Satisfiability Modulo Theories: A Calculus of Computation

69 Security is critical Security bugs can be very expensive: Cost of each MS Security Bulletin: $600k to $Millions. Cost due to worms: $Billions. The real victim is the customer. Most security exploits are initiated via files or packets. Ex: Internet Explorer parses dozens of file formats. Security testing: hunting for million dollar bugs Write A/V Read A/V Null pointer dereference Division by zero Satisfiability Modulo Theories: A Calculus of Computation

70 Two main techniques used by “black hats”: Code inspection (of binaries). Black box fuzz testing. Black box fuzz testing: A form of black box random testing. Randomly fuzz (=modify) a well formed input. Grammar-based fuzzing: rules to encode how to fuzz. Heavily used in security testing At MS: several internal tools. Conceptually simple yet effective in practice Satisfiability Modulo Theories: A Calculus of Computation

71 Execution Path Run Test and Monitor Path Condition Solve seed New input Test Inputs Constraint System Known Paths Satisfiability Modulo Theories: A Calculus of Computation

72 PEX Implements DART for.NET. SAGE Implements DART for x86 binaries. YOGI Implements DART to check the feasibility of program paths generated statically. Vigilante Partially implements DART to dynamically generate worm filters. Satisfiability Modulo Theories: A Calculus of Computation

73 Test input generator Pex starts from parameterized unit tests Generated tests are emitted as traditional unit tests Satisfiability Modulo Theories: A Calculus of Computation

74

75 class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } }

76 class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } } Inputs

77 (0,null) class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } }

78 InputsObserved Constraints (0,null)!(c<0) class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } } c < 0  false

79 InputsObserved Constraints (0,null)!(c<0) && 0==c class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } } 0 == c  true

80 InputsObserved Constraints (0,null)!(c<0) && 0==c class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } } item == item  true This is a tautology, i.e. a constraint that is always true, regardless of the chosen values. We can ignore such constraints.

81 Constraints to solve InputsObserved Constraints (0,null)!(c<0) && 0==c !(c<0) && 0!=c class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } }

82 Constraints to solve InputsObserved Constraints (0,null)!(c<0) && 0==c !(c<0) && 0!=c(1,null) class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } }

83 Constraints to solve InputsObserved Constraints (0,null)!(c<0) && 0==c !(c<0) && 0!=c(1,null)!(c<0) && 0!=c class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } } 0 == c  false

84 Constraints to solve InputsObserved Constraints (0,null)!(c<0) && 0==c !(c<0) && 0!=c(1,null)!(c<0) && 0!=c c<0 class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } }

85 Constraints to solve InputsObserved Constraints (0,null)!(c<0) && 0==c !(c<0) && 0!=c(1,null)!(c<0) && 0!=c c<0(-1,null) class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } }

86 Constraints to solve InputsObserved Constraints (0,null)!(c<0) && 0==c !(c<0) && 0!=c(1,null)!(c<0) && 0!=c c<0(-1,null)c<0 class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } } c < 0  true

87 Constraints to solve InputsObserved Constraints (0,null)!(c<0) && 0==c !(c<0) && 0!=c(1,null)!(c<0) && 0!=c c<0(-1,null)c<0 class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } }

88 Rich Combination Linear arithmetic BitvectorArrays Free Functions Models Model used as test inputs  -Quantifier Used to model custom theories (e.g.,.NET type system) API Huge number of small problems. Textual interface is too inefficient. Satisfiability Modulo Theories: A Calculus of Computation

89 Pex “sends” several similar formulas to Z3. Plus: backtracking primitives in the Z3 API. push pop Reuse (some) lemmas. Satisfiability Modulo Theories: A Calculus of Computation

90 Given a set of constraints C, find a model M that minimizes the interpretation for x 0, …, x n. In the ArrayList example: Why is the model where c = 2147483648 less desirable than the model with c = 1? !(c<0) && 0!=c Simple solution: Assert C while satisfiable Peek x i such that M[x i ] is big Assert x i < n, where n is a small constant Return last found model Satisfiability Modulo Theories: A Calculus of Computation

91 Given a set of constraints C, find a model M that minimizes the interpretation for x 0, …, x n. In the ArrayList example: Why is the model where c = 2147483648 less desirable than the model with c = 1? !(c<0) && 0!=c Refinement: Eager solution stops as soon as the system becomes unsatisfiable. A “bad” choice (peek x i ) may prevent us from finding a good solution. Use push and pop to retract “bad” choices. Satisfiability Modulo Theories: A Calculus of Computation

92 Apply DART to large applications (not units). Start with well-formed input (not random). Combine with generational search (not DFS). Negate 1-by-1 each constraint in a path constraint. Generate many children for each parent run. parent generation 1 Satisfiability Modulo Theories: A Calculus of Computation

93 Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser 00000000h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000030h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000040h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000060h: 00 00 00 00 ;.... Generation 0 – seed file Satisfiability Modulo Theories: A Calculus of Computation

94 Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser 00000000h: 52 49 46 46 00 00 00 00 00 00 00 00 00 00 00 00 ; RIFF............ 00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000030h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000040h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000060h: 00 00 00 00 ;.... Generation 1 Satisfiability Modulo Theories: A Calculus of Computation

95 ` Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser Satisfiability Modulo Theories: A Calculus of Computation 00000000h: 52 49 46 46 00 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF....***.... 00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000030h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000040h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000060h: 00 00 00 00 ;.... Generation 2

96 Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser 00000000h: 52 49 46 46 3D 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF=...***.... 00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000030h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000040h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000060h: 00 00 00 00 ;.... Generation 3 Satisfiability Modulo Theories: A Calculus of Computation

97 Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser 00000000h: 52 49 46 46 3D 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF=...***.... 00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000030h: 00 00 00 00 73 74 72 68 00 00 00 00 00 00 00 00 ;....strh........ 00000040h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000060h: 00 00 00 00 ;.... Generation 4 Satisfiability Modulo Theories: A Calculus of Computation

98 Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser 00000000h: 52 49 46 46 3D 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF=...***.... 00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000030h: 00 00 00 00 73 74 72 68 00 00 00 00 76 69 64 73 ;....strh....vids 00000040h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000060h: 00 00 00 00 ;.... Generation 5 Satisfiability Modulo Theories: A Calculus of Computation

99 Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser 00000000h: 52 49 46 46 3D 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF=...***.... 00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000030h: 00 00 00 00 73 74 72 68 00 00 00 00 76 69 64 73 ;....strh....vids 00000040h: 00 00 00 00 73 74 72 66 00 00 00 00 00 00 00 00 ;....strf........ 00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000060h: 00 00 00 00 ;.... Generation 6 Satisfiability Modulo Theories: A Calculus of Computation

100 Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser 00000000h: 52 49 46 46 3D 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF=...***.... 00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000030h: 00 00 00 00 73 74 72 68 00 00 00 00 76 69 64 73 ;....strh....vids 00000040h: 00 00 00 00 73 74 72 66 00 00 00 00 28 00 00 00 ;....strf....(... 00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000060h: 00 00 00 00 ;.... Generation 7 Satisfiability Modulo Theories: A Calculus of Computation

101 Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser 00000000h: 52 49 46 46 3D 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF=...***.... 00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000030h: 00 00 00 00 73 74 72 68 00 00 00 00 76 69 64 73 ;....strh....vids 00000040h: 00 00 00 00 73 74 72 66 00 00 00 00 28 00 00 00 ;....strf....(... 00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 C9 9D E4 4E ;............ɝäN 00000060h: 00 00 00 00 ;.... Generation 8 Satisfiability Modulo Theories: A Calculus of Computation

102 Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser 00000000h: 52 49 46 46 3D 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF=...***.... 00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000030h: 00 00 00 00 73 74 72 68 00 00 00 00 76 69 64 73 ;....strh....vids 00000040h: 00 00 00 00 73 74 72 66 00 00 00 00 28 00 00 00 ;....strf....(... 00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 ;................ 00000060h: 00 00 00 00 ;.... Generation 9 Satisfiability Modulo Theories: A Calculus of Computation

103 Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser 00000000h: 52 49 46 46 3D 00 00 00 ** ** ** 20 00 00 00 00 ; RIFF=...***.... 00000010h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000020h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................ 00000030h: 00 00 00 00 73 74 72 68 00 00 00 00 76 69 64 73 ;....strh....vids 00000040h: 00 00 00 00 73 74 72 66 B2 75 76 3A 28 00 00 00 ;....strf²uv:(... 00000050h: 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 ;................ 00000060h: 00 00 00 00 ;.... Generation 10 – CRASH Satisfiability Modulo Theories: A Calculus of Computation

104 SAGE is very effective at finding bugs. Works on large applications. Fully automated Easy to deploy (x86 analysis – any language) Used in various groups inside Microsoft Powered by Z3. Satisfiability Modulo Theories: A Calculus of Computation

105 Formulas are usually big conjunctions. SAGE uses only the bitvector and array theories. Pre-processing step has a huge performance impact. Eliminate variables. Simplify formulas. Early unsat detection. Satisfiability Modulo Theories: A Calculus of Computation

106 Annotated Program Verification Condition F pre/post conditions invariants and other annotations pre/post conditions invariants and other annotations

107 class C { private int a, z; invariant z > 0 public void M() requires a != 0 { z = 100/a; }

108 Source Language C# + goodies = Spec# Specifications method contracts, invariants, field and type annotations. Program Logic: Dijkstra’s weakest preconditions. Automatic Verification type checking, verification condition generation (VCG), SMT Spec# (annotated C#) Boogie PL Spec# Compiler VC Generator Formulas SMT Solver

109 … terminates diverges goes wrong

110 State Cartesian product of variables Execution trace Nonempty finite sequence of states Infinite sequence of states Nonempty finite sequence of states followed by special error state … (x: int, y: int, z: bool)

111 x := E x := x + 1 x := 10 havoc x S ; T assert P assume P S  T

112 Hoare triple{ P } S { Q }says that every terminating execution trace of S that starts in a state satisfying P does not go wrong, and terminates in a state satisfying Q

113 Hoare triple{ P } S { Q }says that every terminating execution trace of S that starts in a state satisfying P does not go wrong, and terminates in a state satisfying Q Given S and Q, what is the weakest P’ satisfying {P’} S {Q} ? P' is called the weakest precondition of S with respect to Q, written wp(S, Q) to check {P} S {Q}, check P  P’

114 wp( x := E, Q ) = wp( havoc x, Q ) = wp( assert P, Q ) = wp( assume P, Q ) = wp( S ; T, Q ) = wp( S  T, Q ) = Q[ E / x ] (  x  Q ) P  Q P  Q wp( S, wp( T, Q )) wp( S, Q )  wp( T, Q )

115 if E then S else T end = assume E; S  assume ¬E; T

116 while E invariant J do S end =assert J; havoc x; assume J; (assume E; S; assert J; assume false  assume ¬ E ) where x denotes the assignment targets of S “fast forward” to an arbitrary iteration of the loop check that the loop invariant holds initially check that the loop invariant is maintained by the loop body

117 BIG and-or tree (ground) BIG and-or tree (ground)  Axioms ( (non-ground)  Axioms ( (non-ground) Control & Data Flow

118 Meta OS: small layer of software between hardware and OS Mini: 60K lines of non-trivial concurrent systems C code Critical: must provide functional resource abstraction Trusted: a verification grand challenge Hardware Hypervisor

119 VCs have several Mb Thousands of non ground clauses Developers are willing to wait at most 5 min per VC Satisfiability Modulo Theories: A Calculus of Computation

120 Partial solutions Automatic generation of: Loop Invariants Houdini-style automatic annotation generation Satisfiability Modulo Theories: A Calculus of Computation

121 Quantifiers, quantifiers, quantifiers, … Modeling the runtime  h,o,f: IsHeap(h)  o ≠ null  read(h, o, alloc) = t  read(h,o, f) = null  read(h, read(h,o,f),alloc) = t Satisfiability Modulo Theories: A Calculus of Computation

122 Quantifiers, quantifiers, quantifiers, … Modeling the runtime Frame axioms  o, f: o ≠ null  read(h 0, o, alloc) = t  read(h 1,o,f) = read(h 0,o,f)  (o,f)  M Satisfiability Modulo Theories: A Calculus of Computation

123 Quantifiers, quantifiers, quantifiers, … Modeling the runtime Frame axioms User provided assertions  i,j: i  j  read(a,i)  read(b,j) Satisfiability Modulo Theories: A Calculus of Computation

124 Quantifiers, quantifiers, quantifiers, … Modeling the runtime Frame axioms User provided assertions Theories  x: p(x,x)  x,y,z: p(x,y), p(y,z)  p(x,z)  x,y: p(x,y), p(y,x)  x = y Satisfiability Modulo Theories: A Calculus of Computation

125 Quantifiers, quantifiers, quantifiers, … Modeling the runtime Frame axioms User provided assertions Theories Solver must be fast in satisfiable instances. We want to find bugs! Satisfiability Modulo Theories: A Calculus of Computation

126 There is no sound and refutationally complete procedure for linear integer arithmetic + free function symbols Satisfiability Modulo Theories: A Calculus of Computation

127 Heuristic quantifier instantiationCombining SMT with Saturation proversComplete quantifier instantiationDecidable fragmentsModel based quantifier instantiation Satisfiability Modulo Theories: A Calculus of Computation

128 Is the axiomatization of the runtime consistent? False implies everything Partial solution: SMT + Saturation Provers Found many bugs using this approach Satisfiability Modulo Theories: A Calculus of Computation

129 Standard complain “I made a small modification in my Spec, and Z3 is timingout” This also happens with SAT solvers (NP-complete) In our case, the problems are undecidable Partial solution: parallelization Satisfiability Modulo Theories: A Calculus of Computation

130 Joint work with Y. Hamadi (MSRC) and C. Wintersteiger Multi-core & Multi-node (HPC) Different strategies in parallel Collaborate exchanging lemmas Strategy 1 Strategy 2 Strategy 3 Strategy 4 Strategy 5 Satisfiability Modulo Theories: A Calculus of Computation

131 Logic as a platform Most verification/analysis tools need symbolic reasoning SMT is a hot area Many applications & challenges http://research.microsoft.com/projects/z3 Thank You! Satisfiability Modulo Theories: A Calculus of Computation

132 SMT solvers use heuristic quantifier instantiation. E-matching (matching modulo equalities). Example:  x: f(g(x)) = x { f(g(x)) } a = g(b), b = c, f(a)  c Trigger Satisfiability Modulo Theories: A Calculus of Computation

133 SMT solvers use heuristic quantifier instantiation. E-matching (matching modulo equalities). Example:  x: f(g(x)) = x { f(g(x)) } a = g(b), b = c, f(a)  c x=b f(g(b)) = b Equalities and ground terms come from the partial model M Satisfiability Modulo Theories: A Calculus of Computation

134 Integrates smoothly with DPLL. Efficient for most VCs Decides useful theories: Arrays Partial orders … Satisfiability Modulo Theories: A Calculus of Computation

135 E-matching is NP-Hard. In practice Satisfiability Modulo Theories: A Calculus of Computation

136 Trigger: f(x1, g(x1, a), h(x2), b) Trigger: f(x1, g(x1, a), h(x2), b) Instructions: 1.init(f, 2) 2.check(r4, b, 3) 3.bind(r2, g, r5, 4) 4.compare(r1, r5, 5) 5.check(r6, a, 6) 6.bind(r3, h, r7, 7) 7.yield(r1, r7) Instructions: 1.init(f, 2) 2.check(r4, b, 3) 3.bind(r2, g, r5, 4) 4.compare(r1, r5, 5) 5.check(r6, a, 6) 6.bind(r3, h, r7, 7) 7.yield(r1, r7) CompilerCompiler Similar triggers share several instructions. Combine code sequences in a code tree Satisfiability Modulo Theories: A Calculus of Computation

137 Tight integration: DPLL + Saturation solver. BIG and-or tree (ground) BIG and-or tree (ground) Axioms ( (non-ground) Axioms ( (non-ground) Saturation Solver SMT Satisfiability Modulo Theories: A Calculus of Computation

138 Inference rule: DPLL(  ) is parametric. Examples: Resolution Superposition calculus … Satisfiability Modulo Theories: A Calculus of Computation

139 DPLL + Theories DPLL + Theories Saturation Solver Saturation Solver Ground literals Ground clauses Satisfiability Modulo Theories: A Calculus of Computation


Download ppt "Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic."

Similar presentations


Ads by Google