Download presentation
Presentation is loading. Please wait.
1
Symbolic execution http://pan.cin.ufpe.br © Marcelo d’Amorim 2010
2
Goal and Input-Output Automate test input data generation – Input: parameterized function call – Output: inputs s.t. all* paths are explored © Marcelo d’Amorim 2010 foo(int x, int y){ if(x > y){... } else{... } } Symbolic Execution foo($a, $b); foo(1,0); foo(0,0)
3
Attention! Function foo can be arbitrarily complex – Other types, call to other functions, contain loops and branches, etc. One can obtain tests with user-defined assertions © Marcelo d’Amorim 2010
4
Opening the box… © Marcelo d’Amorim 2010 Symbolic Execution foo($a, $b); foo(1,0); foo(0,0)
5
Opening the box… © Marcelo d’Amorim 2010 Symbolic Execution foo($a, $b); foo(1,0); foo(0,0) Constraint generation Constraint solving
6
Opening the box… © Marcelo d’Amorim 2010 Symbolic Execution foo($a, $b); foo(1,0); foo(0,0) A path condition is a description of a path as function of symbolic inputs. Symbolic execution explores all program paths. Constraint generation Constraint solving path conditions
7
Opening the box… © Marcelo d’Amorim 2010 Symbolic Execution foo($a, $b); foo(1,0); foo(0,0) Constraint generation Constraint solving $a > $b $a <= $b foo(int x, int y){ if(x > y){... } else{... } }
8
Exercise Generate the path conditions for this program. © Marcelo d’Amorim 2010 void bar1(int x){ if (x > 0) { … } else if (x < 0) { … } else { ERROR; } }
9
Exercise Generate the path conditions for this program. © Marcelo d’Amorim 2010 void bar2(int x){ if (x > 0) { if (x > 10) {…} } else if (x < 0) { if (x < 2) {…} } else { ERROR; } }
10
Exercise Generate the path conditions for this program. © Marcelo d’Amorim 2010 void bar2(int x){ if (x > 0) { if (x > 10) {…} } else if (x < 0) { if (x < 2) {…} } else { ERROR; } } Infeasible path!
11
Exercise Generate the path conditions for this program. Hint: ignore paths with length > 2. © Marcelo d’Amorim 2010 int fact(int n){ return n * (n > 0) ? fact (n – 1) : 1; }
12
Exercise Generate the path conditions for this program. Hint: ignore paths with length > 2. © Marcelo d’Amorim 2010 int fact(int n){ return n * (n > 0) ? fact (n – 1) : 1; } Repeated states.
13
Part 1: constraint generator Modifies program semantics to handle symbolic state – Stack, heap, and static area hold symbolic values Two popular alternatives – Instrumentation – Modified interpreter (e.g., Java Virtual Machine) © Marcelo d’Amorim 2010
14
Instrumentation © Marcelo d’Amorim 2010 foo(int x) { x = x + 1; if (x > 10) { // … } else { // … } foo(SymInt x) { x = x.add(ONE); if (x.gt(TEN).choose()) { // … } else { // … } Types and operationschoice
15
Discussion What would you need to modify in a JVM to run programs in symbolic execution mode? What are pros-cons of instrumentation-based solution vs. modified JVM? © Marcelo d’Amorim 2010
16
Part 2: constraint solver Decision procedures can be used to solve simple constraints. For example: – Integer linear arithmetic: x > y + z and z < y Unfortunately, symbolic execution can generate complex constraints – Undecidable, intractable, or just not handled by decision procedures © Marcelo d’Amorim 2010
17
Pointers to the interested JVM symbolic execution: AQUA and SPF Complex constraints: CORAL or FloPSy Links: – AQUA and CORAL: http://pan.cin.ufpe.brhttp://pan.cin.ufpe.br – SPF: google JPF and symb project – FloPSy: http://research.microsoft.com/en- us/people/nikolait/http://research.microsoft.com/en- us/people/nikolait/ © Marcelo d’Amorim 2010
18
Objects: Lazy initialization A symbolic object is an “unknown blob”. – Execution details the blob by need Assignment example: o.f = exp – Variable o holds the symbolic object ? (the blob) – 3 possible outcomes depending on ?: ? is null ? is a not yet seen object ? Is an already seen object © Marcelo d’Amorim 2010
19
Objects: Lazy initialization A symbolic object is an “unknown blob”. – Execution details the blob by need Assignment example: o.f = exp – Variable o holds the symbolic object ? (the blob) – 3 possible outcomes depending on ?: ? is null ? is a not yet seen object ? Is an already seen object © Marcelo d’Amorim 2010 Concretize the heap while making choices
20
Example © Marcelo d’Amorim 2010 Node root; add(Node n) { if (root == null) { root = n; } else { int v = root.val; if (v < n.val) {…} … } Notation: Primitive fields inside the box. Reference fields outside (omission indicates null). Dashed borders indicate symbolic objects. BST bst = new BST(); bst.add($a); bst.add($b); bst
21
Example © Marcelo d’Amorim 2010 Node root; add(Node n) { if (root == null) { root = n; } else { int v = root.val; if (v < n.val) {…} … } BST bst = new BST(); bst.add($a); bst.add($b); $abst root
22
Example © Marcelo d’Amorim 2010 Node root; add(Node n) { if (root == null) { root = n; } else { int v = root.val; if (v < n.val) {…} … } BST bst = new BST(); bst.add($a); bst.add($b); $a $x $y bst $b $a $x $y bst $b $a root left right $a == null $a != null and $a.val = $x and $b.val = $y and $y < $x $x bst $a root $a != null and $a.val = x and $b.val = y and $x=$y $a != null and $a.val = $x and $b.val = $y and $y > $x NPE!
23
Strings Two approaches – A string is an array of symbolic characters – Symbolic string + special interpretation of library methods First approach can be too expensive. Why? © Marcelo d’Amorim 2010
24
Strings Two approaches – A string is an array of symbolic characters – Symbolic string + special interpretation of library methods First approach can be too expensive. Why? © Marcelo d’Amorim 2010 foo(String s) { …if (s.equals(“hello”)) {…}… }
25
Automata for string constraints Second approach generates finite automata for string constraints generated with library calls Constraint solving = automata walk! © Marcelo d’Amorim 2010
26
Exercise Generate automata to characterize these constraints © Marcelo d’Amorim 2010 $s.startsWith(“hello”) and $s.indexOf(“class”)!=-1 and s.endsWith(“.”)
27
Concolic execution (a.k.a. fuzzing) Several problems with standard symbolic execution. In particular: – Exploration of infeasible paths – Symbolic arrays – Handling of loops and recursion – Native method calls © Marcelo d’Amorim 2010
28
Concolic Execution: How it works 1.Execute the problem with concrete and symbolic inputs 2.Save decisions as before, but execute a single path! 3.Solve pending decisions and back to 1 © Marcelo d’Amorim 2010 Can go from symbolic to concrete domain anytime during execution!
29
Summary Important technique to automate testing Found real errors in file systems, OS, network protocols, and several data structures See www.coverity.com for industrial applicationswww.coverity.com © Marcelo d’Amorim 2010
30
What I believe is still missing Automation of driver and oracle generation Exploit natural parallelism © Marcelo d’Amorim 2010 SYMB.EXE Solver YICES … … queries: solutions:
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.