Download presentation

Presentation is loading. Please wait.

Published byJasmine Stanwood Modified over 2 years ago

1
Parallel Symbolic Execution for Structural Test Generation Matt Staats Corina Pasareanu ISSTA 2010

2
Authors Matt Staats Dept. of Computer Science and Eng. University of Minnesota staats@cs.umn.edu Corina Pasareanu Carnegie Mellon University/NASA Ames Research Center Moffett Field, CA, 94035 Corina.S.Pasareanu@nasa.gov

3
Problem of Symbolic Execution Symbolic execution suffers from scalability issues since the number of symbolic paths that need to be explored is very large (or even infinite) for most realistic programs. Exploring all the possible program executions is generally infeasible (the symbolic execution tree may be very large or even infinite) thus limiting the application of symbolic execution in practice.

4
Problem of Symbolic Execution Solution: Motivated by the increased availability of multi- core computers and the inherently parallelizable nature of symbolic execution. Just as exploring a binary tree can be parallelized with little to no inter- process communication, so can exploring a symbolic execution tree. To improve the scalability of symbolic execution via parallelization.

5
Solution Outline 1. Symbolic Pathfinder 2.Simple Static Partitioning 3. Distributing and Using Constraints

6
1. Symbolic Pathfinder (SPF) SPF implements symbolic execution. To prevent SPF from attempting to explore a potentially infinite search space, an upper limit is placed on the search depth or on the number of constraints in the path condition.

7
Symbolic Execution Example (Depth of 3)

8
2.Simple Static Partitioning (SSP) Used to partition the traversal of a symbolic execution tree by using a set of constraints over the input variables of the program under analysis. These constraints are distributed to parallel workers which use them as pre-conditions for the symbolic execution performed with SPF

9
2.Simple Static Partitioning (SSP) To effectively partition and distribute execution, we require a set of constraints to be disjoint and complete. disjoint: If for any two constraints A and B in the set, A ^ B is false, then each worker will potentially explore different parts of the symbolic execution tree complete: If the disjunction of all the constraints in the set simplifies to true, then every possible path will be explored by at least one instance of symbolic execution

10
Example with a depth of 3 and a suggested queue size of 4 1. Perform Symbolic Pathfinder execution, collecting all Path Conditions(PCs) at depth 3. Collected PCs = {b ^ x > y, b ^ x ≤ y, ¬b ^ b2, ¬b ^ ¬b2 }

11
Example with a depth of 3 and a suggested queue size of 4 2. Split each PC into individual constraints and count the frequency of each constraint. Place every constraint in a set IndCons along with its frequency. IndCons = {(b, 2), (¬b, 2), (x > y, 1), (x ≤ y, 1), (b2, 1), (¬b2, 1) }

12
Example with a depth of 3 and a suggested queue size of 4 3. Count the frequency of each symbolic variable in IndCons. Place every variable in a set AvailableVars along with its frequency. AvailableVars= {(b, 4), (x, 2), (y, 2), (b2, 2) }

13
Example with a depth of 3 and a suggested queue size of 4 4. Using IndCons, label symbolic variables as “expensive” or “cheap”, where “cheap” variables are used only in constraints that are “cheap”. Constraints are “expensive” if they use multiplication and/or division, and “cheap”otherwise. IndCons = {(b, 2), (¬b, 2), (x > y, 1), (x ≤ y, 1), (b2, 1), (¬b2, 1) } AvailableVars= {(b, 4), (x, 2), (y, 2), (b2, 2) } all cheap

14
Example with a depth of 3 and a suggested queue size of 4 5. Create an empty set GeneratedConstraints, containing sets of constraints generated by SSP. The product of the size of each set of constraints represents the number of constraints we can generate (we term this numGeneratable). GeneratedConstraints = { }

15
Example with a depth of 3 and a suggested queue size of 4 6. While numGeneratable < SQS: (a) Choose the cheap variable V with the highest frequency from AvailableVars. If no cheap variable is in AvailableVars, choose the expensive variable V with the highest frequency. (b) If V (1) is only referenced in equality constraints, (2) a symbolic integer and (3) the range of V is no more than twice the number of equalities using V in IndCons, then: Add {All constraints referencing V} to GeneratedConstraints Else: C = Most common constraint referencing V in C Add {C, ¬C} to GeneratedConstraints (c) Remove every variable referenced in GeneratedConstraints from AvailableVars.

16
Example with a depth of 3 and a suggested queue size of 4 IndCons = {(b, 2), (¬b, 2), (x > y, 1), (x ≤ y, 1), (b2, 1), (¬b2, 1) } AvailableVars= {(b, 4), (x, 2), (y, 2), (b2, 2) } GeneratedConstraints (1st iteration) = {{b, ¬b}} AvailableVars (1st iteration) = { (x, 2), (y, 2), (b2, 2)}

17
Example with a depth of 3 and a suggested queue size of 4 IndCons = {(b, 2), (¬b, 2), (x > y, 1), (x ≤ y, 1), (b2, 1),(¬b2, 1) } AvailableVars= {(b, 4), (x, 2), (y, 2), (b2, 2) } GeneratedConstraints (1st iteration) = {{b, ¬b}} AvailableVars (1st iteration) = { (x, 2), (y, 2), (b2, 2)} GeneratedConstraints (2nd iteration) = {{b, ¬b}, {x > y, x y} } Step 6c: AvailableVars (2nd iteration) = { (b2, 2)}

18
Example with a depth of 3 and a suggested queue size of 4 7. Generate every combination of constraints using the sets in GeneratedConstraints (i.e, Cartesian product of GeneratedConstraints). Each combination represents a conjunction of constraints. Order the conjunctions (process described under Constraint Queue) to create ConstraintQueue. ConstraintQueue = {b^x > y, ¬b^x > y, b^x y, ¬b^x ≤ y }

19
Constraining Path Condition (PC) in Simple Static Partitioning implemented an optimization when using constraints of the form var == value, called selective concretization. Distribute these constraints to paralel workers. Used as an initial precondition

20
Evaluation Our results demonstrate analysis time speedups up to 30x for 3 of 6 systems using 64 workers, with a maximum speedup of 90x observed using 128 workers. For small numbers of workers (2-8) we demonstrate speedups consistently larger than 90% of the maximum (linear) speedup for 3 of 6 systems, with the other three systems demonstrating speedups of 30% to 90% of the maximum speedup depending on the number of workers. Finally, we demonstrate consistent and significant speedup in automatic test generation over parallel random depth first search for systems requiring on average at least 10 minutes when using random depth first search, with speedups ranging between 5.3x and 70x using 64 workers.

21
Evaluation

26
Future In the future, we would like to evaluate the partitioning techniques in other contexts, such as fault detection. Given the similarity between searching for states satisfying coverage obligations and searching for states violating assertions/properties, we believe the partitioning techniques will perform well in this context. Furthermore we plan to investigate other static partitioning techniques (e.g. based on the control flow graph) and study their effectiveness in conjunction with dynamic partitioning, e.g. [7, 15]. We believe that such a combination will be most effective for parallelizing symbolic execution.

27
The End

Similar presentations

OK

B-Trees. Motivation When data is too large to fit in the main memory, then the number of disk accesses becomes important. A disk access is unbelievably.

B-Trees. Motivation When data is too large to fit in the main memory, then the number of disk accesses becomes important. A disk access is unbelievably.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google