Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent automatic test pattern generation for C-based HW/SW co-design descriptions through combined use of concrete and symbolic simulations Masahiro.

Similar presentations


Presentation on theme: "Intelligent automatic test pattern generation for C-based HW/SW co-design descriptions through combined use of concrete and symbolic simulations Masahiro."— Presentation transcript:

1 Intelligent automatic test pattern generation for C-based HW/SW co-design descriptions through combined use of concrete and symbolic simulations Masahiro Fujita Yoshihisa Kojima University of Tokyo May 2, 2008

2 2 Background  In high-level SoC design, system behavior can be described in C-like programming languages  Target both hardware and software  Tool support is not sufficient  Difficulties compared with RTL or lower design descriptions  Many wide-bit word-level signals (large exploration space)  Complicated control flow (many paths)  Difficulty in modeling various descriptions  SW: pointers, pointer-arithmetic, casting, dynamic allocation, recursive calls…  HW: concurrency, synchronization, throughput, latency…  Our goal is to assist test case generation for system-level descriptions in C-like languages  Automatic input pattern generation  Assertion-based verification to find bugs  For higher code coverage that results in higher confidence

3 3 Most important issues in debugging  Generally speaking, counter examples generated by simulation/emulation are very “long”  Could be billions of cycles  Not east at all to understand why error occurs  Need much shorter counter examples just to understand why the bug happens  Are those long sequences really necessary ?  Bounded model checking is based on assertions with “constraints”  Bounds cannot be large  Can we drive good constraints from the counter examples found in simulation/emulation ? Bug Initial state State space Bug Initial state State space Loops can be skipped There can be more direct path

4 4 Target language  SpecC = ANSI-C + mechanisms for HW  Structural hierarchy  Parallelism  Synchronization  Channel  Languages discussed here  C language  Some additional features b1b2 v1 c1 B p1p2 BehaviorPorts InterfacesChannel Variable (wire) Child behaviors

5 5 Outline  Background  Problem definitions for input pattern generation  Preliminaries  branch / path / coverage definitions  Concrete/symbolic hybrid simulation  Concrete simulation, symbolic simulation  Hybrid simulation  Proposed Method for branch coverage  Implementation  Experimental Results  Conclusion and Future work

6 6 Requirements for input pattern generation (1)  For assertion failure detection  Given a design description annotated with  Input variable definitions  Assumption for input variables as predicates  Assertion predicates  Possible result  Assertion violation (and input value assignments),  Assertion holds for all possible input values,  Unknown int func(int x, int y) { int r = 0; if (x – y > 0) r = x - y; else r = y – x; return r; } int x, y; FL_INPUT(x); FL_INPUT(y); FL_ASSUME(x >= 0); FL_ASSUME(y >= 0); FL_ASSERT(func(x, y) > 0); Assertion failure Counter examples exist: (x = 0, y = 0) (x = 3, y = 3)...

7 7 Requirements for input pattern generation (2)  For branch coverage:  Given design description with annotations and target branch coverage  Generate set of test cases (input value assignments) to cover branches  Tell how to activate code fragments as many as possible (over multiple runs) int x, y; FL_INPUT(x); FL_INPUT(y); if (x > 2) { } if (y > 2) { } Test cases of (1) (x = 0, y = 0) (2) (x = 3, y = 3) will achieve 100% branch coverage

8 8 Outline  Background  Problem definitions for input pattern generation  Preliminaries  branch / path / coverage definitions  Concrete/symbolic hybrid simulation  Concrete simulation, symbolic simulation  Hybrid simulation  Proposed Method for branch coverage  Implementation  Experimental Results  Conclusion and Future work

9 9 Branch / path definitions  A (pair of) conditional branch(es):  Associated with if, do-while, for, switch-case, and while statements  A branch is covered when the associated condition has been evaluated as true (or false) at least once (over multiple runs) if (cond) then else BC = cond BC = ! cond

10 10 Branch / path definitions  A path is a sequence of branches taken  A path condition is defined as the conjunction of all the branch conditions taken  A false (infeasible) path is a path such that there is no value assignment which satisfies the path condition 1: void func(int x, int y) { 2: if (x > 2) { 3: } else { 4: } 5: if (y > 2) { 6: } else { 7: } 8: } There are 4 paths; The path condition is (x > 2) AND NOT(y > 2) 1: void func(int x, int y) { 2: if (x > 2) { 3: } 4: if (x < 2) { 5: } 6: } There appear to be 4 paths; But the path condition is (x > 2) AND (x < 2) INFEASIBLE!

11 11 Branch / path coverage definitions  Branch coverage  # of branches covered out of # of all branches  Path coverage  # of paths covered out of # of all (or feasible) paths  Difficult to use in practice because:  The number of feasible paths cannot be known so easily  The number of possible paths can be huge  Exponential w.r.t. # of if-statements * loop iterations if Exercised 2 runs: branch coverage: 4 / (2 + 2) (100%) path coverage: 2 / (2 * 2) (50%)

12 12 Outline  Background  Problem definitions for input pattern generation  Preliminaries  branch / path / coverage definitions  Concrete/symbolic hybrid simulation  Concrete simulation, symbolic simulation  Hybrid simulation  Proposed Method for branch coverage  Implementation  Experimental Results  Conclusion and Future work

13 13 Traditional (concrete) simulation approach  Create test cases (input values) by hand  Not so easy  Or, generate randomly  Automated, but maybe difficult to activate the corner cases  In system level descriptions, the search space can be huge (e.g. 32-bit word level signals)  Run simulation  Very simple, but how long does it take to hit the failure?  Incomplete: cannot prove the assertion ALWAYS holds  unless all possible values have been exercised (not practically possible)  Confidence (quality of tests): given by coverage metrics  E.g. Branch-coverage Try (x=3, y=100) => r=97 > 0 OK Try (x=1, y=20) => r=19 > 0 OK... Try (x=10, y=10) => r=0 > 0 NG! (may eventually happen, but much rarely)

14 14 Formal approach  Build the formal expressions and mathematically solve the constraints  Precise & Complete  Computationally expensive  Word-level approach: Symbolic simulation  Evaluates values as symbolic expressions instead of concrete values

15 15 Symbolic Simulation  Needs to enumerate all the paths  Sometimes the path can be infeasible (false- path problem) int func(int x, int y) { int r = 0; if (x – y > 0) r = x - y; else r = y – x; return r; } Path1 path2 Path1: (r_1=0) (x – y > 0) (r_2=x - y) (x>=0) (y>=0) -> (r_2>0) Path2: (r_1=0) NOT(x – y > 0) (r_2=y -x) (x>=0) (y>=0) -> (r_2>0) VALID for all x,y INVALID Counter Example: (y - x=0) (some of them may be reported) path-condition Enumerates possible paths (including infeasible ones)

16 16 Symbolic simulation (cont’d)  Employs SMT (satisfiability modulo theory) solver  To solve path conditions  To evaluate assertions  For each path:  One symbolic simulation on a path corresponds to concrete simulations of all possible values on that path  Limitations:  # of paths (including false paths)  Size of symbolic expressions  Solver capability (non-linear algebra)  How to model complicated descriptions  May not be applied straightforwardly to complex / large descriptions

17 17 Concrete-symbolic hybrid approach  Combines concrete simulation and symbolic simulation (originally proposed by Larson[5])  CUTE[11] is proposed for unit testing  Exhaustive traversal on all paths  Concrete run guides the path for symbolic simulation (initially random simulation)  Symbolic run on that path derives the path-condition  Use concrete values for approximation if the constraints cannot be processed (e.g. non-linear)  Solve the constraints to guide the path to another  Negate some path-condition term to take another branch

18 18 initially random Concolic Simulation (1 st ) 1: void test(int x, int y, int z) { 2: if (x > 3) // B1 3: if (y > 11) // B2 4: if (z == y*y) // B3 5: if (x < 5) // B4 6: reach_me(); 7: } Concrete States x=0 y=0 z=0 (0 > 3)? -> no! Symbolic States x=i1 y=i2 z=i3 (i1 > 3)? Path Condition (i1 <= 3) Find the inputs to reach reach_me() Negate this condition And solve to take THEN branch at B1

19 19 Concolic Simulation (2 nd ) 1: void test(int x, int y, int z) { 2: if (x > 3) // B1 3: if (y > 11) // B2 4: if (z == y*y) // B3 5: if (x < 5) // B4 6: reach_me(); 7: } Concrete States x=10 y=0 z=0 (10 > 3) (0 > 11)? -> no! Symbolic States x=i1 y=i2 z=i3 (x > 3) (y <= 11) Path Condition (i1 > 3) (i2 <= 11) Find the inputs to reach reach_me() Negate this condition And solve to take THEN branch at B2

20 20 Concolic Simulation (3 rd ) 1: void test(int x, int y, int z) { 2: if (x > 3) // B1 3: if (y > 11) // B2 4: if (z == y*y) // B3 5: if (x < 5) // B4 6: reach_me(); 7: } Concrete States x=10 y=20 z=0 (10 > 3) (20 > 11) (0 == 400)? -> no! Symbolic States x=i1 y=i2 z=i3 (x > 3) (y > 11) (z == y*y) Path Condition (i1 > 3) (i2 > 11) (i3 != 400) Find the inputs to reach reach_me() Non-linear i2*i2 is replaced by 400. Negate this condition And solve to take THEN branch at B3

21 21 Concolic Simulation (4 th ) 1: void test(int x, int y, int z) { 2: if (x > 3) // B1 3: if (y > 11) // B2 4: if (z == y*y) // B3 5: if (x < 5) // B4 6: reach_me(); 7: } Concrete States x=10 y=20 z=400 (10 > 3) (20 > 11) (400 == 400) (10 < 5)? -> no! Symbolic States x=i1 y=i2 z=i3 (x > 3) (y > 11) (z == 400) (x >= 5) Path Condition (i1 > 3) (i2 > 11) (i3 == 400) (i1 >= 5) Find the inputs to reach reach_me() Negate this condition And solve to take THEN branch at B4

22 22 Concolic Simulation (5 th ) 1: void test(int x, int y, int z) { 2: if (x > 3) // B1 3: if (y > 11) // B2 4: if (z == y*y) // B3 5: if (x < 5) // B4 6: reach_me(); 7: } Concrete States x=4 y=20 z=400 (4 > 3) (20 > 11) (400 == 400) (4 < 5) Symbolic States x=i1 y=i2 z=i3 (x > 3) (y > 11) (z == 400) (x < 5) Path Condition (i1 > 3) (i2 > 11) (i3 == 400) (i1 < 5) Find the inputs to reach reach_me() Reached successfully!

23 23 Concolic approach  Can be applied to work-around non-linear  Can be used to enumerate the paths  Good for path coverage  Can be used to guide the path  But CUTE does not think about which path should be tried next  As CUTE’s strategy is exhaustive  May not terminate if # of paths is huge

24 24 Outline  Background  Problem definitions for input pattern generation  Preliminaries  branch / path / coverage definitions  Concrete/symbolic hybrid simulation  Concrete simulation, symbolic simulation  Hybrid simulation  Proposed Method for branch coverage  Implementation  Experimental Results  Conclusion and Future work

25 25 Proposed method  Flip a branch condition on a path only when not covered yet  Gives the priority for path enumeration  Skips the uncovered paths that do not contribute to the branch coverage  Terminates when the target coverage is achieved  Tries to avoid enumerating all the paths  Not guaranteed to cover all possible branches  Derived alternative paths may not be feasible  Worst case: all paths need to be enumerated  Also limited by the solver’s capability (i.e. path condition may not be solved)

26 26 Our implementation  Implemented on FLEC (our C-Equivalence Checker)  Used as SpecC[3] frontend  Control/data/communication/… dependencies have been extracted  AST interpreter  Evaluates AST node (expression / statement) one by one  C.f. CUTE: instrument & compile  We can start from any points in the program !  Concrete simulator evaluates with concrete values  Symbolic simulator evaluates with symbolic expressions  Branch/Path coverage profiler  Input pattern generator  For alternative path  For assertion failure  SMT solver: CVC3[12]  To generate input patterns  To evaluate assertions  C.f. CUTE: lpsolve

27 27 Outline  Background  Problem definitions for input pattern generation  Preliminaries  branch / path / coverage definitions  Concrete/symbolic hybrid simulation  Concrete simulation, symbolic simulation  Hybrid simulation  Proposed Method for branch coverage  Implementation  Experimental Results  Conclusion and Future work

28 28 Experimental results (1/3)  Simple example  Achieved 2 / 2 (100%) branch coverage with 2 runs  Detected assertion failure with (x=0, y=0) 1: int func(int x, int y) { 2: int r = 0; 3: if (x – y > 0) 4: r = x – y; 5: else 6: r = y – x; 7: return r; 8: } 9: void main() { 10: int x, y; 11: FL_INPUT(x); 12: FL_INPUT(y); 13: FL_ASSUME(x >= 0); 14: FL_ASSUME(y >= 0); 15: FL_ASSERT(func(x, y) > 0); 16: }

29 29 Experimental results (2/3)  Calculate factorial with two implementations  With recursive function calls  With for-loop  Validated for one path (i = 8)  Achieved 4/4 (100%) branch coverage with 1 run 1: unsigned int fact_rec(unsigned int s) { 2: if ( s <= 1) { 3: return 1; 4: } else { 5: unsigned int t; 6: unsigned int p; 7: t = s * fact_rec(s – 1); 8: return t; 9: } 10: unsigned int fact_for(unsigned int s) { 11: unsigned int i; 12: unsigned int p; 13: p = 1; 14: for (i = 1; i <= s; i++) { 15: p *= I; 16: } 17: return p; 18: } 19: void main() { 20: int i, o1, o2; 21: FL_INPUT(i); 22: FL_ASSUME(i <= 10); 23: o1 = fact_for(i); 24: o2 = fact_rec(i); 25: FL_ASSERT(o1 == o2); 26: }

30 30 Experimental results (3/3)  # of branches: 10  # of paths: 4 * 2^100  Achieved 10 / 10 (100%) branch coverage with 5 runs  Detected assertion failure with (x=1, y=2, z=3)  CUTE got stuck due to too many paths 1: int f(int x,int y, int z) { 2: int p; 3: if (x+y+z == 6) 4: if (2*x+7*y+3*z==25) 5: if(-4*x-2*y+2*z==-2) 6: FL_ASSERT(0); 7: for (p = 0; p < 100; p++) { 8: if (p == z) { 9: } 10: } 11: } 12: void main() { 13: int x, y, z; 14: FL_INPUT(x); 15: FL_INPUT(y); 16: FL_INPUT(z); 17: f(x, y, z); 18: }

31 31 Elevator controller profile  Elevator controller (abstracted model)  Cycle-based behavior  Simple, but designed by real engineer  There is a not-intended bug  Inputs:  3 Floors  Up request buttons on 1F and 2F  Down request buttons on 2F and 3F  1 Cabin  3 buttons for floor stop request  2 buttons for door open / close  Outputs:  Up, Down request status  Floor stop request status  Door open/close  Cabin vertical speed (0: stopped, +1: up, -1: down)  Cabin position (on 1F, b/w 1F and 2F, on 2F, b/w 2F and 3F, on 3F)  Service direction (0: none, +1: up, -1: down) 1F 2F 3F 1F2F3F openclose

32 32 Elevator controller profile (cont’d)  State variables:  Up/Down request status (2+2)  Floor stop request status (3)  Door status (1)  Cabin position (on 1F, b/w 1F and 2F, on 2F, b/w 2F and 3F, on 3F)  Cabin speed (0: stopped, +1: up, -1: down)  Service direction (0: none, +1: up, -1: down)  2^8 * 5 * 3 * 3 = 11.5k states (including infeasible ones)  Initially stopped on 1F, door closed, no request active  Original code: 396 lines in SpecC  145 million paths (including infeasible)  Replaced if-then-else & switch-case statements with conditional (cond ? True : false) expressions  To handle multiple paths at once  Simple control flow (straight line), but very complex data flow  Reduced to 155 lines

33 33 Elevator controller profile (cont’d)  Property examples  Elevator must be on or between 1F and 3F  ASSERT((out_position >= 0) && (out_position = 0) && (out_position <= 4));  Door opens only when the elevator is stopped on either of 1F, 2F and 3F  ASSERT (!out_door || ( (out_speed == 0) && ( (out_position == 0) || (out_position ==2) || (out_position == 4)))) ( (out_speed == 0) && ( (out_position == 0) || (out_position ==2) || (out_position == 4))))

34 34 Symbolic simulation result Reset sequence 300k nodes and more! Beginning of Symbolic simulation  Symbolic expression explodes in 3-4 cycles of symbolic simulation  With constant propagation/substitution  With simplifications for ITE, AND, OR, and other operators  Without concrete-value substitution (approximation)  Without common sub-expression sharing  # of cycles of symbolic simulation must be highly bounded!

35 35 User guided simulation  Starts symbolic simulation from the specified state by the user  Explore with respect to the states of user’s interest  Some of the states (proved to be) reachable by concrete (random) simulation  Jump into the states (which may or may not be feasible)  Will need to check its feasibility later State space Initial states Cycle is bounded Concrete simulation Symbolic simulation Symbolic simulation Paths unknown Might be infeasible

36 36 User guided result (1)  Try to generate the input pattern to make a situation where  Located on 2F  Speed = -1 (down)  (not a bug)  I.e. to violate ASSERT (!((out_speed == -1) && (out_position == 2)))  This state is out of bound from the initial state (stopped on 1F)  Need more than 3 cycles for elevator to accept request on 1F, start moving, go up at least to 2F, and go down…

37 37 User guided result (1) (cont’d)  So let’s jump in to one of the feasible state  state_position = 4, state_door = false, state_speed = 0 …  Known as a reachable state by random simulation a priori  Found one of the input pattern to violate the cycle 5 (3 rd cycle of symbolic sim.)  Up request on cycle 1 = true  Up request on cycle 1 = false  Down request on cycle 1 = false  Stop on 1F cycle 1 = false  Stop on 2F cycle 1 = false

38 38 User guided result (2)  Try to violate the assertion  Elevator must be on or between 1F and 3F  ASSERT((out_position >= 0) && (out_position = 0) && (out_position <= 4));  Let’s jump into one of the state  state_position = 4 (on 3F)  state_speed = +1 (up)  next state goes into  out_position = 5 (higher than 3F!)  And violates the assertion!  However, the state (state_position = 4, state_speed = +1) is actually infeasible  Wrong assumption may lead a wrong conclusion  The feasibility of the originating state should be verified in some way

39 39 Conclusion & Future work  Conclusion  Implemented concrete/symbolic hybrid simulator based on AST interpreter  Proposed a method for input pattern generation for branch coverage  Experimental results demonstrate the input pattern generation  For assertion failure detection  For better branch coverage  Future work  Capability to cover the specified target branch  Handling of concurrent executions  Hybrid simulation heuristic tuning  Efficient management of symbolic expressions

40 40 References  [3] D. D. Gajski, J. Zhu, R. Domer, A. Gerstlauer, and S. Zhao. SpecC: Specification Language and Methodology. Kluwer Academic Publishers,  [5] E. Larson and T. Austin. High coverage detection of input-related security facults. In SSYM’03: Proc of 12 th conf on USENIX Security Symbosium,  [11] K. Sen, D. Marinov, and G. Agha. CUTE: a concolic unit testing engine for c. In Proc. Of Esec/SIGSOFT FSE-13,  [12] A. Stump, C. Barrett, and D. Dill. CVC: a cooperating validity checker. In 14 th int’l conf on computer-aided verification, 2002

41 41 Difficulty compared with RTL or lower  In traditional methodology for RTL or gate-level  Word signals are converted into bit-vector  Then, solved with Boolean algebra  Efficient algorithms available: SAT, BDDs…  In system-level descriptions  Too many word signals, too wide words (32 bit / 64 bit)  Too wide space to explore  Complicated control-flow  Data-flow dynamically changes depending on the path  Control-conditions are complex  Too many paths

42 42 Difficulty compared with RTL or lower (cont’d)  In system-level descriptions  To model software  Recursive calls, pointers, pointer-arithmetic, type- casting, dynamic-allocations…  To model hardware  Concurrency, synchronization, throughput, latency…  As word-level solvers, SMT solvers can be employed, but with limited capability  Usually up to linear algebra  Need approximation / workaround, otherwise it would not work!


Download ppt "Intelligent automatic test pattern generation for C-based HW/SW co-design descriptions through combined use of concrete and symbolic simulations Masahiro."

Similar presentations


Ads by Google