Presentation on theme: "November 5, 2007 ACM WEASEL Tech Efficient Time-Aware Prioritization with Knapsack Solvers Sara Alspaugh Kristen R. Walcott Mary Lou Soffa University of."— Presentation transcript:
November 5, 2007 ACM WEASEL Tech Efficient Time-Aware Prioritization with Knapsack Solvers Sara Alspaugh Kristen R. Walcott Mary Lou Soffa University of Virginia Michael Belanich Gregory M. Kapfhammer Allegheny College
Test Suite Prioritization Testing occurs throughout software development life cycle Challenge: time consuming and costly Prioritization: reordering the test suite Goal: find errors sooner in testing Doesn’t consider the overall time budget Alternative: time-aware prioritization Goal 1: find errors sooner in testing Goal 2: execute within time constraint
Motivating Example Original test suite with fault information T2 1 fault 2 min. T4 6 faults 2 min. T3 2 faults 2 min. T1 4 faults 2 min. Assume: - Same execution time - Unique faults found T4 6 faults 2 min. T1 4 faults 2 min. T2 1 fault 2 min. T3 2 faults 2 min. Prioritized test suite Testing time budget: 4 minutes
The Knapsack Problem for Time-Aware Prioritization Maximize:, where is the code coverage of test and is either 0 or 1. Subject to the constraint: where is the execution time of test and is the time budget. P n i = 1 c i ¤ x i x i t i ii c i t max P n i = 1 t i ¤ x i · t max
The Knapsack Problem for Time-Aware Prioritization T2 1 line 2 min. T4 5 lines 2 min. T3 2 lines 2 min. T1 4 lines 2 min. Time Budget: 4 min. Total Value: Space Remaining: 0 4 min. 5 2 min. 9 0 min. Assume test cases cover unique requirements.
The Extended Knapsack Problem Value of each test case depends on test cases already in prioritization Test cases may cover same requirements T2 1 line 2 min. T4 5 lines 2 min. T3 2 lines 2 min. T1 4 lines 2 min. Time Budget: 4 min. Total Value: Space Remaining: 0 4 min. 5 2 min. 7 0 min. T1 0 lines 2 min. UPDATE
Goals and Challenges Evaluate traditional and extended knapsack solvers for use in time-aware prioritization Effectiveness Coverage-based metrics Efficiency Time overhead Memory overhead How does overlapping code coverage affect results of traditional techniques? Is the cost of extended knapsack algorithms worthwhile?
The Knapsack Solvers Random: select tests cases at random Greedy by Ratio: order by coverage/time Greedy by Value: order by coverage Greedy by Weight: order by time Dynamic Programming: break problem into sub-problems; use sub-problem results for main solution Generalized Tabular: use large tables to store sub-problem solutions
The Knapsack Solvers (continued) Core: compute optimal fractional solution then exchange items until optimal integral solution found Overlap-Aware: uses a genetic algorithm to solve the extended knapsack problem for time- aware prioritization
The Scaling Heuristic Order the test cases by their coverage-to- execution-time ratio such that: If, then it is possible to find an optimal solution that includes. Check the inequality for each test case until it no longer holds. belong in the final prioritization. T i T 1 c 1 £ j t max t 1 k ¸ c 2 £ ³ t max t 2 ´ h T 1 ;::: T x ¡ 1 i c 1 t 1 ¸ c 2 t 2 ¸ ::: ¸ c n t n T x ; x 2 [ 1 ; n ]
Implementation Details Knapsack Solver Test Transformer Coverage Calculator Test Suite (T) New Test Suite (T ’) Program Under Test (P) Knapsack Solver Parameters 1. Selected Solver 2. Reduction Preference 3. Knapsack Size
Evaluation Metrics Code coverage: Percentage of requirements executed when prioritization is run Basic block coverage used Coverage preservation: Proportion of code covered by prioritization versus code covered by entire original test suite Order-aware coverage: Considers both the order in which test cases execute in addition to overall code coverage
Experiment Design Goals of experiment: Measure efficiency of algorithms and scaling in terms of time and space overhead Measure effectiveness of algorithms and scaling in terms of three coverage-based metrics Case studies: JDepend Gradebook Knapsack Size 25, 50, and 75% of execution time of original test suite
Summary of Experimental Results Prioritizer Effectiveness: Overlap-aware solver had highest overall coverage for each time limit Greedy by Value solver good for Gradebook All Greedy solvers good for JDepend Prioritizer Efficiency: All algorithms took small amount of time and memory except for Dynamic Programming, Generalized Tabular, and Core Overlap-aware solver required hours to run Generalized Tabular had prohibitively large memory requirements Scaling heuristic reduced overhead in some cases
Conclusions Most sophisticated algorithm not necessarily most effective or most efficient Trade-off: effectiveness versus efficiency Efficiency or effectiveness most important? Effectiveness overlap-aware prioritizer Efficiency low-overhead prioritizer Prioritizer choice depends on test suite nature Time versus coverage of each test case Coverage overlap between test cases
Future Research Use larger case studies with bigger test suites Use case studies written in other languages Evaluate other knapsack solvers such as branch-and-bound and parallel solvers Incorporate other metrics such as APFD Use synthetically generated test suites
Questions? Thank you!
Case Study Applications GradebookJDepend Classes522 Functions73305 NCSS Test Cases2853 Test Suite Exec. Time7.008 s5.468 s