Presentation is loading. Please wait.

Presentation is loading. Please wait.

Test Case Filtering and Prioritization Based on Coverage of Combinations of Program Elements Wes Masri and Marwa El-Ghali American Univ. of Beirut ECE.

Similar presentations


Presentation on theme: "Test Case Filtering and Prioritization Based on Coverage of Combinations of Program Elements Wes Masri and Marwa El-Ghali American Univ. of Beirut ECE."— Presentation transcript:

1 Test Case Filtering and Prioritization Based on Coverage of Combinations of Program Elements Wes Masri and Marwa El-Ghali American Univ. of Beirut ECE Department Beirut, Lebanon wm13@aub.edu.lb

2 Test Case Filtering Test case filtering is concerned with selecting from a test suite T a subset T’ that is capable of revealing most of the defects revealed by T Approach: T’ to cover all elements covered by T

3 Test Case Filtering: What to Cover? Existing techniques cover singular program elements of varying granularity:  methods, statements, branches, def-use pairs, slice pairs and information flow pairs Previous studies have shown that increasing the granularity leads to revealing more defects at the expense of larger subsets

4 Test Case Filtering This work explores covering suspicious combinations of simple program elements The number of possible combinations is exponential w.r.t. the number of singular elements  use an approximation algorithm We use a genetic algorithm

5 Test Case Filtering: Conjectures I.Combinations of program elements are more likely to characterize complex failures II.The percentage of failing tests is typically much smaller than that of the passing tests  Each defect causes a small number of tests to fail  Given groups of (structurally) similar tests, smaller ones are more likely to be failure-inducing than larger ones

6 Test Case Filtering: Steps 1)Given a test suite T, generate execution profiles of simple program elements (statements, branches, and def-use pairs) 2)Choose a threshold M fail for the maximum number of tests that could fail due to a single defect 3)Use the genetic algorithm to generate C’, a set of combinations of simple program elements that were covered by less than M fail tests  suspicious combinations 4)Use a greedy algorithm to extract T’, the smallest subset of T that covers all the combinations in C’

7 Genetic Algorithm A genetic algorithm solves a problem by  Operating on an initial population of candidate solutions or chromosomes  Evaluating their quality using a fitness function  Uses transformation to create new generations with improved quality  Ultimately evolving to a single solution

8 Fitness Function We use the following equation: fitness(combination) = 1 - %tests where %tests is the percentage of test cases that exercised the combination The smaller the percentage the higher the fitness The aim is to end up with a manageable set of combinations in which each combination occurred in at most M fail tests

9 Initial Population Generation Generated from union of all execution profiles Size: 50 in our implementation 0  0 always, 1  1 with small probability P 9

10 Transformation Operator 10 Combines two parent chromosomes to produce a child Passes down properties from each, favoring the parent with the higher fitness. Goal: child to have a better fitness than its parents Replace the parent with the worse fitness with the child

11 Solution Set The obtained solution set contains all the encountered combinations with high-enough fitness values  suspicious combinations

12 Experimental Work Our subject programs included: The JTidy HTML syntax checker and pretty printer; 1000 tests; 8 defects; 47 failures The NanoXML XML parser; 140 tests; 4 defects; 20 failures

13 Experimental Work We profiled the following program elements: – basic-blocks or statements (BB) – basic-block edges or branches (BBE) – def-use pairs (DUP) Next we applied the genetic algorithm to generate the following: – a pool of BB comb – a pool of BBE comb – a pool of DUP comb – a pool of ALL comb (combinations of BBs, BBEs and DUPs) The values of M fail we chose for JTidy, and NanoXML were 100, and 20, respectively

14 Profile Type % Tests Selected % Defects Revealed BB 5.3 55.0 BB comb 9.6 65.6 BBE 6.5 78.7 BBE comb 10.2 87.5 DUP 11.7 81.2 DUP comb 14.1 87.5 ALL 12.4 94.8 ALL comb 14.1 100.0 SliceP 26.7 100.0 JTidy results: In the case of ALL comb, 14.1% of the original test suite was needed to exercise all of the combinations exercised by the original test suite, and these tests revealed all the defects revealed by the original test suite In previous work we showed that coverage of slice pairs (SliceP) performed better than coverage of BB, BBE and DUP; this is why we are including the results of SliceP here for comparison.

15 Above Figure compares the various techniques to random sampling : 1.All variations performed better than random sampling 2.BB comb revealed 10.6% more defects than BB but selected 4.2% more tests 3.BBE comb revealed 8.8% more defects than BBE but selected 3.7% more tests 4.DUP comb revealed 6.3% more defects than DUP but selected 2.4% more tests 5.ALL comb performed better than SliceP, since it revealed all defects, as SliceP did, but selected 12.6% less tests

16 Experimental Work Concerning BB comb, BBE comb, DUP comb, the additional cost due to the selection of more tests might not be well justified, since the rate of improvement is no better than it is for random sampling Concerning ALL comb, not only did it perform better than SliceP, but it is considerably less costly – It took 90 seconds on average per test to generate its profiles (i.e., BB’s, BBE’s and DUP’s), whereas it took 1200 seconds per test to generate the SliceP profiles (1 day vs. 2 weeks)

17 NanoXML observations: BB, BBE, DUP, and ALL did not perform any better than random sampling, whereas BB comb, BBE comb, DUP comb, and ALL comb performed noticeably better BB comb, BBE comb, DUP comb, and ALL comb revealed all the defects, but at relatively high cost, since over 50% tests were needed to be executed The cost of running the genetic algorithm and the greedy selection algorithm has to be factored in when comparing our techniques to others

18 Test Case Prioritization Test case prioritization aims at scheduling the tests in T so that the defects are revealed as early as possible Summary of our technique Prioritize combinations in terms of their suspiciousness Then assign the priority of a given combination to the tests that cover it

19 Test Case Prioritization: Steps 1)Identify combinations that were exercised by 1 test; assign that test priority 1, and add it to T’ 2)Identify combinations that were exercised by 2 tests; assign those tests priority 2, and add them to T’ 3)… and so on … until all tests are prioritized, or M fail is exceeded, or all combinations were explored 4)Use the greedy algorithm to reduce T’ 5)Any remaining tests that were not prioritized will be scheduled to run randomly following the prioritized tests

20 Element %tests%defects BB comb 6.7556.25 BBE comb 7.5581.25 DUP comb 12.687.5 ALL comb 13.05100.0 JTidy prioritization results when step 3 is satisfied, i.e., when all tests are prioritized, or M fail is exceeded, or all combinations were explored Observation: Using BB comb, BBE comb, and DUP comb not all defects were revealed. Combinations of BB, BBE, and DUP (ALL comb ) are needed to reveal all defects.

21 NanoXML prioritization results Observation: All defects were revealed using BB comb, BBE comb, DUP comb, or ALL comb, but at a high cost of selected tests. Element %tests%defects BB comb 50.2100.0 BBE comb 50.8100.0 DUP comb 52.8100.0 ALL comb 53.5100.0

22 Conclusion Our techniques performed better than similar coverage-based techniques that consider program elements of the same type and that do not take into account their combinations Will conduct a more thorough empirical study Will use APFD (Average Percentage of Faults Detected) approach to evaluate prioritization


Download ppt "Test Case Filtering and Prioritization Based on Coverage of Combinations of Program Elements Wes Masri and Marwa El-Ghali American Univ. of Beirut ECE."

Similar presentations


Ads by Google