Presentation on theme: "An Evaluation of MC/DC Coverage for Pair-wise Test Cases By David Anderson Software Testing Research Group (STRG)"— Presentation transcript:
An Evaluation of MC/DC Coverage for Pair-wise Test Cases By David Anderson Software Testing Research Group (STRG)
Background Software is becoming larger and more complicated, which naturally means the cost and time associated with testing is increasing. According to a National Institute of Standards and Technology report, software bugs cost the U.S. economy an estimated $59.5 billion annually. The same report indicated that one third, or $22.2 billion of that amount could be saved by improving testing infrastructure. New research needs to be conducted to find more cost effective ways to test software.
The Proposal This project proposes the integration of pair-wise testing and MC/DC to create a new framework to help software developers test their products in a more cost effective way. This part of the project is concerned primarily with measuring MC/DC coverage using test cases generated by pair-wise testing. Future parts include research into how to improve MC/DC coverage of pair- wise test suites and developing tools that integrate these two testing techniques into one framework.
Definitions Pair-wise testing: a testing technique that analyzes interactions between variables using a small number of tests to cover all possible pairs between parameters. Modified Condition Decision Coverage (MC/DC): A code coverage criterion that requires every point of entry and exit in a program to be executed at least once, every condition in a decision takes on all possible outcomes at least once, and each condition is shown to affect that decision’s outcome independently.
Pair-wise example Consider the Boolean equation: d = (A ∧ B) ∨ C The following are acceptable test cases for full pair-wise coverage. ABC t1110 t2101 t3011 t4000
Pair-wise facts Pair-wise is a powerful black-box testing technique. Extensive research has been conducted on this technique with outstanding results. The number of test cases compared to exhaustive testing is significantly less. The bigger the system being tested, the better this reduction is.
MC/DC example Consider the Boolean equation: d = (A ∧ B) ∨ C The following are acceptable test cases for full MC/DC coverage ABCd t11101 t21011 t31000 t40100
MC/DC facts MC/DC is a white-box testing technique that ensures adequate coverage of decisions in software. MC/DC is used in standards DO-178B and DO-178C to ensure adequate testing of safety-critical software. In particular, the FAA has adopted this technique for the testing of airborne software. Given an expression of N values, on average N+1 test cases are needed to satisfy MC/DC coverage. For comparison, exhaustive testing requires 2 N test cases.
Why combine MC/DC and pair-wise? Pair-wise - weakMC/DC - strong Pair-wise testing is not very effective at testing Boolean expressions. This has been demonstrated in the paper “Effectiveness of Pair-wise Testing for Software with Boolean Inputs” by W. Balance, S. Vilkomir, and W. Jenkins. In this study, pair-wise testing was only slightly more effective than random testing. MC/DC is designed for testing complex Boolean expressions. Many studies have been conducted on the effectiveness of MC/DC with very positive results. In avionics software it is not uncommon to have Boolean expressions with 6+ variables. MC/DC was created specifically to adequately test this kind of complex logic. Reason 1: Effectiveness of testing Boolean Expressions
Why combine MC/DC and pair-wise? Pair-wise - relatively inexpensiveMC/DC - expensive Pair-wise and combinatorial testing in general is relatively cheap to implement. This comes from the black box nature of the technique. A relatively small set of input data is needed for full pair-wise coverage. Since MC/DC is a white box technique, testing of the underlying code is necessary. In particular, each Boolean expression must have in individual set of test data to achieve full MC/DC coverage. This makes implementing MC/DC very time consuming and expensive. Reason 2: Cost of implementation
Tools used Automated Combinatorial Testing for Software (ACTS): A tool developed by NIST that is used to generate combinatorial(in this case pair-wise) test cases for specified input variables. CodeCover: An Eclipse plugin developed at the University of Stuttgart that is used to measure various code coverage metrics including MC/DC. This was the main tool used for measuring coverage. CTC++: A commercial tool by Verifysoft for measuring coverage of C/C++ programs. This tool was used to verify the correctness of the data from CodeCover.
Demonstration For this demonstration, consider the Boolean expression: (A ∧ B) ∨ (C ∨ D)
Part 1: Generating Pair-wise test cases with ACTS
Part 2: Measuring MC/DC Coverage with CodeCover
Note While the previous example obtained 87.5% MC/DC Coverage, the results are not always this good…
Two categories of expressions Boolean expressions were categorized as either “Simple” or “Complex”. Simple expressions were defined as expressions without repetition in variables while complex expressions contained repetition. For example: The reasoning behind this was that complex expressions add more points of measurement to the expressions. In complex expressions, each instance of the variable in the expression has to be covered while in simple expressions each variable only has one point to be covered. SimpleComplex (A ∧ (B ∨ C)(A ∧ B) ∨ (¬A ∧ C) ∨ (¬B ∧ ¬C)
Comparison with random test cases For each size of expression, one set of pair-wise test cases and three sets of random test cases were generated. Random test cases were generated simply by using a random number generator and converting that number into binary. Each set of random test cases had the same number of cases as the pair- wise set for that expression size. The goal was to see if pair-wise test cases obtain better levels of MC/DC than randomly generated test cases.
Experiment design Number of Variables Number of Expressions pair-wise sets pair-wise test cases random setsrandom test cases total Simple Expressions
Experiment design Number of Variables Number of Expressions pair-wise sets pair-wise test cases random setsrandom test cases total Complex Expressions
Comparison based on size Simple ExpressionsComplex Expressions
Comparison based on complexity
Summary of Results SimpleComplexBoth Pair-wiseRandomPair-wiseRandomPair-wiseRandom 3-var var var var var var Average
The data found in this experiment suggests that pair-wise test cases obtain only slightly better coverage than randomly generated test cases. The data between simple and complex expressions did not seem to be significantly different. With larger expressions, coverage appeared to slowly decrease. Coverage appeared to be highly dependent on the structure of individual expressions, with high variance within sets of data.
Stability of Results It should be noted that the range of the results was very high. Note the chart to the right. This is a sample from one set of 4- variable expressions. As you can see, there is a wide range of coverage levels for both pair- wise and random tests.
What does this mean? Because of this high range and variance of MC/DC coverage level, this data only presents a good average of coverage when many expressions of different sizes, complexities, and structures are measured together. This average would not be suitable as a predictor for coverage of individual expressions, or for software from the industry.
Analyzing coverage for one large set of test data In the previous experiment, each size of expression had different test data. In this experiment, one set of test data for 10 Boolean variables is used. Expressions of different sizes and containing different subsets of these 10 variables are tested and coverage is measured. This approach better matches the structure of industry software by using one set of test data for many expressions of different sizes.
Industry Software Since the long-term goal of this project is to create a framework for developers to test their code in a more effective way, applying this approach to software from the industry is important. Repositories exist such as the Software-Artifact Infrastructure Repository that contain many examples of software intended for experiments such as this one. The results for this could be very different that the results from measuring coverage of individual expressions.
Methods for Improving Coverage Now that we have some data for coverage, a next step is to look for methods to improve this coverage. Methods could be increasing interaction strength (3-wise, 4-wise, etc.) or adding additional test cases to the pair-wise sets based on some criteria.