Presentation is loading. Please wait.

Presentation is loading. Please wait.

State coverage: an empirical analysis based on a user study Dries Vanoverberghe, Emma Eyckmans, and Frank Piessens.

Similar presentations


Presentation on theme: "State coverage: an empirical analysis based on a user study Dries Vanoverberghe, Emma Eyckmans, and Frank Piessens."— Presentation transcript:

1 State coverage: an empirical analysis based on a user study Dries Vanoverberghe, Emma Eyckmans, and Frank Piessens

2 Software Validation Metrics Software defects after product release are expensive – NIST2002: $60 billion annually – MS Security bulletins: around 40/year at 100k to 1M $ each Validating software (Testing) – Reduce # defects before release – But not without a cost Make tradeoff: – Estimate remaining # defects => Software validation metrics

3 Example: Code coverage Fraction of statements/basic blocks that are executed by the test suite Principle: – not executed => no defects discovered Hypothesis: – not executed => more likely contains defect

4 Example: Code coverage High statement coverage – No defects? – Different paths Structural coverage metrics: – e.g. Path coverage, data flow coverage, … – Measure degree of exploration Automatic tool assistance – Metrics evaluate tools rather than human effort

5 Problem statement Exploration is not sufficient – Tests need to check requirements – Evaluate completeness of test oracle Impossible to automate: – Guess requirements – Evaluation is critical! No good metrics available

6 State coverage Evaluate strength of assertions Idea: – State updates must be checked by assertions Hypothesis: – Unchecked state update => more likely defect

7 State coverage Complements code coverage – No replacement Metrics also assist developers – Code coverage => reachability of statements? – State coverage => invariant established by reachable statements?

8 State coverage Metric: State update – Assignment to fields of objects – Return values, local variables, … also possible Computation: – Runtime monitor 8 number of state updates read in assertions total number of state updates

9 Design of experiment Existing evaluation: – Correlation with mutation adequacy (Koster et al.) – Case study by expert user Goal: – Directly analyze correlation with ‘real’ defects – Average users

10 Hypotheses Hypothesis 1: – When increasing state coverage (without increasing exploration), the number of discovered defects increases – Similar to existing case study Hypothesis 2: – State coverage and the number of discovered defects are correlated – Much stronger

11 Structure of experiment Base program: – Small calendar management system – Result of software design course – Existing test suite – Presence of software defects unknown

12 Structure of experiment Phase 1: case study – Extend test suite to find defects First increase code coverage Then increase state coverage – Dry run of experiment Simplified application Injected additional defects

13 Structure of experiment Phase 2: Controlled user study – Create new test suite First increase code coverage Then increase state coverage – Commit after each detected defect

14 Threats to validity Internal validity – Two sessions: no differences observed – Learning effect: subjects were familiar with environment before experiment External validity – Choice of application – Choice of faults – Subjects are students

15 Results Phase 1: case study – No additional defects discovered – No confirmation for hypothesis 1 – Potential reasons Mostly structural faults Non-structural faults were obvious Phase 2: Controlled user study – No confirmation for hypothesis 1

16

17

18 Potential causes Frequency of logical faults – 3/20 incorrect state updates – only 1/14 discovered! – 5/14 are detected by assertions – Focusing on these 5 faults Higher state coverage (42% wrt 34%) for classes that detect at least one of these 5 – How common are logical faults?

19 Potential causes Logical faults too obvious – Subjects discovered them with code coverage State coverage is not monotonic – Adding new tests may decrease state coverage – Always relative to exploration

20

21 Conclusions Experiment fails to confirm hypothesis – How frequent are logical faults? – Combine state coverage with code coverage? Or compare test suites with similar code coverage But also: – Simple – Efficient

22 Questions?


Download ppt "State coverage: an empirical analysis based on a user study Dries Vanoverberghe, Emma Eyckmans, and Frank Piessens."

Similar presentations


Ads by Google