8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic University Brooklyn, NY

8/23/00ISSTA-20002 Outline Measures of test effectiveness Delivered reliability Experiment design Subject program Results Threats to validity Conclusions

8/23/00ISSTA-20003 Measures of Test Effectiveness Probability of detecting at least one fault [DN84,HT90,FWey93,FWei93,…] Expected number of failures during test [FWey93,CY96] Number of faults detected [HFGO94] Delivered reliability [FHLS98]

8/23/00ISSTA-20004 Select test cases Execute test cases Check results Debug program Release program Check test data adequacy OK? no yes no

8/23/00ISSTA-20005 Select test cases Execute test cases Check results Debug program Release program Estimate reliability OK? no yes

8/23/00ISSTA-20006 Delivered Reliability Captures intuition that discovery and removal of “important” faults is more crucial Evaluates testing technique according to the extent to which testing will increase reliability Introduced and studied analytically, FHLS (FSE-97, TSE-98)

8/23/00ISSTA-20007 Failures, Faults, and Failure Regions int foo(); int x,y; { s1; s2; if c1 { s3; s4; }; s5; s6; } qi = probability that input selected according to operational distribution will hit failure region i

8/23/00ISSTA-20008 Failure Rate After Testing/Debugging Reliability after testing and debugging determined by which failure regions are hit by test cases Random variable represents failure rate after testing and debugging Compare testing techniques by comparing statistics of their ’s

8/23/00ISSTA-20009 Example

8/23/00ISSTA-200010 Testing Criteria Considered Various levels of coverage of –decision coverage (branch testing) –def-use coverage (all-used data flow testing) –grouped into quartiles and deciles random testing with no coverage criterion

8/23/00ISSTA-200011 Questions Investigated How do test sets that achieve high coverage levels (of branch testing or data flow testing) compare to those achieving lower coverage, according to –Expected improvement in reliability: –Probability of reaching given reliability target:

8/23/00ISSTA-200012 Subject Program “Space” Program 10,000+ LOC C antenna design program, written by professional programmers, containing naturally occurring faults Test generator generates tests according to operational distribution [Pasquini et al] Considered 10 relatively hard-to-detect faults Failure rate: 0.05564

8/23/00ISSTA-200013 Experiment Design Adapted from design used to compare probability of detecting at least one fault [Frankl, Weiss, et al.] Simulate execution of very large number of fixed-sized test sets For each, note coverage achieved (branch, data flow) and faults detected Compute density function of for various coverage-level groups

8/23/00ISSTA-200014 features Test cases Coverage matrix Fault-sets Failure rate vector Test cases faultsResults matrix Fault-sets Fault-detection matrix Coverage levels

8/23/00ISSTA-200015 Coverage Levels Considered the following groups of test sets for test sets of size 50: –highest decile of decision coverage –highest decile of def-use coverage –four quartiles of decision coverage –four quartiles of def-use coverage

8/23/00ISSTA-200016 Expected Values

8/23/00ISSTA-200017 Tail Probabilities

8/23/00ISSTA-200018

8/23/00ISSTA-200019

8/23/00ISSTA-200020

8/23/00ISSTA-200021 Idealized Test Generation Strategy Select one test case from each subdomain (independently, randomly) Widely studied analytically Results in very large test sets for this subject –decision coverage: 995 –def-use coverage: 4296 Compared to large random test sets

8/23/00ISSTA-200022 Expected Values

8/23/00ISSTA-200023 Tail Probabilities

8/23/00ISSTA-200024 Threats to Validity Single program Dependence on programmers’ characterization of the faults Dependence on universe Universe based on operational distribution Single test set size (50) Accurate estimates of expected value, but less accuracy in estimates of density function

8/23/00ISSTA-200025 Conclusions Positive: –higher decision coverage yields lower expected failure rate –higher def-use coverage yields lower expected failure rate –higher coverage increases likelihood of reaching high reliability target (low failure rate target)

8/23/00ISSTA-200026 Conclusions (continued) Negative: –reliability gains with increased coverage are modest cost-effectiveness questionable economic significance of increases depends on context –no silver bullet for ultra-reliability

8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

Similar presentations

Presentation on theme: "8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

Similar presentations

Presentation on theme: "8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic."— Presentation transcript:

Similar presentations

About project

Feedback