8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

Slides:



Advertisements
Similar presentations
Software Testing Testing.
Advertisements

Extension of E(Θ) metric for Evaluation of Reliability.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.
Framework for comparing power system reliability criteria Evelyn Heylen Prof. Geert Deconinck Prof. Dirk Van Hertem Durham Risk and Reliability modelling.
Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University March 22th, 2011 Smart grid seminar series Yao Liu, Peng Ning, and Michael K.
Coverage Estimation in Heterogeneous Visual Sensor Networks Mahmut Karakaya and Hairong Qi Advanced Imaging & Collaborative Information Processing Laboratory.
“ Building Strong “ Delivering Integrated, Sustainable, Water Resources Solutions Probabilistic Scenario Analysis Institute for Water Resources 2010 Charles.
NATW 2008 Using Implications for Online Error Detection Nuno Alves, Jennifer Dworak, R. Iris Bahar Division of Engineering Brown University Providence,
Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.
Validation and Monitoring Measures of Accuracy Combining Forecasts Managing the Forecasting Process Monitoring & Control.
8 Thinking Critically, Making Decisions, Solving Problems.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Stochastic Differentiation Lecture 3 Leonidas Sakalauskas Institute of Mathematics and Informatics Vilnius, Lithuania EURO Working Group on Continuous.
Empirically Assessing End User Software Engineering Techniques Gregg Rothermel Department of Computer Science and Engineering University of Nebraska --
An Experimental Evaluation of the Reliability of Adaptive Random Testing Methods Hong Zhu Department of Computing and Electronics, Oxford Brookes University,
Chapter 13: Audit Sampling Spring Overview of Sampling.
An Experimental Evaluation on Reliability Features of N-Version Programming Xia Cai, Michael R. Lyu and Mladen A. Vouk ISSRE’2005.
1 Validation and Verification of Simulation Models.
Parameterizing Random Test Data According to Equivalence Classes Chris Murphy, Gail Kaiser, Marta Arias Columbia University.
Active Learning Strategies for Compound Screening Megon Walker 1 and Simon Kasif 1,2 1 Bioinformatics Program, Boston University 2 Department of Biomedical.
Reliability Modeling for Design Diversity: A Review and Some Empirical Studies Teresa Cai Group Meeting April 11, 2006.
Value of Information for Complex Economic Models Jeremy Oakley Department of Probability and Statistics, University of Sheffield. Paper available from.
1 Software Testing and Quality Assurance Lecture 5 - Software Testing Techniques.
FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS
Multiantenna-Assisted Spectrum Sensing for Cognitive Radio
8/20/2015Slide 1 SOLVING THE PROBLEM The two-sample t-test compare the means for two groups on a single variable. the The paired t-test compares the means.
Software Testing Sudipto Ghosh CS 406 Fall 99 November 9, 1999.
Introduction to Adaptive Digital Filters Algorithms
Software Reliability SEG3202 N. El Kadri.
SIMULATION USING CRYSTAL BALL. WHAT CRYSTAL BALL DOES? Crystal ball extends the forecasting capabilities of spreadsheet model and provide the information.
Stochastic Linear Programming by Series of Monte-Carlo Estimators Leonidas SAKALAUSKAS Institute of Mathematics&Informatics Vilnius, Lithuania
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 3 Slide 1 Critical Systems 1.
Pseudo-Random Pattern Generator Design for Column ‑ Matching BIST Petr Fišer Czech Technical University Dept. of Computer Science and Engineering.
Dept. of Computer and Information Sciences : University of Delaware John Cavazos Department of Computer and Information Sciences University of Delaware.
1841f06detprob3 MM Stroustrup Ch26 u Comments? u Agree or disagree with his testing approach?
Yaomin Jin Design of Experiments Morris Method.
1 6. Reliability computations Objectives Learn how to compute reliability of a component given the probability distributions on the stress,S, and the strength,
Experimentation in Computer Science (Part 1). Outline  Empirical Strategies  Measurement  Experiment Process.
Auther: Kevian A. Roudy and Barton P. Miller Speaker: Chun-Chih Wu Adviser: Pao, Hsing-Kuo.
Test Drivers and Stubs More Unit Testing Test Drivers and Stubs CEN 5076 Class 11 – 11/14.
Tile-based parallel coordinates and its application in financial visualization Jamal Alsakran, Ye Zhao Kent State University, Department of Computer Science,
Software Reliability in Nuclear Systems Arsen Papisyan Anthony Gwyn.
Part 5 Staffing Activities: Employment
Design of Experiments DoE Antonio Núñez, ULPGC. Objectives of DoE in D&M Processes, Process Investigation, Product and Process Q-improvement, Statistical.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
Coulomb Stress Changes and the Triggering of Earthquakes
Computer Science 1 Mining Likely Properties of Access Control Policies via Association Rule Mining JeeHyun Hwang 1, Tao Xie 1, Vincent Hu 2 and Mine Altunay.
Probabilistic Scenario Analysis Institute for Water Resources 2009 Charles Yoe, PhD
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
A Biased Fault Attack on the Time Redundancy Countermeasure for AES Sikhar Patranabis, Abhishek Chakraborty, Phuong Ha Nguyen and Debdeep Mukhopadhyay.
1 Validation of Qualitative Microbiological Test Methods NCS Conference Brugge, October 2014 Pieta IJzerman-Boon (MSD) Edwin van den Heuvel (TUe, UMCG/RUG)
Machine Design Under Uncertainty. Outline Uncertainty in mechanical components Why consider uncertainty Basics of uncertainty Uncertainty analysis for.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
When Tests Collide: Evaluating and Coping with the Impact of Test Dependence Wing Lam, Sai Zhang, Michael D. Ernst University of Washington.
Pairwise Sequence Alignment Part 2. Outline Summary Local and Global alignments FASTA and BLAST algorithms Evaluating significance of alignments Alignment.
Experimentation in Computer Science (Part 2). Experimentation in Software Engineering --- Outline  Empirical Strategies  Measurement  Experiment Process.
1 1 Slide Simulation Professor Ahmadi. 2 2 Slide Simulation Chapter Outline n Computer Simulation n Simulation Modeling n Random Variables and Pseudo-Random.
Advanced Residual Analysis Techniques for Model Selection A.Murari 1, D.Mazon 2, J.Vega 3, P.Gaudio 4, M.Gelfusa 4, A.Grognu 5, I.Lupelli 4, M.Odstrcil.
Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Mutation Testing Breaking the application to test it.
1841f06detprob3 Testing Basics Detection probability.
Shadow Detection in Remotely Sensed Images Based on Self-Adaptive Feature Selection Jiahang Liu, Tao Fang, and Deren Li IEEE TRANSACTIONS ON GEOSCIENCE.
Software Testing. SE, Testing, Hans van Vliet, © Nasty question  Suppose you are being asked to lead the team to test the software that controls.
Unit 6 Research Project in HSC Unit 6 Research Project in Health and Social Care Aim This unit aims to develop learners’ skills of independent enquiry.
Understanding Results
Types of Testing Visit to more Learning Resources.
Predict Failures with Developer Networks and Social Network Analysis
Evaluating Testing Methods by Delivered Reliability
Presentation transcript:

8/23/00ISSTA Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic University Brooklyn, NY

8/23/00ISSTA Outline Measures of test effectiveness Delivered reliability Experiment design Subject program Results Threats to validity Conclusions

8/23/00ISSTA Measures of Test Effectiveness Probability of detecting at least one fault [DN84,HT90,FWey93,FWei93,…] Expected number of failures during test [FWey93,CY96] Number of faults detected [HFGO94] Delivered reliability [FHLS98]

8/23/00ISSTA Select test cases Execute test cases Check results Debug program Release program Check test data adequacy OK? no yes no

8/23/00ISSTA Select test cases Execute test cases Check results Debug program Release program Estimate reliability OK? no yes

8/23/00ISSTA Delivered Reliability Captures intuition that discovery and removal of “important” faults is more crucial Evaluates testing technique according to the extent to which testing will increase reliability Introduced and studied analytically, FHLS (FSE-97, TSE-98)

8/23/00ISSTA Failures, Faults, and Failure Regions int foo(); int x,y; { s1; s2; if c1 { s3; s4; }; s5; s6; } qi = probability that input selected according to operational distribution will hit failure region i

8/23/00ISSTA Failure Rate After Testing/Debugging Reliability after testing and debugging determined by which failure regions are hit by test cases Random variable represents failure rate after testing and debugging Compare testing techniques by comparing statistics of their ’s

8/23/00ISSTA Example

8/23/00ISSTA Testing Criteria Considered Various levels of coverage of –decision coverage (branch testing) –def-use coverage (all-used data flow testing) –grouped into quartiles and deciles random testing with no coverage criterion

8/23/00ISSTA Questions Investigated How do test sets that achieve high coverage levels (of branch testing or data flow testing) compare to those achieving lower coverage, according to –Expected improvement in reliability: –Probability of reaching given reliability target:

8/23/00ISSTA Subject Program “Space” Program 10,000+ LOC C antenna design program, written by professional programmers, containing naturally occurring faults Test generator generates tests according to operational distribution [Pasquini et al] Considered 10 relatively hard-to-detect faults Failure rate:

8/23/00ISSTA Experiment Design Adapted from design used to compare probability of detecting at least one fault [Frankl, Weiss, et al.] Simulate execution of very large number of fixed-sized test sets For each, note coverage achieved (branch, data flow) and faults detected Compute density function of for various coverage-level groups

8/23/00ISSTA features Test cases Coverage matrix Fault-sets Failure rate vector Test cases faultsResults matrix Fault-sets Fault-detection matrix Coverage levels

8/23/00ISSTA Coverage Levels Considered the following groups of test sets for test sets of size 50: –highest decile of decision coverage –highest decile of def-use coverage –four quartiles of decision coverage –four quartiles of def-use coverage

8/23/00ISSTA Expected Values

8/23/00ISSTA Tail Probabilities

8/23/00ISSTA

8/23/00ISSTA

8/23/00ISSTA

8/23/00ISSTA Idealized Test Generation Strategy Select one test case from each subdomain (independently, randomly) Widely studied analytically Results in very large test sets for this subject –decision coverage: 995 –def-use coverage: 4296 Compared to large random test sets

8/23/00ISSTA Expected Values

8/23/00ISSTA Tail Probabilities

8/23/00ISSTA Threats to Validity Single program Dependence on programmers’ characterization of the faults Dependence on universe Universe based on operational distribution Single test set size (50) Accurate estimates of expected value, but less accuracy in estimates of density function

8/23/00ISSTA Conclusions Positive: –higher decision coverage yields lower expected failure rate –higher def-use coverage yields lower expected failure rate –higher coverage increases likelihood of reaching high reliability target (low failure rate target)

8/23/00ISSTA Conclusions (continued) Negative: –reliability gains with increased coverage are modest cost-effectiveness questionable economic significance of increases depends on context –no silver bullet for ultra-reliability