An Empirical Study on Testing and Fault Tolerance for Software Reliability Engineering Michael R. Lyu, Zubin Huang, Sam Sze, Xia Cai The Chinese University.

Slides:



Advertisements
Similar presentations
Testing and Quality Assurance
Advertisements

SOFTWARE TESTING. INTRODUCTION  Software Testing is the process of executing a program or system with the intent of finding errors.  It involves any.
CS527: Advanced Topics in Software Engineering (Software Testing and Analysis) Darko Marinov September 18, 2008.
An Empirical Study on Reliability Modeling for Diverse Software Systems Xia Cai and Michael R. Lyu Dept. of Computer Science & Engineering The Chinese.
Making Services Fault Tolerant
1 Building Reliable Web Services: Methodology, Composition, Modeling and Experiment Pat. P. W. Chan Department of Computer Science and Engineering The.
Coverage-Based Testing Strategies and Reliability Modeling for Fault- Tolerant Software Systems Presented by: CAI Xia Supervisor: Prof. Michael R. Lyu.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 23 Slide 1 Software testing.
Software Reliability Engineering: A Roadmap
1 Testing Effectiveness and Reliability Modeling for Diverse Software Systems CAI Xia Ph.D Term 4 April 28, 2005.
Discrete-Event Simulation: A First Course Steve Park and Larry Leemis College of William and Mary.
A CONTROL INSTRUMENTS COMPANY The Effectiveness of T-way Test Data Generation or Data Driven Testing Michael Ellims.
1 Software Testing and Quality Assurance Lecture 31 – Testing Systems.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 23 Slide 1 Software testing.
An Experimental Evaluation on Reliability Features of N-Version Programming Xia Cai, Michael R. Lyu and Mladen A. Vouk ISSRE’2005.
Parameterizing Random Test Data According to Equivalence Classes Chris Murphy, Gail Kaiser, Marta Arias Columbia University.
1 Software Testing and Quality Assurance Lecture 30 – Testing Systems.
Testing an individual module
Reliability Modeling for Design Diversity: A Review and Some Empirical Studies Teresa Cai Group Meeting April 11, 2006.
1 The Effect of Code Coverage on Fault Detection Capability: An Experimental Evaluation and Possible Directions Teresa Xia Cai Group Meeting Feb. 21, 2006.
1 Making Services Fault Tolerant Pat Chan, Michael R. Lyu Department of Computer Science and Engineering The Chinese University of Hong Kong Miroslaw Malek.
1 Software Testing and Quality Assurance Lecture 5 - Software Testing Techniques.
Chapter 11: Testing The dynamic verification of the behavior of a program on a finite set of test cases, suitable selected from the usually infinite execution.
1CMSC 345, Version 4/04 Verification and Validation Reference: Software Engineering, Ian Sommerville, 6th edition, Chapter 19.
State coverage: an empirical analysis based on a user study Dries Vanoverberghe, Emma Eyckmans, and Frank Piessens.
Software Testing Sudipto Ghosh CS 406 Fall 99 November 9, 1999.
Software faults & reliability Presented by: Presented by: Pooja Jain Pooja Jain.
Software Testing Verification and validation planning Software inspections Software Inspection vs. Testing Automated static analysis Cleanroom software.
Dr. Pedro Mejia Alvarez Software Testing Slide 1 Software Testing: Building Test Cases.
System/Software Testing
Verification and Validation Yonsei University 2 nd Semester, 2014 Sanghyun Park.
CPIS 357 Software Quality & Testing
Introduction Telerik Software Academy Software Quality Assurance.
CMSC 345 Fall 2000 Unit Testing. The testing process.
VTT-STUK assessment method for safety evaluation of safety-critical computer based systems - application in BE-SECBS project.
We introduce the use of Confidence c as a weighted vote for the voting machine to avoid low confidence Result r of individual expert from affecting the.
© SERG Dependable Software Systems (Mutation) Dependable Software Systems Topics in Mutation Testing and Program Perturbation Material drawn from [Offutt.
1 Software testing. 2 Testing Objectives Testing is a process of executing a program with the intent of finding an error. A good test case is in that.
Testing Testing Techniques to Design Tests. Testing:Example Problem: Find a mode and its frequency given an ordered list (array) of with one or more integer.
BLACK BOX TESTING K.KARTHIKEYAN. Black box testing technique Random testing Equivalence and partitioning testing Boundary value analysis State transition.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 22 Slide 1 Software Verification, Validation and Testing.
European Test Symposium, May 28, 2008 Nuno Alves, Jennifer Dworak, and R. Iris Bahar Division of Engineering Brown University Providence, RI Kundan.
SENG521 (Fall SENG 521 Software Reliability & Testing Fault Tolerant Software Systems: Techniques (Part 4b) Department of Electrical.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.
1 Phase Testing. Janice Regan, For each group of units Overview of Implementation phase Create Class Skeletons Define Implementation Plan (+ determine.
Quality Assurance.
CprE 458/558: Real-Time Systems
Software Development Problem Analysis and Specification Design Implementation (Coding) Testing, Execution and Debugging Maintenance.
Experimentation in Computer Science (Part 2). Experimentation in Software Engineering --- Outline  Empirical Strategies  Measurement  Experiment Process.
Software Engineering1  Verification: The software should conform to its specification  Validation: The software should do what the user really requires.
Testing and inspecting to ensure high quality An extreme and easily understood kind of failure is an outright crash. However, any violation of requirements.
Software Quality Assurance and Testing Fazal Rehman Shamil.
 Software Testing Software Testing  Characteristics of Testable Software Characteristics of Testable Software  A Testing Life Cycle A Testing Life.
1 Developing Aerospace Applications with a Reliable Web Services Paradigm Pat. P. W. Chan and Michael R. Lyu Department of Computer Science and Engineering.
1 Phase Testing. Janice Regan, For each group of units Overview of Implementation phase Create Class Skeletons Define Implementation Plan (+ determine.
Mutation Testing Breaking the application to test it.
Testing Overview Software Reliability Techniques Testing Concepts CEN 4010 Class 24 – 11/17.
Mutation Testing Laraib Zahid & Mariam Arshad. What is Mutation Testing?  Fault-based Testing: directed towards “typical” faults that could occur in.
CS223: Software Engineering Lecture 25: Software Testing.
Week#3 Software Quality Engineering.
Software Testing.
Chapter 8 – Software Testing
Software testing strategies 2
Verification and Validation Unit Testing
Test Case Purification for Improving Fault Localization
Fault Tolerance Distributed Web-based Systems
Presented by: CAI Xia Ph.D Term2 Presentation April 28, 2004
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
DESIGN OF EXPERIMENTS by R. C. Baker
By Hyunsook Do, Sebastian Elbaum, Gregg Rothermel
Presentation transcript:

An Empirical Study on Testing and Fault Tolerance for Software Reliability Engineering Michael R. Lyu, Zubin Huang, Sam Sze, Xia Cai The Chinese University of Hong Kong

Outline Introduction Motivation Project Descriptions and Experimental Procedure Static Analysis of Mutants: Fault Classification and Distribution Dynamic Analysis of Mutants: Effects on Software Testing and Fault Tolerance Software Testing using Domain Analysis Conclusion

Introduction Fault removal and fault tolerance are two major approaches in software reliability engineering Software testing is the main fault removal technique –Data flow coverage testing –Mutation testing The main fault tolerance technique is software design diversity –Recovery blocks –N-version programming –N self-checking programming

Introduction Conclusive evidence abut the relationship between test coverage and software reliability is still lacking Mutants with hypothetical faults are either too easily killed, or too hard to be activated The effectiveness of design diversity heavily depends on the failure correlation among the multiple program versions, which remains a debatable research issue.

Motivation The lack of real world project data for investigation on software testing and fault tolerance techniques The lack of comprehensive analysis and evaluation on software testing and fault tolerance together

Our Contribution Conduct a real-world project to engage multiple teams for independent development program versions Perform detailed experimentation to study the nature, source, type, detectability and effect of faults uncovered in the versions Apply mutation testing with real faults and investigate data flow coverage, mutation coverage, and design diversity for fault coverage Examine different hypotheses on software testing and fault tolerance schemes Employ a new software test case generation technique based on domain analysis approach and evaluated its effectiveness

Project descriptions In spring of 2002, 34 teams are formed to develop a critical industry application for a 12-week long project in a software engineering course Each team composed of 4 senior-level undergraduate students with computer science major from the Chinese University of Hong Kong

Project descriptions The RSDIMU project –Redundatn Strapped-Down Inertial Measurement Unit RSDIMU System Data Flow Diagram

Software development procedure 1.Initial design document ( 3 weeks) 2.Final design document (3 weeks) 3.Initial code (1.5 weeks) 4.Code passing unit test (2 weeks) 5.Code passing integration test (1 weeks) 6.Code passing acceptance test (1.5 weeks)

Program metrics IdLinesModulesFunctionsBlocksDecisionsC-UseP-UseMutants Average Total: 426

Mutant creation Revision control was applied in the project and code changes were analyzed Fault found during each stage were also identified and injected into the final program of each version to create mutants Each mutant contains one design or programming fault 426 mutants were created for 21 program versions

Setup of evaluation test ATAC tool was employed to analyze the compare testing coverage 1200 test cases were exercised on 426 mutants All the resulting failures from each mutant were analyzed, their coverage measured, and cross- mutant failure results compared 60 Sun machines running Solaris were involved in the test, one cycle took 30 hours and a total of 1.6 million files around 20GB were generated

Static analysis: fault classificaiton and distribution Mutant defect type distribution Mutant qualifier distribution Mutant severity distribution Fault distribution over development stage Mutant effect code lines

Static Analysis result (1) Defect typesNumberPercent Assign/Init:13631% Function/Class/Object:14433% Algorithm/Method:8119% Checking:6014% Interface/OO Messages51% QualifierNumberPercent Incorrect:26763% Missing:14133% Extraneous:184% Defect Type Distribution Qualifier Distribution

Static Analysis result (2) Severity Level Highest SeverityFirst Failure Severity NumberPercentageNumberPercenta ge A Level (Critical): 122.8%30.7% B Level (High): % % C Level (Low): %9923.2% D Level (Zero): %71.6% Severity Distribution

Static Analysis result (3) StageNumberPercent age Init Code % Unit Test % Integration Test317.3% Acceptance Test388.9% LinesNumberPercent 1 line: % 2-5 lines: % 6-10 lines: % lines: % lines: % >51 lines:235.40% Average11.39 Development Stage DistributionFault Effect Code Lines

Dynamic analysis of mutants Software testing related –Effectiveness of code coverage –Test case contribution: test coverage vs. mutant coverage –Finding non-redundant set of test cases Software fault tolerance related –Relationship between mutants –Relationship between the programs with mutants

Test case description Case IDDescription of the test cases. 1A fundamental test case to test basic functions. 2-7Test cases checking vote control in different order. 8General test case based on test case 1 with different display mode. 9-19Test varying valid and boundary display mode Test cases for lower order bits Test cases for display and sensor failure Test random display mode and noise in calibration Test correct use of variable and sensitivity of the calibration procedure. 86, Test on input, noise and edge vector failures Test various and large angle value Test cases checking for the minimal sensor noise levels for failure declaration Test cases with various combinations of sensors failed on input and up to one additional sensor failed in the edge vector test Random test cases. Initial random seed for 1st 100 cases is: 777, for 2nd 100 cases is: Random test cases. Initial random seed is: for 200 cases.

Fault Detection Related to Changes of Test Coverage Version IDBlocksDecisionsC-UseP-UseAny 16/11 7/117/11(63.6%) 29/14 10/1410/14(71.4%) 34/8 3/84/84/8(50.0%) 4 7/138/13 8/13(61.5%) 57/12 5/127/127/12(58.3%) 75/11 5/11(45.5%) 81/92/9 2/9(22.2%) 97/12 7/12(58.3%) 1210/1917/1911/1917/1918/19(94.7%) 156/18 6/18(33.3%) 175/11 5/11(45.5%) 185/6 5/6(83.3%) 209/1110/118/1110/1110/11(90.9%) 2212/14 12/14(85.7%) 245/6 5/6(83.3%) 262/114/11 4/11(36.4%) 274/95/94/95/95/9(55.6%) 2910/15 11/1510/1512/15(80.0%) 317/15 8/15(53.3%) 323/164/165/16 5/16(31.3%) 337/11 9/1110/1110/11(90.9%) Overall131/252 (60.0%)145/252 (57.5%)137/252 (53.4%)152/252 (60.3%)155/252 (61.5%)

Relations between Numbers of Mutants against Effective Percentage of Coverage

Test Case Contribution on Program Coverage

Percentage of Test Case Coverage Percentage of Coverage BlocksDecisionC-UseP-Use Average45.86%29.63%35.86%25.61% Maximum52.25%35.15%41.65%30.45% Minimum32.42%18.90%23.43%16.77%

Test Case Contributions on Mutant Average: 248 (58.22%) Maximum: 334 (78.40%) Minimum: 163 (38.26%)

Non-redundant Set of Test Cases Gray: redundant test cases (502/1200) Black: non-redundant test cases (698/1200) Reduction: 58.2%

Mutants Relationship RelationshipNumber of pairsPercentage Related mutants % Similar mutants % Exact mutants % Related mutants: two mutants have the same success/failure result on the 1200-bit binary string Similar mutants: two mutants have the same binary string and with the same erroneous output variables Related mutants: two mutants have the same binary string with the same erroneous output variables, and erroneous output values are exactly the same

Program Versions with Similar Mutants ID

Program Versions with Exact Mutants ID

Relationship between the Programs with Exact Mutants Version 4Version 8 ModuleDisplay Processor StageInitcode Defect TypeAssign/Init SeverityCC QualifierMissing Exact Pair : Versions 4 and 8 Exact Fault Pair 2: Versions 12 and 31 Version 12Version 31 ModuleCalibrate StageInitcode Defect TypeAlgorithm/Method SeverityBB QualifierIncorrect Version 15Version 33 ModuleCalibrate StageInitcode Defect Type Algorithm/Metho d SeverityBB QualifierMissing Exact Fault Pair 3: Versions 15 and 33

Relationship between the Programs with Exact Mutants Version 4Version 15Version 17 ModuleEstimate Vehicle State StageInitcode Defect Type Assign/Init Algorithm/Met hod SeverityBBB QualifierIncorrect Exact Fault Pairs: Versions 4, 15 and 17 Version 31Version 32 ModuleCalibrate StageUnit TestAcceptance Test Defect Type Checking SeverityBB QualifierIncorrect Exact Fault Pair 7: Versions 31 and 32

Software Testing using Domain Analysis A new approach has been proposed to generate test cases based on domain analysis of specifications and programs The differences of functional domain and operational domain are examined by analyzing the set of boundary conditions Test cases are designed by verifying the overlaps of operational domain and functional domain to locate the faults resulting from the discrepancies between these two domains 90 new test cases are developed, and all the 426 mutants can be killed by these test cases

Test cases generated by domain analysis Case IDDescription 1-6Modify linStd to short int boundary 7-16Set LinFailIn array to short int boundary 17-25, 27-41, Set RawLin to boundary 26,66, 67-73, 86 Modify offRaw array to boundary Set DisplayMode in [ – ] boundaries 80-85Set nsigTolerance to various values 87-90Set base=0, , , , respectively

Contribution of Test Cases Generated by Domain Analysis Average: 183 (42.96%) Maximum: 223 (52.35%) Minimum: 139 (32.63%)

Non-redundant Test Set for Test Cases Generated by the Domain Analysis

Observation Coverage measures and mutation scores cannot be evaluated in isolation, and an effective mechanism to distinguish related faults is critical A good test case should be characterized not only by its ability to detect more faults, but also by its ability to detect faults which are not detected by other test cases in the same test set Domain analysis is an effective approach to generating test cases

Observation Individual fault detection capability of each test case in a test set does not represent the overall capability of the test set to cover more faults, diversity natures of the test cases are more important Design diversity involving multiple program versions can be an effective solution for software reliability engineering, since the portion of program versions with exact faults is very small Software fault removal and fault tolerance are complementary rather than competitive, yet the quantitative tradeoff between the two remains a research issue

Conclusion We perform an empirical investigation on evaluating fault removal and fault tolerance issues as software reliability engineering techniques Mutation testing was applied with real faults Static as well as dynamic analysis was performed to evaluate the relationship of fault removal and fault tolerance techniques Domain analysis was adopted to generate more powerful test cases