Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Empirical Study on Reliability Modeling for Diverse Software Systems Xia Cai and Michael R. Lyu Dept. of Computer Science & Engineering The Chinese.

Similar presentations


Presentation on theme: "An Empirical Study on Reliability Modeling for Diverse Software Systems Xia Cai and Michael R. Lyu Dept. of Computer Science & Engineering The Chinese."— Presentation transcript:

1 An Empirical Study on Reliability Modeling for Diverse Software Systems Xia Cai and Michael R. Lyu Dept. of Computer Science & Engineering The Chinese University of Hong Kong

2 Dept. of Computer Science & Engineering 2 Outline Introduction Introduction Objectives and previous work Objectives and previous work Analyses and investigations on reliability models for diverse software systems Analyses and investigations on reliability models for diverse software systems Reliability bounds model by Popov,Strigini, et alReliability bounds model by Popov,Strigini, et al System reliability model by Dugan and LyuSystem reliability model by Dugan and Lyu Discussion Discussion Conclusion Conclusion

3 Dept. of Computer Science & Engineering 3 Introduction Design diversity is one of the two main techniques for software fault tolerance Design diversity is one of the two main techniques for software fault tolerance The rationale of this approach is the expectation that software programs built differently will fail differently The rationale of this approach is the expectation that software programs built differently will fail differently Reliability models attempt to estimate the probability of coincident failures in multiple versions Reliability models attempt to estimate the probability of coincident failures in multiple versions Empirical data are highly demanded for evaluation and cross-validation of the usefulness and/or effectiveness of these models Empirical data are highly demanded for evaluation and cross-validation of the usefulness and/or effectiveness of these models

4 Dept. of Computer Science & Engineering 4 Reliability models for design diversity Eckhardt and Lee (1985) Eckhardt and Lee (1985) Variation of difficulty on demand spaceVariation of difficulty on demand space Positive correlations between version failuresPositive correlations between version failures Littlewood and Miller (1989) Littlewood and Miller (1989) Forced design diversityForced design diversity Possibility of negative correlationsPossibility of negative correlations Dugan and Lyu (1995) Dugan and Lyu (1995) Markov reward modelMarkov reward model Tomek and Trivedi (1995) Tomek and Trivedi (1995) Stochastic reward netStochastic reward net Popov, Strigini et al (2003) Popov, Strigini et al (2003) Subdomains on demand spaceSubdomains on demand space Upper/lower bounds for failure probabilityUpper/lower bounds for failure probability Conceptual models Structural models In between

5 Dept. of Computer Science & Engineering 5 Our objectives To study reliability and fault correlation issues in design diversity by means of mutantation testing To study reliability and fault correlation issues in design diversity by means of mutantation testing To investigate and compare the prediction performance of different existing reliability models for design diversity To investigate and compare the prediction performance of different existing reliability models for design diversity

6 Dept. of Computer Science & Engineering 6 Our previous work Motivated by the lack of empirical data, we conducted the RSDIMU project in the year Motivated by the lack of empirical data, we conducted the RSDIMU project in the year It took more than 100 students 12 weeks to develop 34 program versions It took more than 100 students 12 weeks to develop 34 program versions 1200 test cases were executed on these program versions 1200 test cases were executed on these program versions 426 mutants were generated by injecting a single fault identified in the testing phase 426 mutants were generated by injecting a single fault identified in the testing phase A number of analyses and evaluations were conducted in our previous work A number of analyses and evaluations were conducted in our previous work

7 Dept. of Computer Science & Engineering 7 Introduction Introduction Objectives and previous work Objectives and previous work Analyses and investigations on reliability models for diverse software systems Analyses and investigations on reliability models for diverse software systems Reliability bounds model by Popov,Strigini, et alReliability bounds model by Popov,Strigini, et al (PS model) (PS model) System reliability model by Dugan and LyuSystem reliability model by Dugan and Lyu (DL model) (DL model) Discussion Discussion Conclusion Conclusion Outline

8 Dept. of Computer Science & Engineering 8 PS Model Proposed by P. T. Popov, L. Strigini, J. May and S. Kuball (2003) Target: give the upper and “ likely ” lower bounds for probability of coincident failures Assumptions: Given the knowledge on disjoint subdomains S i on the demand space, i.e., 1)the probability P(S i ) of a random demand being drawn from S i; 2)the probabilities of failure on demand (pfds) of A and B for demands from S i, P A|Si and P B|Si.

9 Dept. of Computer Science & Engineering 9 PS Model (cont’) Alternative estimates for probability of failures on demand (pfd) of a 1-out-of-2 system Alternative estimates for probability of failures on demand (pfd) of a 1-out-of-2 system

10 Dept. of Computer Science & Engineering 10 PS Model (cont’) Upper bound of system pfd Upper bound of system pfd “ Likely ” lower bound of system pfd “ Likely ” lower bound of system pfd - under the assumption of conditional independence

11 Dept. of Computer Science & Engineering 11 Experimental setup Mutants are treated as program versions in our experiment Mutants are treated as program versions in our experiment 1200 test cases are divided into seven categories by the system status 1200 test cases are divided into seven categories by the system status The first 800 test cases (manually designed for functionality testing) are used as qualification test and other 400 test cases (randomly generated) as operational test The first 800 test cases (manually designed for functionality testing) are used as qualification test and other 400 test cases (randomly generated) as operational test

12 Dept. of Computer Science & Engineering 12 Programs passed qualification test Information on subdomains Failure data and demand profile Failure data and demand profile Upper bounds Lower bounds subdomains Faults in operational test hypothetical real Analysis

13 Dept. of Computer Science & Engineering 13 Estimation Method Since no failure was observed in some subdomains, we adopt confidence bounds method rather than point estimates method in our experiment Since no failure was observed in some subdomains, we adopt confidence bounds method rather than point estimates method in our experiment One-sided confidence bounds (Bayesian Bounds) are computed for the probabilities of failures One-sided confidence bounds (Bayesian Bounds) are computed for the probabilities of failures 90% confidence upper bounds as well as lower bounds on pfds of mutants in subdomains under all demand profiles were estimated 90% confidence upper bounds as well as lower bounds on pfds of mutants in subdomains under all demand profiles were estimated

14 Dept. of Computer Science & Engineering 14 Bayesian Bounds under DP4 90% confidence upper bounds on pfds in subdomains 90% confidence upper bounds on pfds in subdomains 90% confidence lower bounds on pfds in subdomains 90% confidence lower bounds on pfds in subdomains

15 Dept. of Computer Science & Engineering 15 Upper bounds Failure LowerAnalysis Upper bounds on the joint pfds under all Demand Profiles Upper bounds on the joint pfds under all Demand Profiles

16 Dept. of Computer Science & Engineering 16 Lower Bounds FailureUpper Analysis “ Likely ” lower bounds on the joint pfds under Demand Profiles “ Likely ” lower bounds on the joint pfds under Demand Profiles

17 Dept. of Computer Science & Engineering 17 Analysis on upper/lower bounds Mutant pairs Failure features Performance comparison Covariance in failures Upper bounds Lower bounds (117, 305) No correlation Observed Fail differently Positive (DP1) Negative (others) Smaller than min(P A,P B ) Larger than P A *P B in DP1 (215, 382) CorrelationObserved Mutant 382 performs worse in all subdomains Always positive Equal to P 215 Larger in all DPs (382, 403) CorrelationObserved Perform differently Positive (DP1&2) Negative(DP3&4) Smaller than min(P A,P B ) Larger in DP1&2 Failure LowerUpper

18 Dept. of Computer Science & Engineering 18 Discussion With our data, the confidence bounds in PS model are tighter than P A *P B and min(P A, P B ) under most circumstances except With our data, the confidence bounds in PS model are tighter than P A *P B and min(P A, P B ) under most circumstances except One program performs worse than the other in all subdomainsOne program performs worse than the other in all subdomains Negative covariance holds between the failure probability of two programsNegative covariance holds between the failure probability of two programs Difficulties and limitations of PS model Difficulties and limitations of PS model The way to divide the demand space into disjoint subdomainsThe way to divide the demand space into disjoint subdomains The thorough knowledge on the probability and performance of all the versions in each subdomainThe thorough knowledge on the probability and performance of all the versions in each subdomain

19 Dept. of Computer Science & Engineering 19 DL Model Proposed by Dugan and Lyu (1995) Proposed by Dugan and Lyu (1995) 3-level reliability model 3-level reliability model A Markov model detailing the system structureA Markov model detailing the system structure Two fault trees presenting the causes of failures in the initial configuration and the reconfigured stateTwo fault trees presenting the causes of failures in the initial configuration and the reconfigured state Assumptions Assumptions Unrelated faults: different erroneous resultsUnrelated faults: different erroneous results Related faults: similar erroneous resultsRelated faults: similar erroneous results

20 Dept. of Computer Science & Engineering 20 DL Model Example: Reliability model of DRB Example: Reliability model of DRB

21 Dept. of Computer Science & Engineering 21 DL Model (cont’) Fault tree models for 2-, 3-, and 4-version systems Fault tree models for 2-, 3-, and 4-version systems

22 Dept. of Computer Science & Engineering 22 Results of DL model with our project data The new experimental data is applied to verify the effectiveness and consistency of DL model The new experimental data is applied to verify the effectiveness and consistency of DL model Six mutants with various failure characteristics are employed in the operational test Six mutants with various failure characteristics are employed in the operational test

23 Dept. of Computer Science & Engineering 23 Results of DL model with our project data Failure characteristics for 2,3,4-version configurations Failure characteristics for 2,3,4-version configurations

24 Dept. of Computer Science & Engineering 24 Results of DL model with our project data Summary of parameter values Summary of parameter values Prob. of related faults between two versions Prob. of unrelated faults Prob. of related faults in all versions

25 Dept. of Computer Science & Engineering 25 Results of DL model with our project data Predicted reliability by different configurations Predicted reliability by different configurations

26 Dept. of Computer Science & Engineering 26 Results of DL model with our project data Predicted safety by different configurations Predicted safety by different configurations

27 Dept. of Computer Science & Engineering 27 Discussion Compared our project with former project, the reliability and safety performance of DRB, NVP, NSCP shows consistency of DL model with respect to our experimental data Compared our project with former project, the reliability and safety performance of DRB, NVP, NSCP shows consistency of DL model with respect to our experimental data The discrepancy in the first thousands of hours may indicate dependence on operational domains The discrepancy in the first thousands of hours may indicate dependence on operational domains The simplified classification of related and unrelated faults need to be improved by including real-life scenarios The simplified classification of related and unrelated faults need to be improved by including real-life scenarios To achieve more accurate results, the information about the correlation between successive executions should be included To achieve more accurate results, the information about the correlation between successive executions should be included

28 Dept. of Computer Science & Engineering 28 Comparison of PS & DL Model PS Model PS Model DL Model DL Model Assumptions The whole demand space can be partitioned into disjoint subdomains; knowledge on subdomains should be given The faults among program versions can be classified into unrelated faults and related faults Prerequisite 1.Probability of subdomains 2.Failure probabilities of programs on subdomains 1.Number of faults unrelated and related among versions 2. Probability of hardware and decider failure Target system Specific 1-out-of-2 system configurations All multi-version system combinations Measurement objective Upper and lower bounds for failure probability Average failure probability Experimental results Give tighter bounds under most circumstances, yet whether tighter enough needs further investigation The prediction results agree well with observation, yet may have deviations to a specific system

29 Dept. of Computer Science & Engineering 29 Conclusion Mutants are employed to investigate the prediction performance of two reliability models Mutants are employed to investigate the prediction performance of two reliability models Advantages, limitations and performance of PS and DL model are compared Advantages, limitations and performance of PS and DL model are compared With our data, the confidence bounds in PS model are tighter than P A *P B and min(P A, P B ) under most circumstances With our data, the confidence bounds in PS model are tighter than P A *P B and min(P A, P B ) under most circumstances

30 Dept. of Computer Science & Engineering 30 Conclusion The PS approach is helpful with our data to analyze the behaviors of the versions under subdomains in revealing the features of fault correlation among diverse programs The PS approach is helpful with our data to analyze the behaviors of the versions under subdomains in revealing the features of fault correlation among diverse programs Our analyses with DL model about the reliability and safety features of DRB, NVP and NSCP are consist with the original experiment, although there are crossovers in the first thousands of hours in the reliability curves Our analyses with DL model about the reliability and safety features of DRB, NVP and NSCP are consist with the original experiment, although there are crossovers in the first thousands of hours in the reliability curves

31 Dept. of Computer Science & Engineering 31 Future work More test cases should be employed for cross-validation on the prediction accuracy of PS model and DL model More test cases should be employed for cross-validation on the prediction accuracy of PS model and DL model Other existing reliability models can be applied for further comparisons with our experimental data Other existing reliability models can be applied for further comparisons with our experimental data

32 Q & A Thank you! Dept. of Computer Science & Engineering


Download ppt "An Empirical Study on Reliability Modeling for Diverse Software Systems Xia Cai and Michael R. Lyu Dept. of Computer Science & Engineering The Chinese."

Similar presentations


Ads by Google