Overview Software Quality Assurance Reliability and Availability

Name: Overview Software Quality Assurance Reliability and Availability
Uploaded: 2017-09-10T21:30:35+00:00
Duration: PTM17S15
Channel: Bennett Lawson
Description: Overview Software Quality Assurance Reliability and Availability

ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Paulo Alencar

Overview Software Quality Assurance Reliability and Availability
Software Reliability Models Calendar Time, Execution Time Operational Phase Concurrent Components

Software Quality Assurance
Basic definitions: Software metrics: measures to determine the degree of each quality characteristic attained by the product. Advantages: - They can support software quality in multiple ways (e.g., software complexity, testing coverage, reliability, etc.) - They can find problematic areas and bottlenecks in the software process (e.g., performance problems, etc.) - They can help the software engineers assess the quality of their work

Software Reliability Basic definitions:
S/W reliability: probability that the software will not cause a failure for some specified time. Failure: divergence in expected external behavior. Fault: cause/representation of an error, i.e., a bug Error: a programmer mistake (misinterpretation of specifications?)

Software Reliability Basic question: How to estimate the growth in software reliability as its errors are being removed? Major issues: testing - (how much? When to stop!) field use ( # of trained personnel? Support staff?) S/W reliability growth models: observe past failure history and give an estimate of the future failure behavior; about 40 models have been proposed.

Reliability and Availability
A simple measure of reliability can be given as: MTBF = MTTF + MTTR , where MTBF is mean time between failures MTTF is mean time to fail MTTR is mean time to repair

Availability can be defined as the probability that the system is still operating within requirements at a given point in time and can be given as: availability is more sensitive to MTTR which is an indirect measure of the maintainability of software.

Since each error in the program does not have the same failure rate, the total error count does not provide a good indication of the reliability of the system. MTBF is perhaps more useful (meaningful) than defects/KLOC since the user is concerned with failures not the total error count.

Software Reliability Models
Software reliability models can be classified into many different groups; some of the more prominent (better known) groups include: error seeding - estimates the number of errors in a program. Errors are divided into indigenous errors and induced (seeded) errors. The unknown number of indigenous errors is estimated from the number of induced errors and the ratio of the two types of errors obtained from the testing data.

Reliability growth Measures and predicts the improvement of reliability through the testing process using a growth function to represent the process. Independent variables of the growth function could be time, number of test cases (or testing stages) and The dependent variables can be reliability, failure rate or cumulative number of errors detected.

Nonhomogeneous Poisson process (NHPP) provide an analytical framework for describing the software failure phenomenon during testing. the main issue is to estimate the mean value function of the cumulative number of failures experienced up to a certain point in time. a key example of this approach is the series of Musa models

A typical measure (failures per unit time) is the failure intensity (rate) given as: where  = program CPU time (in a time shared computer) or wall clock time (in an embedded system).

SR Growth models are generally “black box” - no easy way to account for a change in the operational profile Operational profile: description of the input events expected to occur in actual software operation – how it will be used in practice consequences are that we are unable to go from test to field

· Many models have been proposed, perhaps the most prominent are: Musa Basic model Musa/Okomoto Logarithmic model Some models work better than others depending on the application area and operating characteristics: i.e. interactive? data intensive? control intensive? real-time?

Failure Intensity Reduction Concept
Initial failure intensity, l0 λ: Failure intensity λ0: Initial failure intensity at start of execution μ: Average total number of failures at a given point in time v0: Total number of failures over infinite time l: failure intensity Basic Log. v0 m: Mean failures exp.

Basic Assumptions of Musa
Errors in the program are independent and distributed with constant average occurrence rate. Execution time between failures is large with respect to instruction execution time. Potential ‘test space’ covers it ‘use space’. The set of inputs per test run (test or operational) is randomly selected. All failures are observed. The error causing the failure is immediately fixed or else its re-occurrence is not counted again.

Musa Basic Model Failure Intensity (FI) is the number of failures per unit time. Assume that the decrement in failure intensity (FI) function (the derivative wrt the number of expected failures) is constant. Implies that the FI is a function of average number of failures experienced at any given point in time. Reference: Musa, Iannino, Okumoto, “Software Reliability: Measurement, Prediction, Application”, McGraw-Hill, 1987.

Musa Basic Model where:
0 is the initial failure intensity at the start of execution.  is the average (expected) number of failures at any point in time. 0 is the total number of failures over infinite time. The average number of failures at any point in time is given as:

Example: Assume a program will experience 100 failures in infinite time. It has now experienced 50 failures. The initial failure intensity was 10 failures/cpu hour. The current failure intensity is: The number of failures experienced after 10 cpu hours is: For 100 hours:

Logarithmic Model The decrement per failure (of FI) becomes smaller with failures experienced (exponential decrease) - makes intuitive sense - usually observed in practice. () = 0 exp( - ) where  is the failure intensity decay parameter

Example: assume the 0 = 10 failures/cpu hour,  = 0
Example: assume the 0 = 10 failures/cpu hour,  = 0.02/failure, and that 50 failures have been experienced. The current failure intensity is: At 10 cpu hours: Smaller than the Basic model At 100 cpu hours, we have: (more failures than Basic model)

Reliability Models Basic model Logarithmic model
m(t) = v0[1 – exp(-l0t/v0)] m(t) = (1/q).ln(l0qt + 1) l(t) = l0/(l0qt + 1) l(t) = l0exp(-l0t/v0) l m Log. Log. v0 Basic Basic t t

Reliability Models Examples
Example: Assume that a program will experience 100 failures in infinite time. The initial failure intensity was 10 failures/CPU-hr, the present failure intensity is 3.68 failures/CPU-hour and our objective intensity is failure/CPU-hr. Predict the additional testing time to achieve the stated objective. Answer: We know that l(t) = l0exp(-l0t/v0) At time t1, l(t1) = l0exp(-l0t1/v0) = lp At time t2, l(t2) = l0exp(-l0t2/v0) = lf t2 - t1 = (v0/ l0).ln(lp/ lf) v0 = 100 faults, l0 = 10 failures/CPU-hr lp = 3.68 failures/CPU-hr, lf = failure/CPU-hr Testing time = (t2 - t1 ) = 90 CPU-hr

Choice of Model Basic Model:
For studies or predictions before execution and failure data available Using study of faults to determine effects of a new software engineering technology The program size is changing continually or substantially (i.e. during integration)

Logarithmic Model System subjected to highly non-uniform operational profiles. Highly predictive validity is needed early in the execution period. The rapidly changing slope of the failure intensity during early stages can be better fitted with the logarithmic Poisson than the basic model . Basic idea: use the basic model for pretest studies and estimates and periods of evolution while switching to the logarithmic model when integration is complete.

Calendar Time Component
The calendar time component attempts to relate execution time and calendar time by determining at any given time the ratio: The calendar time is obtaining by integrating this ratio with respect to execution time. The calendar time is of most concern during the test and repair phases as well to predict the dates at which given failure intensities will be achieved

The calendar time is based on a debugging process model and takes into account the following:
The resources used in running the program for a given execution time and processing a specified quantity of features; The resource quantities available; The degree to which a resource can be utilized (due to possible bottlenecks) during the period in which it is limiting.

At the start of testing: a large number of failures in short time intervals;
Stop testing to allow fixing of faults. As testing progresses, the intervals between faults is longer, failure correction people not filled with work the test team becomes the bottleneck and eventually the computing resources are the limiting factor.

Resource Usage Musa has shown that resource usage is linearly proportional to execution time and mean failures experienced. Let r represent the usage of resource r, then where r is the resource usage per CPU hour; r is the resource usage per failure

The following table from Musa summarizes the typical parameters
Resources Usage per CPU hour Failure Availabe Planned Utilization Identification Personnel Failure Correction Computer Time

Example: A test runs test cases for 10 CPU hours and identifies 34 failures; the effort per hour of execution time is 5 person hours, each failure requires on average 2 hours to identify and verify. The total failure identification effort required (using the previous formula) is: = 5(10) + 2(34) = 118 person hr

The change in resource usage per unit of execution time, is calculated as:
Since the failure intensity decreases with execution time, the effort used per hour of execution time also tends to decrease with testing, as expected. In a similar manner, other calendar time components can also be obtained from the base equation.

Operational Phase Once the software has been released and is operational, and no features are added or repairs made between releases, the failure intensity will become a constant. Both models reduce to a homogeneous Poisson processes with the failure intensity as the parameter. The failures in a given time period follow a Poisson distribution while the failure intervals follow an exponential distribution.

Operational Phase The reliability R and failure intensity  are related by: R() = exp( ) As expected, the probability of no failures for a given execution time (the reliability) is lower for longer time intervals.

Operational Phase In many cases, the operational phase consists of a series of releases results in the reliability and failure intensity to be a series of step functions; under these cases, if the releases are frequent and the failure intensity decreases, step functions can be approximated by one the previous reliability models

Operational Phase We can also apply the reliability models directly to reported failures (not counting repetitions) but that now the model reflects the case when failures have been corrected If failures corrected in the field, then the model is similar to that of the system test phase

System Reliability (Concurrent Components)
Assume that we have QP components with constant failure intensities where their reliabilities are measured over a common calendar time interval and that all must function correctly for system success, then the system failure intensity is given by where k refers to the individual component failure intensities.

Software reliabilities are usually given in terms of execution time
Before software and hardware reliabilities are combined, the conversion of software reliability is required. First convert the reliability R of each software component to failure intensity. Using  to represent failure intensity as being with respect to execution time, we obtain using R() = exp( ) where  is the execution time period for which the reliability was given.

Now let C be the average utilization by the program of the machine
assuming that it does not vary greatly over the failure intervals, then the failure intensity with respect to clock time t is given by: Note that the average utilization is less than or equal to 1.

Once the failure intensities with respect to the reference period of clock time are obtained for all components, the resulting reliability is given by:

It is also possible to consider the situation where a software component currently running on machine 1 with instruction execution rate r1 is moved to machine 2 with instruction execution rate of r2 The failure intensity on machine 2 is given by:

Overview Software Quality Assurance Reliability and Availability

Similar presentations

Presentation on theme: "Overview Software Quality Assurance Reliability and Availability"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Overview Software Quality Assurance Reliability and Availability

Similar presentations

Presentation on theme: "Overview Software Quality Assurance Reliability and Availability"— Presentation transcript:

Similar presentations

About project

Feedback