Overview Software Quality Metrics Black Box Metrics White Box Metrics

Name: Overview Software Quality Metrics Black Box Metrics White Box Metrics
Uploaded: 2017-10-16T13:52:32+00:00
Duration: PTM20S33
Channel: Claire Tucker
Description: Overview Software Quality Metrics Black Box Metrics White Box Metrics

ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Lecture 23 Instructor Paulo Alencar

Overview Software Quality Metrics Black Box Metrics White Box Metrics
Development Estimates Maintenance Estimates

White Box Metrics White Box Metrics Linguistic: Structural: Hybrid:
LOC Halstead’s Software Science Structural: McCabe’s Cyclomatic Complexity Information Flow Metric Hybrid: Syntactic Interconnection

Types of White Box Metrics
Linguistic Metrics: measuring the properties of program/specification text without interpretation or ordering of the components. Structural Metrics: based on structural relations between objects in program; usually based on properties of control/data flowgraphs [e.g. number of nodes, links, nesting depth], fan-ins and fan-outs of procedures, etc. Hybrid Metrics: based on combination (or on a function) of linguistic and structural properties of a program.

Linguistic Metrics Lines of code/statements (LOC)
perhaps the simplest: count the number of lines of code (LOC) and use as a measure of program complexity. Simple to use: if errors occur at 2% per line, a 5000 line program should have about 100 errors. If it required 30 tests to find an error, then could infer the expected number of tests needed per line of code.

LOC Examples of use include: productivity KLOC/person-month
quality faults/KLOC cost $$/KLOC documentation doc_pages/KLOC

LOC Various studies indicate:
error rates ranging from 0.04% to 7% when measured against statement counts; LOC is as good as other metrics for small programs LOC is optimistic for bigger programs. LOC appears to be rather linear for small programs (<100 lines), but increases non-linearity with program size. Correlates well with maintenance costs Usually better than simple guesses or nothing at all.

LOC One study: a rough estimate of the average LOC to build one function point (FP): Language LOC/FP (average) Assembly 300 COBOL 100 FORTRAN Pascal 90 Ada 70 00 Language 30 4GL 20 Code Generator 15

Halstead's Software Science Metrics
Halstead [1977] based his "software science" on: common sense, information theory and psychology. His 'theory' is still a matter of much controversy. A basic overview:

Based on two easily obtained program parameters: the number of distinct operators in program the number of distinct operands in program Paired operators such as: "begin...end", "repeat...until" are usually treated as a single operator. Other examples: =, do, if, goto, etc. Operands: primitive variables, array variables the program length is defined by: define the following: N1 = total count of all operators in program. N2 = total count of all operands in program.

then the actual Halstead length is given as: [usually accepted as more appropriate] which is basically a static count of the number of tokens in program. the vocabulary of a program is defined as the sum of the number distinct operands and operators :

Various other measures can also be obtained, for example: program volume: varies with the language and represents the information (in bits) to specify the program. potential volume: V* is the volume of the most succinct program in which the algorithm can be coded. Consider where

program level: which is a measure of the abstraction of the formulation of the algorithm. also given as : program effort: the number of mental discriminations needed to implement the program (the implementation effort) correlates with the effort needed for maintenance for small programs.

Use of Halstead’s Metrics
A motivated programmer who is fluent in a language, the time to generate the preconceived program is where S is a constant (Stroud). Letting n1* = 2 and substituting for L and V we get:

Use of Halstead’s Metrics
Known weaknesses: Call depth not taken into account a program with a sequence 10 successive calls more complex than one with 10 nested calls An if-then-else sequence given same weight as a loop structure. Added complexity issues of nesting if-then-else or loops not taken into account, etc.

Structural Metrics McCabe’s Cyclomatic Complexity
Control Flow Graphs Information Flow Metric

McCabe’s Cyclomatic Complexity
Determines the logical complexity of a graph typically the graph is a flow graph of a function or procedure can also be a graphical representation of an FSM. Define: independent path: any path that introduces at least one new set of statements or a new condition; must move along at least one edge that has not been traversed before. Basis Set: a set of independent paths (covers all conditions and statements).

Example: set of independent paths which comprise a basis set for the following flow graph: Path1: a, c, f Path2: a, d, c, f Path3: a, b, e, f Path4: a, b, e, b, e, f Path5: a, b, e, a, c, f

Alternate methods to calculate V(G): V(G) = #of_predicate_nodes + 1 V(G) = #of_edges - #of_nodes + 2 Note: in previous graph node ‘a’ has an out degree of 3, thus it counts as 2 predicate nodes. The previous graph could also have been drawn as:

A predicate node is a node in the graph with 2 or more out going arcs. In the general case, for a collection of C control graphs with k connected components, the complexity is equal to the summation of their complexities. That is

Information Flow Metric
By Henry and Kafura: attempts to measure the complexity of the code by measuring the flow of information from one procedure to another in terms of fan-ins and fan-outs. fan-in: number of local flows into a procedure plus the number of global structures read by the procedure. fan-out: number of local flow from a procedure plus the number of global structures updated by the procedure.

Information Flow Metric
Flows represent: the information flow into a procedure via the argument lists and flows from the procedure due to return values of function calls. Thus, the complexity of the procedure, p, is given by Cp = (fan-in  fan-out)2

Hybrid Metrics An example is Woodfield’s Syntactic Interconnection Model attemps to relate programming effort to time. A connection relationship between two modules A and B: is a partial ordering between the modules i.e., to understand the function of module A one must first understand the function of module B denoted as: A  B

Hybrid Metrics Three module connections are defined:
control: an invocation of one module by the other. data: one module makes use of a variable modified by another module. implicit: a set of assumptions used in one module are also used in another module. If the assumption changes, then all modules using that assumption must be changed. The number of times a module must be reviewed is defined as the fan-in.

Hybrid Metrics The general form for the measure is given by:
where Cb is the complexity of module B’s code, C1b is internal complexity of module B’s code, fan_in is the sum of the control and data connections to module B RC is a review constant.

Hybrid Metrics The internal complexity can be any code metric
e.g. LOC, Halstead’s, McCabe’s, etc. Halstead’s Program Effort Metric was used originally a review constant of 2/3 was suggested by Halstead. The previous Information Flow Metric can also be used as a hybrid metric by taking into account the internal complexity of the module in question as:

Maintenance Predictions: Case Study 1
The intermetric results can be observed in Henry and Wake’s paper in table 1, page 137 where the statistical correlations are given between : the Length, N, V, E, McCabe, Woodfield, Information-L, Information-E, and Information metrics, There is a high degree of correlation among the code metrics: they all measure the same aspect of the code but low correlations among the code metrics, structure metrics and hybrid metrics these measure different aspects of the code. These agree with other studies made.

Maintenance Predictions
The goal of their study: develop a model using metric values as parameters to predict the number of lines of code (NLC) changed and the total number of changes (NCC) during the maintenance phase of the software life-cycle. The NLC and NCC are the dependent variables for the statistical model while the metric values are the independent variables. They used various statistical analysis techniques (e.g. mean squared error, different curve fitting techniques, etc.) to analyze the experimental data.

the tables below summarize the overall top candidate models obtained in their research (see their paper for the details): Best overall NLC models NLC = *E *INF-E NLC = *E *INF-L NLC = *E *WOOD *INF-E NLC = *E NLC = *E *V(G) *INF-E NLC = *E *V(G)

Best overall NCC models NCC = *E NCC = *E *INF-E NCC = *E *WOOD NCC = *N *V *E NCC = *L *V *E

Note: These models are best fits to the software components used in the study. May not necessarily apply universally to all software systems. Intuitively, one would suspect that the models would be sensitive to: the application domain (i.e. real-time software vs. Data processing, etc.), overall system complexity, maturity of the product area/team, programming language used, CASE tools used etc.

Some good correlations were found: one procedure had a predicted NLC = 3.86 the actual number of lines changed was 3; another had an NLC of 0.07 the number of lines changed was zero. The experimental results look encouraging but need further research.

The basis for generating the models for NLC and NCC: Not to get exact predicted values. But, rather these values could be used as an ordering criteria to rank the components in order of likelihood of maintenance. If performed before system release, future maintenance could be prevented (reduced) by changing or redesigning the higher ranking components.

To properly use these models (or others): the organization would be required to collect a significant amount of error or maintenance data first before the models results can be properly interpreted. Models such as these could be used during the coding stage: to identify high maintenance prone components, redesign/code them to improve the expected results. Useful during the test phase: to identify those components which appear to require more intensive testing, help to estimate the test effort.

Case Study 2 Another interesting case study was performed by Basili et al. Their major results were: the development of a predictive model for the software maintenance release process, measurement based lessons learned about the process, lessons learned from the establishment of a maintenance improvement program.

Case Study 2 Maintenance types considered:
error correction, enhancement and adaptation within the Flight Dynamics Division (FDD) of the NASA Goddard Space Flight Center. the following table illustrates the average distribution of effort across maintenance types: Maintenance Activity Effort Enhancement 61% Correction 14% Adaptation 5% Other 20%

Case Study 2 The enhancement activities typically involved more SLOC (Source Lines of Code) than error corrections verifies the intuitive notion that error corrections usually result in minor local changes. What about the distribution of effort within each maintenance activity? the following efforts can be examined: analysis ( examine different implementations and their associated costs), isolation (time spent to understand the failure or requested enhancement), design (time spent in redesigning the system), code/unit test and, inspection, certification and consulting.

Case Study 2 The following table illustrates the measured distributed effort for error correction and enhancement maintenance: Error Correction Effort Enhancement 6% Analysis 1% 26% Isolation 20% Design 27% 38% Code/Unit Test 39% 4% Inspection, Certification, Consulting 13%

Case Study 2 Attempt to distinguish between the two types of SCRs (Software Change Request): user and tester generated SCRs. For example, during the implementation of any release, errors may be introduced by the maintenance work itself. This may be caught by the tests which generate SCRs which become part of the same release delivery. The SCR count and the SLOC differences between user and tester change requests for 25 releases were given as: SCRs Origin SLOC 35% Tester 3% 65% User 97%

Case Study 2 By estimating the size of a release, an effort estimate (for enhancement releases) can be obtained as the following equation: Effort in hours = (0.36  SLOC) This indicates a certain amount of overhead from the regression testing and comprehension activities which tend to be independent of the size of the change. It appears that if productivity improvement is the main goal, then better to avoid scheduling small error correction releases but, rather to package them with a release of larger enhancements better still if the enhancements require change to the same units as the corrections.

Overview Software Quality Metrics Black Box Metrics White Box Metrics

Similar presentations

Presentation on theme: "Overview Software Quality Metrics Black Box Metrics White Box Metrics"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Overview Software Quality Metrics Black Box Metrics White Box Metrics

Similar presentations

Presentation on theme: "Overview Software Quality Metrics Black Box Metrics White Box Metrics"— Presentation transcript:

Similar presentations

About project

Feedback