Overview Software Quality Metrics Black Box Metrics White Box Metrics

Slides:



Advertisements
Similar presentations
Software Testing. Quality is Hard to Pin Down Concise, clear definition is elusive Not easily quantifiable Many things to many people You'll know it when.
Advertisements

SOFTWARE TESTING. INTRODUCTION  Software Testing is the process of executing a program or system with the intent of finding errors.  It involves any.
David Woo (dxw07u).  What is “White Box Testing”  Data Processing and Calculation Correctness Tests  Correctness Tests:  Path Coverage  Line Coverage.
Software Quality Assurance Inspection by Ross Simmerman Software developers follow a method of software quality assurance and try to eliminate bugs prior.
1 Static Testing: defect prevention SIM objectives Able to list various type of structured group examinations (manual checking) Able to statically.
Detailed Design Kenneth M. Anderson Lecture 21
Software engineering for real-time systems
Software Metrics II Speaker: Jerry Gao Ph.D. San Jose State University URL: Sept., 2001.
Development of Empirical Models From Process Data
1 Complexity metrics  measure certain aspects of the software (lines of code, # of if-statements, depth of nesting, …)  use these numbers as a criterion.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Testing an individual module
Chapter 18 Testing Conventional Applications
1 Software Testing Techniques CIS 375 Bruce R. Maxim UM-Dearborn.
Fundamentals of Python: From First Programs Through Data Structures
Software Metric capture notions of size and complexity.
University of Toronto Department of Computer Science © 2001, Steve Easterbrook CSC444 Lec22 1 Lecture 22: Software Measurement Basics of software measurement.
Fundamentals of Python: First Programs
1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Lecture 22 Instructor Paulo Alencar.
CMSC 345 Fall 2000 Unit Testing. The testing process.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
1 Software Quality CIS 375 Bruce R. Maxim UM-Dearborn.
Chapter 6 : Software Metrics
SWEN 5430 Software Metrics Slide 1 Quality Management u Managing the quality of the software process and products using Software Metrics.
Software Measurement & Metrics
By: TARUN MEHROTRA 12MCMB11.  More time is spent maintaining existing software than in developing new code.  Resources in M=3*(Resources in D)  Metrics.
Product Metrics An overview. What are metrics? “ A quantitative measure of the degree to which a system, component, or process possesses a given attribute.”
Software Metrics Software Engineering.
Agenda Introduction Overview of White-box testing Basis path testing
1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Lecture 23 Instructor Paulo Alencar.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Software Function, Source Lines Of Code, and Development Effort Prediction: A Software Science Validation ALLAN J. ALBRECHT AND JOHN E.GAFFNEY,JR., MEMBER,IEEE.
Geographic Information Science
Software Metrics (Part II). Product Metrics  Product metrics are generally concerned with the structure of the source code (example LOC).  Product metrics.
Software Quality Metrics
Chapter 12: Design Phase n 12.1 Design and Abstraction n 12.2 Action-Oriented Design n 12.3 Data Flow Analysis n Data Flow Analysis Example n
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Software Engineering Principles. SE Principles Principles are statements describing desirable properties of the product and process.
Disciplined Software Engineering Lecture #2 Software Engineering Institute Carnegie Mellon University Pittsburgh, PA Sponsored by the U.S. Department.
1 Program Testing (Lecture 14) Prof. R. Mall Dept. of CSE, IIT, Kharagpur.
SOFTWARE METRICS. Software Process Revisited The Software Process has a common process framework containing: u framework activities - for all software.
Project Estimation techniques Estimation of various project parameters is a basic project planning activity. The important project parameters that are.
CSc 461/561 Information Systems Engineering Lecture 5 – Software Metrics.
Software Metrics.
CSC 480 Software Engineering Testing - I. Plan project Integrate & test system Analyze requirements Design Maintain Test units Implement Software Engineering.
Theory and Practice of Software Testing
SOFTWARE TESTING. Introduction Software Testing is the process of executing a program or system with the intent of finding errors. It involves any activity.
Agent program is the one part(class)of Othello program. How many test cases do you have to test? Reversi [Othello]
Single-Subject and Correlational Research Bring Schraw et al.
Dynamic Testing.
White Box Testing by : Andika Bayu H.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
CS223: Software Engineering Lecture 21: Unit Testing Metric.
BPS - 5th Ed. Chapter 231 Inference for Regression.
SOFTWARE TESTING LECTURE 9. OBSERVATIONS ABOUT TESTING “ Testing is the process of executing a program with the intention of finding errors. ” – Myers.
Software Metrics 1.
Software Testing.
Software Testing.
Design Characteristics and Metrics
Software Engineering (CSI 321)
CPSC 873 John D. McGregor GQM.
Lecture 2 Introduction to Programming
Chapter 13 & 14 Software Testing Strategies and Techniques
Structural testing, Path Testing
Chapter 14 Software Testing Techniques
Halstead software science measures and other metrics for source code
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
Software Testing “If you can’t test it, you can’t design it”
By: Lecturer Raoof Talal
Chapter 8: Design: Characteristics and Metrics
Presentation transcript:

ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Lecture 23 Instructor Paulo Alencar

Overview Software Quality Metrics Black Box Metrics White Box Metrics Development Estimates Maintenance Estimates

White Box Metrics White Box Metrics Linguistic: Structural: Hybrid: LOC Halstead’s Software Science Structural: McCabe’s Cyclomatic Complexity Information Flow Metric Hybrid: Syntactic Interconnection

Types of White Box Metrics Linguistic Metrics: measuring the properties of program/specification text without interpretation or ordering of the components. Structural Metrics: based on structural relations between objects in program; usually based on properties of control/data flowgraphs [e.g. number of nodes, links, nesting depth], fan-ins and fan-outs of procedures, etc. Hybrid Metrics: based on combination (or on a function) of linguistic and structural properties of a program.

Linguistic Metrics Lines of code/statements (LOC) perhaps the simplest: count the number of lines of code (LOC) and use as a measure of program complexity. Simple to use: if errors occur at 2% per line, a 5000 line program should have about 100 errors. If it required 30 tests to find an error, then could infer the expected number of tests needed per line of code.

LOC Examples of use include: productivity KLOC/person-month quality faults/KLOC cost $$/KLOC documentation doc_pages/KLOC

LOC Various studies indicate: error rates ranging from 0.04% to 7% when measured against statement counts; LOC is as good as other metrics for small programs LOC is optimistic for bigger programs. LOC appears to be rather linear for small programs (<100 lines), but increases non-linearity with program size. Correlates well with maintenance costs Usually better than simple guesses or nothing at all.

LOC One study: a rough estimate of the average LOC to build one function point (FP): Language LOC/FP (average) Assembly 300 COBOL 100 FORTRAN Pascal 90 Ada 70 00 Language 30 4GL 20 Code Generator 15

Halstead's Software Science Metrics Halstead [1977] based his "software science" on: common sense, information theory and psychology. His 'theory' is still a matter of much controversy. A basic overview:

Halstead's Software Science Metrics Based on two easily obtained program parameters: the number of distinct operators in program the number of distinct operands in program Paired operators such as: "begin...end", "repeat...until" are usually treated as a single operator. Other examples: =, do, if, goto, etc. Operands: primitive variables, array variables the program length is defined by: define the following: N1 = total count of all operators in program. N2 = total count of all operands in program.

Halstead's Software Science Metrics then the actual Halstead length is given as: [usually accepted as more appropriate] which is basically a static count of the number of tokens in program. the vocabulary of a program is defined as the sum of the number distinct operands and operators :

Halstead's Software Science Metrics Various other measures can also be obtained, for example: program volume: varies with the language and represents the information (in bits) to specify the program. potential volume: V* is the volume of the most succinct program in which the algorithm can be coded. Consider where

Halstead's Software Science Metrics program level: which is a measure of the abstraction of the formulation of the algorithm. also given as : program effort: the number of mental discriminations needed to implement the program (the implementation effort) correlates with the effort needed for maintenance for small programs.

Use of Halstead’s Metrics A motivated programmer who is fluent in a language, the time to generate the preconceived program is where S is a constant (Stroud). Letting n1* = 2 and substituting for L and V we get:

Use of Halstead’s Metrics Known weaknesses: Call depth not taken into account a program with a sequence 10 successive calls more complex than one with 10 nested calls An if-then-else sequence given same weight as a loop structure. Added complexity issues of nesting if-then-else or loops not taken into account, etc.

Structural Metrics McCabe’s Cyclomatic Complexity Control Flow Graphs Information Flow Metric

McCabe’s Cyclomatic Complexity Determines the logical complexity of a graph typically the graph is a flow graph of a function or procedure can also be a graphical representation of an FSM. Define: independent path: any path that introduces at least one new set of statements or a new condition; must move along at least one edge that has not been traversed before. Basis Set: a set of independent paths (covers all conditions and statements).

McCabe’s Cyclomatic Complexity Example: set of independent paths which comprise a basis set for the following flow graph: Path1: a, c, f Path2: a, d, c, f Path3: a, b, e, f Path4: a, b, e, b, e, f Path5: a, b, e, a, c, f

McCabe’s Cyclomatic Complexity Alternate methods to calculate V(G): V(G) = #of_predicate_nodes + 1 V(G) = #of_edges - #of_nodes + 2 Note: in previous graph node ‘a’ has an out degree of 3, thus it counts as 2 predicate nodes. The previous graph could also have been drawn as:

McCabe’s Cyclomatic Complexity A predicate node is a node in the graph with 2 or more out going arcs. In the general case, for a collection of C control graphs with k connected components, the complexity is equal to the summation of their complexities. That is

Information Flow Metric By Henry and Kafura: attempts to measure the complexity of the code by measuring the flow of information from one procedure to another in terms of fan-ins and fan-outs. fan-in: number of local flows into a procedure plus the number of global structures read by the procedure. fan-out: number of local flow from a procedure plus the number of global structures updated by the procedure.

Information Flow Metric Flows represent: the information flow into a procedure via the argument lists and flows from the procedure due to return values of function calls. Thus, the complexity of the procedure, p, is given by Cp = (fan-in  fan-out)2

Hybrid Metrics An example is Woodfield’s Syntactic Interconnection Model attemps to relate programming effort to time. A connection relationship between two modules A and B: is a partial ordering between the modules i.e., to understand the function of module A one must first understand the function of module B denoted as: A  B

Hybrid Metrics Three module connections are defined: control: an invocation of one module by the other. data: one module makes use of a variable modified by another module. implicit: a set of assumptions used in one module are also used in another module. If the assumption changes, then all modules using that assumption must be changed. The number of times a module must be reviewed is defined as the fan-in.

Hybrid Metrics The general form for the measure is given by: where Cb is the complexity of module B’s code, C1b is internal complexity of module B’s code, fan_in is the sum of the control and data connections to module B RC is a review constant.

Hybrid Metrics The internal complexity can be any code metric e.g. LOC, Halstead’s, McCabe’s, etc. Halstead’s Program Effort Metric was used originally a review constant of 2/3 was suggested by Halstead. The previous Information Flow Metric can also be used as a hybrid metric by taking into account the internal complexity of the module in question as:

Maintenance Predictions: Case Study 1 The intermetric results can be observed in Henry and Wake’s paper in table 1, page 137 where the statistical correlations are given between : the Length, N, V, E, McCabe, Woodfield, Information-L, Information-E, and Information metrics, There is a high degree of correlation among the code metrics: they all measure the same aspect of the code but low correlations among the code metrics, structure metrics and hybrid metrics these measure different aspects of the code. These agree with other studies made.

Maintenance Predictions The goal of their study: develop a model using metric values as parameters to predict the number of lines of code (NLC) changed and the total number of changes (NCC) during the maintenance phase of the software life-cycle. The NLC and NCC are the dependent variables for the statistical model while the metric values are the independent variables. They used various statistical analysis techniques (e.g. mean squared error, different curve fitting techniques, etc.) to analyze the experimental data.

Maintenance Predictions the tables below summarize the overall top candidate models obtained in their research (see their paper for the details): Best overall NLC models NLC = 0.42997221 + 0.000050156*E - 0.000000199210*INF-E NLC = 0.45087158 + 0.000049895*E - 0.000173851*INF-L NLC = 0.60631548 + 0.000050843*E - 0.000029819*WOOD - 0.000000177341*INF-E NLC = 0.33675906 + 0.000049889*E NLC = 1.51830192 + 0.000054724*E - 0.10084685*V(G) - 0.000000161798*INF-E NLC = 1.45518829 + 0.00005456*E - 0.10199539*V(G) 

Maintenance Predictions Best overall NCC models NCC = 0.34294979 + 0.000011418*E NCC = 0.36846091 + 0.000011488*E - 0.00000005238*INF-E NCC = 0.38710077 + 0.000011583*E - 0.0000068966*WOOD NCC = 0.25250119 + 0.003972857*N - 0.000598677*V + 0.000014538*E NCC = 0.32020501 + 0.01369264*L - 0.000481846*V + 0.000012304*E

Maintenance Predictions Note: These models are best fits to the software components used in the study. May not necessarily apply universally to all software systems. Intuitively, one would suspect that the models would be sensitive to: the application domain (i.e. real-time software vs. Data processing, etc.), overall system complexity, maturity of the product area/team, programming language used, CASE tools used etc.

Maintenance Predictions: Case Study 1 Some good correlations were found: one procedure had a predicted NLC = 3.86 the actual number of lines changed was 3; another had an NLC of 0.07 the number of lines changed was zero. The experimental results look encouraging but need further research.

Maintenance Predictions: Case Study 1 The basis for generating the models for NLC and NCC: Not to get exact predicted values. But, rather these values could be used as an ordering criteria to rank the components in order of likelihood of maintenance. If performed before system release, future maintenance could be prevented (reduced) by changing or redesigning the higher ranking components.

Maintenance Predictions To properly use these models (or others): the organization would be required to collect a significant amount of error or maintenance data first before the models results can be properly interpreted. Models such as these could be used during the coding stage: to identify high maintenance prone components, redesign/code them to improve the expected results. Useful during the test phase: to identify those components which appear to require more intensive testing, help to estimate the test effort.

Case Study 2 Another interesting case study was performed by Basili et al. Their major results were: the development of a predictive model for the software maintenance release process, measurement based lessons learned about the process, lessons learned from the establishment of a maintenance improvement program.

Case Study 2 Maintenance types considered: error correction, enhancement and adaptation within the Flight Dynamics Division (FDD) of the NASA Goddard Space Flight Center. the following table illustrates the average distribution of effort across maintenance types: Maintenance Activity Effort Enhancement 61% Correction 14% Adaptation 5% Other 20%

Case Study 2 The enhancement activities typically involved more SLOC (Source Lines of Code) than error corrections verifies the intuitive notion that error corrections usually result in minor local changes. What about the distribution of effort within each maintenance activity? the following efforts can be examined: analysis ( examine different implementations and their associated costs), isolation (time spent to understand the failure or requested enhancement), design (time spent in redesigning the system), code/unit test and, inspection, certification and consulting.

Case Study 2 The following table illustrates the measured distributed effort for error correction and enhancement maintenance: Error Correction Effort Enhancement 6% Analysis 1% 26% Isolation 20% Design 27% 38% Code/Unit Test 39% 4% Inspection, Certification, Consulting 13%

Case Study 2 Attempt to distinguish between the two types of SCRs (Software Change Request): user and tester generated SCRs. For example, during the implementation of any release, errors may be introduced by the maintenance work itself. This may be caught by the tests which generate SCRs which become part of the same release delivery. The SCR count and the SLOC differences between user and tester change requests for 25 releases were given as: SCRs Origin SLOC 35% Tester 3% 65% User 97%

Case Study 2 By estimating the size of a release, an effort estimate (for enhancement releases) can be obtained as the following equation: Effort in hours = (0.36  SLOC) + 1040 This indicates a certain amount of overhead from the regression testing and comprehension activities which tend to be independent of the size of the change. It appears that if productivity improvement is the main goal, then better to avoid scheduling small error correction releases but, rather to package them with a release of larger enhancements better still if the enhancements require change to the same units as the corrections.