Presentation on theme: "Software Reliability: The “Physics” of “Failure” SJSU ISE 297 Donald Kerns 7/31/00."— Presentation transcript:
Software Reliability: The “Physics” of “Failure” SJSU ISE 297 Donald Kerns 7/31/00
Thesis: The field of software engineering is a very complex mix of technology, management and human psychology. Glib usage of the phrase “software reliability” implies gross simplifications that make the measurement useless in all but the most closely defined situations. A significant increase in the sophistication of the general field of software engineering will be necessary before true measures of “software reliability” are meaningful.
Software is not monolithic, yet the literature treats it as such. Different types of software have different failure modes and consequences…
Embedded software Does your Furby have bugs? –Microwave? Car? Well defined applications Little data Tightly constrained resources in the delivered product drive... High technical complexity
Batch/Database Driven Software Industrial scale applications Highly data intensive Usually low % of functionality is user interaction Once running, most apparent “software defects” are actually defective data or business rules
User interactive Usually event driven Defects include: broken functionality behavior different than user expectation lack of interoperability with other software
Usage of the word “reliability” implies defects and failure, but the community (much less the literature) has yet to settle on what exactly constitutes a software failure. Most only use the first type. Catastrophic failures Functional failure Poor performance Wrong answers Does not conform to user expectations All are “failures” yet the standard reliability measures are at a loss to evaluate the different consequences.
Frequently, software is just the most visible element of a complex system. Almost all system defects start out appearing as software errors.
Example: “Fire Bay #1” Normal configuration: 2 satellite bus “Cost saving” configuration: Bigger satellite bus Separation failure was a software defect only if the software had been modified to fire bay 1 through the bay 2 wiring and didn’t.
More examples The system isn’t getting the signal, we must have a software defect! –Is the system configured to scan that part of the spectrum? –Is the system configured to report the signal during that portion of the target identification? –Is the system configured to report signals of that priority? The Built In Test software is reporting a failed component, we must have a software defect! –No, by reporting a failed component the software is functioning CORRECTLY. –It is the component that has a defect.
Does software age? No, but the behavior of software depends on the environment that it is executing in and the environment may degrade. Changes in environment may reveal failure modes that have lain dormant for the life of the software. if (strcmp(compiler, “Visual C++”)) do_compile_things(); elseif (strcmp(compiler, “Borland C++”)) crash_in_flames(); Common environment changes: –Change in system configuration (OS, hardware, applications). –Increased processor loading due to above. –Decreased available memory due to above. –Increased network traffic due to growth. –Intentional or non-intentional self-modifying code.
“Does not meet customer expectations” is considered a software defect, however there is almost always a mismatch between customer expectations and the economics of the situation. Windows normally ships with 10,000s of defects. Would you pay 10x as much for 10x fewer defects? Heretical thought: –The methods for producing 80-90% defect free software have been known since the late 1960s (inspections, formal requirements, design and test). –Why aren’t they being used? –The field of software engineering is a very complex mix of technology, management and human psychology.
Finally, even if customer expectations are clearly documented at the beginning of a software development, and properly executed during implementation, the installation of that software system is a significant change to the environment that developed those expectations. This yields new expectations. “Well, since that data is now on the computer we should be able to…” –Share it with our other systems. –Work on it with spreadsheets –Put it on the web –Share it with our Aunt Sally “What do you mean that costs more? The software is defective. Fix it!”
The software AND customer communities will need to address all of these issues in a formal, comprehensive, and consistent manner before the phrase “software reliability” has meaning.
SEI S/W Capability Maturity Model 1) Initial. The software process is characterized as ad hoc, and occasionally even chaotic. Few processes are defined, and success depends on individual effort and heroics. 2) Repeatable. Basic project management processes are established to track cost, schedule, and functionality. The necessary process discipline is in place to repeat earlier successes on projects with similar applications. 3) Defined. The software process for both management and engineering activities is documented, standardized, and integrated into a standard software process for the organization. All projects use an approved, tailored version of the organization's standard software process for developing and maintaining software. 4) Managed. Detailed measures of the software process and product quality are collected. Both the software process and products are quantitatively understood and controlled. 5) Optimizing. Continuous process improvement is enabled by quantitative feedback from the process and from piloting innovative ideas and technologies.