Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 CS 501 Spring 2005 CS 501: Software Engineering Lecture 19 Reliability 1.

Similar presentations


Presentation on theme: "1 CS 501 Spring 2005 CS 501: Software Engineering Lecture 19 Reliability 1."— Presentation transcript:

1 1 CS 501 Spring 2005 CS 501: Software Engineering Lecture 19 Reliability 1

2 2 CS 501 Spring 2005 Administration

3 3 CS 501 Spring 2005 Lectures on Reliability and Dependability Lecture 19, Reliability 1: The development process Reviews Lecture 20, Reliability 2: Different aspects of reliability Programming techniques Lecture 21, Reliability 3:Testing and bug fixing Tools

4 4 CS 501 Spring 2005 Dependable and Reliable Systems: The Royal Majesty From the report of the National Transportation Safety Board: "On June 10, 1995, the Panamanian passenger ship Royal Majesty grounded on Rose and Crown Shoal about 10 miles east of Nantucket Island, Massachusetts, and about 17 miles from where the watch officers thought the vessel was. The vessel, with 1,509 persons on board, was en route from St. George’s, Bermuda, to Boston, Massachusetts." "The Raytheon GPS unit installed on the Royal Majesty had been designed as a standalone navigation device in the mid- to late 1980s,...The Royal Majesty’s GPS was configured by Majesty Cruise Line to automatically default to the Dead Reckoning mode when satellite data were not available."

5 5 CS 501 Spring 2005 The Royal Majesty: Analysis The ship was steered by an autopilot that relied on position information from the Global Positioning System (GPS). If the GPS could not obtain a position from satellites, it provided an estimated position based on Dead Reckoning (distance and direction traveled from a known point). The GPS failed one hour after leaving Bermuda. The crew failed to see the warning message on the display (or to check the instruments). 34 hours and 600 miles later, the Dead Reckoning error was 17 miles.

6 6 CS 501 Spring 2005 The Royal Majesty: Software Lessons All the software worked as specified (no bugs), but... Since the GPS software had been specified, the requirements had changed (stand alone system to part of integrated system). The manufacturers of the autopilot and GPS adopted different design philosophies about the communication of mode changes. The autopilot was not programmed to recognize valid/invalid status bits in message from the GPS (NMEA 0183). The warnings provided by the user interface were not sufficiently conspicuous to alert the crew. The officers had not been properly trained on this equipment.

7 7 CS 501 Spring 2005 Reliability Reliability: Probability of a failure occurring in operational use. Perceived reliability: Depends upon: user behavior set of inputs pain of failure

8 8 CS 501 Spring 2005 User Perception of Reliability 1. A personal computer that crashes frequently v. a machine that is out of service for two days. 2. A database system that crashes frequently but comes back quickly with no loss of data v. a system that fails once in three years but data has to be restored from backup. 3. A system that does not fail but has unpredictable periods when it runs very slowly.

9 9 CS 501 Spring 2005 Reliability Metrics Traditional Measures Mean time between failures Availability (up time) Mean time to repair Market Measures Complaints Customer retention User Perception is Influenced by Distribution of failures Hypothetical example: Cars are less safe than airplanes in accidents per hour, but safer in accidents per mile.

10 10 CS 501 Spring 2005 Reliability Metrics for Distributed Systems Traditional metrics are hard to apply in multi-component systems: In a big network, at any given moment something will be giving trouble, but very few users will see it. A system that has excellent average reliability may give terrible service to certain users. There are so many components that system administrators rely on automatic reporting systems to identify problem areas.

11 11 CS 501 Spring 2005 Requirements Specification of System Reliability Example: ATM card reader Failure class ExampleMetric Permanent System fails to operate1 per 1,000 days non-corrupting with any card -- reboot Transient System can not read1 in 1,000 transactions non-corrupting an undamaged card Corrupting A pattern ofNever transactions corrupts database

12 12 CS 501 Spring 2005 Cost of Improved Reliability $ Up time 99% 100% Will you spend your money on new functionality or improved reliability?

13 13 CS 501 Spring 2005 Example: Central Computing System A central computer serves the entire organization. Any failure is serious. Step 1: Gather data on every failure 10 years of data in a simple data base Every failure analyzed: hardware software (default) environment (e.g., power, air conditioning) human (e.g., operator error)

14 14 CS 501 Spring 2005 Example: Central Computing System Step 2: Analyze the data Weekly, monthly, and annual statistics Number of failures and interruptions Mean time to repair Graphs of trends by component, e.g., Failure rates of disk drives Hardware failures after power failures Crashes caused by software bugs in each module

15 15 CS 501 Spring 2005 Example: Central Computing System Step 3: Invest resources where benefit will be maximum, e.g., Orderly shut down after power failure Priority order for software improvements Changed procedures for operators Replacement hardware

16 16 CS 501 Spring 2005 Building Dependable Systems: Three Principles For a software system to be dependable: Each stage of development must be done well. Changes should be incorporated into the structure as carefully as the original system development. Testing and correction do not ensure quality, but dependable systems are not possible without systematic testing.

17 17 CS 501 Spring 2005 Reliability: Modified Waterfall Model Requirements System design Testing Operation & maintenance Program design Coding Acceptance Feasibility study Changes

18 18 CS 501 Spring 2005 Key Factors for Reliable Software Organization culture that expects quality Approach to software design and implementation that hides complexity (e.g., structured design, object-oriented programming) Precise, unambiguous specification Use of software tools that restrict or detect errors (e.g., strongly typed languages, source control systems, debuggers) Programming style that emphasizes simplicity, readability, and avoidance of dangerous constructs Incremental validation

19 19 CS 501 Spring 2005 Building Dependable Systems: Organizational Culture Good organizations create good systems: Acceptance of the group's style of work (e.g., meetings, preparation, support for juniors) Visibility Completion of a task before moving to the next (e.g., documentation, comments in code)

20 20 CS 501 Spring 2005 Building Dependable Systems: Complexity The human mind can encompass only limited complexity: Comprehensibility Simplicity Partitioning of complexity A simple system or subsystem is easier to get right than a complex one.

21 21 CS 501 Spring 2005 Building Dependable Systems: Specifications for the Client Specifications are of no value if they do not meet the client's needs The client must understand and review the requirements specification in detail Appropriate members of the client's staff must review relevant areas of the design (e.g., operations, training materials, system administration) The acceptance tests must belong to the client

22 22 CS 501 Spring 2005 Building Dependable Systems: Quality Management Processes Assumption: Good processes lead to good software The importance of routine: Standard terminology (requirements, specification, design, etc.) Software standards (naming conventions, etc.) Internal and external documentation Reporting procedures

23 23 CS 501 Spring 2005 Building Dependable Systems: Change Change management: Source code management and version control Tracking of change requests and bug reports Procedures for changing requirements specifications, designs and other documentation Regression testing Release control

24 24 CS 501 Spring 2005 Reviews: Process (Plan) Objectives: To review progress against plan (formal or informal). To adjust plan (schedule, team assignments, functionality, etc.). Impact on quality: Good quality systems usually result from plans that are demanding but realistic. Good people like to be stretched and to work hard, but must not be pressed beyond their capabilities.

25 25 CS 501 Spring 2005 Reviews: Design and Code DESIGN AND CODE REVIEWS ARE A FUNDAMENTAL PART OF GOOD SOFTWARE DEVELOPMENT Concept Colleagues review each other's work: can be applied to any stage of software development can be formal or informal

26 26 CS 501 Spring 2005 Benefits of Design and Code Reviews Benefits: Extra eyes spot mistakes, suggest improvements Colleagues share expertise; helps with training An occasion to tidy loose ends Incompatibilities between components can be identified Helps scheduling and management control Fundamental requirements: Senior team members must show leadership Good reviews require good preparation Everybody must be helpful, not threatening

27 27 CS 501 Spring 2005 Review Team (Full Version) A review is a structured meeting, with the following people Moderator -- ensures that the meeting moves ahead steadily Scribe -- records discussion in a constructive manner Developer -- person(s) whose work is being reviewed Interested parties -- people above and below in the software process Outside experts -- knowledgeable people who have are not working on this project Client -- representatives of the client who are knowledgeable about this part of the process

28 28 CS 501 Spring 2005 Example: Program Design Moderator Scribe Developer -- the design team Interested parties -- people who created the system design and/or requirements specification, and the programmers who will implement the system Outside experts -- knowledgeable people who have are not working on this project Client -- only if the client has a strong technical representative

29 29 CS 501 Spring 2005 Review Process Preparation The developer provides colleagues with documentation (e.g., specification or design), or code listing Participants study the documentation in advance Meeting The developer leads the reviewers through the documentation, describing what each section does and encouraging questions Must allow plenty of time and be prepared to continue on another day.

30 30 CS 501 Spring 2005 Static and Dynamic Verification Static verification: Techniques of verification that do not include execution of the software. May be manual or use computer tools. Dynamic verification: Testing the software with trial data. Debugging to remove errors.

31 31 CS 501 Spring 2005 Static Validation & Verification Carried out throughout the software development process. Validation & verification Requirements specification Design Program REVIEWS

32 32 CS 501 Spring 2005 Static Verification: Program Inspections Formal program reviews whose objective is to detect faults Code may be read or reviewed line by line. 150 to 250 lines of code in 2 hour meeting. Use checklist of common errors. Requires team commitment, e.g., trained leaders So effective that it is claimed that it can replace unit testing

33 33 CS 501 Spring 2005 Inspection Checklist: Common Errors Data faults: Initialization, constants, array bounds, character strings Control faults: Conditions, loop termination, compound statements, case statements Input/output faults: All inputs used; all outputs assigned a value Interface faults: Parameter numbers, types, and order; structures and shared memory Storage management faults: Modification of links, allocation and de-allocation of memory Exceptions: Possible errors, error handlers

34 34 CS 501 Spring 2005 Static Analysis Tools Program analyzers scan the source of a program for possible faults and anomalies (e.g., Lint for C programs). Control flow: loops with multiple exit or entry points Data use: Undeclared or uninitialized variables, unused variables, multiple assignments, array bounds Interface faults: Parameter mismatches, non-use of functions results, uncalled procedures Storage management: Unassigned pointers, pointer arithmetic

35 35 CS 501 Spring 2005 Static Analysis Tools (continued) Static analysis tools Cross-reference table: Shows every use of a variable, procedure, object, etc. Information flow analysis: Identifies input variables on which an output depends. Path analysis: Identifies all possible paths through the program.


Download ppt "1 CS 501 Spring 2005 CS 501: Software Engineering Lecture 19 Reliability 1."

Similar presentations


Ads by Google