CS 501: Software Engineering Fall 2000 Lecture 21 Dependable Systems I Reliability.

Slides:



Advertisements
Similar presentations
INSE - Lecture 16  Documentation  Configuration Management  Program Support Environments  Choice of Programming Language.
Advertisements

Software Engineering-II Sir zubair sajid. What’s the difference? Verification – Are you building the product right? – Software must conform to its specification.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development.
CS CS 5150 Software Engineering Lecture 21 Reliability 2.
SE 450 Software Processes & Product Metrics Reliability: An Introduction.
CS CS 5150 Software Engineering Lecture 22 Reliability 2.
CS CS 5150 Software Engineering Lecture 22 Reliability 3.
1 CS 501 Spring 2007 CS 501: Software Engineering Lecture 21 Reliability 3.
1 CS 501 Spring 2005 CS 501: Software Engineering Lecture 19 Reliability 1.
1 CS 501 Spring 2008 CS 501: Software Engineering Lecture 20 Reliability 2.
CS CS 5150 Software Engineering Lecture 20 Reliability 1.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 12: Managing and Implementing Backups and Disaster Recovery.
1 CS 501 Spring 2008 CS 501: Software Engineering Lecture 19 Reliability 1.
1 CS 501 Spring 2007 CS 501: Software Engineering Lecture 20 Reliability 2.
CS CS 5150 Software Engineering Lecture 21 Reliability 3.
Soft. Eng. II, Spr. 2002Dr Driss Kettani, from I. Sommerville1 CSC-3325: Chapter 9 Title : Reliability Reading: I. Sommerville, Chap. 16, 17 and 18.
Developing Dependable Systems CIS 376 Bruce R. Maxim UM-Dearborn.
1 CS 501 Spring 2006 CS 501: Software Engineering Lecture 20 Reliability 2.
1 CS 501 Spring 2004 CS 501: Software Engineering Lecture 19 Reliability 1.
CS 501: Software Engineering Fall 2000 Lecture 22 Dependable Systems II Validation and Verification.
CS CS 5150 Software Engineering Lecture 21 Reliability 1.
1 CS 501 Spring 2008 CS 501: Software Engineering Lecture 21 Reliability 3.
CS CS 5150 Software Engineering Lecture 19 Reliability 1.
Design, Implementation and Maintenance
Verification and Validation CIS 376 Bruce R. Maxim UM-Dearborn.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 12: Managing and Implementing Backups and Disaster Recovery.
Software Dependability CIS 376 Bruce R. Maxim UM-Dearborn.
Software Reliability Categorising and specifying the reliability of software systems.
Software Testing Verification and validation planning Software inspections Software Inspection vs. Testing Automated static analysis Cleanroom software.
CS 501: Software Engineering Fall 1999 Lecture 16 Verification and Validation.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 9 Slide 1 Critical Systems Specification 2.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 12: Managing and Implementing Backups and Disaster Recovery.
1 Software Engineering II Software Reliability. 2 Dependable and Reliable Systems: The Royal Majesty From the report of the National Transportation Safety.
CS 360 Lecture 3.  The software process is a structured set of activities required to develop a software system.  Fundamental Assumption:  Good software.
CS 425/625 Software Engineering Legacy Systems
1 CS 501 Spring 2002 CS 501: Software Engineering Lecture 23 Reliability III.
Rapid software development 1. Topics covered Agile methods Extreme programming Rapid application development Software prototyping 2.
CS CS 5150 Software Engineering Lecture 20 Reliability 2.
Software Testing Yonsei University 2 nd Semester, 2014 Woo-Cheol Kim.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 22 Slide 1 Software Verification, Validation and Testing.
CS CS 5150 Software Engineering Lecture 19 Reliability 1.
Introduction to Software Development. Systems Life Cycle Analysis  Collect and examine data  Analyze current system and data flow Design  Plan your.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.
RELIABILITY ENGINEERING 28 March 2013 William W. McMillan.
1 CS 501 Spring 2002 CS 501: Software Engineering Lecture 22 Reliability II.
CS CS 5150 Software Engineering Lecture 20 Reliability 2.
CS 5150 Software Engineering Lecture 20 Reliability 1 (and a little Privacy)
Design - programming Cmpe 450 Fall Dynamic Analysis Software quality Design carefully from the start Simple and clean Fewer errors Finding errors.
CS 360 Lecture 17.  Software reliability:  The probability that a given system will operate without failure under given environmental conditions for.
Software Engineering1  Verification: The software should conform to its specification  Validation: The software should do what the user really requires.
CS 360 Lecture 16.  For a software system to be reliable:  Each stage of development must be done well, with incremental verification and testing. 
Software Quality Assurance SOFTWARE DEFECT. Defect Repair Defect Repair is a process of repairing the defective part or replacing it, as needed. For example,
CS451 Lecture 10: Software Testing Yugi Lee STB #555 (816)
Software Development Process CS 360 Lecture 3. Software Process The software process is a structured set of activities required to develop a software.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 23 Slide 1 Software testing.
CS223: Software Engineering Lecture 18: The XP. Recap Introduction to Agile Methodology Customer centric approach Issues of Agile methodology Where to.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
1 CS 501 Spring 2003 CS 501: Software Engineering Lecture 21 Reliability II.
The information systems lifecycle Far more boring than you ever dreamed possible!
CS 5150 Software Engineering Lecture 21 Reliability 2.
1 CS 501 Spring 2004 CS 501: Software Engineering Lecture 20 Reliability 2.
Software Metrics and Reliability
Hardware & Software Reliability
Software Reliability Definition: The probability of failure-free operation of the software for a specified period of time in a specified environment.
Chapter 18 Software Testing Strategies
Software testing strategies 2
CS 5150 Software Engineering
CS 501: Software Engineering Fall 1999
Critical Systems Development
Presentation transcript:

CS 501: Software Engineering Fall 2000 Lecture 21 Dependable Systems I Reliability

2 Administration Assignment 3 Report due tomorrow at 5 p.m. group design with individual parts Presentations Wednesday through Friday every group member must present during the semester

3 Software Reliability Failure: Software does not deliver the service expected by the user (e.g., mistake in requirements) Fault: Programming or design error whereby the delivered system does not conform to specification Reliability: Probability of a failure occurring in operational use. Perceived reliability: Depends upon: user behavior set of inputs pain of failure

4 Reliability Metrics Probability of failure on demand Rate of failure occurrence (failure intensity) Mean time between failures Availability (up time) Mean time to repair Distribution of failures Hypothetical example: Cars are safer than airplane in accidents (failures) per hour, but less safe in failures per mile.

5 Reliability Metrics for Distributed Systems Traditional metrics are hard to apply in multi-component systems: In a big network, at a given moment something will be giving trouble, but very few users will see it. A system that has excellent average reliability may give terrible service to certain users. There are so many components that system administrators rely on automatic reporting systems to identify problem areas.

6 User Perception of Reliability 1. A personal computer that crashes frequently v. a machine that is out of service for two days. 2. A database system that crashes frequently but comes back quickly with no loss of data v. a system that fails once in three years but data has to be restored from backup. 3. A system that does not fail but has unpredictable periods when it runs very slowly.

7 Cost of Improved Reliability $ Up time 99% 100% Will you spend your money on new functionality or improved reliability?

8 Specification of System Reliability Example: ATM card reader Failure class ExampleMetric Permanent System fails to operate1 per 1,000 days non-corrupting with any card -- reboot Transient System can not read1 in 1,000 transactions non-corrupting an undamaged card Corrupting A pattern ofNever transactions corrupts database

9 Principles for Dependable Systems The human mind can encompass only limited complexity: => Comprehensibility => Simplicity => Partitioning of complexity

10 Principles for Dependable Systems High-quality has to be built-in => Each stage of development must be done well => Testing and correction does not lead to quality => Changes should be incorporated into the structure

11 Quality Management Processes Assumption: Good processes lead to good software The importance of routine: Standard terminology (requirements, specification, design, etc.) Software standards (naming conventions, etc.) Internal and external documentation Reporting procedures

12 Quality Management Processes Change management: Source code management and version control Tracking of change requests and bug reports Procedures for changing requirements specifications, designs and other documentation Release control

13 Design and Code Reviews Colleagues review each other's work: can be applied to any stage of software development can be formal or informal The developer provides colleagues with: documentation (e.g., specification or design), or code listing talks through the work while answering questions Most effective when developer and reviewers prepare well

14 Benefits of Design and Code Reviews Benefits: Extra eyes spot mistakes, suggest improvements Colleagues share expertise; helps with training An occasion to tidy loose ends Incompatibilities between modules can be identified Helps scheduling and management control Fundamental requirements: Senior team members must show leadership Must be helpful, not threatening

15 Process (Plan) Reviews Objectives: To review progress against plan (formal or informal) To adjust plan (schedule, team assignments, functionality, etc.) Impact on quality: Good quality systems usually result from plans that are demanding but realistic Good people like to be stretched and to work hard, but must not be pressed beyond their capabilities.

16 Statistical Testing Determine the operational profile of the software Select or generate a profile of test data Apply test data to system, record failure patterns Compute statistical values of metrics under test conditions

17 Statistical Testing Advantages: Can test with very large numbers of transactions Can test with extreme cases (high loads, restarts, disruptions) Can repeat after system modifications Disadvantages: Uncertainty in operational profile (unlikely inputs) Expensive Can never prove high reliability

18 Example: Dartmouth Time Sharing (1980) A central computer serves the entire campus. Any failure is serious. Step 1. Gather data on every failure 10 years of data in a simple data base Every failure analyzed: hardware software (default) environment (e.g., power, air conditioning) human (e.g., operator error)

19 Example: Dartmouth Time Sharing (1980) Step 2. Analyze the data. Weekly, monthly, and annual statistics Number of failures and interruptions Mean time to repair Graphs of trends by component, e.g., Failure rates of disk drives Hardware failures after power failures Crashes caused by software bugs in each module

20 Example: Dartmouth Time Sharing (1980) Step 3. Invest resources where benefit will be maximum, e.g., Orderly shut down after power failure Priority order for software improvements Changed procedures for operators Replacement hardware

21 Factors for Fault Free Software Precise, unambiguous specification Organization culture that expects quality Approach to software design and implementation that hides complexity (e.g., structured design, object-oriented programming) Use of software tools that restrict or detect errors (e.g., strongly typed languages, source control systems, debuggers) Programming style that emphasizes simplicity, readability, and avoidance of dangerous constructs Incremental validation

22 Error Avoidance Risky programming constructs Pointers Dynamic memory allocation Floating-point numbers Parallelism Recursion Interrupts All are valuable in certain circumstances, but should be used with discretion

23 Defensive Programming Murphy's Law: If anything can go wrong, it will. Defensive Programming: Redundant code is incorporated to check system state after modifications Implicit assumptions are tested explicitly

24 Defensive Programming Examples Use boolean variable not integer Test i <= n not i = = n Assertion checking Build debugging code into program with a switch to display values at interfaces Error checking codes in data, e.g., checksum or hash

25 Some Notable Bugs Built-in function in Fortran compiler (e 0 = 0) Japanese microcode for Honeywell DPS virtual memory The microfilm plotter with the missing byte (1:1023) The Sun 3 page fault that IBM paid to fix Left handed rotation in the graphics package Good people work around problems. The best people track them down and fix them!