The Ariane 5 Launcher Failure

Slides:



Advertisements
Similar presentations
Runtime Verification Ali Akkaya Boğaziçi University.
Advertisements

CSCI 5230: Project Management Software Reuse Disasters: Therac-25 and Ariane 5 Flight 501 David Sumpter 12/4/2001.
CMSC 330: Organization of Programming Languages Exceptions.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 24 Slide 1 Critical Systems Validation 2.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 24 Slide 1 Critical Systems Validation.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development.
Syllabus Case Histories WW III Almost Medical Killing Machine
23/05/2015Dr Andy Brooks1 FOR0383 Software Quality Assurance Lecture 2 ESA Ariane 5 Rocket Flight 501.
1 COMS 161 Introduction to Computing Title: Numeric Processing Date: November 10, 2004 Lecture Number: 31.
Figures – Chapter 17. Figure 17.1 Component characteristics Component characteristic Description StandardizedComponent standardization means that a component.
INFORMATION TECHNOLOGIES SAFETY AND QUALITY THROUGH INFORMATION TECHNOLOGY WSRS Ulm – 20 Sept St. Ramberger / Th.Gruber 1 Experience Report: Error.
©Ian Sommerville 2000CS 365 Ariane 5 launcher failureSlide 1 The Ariane 5 Launcher Failure June 4th 1996 Total failure of the Ariane 5 launcher on its.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 23 Slide 1 Software testing.
ARIANE 5 FAILURE ► BACKGROUND:- ► European space agency’s re-useable launch vehicle. ► Ariane-4 was a major success ► Ariane -5 was developed for the larger.
School of Computer ScienceG53FSP Formal Specification1 Dr. Rong Qu Introduction to Formal Specification
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 10 Slide 1 Formal Specification.
©Ian Sommerville 2004Software Engineering, 7th edition. Insulin Pump Slide 1 An automated insulin pump.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 2 Slide 1 Systems engineering 1.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 10 Slide 1 Critical Systems Specification 3 Formal Specification.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 10 Slide 1 Formal Specification.
©Ian Sommerville 2004Software Engineering Case Studies Slide 1 The Ariane 5 Launcher Failure June 4th 1996 Total failure of the Ariane 5 launcher on its.
Airbus flight control system  The organisation of the Airbus A330/340 flight control system 1Airbus FCS Overview.
Airbus flight control system
Software Dependability CIS 376 Bruce R. Maxim UM-Dearborn.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 23 Slide 1 Software testing.
Software Reliability Categorising and specifying the reliability of software systems.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 24 Slide 1 Critical Systems Validation 1.
Scientific Computing Algorithm Convergence and Root Finding Methods.
System/Software Testing
USS Yorktown (1998) A crew member of the guided-missile cruiser USS Yorktown mistakenly entered a zero for a data value, which resulted in a division by.
Software Quality Assurance Lecture #8 By: Faraz Ahmed.
Dr Andy Brooks1 FOR0383 Software Quality Assurance Lecture 1 Introduction Forkröfur/prerequisite: FOR0283 Programming II Website:
REAL-TIME SOFTWARE SYSTEMS DEVELOPMENT Instructor: Dr. Hany H. Ammar Dept. of Computer Science and Electrical Engineering, WVU.
CRASH AND BURN ARIANE 5 Kristen Hieronymus SYSM6309 Advanced Requirements Engineering
CRASH AND BURN ARIANE 5 Kristen Hieronymus SYSM6309 Advanced Requirements Engineering
GENERAL CONCEPTS OF OOPS INTRODUCTION With rapidly changing world and highly competitive and versatile nature of industry, the operations are becoming.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 9 Slide 1 Critical Systems Specification 2.
Introduction to Software Quality Assurance
Verification and Validation Overview References: Shach, Object Oriented and Classical Software Engineering Pressman, Software Engineering: a Practitioner’s.
The Ariane 5 Launcher Failure June 4th 1996 Total failure of the Ariane 5 launcher on its maiden flight.
2.2 Software Myths 2.2 Software Myths Myth 1. The cost of computers is lower than that of analog or electromechanical devices. –Hardware is cheap compared.
INVARIANTS EEN 417 Fall When is a Design of a System “Correct”? A design is correct when it meets its specification (requirements) in its operating.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 3 Slide 1 Critical Systems 1.
Dr. Tom WayCSC Testing and Test-Driven Development CSC 4700 Software Engineering Based on Sommerville slides.
West Virginia University Towards Practical Software Reliability Assessment for IV&V Projects B. Cukic, E. Gunel, H. Singh, V. Cortellessa Department of.
Lecture 08 – Documentation, debugging.  docstring  A special kind of string (text) used to provide documentation  Appears at the top of a module 
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.
©Ian Sommerville 2000Software Engineering, 6th edition. Chapter 19Slide 1 Chapter 19 Verification and Validation.
REAL-TIME SOFTWARE SYSTEMS DEVELOPMENT Instructor: Dr. Hany H. Ammar Dept. of Computer Science and Electrical Engineering, WVU.
Functions CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Functions CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Topics Covered: Software testing Software testing Levels of testing Levels of testing  Unit testing Unit testing Unit testing  Integration testing Integration.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 9 Slide 1 Critical Systems Specification 1.
(1) A beginners guide to testing Philip Johnson Collaborative Software Development Laboratory Information and Computer Sciences University of Hawaii Honolulu.
Software Defects.
“I am not in the office at the moment. Send any work to be translated.”
1 Software Quality Assurance COMP 4004 Notes Adapted from S. Som é, A. Williams.
SAFEWARE System Safety and Computers Chap18:Verification of Safety Author : Nancy G. Leveson University of Washington 1995 by Addison-Wesley Publishing.
©Ian Sommerville 2000Dependability Slide 1 Chapter 16 Dependability.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 23 Slide 1 Software testing.
Functions CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
“I am not in the office at the moment. Send any work to be translated.”
Topic 10Summer Ariane 5 Some slides based on talk from Sommerville.
Regression Testing with its types
Fault Tolerant Computing
Ariane 5 Software error Integer overflow.
Section 8 Discussion Points
Verification and Validation Overview
Presentation transcript:

The Ariane 5 Launcher Failure June 4th 1996 Total failure of the Ariane 5 launcher on its maiden flight

Ariane 5 A European rocket designed to launch commercial payloads (e.g.communications satellites, etc.) into Earth orbit Successor to the successful Ariane 4 launchers Ariane 5 can carry a heavier payload than Ariane 4

Launcher failure Approximately 37 seconds after a successful lift-off, the Ariane 5 launcher lost control. Incorrect control signals were sent to the engines and these swivelled so that unsustainable stresses were imposed on the rocket. It started to break up and was destroyed by ground controllers. The system failure was a direct result of a software failure. However, it was symptomatic of a more general systems failure. Software fault management on the Ariane5 launcher was clearly faulty!

The problem The attitude and trajectory of the rocket are measured by a computer-based inertial reference system. This transmits commands to the engines to maintain attitude and direction. The software failed and this system and the backup system shut down. After that, diagnostic commands were transmitted to the engines which interpreted them as real data and which swiveled to an extreme position resulting in unforeseen stresses on the rocket.

Software failure Software failure occurred when an attempt to convert a 64-bit floating point number to a signed 16-bit integer caused the number to overflow. There was no local exception handler associated with the conversion so the global system exception management facilities were invoked. These shut down the software. The backup software was a copy and behaved in exactly the same way.

Avoidable failure? The part of software that failed was reused from the Ariane 4 launch vehicle. The computation that resulted in overflow was not supposed to be used by Ariane 5. Decisions were made Not to remove the facility as this could introduce new faults; No overflow exception handler because the processor was heavily loaded. For dependability reasons, it was thought desirable to have some spare processor capacity.

Why the operand range in the module was deliberately not protected ? Avoidable failure? Why the operand range in the module was deliberately not protected ? This was because engineering analysis for its use in Ariane 4 had shown the operand would never go out of bounds; The new range requirement analysis was not transferred to the requirements for the Ariane 5; Testing was done against wrong requirements.

Why not Ariane 4? The physical characteristics of Ariane 4 (a smaller vehicle) are such that it has a lower initial acceleration and build up of horizontal velocity than Ariane 5. The value of the variable on Ariane 4 could never reach a level that caused overflow during the launch period.

Validation failure As the facility that failed was not supposed to be required for Ariane 5, there was no requirement associated with it. As there was no associated requirement, there were no tests of that part of the software and hence no possibility of discovering the problem. During system testing, simulators of the inertial reference system computers were used. These did not generate the error as there was no requirement!

Review failure The design and code of all software should be reviewed for problems during the development process The inertial reference system software was not reviewed because it had been used in a previous version; The review failed to expose the problem. The test coverage would not reveal the problem; The review failed to appreciate the consequences of system shutdown during a launch.

Lessons learned As well as testing for what the system should do, you may also have to test for what the system should not do (safety verification). Do not have a default exception handling response which is system shut-down in systems that have no fail-safe state.

Lessons learned In critical computations, always return best effort values even if the absolutely correct values cannot be computed. Wherever possible, use real equipment and not simulations. Improve the review process to include external participants and review all assumptions made in the code.

Avoidable failure The designer’s of Ariane 5 made a critical and elementary error. They designed a system where a single component failure could cause the entire system to fail. As a general rule, critical systems should always be designed to avoid a single point of failure.

Avoidable failure (summary from many authors ) This is more properly classified as a requirements error rather than a programming error. It is not associated with the choice of a particular programming language. Rigorous formal verification up to the top-level requirements, plus rigorous requirements analysis would have caught the error (model verification). One can objectively analyze the causal relations in the accident from the data given in the report by applying logic. The approach requires a strong stomach for formal logic. Use of FORMAL METHODS IN THE SYSTEM DESIGN

Examples of bugs in other systems NASA Space Rover, Intel floating point, etc. (many reports in “Software Hall of Shame”) Why so many bugs ? Because behavior is hard to predict: US F-16 when flown by Israeli pilots over Dead Sea (altitude < sea level). Air traffic controller from US to United Kingdom (It did not assume to deal with problem of 0 degrees longitude along Greenwich Meridian). Since it is hard to predict, verify full state space by formal methods (instead of testing only typical cases).

Additional reading (recommended but not required, see FMUOS Web pages). The Verified Software Initiative: A Manifesto Tony Hoare1, Jayadev Misra2, Gary T. Leavens3, and Natarajan Shankar4 April 16, 2007 1Microsoft Research 2Dept. of Computer Science, The Univ. of Texas at Austin 3Dept. of Computer Science, Iowa State University 4SRI International Computer Science Laboratory