Fault Tolerant Computing

Slides:



Advertisements
Similar presentations
* College Intern, West Virginia Wesleyan, Buckhannon, WV.
Advertisements

CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.
Semantics of Hoare Logic Aquinas Hobor and Martin Henz.
Coursework Requirements Numerical Methods. 1.Front Cover indicating that the coursework is about the numerical Solution of Equations. Include your name,
CSCI 5230: Project Management Software Reuse Disasters: Therac-25 and Ariane 5 Flight 501 David Sumpter 12/4/2001.
CMSC 330: Organization of Programming Languages Exceptions.
Solid Rocket Boosters Overview Two solid rocket boosters provide the main thrust to lift the Space Shuttle off the pad. They are the largest solid- propellant.
What is Rounding Error? AiS Challenge STI 2003 Richard Allen.
Chapter 8 Representing Information Digitally.
Closing Summary Design Testing Abstract Monitoring crop heath via aerial photography is a proper technique used to maximize agricultural productivity.
23/05/2015Dr Andy Brooks1 FOR0383 Software Quality Assurance Lecture 2 ESA Ariane 5 Rocket Flight 501.
1 COMS 161 Introduction to Computing Title: Numeric Processing Date: November 10, 2004 Lecture Number: 31.
Figures – Chapter 17. Figure 17.1 Component characteristics Component characteristic Description StandardizedComponent standardization means that a component.
CSC444 Lec01 1 University of Toronto Department of Computer Science Lecture 1: Why Does Software Fail? Some background What is Software Engineering? What.
INFORMATION TECHNOLOGIES SAFETY AND QUALITY THROUGH INFORMATION TECHNOLOGY WSRS Ulm – 20 Sept St. Ramberger / Th.Gruber 1 Experience Report: Error.
Software Development Methodology for Robotic and Embedded Systems (from drawing to coding) Presented by Iwan Setiawan for Robot and Technology Fair ( )-
©Ian Sommerville 2000CS 365 Ariane 5 launcher failureSlide 1 The Ariane 5 Launcher Failure June 4th 1996 Total failure of the Ariane 5 launcher on its.
1 Software Testing and Quality Assurance Lecture 37 – Software Quality Assurance.
ARIANE 5 FAILURE ► BACKGROUND:- ► European space agency’s re-useable launch vehicle. ► Ariane-4 was a major success ► Ariane -5 was developed for the larger.
1 Software Engineering Software has some special characteristics –Software is “developed” and not “manufactured”
Testing Components in the Context of a System CMSC 737 Fall 2006 Sharath Srinivas.
CARLOS CEDEÑO DSES /04/2008 Reliability of the Three Main Engines of Space Shuttle.
©Ian Sommerville 2004Software Engineering Case Studies Slide 1 The Ariane 5 Launcher Failure June 4th 1996 Total failure of the Ariane 5 launcher on its.
CPSC 372 John D. McGregor Module 0 Session 1 Introduction.
Scientific Computing Algorithm Convergence and Root Finding Methods.
The Ariane 5 Launcher Failure
Uncontrolled copy not subject to amendment Rocketry Revision 1.00.
CRASH AND BURN ARIANE 5 Kristen Hieronymus SYSM6309 Advanced Requirements Engineering
CRASH AND BURN ARIANE 5 Kristen Hieronymus SYSM6309 Advanced Requirements Engineering
CPSC 871 John D. McGregor Module 0 Session 1 Introduction.
Introduction to Software Quality Assurance
The Ariane 5 Launcher Failure June 4th 1996 Total failure of the Ariane 5 launcher on its maiden flight.
Why does it matter how data is stored on a computer? Example: Perform each of the following calculations in your head. a = 4/3 b = a – 1 c = 3*b e = 1.
INVARIANTS EEN 417 Fall When is a Design of a System “Correct”? A design is correct when it meets its specification (requirements) in its operating.
13-1 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Chapter 13 Measures.
SENG521 (Fall SENG 521 Software Reliability & Testing Fault Tolerant Software Systems: Techniques (Part 4b) Department of Electrical.
AAE 450- Propulsion LV Stephen Hanna Critical Design Review 02/27/01.
System Engineering Experiences Harold Sasnowitz, IEEE Life Senior Member.
Chapter 1 Quality terminology Error: human mistake Fault: result of mistake, evidenced in some development or maintenance product Failure: departure from.
Software Defects.
1 Approximation. 2 Taylor Series 3 Truncation Error  In general, the nth order Taylor series expansion will be exact for an nth order polynomial  In.
1 Software Quality Assurance COMP 4004 Notes Adapted from S. Som é, A. Williams.
What exactly is a satellite? The word satellite originated from the Latin word “Satellite”- meaning an attendant, one who is constantly hovering around.
CRICOS No J a university for the world real R ENB443: Launcher Systems Image Credit: ESA Caption: The generic Ariane-5 (Ariane Flight 162) lifting.
Introduction to Algorithmic Processes CMPSC 201C Fall 2000.
Topic 10Summer Ariane 5 Some slides based on talk from Sommerville.
CONTENTS:  INTRODUCTION & HISTORY  EXISTING SYSTEM & DIS-ADVANTAGES  PROPOSED SYSTEM  RESULT ANALYSIS  ADVANTAGES  APPLICATIONS  CONCLUSION.
ISQB Software Testing Section Meeting 10 Dec 2012.
How Computers Store Variables
Component 1.6.
Number Systems and Binary Arithmetic
Software Testing Introduction CS 4501 / 6501 Software Testing
Floating Point Math & Representation
“ADAPTIVE MISSILE GUIDANCE USING GPS”
Ariane 5 Software error Integer overflow.
Section 8 Discussion Points
Development and Principles of Rocketry
Floating Point.
CS 105 “Tour of the Black Holes of Computing!”
Lidia Cucurull, NCEP/JCSDA
Derivation of the FSOA in Ariane 6 Specifications
CS 105 “Tour of the Black Holes of Computing!”
The Challenger “Again Science Fails”
Image Acquisition and Processing of Remotely Sensed Data
Floating Point Numbers
Computer Organization COMP 210
Rocketry Trajectory Basics
CS2S562 Secure Software Development
CS 105 “Tour of the Black Holes of Computing!”
Numerical Modeling Ramaz Botchorishvili
Presentation transcript:

Fault Tolerant Computing ARIANE 5 FLIGHT 501 FAILURE GARIMELLA SUDHEER EE 585: Fault Tolerant Computing Sudheer Fault Tolerant Computing Garimella Sudheer Fault Tolerant Computing

Fault Tolerant Computing INTRODUCTION Ariane 5 is a launch vehicle developed by the European space agency. The maiden flight of Ariane 5 launcher took place on 4th June 1996. 37 seconds into its ascent at an altitude of 3700m the launcher exploded. An enquiry board submitted its report on the reasons of failure. Introduction about the launcher. Date of the launch and failure and about the enquiry board report. Launched from Kourou, French Guiana was first of its class 5. Sudheer Fault Tolerant Computing Garimella Sudheer Fault Tolerant Computing

Fault Tolerant Computing General Description Weather conditions at launch time were acceptable. The launch was initiated at 09h 33mn 59s( Local Time). Nominal behavior of the launcher (36s). Failure of the back up Inertial reference system followed by failure of actual inertial system caused the destruction of the flight. Report submitted by the board. General data given like the weather conditions before launch were given. Self destruction of the launcher correctly triggered by rupture of the links between the solid boosters and the core stage. Destruction occurred near to launch pad at an altitude of 4000m.Debris was scattered over an area of 12 square kilometers east of launch pad. Sudheer Fault Tolerant Computing Garimella Sudheer Fault Tolerant Computing

Fault Tolerant Computing ANOMOLIES OBSERVED One Major anomaly was gradual development of variations in the hydraulic pressure of actuators of the main engine nozzle. This anomaly was observed at H0+22 seconds These variations had a frequency of approximately 10Hz. In this slide points relating to physical failure of launcher before destruction are explained. Sudheer Fault Tolerant Computing Garimella Sudheer Fault Tolerant Computing

Fault Tolerant Computing Image of the launch. Sudheer Fault Tolerant Computing Garimella Sudheer Fault Tolerant Computing

Fault Tolerant Computing Debris of the launcher after explosion. Sudheer Fault Tolerant Computing Garimella Sudheer Fault Tolerant Computing

Fault Tolerant Computing ANALYSIS OF FAILURE In general flight control system of the Ariane 5 is of standard design. The attitude of the launcher and its movements in space are measured by an Inertial Reference system (SRI). SRI has its internal computer in which angles and velocities are calculated. The attitude of the launcher and its movements in space are measured by an SRI. It has its own internal computer, in which angles and velocities are calculated on the basis of information from a “strap-down” inertial platform, with laser gyros and accelerometers Sudheer Fault Tolerant Computing Garimella Sudheer Fault Tolerant Computing

ANALYSIS OF FAILURE (CONT) The data from the SRI are transmitted through data bus to on board computer. The on-board computer (OBC), executes the flight program and controls the nozzles of the solid boosters and the cryogenic Vulcain engine, via servovalves and hydraulic actuators. The design of the SRI used in Ariane 5 is almost identical to that of Ariane 4 Here how the software failure has occurred has been explained and the kind of failure has been elaborated. At 36.7 seconds after H0 (approx. 30 seconds after lift-off) the computer within the back-up inertial reference system, which was working on stand-by for guidance and altitude control, became inoperative. This was caused by an internal variable related to the horizontal velocity of the launcher exceeding a limit which existed in the software of this computer Sudheer Fault Tolerant Computing Garimella Sudheer Fault Tolerant Computing

ANALYSIS OF FAILURE (CONT) software specifications from the Ariane 4, were reused in Ariane 5 but its flight path was different and beyond the range for which the code had been reused. The disintegration occurred due to software exception of On board computer. The software exception was caused during conversion of 64 bit floating point to 16 bit signed integer value. The internal SRI software exception was caused during execution of a data conversion from a 64-bit floating-point number to a 16-bit signed integer value. The value of the floating-point number was greater than what could be represented by a 16-bit signed integer. The result was an operand error. The data conversion instructions (in Ada code) were not protected from causing operand errors, although other conversions of comparable variables in the same place in the code were protected. Sudheer Fault Tolerant Computing Garimella Sudheer Fault Tolerant Computing

Fault Tolerant Computing COMMENTS The primary cause for failure scenario are operand error when converting the horizontal bias variable BH. Lack of protection of this conversion which caused the SRI computer to stop. Specifically a 64 bit floating point number relating to the horizontal velocity of the rocket with respect to the platform was converted to a 16 bit signed integer. The number was larger than 32,767, the largest integer storable in a 16 bit signed integer, and thus the conversion failed. The operand error occurred because of an unexpected high value of an internal alignment function result, called BH (horizontal bias), which is related to the horizontal velocity sensed by the platform. This value is calculated as an indicator for alignment precision over time. The value of BH was much higher than expected because the early part of the trajectory of Ariane 5 differs from that of Ariane 4 and results in considerably higher horizontal velocity values. Sudheer Fault Tolerant Computing Garimella Sudheer Fault Tolerant Computing

Fault Tolerant Computing OTHER WEAKNESSES The review has covered following areas. - The design of the electrical system - Embedded on board software in subsystem other than the Inertial frame of reference system. -The on board computer and the flight program software Here the major weaknesses of the design are told as given in the report submitted by the enquiry board. Sudheer Fault Tolerant Computing Garimella Sudheer Fault Tolerant Computing

Lessons For Software Engineering Test! Try to write code so that it cannot fail. Don't allow errors or exceptions to propagate in an uncontrolled manner Reused code still needs to be tested. Concluding points. Sudheer Fault Tolerant Computing Garimella Sudheer Fault Tolerant Computing

Fault Tolerant Computing REFERENCES ARIANE 5 FLIGHT 501 FAILURE Report by chairman of enquiry board: Prof.J.L.LIONS. http://www.arianespace.com http://www.vuw.ac.nz/staff/stephen_marshall/SE/Failures/SE_Ariane.html http://www-aix.gsi.de/~giese/swr/ariane5.html Sudheer Fault Tolerant Computing