Therac 25 Nancy Leveson: Medical Devices: The Therac-25 (updated version of IEEE Computer article) http://sunnyday.mit.edu/papers/therac.pdf.

Slides:



Advertisements
Similar presentations
ES050 – Introductory Engineering Design and Innovation Studio Prof. Ken McIsaac One last word…
Advertisements

Engineering Diploma Level 2 Unit 7 Application of Maintenance Techniques in Engineering In this unit you will get involved with both maintenance procedures.
CSCI 5230: Project Management Software Reuse Disasters: Therac-25 and Ariane 5 Flight 501 David Sumpter 12/4/2001.
“An Investigation of the Therac-25 Accidents” by Nancy G. Leveson and Clark S. Turner Catherine Schell CSC 508 October 13, 2004.
The Therac-25: A Software Fatal Failure
EECE499 Computers and Nuclear Energy Electrical and Computer Eng Howard University Dr. Charles Kim Fall 2013 Webpage:
In this presentation you will:
Motherboard, BIOS and POST The external data bus connects devices on the motherboard together. Everything is also connected to the address bus. These busses.
An Investigation of the Therac-25 Accidents Nancy G. Leveson Clark S. Turner IEEE, 1993 Presented by Jack Kustanowitz April 26, 2005 University of Maryland.
+ THE THERAC-25 - A SOFTWARE FATAL FAILURE Kpea, Aagbara Saturday SYSM 6309 Spring ’12 UT-Dallas.
© 2004 Cisco Systems, Inc. All rights reserved. Operating and Configuring Cisco IOS Devices Starting a Switch INTRO v2.0—8-1.
Week 5 - Wednesday.  What did we talk about last time?  Attacks on hash functions.
Reliability and Safety Lessons Learned. Ways to Prevent Problems Good computer systems Good computer systems Good training Good training Accountability.
Motivation Why study Software Engineering ?. What is Engineering ? 2 Engineering (Webster) – The application of scientific and mathematical principles.
Programming Languages: Design, Specification, and Implementation G Rob Strom September 7, 2006.
A Gift of Fire Third edition Sara Baase
SWE Introduction to Software Engineering
Page 1 Building Reliable Component-based Systems Chapter 14 - Testing Reusable Software Components in Safety- Critical Real-Time Systems Chapter 14 Testing.
Jacky: “Safety-Critical Computing …” ► Therac-25 illustrated that comp controlled equipment could be less safe. ► Why use computers at all, if satisfactory.
CCNA 2 v3.1 Module 2.
CS 235: User Interface Design January 22 Class Meeting
Software Verification and Validation (V&V) By Roger U. Fujii Presented by Donovan Faustino.
Introduction to Automation Chapter 4 Introduction to Automation Dr. Osama Al-Habahbah The University of Jordan Mechatronics Engineering Department.
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
E.R.P.S University of Palestine. Risks in an ERP environment : The use of ERP systems clearly introduces additional risks into the system environment.
Software Failures Ron Gilmore, CMC Edmonton April 2006.
Lecture 7, part 2: Software Reliability
Dr Andy Brooks1 Lecture 4 Therac-25, computer controlled radiation therapy machine, that killed people. FOR0383 Software Quality Assurance.
DJ Wattam, Han Junyi, C Mongin1 COMP60611 Directed Reading 1: Therac-25 Background – Therac-25 was a new design dual mode machine developed from previous.
ATIF MEHMOOD MALIK KASHIF SIDDIQUE Improving dependability of Cloud Computing with Fault Tolerance and High Availability.
Death by Software The Therac-25 Radio-Therapy Device Brian MacKay ESE Requirements Engineering – Fall 2013.
Therac-25 : Summary Malfunction Complacency Race condition (turntable / energy mismatch) Data overflow (turntable not positioned) time‘85‘86‘88 ‘87 Micro-switch.
Computer Maintenance Unit Subtitle: Basic Input/Output System (BIOS) Excerpted from 1 Copyright © Texas Education Agency, All.
Software Safety Case Study Medical Devices : Therac 25 and beyond Matthew Dwyer.
Therac-25 Final Presentation
Software Metrics - Data Collection What is good data? Are they correct? Are they accurate? Are they appropriately precise? Are they consist? Are they associated.
ITGS Software Reliability. ITGS All IT systems are a combination of: –Hardware –Software –People –Data Problems with any of these parts, or a combination.
CS 235: User Interface Design August 25 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak
Liability for Computer Errors Not covered in textbook.
XpsOES : A New Tool for Improving Safety at Workplace Yasar Kucukefe, Ph.D., National Power Energy.
Security and Reliability THERAC CASE STUDY TEXTBOOK: BRINKMAN’S ETHICS IN A COMPUTING CULTURE READING: CHAPTER 5, PAGES
Dimitrios Christias Robert Lyon Andreas Petrou Dimitrios Christias Robert Lyon Andreas Petrou.
©2001 Southern Illinois University, Edwardsville All rights reserved. Today Fun with Icons Thursday Presentation Lottery Q & A on Final Exam Course Evaluations.
848T High Density Temperature Measurement Validation Diagnostic.
© 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. System design techniques Quality assurance. 1.
IT Essentials: PC Hardware and Software v4.0. Chapter 4 Objectives 4.1 Explain the purpose of preventive maintenance 4.2 Identify the steps of the troubleshooting.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 9 Slide 1 Critical Systems Specification 1.
Therac-25 CS4001 Kristin Marsicano. Therac-25 Overview  What was the Therac-25?  How did it relate to previous models? In what ways was it similar/different?
1 Fault-Tolerant Computing Systems #1 Introduction Pattara Leelaprute Computer Engineering Department Kasetsart University
CSCI 3428: Software Engineering Tami Meredith Chapter 7 Writing the Programs.
Error Handling Tonga Institute of Higher Education.
Chapter - Software Engineering Fail safe design problems Component jams Operator detected failure Erroneous input Unsafe modes Programming errors Sabotage.
Computer Maintenance I
1 Pertemuan 3 Operating Cisco IOS Software. Discussion Topics The purpose of Cisco IOS software Router user interface Router user interface modes Cisco.
What is the reason for installing hardware – P1 By Ridjauhn Ryan.
Topic: Reliability and Integrity. Reliability refers to the operation of hardware, the design of software, the accuracy of data or the correspondence.
Directed Reading 1 Girish Ramesh – Andres Martin-Lopez – Bamdad Dashtban –
1 Advanced Computer Programming Project Management: Basics Copyright © Texas Education Agency, 2013.
Software Development and Safety Critical Decisions
Why study Software Design/Engineering ?
EE 585 : FAULT TOLERANT COMPUTING SYSTEMS B.RAM MOHAN
COMP60611 Directed Reading 1: Therac-25
On-Board Diagnostics Chapter 18 Lesson 1.
Failure and Design Jaime Baber October 12, 2000
Reliability and Safety
Therac-25.
System design techniques
Week 13: Errors, Failures, and Risks
Reliability and Safety
Computer in Safety-Critical Systems
Presentation transcript:

Therac 25 Nancy Leveson: Medical Devices: The Therac-25 (updated version of IEEE Computer article) http://sunnyday.mit.edu/papers/therac.pdf

Therac 25

Therac 25

Therac 25 – Engineering issues The failure only occurred when a particular nonstandard sequence of keystrokes was entered on the VT-100 terminal which controlled the PDP-11 computer: an "X" to (erroneously) select 25,000 EV mode followed by "cursor up", "E" to (correctly) select 200 EV mode, then "Enter". This sequence of keystrokes was improbable, and so the problem did not occur very often and went unnoticed for a long time. The design did not have any hardware interlocks to prevent the electron-beam from operating in its high-energy mode without the target in place. The engineer had reused software from older models. These models had hardware interlocks that masked their software defects. Those hardware safeties had no way of reporting that they had been triggered, so there was no indication of the existence of faulty software commands. 3

Therac 25 – Engineering issues 4. The hardware provided no way for the software to verify that sensors were working correctly (i.e. an open-loop controller). The table-position system was the first implicated in Therac-25's failures; the manufacturer revised it with redundant switches to cross-check their operation. 5. The equipment control task did not properly synchronize with the operator interface task, so that race conditions occurred if the operator changed the setup too quickly. This was evidently missed during testing, since it took some practice before operators were able to work quickly enough for the problem to occur. Experience led to trouble (not the other way). 6. The software set a flag variable by incrementing it. Occasionally an arithmetic overflow occurred, causing the software to bypass safety checks. 3

Therac 25 – Institutional issues AECL did not have the software code independently reviewed. AECL did not consider the design of the software during its assessment of how the machine might produce the desired results and what failure modes existed. These form parts of the general techniques known as reliability modeling and risk management. The system noticed that something was wrong and halted the X-ray beam, but merely displayed the word "MALFUNCTION" followed by a number from 1 to 64. The user manual did not explain or even address the error codes, so the operator pressed the P key to override the warning and proceed anyway. AECL personnel initially did not believe complaints. 4