CML CSE 520: Advanced Computer Architecture: Reliability Aviral Shrivastava.

Slides:



Advertisements
Similar presentations
CSE 599F: Formal Verification of Computer Systems.
Advertisements

CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.
Challenge of Nuclear Weapons
1. Software in our lives, then and now  Medical (processing and analysis, Computer Aided Surgery, other various equipment)  Financial and business (banking,
 The concept of mutual deterrence or mutually assured destruction assumed that both sides had sufficient weapons with enough security that they could.
Museum Entrance Welcome to the Lobby What The Cuban Missile Crisis Was Leaders How it Was Resolved What Occurred The Museum of The Cuban Missile Crisis.
Syllabus Case Histories WW III Almost Medical Killing Machine
1 Miscellaneous Vocabulary Object-Oriented Stuff Algorithms Software Design Testing and Types of Errors.
CSC 4250 Computer Architectures September 12, 2006 Appendix H. Computer Arithmetic.
WHY THEY FAILED AND LESSONS TO BE DRAWN Samuel Franklin G53QAT: Quality Assurance and Testing Famous Software Failures.
Software Engineering Disasters
Chapter 8 Representing Information Digitally.
O What was the Cold War? o East versus the West o Communism versus Democracy o Warsaw Pact verses NATO o Soviet Union and the eastern allies versus the.
Defending North America Ch. 6 (p , )
Who were the two superpowers during the Cold War? The United States and the USSR (The Union of Soviet Socialist Republics)/ Russia. After the end of World.
ECI 2007: Specification and Verification of Object- Oriented Programs Lecture 0.
Numerous people feared that the thoughts were the more nuclear weapons a country had the more power they had.
Nuclear Arms Race & the Space Race
Toward A Reasonable Programmer Standard Responsibility and Negligence in Software Design.
Unit 3a Industrial Control Systems
Nuclear Arms Race! By Adam Damon and Grant Patrizio.
Communism and the Cuban Missile Crisis Kennedy’s Presidency.
Atomic Anxiety 26-2 The Main Idea The growing power of, and military reliance on, nuclear weapons helped create significant anxiety in the American public.
Software Errors Who is to blame?. Almost everything in our daily lives is controlled by CPU’s and software… Does Embedded Software = Embedded Disasters?
When the Battle Started  The battle started on June 10, 1940 but the real air war didn’t start until August 12,  It involved the British (RAF)
THE COLD WAR Time period after WWII of conflict and competition between communist Soviet Union (USSR), and the democratic United States. ENTER.
Cultural Impact of The Cold War. Stanislav Petrov September 26, 1983 September 26, 1983 USSR had recently shot down a Korean airliner USSR had recently.
Overview of President Kennedy, The Cold War, and Cuba.
The Unintended Consequences of a career in Engineering Or How to end up a mass murderer without even trying.
The Cold War Continues: Korea, Eisenhower’s Foreign Policy, & the Cuban Missile Crisis US History: Spiconardi.
Space An Introduction. Space Exploration  Space Exploration: is the use of astronomy and space technology to explore outer space.astronomyspace technologyouter.
Software Engineering Background Dr. David A. Gaitros.
Windows Vista Inside Out Chapter 22 - Monitoring System Activities with Event Viewer Last modified am.
Global Positioning Systems A HISTORY OF THE U.S.A. GPS.
Building Dependable Distributed Systems Chapter 1 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Cuban Missile Crisis.
Arms Race Harry and Jen.
Therac-25 CS4001 Kristin Marsicano. Therac-25 Overview  What was the Therac-25?  How did it relate to previous models? In what ways was it similar/different?
American History Chapter 26-2
Based on the novel, Rocket Boys, By Homer Hickam.
“I am not in the office at the moment. Send any work to be translated.”
Revolution in Cuba by the early 1950s, U.S. corporations
The Unintended Consequences of a career in Engineering Or How to end up a mass murderer without even trying.
Unintended Consequences of a career in Engineering.
” “Iron curtain” Geographic and political divisions between Communist and non- Communist nations in Europe. The Iron Curtain.
The Cold War Arms Race. During the Cold War the United States and the Soviet Union became engaged in a nuclear arms race. They both spent billions and.
Chapter 32 Section 4. Nixon and the Environment  Silent Spring (1962): Rachel Carson, talks about ill effects of DDT (pesticide)  Environmental Protection.
Safety Critical Systems
Unit 4 Section 1 Part 6 ATOMIC ANXIETY. A. THE HYDROGEN BOMB 1950s, a new weapon being studied: the Hydrogen Bomb Fused atoms together (like the sun and.
SOFTWARE FAILURES.
School of Business Administration
Overview of President Kennedy, The Cold War, and Cuba
Patriot Missile Failure
Communism & Totalitarianism
ATTRACT TWD Symposium, Barcelona, Spain, 1st July 2016
ECE 103 Engineering Programming Chapter 2 SW Disasters
Defending North America
Design for Quality Design for Quality and Safety Design Improvement
Fault Injection: A Method for Validating Fault-tolerant System
The Cold War Unfolds Chapter 15.1.
The Anti-Nuclear Movement and Efforts at Disarmament
Chapter 29 Section 2 (Cold War)
Errors, Failures, & Risks
Missile Madness.
Eisenhower’s Policies
Post-Apocalyptic Worlds
What this course is NOT about:
Software Engineering Disasters
The Cuban Missile Crisis
Presentation transcript:

CML CSE 520: Advanced Computer Architecture: Reliability Aviral Shrivastava

CML Web page: aviral.lab.asu.edu CML Therac  The Therac-25 was a machine for administering radiation therapy, generally for treating cancer patients.  ‘arithmetic overflow’ sometimes occurred during automatic safety checks.  If, at this precise moment, the operator was configuring the machine, the safety checks would fail and the metal target would not be moved into place.  The result was that beams 100 times higher than the intended dose would be fired into a patient, giving them radiation poisoning.  This happened on 6 known occasions, causing the later death of 4 patients.

CML Web page: aviral.lab.asu.edu CML Patriot Missile Bug - February 25th, 1991  During Operation Desert Shield, the US military fired a patriot missile against an incoming missile, but hit a US base where it killed 28 soldiers and injured a further 98.  The internal clock would ‘drift’ (much like any clock) further and further from accurate time. It was left running for 100 hours, by which point, the internal clock had drifted out by 0.34 of a second.  So when it calculated the target over half a kilometer away from missile’s true location.

CML Web page: aviral.lab.asu.edu CML Skynet Brings Judgement Day (1997)  Cost: 6 billion dead, near-total destruction of human civilization and animal ecosystems (fictional)  Disaster: Human operators attempt to shut off the Skynet global computer network. Skynet responds by firing U.S. nuclear missiles at Russia, initiating global nuclear war on what became known as Judgement Day (August 29, 1997).  Cause: Cyberdyne, the leading weapons manufacturer, installed Skynet technology in all military hardware including stealth bombers and missile defense systems. The Skynet technology formed a seamless network and effectively removed humans from strategic defense. Eventually Skynet became sentient, was threatened when the humans tried to take it offline, sought to survive, and retaliated with nuclear war.

CML Web page: aviral.lab.asu.edu CML Cold War Missile Crisis September 26, 1983  Soviet military officer Stanislav Petrov received an alert that the US had launched five Minuteman intercontinental ballistic missiles.  Petrov found it strange that the US would attack with just a handful of warheads.  Considering that the early warning system was known to have flaws and had been rushed into service, Petrov decided to rule the alert as a false alarm.  It was later determined that the early detection software had picked up the sun’s reflection from the top of clouds and misinterpreted it as missile launches.

CML Web page: aviral.lab.asu.edu CML Michigan Dept. of Corrections Grants Prisoners Early Release  In October 2005, The Register reported on the early release of 23 prisoners due to a computer programming glitch with the Michigan Department of Corrections.early release of 23 prisoners  The accidental early release dates came around 39 to 161 days early while an undisclosed number of inmates were kept in jail past their release dates.  State assembly representative Rick Jones was concerned about the matter, but noted that he was “glad it’s not murderers.”

CML Web page: aviral.lab.asu.edu CML North American Blackout August 14, 2003  Affecting around 55 million people, mainly in the North Eastern United States, but also Ontario Canada, this was one of the biggest power blackouts in history.  While the causes of this blackout were nothing to do with a software bug, it could have been averted were it not for a software bug in the control centre alarm system.  The centre alarm system had a ‘race condition’, which caused the alarm system to freeze and stop processing alerts. The alarm system failed ‘silently’, and didn’t notify anybody.

CML Web page: aviral.lab.asu.edu CML Blue screen of death

CML Web page: aviral.lab.asu.edu CML Source of Errors  Specification errors  Functionality in footnotes  Programming errors  Incorrect implementation (Michigan prison error)  Algorithm error (Cold war missile crisis)  Floating point errors (Patriot missile)  Race conditions (Blackout)  Manufacturing errors  Process variations  Silicon failures  Runtime errors  Negative Bias Temperature Instability (NBTI)  Noise effects  Voltage emergencies  Environmental  Soft errors Assuming systems are mechanically and physically protected!

CML Web page: aviral.lab.asu.edu CML Fault Tolerant Computing is not new!  1940s:ENIAC, with 17.5K vacuum tubes and 1000s of other electrical elements, failed once every 2 days  1950s: Early ideas by von Neumann (multichannel, with voting) and Moore-Shannon (“crummy” relays)

CML Web page: aviral.lab.asu.edu CML Need is changing: Automation  Space age  Age of Automation  Proliferation of robots

CML Web page: aviral.lab.asu.edu CML Need is changing: Proximity  Near body computing  Google glass  In-body computing  Accurate drug delivery  Robotic surgery

CML Web page: aviral.lab.asu.edu CML Need is changing: Technology  Transistors are smaller  Even low-energy particles can cause soft errors.  Exponentially more low-energy particles

CML Web page: aviral.lab.asu.edu CMLWelcome  To the course on designing reliable computing systems  Focus of the course will be on “soft errors”  Class webpage 