Presentation on theme: "Reliability Tony Massihi Etan Halberg Jacob Hakak."— Presentation transcript:
Reliability Tony Massihi Etan Halberg Jacob Hakak
Software Engineering A discipline that focuses on producing software using certain tools and methodologies. They follow a four step process: Specification: Defining the functions needed. Development: Producing the software. Validation: Testing the software Evolution: Modifying software to meet the changing needs of the customer.
Software Engineering Most organizations use CASE (Computer-assisted software engineering) tools to support the process of developing and documenting a more detailed design. Another good approach to software engineering is using object-oriented design.
Software Engineering These standards have led to better software quality over the years, but in order to stay competitive companies must release products quickly. Many companies feel a tension between meeting tight deadlines and strictly following software engineering methodologies.
Software Warranties Shrinkwrap warranties - Software, such as Microsoft Word, has a limited warranty that says the software will do what the manual says it will do. They provide a 90-day replacement or money-back guarantee. Warranties for games promise that the original media is free from defects, that you will be able to install it and also act as a 90 day warranty.
Problems Most stores will not fully refund you for unopened items even though the license agreement is inside the box. Vendors are willing to give you a full refund if software will not install, but will not take liability if your business is harmed because their software crashed at the wrong time.
Court Cases Step-Saver Data Systems v. Wyse Technology & The Software Link Step-Saver sold timesharing computer systems with Wyse terminals and an OS by The Software Link (TSL). Step-saver purchased and resold 142 copies of the Multilink Advanced OS provided by TSL.
Step-Saver v. Wyse & TSL When Step-Saver called TSL to purchase the OS, the TSL sales rep. said that the OS was compatible with most DOS applications. The software did not work properly and all three companies together were not able to solve the problems. Therefore, Step-Saver sued Wyse and TSL.
Step-Saver v. Wyse & TSL U.S. Court of Appeals ruled in favor of Step-Saver because the president of Step-Saver never signed a document formalizing the licensing agreement. The court justified their ruling with the invoice and oral statement constituting a contract.
Kantian Analysis Every software company produces a license agreement to state the terms that the customer agrees to when buying the software. Didn’t matter to TSL if document was signed, just wanted the business so defeats the point of having a licensing agreement.
Utilitarian Analysis Not ethical. Negatives outweigh the positives. Negatives: TSL sold software with promises that weren’t fulfilled. Step-Saver was sued by 12 of its customers. TSL didn’t care if license agreement was signed before selling many copies. Positives: Wyse and TSL tried to fix the problems.
Social Contract Analysis Companies have the right to state terms on which the customer must agree to when using software. If not agreed to then the courts resort to Article 2 of the UCC which made the argument that a contract was formed with the purchase order, invoice and the oral statements from the sales rep.
ProCD V. Zeidenberg ProCD created a computer database containing info from more than 3000 telephone directories. They created an application called SelectPhone where you can search the database for records. They included a license agreement prohibiting the commercial use of the database and the program, which were displayed every time you run the program.
ProCD V. Zeidenberg Matthew Zeidenberg formed a company called Silken Mountain Web Services and he resold the info in the SelectPhone database. Zeidenberg argued that the license wasn’t printed on the outside of the box so he shouldn’t be liable. The court ruled in favor of ProCD.
Ethical Analysis Kantian: Not ethical. ProCD fulfilled their duty of informing the customer the terms to which they both must agree to when using product. Utilitarian: Not ethical. Reproduced someone else’s work. Social contact: Violated right to intellectual property by stealing their work.
Mortenson v. Timberline Mortenson is a national construction contractor and they purchased copies of a bidding package called Precision Bid Analysis from Timberline. Mortenson used this to prepare a bid and on the day the bid was due, the software malfunctioned. It printed the message “Abort: Cannot find alternate” 19 times. Mortenson continued to use the software and submitted the bid it produced. Mortenson discovered that its bid was $1.95 million too low.
Mortenson v. Timberline Mortenson is a national construction contractor and Timberline sold bidding package to Mortenson. It turns out Timberline was aware of the bug since May 1993 and they fixed it and sent newer versions to some of its customers who encountered it, but not to Mortenson.
Mortenson v. Timberline Timberline argued that the license agreement limited the consequential damages that Mortenson can recover from them. The King County Superior Court ruled in favor of Timberline.
UCITA Uniform Computer Information Transaction Act is a proposed amendment to Article 2 of the UCC, which was proposed after the ruling against The Software Link with the idea that software cannot always be bug free. Article 2 of UCC (Uniform Commercial Code) governs the sale of products in the U.S.
UCITA States Manufacturers may license software to customers for a period of time. Manufacturers may prevent the transfer of software from 1 person to another. Manufacturers may disclaim all liability for defects, must accept “as is”
UCITA Continued Manufacturers may remotely disable licensed software in case of a license dispute. Manufacturers may collect info about how licensees use their computers. Applies to software in computers and not embedded systems, such as PDAs, cell phones.
Arguments Supporting UCITA If we want a vital software industry, we need to understand that software is not going to have the same reliability as physical products. Prevents fraud, so if a customer purchases a license to use the software for a certain period of time, then they can put code that makes it unusable after license has expired.
Arguments Supporting UCITA If the license allows the software to be run on a certain number of computers, then the software can include features to make it impossible to run more machines than specified.
Arguments Against UCITA If you license a piece of software and don’t need it anymore, you can’t give it away legally to someone else. Allowing companies to sell software “as is” violates the Magnuson-Moss Act which was passed by Congress in 1975 for consumers. It prevented manufacturers from putting unfair warranties on products over $25.
Arguments Against UCITA The Magnuson-Moss Act also made it economically feasible for consumers to bring warranty suits by allowing courts to award attorneys’ fees. Consumers see the warranty before the software is installed when they click the I accept button. Once the warranty is accepted and the program is run, it cannot be returned, even though one still does not know if the software works properly.
Arguments Against UCITA Their won’t be a uniform law across every state, Maryland and Virginia have passed a different version of the law.
Moral Responsibility of Software Manufacturers Manufacturers rely on consumers to help them identify bugs. They could find these bugs themselves if they hired more testers, but this would result in higher prices and longer development times. This is a utilitarian way to look at the situation because the positives outweigh the negatives. There will be fewer products with higher prices but the software will be more reliable.
Computer Reliability “The major difference between a thing that might go wrong and a thing that cannot possibly go wrong is that when a thing that cannot possibly go wrong goes wrong it usually turns out to be impossible to get at or repair.” -Douglas Adams
Forethoughts on Reliability Are humans, in general, reliable? Are computer systems, in general, reliable? Is the reliability of a computer system a function of the reliability of its maker? If the maker is flawed, how can his or her creation be flawless?
Data-Entry/Data-Retrieval A computer database is a structured collection of records or data that is stored in a computer system so that a computer program or person using a query language can consult it to answer queries. The records retrieved in answer to queries are information that can be used to make decisions. Examples of databases and query languages: ● Dbase, MySQL, Oracle, PostgreSQL ● SQL, CQL, OQL, Datalog
How can data cause a system to fail? Software related ● Programming errors ● Poor programming practices Non-software related ● Missing, incorrect, inconsistent, or otherwise bad data
Data-Entry/Data-Retrieval Errors: Cause and Effect Mild annoyances ● Human error: John Q Smartguy at the bank entered your address wrong. As a result, your credit card bills are sent to the wrong address. ● Computing error: A table column in a database stores your account number. The new software that the ATMs have select the incorrect table column to check. As a result, all ATM cards now do not work, or worse, access incorrect records.
Data-Entry/Data-Retrieval Errors: Cause and Effect Moderate problems ● National Crime Information Center (NCIC) and faulty database records ● Disqualification due to database records (November 2000 Florida general election; background checks on employees) ● False Arrests due to misinterpretation, incorrectly entered, or otherwise false information (Terry Dean Rogan, Roberto Hernandez)
Data-Entry/Data-Retrieval Errors: Cause and Effect Severe misinterpretation of data ● An Iraqi scud missile hit a base in Dhahran and killed 28 US soldiers in Feb 1991. It was recognized by radar but dismissed due to incorrect data.
Analysis of NCIC Records Should the US government take responsibility for the accuracy of the information stored in the NCIC database? ● Privacy Act of 1974 ● FBI not required to ensure accuracy ● Many agencies enter information ● Accuracy checks would hinder functionality of database with regard to criminal investigations.
The Question of Ethics Is it ethical for individuals from these agencies, or the agencies themselves, to enter information into a national database without checking whether or not it is accurate and correct?
Software and Billing Errors Even if the data entered into a computer is correct, and the manner in which it is retrieved is correct, there are still errors that occur in the manipulation of that data to consider. We've already briefly touched on software errors and have seen a short example of a billing issue involving data entry. Let's take a look at how and why not only faulty data but faulty software as well can affect billing and other processes.
System Malfunctions Qwest billing software malfunction ● $57,346 phone bill USDA beef prices ● $15 - $20million loss for beef producers US Postal Service ● 50,000 pieces of mail returned to sender The car with a mind of its own ● BMW on-board computer crash
System Failures LA County/USC Medical Center ● Backlogging of new lab computer system Air Traffic Control System – Japan ● 4 hour system down; delays/cancellations Chicago/London Trade/Exchange ● Hour long trade suspension, multiple times Comair (sub. Delta) ● Crew assignment system failure
Postal Service Article: http://query.nytimes.com/gst/fullpage.html?res=9805EFDE133E F93AA3575BC0A960958260&n=Top/News/Business/Small%20 Business/Innovation Beef Price Article: http://www.beefusa.org/NEWSUSDAReportingErrorResultsin$42 -54MillionLosstoCattleIndustry4134.aspx Car With a Mind of Its Own Article and followup: http://aardvark.co.nz/daily/2003/n051301.shtml http://www.microsoft.com/presspass/press/2002/mar02/03- 04bmwpr.mspx Comair Cancellations Article: http://www.usatoday.com/travel/flights/delays/2004-12-25- comair-cancels-flights_x.htm
The Question of Ethics, revisited A specific example: Amazon.com, UK ● iPaq handheld computers listed at £7 ● Actual price, £275 Amazon refused delivery unless buyers paid the difference, citing their Pricing and Availability Policy. Focus ● Amazon's refusal to fill orders ● Customers' bids
Kantianism v. Utilitarianism Kantianism ● In the end it would result in higher prices and tend away from the greater good. ● Unethical for consumers to assume it was just a 'really good sale' therefore they were not acting in good faith. Utilitarianism ● Unethical because if this behavior was acceptable prices would increase; costs outweigh the benefits.
Increasingly Complex Systems ● fully or partially controlled by computers ● embedded systems – a computer used as a component of a larger system ● real-time systems – computers that process data from sensors as events occur
Notable System Failures Patriot Missile ● floating point variable stored values with insufficient precision Ariane 5 ● satellite launch vehicle, 64-bit floating point value converted to 16-bit signed int Mars Orbiter and Polar Lander ● Orbiter: english vs metric units ● Lander: landing gear sensor passed incorrect signal value Denver International Airport ● software project nightmare
What can be done? Unfortunately, most of these problems must be solved on a case by case basis. There is no real tried, tested, and true method for ensuring the reliability of all the software and hardware that a system is composed of. Good programming practices and well- educated users is the way to go.
Computer Simulations Uses of simulation Validating simulations
Uses of Simulations Simulations can never completely replace physical experiments. Practical use of simulations: To lower monetary or time cost of laboratory experiments Pharmaceutical Design Car Crashes Ethics of a non-simulated experiment are in question Medical Devices Crashing cars with real people Often experiments are impractical How long will it take before the world runs out of oil? Simulations can be used to model past events Understand world around us Predict the future
Crash Test Simulation 3 different water molecule simulations, progression of technology
Safety Simulation Crash Recreation Simulation from YouTube Crash Recreation Simulation from YouTube Space Shuttle Landing Space Shuttle Landing
Water Molecule Simulations Models before computers Models before computers Simple Computer Models – water molecules in motion Simple Computer Models – water molecules in motion Complex Computer Model – water movement through permiable membrane Complex Computer Model – water movement through permiable membrane
Validating Simulations Validation: Does the model accurately represent the real system? Verification: Does software correctly implement model? Validation methods Make prediction, wait to see if it comes true Predict the present from old data Test credibility with experts and decision makers
Therac-25 Genesis of the Therac-25 Chronology of Accidents and AECL Responses Software Errors Post Mortem Moral Responsibility of the Therac-25 Team “The Rack” – Torture device used in the Middle Ages, Tower of London 1
Genesis of the Therac-25 Predecessors to the Therac-25, Therac-6 and Therac- 20 were built by AECL and CGR Therac-6 and Therac-20 incorporated some software Previous designs by CGR incorporated no software 1 Therac-25 built by AECL Minicomputer used as integral part of system Processes controlled with assembly software Hardware safety features replaced with software Based on designs of the Therac-6 Also reused code from Therac-20 First Models - 40 errors per day was not unusual
AECL and the FDA Atomic Energy of Canada, Limited “Crown Coorporation” State controlled Similar to government agency Therac-25 was FDA approved 3 Opportunity for approximately 6 years of testing Only 2700 hours testing integrated system Limited software documentation Minimal software testing Single Programmer did all source code for Therac-25 Left AECL in 1986, after some accidents had occurred Limited information about his background
Chronology of Accidents and AECL Responses Marietta, Georgia (June 1985) Crippling injuries from radiation overdose Hamilton, Ontario (July 1985) Patient died 2 months after radiation overdose First AECL investigation (July-Sept. 1985) Could not reproduce overdoes Yakima, Washington (December 1985) Permanent scars and disabilities Tyler, Texas (March 1986) Died from complications after five months Real-time video and audio monitor was not functioning Second AECL investigation (March 1986) Tyler, Texas (April 1986) Patient died after 3 weeks due to massive overdose to brain Yakima, Washington (January 1987) Patient died 3 months later FDA declares Therac-25 defective (February 1987) Solutions relied on additional hardware locks Two initial investigations failed to detect any problems because of poor design and lack of documentation
Software Errors RACE CONDITION: A race condition or race hazard is a flaw in a system or process whereby the output of the process is unexpectedly and critically dependent on the sequence or timing of other events. The term originates with the idea of two signals racing each other to influence the output first. 4
Race Conditions Two race conditions in Therac-25 software Command screen editing Movement of electron beam gun These two conditions were “racing” with each other. If the Command screen editing occurred while the electron beam gun was moving, the software would not recognize the changes in input. If this variable was changed from an “E” to an “X” during the 8 second movement, the change would not take effect.
Race Conditions and Parallel Programming “…There is no efficient algorithm that can help detect race conditions in a program. As such, there are no easy-to-use pedagogical tools.” 5 “The most often encountered race conditions are data race conditions. A data race condition is caused by unordered concurrent accesses of the same memory location from multiple threads. Less frequent but harder to find are general race conditions. A subtle general race condition often occurs at a transitory state due to the undetermined program execution order of multiple concurrent events that have data conflicts.” 6
Post Mortem AECL focused on fixing individual bugs “Most accidents are system accidents; that is, they stem from complex interactions between various components and activities.” 7 Therac-25 was not fail-safe Fundamental design flaws No devices to report overdoses Operator could not directly monitor or observe patient Similar to Milgram Experiments at Yale in the 1960’s Software lessons Race Condition Debugging Simplicity in Design Documentation is crucial Reusing code can lead to potential errors AECL did not communicate fully with customers
Moral Responsibility of the Therac-25 Team Conditions for moral responsibility Causal condition: The actions (or inactions) of the agent must have caused the harm Mental condition: The actions (or inactions) must have been intended or willed by the agent The designers of the Therac-25 did not intend to cause lethal overdoses The moral agent is also responsible for carelessness, recklessness, or negligence
Citations 1. http://content.answers.com/main/content/wp/en-commons/thumb/5/54/200px-A_Torture_Rack.jpghttp://content.answers.com/main/content/wp/en-commons/thumb/5/54/200px-A_Torture_Rack.jpg 2. http://computingcases.org/case_materials/therac/supporting_docs/therac_case_narr/cmc.htmlhttp://computingcases.org/case_materials/therac/supporting_docs/therac_case_narr/cmc.html 3. http://www.cs.jhu.edu/~cis/cista/445/Lectures/Therac.pdfhttp://www.cs.jhu.edu/~cis/cista/445/Lectures/Therac.pdf 4. http://en.wikipedia.org/wiki/Race_conditionhttp://en.wikipedia.org/wiki/Race_condition 5. Carr, Steven; Mayo, Jean; Shene, Ching-Kuang. “Race Conditions: A Case Study”. Journal of Computing Sciences in Colleges, 2001. Volume 17, Issue 1. pp 90-105. 6. Chen, Liang T. “The Challenges of Race Conditions in Parallel Programming”. July 21, 2006. Sun Developer Network. 7. Nancy Leveson and Clark Turner. “An Investigation of the Therac-25 Accidents.” Computer, 26(7): 18-41, 1993.