Presentation on theme: "1 of 48 A Note About Continuous Assessment The continuous assessment mark for this course will be based on three components: –Your journal entries –Submission."— Presentation transcript:
1 of 48 A Note About Continuous Assessment The continuous assessment mark for this course will be based on three components: –Your journal entries –Submission of a document in which you will describe a group problem solving experiment –A presentation which you will give in class You don’t need to start working on the latter two just yet For now start thinking about groups and possible problems to solve
2 of 48 The M.Sc. Reading Group On an M.Sc. course experience in reviewing academic papers/ articles is extremely important A reading group will be started to help with this This will help when you get to the case study module on the course Papers will be selected and discussions will take place both on-line and in lectures More will follow shortly…
3 of 48 Brainstorming Ideas Review Excellent Discounts Price material encouragingly Subscription service for downloads Cheaper in bulk Downloads give concert discounts Sell iPods with pre- loaded music Subscriptions to charts while they are there Sell broadcast concerts online Useless Bring back vinyl TV ads in Iran Kidnap bands Kidnap downloaders as an example Brain implants Subliminal messages on vinyl records MUSIC SHOULD BE FREE Sabotage downloads with white noise UselessInteresting Connect artists with different markets Develop streaming- only technology Distribute music for multiple cultures Secondary sales Competitions with downloads Diversify into iPods Stop players sharing music iPod Jukebox Interesting Excellent
4 of 48 Brainstorming Ideas Review Some interesting things to note: –There was only one unanimous choice in both the interesting and excellent categories –The ideas that inspired most disagreement: Downloads don't last forever Free pint with ten songs Free download with X (beer, bananas, sandwiches) Convert politician’s speeches into rap songs –We rejected the brain implants! In 2002 there was something of a scandal over the falsified invention of a tooth implant! scandal
5 of 48 Force Field/S.W.O.T. Analysis Feedback Any comments questions about last week?
Course Website: http://www.comp.dit.ie/bmacnameehttp://www.comp.dit.ie/bmacnamee Problem Solving, Communication & Innovation: Root Cause Analysis
7 of 48 Contents In today’s lecture we are going to look at Root Cause Analysis –What is root cause analysis? –Origins of root cause analysis –Is this important for software/knowledge management? –Major steps in root cause analysis –Root cause analysis example –When to use root cause analysis –Impediments to root cause analysis
8 of 48 Root Cause Analysis Root Cause Analysis (RCA) is a process deigned to help determine the causes of events – typically bad events! Root cause analysis gets us past what and how to why Only by knowing why an event happened can we hope to prevent it happening in the future Treat the cause not the symptoms
9 of 48 Root Cause Analysis Example Scenario # 1: The Plant Manager walked into the plant and found oil on the floor. He called the Foreman over and told him to have maintenance clean up the oil. The next day while the Plant Manager was in the same area of the plant he found oil on the floor again and he subsequently raked the Foreman over the coals for not following his directions from the day before. His parting words were to either get the oil cleaned up or he'd find someone that would.
10 of 48 Root Cause Analysis Example (cont…) Scenario # 2: The Plant Manager walked into the plant and found oil on the floor. He called the Foreman over and asked him why there was oil on the floor. The Foreman indicated that it was due to a leaky gasket in the pipe joint above. The Plant Manager then asked when the gasket had been replaced and the Foreman responded that Maintenance had installed 4 gaskets over the past few weeks and they each one seemed to leak. The Foreman also indicated that Maintenance had been talking to Purchasing about the gaskets because it seemed they were all bad.
11 of 48 Root Cause Analysis Example (cont…) The Plant Manager then went to talk with Purchasing about the situation with the gaskets. The Purchasing Manager indicated that they had in fact received a bad batch of gaskets from the supplier. The Purchasing Manager also indicated that they had been trying for the past 2 months to try to get the supplier to make good on the last order of 5,000 gaskets that all seemed to be bad.
12 of 48 Root Cause Analysis Example (cont…) The Plant Manager then asked the Purchasing Manager why they had purchased from this supplier if they were so disreputable and the Purchasing Manager said because they were the lowest bidder when quotes were received from various suppliers. The Plant Manager then asked the Purchasing Manager why they went with the lowest bidder and he indicated that was the direction he had received from the VP of Finance.
13 of 48 Root Cause Analysis Example (cont…) The Plant Manager then went to talk to the VP of Finance about the situation. When the Plant Manager asked the VP of Finance why Purchasing had been directed to always take the lowest bidder the VP of Finance said, "Because you indicated that we had to be as cost conscious as possible!" and purchasing from the lowest bidder saves us lots of money. The Plant Manger was horrified when he realized that he was the reason there was oil on the plant floor.
14 of 48 What About Ireland’s Woes? Staunton Out! What is the root cause of all of our problems? Will sacking the manager do any good? Will dropping the players do any good? Is it all the fans fault?
15 of 48 Origins Of Root Cause Analysis Root cause analysis is not one well defined technique, but rather a general philosophy The origins of root cause analysis stem from the following areas: –Safety-based: accident investigation, health & safety –Production-based: quality control –Process-based: business processes outside of manufacturing –Systems-based: organisational culture, strategic management
16 of 48 General Principles Of Root Cause Analysis The general principles of root cause analysis are: –Aiming corrective measures at root causes is more effective than merely treating the symptoms of a problem –To be effective, RCA must be performed systematically, and conclusions must be backed up by evidence –There is usually more than one root cause for any given problem Based on “Root Cause Analysis Handbook”, ANS Consulting
17 of 48 What Is A Root Cause? There is substantial debate on the definition of root cause The following is useful: –Root causes are specific underlying causes –Root causes are those that can reasonably be identified –Root causes are those we have control to fix –Root causes are those for which effective recommendations for preventing recurrences can be generated Based on “Root Cause Analysis Handbook”, ANS Consulting
18 of 48 Causes Of Problems There are different kinds of causes can be broken down as follows: –Physical Causes are the tangible causes of failures –Human Causes almost always trigger a physical cause of failure – these could be errors of commission (we did something we shouldn’t do) or omission (we didn’t do something we should have done) –Latent Causes (or Organisational Causes) are the organisational systems that people used to make their decisions
19 of 48 But This All Just Sounds Like Common Sense! Common Sense is not particularly common We all have a different notion of common sense because of: –Our unique senses –Our unique knowledge –Our unique strategies –Our unique conclusions Number series experiment Based on Apollo Root Cause Analysis: A New Way Of Thinking, D GanoApollo Root Cause Analysis: A New Way Of Thinking
20 of 48 Root Cause Analysis & Software/Computing? This all sounds like something people in factories should worry about – but we are make software! In 2002 Wired.com published an interesting article on history’s worst software bugs –July 28, 1962 Mariner I Space Probe: A bug in the flight software for the Mariner 1 causes the rocket to divert from its intended path on launch –1982 Soviet Gas Pipeline: Operatives working for the Central Intelligence Agency allegedly plant a bug in a Canadian computer system purchased to control the trans-Siberian gas pipeline –1985-1987 Therac-25 medical accelerator. A radiation therapy device malfunctions and delivers lethal radiation doses at several medical facilities “History's Worst Software Bugs”, Simson GarfinkelHistory's Worst Software Bugs
21 of 48 Root Cause Analysis & Software/Computing? (cont…) –1988 Buffer Overflow in Berkeley Unix Finger Daemon: The first internet worm (the so-called Morris Worm) infects between 2,000 and 6,000 computers in less than a day by taking advantage of a buffer overflow –1988-1996 Kerberos Random Number Generator: The authors of the Kerberos security system neglect to properly "seed" the program's random number generator with a truly random seed –January 15, 1990 AT&T Network Outage: A bug in a new release of the software that controls AT&T's long distance switches causes these mammoth computers to crash when they receive a specific message from one of their neighbouring machines “History's Worst Software Bugs”, Simson GarfinkelHistory's Worst Software Bugs
22 of 48 Root Cause Analysis & Software/Computing? (cont…) –1993 Intel Pentium Floating Point Divide: A silicon error causes Intel's highly promoted Pentium chip to make mistakes when dividing floating-point numbers that occur within a specific range –1995/1996 The Ping of Death: A lack of sanity checks and error handling in the IP fragmentation reassembly code makes it possible to crash a wide variety of operating systems by sending a malformed "ping" packet from anywhere on the internet –June 4, 1996 Ariane 5 Flight 501: Working code for the Ariane 4 rocket is reused in the Ariane 5, but the Ariane 5's faster engines trigger a bug in an arithmetic routine inside the rocket's flight computer “History's Worst Software Bugs”, Simson GarfinkelHistory's Worst Software Bugs
23 of 48 Root Cause Analysis & Software/Computing? (cont…) –November 2000 National Cancer Institute, Panama City: In a series of accidents, therapy planning software created by Multidata Systems International, a U.S. firm, miscalculates the proper dosage of radiation for patients undergoing radiation therapy We should worry about this kind of thing in software! “History's Worst Software Bugs”, Simson GarfinkelHistory's Worst Software Bugs
24 of 48 Major Steps In Root Cause Analysis There are four major steps in root cause analysis: –Data collection –Causal factor charting –Root cause identification. –Recommendation generation and implementation Based on “Root Cause Analysis Handbook”, ABS Consulting
25 of 48 Root Cause Analysis Steps: Data Collection The first step in root cause analysis is to gather as much data as possible Without complete information how can we hope to find the root causes? Data gathering consumes most of the time in root cause analysis Based on “Root Cause Analysis Handbook”, ABS Consulting
26 of 48 Root Cause Analysis Steps: Causal Factor Charting Causal factor charting provides a structure for us to organise and analyse the data gathered during the investigation Preparing the chart begins as soon as we start to gather data The chart should show all of the information that we know in a sequence diagram leading up to the event we are investigating Event information Timing info Actors involved Event information Timing info Actors involved Event information Timing info Actors involved Event information Timing info Actors involved Event information Timing info Actors involved Based on “Root Cause Analysis Handbook”, ABS Consulting
27 of 48 Root Cause Analysis Steps: Causal Factor Charting (cont…) The chart should begin as a skeleton working backwards from the event we are investigating As more information arises it should be added to the chart Causal factors are those contributors that, if eliminated, would have prevented the occurrence A causal factor chart can help us identify gaps in our knowledge Based on “Root Cause Analysis Handbook”, ABS Consulting
28 of 48 Root Cause Analysis Steps: Root Cause Identification After all factors have been identified, root cause identification begins Root causes are often identified by following a chain of events to its beginning Remember there will often be more than one root cause Tools such as the Root Cause Map (from the “Root Cause Analysis Handbook”) can be used to help identify root causes for each cause factor follow the map to see if this is a root cause Based on “Root Cause Analysis Handbook”, ANS Consulting
29 of 48 Equipment Difficulty Equipment Design Problem Equipment Reliability Program Problem Installation/ Fabrication Equipment Misuse Design Input/Output Equipment Records Equipment Reliability Program Design Equipment Reliability Program Implementation Administrative & Management Systems Procedures Start Here With Each Causal Factor
30 of 48 Personnel Difficulty Company Employee Contract Employee Natural Phenomenon Sabotage/ Horseplay Human Factors Engineering Training Personal Performance Start Here With Each Causal Factor Other Difficulty External Events Other Immediate Supervision Communications
31 of 48 Root Cause Analysis Steps: Recommendation Generation & Implementation Following the identification of a root cause recommendations for preventing its recurrence should be made The recommendations should be achievable and must be implemented If recommendations are not implemented then the analysis is a waste of time and the event should be expected to recur
32 of 48 Example “It was 5 p.m. I was frying chicken. My friend Jane stopped by on her way home from the doctor, and she was very upset. I invited her into the living room so we could talk. After about 10 minutes, the smoke detector near the kitchen came on. I ran into the kitchen and found a fire on the stove. I reached for the fire extinguisher and pulled the plug. Nothing happened. The fire extinguisher was not charged. In desperation, I threw water on the fire. The fire spread throughout the kitchen. I called the fire department, but the kitchen was destroyed. The fire department arrived in time to save the rest of the house.”
33 of 48 Kitchen destroyed by fire Fire brigade puts out fire Time? FB Fire brigade arrives Time? Mary calls the fire brigade Time? Mary, FB Fire spreads throughout the kitchen Kitchen, Mary Mary throws water on the fire Mary Fire extinguisher does not operate when Mary tries it Mary Mary tries to use the fire extinguisher Mary Mary runs into the kitchen Mary Smoke alarm sounds About 17:10 Mary chats with Jane 10 minutes Jane, Mary Mary leaves frying chicken alone Mary Mary begins frying chicken 17:00 Mary Fire starts on the stove
34 of 48 Kitchen destroyed by fire Fire brigade puts out fire FB Fire brigade arrives Mary calls the fire brigade Mary, FB Fire spreads throughout the kitchen Kitchen, Mary Mary throws water on the fire Mary Did the FB use the correct techniques? How long did it take the FB to arrive? Fire was a grease fire Mary, pan What was Jane doing during this time? Did Mary do anything else? Was Mary trying to do this? Did Mary know this was wrong?
35 of 48 Fire extinguisher does not operate when Mary tries it Mary Mary tries to use the fire extinguisher Mary Mary runs into the kitchen Mary Smoke alarm sounds About 17:10 Mary chats with Jane 10 minutes Jane, Mary Mary leaves frying chicken alone Mary Mary begins frying chicken 17:00 Mary Fire starts on the stove Jane rings the doorbell Jane Jane comes to the door Jane Grease ignites when it hits the burner Conclusion Aluminium melts forming hole in pan Pan Arcing heats bottom of aluminium pan Pan Electric burner shorts out Burner Fire generates smoke Assumption Mary sees fire on stove Mary Fire extinguisher not charged Mary Mary pulls the plug on fire extinguisher Mary Mary uses an aluminium pan Mary Is plug the same as pin? Does Mary know how to use an extinguisher? How much oil? How much chicken? What exactly did she see? Had it been previously used? Had it leaked? Was it originally charged?
36 of 48 Fire extinguisher does not operate when Mary tries it Mary Mary tries to use the fire extinguisher Mary Mary runs into the kitchen Mary Smoke alarm sounds About 17:10 Mary chats with Jane 10 minutes Jane, Mary Mary leaves frying chicken alone Mary Mary begins frying chicken 17:00 Mary Fire starts on the stove Jane rings the doorbell Jane Jane comes to the door Jane Grease ignites when it hits the burner Conclusion Aluminium melts forming hole in pan Pan Arcing heats bottom of aluminium pan Pan Electric burner shorts out Burner Fire generates smoke Assumption Mary sees fire on stove Mary Fire extinguisher not charged Mary Mary pulls the plug on fire extinguisher Mary Mary uses an aluminium pan Mary Kitchen destroyed by fire Fire brigade puts out fire FB Fire brigade arrives Mary calls the fire brigade Mary, FB Fire spreads throughout the kitchen Kitchen, Mary Mary throws water on the fire Mary CF Fire was a grease fire Mary, pan CF
37 of 48 Root Cause Summary Table Causal Factor 1Recommendations Description: Mary leaves the frying chicken unattended Implement a policy that hot oil is never left unattended on the cooker Determine whether policies are required for other types of hazards Causal Factor 2Recommendations Description: Electric burner element fails (burns out) Replace all burners on cookers Develop a preventative maintenance strategy to periodically replace burner elements Consider alternative, less hazardous methods for preparing chicken
38 of 48 Root Cause Summary Table (cont…) Causal Factor 3Recommendations Description: Fire extinguisher does not operate when Mary tries to use it Refill fire extinguisher Inspect all fire extinguishers in the building to make sure they are full Ensure a safety equipment audit is properly in place Causal Factor 4Recommendations Description: Mary throws water on fire Provide practical training on the use of fire extinguishers Review overall training plan
39 of 48 Root Cause Analysis Exercise “This week DIT suffered a massive e-mail failure. The hard-drive on our mail server crashed and the contents of all mail boxes were lost. Furthermore all e-mails arriving over the following days were lost with no indication given to senders that they were not received. When back-ups were sought the last back-up had been made 5 months before” Let’s do a root cause analysis of this event For this exercise I will have to act as an oracle for our investigation – sorry!
40 of 48 Impediments To Root Cause Analysis There are a number of reasons why people don’t like root cause analysis: –This is great, but I don’t have time for this…. –Inability or unwillingness to tackle the bigger issues –Fear of being “blamed” for making an error
41 of 48 This Is Great, But I Don’t Have Time For This…. There is not one of us that does not have more things to do than we have time to perform If you have ever tried test-driven-development it offers the same value proposition “If you haven’t got time to stop these failures from recurring, how are you going to find the time to keep fixing them?”
42 of 48 Inability Or Unwillingness To Tackle The Bigger Issues The most effective solutions are those that address the latent (or organisational) causes of problems These solutions typically require changes to underlying organisational systems, processes and beliefs and so require more time, effort, and management clout to implement Often small root cause analysis teams are reluctant or unable to identify these issues as potential causes
43 of 48 Fear Of Being Blamed For Errors Root cause analysis often involves identifying errors committed by individuals - this can be terrifying Bad use of root cause analysis can quickly devolve into the blame game It is worth thinking briefly about whether blaming people is of any use to us
44 of 48 Fear Of Being Blamed For Errors (cont…) Regarding human error many of us believe: –Human error is infrequent –Human error is intrinsically bad –A few people are responsible for most of the human errors –The most effective way of preventing human error is through disciplinary actions So, we allocate blame and then seek to prevent recurrence through disciplinary actions “Getting Root Cause Analysis to Work for You”, Alexander (Sandy) DunnGetting Root Cause Analysis to Work for You
45 of 48 Fear Of Being Blamed For Errors (cont…) However many behavioural psychologists are now showing: –Human error is inevitable –Human error is not intrinsically bad –Everybody commits errors –Blame and punishment is almost always inappropriate So we shouldn’t blame individuals but rather seek to find the latent causes – using root cause analysis? “Getting Root Cause Analysis to Work for You”, Alexander (Sandy) DunnGetting Root Cause Analysis to Work for You
46 of 48 Teams And Root Cause Analysis Advantages of team-based problem-solving: –Those closest to the work know best how to perform and improve their jobs –Application of a broader range of knowledge from multiple disciplines –Broader, more creative solutions –Greater chance of risk-taking –Teams tend to be more successful in implementing complex plans –Higher level of ownership of results This is all true for root cause analysis “Getting Root Cause Analysis to Work for You”, Alexander (Sandy) DunnGetting Root Cause Analysis to Work for You
47 of 48 Root Cause Analysis Summary Root cause analysis is a problem solving technique which can be used to find the reasons why an event occurred There are four major steps: –Data collection –Causal factor charting –Root cause identification. –Recommendation generation and implementation
48 of 48 Exercise Into your problem solving journals, write a ½ page – full page on your thoughts about force field analysis, SWOT analysis and root cause analysis In particular focus on how useful you feel the techniques are – does the formalism help, or is it just “common sense” dressed up in a fancy coat?