Presentation on theme: "Failure Modes & Effects Analysis (FMEA) A Great Tool to Improve Product and Process Reliability and Reduce Risks Anthony Tarantino PhD, Six Sigma Master."— Presentation transcript:
1 Failure Modes & Effects Analysis (FMEA) A Great Tool to Improve Product and Process Reliability and Reduce RisksAnthony TarantinoPhD, Six Sigma Master Black Belt, CPIM, CPM,Sr. Advisor to Cisco’s Six Sigma Center of ExcellenceAdjunct Professor of Finance, Santa Clara UniversityMay 23, 2011
2 A Leading Six Sigma Authority: “To me Failure Modes and Effects Analysis (FMEA) is a versatile, powerful, process centered tool that belongs in every Process Owners’ and Six Sigma practitioners’ toolbox."
3 A Leading Operational Risk Authority: “Catastrophic failures in operational risk management are rarely caused by a single and major point of failure. Rather they are the cumulative effect of smaller and inter-related failures. …FMEA is the tool of choice to address these complex operational risk failures at any level of an organization, whether tactical, strategic, or enterprise-wide. It works in every type of organization.”
4 The objectives for this session include: Understand what a FMEA is, why it is used, and when it can it be deployedUnderstand the different components, definitions, and calculations used in a FMEALearn the steps to developing a FMEAUse examples and Case Studies to showcase FMEA in action:Purchasing Process in FinanceSample High Tech Project to Reduce RMA RatesSan Bruno Gas Pipeline Explosion
5 Reliability DefinedProduct reliability is one of the qualities of a product. Quite simply, it is the quality which measures the probability that the product or device “will work.”As a definition:Product reliability is the ability of a unit to perform a required function under stated conditions for a stated period of time.And, correspondingly, quantitative reliability, as a definition, is:Quantitative reliability is the probability that a unit will perform a required function under stated conditions for a stated time.Source: Fergenbaum, A. V. (1991). Total Quality Control. New York: McGraw-Hill, Inc.
6 When Reliability is Lacking - Categories of Failure Mode SafetyAny failure mode that directly affects the ability of a product to meet Federal Safety Standards, or creates a potential product liability issue, or can result in death or extensive property damage.Major (Hard)Any failure mode that stops the operation of a product or system which requires immediate repair.Evidenced by a catastrophic event, i.e, TEPCO Nuclear Plant MeltdownFailure mechanism might be due to a “shock” to the system or an accumulation of shocks to the systemMinor (Soft)Any failure mode that results in a product from meeting one of its intended functions, but does not preclude it from satisfying its most important functions.Any failure mode which results in a gradual but not complete ability of the product to meet its intended function.Degradation of performance over time, wear are examples of soft failures.
7 FMEA Defined What is a Failure Modes & Effects Analysis? A FMEA is a systematic method to:Recognize, evaluate, and prioritize (score) potential failures and their effectsIdentify actions which could eliminate or reduce the chance of potential failure occurringDocument and share the processFMEA generates a living document that can be used to anticipate and prevent failures from occurring.In DMAIC and Design For Sigma Projects, FMEA’s can be used in various stages and revised as the project moves forward.
8 Why Use a FMEAUse of quality tools such as Statistical Process Control (SPC) encourage the use of FMEA(s) to help problem-solve quality problemsISO/QS and product liability directives of the EC 1985 strongly encourage its use.Helps select alternatives (in system, design, process, and service) with high reliability and high safety potential during the early phases (Blanchard 1986)Ensures that all conceivable effects on operational success have been considered.Many risk management regimens and standards, such as ISO 31000/31010 used in finance and operations are based on FMEA logic – probability vs. severity scoring and matrix.
9 Why Use a FMEA - Continued Improves the quality, reliability and safety of products and processes in a proactive manner.Helps to increase customer satisfaction, by proactively addressing failures that keep us from meeting critical customer requirements in processes or products.Reduces product development timing and costReduces operational riskDocuments and tracks actions taken to reduce risk; Prioritize areas of focus
10 What are your experiences in FMEA Teams? FMEA is a Team ProcessTeam FormationProduct DevelopmentDesignManufacturingQualitySales/MarketingSuppliersReliability and testingTeam RolesFacilitatorChampionRecorder/librarian6-10 members is optimalWhat are your experiences in FMEA Teams?
11 Soft Skills Are Critical Why Use a Team for FMEATeam decision-making takes time. For a team to reach consensus:100 percent active (express agreement/disagreement) participation.Participants must be open to new ideas/to influence others.100 percent agreement not the goal. Majority does not rule. Sometimes a single individual may be on the right track.Need a formal system for voting.Need effective facilitator (leader).Team process check (how did we do?)Difficult individualsFacilitator must resolve such instances.Effective meeting skillsPlanning the meetingEffective problem-solving skillsSoft Skills Are Critical
12 The Primary Driver for FMEA - What does 99.9% Quality Mean? One hour of unsafe drinking water291 incorrect pacemaker operations per year12 babies given to the wrong parent each dayTwo unsafe landings at O’Hare Airport per dayYour heart fails to beat 32,000 times per year6,000 lost pieces of mail per hour20,000 incorrect drug prescriptions per year107 incorrect medical procedures performed daily14,208 defective personal computers shipped each year268,500 defective tires shipped per year500 incorrect surgical operations performed each weekTwo million documents lost by the IRS per year880,000 credit card magnetic strips with the wrong information19,000 newborn babies dropped at birth by doctors each year22,000 checks deducted from the wrong account each hour
13 Elements of a Successful FMEA 1. All problems are not the same. This is perhaps the most fundamental concept in the entire FMEA methodology. Unless a priority of problems (as a concept) is recognized, workers are likely to be contenders for chasing fires. They will respond to the loudest request and/or the problem of the moment. (In other words, they will manage by emergency.) - Does this sound like your organization? 2. The customer must be known. Acceptance criteria are defined by the customer, not the engineer. 3. The function must be known. 4. One must be prevention (proactively) oriented. Unless continual improvement is the force that drives the FMEA, the efforts of conducting FMEA will be static. The FMEA will be conducted only to satisfy customers and/or market requirements to the letter rather than the spirit of the requirements. Unfortunately, this is a common problem in implementation of an FMEA program.)
14 Sample FMEA Form What could cause the failure? What actions will you take?Describe the impactProcessStepIs there anything in place to detect or stop this from happening?Describe how the process step could go wrongRankings (1-10)
15 Sample FMEA Process - Adding Milk to a Cake Mix
16 1960s - Adopted and refined by NASA (used in the Apollo Space program) History of the FMEA1940s - First developed by the US military in 1949 to determine the effect of system and equipment failures1960s - Adopted and refined by NASA (used in the Apollo Space program)1970s – Ford Motor Co. introduces FMEA after the Pinto affair. Soon adopted across automotive industryToday – FMEA used in both manufacturing and service industriesThe FMEA discipline was developed in the United States Military. Military Procedure MIL-P-1629, titled Procedures for Performing a Failure Mode, Effects and Criticality Analysis, is dated November 9, It was used as a reliability evaluation technique to determine the effect of system and equipment failures. Failures were classified according to their impact on mission success and personnel/equipment safety.The term "personnel/equipment", taken directly from an abstract of Military Standard MIL-STD-1629, is notable. The concept that personnel and equipment are interchangeable does not apply in the modern manufacturing context of producing consumer goods. The manufacturers of consumer products established a new set of priorities, including customer satisfaction and safety. As a result, the risk assessment tools of the FMEA became partially outdated. They have not been adequately updated since.Later it was used for aerospace/rocket development to avoid errors in small sample sizes of costly rocket technology. An example of this is the Apollo Space program. The primary push came during the 1960s, while developing the means to put a man on the moon and return him safely to earth.In the late 1970s the Ford Motor Company introduced FMEA to the automotive industry for safety and regulatory consideration after the Pinto affair. They also used it to improve production and design.
17 Types of FMEAsDesign FMEA - examines the functions of a component, subsystem or main system.Potential Failures: incorrect material choice, inappropriate specifications.Example: Air Bag (excessive air bag inflator force).Process FMEA - examines the processes used to make a component, subsystem, or main system.Potential Failures: operator assembling part incorrectly, excess variation in process resulting in out-spec products.Example: Air Bag Assembly Process (operator may not install air bag properly on assembly line such that it may not engage during impact).
18 Definitions Failure Mode Failure Effects The way in which the product or process could fail to perform its intended function.Failure modes may be the result of upstream operations or inputs, or may cause downstream operations or outputs to fail.Failure EffectsThe outcome of the occurrence of the failure mode on the system, product, or process.Failure effects define the impact on the customer.Ranking is translated into “Severity” score
19 Definitions Failure Causes Current Controls Potential causes or reasons the failure mode could occurLikelihood of the cause creating the failure mode is translated into an “Occurrence” scoreCurrent ControlsMechanisms currently in place that will detect or prevent the failure mode from occurringAbility to detect the failure before it reaches the customer is translated in “Delectability” score
20 Linking Causes to Effects One to One, One to Many, Many to One, or Many to Many 1:11:MCause 2Effect 2Effect 1Cause 1Effect 2M:1Cause 1Effect 1Cause 2
21 Calculations Risk Priority Number The Risk Priority Number (RPN) identifies the greatest areas of concern. RPN is the product of: (1) Severity rating (2) Occurrence rating (3) Detection rating
22 Calculations - FMEA Variables SeverityA rating corresponding to the seriousness of an effect of a potential failure mode. (scale: : no effect on the customer, 10: hazardous effect)OccurrenceA rating corresponding to the rate at which a first level cause and its resultant failure mode will occur over the design life of the system, over the design life of the product, or before any additional process controls are applied. (scale: : failure unlikely, 10: failures certain)DetectionA rating corresponding to the likelihood that the detection methods or current controls will detect the potential failure mode before the product is released for production for design, or for process before it leaves the production facility. (scale: : will detect failure, 10: almost certain not to detect failures)
23 Calculations - Risk Priority Number (RPN) Severity x Occurrence x Detectability = Risk Priority Number (RPN)For a given potential failure mode, how bad the outcome is multiplied by how likely it would actually happen multiplied by what things are in place today to prevent or notice it before it happens.
24 FMEA ProcessFor each step, brainstorm potential failure modes and effects21Start with the process mapDetermine the potential causes to each failure mode3Determine severityDetermine likelihood of occurrenceDetermine detectabilityEvaluate current controls4Determine RPN5Identify actions6
25 When is a FMEA Started?As early as possible; that is, as soon as some information is known (usually through a QFD).Practitioners should not wait for all the information. If they do, they will never perform a FMEA because they will never have all the data or information.When new systems, designs, products, processes, or services are designed.When existing systems, designs, products, processes, or services are about to change regardless of reason.When new applications are found for the existing conditions of the systems, designs, products, processes, or services.
26 When is a FMEA Completed? Only when the system, design, product, process, or service is considered complete and/or discontinued.A System FMEA may be considered finished when all the hardware has been defined and the design is declared frozen.A Design FMEA may be considered finished when a release date for production has been set.A Process FMEA may be considered finished when all operations have been identified and evaluated and all critical and significant characteristics have been addressed in the control plan.A Service FMEA may be considered finished when the design of the system and individual tasks have been defined and evaluated, and all critical and significant characteristics have been addressed in the control plans.As a general rule, the FMEA should be available for the entire product life. The FMEA is a working document.
27 FMEA TipsNo absolutes rules for what is a high RPN number. Rather, FMEA often are viewed on relative scale (i.e., highest RPN addressed first)It is a team effortMotivate the team membersEnsure cross-functional representation on the teamTreat as a living document, reflect the latest changesDevelop prioritization with the process owners!Assign an owner to the FMEA; ensure it is periodically reviewed and updated
28 FMEA & The DMAIC Lifecycle Q: At what phase can/should the FMEA be used in a DMAIC project?A: A FMEA can be used in most phases of the DMAIC lifecycle for various purposesHow it can be used:Project selectionProject scopeHow it can be used:Understand the process (w/ process mapping)How it can be used:Identify process variables / root cause analysisHow it can be used:Assist with new process development / understand failures in designHow it can be used:Manage and control the process on an ongoing basisFMEA can also be used in each stage of Design for Six Sigma - DMADV
29 FMEA Example Purchasing Requisition to Purchase Order
38 Example Calculate the RPN SeverityOccurrenceDetectabilityRPN5 x 4 x 3 =
39 Example Purchasing Dept. Occurrence Reduced from 4 to 3.PRN cut in half.FMEA owner & team update the document as actions are completeRecalculate the RPN after actions are completeAssign specific ownersBrainstorm potential actions that will lower the RPN
40 Case Study: FMEA Logic in Scoring the Risk of Problems
41 Case Study: Using a FMEA Hybrid – Adding Project Prioritization Index (PPI) PPI can be used in combination with FMEA to score problem solving projects by balancing potential savings against project costs, and project effort/duration against project risks (chance of success).PPI consists of four metrics:Project Costs ($)Project Benefits ($)Project Probability of Success (Percent)Project Duration (Years)The PPI formula balances:Project Benefits versus Project CostsProject Probability of Success versus Project DurationThe formula looks like this:PPI = (Benefits/Costs) x (Probability of Success/Project Duration)Source: Praveen Gupta, Total Quality Management, in Anthony Tarantino, Risk Management in Finance: Six Sigma and Other Next Generation Techniques (Wiley and Sons, 2010)
42 Case Study: Using a FMEA Hybrid - Adding Project Prioritization Index (PPI)
43 Case Study: Using FMEA+PPI To Score Potential Problem Solutions
46 San Bruno, CA - September 10, 2010 The ruptured natural gas pipeline created a crater approximately 72 feet long by 26 feet wide.A pipe segment approximately 28 feet long was found about 100 feet away from the crater. The released natural gas was ignited sometime after the rupture; the resulting fire destroyed 37 homes and damaged 18.Eight people were killed, numerous individuals were injured, and many more were evacuated from the area.Source:
47 Loss of Power at Control Terminal Just before the accident, PG&E was working on their uninterruptable power supply (UPS) system at Milpitas Terminal, which is located about miles SE of the accident site.During the course of this work, the power supply from the UPS system to the supervisory control and data acquisition (SCADA) system malfunctioned so that instead of supplying a predetermined output of 24 volts of direct current (VDC), the UPS system supplied approximately 7 VDC or less to the SCADA system.Because of this anomaly, the electronic signal to the regulating valve for Line 132 was lost. The loss of the electrical signal resulted in the regulating valve moving from partially open to the full open position as designed.The pressure then increased to 386 psig. The over-protection valve, which was pneumatically activated and did not require electronic input, maintained the pressure at 386 psig.Source:
48 Case Study: San Bruno Gas Pipeline Explosion There were longitudinal fractures in the first and second pup of the ruptured segment and a partial circumferential fracture at the girth weld between the first and second pup. There was a complete circumferential fracture at the girth weld between the fourth pup in the ruptured segment and the fifth pup in the north segment.Source:
49 Case Study: San Bruno Gas Pipeline Explosion The longitudinal fracture in the first pup continued south into the pipe ending in a circumferential fracture in the middle of the pipe.Source:
50 Poor Document and Records Retention SAN FRANCISCO (AP) March 5, 2011 – Facing a state Public Utilities Commission order to produce records on its pipelines by March 15. the utility has been shipping pallets loaded with boxes of documents to the Cow Palace in Daly City, where PG&E employees are pouring through the paper records.“This effort is an example of the level of commitment the company is putting forward to make sure this process is thorough and complete,” PG&E spokesman Paul Moreno said. …it was part of a 24-hour search by more than 300 employees.The document search comes after investigators found a seam with inferior welds that was believed to be the origin of the blast.PG&E’s computer records had shown the pipeline did not have a seam, but PG&E officials have acknowledged problems when the old paper records were incorporated into the utility’s computer system.PG&E President Chris Johns said last month the utility had been unable to find documents for 30 percent of its 1,000-plus miles of pipeline running under urban areas.
51 DOT to Issue New Pipeline Regulations in August SAN FRANCISCO (Dow Jones)--The U.S. Department of Transportation will issue new safety rules for the nation's oil and gas pipeline operators in August, the agency's top official said Thursday."We and the Obama administration will redouble our efforts on pipeline safety," Transportation Secretary Ray LaHood said, speaking at a press conference in San Francisco.LaHood earlier visited the site in San Bruno, Calif., where a PG&E Corp. (PCG) gas pipeline exploded last September, killing eight people and destroying ...
52 Mode of Failure - Pipeline Rupture followed by Explosion Potential Causes of Failure:Faulty Weld – (1/2 thickness spec)Pipe Corrosion (Over 50 Years Old)Corrosion of Girth/Lateral WeldCorrosion of Circumference WeldFailure of Monitoring Station UPSLack of Automatic Shut Off ValvesFaulty Maintenance DocumentationFaulty Maintenance ProceduresLack of Tone-at-the-Top ManagementWeak Oversight by Calif. PUCWeak Federal Regulations by DOTCauses 1-5Tactical in NatureSix Sigma ToolDesign of ExperimentsCauses 6-11Systemic in NatureEnterprise-wideOperational Risk Mgt.
53 FMEA Advantages Over RCA and 5 Whys A robust FMEA will consider each of the 5 tactical modes of failure and combination of modes of failure.Design of Experiments (DOE) can be used to test the most likely combination of modes and causes.A typical Root Cause Analysis (RCA) may focus on one or more of the failure modes and causees, but would not score their risk profiles.A typical 5 Whys will focus on only one of the failure modes, and may not point to a solution.
54 Design of Experiments (DOE) Potential Tests & Combination of Tests: FMEA Suggested TestsDesign of Experiments (DOE)Potential Tests & Combination of Tests:Faulty WeldCorrosion of PipeCorrosion of Girth/Lateral WeldCorrosion of Circumference WeldRise In PressureFaulty Weld (Remove Half Weld) + Accelerated Corrosion Test of Pipe and Welds + Rise in Pressure
56 FMEA & Other Risk Analysis Tools Cause & Effect DiagramFault Tree AnalysisBottoms-up approach to failure analysisSystematic method for identifying all the potential failure modes of a process or productCreates prioritized ranking of failure modes within a systemExamines a certain failure mode or event and identifies all the possible causesCauses are grouped into several logical categoriesTop-down approach to failure analysisStarting point is a failure or “undesired state”Drill down into lower level events leading up to the undesired stateSimilar to the 5 Why’s method
58 For Further Information Anthony Tarantino, PhD, MBB Sr. Consulting SupportCarl Ashcroft, MBBCisco’s Six Sigma Training and Education Programs
59 Published FMEA Guidelines J From the SAE for the automotive industry.AIAG FMEA-3 - From the Automotive Industry Action Group for the automotive industry.ARP From the SAE for non-automotive applications.EIA/JEP131 – Provides guidelines for the electronics industry, from the JEDEC/EIA.P provides guidelines for NASA GSFC spacecraft and instruments.SEMATECH A-ENG - for the semiconductor equipment industry.