Presentation is loading. Please wait.

Presentation is loading. Please wait.

Systematic Methods To Address Root And Contributing Causes

Similar presentations


Presentation on theme: "Systematic Methods To Address Root And Contributing Causes"— Presentation transcript:

1 Systematic Methods To Address Root And Contributing Causes
Expectations in NRC Inspection Procedures and 95002 Frederick J. Forck 4Konsulting, LLC One of the overall inspection requirements in the NRC’s Inspection Procedures and is to determine that the problem was evaluated using a systematic methodology to identify the root and contributing causes. The objectives of the NRC IP are to: To provide assurance that the root causes and contributing causes of risk-significant performance issues are understood. To provide assurance that the extent of condition and extent of cause of risk-significant performance issues are identified. To provide assurance that the licensee’s corrective actions for risk-significant performance issues are sufficient to address the root and contributing causes and prevent recurrence. NRC IP adds the following objective: To independently determine if safety culture components caused or significantly contributed to the individual and collective (multiple white inputs) risk-significant performance issues. Inspection Procedure 95001, Inspection For One Or Two White Inputs In A Strategic Performance Area, Issue Date: 11/09/09 Inspection Procedure 95002, Inspection For One Degraded Cornerstone Or Any Three White Inputs In A Strategic Performance Area, Issue Date: 11/09/09

2 Using Tools Use a tool Use a tool to build
Investigators need to learn how to use the investigative techniques listed in NRC Inspection Procedures and such as Fault Tree Analysis, Events and Causal Factors Analysis, Barrier Analysis, Change Analysis, and the Why Staircase. But, more importantly, investigators need to be able to fully integrate those techniques (and more) into a systematic methodology for analyzing and solving problems. As an analogy, a person may be able to use a hammer and a saw properly, but still not be able to build a house. NRC Inspection Procedure 95001, Inspection For One Or Two White Inputs In A Strategic Performance Area, Revision 11/09/09 NRC Inspection Procedure 95002, Inspection For One Degraded Cornerstone Or Any Three White Inputs In A Strategic Performance Area, Revision 11/09/09

3 Using Cause Analysis Tools
Use tools to reconstruct Fault tree analysis Critical incident techniques Events & causal factors analysis Pareto Analysis Change analysis Barrier analysis Management Oversight & Risk Tree (MORT) analysis Why Staircase NRC Inspection Procedures Root Cause, Extent of Condition, and Extent of Cause Evaluation.  The licensee’s evaluation should generally make use of systematic methods to identify root and contributing causes. The root cause evaluation methods that are commonly used in nuclear facilities include:  Events and causal factors analysis – to identify the events and conditions that led up to an event;  Fault tree analysis – to identify relationships among events and the probability of event occurrence; Barrier analysis – to identify the barriers that if present or strengthened would have prevented the event from occurring; Change analysis – to identify changes in the work environment since the activity was last performed successfully that may have caused or contributed to the event; Management Oversight and Risk Tree (MORT) analysis – to systematically check that all possible causes of problems have been considered; Critical incident techniques – to identify critical actions that if performed correctly would have prevented the event from occurring or would have significantly reduced its consequences; Why Staircase – to produce a linear set of causal relationships and use the experience of the problem owner to determine the root cause and corresponding solutions; and  Pareto Analysis – a statistical approach to problem solving to determine where to start an analysis. NRC IP 95001

4 Systematic Evaluation Normally Includes:
Clearly identify problem State assumptions Data Timely collection Verification Preserve evidence Document analysis so Progression of the problem is clearly understood Any missing information or inconsistencies are identified Problem can be easily explained and/or understood by others Determine cause & effect relationships resulting in Identification of root and contributing causes that Consider the following types of issues: Hardware: design, materials, systems aging, and environmental conditions; Process: procedures, work practices, operational policies, supervision and oversight, preventive and corrective maintenance programs, and quality control methods; and Human performance: training, communications, human-system interface, and fitness for duty (which includes managing fatigue). The licensee may use other methods to perform root cause evaluations. A systematic evaluation of a problem using one of the above methods should normally include: 1. A clear identification of the problem and the assumptions made as a part of the root cause evaluation. For example, the evaluation should describe the initial operating conditions of the system or component identified, staffing levels, and training requirements as applicable. 2. A timely collection of data, verification of data, and preservation of evidence to ensure that the information and circumstances surrounding the problem are fully understood. The analysis should be documented such that the progression of the problem is clearly understood, any missing information or inconsistencies are identified, and the problem can be easily explained and/or understood by others. 3. A determination of cause and effect relationships resulting in an identification of root and contributing causes that consider potential hardware, process, and human performance issues. For example: (a) Hardware issues could include design, materials, systems aging, and environmental conditions; (b) Process issues could include procedures, work practices, operational policies, supervision and oversight, preventive and corrective maintenance programs, and quality control methods; and (c) Human performance issues could include training, communications, human-system interface, and fitness for duty (which includes managing fatigue). See IP 93002, “Managing Fatigue,” for guidance on the requirements of 10 CFR Part 26, Subpart I – Managing Fatigue. NRC Inspection Procedure 95001, Inspection For One Or Two White Inputs In A Strategic Performance Area, Revision 11/09/09 NRC IP 95001

5 Basic Investigation Steps
Gather information Reconstruct the incident. Discover causes. Recommend corrective actions Assessment Process The most basic investigation will follow these steps: Gather information Reconstruct the incident. Discover causes. Recommend strategies for improving performance.

6 Continuous Performance Improvement
Problem Prevention Symptom/Effect Analysis Cause Analysis Solution Analysis Follow Up Analysis Problem Prevention Symptom/Effect Analysis Cause Analysis Solution Analysis Follow Up Analysis Avatar International Inc., 1985 Atlanta, Georgia from Georgia Power, p. 3-2 Avatar International Inc., 1985

7 General Job/Task Analysis
Adapted from Incident Investigation Training T , AmerenUE, Callaway Plant Relationship of Human Performance Improvement Process to Consultant Roles Human Performance Improvement in the Workplace, Ethan S. Sanders, © 2000, American Society for Training & Development, 3-11. Derived from INPO NUREG/CR-5455, NRC HPIP Entergy Root Cause Analysis Process

8 Include Acceptance Criteria
Instructions, Procedures, & Drawings Criterion V of Appendix B to 10CFR50 Written Followed Include Acceptance Criteria V. Instructions, Procedures, and Drawings Activities affecting quality shall be prescribed by documented instructions, procedures, or drawings, of a type appropriate to the circumstances and shall be accomplished in accordance with these instructions, procedures, or drawings. Instructions, procedures, or drawings shall include appropriate quantitative or qualitative acceptance criteria for determining that important activities have been satisfactorily accomplished. Appendix B to Part 50 of the Code of Federal Regulations Title 10— Quality Assurance Criteria for Nuclear Power Plants and Fuel Reprocessing Plants Some Myths About Procedures… We accurately measure and trend procedure compliance issues Strong technical competence = good technical writer Good technical writer = good Admin writer “Our writers guide doesn’t allow that….” Writers don’t need specific training Procedure error traps are common sense Procedure Use & Adherence: A Global Issue?, Rob Fisher, The Adult Education & Management Research Institute, Inc. 10CFR50, App. B Callaway Plant Lead Auditor Training

9 Steps with Acceptance Criteria
Issues that drove, influenced, or allowed the incident Scope The Problem Investigate The Factors Reconstruct The Story Establish Contributing Factors Validate Underlying Factors Plan Corrective Actions Report Learnings Accurate, factual information Intervention(s) that improve design or change behavior Cause Analysis Process Step Quantitative or Qualitative Acceptance Criteria 1. Scope the problem A precise, complete, and bounded problem statement 2. Investigate the factors Accurate, factual information 3. Reconstruct the story Progression of the problem 4. Establish contributing factors Issues that drove, influenced, or allowed the incident 5. Validate underlying factors Correctable root and contributing causes 6. Plan corrective actions Intervention(s) that improve design or change behavior 7. Report learning Auditable, defensible record Progression of the problem Auditable, defensible record Correctable root and contributing causes Precise, complete, bounded problem statement “Table 1” shows acceptance criteria for each step of a systematic evaluation. NRC Inspection Procedure 95001, Inspection For One Or Two White Inputs In A Strategic Performance Area, Revision 11/09/09 NRC Inspection Procedure 95002, Inspection For One Degraded Cornerstone Or Any Three White Inputs In A Strategic Performance Area, Revision 11/09/09

10 Overall Method Steps w. Techniques
Adapted from Incident Investigation Training T , AmerenUE, Callaway Plant Relationship of Human Performance Improvement Process to Consultant Roles Human Performance Improvement in the Workplace, Ethan S. Sanders, © 2000, American Society for Training & Development, 3-11. Derived from INPO , OE-907, Good Practice, Root Cause Analysis, January 1990 NUREG/CR-5455, S , Vol. 2, Development of the NRC's Human Performance Investigation Process (HPIP), Investigator's Manual Entergy Root Cause Analysis Process (Rev 4) EN-LI-118, dated 6Jul06 Derived from INPO NUREG/CR-5455, NRC HPIP Entergy Root Cause Analysis Process

11 SCOPE THE PROBLEM (Step 1)
Derived from INPO NUREG/CR-5455, NRC HPIP Entergy Root Cause Analysis Process You will be able to provide the organization with: A simple Deviation Statement A specific, concise, objective, observable, and measurable Problem Description that meets the following criteria: Explains the undesirable or unacceptable consequences, conditions, methods, or results. A statement of the safety significance must be in the report. Focuses on the problem; not the symptoms or causes of the problem. Describes the gap between the way things are and the way they ought to be . States WHAT, WHO, WHEN, and WHERE The Extent of the Adverse Condition (Actual and Potential) The scope of the evaluation with spatial, chronological, and organizational boundaries. Derived from INPO , OE-907, Good Practice, Root Cause Analysis, January 1990 NUREG/CR-5455, S , Vol. 2, Development of the NRC's Human Performance Investigation Process (HPIP), Investigator's Manual Entergy Root Cause Analysis Process (Rev 4) EN-LI-118, dated 6Jul06 Techniques Deviation Statement Difference Mapping Problem Description Extent of Condition Review Methodology Selection

12 Effective Problem Description
Identify the GAP: What is the Problem? Method 1: Deviation Statement (noun/verb) OBJECT: What is the item that is affected? DEFECT: Identify the “DEVIATION” from the “EXPECTED” or “REQUIRED STANDARD of PERFORMANCE.” Example: Five gallons of oil spilled (defect) on the “B” Emergency Diesel Generator room floor (object) . OR Use: Method 2: Expected vs. Actual Statement Compare “WHAT SHOULD BE”*: Requirement, Standard, Norm, or Expectation with “WHAT IS”: The existing, as-found condition” *Sometimes the “What Should Be” is implied. Kepner, Charles H. and Tregoe, Benjamin B. The New Rational Manager, Princeton Research Press, Princeton, NJ, 1981, pp BPI Problem Solving-Decision Making-Planning, Business Processes Inc. 1983, pg. 1 Method 1 (Object/Defect) [9] [10] [11] STATE the object affected; then STATE the defect or deviation (describes the equipment failure, the human performance difficulty, or the programmatic or organizational deficiency). Examples: The reactor (object) tripped (deviation). A substantive cross-cutting issue (deviation) exists in the area of Problem Identification and Resolution (object). Soil and groundwater samples from discharge line manholes (object) were contaminated (deviation). The documentation to support the design basis function for a safety-related system (object) could not be located (deviation). Method 2 (Requirement & Contrary to) [12] STATE the original performance expectation (i.e., procedure step) and performance gap (i.e., violation, error, etc.), or EXPRESS the gap between the way things are and the way they ought to be (an ideal or an expectation). Example: Procedure XYZ requires that the container lids for environmental discharge samples will be taped closed and that the samples will be transported on Chain-of-Custody (COC) to onsite and offsite laboratories. Contrary to the above, not all batch samples that were observed being collected for discharge on October 17, 2008, had lids that were taped and not all were transported on a COC. The department’s goal is to complete root cause investigations within 30 days. In the past year, only one of ten investigations was completed within 30 days. Kepner-Tregoe, The New Rational Manager BPI Problem Solving-Decision Making-Planning

13 HOW: Extent of [Adverse] Condition
Evaluate ONLY from Problem Description Perspective Then evaluate various combinations Same  Same  Same Same  Same  Similar Similar  Same  Same Similar  Similar Same etc. Document the basis for bounding with the associated risk and consequence Deviation Statement: Object Application Defect Same-Same-Same An Identical Object in an Equivalent Application with a Matching Defect Same-Same-Similar An Identical Object in an Equivalent Application with a Related Defect Similar-Same-Same A Comparable Object in an Equivalent Application with a Matching Defect For equipment and system issues: Do we have the vulnerabilities in other components, other trains, or the other unit? Same components in different systems Similar components from the same manufacturer Similar components from a different manufacturer For process and performance issues: Same process in different departments Different processes performed by the same group Same performance in a different location For material-related issues: Impact to other installed material, including spares/spared-in-place items Impact to Procurement documents and information relating to current and future use Impact to stored material Impact to material procurements Determining the Extent of Condition/Extent of Cause, Lewis Allen , STP, 15th Annual HPRCT Conference, June 22-25, 2009 · Del Ray Beach Marriott, Hosted by Florida Power & Light, Slides 9-10 Lewis Allen , STP, 15th Annual HPRCT

14 How to do an Extent of Condition Review
Human Performance Tool Peer Check Through investigation, the evaluator is trying to clearly define what the scope of the problems may be and what actions may be appropriate to resolve the issue. It is expected that the level of effort in determining and documenting the extent of condition is commensurate with the level of investigation and significance of the event. Provided below are questions that the evaluator should consider when determining the EOCo. These questions are intended to aide the evaluator in performing an effective EOCo, but the questions need to be considered in the proper context and with the appropriate understanding of the condition to ensure sufficient evaluation of the discovered condition. Proper context would involve applying these questions in terms of: 1. Determine the transportability of the condition. a. Can the problem potentially affect other equipment, organizations, or processes? b. Can the problem affect another unit? c. Can the problem affect another site? d. Can the problem result in a common mode failure? e. Has consideration been given to initiate the same immediate actions on other equipment? 2. Equipment a. One component or a group of components? b. Is it only this component type? c. Is it more than this component type? 3. Human Performance: a. Is it one task? b. Is it all he/she did today? Or this week? c. What other tasks did he/she do that we should be concerned about? d. Should this be considered as an inappropriate action affecting others? e. Will this task be performed by others and when? 4. For all additional issues identified as part of the EOCo, ask the following: a. Close to actions taken? b. Additional actions needed? c. Additional investigation needed?

15 INVESTIGATE THE FACTORS (Step 2)
Derived from INPO NUREG/CR-5455, NRC HPIP Entergy Root Cause Analysis Process Types of information to be collected From the collected information, we gather the facts, and search for the evidence to sustain them. There are three main types of information to be considered: (a) Physical evidence Physical evidence includes: equipment, components, tools, liquid samples, computer disks, personnel protective equipment worn during the incident, debris, etc. Physical evidence could even include for example laboratories testing for determination of fitness for duty issues. Sometimes the investigation team may require analysis from specialized laboratories. The inspection of physical evidence must not result in altering the evidence. When it is necessary to remove physical evidence, it should be done in a controlled, careful and methodical manner. (b) Documentary evidence Documentary evidence includes all documentation related to the incident, such as operating procedures, logbooks, internal and external operating experience, etc. It is really important that the documents used during the work (preferably originals if not certified copies) are collected as quickly as possible since these documents may be altered or lost. (c) Personnel evidence Information collected from personnel is usually very important in order to understand what happened, but needs to be confirmed before it is used as evidence. Witness recollection declines rapidly after an incident, therefore, it is important to start the investigation as soon as possible. Personnel information includes, information obtained from interview and related directly to the event (testimony) and information on personal history such as training, working environment, individual experience etc. IAEA-TECDOC-1600, Best Practices in the Organization, Management and Conduct of an Effective Investigation of Events at Nuclear Power Plants, International Atomic Energy Agency, September 2008, p. 8 Techniques Evidence Preservation Interviewing (What & How) Performance Analysis Worksheet Culpability Decision Tree Substitution Test/Survey SORTM questions

16 Information Gathering Strategy
Determine how best to fill your information needs. (Information you have vs. Information you still need) review of logsheets, charts, drawings, etc. area walkdowns interviews Decide who to interview and what you hope to learn from them. Determine which information to pursue first. Considerations: Focus on issues that appear to be key. Management Sponsor may need certain information first (e.g. restart issues). Interviewee availability may pose an impact. Determine who will obtain the information. Divide responsibilities among team members If no team, you can still seek assistance from cognizant parties e.g. system engineer can research material history The investigators should gather additional information and data relating to the event/problem. This includes physical evidence, interviews, records, and documents needed to support the investigation. Some typical sources of information which may be of assistance include the following: Operating logs (Obtain from Operations Department) Correspondence (Obtain from Document Control, s, etc.) Inspection/surveillance records Maintenance work packages and records (Obtain from Maintenance or Document Control) Meeting minutes Computer process data Procedures and instructions Vendor manuals and specifications Drawings and specifications Functional retest specification and results Equipment history records Design basis information Safety Analysis Report (SAR)/Technical Specifications Related quality control evaluation reports Operational Safety Requirements Safety Performance Measurement System/Occurrence Reporting and Processing System (SPMS/ORPS) Reports Radiological surveys Trend charts and graphs Facility parameter readings Sample analysis and results (chemistry, radiological, air, etc.) Inspection reports Strip Chart Recordings Sequence of Event Recorders (Obtain from Ops, Chemistry, I&C, etc.) Radiological Surveys (Obtain from Radiation Protection) Plant Parameter Readings (Obtain from Ops, Chemistry, I&C, etc.) Shipping Manifests (Obtain from Materials) Sample Analysis and Results (Obtain from Materials or off-site vendor) Design Basis Information (Obtain from T/S, FSAR, Westinghouse documents, etc.) Photographs/Sketches of Failure Site Industry Bulletins Previous corrective action documents EPIX Records Training Records Witness Recollection Statements DOE Guideline Root Cause Analysis Guidance Document, February 1992, DOE-NE-STD , U.S. Department of Energy, Office of Nuclear Energy, Office of Nuclear Safety Policy and Standards, Washington, D.C , pp. 5,6 Adapted from Incident Investigation Training, Callaway Plant

17 How is Interviewing done?
Prepare Open Question Close Preparation ⎯ Schedule the appointments; ⎯ Choose an appropriate location; ⎯ Make sure you are interviewing the right people; ⎯ Having question areas or themes prepared in advance; ⎯ Have required reference documents at hand; ⎯ Be mentally prepared and focused. Introduction ⎯ Introduce yourself; ⎯ Explain the purpose of the interview; ⎯ Do not be confrontational; ⎯ Control your body language. Asking questions ⎯ Seek to understand why not just what; ⎯ Control the interview; ⎯ Keep questions simple and focused; ⎯ Use a funnel approach: broad leading to specific questions; ⎯ Anticipate unsatisfactory replies: have a means to deal with them; ⎯ Avoid jargon; ⎯ Avoid devious or trick questions; ⎯ Focus on facts; ⎯ Anticipate interviewee questions; ⎯ Be aware that interviewing is not interrogating. IAEA-TECDOC-1600, Best Practices in the Organization, Management and Conduct of an Effective Investigation of Events at Nuclear Power Plants, International Atomic Energy Agency, September 2008, p. 23 IAEA-TECDOC-1600

18 Two-Pronged Approach to Incident Prevention
Md Human Factors Prong System Factors Prong INPO , Human Performance Reference Manual, October 2006 The risk reduction action plan should include a description of: who is accountable for the risk what action is to be taken who is responsible the action when the action is to be completed by a measurable performance target. Adapted from INPO

19 Factor Tree The Phoenix Handbook: The Ultimate Event Evaluation Manual for Finding Safety and Profit Improvement in Adverse Events, by William R. Corcoran, Ph.D., P.E., President, Nuclear Safety Review Concepts, Windsor, CT, May 4, 2007 Version, pp Remember, most of us are not investigating "paper cuts". We are generally investigating Significant Events in High Reliability Organizations (HRO) such as nuclear power and hospitals. HROs do not rely solely on fallible humans, but set up defenses-in-depth to prevent events or accidents while still producing electricity or saving lives. In this regard, I agree with Richard Rouse when he says, "I think the purpose of root cause is to find areas where errors can occur and then create or strengthen barriers to reduce the likelihood of those errors or to prevent the error from causing a significant condition or event." Phoenix Handbook, Corcoran Dana Cooley

20 RECONSTRUCT THE STORY (Step 3)
Derived from INPO NUREG/CR-5455, NRC HPIP Entergy Root Cause Analysis Process Objectives: You will be able to provide the organization with: A reconstruction of HOW the incident happened presented in a logical manner Documented initiating actions, inappropriate actions, & error-inducing factors Documented flawed barriers (human, programmatic, organizational vulnerability factors) Documented latent organizational weaknesses Derived from INPO , OE-907, Good Practice, Root Cause Analysis, January 1990 NUREG/CR-5455, S , Vol. 2, Development of the NRC's Human Performance Investigation Process (HPIP), Investigator's Manual Entergy Root Cause Analysis Process (Rev 4) EN-LI-118, dated 6Jul06 Techniques Fault Tree Task Analysis Critical Activity Charting Actions & Factors Chart

21 Human-Machine Interface
One of the best ways to reconstruct the story behind an equipment failure is to use a Fault Tree. Fault Tree Analysis (FTA) is highly dependent on Questioning Attitude and not answering your own questions. What are all the ways this component can fail? Good opportunity to use reference books and several fault tree resources to ensure completeness. Refer to: Heinz P. Bloch’s book – Machinery Failure Analysis and Troubleshooting EPRI – ALTRAN – Aging Assessment Field Guide PII – Diagnosing Equipment Failures VATIC – Failure Mode Analysis Handbook NMAC – Guides NRC Fault Tree Analysis Handbook – NUREG – 0492 Existing Fault Trees Adapted from Callaway Plant “Fault Tree Analysis” Training

22 8 Steps of Fault Tree Analysis
Identify the Undesirable Incident Step 2: Identify 1st Level Inputs Step 3: Link Using Logic Gates Step 4: Identify 2nd Level Inputs Step 8: Determine Contributing Factors “Physical Roots” Step 7: Investigate Remaining Inputs Step 6: Develop Remaining Inputs Step 5: Evaluate Inputs Fault Tree Analysis, P.L. Clemens, JACOBS Sverdrup, February 2002, 4th Edition, Slide 12 Principles of construction. The tree must be constructed using the incident symbols listed above. It should be kept simple. Maintain a logical, uniform, and consistent format from tier to tier. Use clear, concise titles when writing in the incident symbols. The logic gates used should be restricted to the AND gate and OR gate. The purpose of the tree is to keep the procedure as simple as possible. Steps in Carrying Out a Fault Tree Analysis A successful FTA requires the following steps be carried out: 1. Identify the objective for the FTA. 2. Define the top event of the FT. 3. Define the scope of the FTA. 4. Define the resolution of the FTA. 5. Define ground rules for the FTA. 6. Construct the FT. 7. Evaluate the FT. 8. Interpret and present the results. NASA Fault Tree Handbook with Aerospace Applications, Version 1.1, pp. 22, August 2002 Use chart as place keeping tool to determine where team is in process. Fault Tree Analysis, Clemens Callaway Plant “Fault Tree Analysis” Training Define the Incident Identify 1st Tier Inputs Define the Logic Relationship Identify 2nd Tier Inputs Evaluate Inputs Develop Remaining Inputs Investigate Remaining Inputs Determine Contributing Factors

23 Human-Machine Interface
Factor Flow Equipment Physical Roots Human-Machine Interface Response Think (Operation) One common error is to look no further than the equipment that failed or the individual involved when determining cause. The 2000 STPNOC Human Performance Self-Assessment identified that investigators weren’t consistently finding the deeper organizational weaknesses that often are the root of the problem. As the Investigator, It’s Your Job… … to look past equipment failure and human errors and identify if they’re symptomatic of weaknesses in the organization. Use the “Why Road” to look beyond the symptoms. Hopping Down the “WHY” Road if a spill occurred when a valve leaked by… Ask “Why did the valve leak by?” because it did not seat properly “Why did the valve not seat properly?” because the seating surface was worn “Why was the seating surface worn?” because of… When human performance is an issue, Task Analysis is a tool to identify critical actions that if performed correctly would have prevented the event from occurring or would have significantly reduced its consequences. Task analysis is the process of first determining how a task should be performed, and then comparing that information against how the task was actually performed. Differences can then be analyzed as potential causal factors for the incident being investigated. Task Analysis involves researching the task of interest, breaking it down to its critical elements, and then reconstructing task performance through reenactment or interviews. “Figure 6” shows the Task Analysis process flow. [20] Critical human activities (steps) include actions aimed at changing the state of facility structures, systems, or components; steps that are irrecoverable or actions that cannot be reversed; and steps where the outcome of an error is intolerable for personnel or facility safety. Stimulus Human Roots Defense-In-Depth Latent Roots Latent Organizational Weaknesses

24 How is Task Analysis done?
Step 1: Obtain Preliminary Information Step 2: Select Task(s) of Interest Step 3: Obtain Background Information Step 4: Prepare a Task Performance Guide Paper & Pencil Phase Step 8: Evaluate & Integrate Findings Step 7: Reenact Task Performance Step 6: Select Personnel Step 5: Get Familiar With the Guide IAEA-TECDOC-1600, Best Practices in the Organization, Management and Conduct of an Effective Investigation of Events at Nuclear Power Plants, International Atomic Energy Agency, September 2008, p.13-14 Steps in Task Analysis are as follows: 1. Obtain preliminary information so you know what the person was doing when the problem or inappropriate action occurred. 2. Decide on a task of interest. 3. Obtain necessary background information: Obtain relevant procedures Obtain system drawings, block diagrams, piping and instrumentation diagrams, etc. Interview personnel who have performed the task (but not those who will be observed) to obtain understanding of how the task should be performed. 4. Produce a guide outlining how the task will be carried out. A procedure with key items underlined is the easiest way of doing this. The guide should indicate steps in performing task and key controls and displays so that: You will know what to look for You will be able to record actions more easily. 5. Thoroughly familiarize yourself with the guide and decide exactly what information you are going to record and how you will record it. You may want to check off each step and controls or displays used as they occur. Discrepancies and problems may be noted in the margin or in a space provided for comments, adjacent to the step. 6. Select personnel who normally perform the task. If the task is performed by a crew, crew members should play the same role they fulfill when carrying out the task. 7. Observe personnel walking through the task and record their actions and use of displays and controls. Note discrepancies and problem areas. You should observe the task as it is normally carried out; however, if necessary, you may stop the task to gain full understanding of all steps. Conducting the task as closely to the conditions that existed when the incident occurred will provide the best understanding of the incident causal factors. 8. Summarize and consolidate any problem areas noted. Identify probable contributors to the incident. Step 7A: Interview Personnel (Alternate Method) Walk-Through Phase DOE-NE-STD

25 Critical Human Action Concept
Note: Not all steps of a work activity are equally important. Critical Human Actions (steps) include: Actions aimed at changing the state of facility structures, systems, or components Steps that are irrecoverable or actions that cannot be reversed Steps where the outcome of an error is intolerable for personnel or facility safety Integrating Human Performance Improvement Concepts and Tools into Work PlanningCH2M HILL Hanford Group, Inc.September 12-13, 2006 Certain tasks are more critical than others Some actions/tasks are irrecoverable; once the action is taken, the reverse action cannot recover Some steps have more chances for error Need to consider critical tasks as part of hazards analysis Is changing the state of the facility, system, component, or the well-being of the individual dependent on the individual worker? Is the outcome of the error intolerable from a personnel safety or facility perspective? Helps focus attention on potential consequences so appropriate defenses can be put in place NRC NUREG/CR-5455, NRC HPIP

26 A "Critical" Human Action IS:
A step in the activity that caused or could have made the incident less severe. It is a CHA if the step: Might cause an incident if the step is not done Might cause an incident if an error is made Might cause an incident if done some other way Makes incident less severe if done the right way. Could be a “Critical Step” related to the incident NRC NUREG/CR-5455, NRC HPIP

27 How is a Critical Human Activity Table done?
Identify the human actions to be analyzed. (This may be all the human actions in the incident, or it may be those that are believed to have been responsible for the event's occurrence.) Decide which human actions caused the incident or, if they had been performed correctly, could have prevented the incident or made the incident less severe (Critical Human Actions or CHAs). Collect and record information about the CHAs. Derived from: NUREG/CR-5455, S , Vol. 2, Development of the NRC's Human Performance Investigation Process (HPIP), Investigator's Manual UE Quality Improvement Process Manual, July 1992 Derived from: NRC NUREG/CR-5455, NRC HPIP UE QIP

28 General Systems Analysis Events & Causal Factors Charting
Action Action Action Action Incident How did the factors originate? Factor Factor Why did this Incident happen? Work Activity Causes Contributing Factor “Accident Investigation Technician (AIT) Independent Study - General Industry” National Association of Safety Professionals (NASP), Burgaw, NC, 2008, pg. 31 Department Of Energy Accident Investigation Program, Analytical Methods for Accident Investigations, Chip Lagdon, FACREP CONFERENCE, MAY 12-15, 2003 Work Activity Guiding Principles Define the scope of work Identify and analyze the hazards Develop and implement hazard controls Perform work within controls Provide feedback and continuous improvement Process and Institutional Guiding Principles 1. Line management responsibility for safety 2. Clear roles and responsibilities 3. Competence commensurate with responsibilities 4. Balanced priorities 5. Safety standards and requirements identified 6. Hazard controls tailored to the work 7. Operations authorization Process Causes Contributing Factor What systems allowed The Conditions to exist? Contributing Factor Institutional Causes Adapted from DOE Accident Investigation Program

29 General Format Actions Keep asking, “What happened next?”
Include only one action per rectangle – (watch out for the word “and”) DO NOT use names—USE job titles. Add date/times above boxes or in boxes (but maintain a consistent format). States facts Get rid of judgmental words Quantify when possible. Connect with solid arrows, use dotted boxes for assumptions. Arrange chronologically from left to right. Duke Power Root Cause Analysis Training Day 2 (TT0889), Slide 29 Callaway Plant Incident Investigation Training (T ) Set down the known sequence of actions Identify and add contributing factors Identify and add broken barriers Make sure facts support conclusions

30 ESTABLISH CONTRIBUTING FACTORS (Step 4)
Derived from INPO NUREG/CR-5455, NRC HPIP Entergy Root Cause Analysis Process The analysis should be documented such that the progression of the problem is clearly understood, any missing information or inconsistencies are identified, and the problem can be easily explained and/or understood by others. The incident needs to be reconstructed in a logical manner. When an equipment failure is involved, the physical or hardware root causes of the problem would be one of the first items to be identified in an investigation generally before the human roots are identified and certainly before the latent roots are discovered. You will be able to provide the organization with: Factors that set off or released the incident are identified (triggering factors) Factors that made the situation worse are identified (aggravating factors) Vulnerabilities in defenses are identified (exposure factors) Factors that prevented the incident from being worse than it was are identified (moderating or mitigating factors) Derived from INPO , OE-907, Good Practice, Root Cause Analysis, January 1990 NUREG/CR-5455, S , Vol. 2, Development of the NRC's Human Performance Investigation Process (HPIP), Investigator's Manual Entergy Root Cause Analysis Process (Rev 4) EN-LI-118, dated 6Jul06 Techniques Change Analysis Barrier Analysis Production/Protection Strategy (Defense-In-Depth) Analysis Factor Tree

31 How is Change Analysis done?
1 3 4 5 6 Evaluate by asking these questions: What was different about this time from all the other times the same hardware operated without a problem or the same task or activity was carried out without error? Why now and not before? Why here and not there? 2 Ammerman, Max. The Root Cause Analysis Handbook: A Simplified Approach to Identifying, Correcting, and Reporting Workplace Errors, Productivity, Inc., 1998, p. 27 Several key elements include the following: Consider the incident containing the undesirable consequences. Consider a comparable activity that did not have the undesirable consequences. Compare the condition containing the undesirable consequences with the reference activity. Set down all known differences whether they appear to be relevant or not. Analyze the differences for their effects in producing the undesirable consequences. This must be done with careful attention to detail. Be sure to include the obscure and indirect effects. For example, different paint on a piping system may change the heat transfer characteristics and therefore change the system parameters. Integrate information into the investigative process relevant to the causes of, or the contributors to, the undesirable consequences. DOE-NE-STD , DOE GUIDELINE ROOT CAUSE ANALYSIS GUIDANCE DOCUMENT February 1992, U.S. Department of Energy, Office of Nuclear Energy, Office of Nuclear Safety Policy and Standards, Washington, D.C , page E-1. Root Cause Analysis Training Course CAP-02, Palo Verde Nuclear Generating Station Ammerman, The Root Cause Analysis Handbook

32 Identify Risk Defenses (Barriers & Controls)
Local Factor Control Engineered Barriers Admin Controls Oversight Cultural Eliminate task. Prevent error. Catch error. Detect defect. Mitigate harm. Accept risk. Muschara, Tony, CPT, Principal Consultant, Muschara Error Management Consulting, LLC, Managing Critical Steps, HPRCT Pre-Conference Course, June 22, 2009, Slides 33-34 Managing Defenses, HPRCT Pre-Conference Course, June 16, 2008, p. 7 Flight Standards and Industry Roles in the AVSSMS, Don Arendt, August 23, 2007, Slide 3 (http://www.faa.gov/safety/programs_initiatives/oversight/saso/library/media/sms_presentation.pdf) “Carelessness and overconfidence are more dangerous than deliberately accepted risk.” Wilbur Wright, 1901 (www.faa.gov) Muschara, Managing Critical Steps, HPRCT 2009 Muschara, Managing Defenses, HPRCT 2008

33 Systematic Barrier Analysis
Identify each Target of hazards/threats. Identify each Hazard (adverse effect/consequence) Identify Barriers that should have controlled Hazard Prevented contact between Hazard and Target OR Mitigated consequences of Hazard/Target contact Assign a Safety Precedence Sequence # to each Barrier Assess HOW Barrier failed not provided/missing (not in place) not used/circumvented (but were in place) ineffective Determine WHY Barrier failed (Step 5) Validate analysis results Integrate this information in E & CF Chart Barrier Analysis is accomplished using the following process. Identify each Target of hazards/threats. Identify each Hazard (adverse effect/consequence) Identify Defenses that should have controlled Hazard Prevented contact between Hazard and Target OR Mitigated consequences of Hazard/Target contact Assign a Safety Precedence Sequence Number (#) to each Defense [31] Assess HOW Defense failed not provided/missing (not in place) not used/circumvented (but were in place) ineffective Determine WHY Defense failed Validate analysis results Integrate this information in Events & Causal Factors Chart Ammerman, Max. The Root Cause Analysis Handbook: A Simplified Approach to Identifying, Correcting, and Reporting Workplace Errors, Productivity, Inc., 1998, pp Wilson, Paul F. Dell, Larry D. & Anderson, Gaylord F., Root Cause Analysis: A Tool For Total Quality Management, ASQ Quality Press, Milwaukee, WI, 1993, pp Ammerman, The Root Cause Analysis Handbook ASQ

34 System Safety Design Order Of Precedence
MOST EFFECTIVE LOW HUMAN INTERFACE Eliminate hazards through design selection Incorporate Safety Devices Provide Warning Devices Use Procedures & Administrative Controls Select, train, supervise, and motivate to work safely Accept risks at appropriate management level $ MIL-STD-882D, Department Of Defense Standard Practice For System Safety, 10 February 2000, pp. 3-4 An engineered feature is usually more reliable, and nearly always more expensive, than an administrative control. A formal process, when followed, is more dependable than human recall. Identification of mishap risk mitigation measures. Identify potential mishap risk mitigation alternatives and the expected effectiveness of each alternative or method. Mishap risk mitigation is an iterative process that culminates when the residual mishap risk has been reduced to a level acceptable to the appropriate authority. The system safety design order of precedence for mitigating identified hazards is: Extreme risk- Design for minimum hazard. Include fail-safe features and redundancy. 4.4.a. Eliminate hazards through design selection If unable to eliminate an identified hazard, reduce the associated mishap risk to an acceptable level through design selection. Appropriate design/hardware changes are the most foolproof ways to prevent recurrence of undesirable events. The human element is virtually removed, and reliance on safety devices, procedures, training, and judgment is minimal. (The cost vs. the benefit must be considered) High risk- Control hazards to an acceptable risk level with safety devices. 4.4.b. Incorporate Safety Devices If unable to eliminate the hazard through design selection, reduce the mishap risk to an acceptable level using protective safety features or devices. This is the next most effective type of corrective action. Again, human involvement is minimal, since safety devices are automatic, reducing dependence on training, judgment, etc. (of course, these devices must be properly designed, installed, and maintained). Important- Provide devices that warn targets of hazards. 4.4.c. Provide Warning Devices If safety devices do not adequately lower the mishap risk of the hazard, include a detection and warning system to alert personnel to the particular hazard. The third most effective type of corrective action involves the use of warning devices, such as alarms, sirens, lights, etc. These are considered automatic, in that they require no human action for their activation, but their potential effectiveness is less than the previous two types of corrective action due to the need for a proper human response to the warning device in order for the corrective action to be completed. Moderate- Develop procedures to reduce and control hazards. 4.4.d. Use Procedures and Administrative Controls Where it is impractical to eliminate hazards through design selection or to reduce the associated risk to an acceptable level with safety and warning devices, incorporate special procedures and training. Procedures may include the use of personal protective equipment. For hazards assigned Catastrophic or Critical mishap severity categories, avoid using warning, caution, or other written advisory as the only risk reduction method. Reliance on procedures and other administrative controls is considered to be the weakest form of corrective action due to the total dependence on the proper human response. (People are the weakest link) Uneconomic- Select, train, supervise, and motivate personnel to work safely in presence of hazard. Negligible- Identify residual hazards, and accept the risks at the proper management level LEAST EFFECTIVE HIGH HUMAN INTERFACE MIL-STD-882D

35 BARRIER/CONTROL THAT SHOULD HAVE PRECLUDED THE INCIDENT
Defense Analysis Form Instructions for Use of Defense Analysis Form Identify each Target of the hazards/threats. (i.e., reactor, ESF, personnel, valve, etc.) Identify each Hazard/Threat (adverse effect/consequence)--typically start with the activity in progress at the time that the inappropriate action occurred. (i.e., reactor scram, ESF actuation, personnel injury, valve mispositioned, etc.) Identify Defenses that should have controlled Hazard-- failed or allowed the incident to progress. Prevented contact between Hazard and Target OR Mitigated consequences of Hazard/Target contact Assign a Safety Precedence Sequence # to each Defense. Assess HOW Defense failed not provided/missing (not in place) not used/circumvented (but were in place) ineffective Determine WHY Defense failed (Step 5 of Incident Analysis) Validate the results of the analysis with the information learned. The integrated method for using defense analysis involves superimposing defenses into the Action & Factors Chart analysis which was discussed earlier. Integrate this information in Actions & Factors Chart Finally determine what Corrective Action is needed to Restore the Defense to Effectiveness. Note: While defense analysis identifies missing or defective defenses, it has one weakness. If the investigator does not recognize ALL the failed defenses, the evaluation may be incomplete. Because using defense analysis alone is very time-consuming, it is recommended that defense analysis be used in conjunction with other techniques. Ammerman, Max. The Root Cause Analysis Handbook: A Simplified Approach to Identifying, Correcting, and Reporting Workplace Errors, Productivity, Inc., 1998, pp Wilson, Paul F. Dell, Larry D. & Anderson, Gaylord F., Root Cause Analysis: A Tool For Total Quality Management, ASQ Quality Press, Milwaukee, WI, 1993, pp EFFECT/ CONSEQUENCES (What Happened) List one at time- sequential order not required BARRIER/CONTROL THAT SHOULD HAVE PRECLUDED THE INCIDENT list all applicable physical and administrative defenses for each consequence Ammerman, The Root Cause Analysis Handbook ASQ

36 Example www.sandia.gov www.sandia.gov

37

38 Contributing [Causal] Factor Test Identify Contributing Influences
Evaluate factors (ovals) and flawed defense (broken barriers) on the Actions & Factors Chart by asking: If this factor had not existed, could this incident have occurred? If the answer is no, then you’re on your way toward finding a “Contributing Factor”! Causal factors (CF) are those actions, conditions, or events that directly or indirectly influence the outcome of a situation or problem. Contributing causes are defined as causes that by themselves would not create the problem but are important enough to be recognized as needing corrective action. Contributing causes are sometimes referred to as causal factors. Evaluate factors by asking: “If this factor had not existed, could this incident have occurred?” If the answer is no, then you’ve most likely found a “Contributing Factor”. NRC Inspection Procedure 95001, Inspection For One Or Two White Inputs In A Strategic Performance Area, Revision 11/09/09 NRC Inspection Procedure 95001

39 VALIDATE UNDERLYING FACTORS (Step 5)
Derived from INPO NUREG/CR-5455, NRC HPIP Entergy Root Cause Analysis Process Techniques WHY Factor Staircase A-B-C Analysis HOW-To-WHY Matrix Cause & Effect Tree Root Cause Test Root Cause Evaluation Extent of Cause Review Common Factor Analysis Stream Analysis You will be able to provide the organization with: Correctable cause(s) (i.e. the underlying factors) with written justification for addressing Correctable extent of condition(s) with written justification for addressing Correctable extent of cause(s) with written justification for addressing (if detected) Written justification for rejecting or not addressing possible underlying factors (“root causes”) Derived from INPO , OE-907, Good Practice, Root Cause Analysis, January 1990 NUREG/CR-5455, S , Vol. 2, Development of the NRC's Human Performance Investigation Process (HPIP), Investigator's Manual Entergy Root Cause Analysis Process (Rev 4) EN-LI-118, dated 6Jul06

40 The WHY Factor Staircase
Incident Execution Preparation Feedback Capabilities/Limitations Task Demands/Environment Outcomes Methods Resources Plan/Do/Check/Act Vision Beliefs Values Root Cause, Jack L. Martin, TXU Power- CPSES, HPRCT 2006, Slides 13 W. R. Corcoran Ph.D., P.E., President, Nuclear Safety Review Concepts, The Phoenix Handbook: The Ultimate Event Evaluation Manual for Finding Safety and Profit Improvement in Adverse Events, Windsor, CT, May 4, 2007 Version An effective investigation focuses on discovering the weaknesses embedded in the organization, its culture, and the physical plant, rather than simply singling out one or two individuals for counseling or training. If causal analysis focuses on individual capability, finding effective corrective action will be elusive because the real cause (s) of the incident will not be identified. The thought process associated with the WHY Factor Staircase is a helpful guide in causing an investigator to dig more deeply into the vision, values, and beliefs of an organization. Phoenix Handbook, Corcoran Root Cause, Martin, HPRCT 2006

41 Culture As investigators do we want to focus on the very general issues of following procedures and laws? The resulting corrective actions will surely be a very general self-righteous sermon about how accountable people follow procedures and laws. (And bad people don't.) Alternatively. we could focus on discovering the specific underlying thought processes that influenced the decisions to do something other than follow the law or a procedure. The discovery of the factors influencing thoughts (mental models, beliefs, values) would lead to specific corrective actions aimed at changing mental models or beliefs or values. Changing any of these thought processes will change behaviors. Changed behaviors (norms) change cultures. The choice is between general corrective actions that will have only minimal effect or specific corrective actions that will produce sustainable positive change.

42 Re Active Error Analysis Results Behavior Job Performer Business
Embedded system flaws Touching the plant Touching the people Job Performer Behavior TW IN Analysis Task Preview Pre-Job Brief Post-Job Review Goals & Values Business Results INPO Human Performance Fundamentals Course

43 The “A-B-C’s”: 1st Occurrence
Desired behavior: Wear safety glasses A Safety policy Safety signs Safety procedure Safety briefing Just-in-time training B Wear safety glasses C Ears hurt Can’t see clearly Uncomfortable Feel odd Daniels, Aubrey C. and Daniels, James E. Performance Management, 4th Edition Revised, Performance Management Publications, Atlanta, GA, 2004, pp “Foundations of Behavioral Accident Prevention,” Eagles Management Support Course, BST, Inc. 1993, page FND-60 t FND-64 Behaviors and consequences operate in a cause-and-effect balance, too. For every behavior, there is a consequence, and the consequences control future behaviors. Some consequences encourage the behaviors to be repeated or even expanded; others lead to reduction or ending of the behaviors. Even the absence of a consequence is actually a consequence because it is human nature to expect a response for every action. Consequences for current or past behaviors have the strongest influence on our future behavior. Foundations of Behavioral Accident Prevention: Eagles Management Support Course, BST, Inc. Performance Management, Daniels

44 The “A-B-C’s”: Subsequent Occurrence
Desired behavior: Wear safety glasses A Peers don’t wear Supervisors occasionally don’t wear Leave at home Embarrassed to ask for spare pair B Work w/o safety glasses C Ears don’t hurt Can see clearly Less bother Daniels, Aubrey C. and Daniels, James E. Performance Management, 4th Edition Revised, Performance Management Publications, Atlanta, GA, 2004, pp “Foundations of Behavioral Accident Prevention,” Eagles Management Support Course, BST, Inc. 1993, page FND-60 t FND-64 Consequences for current or past behaviors have the strongest influence on our future behavior. Foundations of Behavioral Accident Prevention: Eagles Management Support Course, BST, Inc. Performance Management, Daniels

45 Defense Management Analysis
Md Defense Management Analysis Uneasy Attitude Walk-downs Questioning Attitude Morale Written Instruction Quality Task Preview Procedure Use Job Performer Skill, Knowledge, Proficiency Pre-Job Brief Procedure Adherence Equipment Labeling & Condition Housekeeping Turnover Self-Check Place-keeping Observations Work-Arounds & Burdens Processes/ Practices Tasks/ Behaviors Conservative Decision-Making Tool Quality & Availability Equipment Ergonomics 3 Part Communication Lockout-Tagout Stop…When Unsure Fitness-For-Duty Peer Check Walk-downs Leadership Interlocks Task qualifications Independent Verification Defense In Depth Performance Feedback Personal Protective Equipment Task assignment Alarms Goals/ Values Results/ Consequences Staffing Continuous Learning Clear Expectations Berms Change Management Redundant trains Benchmarking Problem-Solving Equipment Reliability Reviews & Approvals Containment Communication Practices Post-Job Critiques Simple, Effective Processes Management Practices Root Cause Analysis Equipment Protection Systems Independent Oversight Rewards & Reinforcement Accountability Performance Indicators Safeguards Equipment Task assignment Handoffs INPO Human Performance Fundamentals Course

46 Deeper Understanding We've been taught to ask "Why?" a lot of times. Dr. Aubrey Daniels* suggests that, in order to understand why people do what they do, beyond asking, "Why did they do that?"; ask, "What happens to them when they do that?" When you understand the real or perceived consequences of a behavior, you are able to understand the behavior better. By following a line of inquiry similar to "What happens to them when they do that?", the rootician will be able to find out whether a desired behavior is perceived by the Job Performer as rewarding or punishing. Also the rootician will be able to discover whether undesired behaviors are rewarded or challenged in that Job Performer's perception. (Note: Job Performers could be mechanics, nurses, vice presidents, senators, etc.) If the Job Performer perceives that a certain behavior will bring a Soon, Certain, and Positive consequence, we should expect that behavior-whether desired or undesired-to be repeated. If the Job Performer perceives that a certain behavior will bring a Soon, Certain, and Negative consequence, we should expect that behavior-whether desired or undesired-to be avoided. I have attached a procedure for doing this type of analysis.  In nuclear power, the Nuclear Regulatory Commission is expecting root cause analyses to get to underlying safety culture factors. Since one part of a culture is values, we are expected to root out what is really valued or devalued by the organization--in other words, what is being rewarded and what is being punished. We have to remember that what is being valued or devalued is in the perception of the Job Performer--it is not the "politically correct" answer we may get in a follow-up interview with the Job Performer's chain-of-command.  *Daniels, Aubrey C., Ph.D.; Performance Management, Performance Management Publications, Tucker, GA, 1989, pp

47 NRC: Safety Culture General Tree
NRC INSPECTION MANUAL CHAPTER 0305, OPERATING REACTOR ASSESSMENT PROGRAM, Issue Date 06/22/06 NRC IM Chapter 0305 Areas

48 Safety Culture Analysis
Do Last!!! Tasks/ Behaviors Processes/ Practices A. What is it? This analysis compares each of the 37 cross-cutting aspects of Safety Culture to the circumstances surrounding the event to determine if the Safety Culture contributed to the performance deficiency. The 37 cross-cutting aspects are described in RIS and addressed by the NRC’s baseline inspection program. The NRC is the only organization that can declare that an issue is cross-cutting. A cross-cutting issue is an NRC inspection finding associated with a cross-cutting aspect that is a significant contributor to the performance deficiency. B. Why is it useful? The purpose of this evaluation is to identify issues with cross-cutting tendencies that warrant enhanced corrective action to address adverse impacts to safety culture. NRC Inspection Manual CHAPTER 0305, Operating Reactor Assessment Program, Issue Date 06/22/06 Goals/ Values NRC IMC 0305

49 Root Cause Test The investigation needs to yield root and contributing causes that the organization can correct along with written justification for the corrective actions that are recommended. Investigators need to provide written justification for rejecting or not addressing possible underlying factors (“root causes”). The flow chart is an adaptation of a root cause test shared with HPRCT participants in the past. When can a cause be designated as an endpoint? Try the following criteria: The cause must be basic (i.e. not caused by something more important), AND The cause must be correctable by management (or does not require correction), AND If the cause is removed or corrected, the incident does not occur (or, in the future, recur). Once the underlying factors (“root causes”) of an incident have been identified, additional action should be taken to ensure that the correction of these underlying factors will prevent recurrence. To be validated, potential underlying factors (aka “root causes”) should meet the following three criteria in relationship to the problem (i.e. the incident): The incident would not have occurred had the underlying factors not been present. The incident will not recur due to the same contributing (causal) factors if the underlying factors (“root causes”) are corrected or eliminated. Correction or elimination of the underlying factors (“root causes”) will prevent recurrence of similar conditions. INPO , OE-907, Good Practice-Root Cause Analysis, May 1989 © 1999, William R. Corcoran, NSRC Corp., , Root Causes Root Cause Necessary Attributes Not Caused by More Important Deeper Underlying Factor Is “Causal” Without it the Incident Would not Have Happened Without it the Consequences Would Have Been Milder Adapted from work of Dr. William R. Corcoran, NSRC Corp.

50 How to do an Extent of Cause Review
Human Performance Tool Peer Check Extent of cause actions address where else the cause could create additional problems beyond the event or condition under investigation. Through investigation, the evaluator is trying to clearly define what the scope of the problems may be and what actions may be appropriate to resolve the issue. It is expected that the level of effort in determining and documenting the extent of condition is commensurate with the level of investigation and significance of the event. Provided below are questions that the evaluator should consider when determining the EOCo. These questions are intended to aide the evaluator in performing an effective EOCo, but the questions need to be considered in the proper context and with the appropriate understanding of the condition to ensure sufficient evaluation of the discovered condition. Proper context would involve applying these questions in terms of: 1. Determine the transportability of the condition. a. Can the problem potentially affect other equipment, organizations, or processes? b. Can the problem affect another unit? c. Can the problem affect another site? d. Can the problem result in a common mode failure? e. Has consideration been given to initiate the same immediate actions on other equipment? 2. Equipment a. One component or a group of components? b. Is it only this component type? c. Is it more than this component type? 3. Human Performance: a. Is it one task? b. Is it all he/she did today? Or this week? c. What other tasks did he/she do that we should be concerned about? d. Should this be considered as an inappropriate action affecting others? e. Will this task be performed by others and when? 4. For all additional issues identified as part of the EOCo, ask the following: a. Close to actions taken? b. Additional actions needed? c. Additional investigation needed?

51 Common Factor Analysis Steps
Determine the Scope of the CFA Step 2 Gather Data Step 3 Determine Which Information to Evaluate Step 4 Categorize the Data Step 5 Identify Areas for Further Analyses There is more than one way to perform the Common Factor Analysis. The method below is one that will provide successful results. 1. Identify a group of incidents to be evaluated. These incidents should have similar attributes such as processes, programs, department, equipment, Condition Report Significance level, etc. 2. Gather supporting documentation and determine the causes and contributing factors. 3. Review the causes and contributing factors to develop groupings of similar or related causes and contributing factors. 4. For groupings of causes and contributing factors that appear to be more numerous than others, perform further analysis to attempt to identify the underlying weaknesses in management, supervision, programs, processes, procedures. 5. Develop corrective actions to address the identified common factors. Note that the common factor analysis is used in place of the normal templates for other types of causal analysis. The extent of condition and extent of cause are inherent in the method of analysis and do not require additional consideration. TXU Power Cause Analysis Handbook, Rev. 7, June 28, 2005, p. 40 Step 9 Report Learnings Step 8 Plan Corrective Actions Step 7 Develop and Validate Causal Theories Step 6 Analyze Areas of Interest Adapted from Incident Investigation Training, Callaway Plant

52 PLAN CORRECTIVE ACTIONS (Step 6)
Derived from INPO NUREG/CR-5455, NRC HPIP Entergy Root Cause Analysis Process Techniques Action Plan Solution Selection Tree Solution Selection Matrix Change Management Active Coaching Plan S.M.A.R.T.E.R. Effectiveness Review Contingency Plan Communication Plan A Corrective Action Plan has the following three major components completed in order 1. Outcomes, 2. Methods, and 3. Resources.*) The actions need context. If a significant event has occurred, we often "root out" individual, leader, or organizational behaviors that need to be changed. A quality corrective action plan to implement and sustain a behavior change has to include the following elements in sequence to assure Alignment and Accountability: 1. Mental Model: Do we have the "Right Picture" for this particular behavior. Otherwise, find out what “good” looks like by benchmarking, review of Best Practices, etc. (Sub-step: Agree on the Mental Model;) 2. Written Description:  Paint the Right Picture in procedures, job aids, written instructions. (Sub-step: Get Agreement in writing) 3. Communicate: Assure Job Performers are aware of the behavior standard / expectations [by training, newsletters, stand-downs, message maps, etc.—but primarily by example] (Sub-step: Get Agreement on the Communication Plan) 4. Monitor:  Are we getting the expected change in behavior with the right results (by Observing in the “field”, by Performance Indicators, etc.). (Sub-step: Adjust/adapt the original plan based on opportunities to improve implementation) 5. Feedback: +/- Positively reinforce desired behaviors/correct inappropriate behaviors. We need to include critical points were it is necessary to reach Agreement on plans going forward. The ability to Adjust and Adapt the plan based on new inputs must also be built in. Some the actions Tedd listed in his example might fit the five elements, but the actions would need to be done in context and in sequence. Again, this plan template is not aimed at preventing paper cuts. It is a plan with the purpose of preventing future Significant Conditions Adverse to Quality and Significant Events. For "paper cuts" the plan would not be as comprehensive. As analysts, we still have to come up with corrective action plans that uncompromisingly achieve the balance between Production and Protection. I do not know the correct quantity (#) of pages for a corrective action plan. If the number of pages becomes my focus, I will start asking foolish questions like, "What's the right number of pages?" and "What size font is the smallest you will accept?". My main concern is the plan's quality. If the plan has the 5 general elements listed above and the individual actions meet some version of the S.M.A.R.T.E. R.** criteria, I would say implementation of the plan will produce and sustain the quality results we have envisioned. That plan will have the Outcomes (What and Why), Methods(How) , and Resources (Who, When, Where, and How Much) questions answered. The quantity of pages is not in my acceptance criteria for the quality of a corrective action plan. *O-M-R: According the U.S. Army's Organizational Effectiveness training. **S.M.A.R.T.E.R. Specific, Measurable, Actionable/Achievable/Accountable, Relevant/Reasonable , Timely, Effective, Reviewed for unintended consequences

53 Developing A Corrective Action Plan To Prevent Recurrence
4/5/2017 Developing A Corrective Action Plan To Prevent Recurrence Develop alternative actions which address the underlying factors [i.e. the root cause(s)]. Evaluate alternative courses of action. Ensure corrective actions address the underlying factors [i.e. the root cause(s)]. Decide which alternatives will be recommended to management. Map out implementation of interventions/actions that will prevent or mitigate recurrence. Plan for contingencies. Develop and select solutions. The following general steps are used during the development of corrective actions. Develop a solution that will reduce or eliminate the root cause. Brainstorm solutions—Don’t stop with the first one. “The first solution is seldom the best.” Clarify meanings to insure understanding of each solution. Consolidate similar solutions. Prioritize and select top solution for testing Model and test solutions. Additionally, solutions should be validated. A probable cause is “innocent” until proven guilty. Develop a “mock up” model to test. Test only 1 solution at a time!!! (Sample of 25-33%) Avatar International Inc., 1985 Atlanta, Georgia from Georgia Power

54 The Success Cycle

55 Institutionalization Plan
Behavior Change Institutionalization Plan Who When Factor/Cause Being Addressed Corrective Action Step 1. Right Picture 2. Communicate 3. Monitor 4. Feedback Owner Due Date Accountability has to do with the ability of the members of an organization to clearly identify who is answerable and responsible for a particular outcome. M. Paradies and L. Unger, TapRooT® Root Cause Tree Dictionary, Systems Improvement, Inc. Knoxville, TN, 1999 IMNSHO, it is within a manager's pay grade to fix a dysfunctional reward system. Unfortunately, as rooticians, we may have discovered in an investigation that "management failed to provide proper examples and rewards for quality workmanship, good safety performance, and environmental stewardship in the organization"; but then only came up with a narrowly-scoped corrective action plan aimed at getting managers and supervisors to provide better examples and rewards. This is akin to coming up with a corrective action plan only aimed at fixing a procedure or only aimed at training or only aimed at coaching/counselling the worker. Instead, we need to plan a more comprehensive intervention that addresses how we are going to (a) get the proper mental model for a certain behavior, (b) communicate the desired behavior, (c) monitor the desired behavior, and (d) reward the desired behavior. One of my former plant managers thought that, with every corrective action plan to prevent recurrence (CAPR), there should also be a line item in the plan to reinforce the desired behavior. The basic steps for institutionalizing a behavior change are as follows: 1.Define standard / expectations (find out what “good” looks like by benchmarking, etc.). Get the “Right” Mental Model. 2.Communicate standard / expectations (training, newsletters, etc.) 3.Monitor expected behaviors and results (by Observing, by Performance Indicators, etc.) 4.Feedback +/- (Positively reinforce desired behaviors/correct inappropriate behaviors) © Konsulting, LLC

56 S.M.A.R.T.E.R. Criteria Specific Measurable Attainable Related
What exactly needs to be done? Focus on results. WHO does WHAT by WHEN Measurable Describes desired behaviors so an observer can compare observed behavior to a desired behavior Attainable Doable? Feasible? Realistic? Cost/Benefit? Agreed to by Stakeholder? Good business? Related Logical tie between the problem and cause(s) Logical tie between cause(s) and corrective actions Time-sensitive Should be completed before next “shot on goal” If not, interim corrective actions are needed Effective Degree of Dependability/Reliability Leveraged solution w. Behavior Engineering Model Reviewed By Stakeholders? By Subject Matter Experts? For Unintended Consequences? CH2M HILL Hanford Group, Inc., Event Investigation Process, TFC-OPS-OPER-C-14, REV C, Issue Date July 27, 2006, p. 9-10 Specific? Clearly state the desired end result or action; do not just restate the condition. Can you tell who is going to do what when? : Identify a specific person/group responsible for the action. Are all compensatory measures specified in numbers? (Examples: bad – “Clean up the air”; good – Operations will use high-efficiency air filters to reduce particulate contamination to <0.01 ppm.”) Measurable? Clearly define the necessary actions so a reviewer can easily determine the completion of the actions Can the compensatory measure be measured (quantitatively) to see when it is done and to see if it worked (will it prevent future incidents)? For example, a measurable compensatory measure would contain the following: “Revise step 6.2 of the procedure to reflect the correct equipment location.” This measurable attribute would require a review of the procedure to see that the new equipment locations were correct. Attainable?Will this compensatory measure work? Is it practical? Realistic? The action shall be within the control of the person/organization assigned to perform the action .  Can it be implemented? Is there a simpler or less expensive way to do the same thing? The Group/individual that will be assigned to implement this corrective action must understand what Action they need to take. The action shall be within the control of the person/organization assigned to perform the action Related? Proposed action related to the original problem? Is the corrective action related to the cause? Is the benefit worth the cost? Time-sensitive? How long will it take to complete the actions? Should we take interim measures until final corrective actions are in place? Action can be completed within appropriate time frame before more significant consequences occur from repeat events. Effective? Review the usefulness of the corrective actions. How? Is waiting for recurrence a good way? Corrective actions that depend more on human response are generally less effective than those involving physical devices. Reviewed? Will the compensatory measure have undesirable effects? Go through the corrective actions for unintended consequences. Do they cause any potential negatives? 50.59? Change Management? Have negative side effects been avoided?

57 Institutionalization Plan
S.M.A.R.T.E.R. WHO WHEN Cause/Factor Being Addressed Corrective Action Plan To Prevent Recurrence Specific Measurable Attainable Related Timely Effective Reviewed Owner Due Date 1. Right Picture 2. Communicate 3. Monitor 4. Feedback WHO “Everybody’s business is nobody’s business” If you don’t make an actual assignment to an actual person, there is a good chance that nothing will get done “We” in assignments actually means “not me” WHAT Spell out your exact deliverable – What exactly do you want to happen? The fuzzier the expectation, the higher likelihood of disappointment Tell the performer what you want; Use physical examples, paint a clear picture WHEN Assign an end date Preferably before the next opportunity for the problem to occur Root Cause Analysis Training Course CAP-02, Palo Verde Nuclear Generating Station

58 Corrective Action Effectiveness Scale
MIL-STD-882D Md MIL-STD-882D When developing corrective actions, consider a System Safety Order of Precedence. The different levels are (from the most to the least preferred risk mitigation methods): 1. Eliminate hazards through design selection Appropriate design/hardware changes are the most foolproof ways to prevent recurrence of undesirable events. The human element is virtually removed, and reliance on safety devices, procedures, training, and judgment is minimal. (The cost vs. the benefit must be considered) 2. Incorporate Safety Devices This is the next most effective type of corrective action. Again, human involvement is minimal, since safety devices are automatic, reducing dependence on training, judgment, etc. (of course, these devices must be properly designed, installed, and maintained). 3. Provide Warning Devices The third most effective type of corrective action involves the use of warning devices, such as alarms, sirens, lights, etc. These are considered automatic, in that they require no human action for their activation, but their potential effectiveness is less than the previous two types of corrective action due to the need for a proper human response to the warning device in order for the corrective action to be completed. 4. Use Procedures and Administrative Controls Reliance on procedures and other administrative controls is considered to be the weakest form of corrective action due to the total dependence on the proper human response. (People are the weakest link) DoE/SSDC /4-Rev. 3, p. 46 Corrective actions such as: Counseling, rewriting procedures, initiating night orders, etc., are usually destined to fail due to their complete reliance on people. Other problems with these types of actions include the actual administration of the actions - how long are night orders kept in the night order book?; how long will a person remember verbal counseling?; how will others benefit from such counseling?; how will a new individual in the department benefit from all the administrative fixes applied before his or her time?; how effective is a four hour training session when you are trying to change actions that have been the norm for years and years?; and the list goes on. Another problem with administrative controls is that they are easy to administer and complete, and the regulators seem to buy off on them. And yet events that have been “corrected” with primarily administrative fixes are almost certain to recur. Safety Precedence Sequence first appeared in “Applications of MORT to Review of Safety Analyses,” DOE/SSDC-17, by Briscoe, Lofthouse, and Nerntney. See also W. G. Johnson, MORT Safety Assurance Systems, New York, Marcel Dekker, 1980.

59 Effectiveness Review General Flow

60 M.A.S.T. Effectiveness Plan
METHOD Describe the means that will be used to verify that the actions taken had the desired outcome. ATTRIBUTES Describe the process characteristics to be monitored or evaluated. SUCCESS Establish the acceptance criteria for the attributes to be monitored or evaluated. TIMELINESS Define the optimum time to perform the effectiveness review. Why is it useful? Effectiveness reviews are required for Corrective Actions to Prevent Recurrence (CAPRs). How is it done? Develop the Effectiveness Review Plan completing each of the four attributes of M.A.S.T. When is it done? During Step 6 (Plan Corrective Actions) Grand Gulf Benchmarking/Trip Report, page 2 SA06-PI-B01, June 18-22, 2006 Grand Gulf Nuclear Station

61 Performance Indicator Development How is it done?
Develop performance measures following this general sequence: Step 1: Identify; then record the Organizational Outcome/Output. Organizational Outcome/Outputs may be located in the following sources: Cycle Strategic Plan Strategy INPO Performance Objectives and Criteria (POCs) NRC, OSHA, and DNR Performance Requirements Operating License, FSAR and Other License Basis Documents Corporate And Division Performance Expectations EPRI, NEI, NUSMG, etc. (other formal sources of "Best Practices") Benchmarking of the nuclear and other related industries. Step 2: Identify; then record the Process Outcome/Output. (e.g. Cycle Plan Strategic Objective or INPO organizational excellence outcome). Step 3: Identify; then record the Process Purpose. (e.g. INPO Performance Objective) Step 4: Identify; then record the most significant outputs of the organization, process, or job. (e.g INPO Performance Objective Criteria) Step 5: Classify; then record the "critical dimensions" of performance for each of these outputs. Critical dimensions should be derived from both: The needs of the internal and external customers who receive the outputs, and The financial needs of the business. Step 6: Develop; then record the measures for each critical dimension. For example, if "ease of use" has been identified as a critical dimension of quality for a given output, one or more of the measures should answer this question: "What indicators will tell us if our customers find our product or service (output) easy to use?"Step 7: Develop; then record standards for each measure. Note: A standard is a specific level of performance expectation. For example, if a measure for ease of use is "number of customer questions/complaints per month," a standard may be "no more than two questions/complaints per month.“ Step 8: Define; then record the specified levels of success using annunciator windows which indicate whether the desired results have been achieved. Improving Performance: How to Manage the White Space on the Organization Chart, 2nd Ed. Geary A Rummler & Alan P. Brache, Jossey-Bass Publishers, San Francisco, pp Improving Performance: How to Manage the White Space on the Organization Chart, Rummler & Brache

62 REPORT LEARNINGS (Step 7)
Derived from INPO NUREG/CR-5455, NRC HPIP Entergy Root Cause Analysis Process In high reliability organizations (such as nuclear power stations), the report is the required written record providing management, external regulators, and other customers the assurance that: the root causes (underlying factors) and contributing factors of risk significant performance issues are understood. the extent of condition and extent of cause of risk significant performance issues are identified. corrective actions to risk significant performance issues are sufficient to address the root causes and contributing causes, and to prevent recurrence. The Investigation Report should answer these questions? WHAT HAPPENED? (Including the role of all individuals directly and indirectly involved, the setting for the event, and any impact or potential impact of the event that is relevant to the conduct of the practice or business) WHY DID IT HAPPEN? (Including description and discussion of the main and underlying reasons for the event occurring, where this is possible) WHAT HAVE YOU LEARNED? (Reflect on significant event and highlight personal and, if appropriate, team-based learning) WHAT CHANGES WILL YOU MAKE TO PREVENT IT FROM HAPPENING AGAIN? (What action will be taken, where this is relevant or feasible, ensuring that all relevant individuals are involved, how will you monitor the changes) Forms Report Template Grade Cards/Scoresheets

63 Report Answers General Questions
The investigation will have determined the following: What was expected (anticipated consequences); What has happened (real consequences); What could have happened (potential consequences); Cause-effect relations; Faulty/failed technical elements (structures, systems, or components); Inappropriate actions (human, management, organizational); Failed or missing defenses (barriers, controls). IAEA-TECDOC-1600, Best Practices in the Organization, Management and Conduct of an Effective Investigation of Events at Nuclear Power Plants, International Atomic Energy Agency, September 2008, p. 11 Effective communication of investigation findings is nearly as important as the investigation itself! management needs to understand the basis for the action plan report is a historical record of your findings report must meet the content and format requirements of the Corrective Action Procedure Incident Investigation Training, Callaway Plant So whether the analyst is "telling the story" using Events and Causal Factors Charting or a Cause and Effect Chart, here are some recommendations to address concerns: 1. If the report "distorts" the facts of the story by over-emphasizing some or under-emphasizing others, the report writer needs a course correction. 2. If the report does not allow the customer to read or not read the details, the report writer needs to empathize with all three types of audiences. 3. If the report is "too long", the report writer needs to ask for the required standard length of a report (1 page? 10 pages? 40 pages?). I hope you recognize that, unless the "tail is allowed to wag the dog", there is no specific answer to this question. Note: I have always tried to keep the Executive Summary to one page (and succeeded in most cases). When asked about the length of the rest of the report, my answer is that the right length is the length that (1) covers the pertinent facts needed by all audiences to understand the basis for conclusions and recommendations and (2) also assures the audience/customer that the team has completed reasonable efforts to "leave no stone unturned". IAEA-TECDOC-1600

64 Report Answers Specific Questions
What was the Job Performer focused on? Could they do the Job if their lives depended on it? Equally qualified person likely to make same error? What were the factors that directly resulted in the nature, the magnitude, the location, and the timing of the key consequences? What happens to them when they do what they do? A. Daniels, Performance Management: Improving Quality and Productivity Through Positive Reinforcement, Performance Management Publications, Inc. 1984 W. R. Corcoran Ph.D., P.E., President, Nuclear Safety Review Concepts, The Phoenix Handbook: The Ultimate Event Evaluation Manual for Finding Safety and Profit Improvement in Adverse Events, Windsor, CT, May 4, 2007 Version Mager, Robert F. and Pipe, Peter. Analyzing Performance Problems, 3rd Edition, CEP Press, Atlanta, Georgia. 1997 Your report needs to answer questions, not raise them. Always ensure that: The Incident Description clearly describes what happened and how it occurred. The Contributing Factors are logical and supported by factual information in the event description. The Extent of Condition and Extent of Cause make sense based upon the contributing factors and related operating experience. The report doesn’t bring up issues without indicating how they will be resolved. Remember your audience. Clearly explain terms and issues whose meaning may not be obvious. Avoid unnecessary details that add little or no value. Include times and dates as necessary to allow the reader to understand the event’s progression. Define acronyms the first time they are used. Clearly delineate if information and/or conclusions are based upon assumptions. Specify their basis. Consider the use of pictures, diagrams, figures, or plots to aid the reader in understanding the issues. Use vertical bars to denote new material in revised reports. List individuals by position (e.g., I&C Technician #1) rather than by name in the body of the report. Ask for a peer check of your report prior to submitting it for approval. Incident Investigation Training, Callaway Plant Palo Verde Root Cause Analysis Training, CAP-02 Mager & Pipe, Analyzing Performance Problems Corcoran , Phoenix Handbook Daniels, Performance Management

65 Report Answers Regulator Questions
Who identified issue (licensee? regulator? self-revealing?) under what conditions? How long did issue exist? prior opportunities to identify? Plant-specific risk consequences? individual & collective compliance concerns? Systematic method used to identify underlying factors? Evaluation detail commensurate with significance of the problem? Evaluation considered prior occurrences? operating experience? Extent of condition addressed? extent of cause? Corrective actions for each underlying factor? or adequate evaluation why no corrective actions are necessary? Corrective action priority considers risk significance & regulatory compliance? Schedule established for implementing and completing corrective actions? Quantitative/qualitative effectiveness measures of actions to prevent recurrence? Corrective actions adequately address Notice of Violation, if applicable? 02.01 Problem Identification Determine that the evaluation documented who identified the issue (i.e. licensee-identified, self-revealing, or NRC-identified) and under what conditions the issue was identified. Determine that the evaluation documented how long the issue existed and prior opportunities for identification. Determine that the evaluation documented the plant-specific risk consequences, as applicable, and compliance concerns associated with the issue(s) both individually and collectively. 02.02 Root Cause, Extent of Condition, and Extent of Cause Evaluation Determine that the problem was evaluated using a systematic methodology to identify the root and contributing causes. Determine that the root cause evaluation was conducted to a level of detail commensurate with the significance of the problem. Determine that the root cause evaluation included a consideration of prior occurrences of the problem and knowledge of prior operating experience. Determine that the root cause evaluation addresses the extent of condition and the extent of cause of the problem. 02.03 Corrective Actions Determine that appropriate corrective actions are specified for each root and contributing cause or that the licensee has an adequate evaluation for why no corrective actions are necessary.  Determine that corrective actions have been prioritized with consideration of risk significance and regulatory compliance.  Determine that a schedule has been established for implementing and completing the corrective actions.  Determine that quantitative or qualitative measures of success have been developed for determining the effectiveness of the corrective actions to prevent recurrence.  Determine that the corrective actions planned or taken adequately address a Notice of Violation (NOV) that was the basis for the supplemental inspection, if applicable. NRC Inspection Procedure 95001, Inspection For One Or Two White Inputs In A Strategic Performance Area, Revision 04/09/09, pp. 4-5 NRC Inspection Procedure 95002, Inspection For One Degraded Cornerstone Or Any Three White Inputs In A Strategic Performance Area, Revision 04/09/09, pp. 4-5 NRC IP 95001 NRC IP 95002

66 Questions? Later Frederick J. Forck, CPT* 4Konsulting, LLC
2320 Knight Valley Drive Jefferson City, Mo Phone: Fax: *International Society for Performance Improvement (ISPI) Certified Performance Technologist (CPT)

67 Similar-Same-Similar
Extent of Condition Review Criteria Object (Person, Place, Thing) Application (Activity, Form, Fit, Function) Defect (Flaw, Failing, Deficiency) Deviation Statement Same-Same-Same An Identical Object in an Equivalent Application with a Matching Defect. Same-Same-Similar An Identical Object in an Equivalent Application with a Related Defect. Similar-Same-Same A Comparable Object in an Equivalent Application with a Matching Defect. Similar-Same-Similar A Comparable Object in an Equivalent Application with a Related Defect. Same-Similar-Same An Identical Object in a Corresponding Application with a Matching Defect. Similar-Similar-Same A Comparable Object in a Corresponding Application with a Matching Defect. Same-Similar-Similar An Identical Object in a Corresponding Application with a Related Defect.

68 Driver’s Side Front Tire on Rental Car Parked in My Driveway Flat
Extent of Condition Review Criteria Object (Person, Place, Thing) Application (Activity, Form, Fit, Function) Defect (Flaw, Failing, Deficiency) Deviation Statement Driver’s Side Front Tire on Rental Car Parked in My Driveway Flat Same-Same-Same An Identical Object in an Equivalent Application with a Matching Defect. Other Tires on Rental Car Tires on Pickup Truck Same-Same-Similar An Identical Object in an Equivalent Application with a Related Defect. Low on Air Similar-Same-Same A Comparable Object in an Equivalent Application with a Matching Defect. Tires on Boat Trailer Tires on Bicycle Similar-Same-Similar A Comparable Object in an Equivalent Application with a Related Defect. Same-Similar-Same An Identical Object in a Corresponding Application with a Matching Defect. Car Spare Tire Tires on Son’s Vehicle Tires on Spouse’s Vehicle In Trunk as a Spare Parked on the Street Parked in the Garage Similar-Similar-Same A Comparable Object in a Corresponding Application with a Matching Defect. Garden Tractor Parked Behind My House Same-Similar-Similar An Identical Object in a Corresponding Application with a Related Defect. Parked on Street

69 Fault Tree Form OR OR OR OR OR Refer to examples on the walls.
Adapted from Callaway Plant “Fault Tree Analysis” Training

70 Task Analysis Technique
(1) Paper & Pencil Input Steps in Procedure or Practice (2) Walk Through by Analyst or trained individual. (3)  Questions/ Conclusions about how task was/should be performed. What is Task Analysis? Task analysis is the process of first determining how a task should be performed, and then comparing that information against how the task was actually performed. Differences can then be analyzed as potential causal factors for the incident you're investigating. Task Analysis involves researching the task of interest, breaking it down to its critical elements, and then reconstructing task performance through reenactment or interviews. Why Do A Task Analysis? It’s a simple truth that the vast majority of incidents you’ll be assigned to investigate will involve an activity that produced undesirable results. In such cases, it’s imperative that we as investigators understand the sequence of actions, tools and equipment involved when performing the task in question. Only then are we truly capable of identifying discrepancies between expected and actual task performance that could key us in to how the event occurred. Task Analysis additionally provides the investigator an opportunity to identify previously undetected flaws in the task methodology that, in themselves, represent potential causal factors for the incident. Reenacting the task helps us identify environmental conditions (e.g. noise, lighting) and other factors (e.g. labeling) that may also have affected the outcome. Wolf Creek Nuclear Operating Corporation Root Cause Investigator's Manual, Revision 0 WCNOC 70

71 Example: Task Analysis Technique
(1) Paper & Pencil Input Steps in Procedure or Practice (2) Walk Through by Analyst or trained individual. (3)  Questions/ Conclusions about how task was/should be performed. 1. Locate proper “pig trap”. 2. De-pressurize line pressure. 3. Verify that the line has been de-pressurized. 4. Open line. 5. Insert pig. 6. Close line. 7. Re-pressurize line. Pig trap is not labeled. Nearest pressure gauge is up 2 flights of stairs about 50’ away. Other pig traps all have pressure gauges near opening. Is there a requirement to label? Why is the location without a pressure gauge? Has it been modified? Steps are all very general. How does the operator know how to do them? What is Task Analysis? Task analysis is the process of first determining how a task should be performed, and then comparing that information against how the task was actually performed. Differences can then be analyzed as potential causal factors for the incident you're investigating. Task Analysis involves researching the task of interest, breaking it down to its critical elements, and then reconstructing task performance through reenactment or interviews. Why Do A Task Analysis? It’s a simple truth that the vast majority of incidents you’ll be assigned to investigate will involve an activity that produced undesirable results. In such cases, it’s imperative that we as investigators understand the sequence of actions, tools and equipment involved when performing the task in question. Only then are we truly capable of identifying discrepancies between expected and actual task performance that could key us in to how the event occurred. Task Analysis additionally provides the investigator an opportunity to identify previously undetected flaws in the task methodology that, in themselves, represent potential causal factors for the incident. Reenacting the task helps us identify environmental conditions (e.g. noise, lighting) and other factors (e.g. labeling) that may also have affected the outcome. Wolf Creek Nuclear Operating Corporation Root Cause Investigator's Manual, Revision 0 WCNOC 71

72 Example: Chlorine Tanker Fill Critical Human Activity
Error Type: Wrong Information Obtained Error Description: Wrong Weight Entered Consequence: Alarm does not sound before tanker overfills Derived from: NUREG/CR-5455, S , Vol. 2, Development of the NRC's Human Performance Investigation Process (HPIP), Investigator's Manual UE Quality Improvement Process Manual, July 1992 Chlorine Tanker Fill Task Analysis Incident: September 6, 2002 India Chlorine Plant Explosion Kills 3, Injures 1 Two persons were killed and 18 injured, three of them seriously, in an explosion in the chlorine filling plant of Gujarat Alkalis and Chemicals Limited (GACL). One of the injured died later. The accident occurred while a chlorine tanker was being filled. The operator filling the tanker realized that something was wrong and was moving the tanker to the evacuation bay when it exploded. Guidelines for Preventing Human Error in Process Safety, Center for Chemical Process Safety of the American Institute of Chemical Engineers, © 1994, page 213. Prepare tanker for filling Plan: Do 2.1 or 2.2 in any order then do 2.3 in order. Verify tanker is empty Plan: Do in order. Open test valve Test for Cl2 Close test valve Check weight of tanker Enter tanker target weight Prepare fill line Vent and purge line Ensure main Cl2 valve closed Connect main Cl2 fill line Initiate and monitor tanker filling operation Initiate filling operation Open supply line valves Ensure tanker is filling with chlorine Monitor tanker filling operation Plan: Do 3.2.1, do every 20 minutes. On initial weight alarm, do and On final weight alarm, do and Remain within earshot while tanker is filling Check road tanker Attend tanker during last 2-3 ton filling Cancel initial weight alarm and remain at controls Cancel final weight alarm Close supply valve A when target weight reached Error Type: Check Omitted Error Description: Tanker not monitored while filling Consequence: Leaks not detected early Guidelines for Preventing Human Error in Process Safety, Center for Chemical Process Safety of the American Institute of Chemical Engineers

73 that Influence Performance Successful Performance
Example A. B. C. D. E. Factors that Influence Performance Failed Performance Past Successful Performance Difference or Change Contributing Factor? (Yes/No) When Job Performer came in early to avoid the heat. Job Performer started day the same time as co-workers. No co-workers were available to help with the job. Yes. Worker came to work early, so was working alone, carrying tools. Supervision Employee did not meet with supervisor the morning of the accident. Employee met with supervisor to discuss the day’s work activities. Work activities were not discussed. Yes. Because worker came to work early, job hazards were not discussed. Instructions for Use of Change Analysis Form Consider the current problem situation and list factors that influenced performance, equipment, or the process. Record all facts concerning the incident with the undesirable consequences. (Write questions for interviews to help you identify changes.) Describe the way the task was performed, equipment or process functioned during the incident. Compare the incident with undesirable consequences to the reference event. Describe the "old" way the task was performed, equipment or process functioned when performance was successful. Consider a comparable, reference event that did not have undesirable consequences. Document any Changes or Differences. Establish all known differences whether they appear relevant or not. Answer the Question: Is This a Causal Factor? (Yes or No). Analyze all the differences for their effects in producing the undesirable result. Be sure to include the obscure and indirect effects.

74 Events & Causal Factors Chart after Change & Barrier Analysis

75 Problem Correction Flowchart

76 Effectiveness Review Detailed Flow
Strategy The actions to prevent recurrence will be evaluated individually and collectively. There are two approaches to determine whether an individual corrective action has been effective. Demonstrate that the corrective actions have been adequately challenged and have proven their effectiveness or research the Problem Report database to show that there have been no additional failures or events over a long enough period to demonstrate effectiveness. After each corrective action has been evaluated individually, then evaluate the broader scope of the actions to prevent recurrence to determine whether the actions were collectively effective in correcting the root cause. The determination of the effectiveness of an individual action to prevent recurrence may not be possible if insufficient time has elapsed since the completion of the action to prevent recurrence or if the action to prevent recurrence has not been challenged. In this case, the Effectiveness Review (EFR) assignment is indeterminate and should be rescheduled at a later date. Additional actions to prevent recurrence are not normally needed for indeterminate EFRs. The collective effectiveness evaluation is not dependent on the effectiveness of each of the actions to prevent recurrence. For example, individual actions to prevent recurrence may be ineffective or indeterminate, but collectively the action to prevent recurrences may have effectively resolved the original problem. Conversely, even if all the actions to prevent recurrence have been individually effective, the original problem may not have been adequately resolved and the collective evaluation may be ineffective. If the collective assessment of actions to prevent recurrence determines that the root cause has been corrected, then the EFR assignment can be completed and closed. However, if there is only one action to prevent recurrence identified and its effectiveness cannot be determined, then the collective EFR cannot be considered effective. The AT assignment due dates for EFR assignments are set based on the anticipated completion dates of the actions to prevent recurrence. If the due dates of the actions to prevent recurrence are extended with proper management approval, then the due date of the EFR assignment will need to be similarly extended. When requested, the CAP department will extend the EFR assignments without charging the extension to the department. It is expected that all actions to prevent recurrence will be implemented. If in the process of performing the effectiveness review, the investigator determines that the AT assignment for the action to prevent recurrence has been closed without implementation (as opposed to ineffective implementation of the corrective action), then a Problem Report should be initiated to document this condition and identify why the action to prevent recurrence was not implemented.


Download ppt "Systematic Methods To Address Root And Contributing Causes"

Similar presentations


Ads by Google