Presentation is loading. Please wait.

Presentation is loading. Please wait.

Human Reliability Assessment

Similar presentations


Presentation on theme: "Human Reliability Assessment"— Presentation transcript:

1 Human Reliability Assessment

2 CHERNOBYL * Red indicates the radiation cloud covering the earth on April 27th, 1986 – one day after Chernobyl accident

3 Human Reliability Definition:
“The probability that a person will correctly perform some system-required activity during a given time period (assuming time is a limiting factor) without performing any extraneous activity that can degrade the system” (Hollnagel, 2002)

4 Human Reliability Assessment (HRA)
Assessment of the impact of human errors on systems safety and, if warranted, the specification of ways to reduce human error impact and/or frequency. HRA is far from being a precise science, but is a useful means for identifying and prioritizing safety vulnerabilities for HE, thereby reducing the frequency of accidents. Hybrid area: - Engineering & reliability - Psychology - Ergonomics

5 Probabilistic Safety Analysis (PSA)
Engineering approach Quantitative statement of finding expected frequencies of accidents and then compared against predefined risk criteria. HRA must be incorporated into PSA if risk is to be properly estimated…hence the need for the theoretical framework (psychology and ergonomics)

6 HRA History Started early 1960s…expanded greatly since 1979. Why?
Followed exact same procedure as conventional reliability analysis  human tasks substituted for equipment failures Greater variability and interdependence for human performance (‘human factor’) How can we get this ‘variability’ and ‘interdependence’ information? 1979  Three Mile Island 1986  Chernobyl

7 Understanding HRA Accident sequence analyzed represented as an event tree (slide on next page). Nodes represent specific tasks/functions with 2 outcomes (success/failure) Engineering approaches (PRA/PSA) can calculate failure probabilities in terms of material & process information, but HRA must also account for the “human factor” to determine if the human AS A COMPONENT will fail. -A node can either represent the function of a technical system or component, or the interaction between an operator and the process. -For example, if the analysis considers the sequence of events that are part of landing an aircraft, the event "timely extraction of flaps", which is an action that must be taken by the pilot, is represented by a node. -From the perspective of the PRA/PSA there is a need to know whether it is likely that an event will succeed or fail, and further to determine the probability of failure in order to calculate the combined probability that a specific outcome or end state will occur. -If the node represents the interaction between an operator and the process, engineering knowledge must be supplemented by a way of calculating the probability that the human, as a "component", will fail.

8 Event tree structure (Hollnagel, 2002) Event/Fault tree system
Several pathways with various outcomes depending on how systems tasks proceed: SUCCEED/FAIL NOTE the path that has a recovery step after failing the first time NOTE the path that has two fails in a row indicating that no recovery step was available. Recall an example in Decker re: landing gear and signal going on once landing gear has dropped…pilot has established his own routine based on timing of events (PERSONAL EXPERIENCE) how he/she can perform tasks out of order and still be safe Recall from Decker  Many pilots retire once a new operations manual comes out since they are not willing to put all the time and effort into modifying already cognitively established routines for the sake of ‘supposed’ safer procedures. Recall from last week the WWI & modern age (i.e. Boeing) pilot  does automation of systems increase or decrease potential for failure in such an analysis? Why? (Hollnagel, 2002)

9 Understanding HRA…cont (2)
Traditional approach: Determine HEP (human error probability) using tables (from collected data), HR models, or expert judgement Account for the influence of Performance Shaping Factors (PSFs) Calculate probability of an erroneous action using: Pea = Erroneous action PSFs = task characteristics, aspects of the physical environment, work time characteristics, etc.

10 Understanding HRA…cont (3)
Using this formula we must make the following assumptions: 1.) Probability of failure can be determined for specific types of actions independently of context. 2.) Effects of context are additive (various performance conditions don’t influence each other) Problems: Does not account for psychological theory!! Previous experience & beliefs (internal events) about the situation (external events) will influence actions performed!!! Means that human performance takes place in a context which consists both of the actual working conditions and the operator’s perception or understanding of them. Beliefs may be shaped –and shared – by the group. **Recall from Decker  tunnel example – look at it from the operator’s perspective at the time of the incident and not just after the fact (retrospective) AS A RESULT  context is important because human action always is embedded in a context. The preferred mode of representation that the analyses use - the event tree or operator action tree - is prone to be misleading, since it represents actions without a context

11 Understanding HRA…cont (4)
As a result, several models have been used to improve HRA: 1.) Behavioral (Human Factors ) models Focus on simple manifestations (error modes) Described in terms of omissions, commissions, extraneous actions Derive probability that specific errors will occur Problems Causal models very simple Weak in accounting for context Mainly viewed as event happenings due to probability (as things often do in life) - Moreso Quantitative approach

12 Understanding HRA…cont (5)
2.) Information processing models Focus on internal mechanisms i.e. decision making, reasoning, etc. Explain flow of causes and effects through models Problems: Models often complex Limited predictive power (hypothetical basis) Little concern for quantification Context not considered explicitly Better suited for retrospective analysis than predictions -Moreso Qualitative approach

13 Understanding HRA…cont (6)
3.) Cognitive models Focus on relation between error modes and causes Models and (relatively) simple and context specific Premise: Cognition is the reason why performance is efficient (or limited) Operator seen as acting in anticipation of future event Well suited for predictions and retrospective analysis The problem of cognition can be illustrated by the popular notion of "cognitive error". The problem facing HRA analysts was that human cognition undoubtedly affected human action, but that it did not fit easily into any of the established classification schemes or the information processing models. Fast solution  simply declare that any failure of a human activity represented in an event tree, such as diagnosis, was a "cognitive error". However, that did not solve the problem of where to put the category in the existing schemes. Genuine solution  realize that human cognition is a cause rather than an effect, and that "cognitive error" therefore is not a new category or error mode. Instead, all actions – whether erroneous or not - are determined by cognition, and the trusted categories of "error of omission" and "error of commission" are therefore cognition-based. The cognitive viewpoint also implies that unwanted consequences are due to a mismatch between cognition and context, rather than to specific "cognitive error mechanisms".

14 HRA Framework 10 Steps with 3 MAIN GOALS GOALS:
1.) Human error identification (What can go wrong?) 2.) Human error quantification (How often will a human error occur?) 3.) Human error reduction (How can human error be prevented from reoccurring or its impact on the system be reduced?) Most research on #2, but #1 probably most important b/c if errors not identified, HRA will greatly underestimate HRA results. NOTE >> Task analysis usually precedes Human Error Identification (HEI)

15 HRA Generic Methodology
Systematic way of approaching HRA logically Will help ensure the problem is dealt with reliably while minimizing error (biases) Encompasses 10 steps from identifying the problem to final documentation

16 Steps in the HRA process:

17 HRA Steps 1.) Defining the problem Define precisely the problem and its setting in terms of the system goals and overall forms of human of human-caused deviations from these goals. 2.) Task analysis Define explicitly the data, equipment, behaviour, plans and interfaces used by the operators to achieve system objectives, and to identify factors affecting human performance within tasks. 3.) Human error analysis Identify all significant human errors affecting performance of the system and finding ways in which human errors can be recovered.

18 HRA Steps cont… 4.) Representation Model human errors and recovery paths in a logical manner for quantitative measurement (integrate human errors with hardware failures) 5.) Screening Define the level of detail and effort with which the quantification will be conducted by defining all significant human errors and interactions and ruling out insignificant errors 6.) Quantification Quantify human error probabilities and human error recovery probabilities (to determine likelihood of success in achieving system goals)

19 HRA Steps cont… 7.) Impact assessment
Determine sig. of human reliability to achieve system goals, to decide if improvements in human reliability are required, and (if so) what are the primary errors/factors negatively affecting the system. 8.) Error reduction Identify error reduction mechanisms, the likelihood or error recovery, improving human performance in achieving system goals. 9.) Quality assurance Ensure the enhanced system satisfactorily meets system performance criteria NOW and in the future.

20 HRA Steps cont… 10.) Documentation
Detail all information necessary to allow the assessment to be understandable, auditable, and reproducible.

21 1. Problem Definition 2 parts of defining a problem:
Identify the HR problem Identify the HR context Once HR problem identified and defined within the systems context discussions with designers, engineers & operational managers should occur Define system goals at various levels operator action is required – this defines higher goals which the operator was aiming for and can get to the operator intentions at the time of the event and the root of the problem What is the human reliability problem and what is this specific problem related in the context of the larger system?

22 Problem Definition cont. (2)
Must investigate the “safety culture” of a plant – this can dramatically influence HR and is important to defining the problem If HRA is being carried out as part of an overall risk assessment the HR analyst will probably be given a set of scenarios to assign risk and HE to. By the end of the process the problem should be explicitly defined in its respective system context: Scenarios to be addressed Overall tasks requires to achieve safety goals within each scenario Criteria should be in place to know when to drop production goals in order to keep safety goals/standards adequate – often inherent in the system and may pose a problem – management goals/production output/$

23 2. Task Analysis Purpose:
Provide a complete, comprehensive description of the tasks that have to be performed by the operator to achieve system goals - Many forms of TA, some notable ones include: Sequential – chronological order of events Hierarchal – considers tasks in a hierarchy (importance) Tabular – dynamic situations (operators actions during a power plant emergency) Task analysis is essential in a complete and detailed HRA. Detailed operator actions provide good information to identify errors in the next phase.

24 Task Analysis cont. (2) Methods of deriving info from task analysis:
Interaction with all levels (operators, maintenance, supervisors, managers, system designers, etc.) Observation (structured & unstructured interviews) Procedure analysis Incident analysis Walkthrough/explanation of procedures from operator(s) Examination of system documentation

25 Task Analysis cont. (3) Important not to completely rely on procedures/operating instructions – practical, real life procedures often differ Operator/employee knowledge (tacit knowledge) gained through experience vital in the TA process. Why?

26 3. Human Error Analysis (HEA)
Stage to identify all errors associated with the task!! Most critical part of HRA. WHY?... If significant errors are omitted, will not appear in analysis and may seriously UNDERESTIMATE EFFECTS of human error on the system

27 3. HEA cont…(2) 1.) Error of omission: 2.) Error of commission:
Method example #1 Simplest approach to HEA…consider ‘external error modes’ 1.) Error of omission: Act omitted (not carried out) 2.) Error of commission: Act carried out inadequately Act carried out in wrong sequence Act carried out too early/late Error of quality (too little/too much) 3.) Extraneous error: Wrong (unrequired) act performed

28 3. HEA cont… Once the factors which influence HRA are identified the next step is representing them in such a way to indicate their effects on the system goals…

29 4. Representation Visual representation of events/actions in a scenario Can be used to represent simple or complex failure paths Skill & proper knowledge is needed of “tree” construction – trees can become extremely complex and off focus is not carefully put together A smaller scenario with a low number of errors may not need or benefit from this type of representation

30 4. Representation cont…(2)
Fault Tree Typical representation of HE & its effect on a system is to use a fault tree Logical structure that defines what events must occur in order of an undesirable outcome Undesirable event located at the top of the “tree” (most important) 2 different types of gate that allow events to proceed to the next level “OR” gate – Only used if any of the events joined below it by this gate occur “AND” gate – Only used if all events joined below this gate occur Trees may be either qualitative or quantitative in nature depending on the situation/needs of the analyst

31 Fault Tree A supposedly ‘simple’ fault tree
Looks complicated but considering that operators have a well-established knowledge of tasks/procedures, a fault tree like this become much less difficult to interpret/understand. Probabilities not necessary, BUT quantitative representation usually included.

32 5. Screening Identifies where major effort in the quantification analysis should be applied. Filters out tasks/scenarios which may have little contribution to system failure Screening methods have the ability to potentially eliminate studying important errors and interactions in the analysis As a general rule, when applying any screening technique – if in doubt, leave the human error in the fault/event tree

33 6. Quantification Human reliability needs to be quantified to something that can compared across the HRA spectrum The metric for HRA quantification is Human Error Probability (HEP) HEP = (# of errors occurred)/(# opportunities for error to occur) Expressed as a number between 0 and 1. Little recorded industrial HEP data available because: Difficulty in estimating opportunities for error in realistic complex tasks (denominator error) Confidentiality and unwillingness to publish poor performance data Lack of awareness regarding the usefulness of human error data (hence no fiscal incentives) Very few datasets with recorded HEP probabilities

34 There are lots of ways to determine HEP
For this course we keep it simple!

35 6. Quantification…cont.(2)
Problems with simulator data in determining HEPs: Personnel using simulators usually highly motivated and know what’s on training curriculum Reliability of emergency training/responses on simulator compared to the real situation (‘cognitive fidelity issue’). Experiments are also usually controlled investigating only one or two variables  generalization risky!!! Lack of ‘generalizability’ has led to: Non-data-dependent approaches i.e. Expert opinion/judgment People don’t go to work to do a bad job!! (thanks Decker) Expert opinion’s not a bad thing and has successfully been used in many other areas and occasionally in HRA/PSA.

36 7) Impact Assessment System risk or reliability is calculated
Compared to acceptable levels/standards established Each event is analyzed and classified into a fault tree Both HE & hardware/software analysis are taken into account for the best combination to improve the system If HE dominates error reduction methods must be investigated If HE cannot be reduced to acceptable levels – redesign of the system is necessary Fault Tree – Classifies events in order of importance (top-down) most fault tree computer packages will automatically determine the most important events “engineering the problem out”

37 8. Error Reduction Not required if: If required, then:
Human reliability is adequate (what’s adequate?) It’s not the most effective means of achieving system performance (other modifications more suitable) Not within scope of assessment If required, then: Focus on reducing impact/frequency of human errors Implement a more general error reduction strategy (how?)

38 Steps in the HRA process:
Reiterate…

39 8. Error Reduction…cont (2)
Ways of reducing impact of critical errors on system by: Prevent hardware/software changes Increase system tolerance Enhance error recovery Reduce error at source -- Prevent hardware/software changes  use of interlock devices to prevent error, automate tasks, etc. -- Increase system tolerance  make system hardware and software more flexible or self-correcting allowing for greater variability in operator inputs to achieve intended goal -- Enhance error recovery  enhance direction and correction of errors by means of increased feedback, checking procedures, supervision, automated performance monitoring, etc. -- Reduce error at source  reduce errors via improved procedures, training, interface/equipment design, etc. THEREFORE: **Requires collaboration between ergonomist & human reliability analyst (expensive???) **Error recovery steps often simplest to implement, but reducing error at source will be most effective at improving human reliability **Quantification methods (i.e. HEART) prescribe error reducing mechanisms for identified errors **PSF approaches (i.e. SLIM, HEART, THERP) allows for determining contributions of each PSF (i.e. quality of procedures) to indicate most effective area for focus.

40 8. Error Reduction…cont (3)
Additional considerations: Positive error reducing strategies should be factored back into quantitative analysis Check that HEP(s) and overall system calculated risk become acceptable As part of the quality assurance phase: Provide ‘operational definition’ for each error reduction strategy Ensure strategy is properly implemented and maintained over time What’s meant by ‘operational definition’?

41 9. Quality Assurance Effectiveness of error reduction mechanism implementation should be ensured by: Monitoring Performance verification (at later stage) Reliability/Validity Analysis (can be hard…why?) Continuous performance monitoring systems present powerful quality assurance systems. Why? Gradual performance standard degradation Increased maintenance loading Loss of personnel Impromptu changes (since startup) Increasing ‘retrofit’ changes Hard because of CONTEXT!!!

42 9. Quality Assurance cont…(2)
Long-term performance monitoring allows for: Identifying WHEN in time results of HRA my be outdated. Signify need for further (or new) evaluation Justifying acceptability of risk associated with system Avoid gradual deterioration of safety barriers. i.e. BHOPAL SYNDROME (India)

43 10. Documentation Formally document all results of study
Ensure auditability and justifiability of results Can provide database for future investigation and monitoring. Ensure assumptions and judgments included!! Aid new/unconnected personnel to understand Enable independent examination, updating, and reproduction Allows for learning from mistakes

44 Future Directions 1)Low technology Risk
2)Cognitive Errors and misdiagnosis 3) Management, organizational and sociotechincal contributions to risk Future directions as noted by Kirwan in mid-late 90’s – how do they compare to today's world?

45 Low Technology Risk HRA traditionally used for high risk, high technology industries HRA likes to focus on massive accidents that happen less frequently – large consequences Not applied to high risk, low technology sectors as much (ex. mining) – which has a larger number of “small” accidents Can have very valuable applications to low technology industries - Have we seen a change over the past decade towards HRA being used in low tech industries?

46 Cognitive Errors & Misdiagnosis
Operators may misdiagnose a situation, not realize the mistake and continue interpretation of the feedback incorrectly Can make matters worse if operator overrides safety system then if nothing was done at all

47 Management, Organizational & Sociotechnical Contributions to Risk
HRA should not only be applied to operators/workers on the job site but also management and the organizational design of the plant Bhopal, Challenger Shuttle & Chernobyl all had significant human error BUT current HRA techniques would not have detected risk prior to accidents because error was neither procedural or diagnostic

48 Management, Organizational & Sociotechnical Contributions to Risk
Economics, time restraints, social pressures, communication breakdown – personality conflicts, etc. all add pressures on a system and its safety


Download ppt "Human Reliability Assessment"

Similar presentations


Ads by Google