Value of Trending: Application Exercise Bobbi Welch Regulatory Compliance Advisor WECC Human Performance Conference October 2018
What are we seeking to measure? NERC compliance violations OSHA recordables Relay setting errors / misoperations Switching errors Other?
What are the characteristics of what we’re measuring? Frequency of events; i.e. high or low? Static or dynamic conditions? Corporate culture Availability/expertise of resources Weather Consequence/impact of event? Is more ever a good thing?
High Impact Low Frequency Events (HILF) What we’re trying to avoid By their name, they’re infrequent To be statistically valid, what is required? ≥ 60 data points What if I don’t have 60 data points? Is this a problem? No, although some statistical rules will not apply How can I increase the pool of data available for trending?
A way to increase the data available to identify trends. Iceberg Theory Compliance Violations Near Misses / Good Catches How effectively do our RSAWs tell the story? How well do we implement new standards? How well do our controls mitigate risk? How quickly do we identify “near misses”/ violations? And who identifies them? One NATF company’s experience: Near Miss/Good Catch Reporting up by 435%; events down by 30% A way to increase the data available to identify trends.
Low Impact High Frequency Events Goal: 20-30 events over a 2-3 year period Processes undergoing change are not “normal” What does this mean? Some more statistical rules will not apply; e.g. the bell curve Use tools that are applicable
Event Cause Analysis – Keep it Simple
NERC Standard Pareto Chart Activity 1 NERC Standard Pareto Chart
Answer Key #1: Events by Standard 2 4 7 8 11 3 5 6 9 10
NERC Cause Code Assignment Process A1 - Design Engineering A2 – Equipment / Material A3 – Human Performance A4 – Management / Organization A5 – Communication A6 – Training A7 – Other https://www.nerc.com/pa/rrm/ea/Pages/EA-Program.aspx
NERC Cause Code (Excerpt)
Cause Code Pareto Chart Activity 2 Cause Code Pareto Chart
Answer Key #2: Events by Cause Code 3 6 9 1 5 11 7 10 4 8
Let’s Compare the 2 Activities Which exercise was easier? Why? Were there any events where there were differing opinions? If so, what were they? How could this be resolved? What other ways could we trend the data?
Examples of Higher Risk Areas (Where to Anticipate Problems) Change management Cross-functional processes Contractor oversight Construction (specific to NERC standards)
Project Cycle Flow Chart Planning System Protection Construction Maintenance Operations Commissioning Blue indicates functions governed by NERC standards.
Develop programs to address trends Goal: Focus resources here x x x x Time Prevent Detect Correct Similar concepts applied to compliance concerns Event Occurs Mitigation Plan Return to Normal Operations
After some run time, perform an effectiveness review Based on process frequency Typically 6 months to a year Have the improvements performed as anticipated? If not, why (e.g. wrong cause code)? Consider additional corrective actions
Performance Run Chart
Does this stuff really work? Real Program Results Unplanned outage reduction of 45% over a 5 year period Unplanned insulator failures reduced by 80% over a 5 year period Human performance errors leading to unplanned outages reduced by 60% over a 5 year period See any pattern?
Supporting Tools APS uses DevonWay Corporate-wide: safety, compliance, environmental, etc. Capture cause analysis and cause codes Monitor implementation of corrective actions Trigger and capture effectiveness reviews