BEAM INSTRUMENTATION GROUP DEPENDABILITY APPROACH CERN, Chamonix 26th January 2016 William Viganò

Slides:



Advertisements
Similar presentations
Integra Consult A/S Safety Assessment. Integra Consult A/S SAFETY ASSESSMENT Objective Objective –Demonstrate that an acceptable level of safety will.
Advertisements

Operation & Maintenance Engineering Detailed activity description
PowerCare Customer Support Agreement LP Products and Systems 2013 PM.
Reliability Risk Assessment
Overview Lesson 10,11 - Software Quality Assurance
Maintaining and Updating Windows Server 2008
Vegard Joa Moseng BI - BL Student meeting Reliability analysis summary for the BLEDP.
Technical review on UPS power distribution of the LHC Beam Dumping System (LBDS) Anastasia PATSOULI TE-ABT-EC Proposals for LBDS Powering Improvement 1.
Project Life Cycle Introduction and Overview © Ed Green Penn State University All Rights Reserved.
Unit 3a Industrial Control Systems
Release & Deployment ITIL Version 3
Module 8: Designing Active Directory Disaster Recovery in Windows Server 2008.
S/W Project Management
Maintaining a Microsoft SQL Server 2008 Database SQLServer-Training.com.
Designing a HEP Experiment Control System, Lessons to be Learned From 10 Years Evolution and Operation of the DELPHI Experiment. André Augustinus 8 February.
Project Tracking. Questions... Why should we track a project that is underway? What aspects of a project need tracking?
Protecting the Public, Astronauts and Pilots, the NASA Workforce, and High-Value Equipment and Property Mission Success Starts With Safety Believe it or.
Event Management & ITIL V3
B. Todd et al. 25 th August 2009 Observations Since v1.
BCT for protection LHC Machine Protection Review April 2005 David Belohrad, AB-BDI-PI
Aircraft Maintenance Human Factors 25th and 26th September 2000 The Hatton, London.
Lecture 14 Maintaining the System and Managing Software Change SFDV Principles of Information Systems.
Status Report – Injection Working Group Working group to find strategy for more efficient start-up of injectors and associated facilities after long stops.
Quality Assurance.
PLC Workshop at ITER, 4-5 th of December 2014 A. Nordt, ESS, Lund/Sweden.
Wojciech Sliwinski BE/CO for the RBAC team 25/04/2013.
Vegard Joa Moseng BI - BL Student meeting Reliability analysis of the Input Monitor in the BLEDP.
K.Hanke – PS/SPS Days – 19/01/06 K.Hanke - PS/SPS Days 19/01/06 Recommissioning Linac2/PSB/ISOLDE from CCC  remote operation from CCC  upgrades & changes.
Architectural issues M.Jonker. Things to do MD was a success. Basic architecture is satisfactory. This is not the end: Understanding of actual technical.
Reliability Applied to KM3NET
‘Review’ of the machine protection system in the SPS 1 J. Wenninger BE-OP SPS MPS - ATOP 09.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 21 Slide 1 Software evolution.
Beam Interlock System MPP Internal ReviewB. Puccio17-18 th June 2010.
Proposal for a Global Network for Beam Instrumentation [BIGNET] BI Group Meeting – 08/06/2012 J-J Gras CERN-BE-BI.
CERN Availability Working Group & Accelerator Fault Tracker Availability Working Group & Accelerator Fault Tracker - Where do we.
Post Mortem Workshop Session 4 Data Providers, Volume and Type of Analysis Beam Instrumentation Stéphane Bart Pedersen January 2007.
ATC / ABOC 23 January 2008SESSION 6 / MTTR and Spare Parts AB / RF GROUP MTTR, SPARE PARTS AND STAND-BY POLICY FOR RF EQUIPMENTS C. Rossi on behalf of.
Agenda 1 BOSS Meeting - 27/5/ : :10 Introduction and aims of the workshop 10‘ 09: :25 Priorities from OP for PSB and transfer lines 15'
TE/TM 30 th March - 0v1 CERN MPP SMP 3v0 - Introduction 3 *fast *safe *reliable *available generates flags & values.
First discussion on MSS for Katrin March 26, 2013 M.Capeans CERN PH-DT.
Failure Modes, Effects and Criticality Analysis
MPE Workshop 14/12/2010 Post Mortem Project Status and Plans Arkadiusz Gorzawski (on behalf of the PMA team)
ASSEMBLY AND DISASSEMBLY: AN OVERVIEW AND FRAMEWORK FOR COOPERATION REQUIREMENT PLANNING WITH CONFLICT RESOLUTION in Journal of Intelligent and Robotic.
LIU-PSB Working Group meeting: 25/06/2015, Markus Zerlauth Consolidation of magnet interlocks in the PS complex – Warm magnet Interlock System (WIC) R.Mompo,
Design process of the Interlock Systems Patrice Nouvel - CERN / Institut National Polytechnique de Toulouse CLIC Workshop Accelerator / Parameters.
Computer Security: Principles and Practice First Edition by William Stallings and Lawrie Brown Lecture slides by Lawrie Brown Chapter 17 – IT Security.
 TE-MPE-PE Clean code workshop – R.Heil, M.Koza, K.Krol Introduction to the MPE software process Raphaela Heil TE-MPE-PE Clean code workshop - 9 th July.
R2E/Availability Workshop Report - RadWG October 22 nd 2014 R2E/Availability Workshop 2014 October th 2014 R2E/Availability Workshop RadWG - Brief.
Reliability and Performance of the SNS Machine Protection System Doug Curry 2013.
Roadmap to Greener Computing Henri Mikkola. Eco-Friendly Product Lifecycle Today through globalization products travel long distances before reaching.
TOPIC: THE VALUE OF BIOMEDICAL ENGINEERS AND TECHNICIANS IN HOSPITAL PRESANTED BY: KWIZERA ABRAHAM STUDENT IN: BMET /FINALIST STU INSTITUTION: IPRC KIGALI.
A monitoring system for the beam-based feedbacks in the LHC
OH&S Plant Obligations make
2007 IEEE Nuclear Science Symposium (NSS)
Updated with comments 22 may 2017 Updated again 29 June 2017
Dependability Requirements of the LBDS and their Design Implications
Isograph Packages Apollonio, 7/7/2016
LV Safe Powering from UPS to Clients
1v0.
the CERN Electrical network protection system
STPA FOR LINAC4 AVAILABILITY REQUIREMENTS
Hot Checkout System for LERF Resuscitation
Disabling Rules.
Interlocking of CNGS (and other high intensity beams) at the SPS
450 GeV Initial Commissioning with Pilot Beam - Beam Instrumentation
LHC dry-runs (BE-BI view)
Chapter 8 Software Evolution.
1v1.
PSS verification and validation
Operation of Target Safety System (TSS)
Presentation transcript:

BEAM INSTRUMENTATION GROUP DEPENDABILITY APPROACH CERN, Chamonix 26th January 2016 William Viganò

Beam Instrumentation equipment William Viganò o There is a huge difference between a Safety Critical System and an “Observation System”, The difference starts with the architectural design, followed by an entirely different implementation procedure. Safety Critical System Observation System o The Beam Instrumentation Group is in charge of equipment installed in all CERN accelerators, some are solely for observation but others function as protection system.

Critical systems involved in Dependability William Viganò Generate interventions or preventive maintenance that reduce beam availability. o Dependability Measure of a system's availability, reliability & maintainability Can be responsible for blind failures.  Where the machine should be protected but the system is not operational and / or not monitored). Generate false dumps. o The criteria used in this presentation for selecting beam instrumentation considered “critical” are LHC systems that:

List of BI Critical systems William Viganò o Beam Loss Monitor System [BLM] o SW Interlocked Beam Positioning Monitors for orbit observation [Standard BPM] o HW Interlocked Beam Positioning Monitors for the dump line in LSS6 [Interlocked BPM] o Synchrotron Light Abort Gap Monitor [BSRA] o DC Beam Current Transformer [BCTDC] o Fast Beam Current Change monitor linked to the Fast Beam Current Transformer [BCTF] o Tune measurement system [BBQ] o Orbit and Tune feedback system [OFC/OFSU]

Failures in 2015 William Viganò o The failure analysis presented on the next few slides is based on data from the Accelerator Fault Tracking [AFT] Tool which gives the overview from the Operation point of view. o From the Expert point of view, the number of faults is larger than the AFT result, but invisible to Operations due to factors such as: redundancy, recovery strategies, etc… These are difficult to analyse as they are catalogued by each system using their preferred fault tracking method. o More than 70 faults of Beam Instrumentation have been analyzed.

Failures Summary William Viganò 16 Channels 4000 Channels 2160 Channels 4 Channels 2 Channels Hardware connected to the Beam Interlock System Software connected to the Beam Interlock System

Fault categorization William Viganò o “External fault”: represents external interference leading to a failure, such as the surge current discharge that damaged the BLM system. o “System Fault”: Fault of a Hardware/Software nature. o “Design and installation issues”: mainly due to missing test-benches or additional functionality added to systems after design completion. o “Human factor”: due to wrong operation, lack of communication / understanding of the issue or modifications by untrained people. Faults have been categorized into 4 sub-categories:

BI Fault types Summary William Viganò

BI Intervention time Summary William Viganò

Lesson learned in 2015 William Viganò  Improving the beam availability means for BI working on several aspects.  We need to better understand the nature of the faults for optimizing their mitigation. Failure TypeNumber of failures [#] Total stop time [h] Contribution Percentage External Fault % System Fault % Design Issue % Human factor % o Total stop time in 2015 => hours (<2%) o No R2E failures thanks to heavy testing of BLM & BPM systems at design stage.

WHAT HAS BEEN DONE TO FORESEE AND REDUCE FAILURES William Viganò

Diagnostics: Online monitoring William Viganò Colors allow a fast understanding of the system status. Many parameters can be visualized. o BPM calibration screenshot for the SPS-LHC transfer lines. o These tools do not prevent faults but they help save time understanding issues.

Diagnostics: Offline monitoring William Viganò o Several offline monitoring functions added during o Daily checks introduced with the aim of detecting parameter degradation. Daily report on number of errors per Optical link Daily Temperature monitoring for all crates in all Points

Diagnostics: New event warning William Viganò o BLM Threshold changes – automatic alert to experts in charge of Piquet service. o Most of the diagnostic tools have been created or grown with the project development. Now systems are installed but many tools are needed for understanding issues.  With the growing of the system complexity we need a simplification of software used for diagnostic purpose.

Test-benches William Viganò o Testing BI Software (e.g. Orbit Feedback software test-bench)

Intervention time reduction William Viganò o Preparation of a Piquet Service Manual Learning from past failures. Reduction in time taken to resolve issues. o Creation of a Piquet service for BLM.

WHAT BEAM INSTRUMENTATION NEEDS FOR FURTHER IMPROVING ITS DEPENDABILITY William Viganò

Reliability Analysis William Viganò o Simple example: A brushed fan has a lifetime of 7 years. We have many fans installed everywhere. What happens in case of multiple failures? Do we have enough spares? What is the nature of the Failure D ? Failure A Failure B Failure C Solution A Solution B Solution C We are here o Apart from a few exceptions, such as the LHC and injector BLM systems, the normal way of proceeding is: “learn from the experience”. o This approach can work in general, but means that we are often working to solve issues rather than foresee them:

Common Analysis Method Required William Viganò o Need a common method for running Reliability Analysis for all future critical systems (not just BE-BI?) composed of 3 main points: Failure Prediction. Failure mode and criticality analysis using the Risk Priority Number method. Fault Tree analysis. (e.g. Using Military handbook as reference standards.)

Maintenance plan William Viganò Type of intervention For projects with Prediction and Failure mode analysis done, a maintenance plan is automatically obtained as one of the results.

Connecting systems to the Interlock William Viganò o We have cases of equipment originally designed for observation or measurements only which are now being connected to the Beam Interlock System.  BI propose that the decision to connect new systems to the Beam Interlock System should be limited to those that have undergone Reliability Analysis, and are compliant with BIS specifications. o Many systems have been designed, manufactured and installed following the measurement performance needs without performing a proper Reliability Analysis.

A structured approach to improve dependability William Viganò o Training of design engineers in best practice: CERN guidelines for component “derating rules”? What do we mean by failure modes? How to guarantee high reliability manufacturing, assembling, etc.  How do we verify this? What is the appropriate tool for fault tracking?  AFT, Jira, EAM, etc. ? o Identifying critical systems early in the design phase: Needs to be clearly included in specifications. Systems should be audited during conception phase. o Maintaining Reliability Analysis Expertise: Often done solely by Temporary Personnel.  Overhead for starting each analysis.  Subsequent loss of knowledge when they leave. BE-BI Group would profit from a longer term (centralised?) Reliability Analysis team.

Conclusions William Viganò o The reliability analysis of BLM system (over 4000 channels connected to BIS), has certainly contributed to its full reliability & low number of system faults. Currently addressing human factor & remaining design issues to further improve availability. o Reliability, Availability and Maintainability (R.A.M.) analysis should be performed on all BI systems affecting availability.  To do this BI will need more support and would greatly profit from a dedicated team for guaranteeing the feasibility of this important and complex task. o Need to extend reliability analysis to other systems with all engineers aware of the importance of such an approach at the design stage (Consolidation & HL-LHC).

William Viganò And many thanks to the AFT team and BI colleagues for their contribution to this presentation! Thanks for your attention!