TE-MPE-CP, RD, 06-Oct-2011 1 Radiation Induced Faults in QPS Systems during LHC run 2011 R. Denz TE-MPE Technical Meeting October 6 th.

Slides:



Advertisements
Similar presentations
K. Potter RADWG & RADMON Workshop 1 Dec WELCOME TO THE 4th RADWG & RADMON WORKSHOP 1 December 2004.
Advertisements

LHC UPS Systems and Configurations: Changes during the LS1 V. Chareyre / EN-EL LHC Beam Operation Committee 11 February 2014 EDMS No /02/2014.
Supervision of Production Computers in ALICE Peter Chochula for the ALICE DCS team.
Oliver Bitterling  Introduction to the QPS  Radiation damage in electronic systems  Construction of radiation tolerant systems  Radiation test and.
Technical review on UPS power distribution of the LHC Beam Dumping System (LBDS) Anastasia PATSOULI TE-ABT-EC Proposals for LBDS Powering Improvement 1.
TE-MPE-CP, RD, 03-Dec Quench Detection and Energy Extraction Systems R. Denz (Quench Detection), K. Dahlerup-Petersen (Energy Extraction Systems),
REVIEW OF THE CRYOGENIC BY-PASS FOR THE LHC DS COLLIMATORS ELECTRICAL CIRCUIT MODIFICATION, INCLUDING OPERATIONAL CONSIDERATIONS PRESENTED BY A. SIEMKO.
MP3 report : W49 (6 slides, 6’) J.Ph Tock for the MP3 team Since oral report last week (24 th of November) 1JPhT – LMC 1st of December 2010.
TE-MPE-EP, VF, 11-Oct-2012 Update on the DQLPU type A design and general progress. TE-MPE Technical Meeting.
DLS Digital Controller Tony Dobbing Head of Power Supplies Group.
LHC ARC Commissioning report during LS1 Agenda: VRGPE documentation (former VRJGE) Active Penning modification By-Pass Valves modification LHC ARC commissioning.
Status of NA62 straw electronics and services Peter LICHARD, Johan Morant, Vito PALLADINO.
How to achieve higher redundancy of the UPS for QPS ? -The lack of redundancy concerns the supply lines from UPS to consumer but equally parts of the UPS.
A. Siemko and N. Catalan Lasheras Insulation vacuum and beam vacuum overpressure release – V. Parma Bus bar joints stability and protection – A. Verweij.
TE-MPE-EP, RD, 06-Dec QPS Data Transmission after LS1 R. Denz, TE-MPE-EP TIMBER PM WinCC OA Tsunami warning:
Chamonix Risks due to UPS malfunctioning Impact on the Superconducting Circuit Protection System Hugues Thiesen Acknowledgments:K. Dahlerup-Petersen,
TE-MPE-CP, RD, 23-Nov Summary of Radiation Induced QPS Events in LHC 2010 R. Denz TE-MPE-CP.
1 Second LHC Splice Review Copper Stabilizer Continuity Measurement possible QC tool for consolidated splices H. Thiesen 28 November 2011 K. Brodzinski,
Changes in QPS R. Denz, TE-MPE-EP MPP workshop Acknowledgements: K. Dahlerup-Petersen, V. Froidbise, S. Georgakakis, B. Magnin, C. Martin, J.
R2E Report M. Brugger for the R2E Study Group RadWG Meeting, August 20 th 2009.
TE-MPE-EP, RD, 30-Nov DQLPR - are there any R2E issues? R. Denz, TE-MPE-EP.
John Coughlan Tracker Week October FED Status Production Status Acceptance Testing.
Training LHC Powering R. Denz Quench Protection System R. Denz AT-MEL.
TE-MPE-CP, RD, 09-June Enhanced Diagnostics & Supervision for Quench Heater Circuits R. Denz TE-MPE-CP.
K. Foraz & Perrot A. L., EN/MEF - R2E Project LMC 25th May 2011 Thanks to EN/EL, EN/HE, EN/MEF, equipment owners the LHC integration team & the R2E team.
AT-MEI-PE, RD, LIUWG 31-JUL R. Denz AT-MEI-PE LHC Luminosity Upgrade Protection of the Inner Triplet, D1, Correctors and Superconducting Links/Leads.
TE-MPE-EP, RD, 26-Aug Workshop on QPS Software Layer Hardware / Agents R. Denz, TE-MPE-EP.
QPS R2E Status R2E Internal Meeting, October 24 th 2013Discussion 1 LS1 developments Digital Quench Protection insertion region magnets Delivery/installation.
Pixel power R&D in Spain F. Arteche Phase II days Phase 2 pixel electronics meeting CERN - May 2015.
LHC Enhanced Quench Protection System Review 24 – 26 February 2009.
M. POJER, ON BEHALF OF THE USUAL BUNCH OF SUSPECTS… S UPERCONDUCTING CIRCUITS RE - COMMISSIONING AFTER THE 2011 C HRISTMAS BREAK.
LHC-CC Validity Requirements & Tests LHC Crab Cavity Mini Workshop at CERN; 21. August Remarks on using the LHC as a test bed for R&D equipment.
TE-MPE-EP New DQLPU type A Production of 1300 new protection units TE-MPE-EP, VF, 23-Nov-2012.
TE/MPE activities currently planed for YETS Reiner Denz, Knud Dalherup-Petersen, Markus Zerlauth & Bruno Puccio YETS coordination meeting.
BCWG - 16/11/20102 Content WHY do we need a HW Commissioning campaign? WHAT are we going to do? HOW are we going to do it? ElQA QPS Powering Tests Planning.
NQPS commissioning …a long way to go. Topics nQPS component overview Enhancements in Firmware Commissioning diagram Detailed task list Summary.
LMC Update - R2E Failures December 5 th 2012 (R2E) Mitigation Project: LMC Update December 5 th 2012 R2E SEE Update M. Brugger on behalf.
R2E Availability October 15 th 2014 Experience from Past LHC and Injector Operation and scaling to the future G. Spiezia.
TE-MPE-CP, RD, LHC Performance Workshop - Chamonix Feb R. Denz TE-MPE-CP on behalf of the QPS team QPS Upgrade and Re-commissioning.
Quench Detection System R. Denz TE-MPE-EP on behalf of the QPS team.
1 J. Mourao (TE/MPE/CP) Enhanced DQHDS functionality  Status for 2011  Increase Magnet diagnostic capabilities  Our proposals.
New projects, priorities and main objectives for /12/15 TE-MPE Annual Meeting1.
Conclusions on UPS powering test and procedure I. Romera Acknowledgements: V. Chareyre, M. Zerlauth 86 th MPP meeting –
Machine Protection Review, R. Denz, 11-APR Introduction to Magnet Powering and Protection R. Denz, AT-MEL-PM.
Christophe Martin TE-MPE-EP 02/06/ The BIS and SMP activities during LS1 MPE Group Review, 2 June 2015 Christophe Martin, Stephane Gabourin & Nicolas.
TOF status and LS1 plansTOF status and LS1 plans 27/06/2012.
TE-MPE-CP, RD, 12-Dec QPS - analysis of main problems, areas to target, possible improvements R. Denz, TE-MPE-CP Evian 2011.
Status And Plans For Consolidation Of Magnet Protection Systems And Interlocks TE-MPE A. Siemko 12/09/2013 Accelerator Consolidation Workshop2.
CERN TE-MPE-EP, RD, 09-April Quench Protection Systems (QPS) for the LHC R. Denz, TE-MPE-EP Acknowledgements: K. Dahlerup-Petersen, A. Siemko, J.
August 24, 2011IDAP Kick-off meeting - TileCal ATLAS TileCal Upgrade LHC and ATLAS current status LHC designed for cm -2 s 7+7 TeV Limited to.
TE-MPE-CP, RD, 28-Sep Problems with QPS DAQ Systems During LHC Operation, 1 st Results from 2010 CNRAD Tests R. Denz TE-MPE-CP.
MPE LS1 workshop Summary Session 4 – Quench Detection R. Denz, D. E. Rasmussen.
CERN Converter Control Electronics Strategy for LHC Machine Electronics : Limitations & Risks
R2E/Availability Workshop Report - RadWG October 22 nd 2014 R2E/Availability Workshop 2014 October th 2014 R2E/Availability Workshop RadWG - Brief.
Main MPE Activities during YETS/EYETS/LS2 and the Provision of Resources Andrzej Siemko Andrzej Siemko TE-MPE1.
2007 IEEE Nuclear Science Symposium (NSS)
The HL-LHC Circuits: Global View and Open Questions
Document Plan & Milestones WP7
RF interlocks for higher intensities (LMC 15 June)
Ideas and design concepts, and challenges
B.Todd, M. Zerlauth, I. Romera, A. Castaneda
LV Safe Powering from UPS to Clients
Powering the LHC Magnets
the CERN Electrical network protection system
Detailed global view on protection and detection of the circuits
Quench detection electronics for the HL-LHC magnet circuits of the LHC
Collimator Control (SEUs & R2E Outlook)
RADIATION induced failures in LHC 28th June 2011
MICROFIP-Do we need further tests? 28th June 2011
R. Denz, TE-MPE-EP Acknowledgements: J. Steckert
Presentation transcript:

TE-MPE-CP, RD, 06-Oct Radiation Induced Faults in QPS Systems during LHC run 2011 R. Denz TE-MPE Technical Meeting October 6 th

TE-MPE-CP, RD, 06-Oct Outline  Introduction  Radiation induced fault statistics 2011  Fault analysis, mitigation and consolidation measures –Measures taken during LHC run 2011 –Proposals for Xmas break 2011/2012 –Proposals for LS1  Summary This presentation contains 23 slides and some jokes.

TE-MPE-CP, RD, 06-Oct  Due to functional requirements a significant amount of QPS and EE equipment is exposed to radiation during LHC operation –Radiation load depends on location and LHC exploitation  QPS and EE equipment locations –LHC tunnel Main magnet protection, nQPS, some 13kA EE systems (e.g. point 3) Effects seen during LHC run 2010 and 2011 –Partly shielded areas (RR13,17,53,57,73,77, UJ14, 16, 56) IPQ, IPD, IT, 600 A protection, EE 600 A, EE 13 kA Effects seen during LHC run 2011 Additional shielding for UJ14, UJ16 during Xmas break ;-)) –Protected areas (UA23, 27, 43, 47, 63, 67, 83, 87, UJ33) IPQ, IPD, IT, 600 A protection, EE 600 A, EE 13 kA No confirmed radiation induced fault observed so far Relocation during LS1 Introduction

TE-MPE-CP, RD, 06-Oct  Fault analysis has to be done very carefully as not all problems are related to radiation –Equipment faults, EMC, “friendly fire”, bad connections, virtual equipment, circuit breakers, real triggers (very rare but not excluded) –In addition there remain some doubtful cases where the exact cause of the trip cannot be determined –Enhanced diagnostic capabilities would be helpful E.g. diagnostics for power abort loops linking PC, PIC and QPS  Confirmed radiation induced faults are transmitted regularly to the R2E project to be included in their statistics –Radiation to electronics related problems are discussed as well in the RADWG –Technical notes are compiled for selected events Fault statistics

TE-MPE-CP, RD, 06-Oct Radiation induced fault statistics 2011

TE-MPE-CP, RD, 06-Oct Radiation induced fault statistics 2011 – spurious triggers SystemLocations DQQDI (IPQ, IPD, IT)UJ14, UJ16 (2x), RR53 DQQDG (600 A)UJ14 (2x), UJ16 (2x), RR17, RR73, RR77 (3x) nQPS (splice protection)B8L1, B11L5, B9L8 Flat top?

TE-MPE-CP, RD, 06-Oct Radiation induced fault statistics 2011 – spurious triggers Detection system typeExposed systems Radiation induced spurious triggers DQQDL (MB & MQ protection, analog, radiation tolerant) DQQDS (MB & MQ protection, digital, radiation tolerant) DQQDG (600 A, digital, partly hardened) 250 out of 8369 (3.6 %) DQQDI,T (IPQ, IPD, IT, digital, partly hardened) 138 out of 4084 (2.9 %) DQQBS (nQPS splice protection, partly hardened) (0.15 %) DQQDC (HTS lead protection, partly hardened) 508 out of –DQQDG and DQQDI,T are hardware equivalent and differ only in firmware –DQQBS and DQQDC are hardware equivalent and differ only in firmware –DQQBS and DQQDC have on board redundancy A/B (two interlock channels)

TE-MPE-CP, RD, 06-Oct Radiation induced fault statistics 2011 – fault types aa Equipment Faults total Faults remotely cleared or transparent Faults requiring access Faults causing a beam dump Electronic component and failure mode(s) DQAMC (DAQ system, local bus) ISO150™ digital isolator, upset in capacitive transmission path DQAMC (DAQ system, fieldbus)4040 uFIP™ fieldbus coupler DQQDG (quench detection system 600 A) 9009 SDRAM or DSP, program execution stalled and triggering watchdog, digital filter corruption (bit flips) DQQDI,T (quench detection system IPQ, IPD, IT) 4004 DQQBS (nQPS splice protection)6203ADuC834™, internal RAM, external SRAM, ADC register stuck, digital filter corruption

TE-MPE-CP, RD, 06-Oct Radiation induced faults – technology versus hardness aa EquipmentProcessorsRemark DQQDL_STPECC83, EL84 Under study, rack powering and cooling to be revised DQQDL, nDQQDL ADuC812, ADuC831 for DAQ only Microcontrollers used only for DAQ, detection part analog and classic digital logic DQQDS, nDQQDI, nDQQDG, nDQAMG ProASIC 3E A3PE1500 FPGA configuration stored in FLASH, triplicate logic DQAMC, DQAMG, DQAMS (fieldbus couplers) ADuC831, ADuC841, uFIP Program executed from FLASH, standard logic DQQBS, DQQDCADuC834Program executed from FLASH, standard logic DQQDG, I, TTMS320C6211Program stored in FLASH but executed from SDRAM, standard logic

TE-MPE-CP, RD, 06-Oct Radiation induced fault statistics 2011 – conclusions  While most of the radiation induced faults are transparent to LHC operation, the number of beam dumps caused by spurious triggers is close to reach the maximum admissible limit. –Consolidation measures to be applied already during Xmas break 2011/2012  Enhanced shielding for UJ14 and UJ16 will be beneficial but not cure all problems  The main problem are the DSP based quench detectors originally developed for radiation free areas –Consolidation work has been launched already in 2008 –The symmetric quench detection board is the first result of these efforts Fully satisfying performance during LHC operation One man year of work required for R&D –While the technological challenges have been mastered the lack of resources remains a major problem Rather hypothetical scenario - to be avoided nevertheless …

TE-MPE-CP, RD, 06-Oct Mitigation and consolidation measures – DAQ systems  Firmware upgrade for DQAMCMB and DQAMCMQ as first mitigation measure –Deployment to be completed during next TS –88.5% (1437 units) done so far including all MB –Upgrade includes 3 out of 4 condition for MB quench heater power supply availability (no injection inhibit in case of loss of 1 power supply)  Full consolidation requires hardware upgrade (new board) –Incriminated chip is located on quench detection board type DQQDL Replacement already successfully exploited with DQQDS board –Design completed (Joaquim), prototype expected for 10/2011 –Production covering DS areas 02/2012, procurement of components started  Replacement of the fieldbus coupler chip (MicroFip™) by NanoFip CERN –New chip is neither hardware nor software compatible –Significant development and integration work to be done –First fieldbus segments to be upgraded during LS1

TE-MPE-CP, RD, 06-Oct  Firmware upgrade –Triplication of digital filters and other modifications –Expected to cure a significant amount but not all faults –Development to be completed 10/2011, partial deployment during Xmas break (half cells 8 to 11 around IP1, 2, 5, and 8) –One test slot in CNRAD still available for type testing  Additional Shielding –Proposed by R2E, to be considered for half cells 8 to 11 around IP1 & 5 Option currently being evaluated – installation not yet confirmed –16 locations ~ 14 tons of steel Mitigation and consolidation measures – nQPS splice protection

TE-MPE-CP, RD, 06-Oct  Hardware upgrade –Technology evaluated – two possible options FPGA based version using high resolution ADC –Additional radiation test campaign for ADC wishful Standard technology with optimised firmware and modified evaluation logic –Using three instead of two redundant processors and majority voting (introduction of “famous” board C) –This option could be implemented on a relatively short timescale but requires a more detailed study –Design in 2012  installation in hot zones during LS1 or even in 2012 Mitigation and consolidation measures – nQPS splice protection

TE-MPE-CP, RD, 06-Oct QPS Crate Location IP Direction (cells 7, 6,…) ARC Direction (cells 9, 10, 11…) Shielding Upstream of QPS crate Detailed location of affected racks to be verified!!! Slide courtesy M. Brugger

TE-MPE-CP, RD, 06-Oct Mitigation and consolidation measures – IPQ, IPD and IT protection  New digital quench detection systems type nDQQDI –Similar to symmetric quench detection board developed for nQPS Core is flash based FPGA ProAsic TM A3PE1500 –Board design and firmware development by Jens (= QPS FPGA guru) –New board is (of course) not fully compatible with previous version Some specialist work required to integrate it into QPS supervision –200 boards including spares required for consolidation UJ14,16,56, RR13,17,53,57 By the way: The detection threshold for IPQs especially Q9 and Q10 should be revised as well – is 200 mV, 10 ms acceptable?

TE-MPE-CP, RD, 06-Oct Mitigation and consolidation measures – IPQ, IPD and IT protection TaskStatus Board designDone Technical reviewDone Prototypes I (5 units)Done Firmware developmentDone, only minor modifications expected Radiation test in CNRADTest successfully passed Prototypes II (10 units)In preparation (Vincent) System integration – adaption of QPS low level supervision (DQAMG firmware update) Started and advancing well, tedious but no showstoppers so far Adaption gateway application for new commands Done Type testsStarted, ok so far Procurement of componentsStarted Production of 200 boards and follow upPending Installation and QPS-ISTPending, during Xmas break 2011/2012

TE-MPE-CP, RD, 06-Oct Mitigation and consolidation measures – 600 A protection  New digital quench detection systems type nDQQDG –Similar to nDQQDI board developed for nQPS Core is flash based FPGA ProAsic TM A3PE1500 or A3PE3000 –High dynamic range of the current reading requires a high resolution ADC or a complex digital to analog feedback circuit Fast high resolution 24 bit ∑Δ ADC TI ADS1271 Modulator part successfully radiation tested by TE-EPC –Firmware is by far more complex than for nDQQDI Complex digital filter system including non-linear filters Numerical derivative of current, look-up tables for circuit inductance Algorithms well known but transfer to FPGA not trivial –Board design and especially the firmware development to be done by Jens –300 boards including spares required for consolidation UJ14,16,56, RR13,17,53,57,73,77

TE-MPE-CP, RD, 06-Oct Mitigation and consolidation measures – 600 A protection TaskStatus Board designStarted Technical reviewTo be decided Prototypes I (5 units)Pending Firmware developmentStarted Radiation test in CNRADNot required Prototypes II (10 units)Pending System integration – adaption of QPS low level supervision Pending Adaption gateway application for new commandsDone Type testsPending Procurement of componentsPending Production of 300 boards and follow upPending Installation and QPS-ISTPending, earliest at the end of the Xmas break 2011/2012

TE-MPE-CP, RD, 06-Oct Mitigation and consolidation – current and new baseline Current baselineNew proposal DeviceR&DDeploymentR&DDeployment nDQQDI2011Partial Full Xmas break 2011/2012 nDQQDG2012LS12011/12Early 2012 nDQQBS2012LS12011/12Mid 2012 nDQQDL2011Partial Xmas break 2011/ Partial Xmas break 2011/2012 nDQAMC (NanoFip CERN ) 2012LS12012LS1

TE-MPE-CP, RD, 06-Oct Mitigation and consolidation – resources  Financial resources –nDQQDI boards: ~ 300 CHF per board  60 kCHF –nDQQDG boards: ~ 350 CHF per board  105 kCHF –nDQQDL board: ~ 200 CHF per board  40 kCHF Production of 200 boards, which serve as well as spares –nDQQBS board: ~ 300 – 400 CHF per board (redundant circuit board) –NanoFip board: ~ 300 – 400 CHF per board  Production –Lead time for many components critical (e.g. A3PE3000 > 20 weeks) (Pre-emptive) ordering already started –To be planned in detail and firms to be selected very carefully Recommended to use known good suppliers only Production follow-up could eventually be outsourced

TE-MPE-CP, RD, 06-Oct Mitigation and consolidation – resources (the controversial slide)  Manpower: –Most of the work is reserved to QPS specialists; there is no gain in outsourcing as the necessary transfer of information would require as well a substantial specialist contribution Production can be outsourced if knowledgeable workforce is available –With the present baseline, i.e. no installation of nDQQDG and nDQQDB boards in 2012 the current assignment of activities does not to be changed Schedule for the nDQQDI board is very tight but still feasible –In case an upgrade of the 600 A protection systems in 2012 is regarded as mandatory, resources need to be re-assigned The FPGA specialist must be relieved from other tasks as much as possible (number of FPGA specialists to be increased as well...) –The good news: Installation and QPS-IST estimated to 1-2 days (2 specialists working) per concerned area (installation during TS feasible)

TE-MPE-CP, RD, 06-Oct Mitigation and consolidation – outlook LS1  Relocation of all QPS equipment installed in UJ14, 16 and UJ56  Installation of nQPS for IPQ, IPD and IT  Hardware upgrades for 600 A protection –Upgrades to be started in 2012 to be completed  Hardware upgrade nQPS splice protection –Scope to be defined  Consolidation of DAQ systems –NanoFip CERN –ISO150™ replacement  Change of detector evaluation logic –E.g. 2 out of 3 instead of 1 out of 1 –Significant change of QPS systems, to be studied in more detail

TE-MPE-CP, RD, 06-Oct Summary  During LHC run 2011 so far 116 confirmed radiation induced faults have been observed –16 beam dumps due to radiation induced spurious triggers  Fault analysis to be done very carefully before coming to conclusions  So far only soft errors, i.e. no destructive faults have been observed  None of the observed events caused a total loss of magnet and/or circuit protection –Redundancy of the protection systems is essential  Solutions for mitigation and consolidation have been elaborated and deployment has started in some cases –Priority is given to events requiring access to LHC or causing beam dumps  In order to keep the radiation induced spurious QPS triggers in 2012 at a reasonable level some consolidation measures have to be implemented already during the coming Xmas break –Adequate resources have to be assigned