Download presentation
Presentation is loading. Please wait.
Published byBennett Elliott Modified over 8 years ago
1
Calorimeter Data Monitoring News Benoit Viaud (LAL-in2p3) B. Viaud, Calo Mtg Aug. 31 st 2011 0
2
Overview Reminder: too many alarms make the monitoring inefficient; A survey of the Monitors' behavior over 2011; A few proposed improvements. B. Viaud, Calo Mtg Aug. 31 st 2011 1
3
Reminder Marie-Noelle (early May 2011) : there are too many alarms issued by the monitoring. Not all have real consequences. this brings Data Quality shifters vigilence down: they eventually overlook important issues. I surveyed 2011 monitoring data to determine what alarms are indeed to noisy and see what can be done. B. Viaud, Calo Mtg Aug. 31 st 2011 2
4
Survey of the monitors over 2011 Most of the Monitoring is based on those monitors: + A few others based on collision data. 3
5
Survey of the monitors over 2011 Most of the Monitoring is based on those monitors: Quantities like PMT's answer to a LED pulse, pedestal position, etc... measured in each cell:it's faulty if the average over n events is outside a certain range. The number of faulty cells determines the severity of the conclusion: warning/alarm/fatal This monitoring is repeated every 10-15 minutes. 4
6
Survey of the monitors over 2011 I analyzed all the 15-minute savesets taken in 2011 (up to Aug. 13th, only physics fills, discard those created automatically at the end of a run); The goal is to count the number of warnings and alarms issued by each monitor, per unit of time (fill): spot those which "overwhelm" the DM. Action to be taken: to be discussed with the corresponding experts (re- tune the ranges and thresholds to reduce the nb of alarms while keeping the calo safe) Correlations are expected among the monitors: confirm them in practice. Correlated monitors can be grouped into a single item to simplify the DM's work. Scripts developped for this study can easily determine the effect of thresholds' variation. 5
7
Example: Ecal_Unexpected Signal NB: All the other monitors shown in back-up # of Savesets in the fill # of Savesets at least in Warning # of Savesets at least in Alarm # of Savesets in Fatal. Fill Number # Savesets 6
8
Example: Ecal_Unexpected Signal Normalized to the number of fills in the Saveset Fill Number # Savesets Fill 1806: 25-05-2011 Fill 1613: 13-03-2011 Fill 1944: 14-07-2011 Fill 2025: 13-08-2011 7
9
Correlated Monitors PedestalChi2 & PedestalAverageNoise alarms always accompanied by a PedestalNoise alarm; Most of the PedestalNoise & PedestalShiftOverNoise alarms accompanied by a PedestalShift alarm
10
Ecal_AveragePedestalNoise
11
Ecal_PedestalChi2 10
12
Ecal_PedestalNoise
13
Ecal_PedestalShiftOverNoise
14
Ecal_PedestalShift 13
15
Hcal_AveragePedestalNoise
16
Hcal_PedestalChi2
17
Hcal_PedestalNoise 16
18
Hcal_PedestalShift
19
Correlated Monitors PedestalChi2 & PedestalAverageNoise alarms always accompanied by a PedestalNoise alarm; Most of the PedestalNoise & PedestalShiftOverNoise alarms accompanied by a PedestalShift alarm Group them into a single Pedestal alarm in the DM page. Keep the full picture in the Piquet page for finer diagnostics.
20
Correlated Monitors LEDNoise & LargeLEDNoise LowLEDSignal & OutRangeLED & NoGainMonitor 19
21
Ecal_LEDNoise
22
Ecal_LargeLEDNoise
23
Hcal_LEDNoise 22
24
Hcal_LargeLEDNoise
25
Ecal_LowLEDSignal
26
Ecal_OutRangeLED 25
27
Ecal_NoGainMonitor
28
Hcal_LowLEDSignal
29
Hcal_OutRangeLED 28
30
Hcal_NoGainMonitor
31
Correlated Monitors LEDNoise & LargeLEDNoise LowLEDSignal & OutRangeLED & NoGainMonitor Group them into a single LEDNoise and a single NoGainMonitor
32
Even vs. Odd in Prs/Spd Group Odd and Even in DM plots. 31
33
Replace this: Proposal
34
By this : Quite simpler for the DM.
35
Noisy Monitors Now: study the pattern behind those alarms + discussions with experts to make them quieter and safe (ex: optimized ranges and thresholds). Next slides contain my first remarks. Summing up all the alarms: something pretty much everyday Those which issue a Warning/Alarm/Fatal at least every few days; There are a few of them (see next slides); 34
36
Noisy Monitors
38
37
40
Noisy Monitors Some alarms appear simultaneously in many monitors ; Happens when something a bit dramatic occurred (at leat something at all must have happened) ; I guess we want those alarms; we should see to it that they’re still there after monitoring ranges/thresholds have been optimized. Ex: Fills 1738, 1743, 1944 HCAL_LEDNoise ECAL_LEDNoise
41
Noisy Monitors Some alarms appear simultaneously in many monitors ; Happens when something a bit dramatic occurred (at leat something at all must have happened) ; I guess we want those alarms; we should see to it that they’re still there after monitoring ranges/thresholds have been optimized. Fill 1944: right after LHCb restarted on July 14 th, shortly after a power cut. Fill 1743: mis-Configuration of ODIN, LED pulsing in a physics BXID. 40
42
Noisy Monitors: Spd Fake Signal I observe one faulty saveset every few hours, everything’s OK 15 minutes before/after. Instability in the pedestal ? Most of the times, not very much above the Warning threshold.
43
Noisy Monitors: Prs PedestalMeans Shows up after LHCb restarted on July 14 th + power cut. FEB11 on crate 2 changed by Stephane ?
44
Noisy Monitors: Ecal/Prs Low Occupancy Other alarms are simultaneous for Ecal and Prs. Always (save one time) due to the very first saveset analyzed in the fill, typically 1 to 5 minutes after the start. PS2FEB11 is visible on the left of the PRS plot. Do they really appeared in the alarm section of the presenter ? If yes, discarding the first saveset will reduce a lot their rate. 43
45
Noisy Monitors Known for long. Find something to fix it…
46
Noisy Monitors Known for long. Find something to fix it… 1799, 21/5/11 2040, 22/8/11
47
Summary and Prospects Surveyed 2011 monitoring data to find ways to reduce the number of alarms to be handled by the Shift Data Manager; Many alarms correlated/simultaneous: could group them into a single one; will require a bit of coding (create new monitoring histos): one of my next steps. A few monitors trigger an alarm every few days; combining everything, it means something almost every day. I’m presently having a look at that to determine if this can be re-optimized (less alarms and still safe). Scripts written for this study can be made available to the Piquets (after a bit of cleaning). Could be used every day to monitor in their whole the fills taken in the past 24 hours. 45
48
Back-up
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.