Presentation is loading. Please wait.

Presentation is loading. Please wait.

OPERATIONS REPORT JUNE – SEPTEMBER 2015 Stefan Roiser CERN.

Similar presentations


Presentation on theme: "OPERATIONS REPORT JUNE – SEPTEMBER 2015 Stefan Roiser CERN."— Presentation transcript:

1 OPERATIONS REPORT JUNE – SEPTEMBER 2015 Stefan Roiser CERN

2 CPU Time Provided by Sites Not much news: as usual T0, T1 sites + Yandex are the top providers Still some news… … a cloud site (CERN) for the first time in the top 20 contributors Was 28 th in last report … and naturally no more OFFLINE processing at the HLT farm since Run2 start 14 Sep '15 - NCBOperations Report - StR2

3 Job Types Distribution & CPU Efficiency Main Job Types continue to be executed with very high CPU efficiency 14 Sep '15 - NCBOperations Report - StR3 Period still dominated by Simulation Jobs DataReconstruction is picking up

4 Running Jobs Average of 20k running jobs, is below usual Problem with Dirac job submission now fixed Note: Same fluctuations in summer 2014, where all year average was then significantly higher 14 Sep '15 - NCBOperations Report - StR4 Avg Summer 2014

5 Job Success Rates Job success rate decreased over reference period to ~ 90 % (Done + Completed) Used to be at ~ 95 % for very long time Increase in “stalled” jobs b/c of Simulation productions for Trigger Upgrade studies with high μ & occupancy Failure causes are under investigation Most of requests are finished by now -> sneak preview on Run3 data processing + working on calculation queue time left problem 14 Sep '15 - NCBOperations Report - StR5 increase Stalled MC Jobs

6 CERN Slow Worker Nodes 14 Sep '15 - NCBOperations Report - StR6 Some CERN worker nodes process data by factors slower In general we are interested in data throughput which is OK … but additional hassle for production team to do “wait for the last job” before e.g. closing a production CERN/LSF looking into it with highest priority

7 Run 2 Data Validation & Production 14 Sep '15 - NCBOperations Report - StR7 Reco15/Stripping22 Reco15a/Stripping22 Reco15/Turbo01a Reco15/Turbo01b Reco15RDST Turbo01c Reco15c Reco15em/Stripping22 & Turbo01em Reco15em/Stripping22a Reco14/…/MDST01 Reco15 Reco15a Turbo01 Reco15b/Stripping23 Reco15b/Stripping23a Reco15pne Reco15c/Stripping23b (run range extension) Early Measurements 25ns ramp 41 Bookkeeping processing passes produced for Run2 data val & prod

8 Verification of Processed Run 2 Validation Data Quality by Analysts 4 % of files accessed during this period were done on the grid by analysts for quality checking of the validation data 14 Sep '15 - NCBOperations Report - StR8 Access of validation data

9 ONLINE Calibration & Turbo Stream Major change in the data processing workflow Detector calibration and alignment done in the pit … allows “physics quality” data reconstruction in the pit So produced Turbo stream for part of the HLT selected data shipped to storage sites and ready for analysis right away Until end of the year also RAW information is exported and re- constructed to check ONLINE/OFFLINE equivalence Validation of Turbo (and Turbo Validation) workflows successfully conveyed 14 Sep '15 - NCBOperations Report - StR9

10 Run 2 Data Processing Lessons Learnt Validation chain is essential and working very well for both infrastructure and data quality checking Fast turnaround times (O(day(s)) Very good & fluent communication between ONLINE, OFFLINE & analysts Allowed to detect issues early on Until the end of 2015 unlikely that we need T2 sites for data processing “Mesh processing” was tested and works as expected It was initially foreseen to have additional data processed at T2 sites but due to lower LHC luminosity not needed for now 14 Sep '15 - NCBOperations Report - StR10

11 Summary LHCb continues to use the pledged resources at very high CPU efficiency ¾ of resources used for Simulation, Data Processing increasing Currently below average usage but picking up after summer One issue with killed batch jobs is understood and worked on Lots of work in validation of Run2 data Successful validation of the distributed computing infrastructure, applications and quality of processed data Turbo data workflow successfully validated Now looking forward to moving on to stable operations … 14 Sep '15 - NCBOperations Report - StR11


Download ppt "OPERATIONS REPORT JUNE – SEPTEMBER 2015 Stefan Roiser CERN."

Similar presentations


Ads by Google