Presentation is loading. Please wait.

Presentation is loading. Please wait.

AMOD Report Doug Benjamin Duke University. Hourly Jobs Running during last week 140 K Blue – MC simulation Yellow Data processing Red – user Analysis.

Similar presentations


Presentation on theme: "AMOD Report Doug Benjamin Duke University. Hourly Jobs Running during last week 140 K Blue – MC simulation Yellow Data processing Red – user Analysis."— Presentation transcript:

1 AMOD Report Doug Benjamin Duke University

2 Hourly Jobs Running during last week 140 K Blue – MC simulation Yellow Data processing Red – user Analysis Magenta – group production Grey – group Analysis 0

3 DDM data flows during last week 10 TB 0 TB 800 TB

4 Notable activities Monday - Recover from slow T0 Export over the weekend to RAL and Triumf o Both switched over to backup OPN over the weekend Cause never understood o Triumf slower link and RAL Asymmetric link Tuesday – SARA T0 export and T1 stage from tape issues Wednesday - RAL unplanned power cut, CERN LSF job submission slowness Thursday – RAL power restored – recover outage, continue with CERN LFS job submission slowness Friday - CERN LFS job submission slowness Saturday – Rain lots of it (flooding, R1, my office building, SPS – took beam offline)

5 Other notable events ND cloud local storage problems o Currently trying to recover 70k files to avoid declaring them lost. Resubmitting most tasks and Rob subscribed to missing Raw input files. RAL – worked to recover several ATLAS pools affected by the power cut. (159 files declared lost)

6 Bulk reprocessing Bulk Reprocessing o Originally planned to start Period D, then B, then A and then C o Instead Period D started, then period B, A and C to keep all jobs running in all clouds but….. This processing pattern has caused problems with disk space issues at Tier 1 sites o Stopped early submission of periods A and C, D and B continue As of Sunday period D – 98.5% done (before merge) Period B 68% done Over weekend disk space in Tier 1 became an issue.

7 T1 data disk space Due to low free disk space – PIC, SARA, FZK all were removed from SANTA CLAUS, now 4 T1 sites excluded (DE,ES,NL,IT clouds) Saturday – Stephane Jezequel triggered cleaning (Victor is running very slowly recently). o Situation at FZK and SARA improved. o Monday (12-Nov) SARA will migrate 60 TB from scratch to data disk o PIC still issue as of Sunday night. o Stephane – moving away MC datasets

8 LSF LSF job dispatch speed caused problems all week, 6K 60 K

9 GGUS tickets

10 Conclusions Thanks to the experts, sites, shifts (Comp@p1, ADCOS, ADCOS expert) Bulk reprocessing proceeding relatively smoothly LSF job submission speed causing Tier 0 team headaches DATA disk space at the Tier 1 sites an issue. Needs to be monitored as not to effect Bulk reprocessing


Download ppt "AMOD Report Doug Benjamin Duke University. Hourly Jobs Running during last week 140 K Blue – MC simulation Yellow Data processing Red – user Analysis."

Similar presentations


Ads by Google