Presentation is loading. Please wait.

Presentation is loading. Please wait.

9 th Weekly Operation Report on DIRAC Distributed Computing YAN Tian From 2015-02-25 to 2015-03-04.

Similar presentations


Presentation on theme: "9 th Weekly Operation Report on DIRAC Distributed Computing YAN Tian From 2015-02-25 to 2015-03-04."— Presentation transcript:

1 9 th Weekly Operation Report on DIRAC Distributed Computing YAN Tian From 2015-02-25 to 2015-03-04

2 Weekly Running Jobs by User Notes: 1. zhanglei & huangy submit BES jobs 2. CEPC’s production user weiyq keep running MC jobs. itemvalue active users3 max running jobs785 average running jobs394 total executed jobs31.4 k

3 Final Status of Running Jobs Failed Reasonpercent Application Error3.99 % upload/download failed3.77 % stalled1.10 % other0.025 % CEPC upload err huangy 74# pkg download error

4 Output Data Generated and Transfered Total: 7.48 TB ~1.07 TB/day good quality

5 Running job by Site Notes: – UMN’s max running jobs is set to 500 – WHU finished 47.7% jobs.

6 Job Final Status at Each Site WHU 92.0% upload failed GRID.JINR 100% UMN 91.3% 74# BES OpenNebula 96.5% OpenStack 85.8% 11# CEPC UCAS 96.7% 66# BES

7 Failed Types at Site: Description GRID.JINR is good UCAS is good too, but still has 66# randomtrg download errors OpenNebula is good OpenStack has problem in a short time when VM started. (occasionally), 900 jobs failed in 1 hour WHU still has upload error. It will be better when CEPC run long jobs. UMN is good. But jobs submit to UMN failed because input data download error arise when large amount of jobs get data from DIRAC server simutaneously. 2015-03-03 21:07:39 UTC DataManagement/StorageElement NOTICE: Returning response (128.101.221.233:43120)[bes_user:huangy] (30.23 secs) ERROR: Failed to get file dips://besdirac02.ihep.ac.cn:9148/DataManagement/StorageElement/bes/user/h/huangy/Upload/524 0cc601bde6085a9baf9185a2cc22979c2624bc76705cb10e5301e0fd3fcfb 2015-03-03 21:07:39 UTC DataManagement/StorageElement ERROR: Error processing proposal Error while sending: [('SSL routines', 'SSL3_WRITE_PENDING', 'bad write retry')]

8 Cumulative User Jobs Total user jobs: 31.4 k

9 Running jobs and Walltime Usage of VOs Walltime usage: BES 17.6% CEPC 82.4%

10 New Features in Frontend Added this week to fulfill user’s need gangaBOSS – add support BOSS 6.6.5 (done). Only large-scale test needed. dsub – support setting start evt number and batches to run (done and in production use now to deal with 200k events input data) – can handle input stdhep which has events during [0, evtmax] in each job.


Download ppt "9 th Weekly Operation Report on DIRAC Distributed Computing YAN Tian From 2015-02-25 to 2015-03-04."

Similar presentations


Ads by Google