Presentation is loading. Please wait.

Presentation is loading. Please wait.

Status of BESIII Distributed Computing

Similar presentations


Presentation on theme: "Status of BESIII Distributed Computing"— Presentation transcript:

1 Status of BESIII Distributed Computing
name list BESIII Collaboration Meeting, IHEP, June 2014

2 Outline DMS status WMS status Site status Site monitoring
Release of BESDIRAC v0r8 Summary BESIII Collaboration Meeting, IHEP 4th June 2014

3 DMS Status Motivation Data transfers over SEs Deployment of StoRM SE
Combination of dCache SE and Lustre BESIII Collaboration Meeting, IHEP 4th June 2014

4 Motivation distributed storage solution
improve performance of reconstruction jobs on grid evnironment by changing computing model from remote central storage to distributed local storage easy connection to local analysis jobs for distributing raw/dst data from IHEP to collaboration members. site SE site CE site CE site Lustre site SE central storage solution WAN site SE IHEP SE site CE site CE site CE network jam site SE high load of SE site CE site CE site CE site CE read randomtrg data write output dst download randomtrg data upload output dst SE replicate/transfer BESIII Collaboration Meeting, IHEP 4th June 2014

5 SE Data Transfer: Statistics
24.5 TB XYZ DST data ( IHEP  USTC @ 3.20 TB/day ) 4.4 TB randomtrg data ( IHEP  USTC, JINR, WHU, 1.95 TB/day ) BESIII Collaboration Meeting, IHEP 4th June 2014

6 SE Data Transfer: Speed & Quality
USTC JINR & WHU Quality UMN BESIII Collaboration Meeting, IHEP 4th June 2014

7 SE Data Transfer: Web UI
Twiki: operations button request status submit request monitoring jobs BESIII Collaboration Meeting, IHEP 4th June 2014

8 Deployment of StoRM SE Succesful Case: WHU site, 39 TB StoRM SE
rpm packaged by Sergey Belov at JINR and tested at IHEP easy to install and configure (1 day work) easy to maintenance: good stability since its deployment at Aprl 2, 2014 webDAV support for HTTPS access Hardware of WHU SE: 12 cores Xeon 2.40GHz 8 GB memory 15 x 3TB SATA Disk RAID-6 Two network interface, configured with WAN IP and LAN IP Hardware information of current SEs Detail StoRM SE installation guide: BESIII Collaboration Meeting, IHEP 4th June 2014

9 Combination of dCache and Lustre
Features directories in Lustre can be mounted on dCache SE ro, rw model supported Benefits enlarge capacity of SE no need to transfer data between SE and Lustre unified interface for users To do the metadata sub-system in pool nodes are done the interface program between pool nodes and Lustre is under developing permission control need to consider carefully will go into test phase in three month, and working at the end of this year BESIII Collaboration Meeting, IHEP 4th June 2014

10 dCache and Lustre: Current System
1 Server DELL R720 CPU: E x2 Memory: 32GB 2 Disk Array 24 x 3TB RAID-6 Capacity: 126 TB Network: 10Gbps Ethernet BESIII Collaboration Meeting, IHEP 4th June 2014

11 dCache and Lustre: Future System
Head Node (1 server) DELL/IBM/HP 1U, CPUx2, Mem: 32GB 1Gbps Ethernet Disk Node (2 servers) DELL/IBM/HP 1U, CPUx2 Mem: 64 GB 10 Gbps Ethernet 8 Gb Hba Disk Array (3 servers) 24 x 3TB (2 srv) 24 x 4Tb (1 srv) Capacity: 126TB + 80 TB BESIII Collaboration Meeting, IHEP 4th June 2014

12 dCache and Lustre: Develop Status
Realized: direct read/write metadata info Under develop: storage info BESIII Collaboration Meeting, IHEP 4th June 2014

13 WMS status Upgrade of GangaBOSS Tests of Simulation+Reconstruction
Upgrade of OS and CVMFS in sites BESIII Collaboration Meeting, IHEP 4th June 2014

14 gangaBOSS Release 1.0.6 New features:
generate a dataset of output data, inform user the dataset name and LFN path register metadata for output data support simulation+reconstruction add more minor status in Logger info add more error code for debug auto upload job logs to SE and register in DFC BESIII Collaboration Meeting, IHEP 4th June 2014

15 Tests of Simulation+Reconstruction
site without SE can download randomtrg data from SEs success rate is greater than 95% when SE works fine. GUCAS’s network is poor to download from IHEP UMN has 700+ cores, the load of their SE is higher than others Other optimization scheme: Fabio’s clould storage? mount SE on WNs in site? benefits: read less than download; support spliting by event BESIII Collaboration Meeting, IHEP 4th June 2014

16 Upgrade of OS and CVMFS in sites
CVMFS have to be upgrade from 2.0 to 2.1 version 2.0 is out of support by CERN GUCAS, USTC’s WN with old version have upgrated recently. PKU under upgrading OS have to be upgrade from SL 5 to SL 6 BOSS will be upgraded to run on SL 6 at the end of this year hope site will update your OS BESIII Collaboration Meeting, IHEP 4th June 2014

17 Site Status # Site Name Type CPU Cores SE Capacity OS Site Status 1
BES.IHEP-PBS.cn Cluster 96 dCache 126 TB SL 5.5 Running 2 BES.UCAS.cn 152 3 BES.USTC.cn 228 ~ 896 dCache 24 TB SL 5.7 4 BES.PKU.cn 100 SL 5.10 5 BES.JINR.ru gLite 40 ~ 200 dCache 7.5 TB SL 6.5 6 BES.UMN.us 768 BeStMan 50 TB SL 5.9 7 BES.WHU.cn 200 ~ 400 StoRM 39 TB SL 6.4 8 BES.INFN-Torino.it 200 9 BES.SDU.cn ~100 Preparing 10 BES.BUAA.cn ~256 SL 5.8 Total 1784 ~ 3368 246.5 TB BESIII Collaboration Meeting, IHEP 4th June 2014

18 Site monitoring CE Availablity Host (worker nodes) Network SE latency
WMS_send_test BOSS_work_test CPU_limit_test Reliablity statistics Host (worker nodes) job success rate statistics Network ping CE login node SE latency data upload, replicate More test will be added: SE transfer speed SE usage information dataset status pilot monitoring Author: Igor JINR Details in Alexey’s report BESIII Collaboration Meeting, IHEP 4th June 2014

19 Release of BESDIRAC v0r8 DIRAC version: v6r10pre17
besdirac-* toolkits included dataset toolkit tested and added transfer toolkit refined download tool work with rsync gangaBOSS upgraded generate a dataset of output data register metadata for output data support simulation+reconstruction Upgration is transparent to user. BESIII Collaboration Meeting, IHEP 4th June 2014

20 Usage of Private Users General procedure Features Hypernews Twiki
apply a certificate (1 week) submit job with ganga (one cmd) monitoring job in web UI download data from SE to Lustre (one cmd) Features support simulation+reconstruction resource: more than 1500 cpu cores easy to monitor job status Hypernews Twiki Contact person Xiaomei ZHANG Xianghu Tian YAN BESIII Collaboration Meeting, IHEP 4th June 2014

21 Thanks Thanks for: Thank you for your attention! Questions and Answers
the resources contribution of sites the effort of site administrators and contactors the effort of developers at JINR the advise of DIRAC experts Thank you for your attention! Questions and Answers BESIII Collaboration Meeting, IHEP 4th June 2014


Download ppt "Status of BESIII Distributed Computing"

Similar presentations


Ads by Google