Status of BESIII Distributed Computing BESIII Collaboration Meeting, Nov 2014 Xiaomei Zhang On Behalf of the BESIII Distributed Computing Group.

Status of BESIII Distributed Computing BESIII Collaboration Meeting, Nov 2014 Xiaomei Zhang On Behalf of the BESIII Distributed Computing Group

Outline Major upgrade for central server Private production supports Central storage solutions Cloud computing Summary 2

Resources and Sites 3 #Site NameTypeCPU CoresSE TypeSE CapacityStatus 1CLOUD.IHEP.cnCloud144dCache214 TBActive 2CLUSTER.UCAS.cnCluster152Active 3CLUSTER.USTC.cnCluster200 ~ 1280dCache24 TBActive 4CLUSTER.PKU.cnCluster100Active 5CLUSTER.WHU.cnCluster100 ~ 300StoRM39 TBActive 6CLUSTER.UMN.usCluster768BeStMan50 TBActive 7CLUSTER.SJTU.cnCluster100 ~ 360Active 8GRID.JINR.ruGrid100 ~ 200dCache30 TBActive 9GRID.INFN-Torino.itGrid200Active 10Cluster.SDU.cnClusterOn the way 11Cluster.BUAA.cnClusterOn the way Total1864 ~ 3504357 TB CPU resources are about 2000 cores, storage about 350TB Sites’ name have been adjusted and classified according to their Type SJTU is newly added site, JINR increase its storage to 30TB Local IHEP PBS site has moved to IHEP cloud site, preparing for cloud application into production

MAJOR UPGRADE FOR CENTRAL SERVER 4

Major upgrade 5 BESDIRAC Cloud (v1r0 -> v1r2) DIRAC v6r10pre17->v6r10p25 gLite->EMI3 (grid middleware) SL5->SL6 (OS) Separate the database from main server Reduce load of main server Easier to maintain Easier to next upgrade Large scale of testing has been done and these tests have shown good status of new server Jobs, data query, data transfer, etc

Jobs 6 ~ 1500 jobs running 98.7% success

Data File Catalog and Transfers 7 Data File Catalog is working fine – More than 600,000 files are registered in the file catalog – Dataset and replicas query have been tested – A single metadata query takes 0.5s Data transfer system is OK – Two batch DST data transferred from IHEP SE to WHU SE: 8 TB xyz data used – Transfer speed can reach 90+ MB/s – One time success rate is beyond 99%

PRIVATE PRODUCTION SUPPORT 8

User Job Status 9 More than 12,000 user jobs have been successfully done – 3500 is Simulation+Reconstruction+Analysis jobs with one time 98% success rate – 4000 is Simulation+Reconstruction jobs with customized DIY generator About 7,500 jobs / 63M events (2014.10.27 – 2014.11.15) Job status of all private user jobs

Simulation with customized generator Normal simulation process is done with officially published BOSS For simulation with customized generator, users do simulation with his own compiled generator – Customized generator need to be shipped with his jobs – Customized generator need to be used instead of official one How to use – Just specify your own generator lib in GangaBOSS configuration j.inputsandbox.append(‘ /InstallArea/x86_64-slc5-gcc43- opt/lib/libBesEvtGenLib.so') Feedback and Answer – Can my own lib be added automatically to the configuration? Yes if needed – Too slow to submit a large scale of jobs with big user library (~10MB) Currently each job need to upload this big library to the central server Soon will improve by using just one replica in SE to be shared among the related user jobs 10

Simulation+Reconstruction+Analysis(1) 11 Simulation Reconstruction DST Files Download DST Files Analysis ROOT Files Distributed SystemLocal Farm Simulation Reconstruction Analysis Download ROOT Files Distributed SystemLocal Farm ROOT Files Simu+Reco Simu+Reco+Ana These three processes can be done sequentially in one job – No input needed – Return Ntuple ROOT files This job type is highly recommended – For users, just one step to complete all three processes to get final results

Simulation+Reconstruction+Analysis(2) 12 – For distributed system, greatly reduce data movements No intermediate data movements needed ROOT Ntuple file is normally much smaller than DST or Raw files It is a good case of using distributed computing system sim+ rec sim+ rec+ ana

Simulation+Reconstruction+Analysis(3) How to use – Specify sim, rec and ana joboption files in GangaBOSS configuration to start 13 Feedback – The intermediate output can be also required to return in some case? It can be done. We will provide a way for you to decide which files will be returned

Physics Validation 14 Physics validation done by physics user – psi(4160) 9 decay modes, 200,000 events each mode – Same splitting and random seeds Results show that the reconstruction DST data from distributed computing and local farm are exactly identical One of the 9 modes Graph from SUN Xinghua Distributed computing Local farm

Support Four job type are supported now – Simulation (return rtraw files) – Simulation + Reconstruction (return dst files) – Simulation + Reconstruction + Analysis (return ntuple root files) – Simulation with customized generator (eg. DIY generator) Currently supported BOSS version – 6.6.2, 6.6.3, 6.6.3.p01, 6.6.4, 6.6.4.p01, 6.6.4.p02 Detailed user guide has been provide in twiki – How to submit a BOSS job to distributed computing: http://boss.ihep.ac.cn/~offlinesoftware/index.php/BESDIRAC_User_Tu torial – How to submit different type of BOSS job: http://docbes3.ihep.ac.cn/~offlinesoftware/index.php/BESDIRAC_BOS S_Job_Guide http://docbes3.ihep.ac.cn/~offlinesoftware/index.php/BESDIRAC_BOS S_Job_Guide Welcome to use and your feedbacks are valuable to us! 15

CENTRAL STORAGE SOLUTION 16

Central Storage Central storage plays a great role in BESIII computing model – Share DST and Random Trigger Data with sites – Accept and save MC output from remote sites Central storage – Lustre (local file system) holds DST data and Random Trigger data – SE ( grid Storage Element, dCache) exposes grid interface to access data from outside IHEP 17 Current situation of data flow – Lustre and SE are completely separated – Manual copy data between Lustre and SE are needed Improvement – Automate and speed up or eliminate any data movements between Lustre and SE by closely uniting them

dCache + Lustre Solution(Xiaofei Yan) 18 1.Administrator copy data from Lustre to dCache. 2.User can access data from dcache. Separated model Combined model Testbed based on current infrastructure: dCache version 2.6.33 one cache pool with 88TB data array added

dCache+Lustre Read Test 19 Transfer 1TB data from the dCache+Lustre SE to the WHU SE 1.Register Lustre metadata into dCache DB (~ 7 minutes without checksum) Average: 83.5 MB/s Peak: 93.1 MB/s One-time success rate: 97.0% One-time success rate: 100%

StoRM + Lustre solution 20 Architecture of StoRM StoRM is a Storage Resource Manager, which can provide grid interface to POSIX file system eg. Lustre, GPFS Lightweight architecture StoRM has been successfully used as a grid storage solution for remote sites in BESIII distributed system WHU SE base on StoRM is working well Widely used in LCG The testbed has been set up Single StoRM Server with Lustre “/cefs” attached StoRM version 1.11.4

21 StoRM + Lustre Read Test Transfer 2TB data from the StoRM+Lustre SE to WHU SE: One-time success rate: 100% Average: 80.9 MB/s peak 91.9 MB/s

22 StoRM + Lustre Write Test Use about 7000 BOSS jobs, 1300 jobs in peak time – 900GB output of jobs are written back to the StoRM+Lustre SE (IHEP-STORM) – At the same time, these output can be directly seen and read from Lustre in IHEP local farm With this solution, users don’t need to download data from grid to local farm for further analysis Grid data = Local data

Comparison 23 The StoRM solution is easier to install and maintain, no extra development is required The StoRM solution could be more efficient without registering lustre metadata in advance and without data movement StoRM is a promising solution and we will do more tests before making final decision dCache + LustreStoRM + Lustre HardwareNeed extra disk array as cache poolJust mount Lustre SoftwareNeed developing extra scripts to support metadata synchronization SE Transfer speed83.5 MB/s80.9 MB/s Read/WriteCan read, but need to do metadata register, write under development Can support read/write Data movementCache pool Lustreno SecurityGrid authentication

CLOUD COMPUTING 24

Cloud integration Distributed computing has integrated cloud resources based on pilot schema, implementing dynamic scheduling Cloud resources used can be shrunk and extended dynamically according to job requirements 25 VM1, VM2, … Cloud Distributed Computing Job1, Job2, Job3… Cloud Distributed Computing No Jobs User Job Submission Create Get Job VM Cloud Distributed Computing No Jobs Job Finished Delete

Cloud sites 5 cloud sites from Torino, JINR, CERN and IHEP have been set up and connected to distributed computing system – About 320 CPU cores, 400GB Memory, 10TB disk 26

Cloud tests More than 4500 jobs have been done with 96% success rate Failure reason is lack of disk space – Disk space will be extended in IHEP cloud Expect to do large scale tests with the support of the Torino cloud site 27

Performance and Physics validation Performance tests has shown that running time in the cloud sites are comparable with other production sites – Simulation, Reconstruction, Download random trigger data Physics validation has proved that physics results are highly consistent between clusters and cloud sites 28

User support Cloud usage is transparent to BESIII physics users through distributed computing system Users can specify cloud sites same as other sites through GangaBOSS if needed Cloud sites will be opening to users after collaboration meeting – Default env: Scientific Linux 6.5, BOSS software – http://docbes3.ihep.ac.cn/~offlinesoftware/index.php/BESDIRAC_Clo ud_Guide http://docbes3.ihep.ac.cn/~offlinesoftware/index.php/BESDIRAC_Clo ud_Guide With flexible feature of cloud, it is interesting for the users with special requirements to try – Different OS, software env …. from clusters – Let us know your requirements 29

Future Plan Further strengthen user support – User tutorial will be provided regularly if needed – More improvements will be done according to user feedback Make cloud resources easier to be centrally managed – Improve cloud monitoring and configuration to ease the life of central admins More efforts will be done to make system more robust – Take care of big inputsandbox (user packages) – Push usage of mirror offline database, implementing real-time synchronization – Consider redundant central server to avoid one point failure 30

Summary Distributed computing system remains in good status after major update Private user production is strongly supported with two more job type added In central storage tests, StoRM+Lustre is found to be a promising solution Cloud application will be moved into production before end of year 31

32 Thanks for your attention! Thank you for your feedback! Thanks to resource contributors! Thanks to all site administrators for the help and participation!

Status of BESIII Distributed Computing BESIII Collaboration Meeting, Nov 2014 Xiaomei Zhang On Behalf of the BESIII Distributed Computing Group.

Similar presentations

Presentation on theme: "Status of BESIII Distributed Computing BESIII Collaboration Meeting, Nov 2014 Xiaomei Zhang On Behalf of the BESIII Distributed Computing Group."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Status of BESIII Distributed Computing BESIII Collaboration Meeting, Nov 2014 Xiaomei Zhang On Behalf of the BESIII Distributed Computing Group.

Similar presentations

Presentation on theme: "Status of BESIII Distributed Computing BESIII Collaboration Meeting, Nov 2014 Xiaomei Zhang On Behalf of the BESIII Distributed Computing Group."— Presentation transcript:

Similar presentations

About project

Feedback