Presentation is loading. Please wait.

Presentation is loading. Please wait.

Status of StoRM+Lustre and Multi-VO Support YAN Tian Distributed Computing Group Meeting Oct. 14, 2014.

Similar presentations


Presentation on theme: "Status of StoRM+Lustre and Multi-VO Support YAN Tian Distributed Computing Group Meeting Oct. 14, 2014."— Presentation transcript:

1 Status of StoRM+Lustre and Multi-VO Support YAN Tian Distributed Computing Group Meeting Oct. 14, 2014

2 SE: StoRM/dCache + Lustre Test Test Method: For dCache: wget from webDAV, 200~250MB file, 12 times test, drop lowest & highest record Download commands are like: wget http://cmspn06.ihep.ac.cn:2880/bes/lustre/besfs/groups/grid/zhanggang/2013SepSimulation/psipp/664p01/mc/DpDm/round03/stream001/664p01_psipp_DpDm _stream001_run12526_file0001.rtraw http://cmspn06.ihep.ac.cn:2880/bes/lustre/besfs/groups/grid/zhanggang/2013SepSimulation/psipp/664p01/mc/DpDm/round03/stream001/664p01_psipp_DpDm _stream001_run12526_file0001.rtraw wget http://cream.ihep.ac.cn:2880/bes/besfs/groups/grid/zhanggang/2013SepSimulation/psipp/664p01/mc/DpDm/round03/stream001/664p01_psipp_DpDm_stream0 01_run12526_file0001.rtrawgroups/grid/zhanggang/2013SepSimulation/psipp/664p01/mc/DpDm/round03/stream001/664p01_psipp_DpDm_stream0 01_run12526_file0001.rtraw DestinationMin. SpeedMax SpeedAverage Speed cmspn06 (202.122.33.25)120 MB/s165 MB/s142.6 MB/s badger02(202.122.33.116)90 MB/s109 MB/s98.6 MB/s badger01 (202.122.37.82)104 MB/s110 MB/s107.4 MB/s StoRM + Lustre performance: (cp file from Lustre @ 24.6 MB/s) dCache + Lustre performance (cp file from Lustre @ 16.5~29.1MB/s) DestinationMin. SpeedMax SpeedAverage Speed cream (202.122.33.44)21.321.4 badger02(202.122.33.116)22.131.8 badger01 (202.122.37.82)19.237.323.2 MB/s (11 tests) It’s working ! But Need More test on physical machine (In progress …OS installed) compare to: download from SE to Lustre @ ~20MB/s gridtb cp file from Lustre @ 6.9 MB/s gilda128 cp file from Lustre @ 9.2 MB/s

3 CEPC Mirror Database Done in my VM and cream.ihep.ac.cn (StoRM SE) Successfully configured and running Will deploy on WHU SE later

4 CEPC Resources Status ContributorsContactorCPU CoresOSStatus IHEPZHANG Xiaomei144SL 5.5/6.5Joined at 2014.9.7 WHUCAI Hao100 ~ 200SL 6.4Joined at 2014.9.8 BUAASHEN Chengping20SL 5.8Joined at 2014.9.10 SJTUYANG Haijun100 ~ 360SL 6.5Joined at 2014.9.19 GXULIU Hongbang50 ~ 96CentOS 5.10Joined at 2014.09.30 PKUWANG Dayong100waiting for new computing room SDU-1MA Lianliang2~3 weeks upgrade SDU-2HUANG Xingtaoconfirmed at 2014.9.21 NCEPUHAN Ranwill prepare workstation Total514~ 920

5 CEPC Job Submission Tool: Basic a group of 10, 30, 100, etc. are copied to – /besfs/groups/grid/besdirac/cepc/stdhep/input_0010 – /besfs/groups/grid/besdirac/cepc/stdhep/input_0030 – /besfs/groups/grid/besdirac/cepc/stdhep/input_0100 – /besfs/groups/grid/besdirac/cepc/stdhep/input_0500 – /besfs/groups/grid/besdirac/cepc/stdhep/input_1000 File size: each ~33 MB, total ~ 54 GB They are uploaded to SE: – LFN: /test/cepc/stdhep/input_****/ – upload/download/delete tool ① Prepare input.stdhep files ② Define user specified parameters Path & directories: dir to write job scripts template directory in Lustre input data directory in Lustre input data directory in SE output data directory in SE Job parameters no. of events/job DB hosts first random seed Site name, job group, ③ Generate scripts from Templates: Job control job.jdl  job.py Simulation init.macro event.macro Reconstruction init.xml  ILD_o2_v06.xml  PandoraLikelihoodData9EBin.xml  PandoraSettingsDefault.xml event.macro – INPUT_FILE_NAME (line 1) – EVENT_NO_PER_JOB (line 2) init.macro – RANDOM_SEED (line 6): – DB_HOST (line 8) – SIM_OUTPUT_FILE_NAME (line 12) init.xml – SIM_OUTPUT_FILE_NAME (line 82) – REC_OUTPUT_FILE_NAME (line 95) ④ Replace strings in Templates:  needn’t change

6 Test Result 1 Program written by SUO Bin 10 jobs generated, submitted, and done.

7 Test Result 2 data output written to WHU SE

8 Job Submission Tools: Class interface design UserParameters – user specified parameters AS attribute (evtPerJob, inputDir, outputDir, jobGroup, etc.) – userParameters.get() Splitter (base class) SplitterByStdhep (inherit from Splitter) – parameter: userParameters instance Job – job.submit()

9 Recent Plan: rewrite SE upload/download tool challenge: – 45 min CPU time limit – 2 hours auto logout – can use screen? plan: – use subprocess to deal data transfer – use sqlite3 db to record tranfser status

10 Job Submission & Data Mangement: a combined workflow for user input data (Lustre  SE) (each job 30~50 MB) Job Steps: (1)input data (SE  WN) (2)Sim. (3)Rec. (4)output data (WN  SE) output data (SE  Lustre) (each job 1~3 GB) Job Steps: (1)input data (Lustre  SE) (2)input data (SE  WN) (3)Sim. (4)Rec. (5)output data (WN  SE) (6)output data (SE  Lustre)

11 Job Submission & Data Mangement: a combined workflow Benefits: user don’t need to operate on SE feel like PBS save time (since data transfer in each job) To do: develop upload/download tools dedicated machine & service for SE  Lustre data transfer (permision control) new CLI/WEB portal for username/passwd access, no proxy needed A Machine mount /besfs, /cepcfs mount /afs no CPU time limit a service run by root root  user (data upload/download) use yant’s proxy for each job: data transfer request from frontend system job step status info. TO job monitoring sys.

12 CEPC Physics Validation: Status Following results are coincident: – different jobs at same site (execpt BUAA) – two cloud sites: OpenNebula and OpenStack – IHEP-PBS (SL 5.5) and YANT-TEST (SL 5.10) Following results are not coincident: – IHEP-PBS (SL 5.5) and WHU (SL 6.4) – IHEP-PBS (SL 5.5) and BUAA (SL 5.8)

13 CEPC Physics Validation: Test Plan Later Plan detail: – software compiled in SL 5.10 and SL 6.5 – vitural machines with OS 5.5/5.8/5.10 mount repository 5.10 – vitural machines with OS 6.4/6.5 mount repository 6.5 – use a static DB repo-6.5 ILCsoft compiled in SL 6.5 repo-5.10 ILCsoft compiled in SL 5.10 VM SL 6.5 VM SL 6.5 VM SL 6.4 VM SL 6.4 VM SL 5.5 VM SL 5.5 VM SL 5.8 VM SL 5.8 VM SL 5.10 VM SL 5.10 testing DB (static?)


Download ppt "Status of StoRM+Lustre and Multi-VO Support YAN Tian Distributed Computing Group Meeting Oct. 14, 2014."

Similar presentations


Ads by Google