Presentation is loading. Please wait.

Presentation is loading. Please wait.

StoRM + Lustre Proposal YAN Tian On behalf of Distributed Computing Group 2014.12.10 1.

Similar presentations


Presentation on theme: "StoRM + Lustre Proposal YAN Tian On behalf of Distributed Computing Group 2014.12.10 1."— Presentation transcript:

1 StoRM + Lustre Proposal YAN Tian On behalf of Distributed Computing Group 2014.12.10 1

2 INTRODUCTION TO STORM 1.Architecture 2.Security (X509) 3.User Access Management 4.Server Scalability 2

3 StoRM Architecture Overview 3 Simple architecture – FE handle authorization and SRM request – DB store asynchronous SRM request info – BE execute syn/asyn request, bind with underlying fs StoRM act as a frontend of storage at a site

4 StoRM Security StoRM rely on user credential for what concern user authentication and authorization. StoRM is able to support VOMS extension, and to use that to define access policy (complete VOMS- awareness) 4

5 User Access Management There are several steps StoRM does to manage access to file: 1.User makes a request with his proxy 2.StoRM checks if the user can perform the requested operation on the required resource 3.StoRM ask user mapping to the LCMAPS service 4.StoRM enforce a real ACL on the file and directories requested 5.Jobs running on behalf of the user can perform a direct access on the data 5

6 Scalability Single host 6 Clustered deployment

7 STORM + LUSTRE PERFORMANCE TEST 1.Test Bed 2.SE Transfer Out Test (besdirac’s dir. read) 3.Job Write to Lustre Test (besdirac’s dir. write) 4.SE Transfer In Test (besdirac’s dir. write) 5.DST Data Trasfer between SE (other user’s dir. read – to be taken) 6.Multi-VO Support Test 7

8 Test Bed Single server without data disk 10 Gbps network /cefs mounted with ACL enabled A subdirectory of /cefs (own to account besdirac) is bind to StoRM pub directory ModelDell PowerEdger R620 CPUXeon E5-2609 v2 ( 8 cores) Memory64 GB HDD300 GB SAS RAID-1 Network10 Gbps 8

9 SE Transfer Out Test Test procedure: 1.prepare 2,000 files of 1GB size located in /cefs 2.registering metadata into DIRAC DFC 3.transfer the dataset to remote SE at WHU Test Results: 1.registering DFC takes 70 seconds, i.e., 35 seconds per 1k files 2.average transfer speed is 80.9 MB/s, peak speed is 91.9 MB/s 3.one-time success rate is 100% 9

10 10 IHEP-STORM  WHU-USER average: 80.9 MB/s peak: 91.9 MB/s 2 TB data transferred in 7 hours

11 11 2000 files of 1GB size 100% success

12 Job Write to Lustre Test Facts about testing jobs: – 200M total events, bhabha sim. + rec. – split by run, 20k max events/job – 10,929 total jobs submitted – 10,282 jobs done (94.1%) – Job failed reason: 353 stalled (USTC unstable power supply) 275 overload (UMN node error) 6 application failed 13 network failure – 1.4 TB data generated and uploaded to StoRM+Lustre (IHEP-STORM) Test results: 1.No job failed because of upload output data error 2.1.4 TB output data write to test SE with high success rate 3.output can be immediately seen at Lustre 12

13 13 3 days >10K jobs

14 14 94.2% success rate no job failed for upload output data

15 15 ~ 1.4 TB output data write to StoRM+Lustre

16 16 data uploaded with good quality

17 Output data can be seen at Lustre 17 data write to immediately

18 SE Transfer In Test Facts: – tranfser from UMN SE to /cefs/tmp_storage/yant/transfer/DsHi – 2.3 TB MC Sample (dst, rtraw, logs, scripts) – 16011 files – registered into DFC in 12m50s (48s/1k files) – speed: 20~30 MB/s – quality: > 99% 18

19 19

20 20

21 Multi-VO Test Currently supported VO: bes, cepc, juno Each VO’s user can read/write it’s own root directory User from one VO can not access other VO user’s files A test is performed: 1.initialize proxy as cepc VO user 2.check if bes VO’s directory is available 3.check if cepc VO’s directory is available 4.srmcp test (read/write) Test Result: 1.cepc user can not visit bes VO’s directory 2.cepc user can read/write its VO’s own directory 21

22 22 Register as cepc user Failed to access BES VO’s storage area Success to read/write CEPC VO’s storage area

23 SUMMARY AND DISCUSSION 23

24 Test Summary With ACL enabled /cefs, in besdirac’s diretory, read/write is OK Need more debug&test on reading other user’s data Speed of read (80MB/s) is acceptable Speed of write (20-30MB/s), need more test Mulit-VO support is working 24

25 Comparison of StoRM and dCache The StoRM solution is easier to install and maintain, no extra development is required The StoRM solution could be more efficient without registering lustre metadata in advance and without data movement StoRM is a promising solution and we will do more tests before making final decision 25

26 Lustre Data Security StoRM SE server acts like lxslc5 login node Lustres are mounted on / use mount –bind to remount a subdirectory of Lustre to StoRM pub directory only this subdirectory is visiable to grid user (by low level srm command) currently, in StoRM, all grid user are mapped to AFS account ‘besdirac’, r/w on Lustre is executed by user ‘besdirac’. So, only besdirac’s directory can be modified, other user’s data in Lustre is safe In production senario: input/output data of DIRAC jobs will be located at one Lustre user’s directory (i.e. besdirac) in besdirac’s directory, we create subdirectories for each grid users When we need to transfer DST from IHEP to remote site, that DST directory is mounted temporarily and read only When transfer DST from remote site back to IHEP, data will be write into besdirac’s dir. 26

27 Production Solution 1 enable ACL, user_xattr on production Lustre ( /besfs, /besfs2, /bes3fs, /junofs, etc) create a directory for user besdirac in each Lustre with serveral or dozens of TB quota (depend on physics user’s requirements) disadvantage: prod. Lustre are busy and can’t be shutdown to enable ACL a solution: can be performed during mantaince time 27

28 Production Solution 2 prepare a seperated Lustre, e.g. /diracfs or we can change current IHEPD-USER’s 88TB disk pool (even 126TB data disk) to Lustre advantage: production Lustres are un-effected disadvantage: – abandon StoRM+Lustre solution ; – hard to enlarge /diracfs to PB level 28


Download ppt "StoRM + Lustre Proposal YAN Tian On behalf of Distributed Computing Group 2014.12.10 1."

Similar presentations


Ads by Google