Download presentation
Presentation is loading. Please wait.
Published byNeil Waters Modified over 6 years ago
1
Flavia Donno CERN GSSD Storage Workshop 3 July 2007
Status of the PPS for SRM 2.2: first tests performed SRM v2.2 roll-out plan Flavia Donno CERN GSSD Storage Workshop 3 July 2007
2
GSSD Storage Workshop - CERN, 03 July 2007
Goals of PPS SRM v2.2 Check configuration/management issues for the new versions of the Storage Services Comply with production environment requirements (SAM tests, information system, response to GGUS tickets, etc, monitoring, etc.) Exercise SRM configuration Storage Classes Space Reservation and Space Token Descriptions Configuration of groups/roles for permissions Provide an SRM 2.2 testing environment to the experiments Express requirements in terms of needed space per Storage Class Name space definition Usage patterns Tools needed for managing and monitoring SRM 2.2 service Test stability and manageability of the Storage and SRM services. GSSD Storage Workshop - CERN, 03 July 2007
3
PPS SRM v2.2 for the LHC experiments
At the moment 2 experiments have declared interest in testing SRM v2.2 in PPS: ATLAS and LHCb. CMS is coming along. The 2 experiments have both declared the need to access production data from the PPS infrastructure while using production Grid resources. ATLAS will exercise mostly transfers from Tier-0 (with SRM v1) to Tier-1 (with SRM v2) LHCb mostly interest in testing full chain of transfer + data access in Tier-1s The initial requirements for experiments testing in PPS have been collected and published in the GSSD twiki pages. It is important not to loose the momentum and encourage the experiments to perform the needed tests. Sites must make an effort to meet the requirements whenever possible to shorten the lifetime of PPS instances and migrate to production. Developers must ensure adequate support. GSSD Storage Workshop - CERN, 03 July 2007
4
GSSD Storage Workshop - CERN, 03 July 2007
ATLAS requirements Provide both or only one of the following storage classes: REPLICA-ONLINE (Tape0Disk1) CUSTODIAL-NEARLINE (Tape1Disk0) Define the following space token descriptions: TAPE for the CUSTODIAL-NEARLINE storage class DISK for the REPLICA-ONLINE storage class Provide resources such that a transfer sustain rate of at least 40MB/sec is supported by a single Tier-1 Make available an ATLAS specific path (ex. /atlas). ATLAS will then create further paths specifying the desired space token, provided that tokens will be decided somewhere in some top levels and will not change for data stored in the leaves of that top directory. Enable the following group/roles to reserve space: /atlas/role=production /atlas/soft-valid/role=production Publish in the GLUE schema the ATLAS specific root path (GlueSAPath) Publish the space token descriptions (GlueSATag) and the storage classes supported by your site. Publish the size made available for ATLAS and for a particular storage class/space token description at your site. GSSD Storage Workshop - CERN, 03 July 2007
5
GSSD Storage Workshop - CERN, 03 July 2007
ATLAS sites Mostly interested in dCache sites, although any site/implementation can easily being included in their tests Initial dCache targets are: BNL (the site provides a real tape backend) FZK (real tape backend) IN2P3 (simulated tape backend) NDGF (disk only) Other targets: GSSD Storage Workshop - CERN, 03 July 2007
6
GSSD Storage Workshop - CERN, 03 July 2007
LHCb requirements LHCb will perform data transfer and access exercises against the SRM v2.2 endpoints in PPS. SRM v2 endpoints should be available at all LHCb T1 centres (preferably) MSS (with FTS endpoints from CERN). There is no need for the SRM v2.2 endpoint to be published in the production BDII. The Storage Classes needed are: CUSTODIAL-NEARLINE (Tape1Disk0) Define the following space token descriptions: LHCb_RAW and LHCb_RDST for the CUSTODIAL-NEARLINE storage class Make available an LHCb specific path (ex. /lhcb). LHCb will then create further paths specifying the desired space token, provided that tokens will be decided somewhere in some top levels and will not change for data stored in the leaves of that top directory. Enable the following groups to reserve space: lhcbprd GSSD Storage Workshop - CERN, 03 July 2007
7
GSSD Storage Workshop - CERN, 03 July 2007
LHCb sites The LHCb Tier-1 centres are: CERN, CNAF, RAL: CASTOR FZK, IN2P3, NIKHEF, PIC: dCache Provide the following temporary storage requirements expressed in TB: CERN: 20.4 IN2P3:5.7 FZK: 3.3 CNAF: 3.1 SARA: 5.1 PIC: 3.3 RAL: 1.62 LHCb would like access from (production) WN on the sites to these endpoint. They estimate something like the following kSI2k would be needed over a month: CERN: 19.6 IN2P3: 24.9 FZK: 14.4 CNAF: 13.5 SARA: 22 PIC: 14.4 RAL: 6.8 GSSD Storage Workshop - CERN, 03 July 2007
8
GSSD Storage Workshop - CERN, 03 July 2007
PPS SRM v2.2 endpoints At the moment there are 13 endpoints in PPS, i.e. published in PPS BDII: CASTOR: CERN, CNAF dCache: BNL, FZK (?), IN2P3(?), DESY, NDGF, Edinburgh DPM: CERN, Edinburgh, Glasgow (LAL not published ?) StoRM: CNAF, Bristol These endpoints are tested daily using the S2 families of tests. They are also tested by the SAM tests for both SRM v1 and SRM v2: Please, make sure you publish “SRM” as GlueServiceType for SRM v2.2. GGUS Tickets are issues when SAM tests are not passed and for other support problems Tickets are assigned to Flavia They can be assigned directly to sites for support using the usual channels (do sites have SRM v2 support units ?) GSSD Storage Workshop - CERN, 03 July 2007
9
GSSD Storage Workshop - CERN, 03 July 2007
PPS SRM v2.2 endpoints However: CASTOR: Instance at CERN not serving “production” data. It passes S2functionality tests and SAM tests. No Storage Classes, no space tokens. Can experiments use it ? instance at CNAF not usable by the experiments (used only to check configuration issues). No enough hardware resources available. - It passes S2 functionality tests. No RAL endpoint available for LHCb. Can this change soon ? dCache: BNL, still not meeting all ATLAS requirements (space reservation allowed for everybody - no space token descriptions available) - It passes S2 functionality tests. - Still old version of dCache. Handling of top dir. Access to all members of a VO. FZK, same situation as BNL. New version of dCache, however a disk pool became unusable. It does not pass S2 functionality tests as of today. Can LHCb use this instance ? IN2P3, not ready. It does not pass functionality tests. Still old version of dCache. Can LHCb use this instance ? DESY, old version of dCache, not passing S2 functionality tests. NDGF, still old version of dCache. No configuration executed for ATLAS. It does not pass S2 functionality tests. Edinburgh, new version of dCache. It does not pass S2 functionality tests. GSSD Storage Workshop - CERN, 03 July 2007
10
PPS SRM v2.2 endpoints However: No endpoints correctly configured
DPM: CERN, No special configuration done for ATLAS and/or LHCb. Running DPM 1.6.5 Glasgow, still old version of DPM. StoRM: CNAF, only small endpoint in PPS. Not adequate for experiment testing. No special configuration performed for VOs. Bristol, still running old version. No endpoints correctly configured for experiment testing!!! GSSD Storage Workshop - CERN, 03 July 2007
11
GSSD Storage Workshop - CERN, 03 July 2007
Some tests performed Some tests have been performed anyway with success Sites involved: BNL-dCache, CERN-DPM, CNAF-StoRM Using default spaces and no token descriptions Using no VOMS roles for space reservation and/or file access Multi-VO usage exercised SRMv1-v2 interaction exercised only in one direction (read-only from SRMv1) GSSD Storage Workshop - CERN, 03 July 2007
12
GSSD Storage Workshop - CERN, 03 July 2007
Some tests performed Transfer tests using FTS 2.0 with default token from CASTOR SRMv1 to dCache-BNL using ATLAS certificate Single file transfers Transfer tests using FTS 2.0 with default token from CASTOR SRMv1 to StoRM-CNAF using ATLAS and LHCb certificates. FTS 2.0 bulk-mode transfers with small files ( MB). Transferred files at the time failures while using StoRM due to Put cycles not completed correctly (need to investigate) Data access using GFAL+ROOT with LHCb data using SRMv2 SURLs. Data copied from CASTOR SRMv1 to DPM SRMv2. Used ROOT v f, GFAL 1.9.0, DPM to access DPM data specifying a SURL Tests need to be re-executed using SURLs from all SRM v2 implementations. A lot to test. Any volunteer beside the experiments ? (Stephen Burke? Mirco Ciriello?) It would be nice to create small suites to be reused by existent testing frameworks GSSD Storage Workshop - CERN, 03 July 2007
13
GSSD Storage Workshop - CERN, 03 July 2007
How should we proceed ? Plan for Tier-1s: Have dCache test instances properly configured for ATLAS and LHCb for end of this week/beginning of next week (Wednesday, 11th July 2007). Do we have resources for LHCb ? How much can we dedicate to LHCb in PPS? Have DESY and Edinburgh properly configured with dCache by the end of this week. Start sustained stress tests on these 2 instance, using several certs at the same time and performing a mixture of SRM v2.2 requests. Prove stability over a period of at least a week under heavy load. Test SRMv1-v2 interoperability for all possible implementations with high level tools/apis using entries in production catalogues with multiple certs GSSD Storage Workshop - CERN, 03 July 2007
14
GSSD Storage Workshop - CERN, 03 July 2007
How should we proceed ? Plan for Tier-1s: Push experiment active testing by next week (9 July?) Continue stress tests on development instances and experiment testing till middle October 2007. Roll-out patches as they come out. Start deployment of SRM v2.2 at BNL, FZK, IN2P3, DESY October 15th, 2007 (agreement with the experiments) Add CASTOR instances ? Other sites should start moving to production beginning of January 2008. GSSD Storage Workshop - CERN, 03 July 2007
15
GSSD Storage Workshop - CERN, 03 July 2007
How should we proceed ? Plan for Tier-2s: Tier-2s using DPM can migrate to SRM v2 as of now. The configuration can be coordinated centrally by GSSD with the input from the experiments. The 15th of October, Tier-2s with DPM can switch SRM v1 off! Tier-2s using dCache can upgrade starting January 2008. Tier-2s using StoRM can migrate to SRM v2 as of now (?). Roll-out patches as they come out. Deployment of SRM 2.2 should be completed by end of January 2008. GSSD Storage Workshop - CERN, 03 July 2007
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.