Download presentation
Presentation is loading. Please wait.
Published byJulianna Jones Modified over 8 years ago
1
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013
2
Outline Motivation: gains in operations Impact on data federation Progress and technical issues Changes in operations and procedures 2013-11-11 WLCG Workshop: Disk/Tape separation 2
3
Introduction CMS asked the Tier-1 sites to change their storage setup to gain more flexibility and control of the available disk and tape resources Old setup: One MSS system controlling both disk and tape Automatic migration of new files to tape Disk pool automatically purges unpopular files to make room for more popular files Automatic recall of files from tape when accessing files without disk copy Several disadvantages: Pre-staging needed for organized processing, not 100% efficient because system was still allowed to automatically purge files if needed User analysis was not allowed at Tier-1 sites to protect the tape drives from chaotic user access patterns 2013-11-11 WLCG Workshop: Disk/Tape separation 3
4
Disk/Tape separation CMS asked the Tier-1 sites to separate disk and tape and base the management of both on PhEDEx Sites were asked to deploy two independent [*] PhEDEx endpoints “Large” [**] persistent disk Tape archive with “small” [**] disk buffer All file access will be restricted to the disk endpoint All processing will write only on the disk endpoint [*] Can write/delete a file on disk-only, or on tape-only, or on both simultaneously [**] “small” ~ 10% of “large”, but can be sized according to expected rates to tape 2013-11-11 WLCG Workshop: Disk/Tape separation 4
5
Motivation Increase flexibility for Tier-1 processing Enable user analysis at Tier-1s Enable remote access of Tier-1 data 2013-11-11 WLCG Workshop: Disk/Tape separation 5
6
Processing at Tier-1s: Location independence Use case: Organized processing needs to access input samples stored custodially on tape at one of the Tier-1 sites Old model: Jobs needed to run close to tape endpoint hosting input and output data (custodial location) New model: Jobs can run against any disk endpoint, not necessarily close to tape endpoint hosting input or output data Benefit of new model: Custodial distribution optimizes tape space utilization taking into account processing capacities of the Tier-1 sites Not all data is being accessed at the same time causing uneven processing resource utilization Location independence enables to use both tape and processing resources efficiently at the same time 2013-11-11 WLCG Workshop: Disk/Tape separation 6
7
Processing at Tier-1s: Pre-staging and Pinning Use case: Staging and pinning input files to local disk for organized processing is required to optimize CPU efficiency Input files need to be released from disk when processing is done Old model: Pre-staging via SRM or Savannah tickets was used to convince the MSS to have input files available on disk Release of input relied on automatic purge within MSS New model: CMS will centrally subscribe and therefore pre-stage input files to have them available on disk before jobs start CMS will permanently keep input files on disk for regular activities Benefit of new mode: CMS is in control of what is on disk at the Tier-1 sites and can optimize disk utilization (CMS will have to actively manage the disk space through PhEDEx) 2013-11-11 WLCG Workshop: Disk/Tape separation 7
8
Processing at Tier-1s: Output from central processing Use case: Central processing produces output which needs to be archived on tape Old model: Output of individual workflows could only be produced at one site, the site of the custodial location New model: Output can be produced at one or more disk endpoints, then migrated to tape only at single final custodial location Benefit of new model: CMS can optimize processing resource utilization Tier-1s with no free tape are no longer idle CMS can validate data before final tape migration, reducing unnecessary tape usage 2013-11-11 WLCG Workshop: Disk/Tape separation 8
9
Impact on data federation CMS would like to benefit from a fully deployed CMS data federation Tier-1s need to publish files on the disk endpoints in the Xrootd federation Eventually, all popular data will be accessible through the federation Benefits: Further optimize processing resource utilization by processing input files without the need to relocate samples through PhEDEx Enables processing not only on remote Tier-1 sites through the LHCOPN but also at Tier-2 sites 2013-11-11 WLCG Workshop: Disk/Tape separation 9
10
Technical implementation Sites and storage providers free to choose implementation Two possibilities identified in practice: Two independent storage endpoints CERN, FNAL Single storage endpoint with two different trees in the namespace RAL, KIT, CNAF, CCIN2P3, PIC 2013-11-11 WLCG Workshop: Disk/Tape separation 10
11
Internal transfers Currently using standard tools for disk tape buffer transfers at all sites e.g. FTS, xrdcp No bottleneck seen so far If needed, internal optimizations are possible with a single endpoint e.g. on a single dCache endpoint, internal data flow can be delegated to the pools 2013-11-11 WLCG Workshop: Disk/Tape separation 11
12
Site concerns Main site concern has been duplication of space used between disk and tape buffer Should not be a big effect given the “small” size of the buffer in front of tape For dCache, a solution is planned: “flush-on-demand” command creating a hard link in tape namespace instead of copy development schedule will depend on need, for now gather experience with current version 2013-11-11 WLCG Workshop: Disk/Tape separation 12
13
Current status DONE RAL, CNAF KIT (in commissioning last week) ~ DONE CERN (except for Tier-0 streamers and user) IN PROGRESS PIC, CCIN2P3, FNAL 2013-11-11 WLCG Workshop: Disk/Tape separation 13
14
Issues At sites No blocking technical issues Not stress-tested yet: challenge in 2014? In CMS software Minor update needed in PhEDEx to handle disk tape moves Need to settle data location for job matching PhEDEx node vs. SE… CMS internal, in progress 2013-11-11 WLCG Workshop: Disk/Tape separation 14
15
Changes in operations and procedures The Tier-1 disk endpoint is a central space CMS will manage subscriptions and deletions on disk Tape endpoint subscriptions are subject to approval by Tier-1 data managers (functions that are held by site-local colleagues) CMS would like to auto-approve disk subscription and deletion requests to be able to reduce latencies 2013-11-11 WLCG Workshop: Disk/Tape separation 15
16
Changes in operations and procedures Tape families: Together with the Tier-1 sites, CMS optimized placement of files on tape for reading by requesting tape families In the old model, tape family requests needed to be made before processing started, could lead to complications if forgotten New model allows processing on disk endpoints without the need for tape families A PhEDEx subscription archives the output to tape: needs to be approved by the site-local data manager Tape family requests by CMS are not needed anymore, Sites can create tape families before approving archival PhEDEx subscriptions CMS is happy and available for the sites to optimize rules for tape family creation CMS would like to evolve the tape family procedure from requesting individual families to a dialogue with the sites defining tape family setups and rules 2013-11-11 WLCG Workshop: Disk/Tape separation 16
17
Changes in site readiness Site readiness metrics for Tier-1s will evolve taking into account separated disk and tape PhEDEx endpoints SAM tests only on CEs close to disk SAM tests for SRM both on disk and on tape endpoints More links to monitor: disk WAN tape WAN disk tape 2013-11-11 WLCG Workshop: Disk/Tape separation 17
18
Conclusions Hosting Tier-1 data on disk will increase flexibility in all computing workflows Technical solutions identified for all sites Deployment in progress with no blocking issues, expecting completion at all sites by beginning of 2014 For more details: https://twiki.cern.ch/twiki/bin/view/CMSPublic/CompProjDiskTape https://twiki.cern.ch/twiki/bin/view/CMSPublic/CompProjDiskTape https://indico.cern.ch/conferenceDisplay.py?confId=249032 https://indico.cern.ch/conferenceDisplay.py?confId=249032 2013-11-11 WLCG Workshop: Disk/Tape separation 18
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.