Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.

Similar presentations


Presentation on theme: "CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013."— Presentation transcript:

1 CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013

2 Outline  Motivation: gains in operations  Impact on data federation  Progress and technical issues  Changes in operations and procedures 2013-11-11 WLCG Workshop: Disk/Tape separation 2

3 Introduction  CMS asked the Tier-1 sites to change their storage setup to gain more flexibility and control of the available disk and tape resources  Old setup:  One MSS system controlling both disk and tape  Automatic migration of new files to tape  Disk pool automatically purges unpopular files to make room for more popular files  Automatic recall of files from tape when accessing files without disk copy  Several disadvantages:  Pre-staging needed for organized processing, not 100% efficient because system was still allowed to automatically purge files if needed  User analysis was not allowed at Tier-1 sites to protect the tape drives from chaotic user access patterns 2013-11-11 WLCG Workshop: Disk/Tape separation 3

4 Disk/Tape separation  CMS asked the Tier-1 sites to separate disk and tape and base the management of both on PhEDEx  Sites were asked to deploy two independent [*] PhEDEx endpoints  “Large” [**] persistent disk  Tape archive with “small” [**] disk buffer  All file access will be restricted to the disk endpoint  All processing will write only on the disk endpoint  [*] Can write/delete a file on disk-only, or on tape-only, or on both simultaneously  [**] “small” ~ 10% of “large”, but can be sized according to expected rates to tape 2013-11-11 WLCG Workshop: Disk/Tape separation 4

5 Motivation  Increase flexibility for Tier-1 processing  Enable user analysis at Tier-1s  Enable remote access of Tier-1 data 2013-11-11 WLCG Workshop: Disk/Tape separation 5

6 Processing at Tier-1s: Location independence  Use case:  Organized processing needs to access input samples stored custodially on tape at one of the Tier-1 sites  Old model:  Jobs needed to run close to tape endpoint hosting input and output data (custodial location)  New model:  Jobs can run against any disk endpoint, not necessarily close to tape endpoint hosting input or output data  Benefit of new model:  Custodial distribution optimizes tape space utilization taking into account processing capacities of the Tier-1 sites  Not all data is being accessed at the same time causing uneven processing resource utilization  Location independence enables to use both tape and processing resources efficiently at the same time 2013-11-11 WLCG Workshop: Disk/Tape separation 6

7 Processing at Tier-1s: Pre-staging and Pinning  Use case:  Staging and pinning input files to local disk for organized processing is required to optimize CPU efficiency  Input files need to be released from disk when processing is done  Old model:  Pre-staging via SRM or Savannah tickets was used to convince the MSS to have input files available on disk  Release of input relied on automatic purge within MSS  New model:  CMS will centrally subscribe and therefore pre-stage input files to have them available on disk before jobs start  CMS will permanently keep input files on disk for regular activities  Benefit of new mode:  CMS is in control of what is on disk at the Tier-1 sites and can optimize disk utilization (CMS will have to actively manage the disk space through PhEDEx) 2013-11-11 WLCG Workshop: Disk/Tape separation 7

8 Processing at Tier-1s: Output from central processing  Use case:  Central processing produces output which needs to be archived on tape  Old model:  Output of individual workflows could only be produced at one site, the site of the custodial location  New model:  Output can be produced at one or more disk endpoints, then migrated to tape only at single final custodial location  Benefit of new model:  CMS can optimize processing resource utilization  Tier-1s with no free tape are no longer idle  CMS can validate data before final tape migration, reducing unnecessary tape usage 2013-11-11 WLCG Workshop: Disk/Tape separation 8

9 Impact on data federation  CMS would like to benefit from a fully deployed CMS data federation  Tier-1s need to publish files on the disk endpoints in the Xrootd federation  Eventually, all popular data will be accessible through the federation  Benefits:  Further optimize processing resource utilization by processing input files without the need to relocate samples through PhEDEx  Enables processing not only on remote Tier-1 sites through the LHCOPN but also at Tier-2 sites 2013-11-11 WLCG Workshop: Disk/Tape separation 9

10 Technical implementation  Sites and storage providers free to choose implementation  Two possibilities identified in practice:  Two independent storage endpoints  CERN, FNAL  Single storage endpoint with two different trees in the namespace  RAL, KIT, CNAF, CCIN2P3, PIC 2013-11-11 WLCG Workshop: Disk/Tape separation 10

11 Internal transfers  Currently using standard tools for disk  tape buffer transfers at all sites  e.g. FTS, xrdcp  No bottleneck seen so far  If needed, internal optimizations are possible with a single endpoint  e.g. on a single dCache endpoint, internal data flow can be delegated to the pools 2013-11-11 WLCG Workshop: Disk/Tape separation 11

12 Site concerns  Main site concern has been duplication of space used between disk and tape buffer  Should not be a big effect given the “small” size of the buffer in front of tape  For dCache, a solution is planned:  “flush-on-demand” command creating a hard link in tape namespace instead of copy  development schedule will depend on need, for now gather experience with current version 2013-11-11 WLCG Workshop: Disk/Tape separation 12

13 Current status  DONE  RAL, CNAF  KIT (in commissioning last week)  ~ DONE  CERN (except for Tier-0 streamers and user)  IN PROGRESS  PIC, CCIN2P3, FNAL 2013-11-11 WLCG Workshop: Disk/Tape separation 13

14 Issues  At sites  No blocking technical issues  Not stress-tested yet: challenge in 2014?  In CMS software  Minor update needed in PhEDEx to handle disk  tape moves  Need to settle data location for job matching  PhEDEx node vs. SE…  CMS internal, in progress 2013-11-11 WLCG Workshop: Disk/Tape separation 14

15 Changes in operations and procedures  The Tier-1 disk endpoint is a central space  CMS will manage subscriptions and deletions on disk  Tape endpoint subscriptions are subject to approval by Tier-1 data managers (functions that are held by site-local colleagues)  CMS would like to auto-approve disk subscription and deletion requests to be able to reduce latencies 2013-11-11 WLCG Workshop: Disk/Tape separation 15

16 Changes in operations and procedures  Tape families:  Together with the Tier-1 sites, CMS optimized placement of files on tape for reading by requesting tape families  In the old model, tape family requests needed to be made before processing started, could lead to complications if forgotten  New model allows processing on disk endpoints without the need for tape families  A PhEDEx subscription archives the output to tape: needs to be approved by the site-local data manager  Tape family requests by CMS are not needed anymore, Sites can create tape families before approving archival PhEDEx subscriptions  CMS is happy and available for the sites to optimize rules for tape family creation  CMS would like to evolve the tape family procedure from requesting individual families to a dialogue with the sites defining tape family setups and rules 2013-11-11 WLCG Workshop: Disk/Tape separation 16

17 Changes in site readiness  Site readiness metrics for Tier-1s will evolve taking into account separated disk and tape PhEDEx endpoints  SAM tests only on CEs close to disk  SAM tests for SRM both on disk and on tape endpoints  More links to monitor:  disk  WAN  tape  WAN  disk  tape 2013-11-11 WLCG Workshop: Disk/Tape separation 17

18 Conclusions  Hosting Tier-1 data on disk will increase flexibility in all computing workflows  Technical solutions identified for all sites  Deployment in progress with no blocking issues, expecting completion at all sites by beginning of 2014  For more details:  https://twiki.cern.ch/twiki/bin/view/CMSPublic/CompProjDiskTape https://twiki.cern.ch/twiki/bin/view/CMSPublic/CompProjDiskTape  https://indico.cern.ch/conferenceDisplay.py?confId=249032 https://indico.cern.ch/conferenceDisplay.py?confId=249032 2013-11-11 WLCG Workshop: Disk/Tape separation 18


Download ppt "CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013."

Similar presentations


Ads by Google