Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evolution of the distributed computing model The case of CMS

Similar presentations


Presentation on theme: "Evolution of the distributed computing model The case of CMS"— Presentation transcript:

1 Evolution of the distributed computing model The case of CMS
Claudio Grandi (INFN Bologna) SuperB Computing R&D Workshop 6 July 2011

2 SuperB Computing R&D Workshop
Introduction The presentation describes how the CMS experiment is changing its approach in the exploitation of distributed resources with respect to what described in the Computing TDR I take responsibility for the few thoughts on possible evolutions for the future of HEP experiments that are added in the different sections The presentation concentrates on Workload and Data Management and does not cover other important items such as Security, Monitoring, Bokkeeping SuperB Computing R&D Workshop 6 July 2011

3 SuperB Computing R&D Workshop
Workload Management SuperB Computing R&D Workshop 6 July 2011

4 SuperB Computing R&D Workshop
Baseline WM model All processing jobs are sent to the site hosting the data Use a push model supported by a WMS and a Grid Information System Intrinsic scalability by WMS replication CMS Computing TDR (2005) WMS SuperB Computing R&D Workshop 6 July 2011

5 SuperB Computing R&D Workshop
Beyond the baseline In the computing TDR it was foreseen to evolve the CMS WM to a 2-layer pull model  Hierarchical Task Queues Global Task Queue must reach the required scale CMS services at sites CMS Computing TDR (2005) The architecture being implemented is an evolution of this one but batch slot harvesting and local task queues are moved out of the site boundaries SuperB Computing R&D Workshop 6 July 2011

6 Glidein-WMS based system
The Factory harvests batch jobs The Frontend contains the job queue Frontend and Factory are in n:m correspondence UI Global Task Queue Resource allocation Glidein Factory schedd collector CE LRMS Grid job Glidein Startup startd CMS job WN Site boundary Local Task Queue WMAgent schedd Job management Glidein Frontend Central Manager collector negotiator SuperB Computing R&D Workshop 6 July 2011

7 SuperB Computing R&D Workshop
Thoughts on the future Split resource allocation and job management  Job management requires detailed bookkeeping, resource allocation doesn’t  Resource allocation requires clear interfaces with sites; job management is internal to the VO  Get rid of Grid Information Systems  Requires careful tuning of resource harvesting w.r.t. job load  Requires to develop/maintain a VO job management system Security issues Cloud computing is providing flexible ways for allocating resources Abandon the concept of job in the resource allocation phase Possible to build virtual VO farms (e.g. with VPNs) and use commercial batch system as VO job management system SuperB Computing R&D Workshop 6 July 2011

8 SuperB Computing R&D Workshop
Quantitative aspects Towards ½ mil. jobs/day > 50% are analysis jobs! About 40K parallel running jobs (increasing) If you are concerned by the scale consider that simply adopting the whole-node approach you gain an order of magnitude  SuperB Computing R&D Workshop 6 July 2011

9 SuperB Computing R&D Workshop
Data Management SuperB Computing R&D Workshop 6 July 2011

10 SuperB Computing R&D Workshop
Data distribution CMS Computing TDR (2005) Data moved to Tier-1s for organized processing and to Tier-2s for analysis MC data produced at Tier-1s and Tier-2s then follow the path of real data SuperB Computing R&D Workshop 6 July 2011

11 T2-T2 full mesh commissioning
The first important modification to the Computing Model Every site is able to get data via PhEDEx from any other site Today the Tier-2s host the biggest fraction of CMS data # links T2-T2 used for data transfer each month (not always the same) 2010 Up to 30 links/day 7/links day in average in first 6 months of data taking >95% of the full mesh commissioned Average: 7 SuperB Computing R&D Workshop 6 July 2011

12 CMS production traffic in 2010
Data transferred through different links T0➝T1 related with LHC activities. PhEDEx rerouting moves part of the traffic to T1 ➝ T1 links T1➝T2 increased to serve data to the analysis layer T2➝T2 important after the dedicated efforts in 2010 [ estensione di informazioni presentate a CHEP’10, D.Bonacorsi ] SuperB Computing R&D Workshop 6 July 2011

13 Data Management components
CMS Computing TDR (2005) DBS TMDB (PhEDEx transfer DB) TFC (Trivial File Catalogue) Just an algorithm to do LFN  PFN translation DAS SuperB Computing R&D Workshop 6 July 2011

14 Storage Model Evolution
Motivations: Non optimal use (waste) of disk resources for analysis Too many replicas for data rarely accessed More efficient use of network resources Network is cheaper and more available than it used to be More controlled access to MSS (now T1D0 only) Strategies: Remote data access xrootd, NFS 4.1, WebDAV, ... Dynamic data placement pre-placement caching MSS-Disk split SRM interface may become redundant SuperB Computing R&D Workshop 6 July 2011

15 SuperB Computing R&D Workshop
Remote Data Access SuperB Computing R&D Workshop 6 July 2011

16 Remote data access concerns
What we should care about Tier-1 MSS systems must be protected from tape recalls Publicize with xrootd only files on disk Later provide a tool to pre-stage files in a controlled way that does not impact operations, and possibly add automation to gather demand overview info and dispatch pre-stage requests Central processing should not be impacted Define a threshold for the load to the data servers that can come form remote access and throttle access at the xrootd server level Data Operations transfers should not be impacted Limit the bandwidth of the xrootd servers In general network usage should be monitored with care and all excessive use / abuse cases identified SuperB Computing R&D Workshop 6 July 2011

17 Dynamic Data Placement
Measure the “popularity” of data and develop algorithms to decide: What replicas should be deleted What datasets should be replicated and where First application: Data Reduction Agent (ATLAS) Dynamic Data Placement is still compatible with the data driven model # accesses to datasets SuperB Computing R&D Workshop 6 July 2011

18 SuperB Computing R&D Workshop
Tier-1 MSS-disk split Transparent access to data on tape (T1D0) is not as easy as it appeared at the beginning Go through an additional access layer also if data is on disk CMS is able to efficiently use T1D0 systems but the Data Operations team is requested to trigger organized recalls from tape before starting heavy reprocessing activities at Tier-1s the Data Operation team does not have enough flexibility for deciding when data has to go to tape user analysis is not allowed at Tier-1s to protect the MSS systems from uncontrolled recalls avoid adding authorization when accessing the storage from the farm! The plan is to split MSS and disk at the Tier-1s The jobs can only access the disk – open the farm to users Remote access allowed only to data on disk SuperB Computing R&D Workshop 6 July 2011

19 SuperB Computing R&D Workshop
Unsorted thoughts... Add private data management to the system since the beginning Not doing well for publication of private data Start from a proper data placement and add remote access mainly for handling exceptions The bottleneck for remote access could be the security layer on the storage system rather than the network Open as many sites as possible to analysis And set up the storage accordingly (e.g. protect MSS systems, if any) Think whether to keep using tapes for nearline data Alternative may be e.g. to have two custodial copies on disk at different sites that may be used also for processing! SuperB Computing R&D Workshop 6 July 2011

20 SuperB Computing R&D Workshop
Conclusions SuperB Computing R&D Workshop 6 July 2011

21 Responsibility-based model
MONARC regional model not needed any more It was justified by network costs and complexity in the management of users-sites relations The Grid technology removed the complexity by providing standard interfaces and single sign-on Network costs do not limit today implementing the full mesh Still there are different kind of sites: Based on size Based on the kind services that are offered Based on quality of service (including network connectivity)  (Loosely) hierarchical structure but based on responsibilities rather than geography More dynamic: move responsibilities as function of experiment needs SuperB Computing R&D Workshop 6 July 2011


Download ppt "Evolution of the distributed computing model The case of CMS"

Similar presentations


Ads by Google