Evolution of the distributed computing model The case of CMS

Slides:

Advertisements

Similar presentations

1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.

Advertisements

Stuart K. PatersonCHEP 2006 (13 th –17 th February 2006) Mumbai, India 1 from DIRAC.Client.Dirac import * dirac = Dirac() job = Job() job.setApplication('DaVinci',

A tool to enable CMS Distributed Analysis

Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers.

PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.

Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.

CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.

Grid Data Management A network of computers forming prototype grids currently operate across Britain and the rest of the world, working on the data challenges.

José M. Hernández CIEMAT Grid Computing in the Experiment at LHC Jornada de usuarios de Infraestructuras Grid January 2012, CIEMAT, Madrid.

INTRODUCTION The GRID Data Center at INFN Pisa hosts a big Tier2 for the CMS experiment, together with local usage from other HEP related/not related activities.

Take on messages from Lecture 1 LHC Computing has been well sized to handle the production and analysis needs of LHC (very high data rates and throughputs)

Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.

Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.

November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.

And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR

Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.

LHCb The LHCb Data Management System Philippe Charpentier CERN On behalf of the LHCb Collaboration.

CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.

1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.

Claudio Grandi INFN Bologna CMS Computing Model Evolution Claudio Grandi INFN Bologna On behalf of the CMS Collaboration.

6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.

Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.

The CMS Top 5 Issues/Concerns wrt. WLCG services WLCG-MB April 3, 2007 Matthias Kasemann CERN/DESY.

Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.

6 march Building the INFN Grid Proposal outline a.ghiselli,l.luminari,m.sgaravatto,c.vistoli INFN Grid meeting, milano.

CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.

EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Data Management Highlights in TSA3.3 Services for HEP Fernando Barreiro Megino,

CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.

Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.

INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.

Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.

VO Box discussion ATLAS NIKHEF January, 2006 Miguel Branco -

PanDA & Networking Kaushik De Univ. of Texas at Arlington UM July 31, 2013.

ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,

1 LCG-France 22 November 2010 Tier2s connectivity requirements 22 Novembre 2010 S. Jézéquel (LAPP-ATLAS)

Claudio Grandi INFN Bologna Workshop congiunto CCR e INFNGrid 13 maggio 2009 Le strategie per l’analisi nell’esperimento CMS Claudio Grandi (INFN Bologna)

Evolution of storage and data management

Daniele Bonacorsi Andrea Sciabà

CASTOR: possible evolution into the LHC era

Dynamic Extension of the INFN Tier-1 on external resources

Extending the farm to external sites: the INFN Tier-1 experience

WLCG IPv6 deployment strategy

WLCG Network Discussion

Ian Bird WLCG Workshop San Francisco, 8th October 2016

Sviluppi in ambito WLCG Highlights

Virtualization and Clouds ATLAS position

Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017

Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.

ATLAS Use and Experience of FTS

Dag Toppe Larsen UiB/CERN CERN,

Dag Toppe Larsen UiB/CERN CERN,

IT-DB Physics Services Planning for LHC start-up

Distributed Computing for HEP - present and future

Workload Management System

Jan 12, 2005 Improving CMS data transfers among its distributed Computing Facilities N. Magini CERN IT-ES-VOS, Geneva, Switzerland J. Flix Port d'Informació.

Data Challenge with the Grid in ATLAS

Provisioning 160,000 cores with HEPCloud at SC17

INFN-GRID Workshop Bari, October, 26, 2004

How to enable computing

Sergio Fantinel, INFN LNL/PD

LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.

Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group

Artem Trunov and EKP team EPK – Uni Karlsruhe

Network Requirements Javier Orellana

Ákos Frohner EGEE'08 September 2008

LCG middleware and LHC experiments ARDA project

WLCG Collaboration Workshop;

The ATLAS Computing Model

Database System Architectures

L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher

Presentation transcript:

Evolution of the distributed computing model The case of CMS Claudio Grandi (INFN Bologna) SuperB Computing R&D Workshop 6 July 2011

SuperB Computing R&D Workshop Introduction The presentation describes how the CMS experiment is changing its approach in the exploitation of distributed resources with respect to what described in the Computing TDR I take responsibility for the few thoughts on possible evolutions for the future of HEP experiments that are added in the different sections The presentation concentrates on Workload and Data Management and does not cover other important items such as Security, Monitoring, Bokkeeping SuperB Computing R&D Workshop 6 July 2011

SuperB Computing R&D Workshop Workload Management SuperB Computing R&D Workshop 6 July 2011

SuperB Computing R&D Workshop Baseline WM model All processing jobs are sent to the site hosting the data Use a push model supported by a WMS and a Grid Information System Intrinsic scalability by WMS replication CMS Computing TDR (2005) WMS SuperB Computing R&D Workshop 6 July 2011

SuperB Computing R&D Workshop Beyond the baseline In the computing TDR it was foreseen to evolve the CMS WM to a 2-layer pull model  Hierarchical Task Queues Global Task Queue must reach the required scale CMS services at sites CMS Computing TDR (2005) The architecture being implemented is an evolution of this one but batch slot harvesting and local task queues are moved out of the site boundaries SuperB Computing R&D Workshop 6 July 2011

Glidein-WMS based system The Factory harvests batch jobs The Frontend contains the job queue Frontend and Factory are in n:m correspondence UI Global Task Queue Resource allocation Glidein Factory schedd collector CE LRMS Grid job Glidein Startup startd CMS job WN Site boundary Local Task Queue WMAgent schedd Job management Glidein Frontend Central Manager collector negotiator SuperB Computing R&D Workshop 6 July 2011

SuperB Computing R&D Workshop Thoughts on the future Split resource allocation and job management  Job management requires detailed bookkeeping, resource allocation doesn’t  Resource allocation requires clear interfaces with sites; job management is internal to the VO  Get rid of Grid Information Systems  Requires careful tuning of resource harvesting w.r.t. job load  Requires to develop/maintain a VO job management system Security issues Cloud computing is providing flexible ways for allocating resources Abandon the concept of job in the resource allocation phase Possible to build virtual VO farms (e.g. with VPNs) and use commercial batch system as VO job management system SuperB Computing R&D Workshop 6 July 2011

SuperB Computing R&D Workshop Quantitative aspects Towards ½ mil. jobs/day > 50% are analysis jobs! About 40K parallel running jobs (increasing) If you are concerned by the scale consider that simply adopting the whole-node approach you gain an order of magnitude  SuperB Computing R&D Workshop 6 July 2011

SuperB Computing R&D Workshop Data Management SuperB Computing R&D Workshop 6 July 2011

SuperB Computing R&D Workshop Data distribution CMS Computing TDR (2005) Data moved to Tier-1s for organized processing and to Tier-2s for analysis MC data produced at Tier-1s and Tier-2s then follow the path of real data SuperB Computing R&D Workshop 6 July 2011

T2-T2 full mesh commissioning The first important modification to the Computing Model Every site is able to get data via PhEDEx from any other site Today the Tier-2s host the biggest fraction of CMS data # links T2-T2 used for data transfer each month (not always the same) 2010 Up to 30 links/day 7/links day in average in first 6 months of data taking >95% of the full mesh commissioned Average: 7 SuperB Computing R&D Workshop 6 July 2011

CMS production traffic in 2010 Data transferred through different links T0➝T1 related with LHC activities. PhEDEx rerouting moves part of the traffic to T1 ➝ T1 links T1➝T2 increased to serve data to the analysis layer T2➝T2 important after the dedicated efforts in 2010 [ estensione di informazioni presentate a CHEP’10, D.Bonacorsi ] SuperB Computing R&D Workshop 6 July 2011

Data Management components CMS Computing TDR (2005) DBS TMDB (PhEDEx transfer DB) TFC (Trivial File Catalogue) Just an algorithm to do LFN  PFN translation DAS SuperB Computing R&D Workshop 6 July 2011

Storage Model Evolution Motivations: Non optimal use (waste) of disk resources for analysis Too many replicas for data rarely accessed More efficient use of network resources Network is cheaper and more available than it used to be More controlled access to MSS (now T1D0 only) Strategies: Remote data access xrootd, NFS 4.1, WebDAV, ... Dynamic data placement pre-placement caching MSS-Disk split SRM interface may become redundant SuperB Computing R&D Workshop 6 July 2011

SuperB Computing R&D Workshop Remote Data Access SuperB Computing R&D Workshop 6 July 2011

Remote data access concerns What we should care about Tier-1 MSS systems must be protected from tape recalls Publicize with xrootd only files on disk Later provide a tool to pre-stage files in a controlled way that does not impact operations, and possibly add automation to gather demand overview info and dispatch pre-stage requests Central processing should not be impacted Define a threshold for the load to the data servers that can come form remote access and throttle access at the xrootd server level Data Operations transfers should not be impacted Limit the bandwidth of the xrootd servers In general network usage should be monitored with care and all excessive use / abuse cases identified SuperB Computing R&D Workshop 6 July 2011

Dynamic Data Placement Measure the “popularity” of data and develop algorithms to decide: What replicas should be deleted What datasets should be replicated and where First application: Data Reduction Agent (ATLAS) Dynamic Data Placement is still compatible with the data driven model # accesses to datasets SuperB Computing R&D Workshop 6 July 2011

SuperB Computing R&D Workshop Tier-1 MSS-disk split Transparent access to data on tape (T1D0) is not as easy as it appeared at the beginning Go through an additional access layer also if data is on disk CMS is able to efficiently use T1D0 systems but the Data Operations team is requested to trigger organized recalls from tape before starting heavy reprocessing activities at Tier-1s the Data Operation team does not have enough flexibility for deciding when data has to go to tape user analysis is not allowed at Tier-1s to protect the MSS systems from uncontrolled recalls avoid adding authorization when accessing the storage from the farm! The plan is to split MSS and disk at the Tier-1s The jobs can only access the disk – open the farm to users Remote access allowed only to data on disk SuperB Computing R&D Workshop 6 July 2011

SuperB Computing R&D Workshop Unsorted thoughts... Add private data management to the system since the beginning Not doing well for publication of private data Start from a proper data placement and add remote access mainly for handling exceptions The bottleneck for remote access could be the security layer on the storage system rather than the network Open as many sites as possible to analysis And set up the storage accordingly (e.g. protect MSS systems, if any) Think whether to keep using tapes for nearline data Alternative may be e.g. to have two custodial copies on disk at different sites that may be used also for processing! SuperB Computing R&D Workshop 6 July 2011

SuperB Computing R&D Workshop Conclusions SuperB Computing R&D Workshop 6 July 2011

Responsibility-based model MONARC regional model not needed any more It was justified by network costs and complexity in the management of users-sites relations The Grid technology removed the complexity by providing standard interfaces and single sign-on Network costs do not limit today implementing the full mesh Still there are different kind of sites: Based on size Based on the kind services that are offered Based on quality of service (including network connectivity)  (Loosely) hierarchical structure but based on responsibilities rather than geography More dynamic: move responsibilities as function of experiment needs SuperB Computing R&D Workshop 6 July 2011