Integrating Network Awareness in ATLAS Distributed Computing Using the ANSE Project J.Batista, K.De, A.Klimentov, S.McKee, A.Petroysan for the ATLAS Collaboration.

Slides:

Advertisements

Similar presentations

 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.

Advertisements

1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.

Integrating Network and Transfer Metrics to Optimize Transfer Efficiency and Experiment Workflows Shawn McKee, Marian Babik for the WLCG Network and Transfer.

LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.

CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.

High Energy Physics At OSCER A User Perspective OU Supercomputing Symposium 2003 Joel Snow, Langston U.

Russian “MegaProject” ATLAS SW&C week Plenary session : status, problems and plans Feb 24, 2014 Alexei Klimentov Brookhaven National Laboratory.

Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.

José M. Hernández CIEMAT Grid Computing in the Experiment at LHC Jornada de usuarios de Infraestructuras Grid January 2012, CIEMAT, Madrid.

Integration Program Update Rob Gardner US ATLAS Tier 3 Workshop OSG All LIGO.

Take on messages from Lecture 1 LHC Computing has been well sized to handle the production and analysis needs of LHC (very high data rates and throughputs)

Grid Applications for High Energy Physics and Interoperability Dominique Boutigny CC-IN2P3 June 24, 2006 Centre de Calcul de l’IN2P3 et du DAPNIA.

CHEP'07 September D0 data reprocessing on OSG Authors Andrew Baranovski (Fermilab) for B. Abbot, M. Diesburg, G. Garzoglio, T. Kurca, P. Mhashilkar.

PanDA A New Paradigm for Computing in HEP Kaushik De Univ. of Texas at Arlington NRC KI, Moscow January 29, 2015.

The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 25 th April 2012.

PanDA Summary Kaushik De Univ. of Texas at Arlington ADC Retreat, Naples Feb 4, 2011.

Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.

10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab

14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.

PanDA: Exascale Federation of Resources for the ATLAS Experiment

November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.

PanDA Update Kaushik De Univ. of Texas at Arlington XRootD Workshop, UCSD January 27, 2015.

The LHC Computing Grid – February 2008 The Challenges of LHC Computing Dr Ian Bird LCG Project Leader 6 th October 2009 Telecom 2009 Youth Forum.

Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.

Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.

Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.

Overview of ASCR “Big PanDA” Project Alexei Klimentov Brookhaven National Laboratory September 4, 2013, Arlington, TX PanDA UTA.

6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.

Network awareness and network as a resource (and its integration with WMS) Artem Petrosyan (University of Texas at Arlington) BigPanDA Workshop, CERN,

SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.

Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow UK-T0 Meeting 21 st Oct 2015 GridPP.

PanDA Status Report Kaushik De Univ. of Texas at Arlington ANSE Meeting, Nashville May 13, 2014.

PANDA: Networking Update Kaushik De Univ. of Texas at Arlington SC15 Demo November 18, 2015.

PanDA & BigPanDA Kaushik De Univ. of Texas at Arlington BigPanDA Workshop, CERN October 21, 2013.

SDN Provisioning, next steps after ANSE Kaushik De Univ. of Texas at Arlington US ATLAS Planning, CERN June 29, 2015.

EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Monitoring of the LHC Computing Activities Key Results from the Services.

INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.

Efi.uchicago.edu ci.uchicago.edu Storage federations, caches & WMS Rob Gardner Computation and Enrico Fermi Institutes University of Chicago BigPanDA Workshop.

Shifters Jamboree Kaushik De ADC Jamboree, CERN December 4, 2014.

Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,

Future of Distributed Production in US Facilities Kaushik De Univ. of Texas at Arlington US ATLAS Distributed Facility Workshop, Santa Cruz November 13,

ATLAS Distributed Computing ATLAS session WLCG pre-CHEP Workshop New York May 19-20, 2012 Alexei Klimentov Stephane Jezequel Ikuo Ueda For ATLAS Distributed.

PanDA & Networking Kaushik De Univ. of Texas at Arlington ANSE Workshop, CalTech May 6, 2013.

Joint Institute for Nuclear Research Synthesis of the simulation and monitoring processes for the data storage and big data processing development in physical.

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

Monitoring the Readiness and Utilization of the Distributed CMS Computing Facilities XVIII International Conference on Computing in High Energy and Nuclear.

Danila Oleynik (On behalf of ATLAS collaboration)

Breaking the frontiers of the Grid R. Graciani EGI TF 2012.

1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.

Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The Common Solutions Strategy of the Experiment Support group.

Scientific Computing at Fermilab Lothar Bauerdick, Deputy Head Scientific Computing Division 1 of 7 10k slot tape robots.

J. Templon Nikhef Amsterdam Physics Data Processing Group Large Scale Computing Jeff Templon Nikhef Jamboree, Utrecht, 10 december 2012.

BigPanDA Status Kaushik De Univ. of Texas at Arlington Alexei Klimentov Brookhaven National Laboratory OSG AHM, Clemson University March 14, 2016.

PanDA & Networking Kaushik De Univ. of Texas at Arlington UM July 31, 2013.

Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,

Ian Bird, CERN WLCG Project Leader Amsterdam, 24 th January 2012.

Bob Jones EGEE Technical Director

Joslynn Lee – Data Science Educator

Computing models, facilities, distributed computing

U.S. ATLAS Tier 2 Computing Center

Methodology: Aspects: cost models, modelling of system, understanding of behaviour & performance, technology evolution, prototyping  Develop prototypes.

PanDA setup at ORNL Sergey Panitkin, Alexei Klimentov BNL

HEP Computing Tools for Brain Studies

PanDA in a Federated Environment

Readiness of ATLAS Computing - A personal view

Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)

Southwest Tier 2.

BigPanDA WMS for Brain Studies

Univ. of Texas at Arlington BigPanDA Workshop, ORNL

A Possible OLCF Operational Model for HEP (2019+)

Presentation transcript:

Integrating Network Awareness in ATLAS Distributed Computing Using the ANSE Project J.Batista, K.De, A.Klimentov, S.McKee, A.Petroysan for the ATLAS Collaboration and Big PanDA team University of Michigan University of Texas at Arlington Brookhaven National Laboratory Joint Institute of Nuclear Research (Dubna) CHEP 2015, Okinawa, Japan, April 13-17, 2015

The ATLAS Experiment at the LHC K.De et al CHEP 2015, Okinawa, Japan 2 ATLAS

Distributed Computing in ATLAS K.De et al CHEP 2015, Okinawa, Japan 3  ATLAS Computing Model :  11 Clouds : 10 T1s + 1 T0 (CERN) Cloud = T1 + T2s + T2Ds T2D = multi-cloud T2 sites  2-16 T2s in each Cloud Basic unit of work is a job: Basic unit of work is a job: Executed on a CPU resource/slot May have inputs Produces outputs JEDI – layer above PanDA to create jobs from ATLAS physics and analysis 'tasks' JEDI – layer above PanDA to create jobs from ATLAS physics and analysis 'tasks' Current scale – one million jobs per day Workload Management System Task  Cloud :Task brokerage Jobs  Sites :Job brokerage

PanDA Overview  The PanDA workload management system was developed for the ATLAS experiment at the Large Hadron Collider  Hundreds of petabytes of data per year, thousands of users worldwide, many dozens of complex applications…  Leading to >400 scientific publications – and growing daily  Discovery of the Higgs boson, search for dark matter…  A new approach to distributed computing  A huge hierarchy of computing centers working together  Main challenge – how to provide efficient automated performance  Auxiliary challenge – make resources easily accessible to all users  Large Scale Networking enables systems like PanDA  Related CHEP Talks and Posters :  T.Maeno : The Future of PanDA in ATLAS Distributed Computing; CHEP ID 144  S.Panitkin : Integration of PanDA Workload Management System with Titan supercomputer at OLCF ; CHEP ID 152  M.Golosova : Studies of Big Data meta-data segmentation between relational and non-relational databases ; CHEP ID 115  Poster : Scaling up ATLAS production system for the LHC Run 2 and beyond: project ProdSys2  Poster : Integration of Russian Tier-1 Grid Center with High Performance Computers at NRC-KI for LHC experiments and beyond HENP; CHEP ID 100  Reference : K.De et al CHEP 2015, Okinawa, Japan 4

Core Ideas in PanDA  Make hundreds of distributed sites appear as local  Provide a central queue for users – similar to local batch systems  Reduce site related errors and reduce latency  Build a pilot job system – late transfer of user payloads  Crucial for distributed infrastructure maintained by local experts  Hide middleware while supporting diversity and evolution  PanDA interacts with middleware – users see high level workflow  Hide variations in infrastructure  PanDA presents uniform ‘job’ slots to user (with minimal sub-types)  Easy to integrate grid sites, clouds, HPC sites …  Production and Analysis users see same PanDA system  Same set of distributed resources available to all users  Highly flexible – instantaneous control of global priorities by experiment K.De et al CHEP 2015, Okinawa, Japan 5

Resources Accessible via PanDA About 150,000 job slots used continuously 24x7x365 K.De et al CHEP 2015, Okinawa, Japan 6

PanDA Scale Current scale – 25M jobs completed every month at >hundred sites First exascale system in HEP – 1.2 Exabytes processed in 2013 K.De et al CHEP 2015, Okinawa, Japan 7

The Growing PanDA EcoSystem  ATLAS PanDA core  US ATLAS, CERN, UK, DE, ND, CA, Russia, OSG …  ASCR/HEP Big PanDA  DOE funded project at BNL, UTA – PanDA beyond HEP, at LCF  CC-NIE ANSE PanDA  NSF funded network project - CalTech, Michigan, Vanderbilt, UTA  HPC and Cloud PanDA – very active  Taiwan PanDA – AMS and other communities  Russian NRC KI PanDA, JINR PanDA – new communities  AliEn PanDA, LSST PanDA, other experiments  MegaPanDA (COMPASS, ALICE, NICA) …  RF Ministry of Science and Education funded project K.De et al CHEP 2015, Okinawa, Japan 8

PanDA Evolution. ”Big PanDA”  Generalization of PanDA for HEP and other data-intensive sciences  Four main tracks :  Factorizing the core : Factorizing the core components of PanDA to enable adoption by a wide range of exascale scientific communities  Extending the scope : Evolving PanDA to support extreme scale computing clouds and Leadership Computing Facilities  Leveraging intelligent networks : Integrating network services and real-time data access to the PanDA workflow  Usability and monitoring : Real time monitoring and visualization package for PanDA K.De et al CHEP 2015, Okinawa, Japan 9

ATLAS Distributed Computing. Data Transfer Volume during Run 1. 6 PBytes -> TBytes CERN to BNL data transfer time. An average 3.7h to export data from CERN Data Transfer Volume in TB (weekly average). Aug Aug 2013 K.De et al CHEP 2015, Okinawa, Japan 10

PanDA and Networking  PanDA as workload manager  PanDA automatically chooses job execution site  Multi-level decision tree – task brokerage, job brokerage, dispatcher  Also predictive workflows – like PD2P (PanDA Dynamic Data Placement)  Site selection is based on processing and storage requirements  Why not use network information in this decision?  Can we go even further – network provisioning?  Network knowledge useful for all phases of job cycle  Network as resource  Optimal site selection should take network capability into account  We do this already – but indirectly using job completion metrics  Network as a resource should be managed (i.e. provisioning)  We also do this crudely – mostly through timeouts, self throttling  Goal for PanDA  Direct integration of networking with PanDA workflow – never attempted before for large scale automated WMS systems K.De et al CHEP 2015, Okinawa, Japan 11

Sources of Network Information  Distributed Data Management Sonar measurements  Actual transfer rates for files between all sites  This information is normally used for site white/blacklisting  Measurements available for small, medium, and large files  perfSonar (PS) measurements  perfSonar provides dedicated network monitoring data  All WLCG sites are being instrumented with PS boxes  Federated XRootD (FAX) measurements  Read-time for remote files are measured for pairs of sites  Using standard PanDA test jobs (HammerCloud jobs) K.De et al CHEP 2015, Okinawa, Japan 12

Network Data Monitoring. Sonar. K.De et al CHEP 2015, Okinawa, Japan 13

Network Measurements. perfSonar SSB: K.De et al CHEP 2015, Okinawa, Japan 14

Example: Faster User Analysis  First use case for network integration with PanDA  Goal - reduce waiting time for user jobs  User analysis jobs normally go to sites with local input data  This can occasionally lead to long wait times (jobs are re-brokered if possible, or PD2P data caching will make more copies eventually to reduce congestion)  While nearby sites with good network access may be idle  Brokerage uses concept of ‘nearby’ sites  Use cost metric generated with Hammercloud tests  Calculate weight based on usual brokerage criteria (availability of CPU resources, data location, release…) plus new network transfer cost  Jobs will be sent to the site with best overall weight  Throttling is used to manage load on network K.De et al CHEP 2015, Okinawa, Japan 15

Jobs Using FAX for Remote Access April, 2015K.De et al CHEP 2015, Okinawa, Japan16 About 4% of jobs access data remotely Through federated XRootD Remote site is selected based on network performance Data for October 2014 Completed jobs (Sum : 14,559,951) FAX 3.6% (522,185 jobs) “FAX” jobs (Sum : 522,185) USA 92.66% France 4.83% Italy 1.07%

Daily Remote Access Rates April, 2015K.De et al, CHEP 2015, Okinawa, Japan17 Data for October 2014 FAX USA France Italy Canada Germany

Jobs from Task November 18, 2014Kaushik De18

Job Wait Times for Example Task K.De et al CHEP 2015, Okinawa, Japan 19

Cloud Selection  Second use case for network integration with PanDA  Optimize choice of T1-T2 pairings (cloud selection)  In ATLAS, production tasks are assigned to Tier 1’s  Tier 2’s are attached to a Tier 1 cloud for data processing  Any T2 may be attached to multiple T1’s  Currently, operations team makes this assignment manually  Automate this using network information K.De et al CHEP 2015, Okinawa, Japan 20

Example of Cloud Selection K.De et al CHEP 2015, Okinawa, Japan 21 5 best T1s for SouthWest T2 in TX, USA

Summary  The knowledge of network conditions, both historical and current, are used to optimize PanDA and other systems for ATLAS and allows to WMS control of end-to-end network paths can augment ATLAS's ability to effectively utilize its distributed resources.  Optimal Workload Management System design should take network capability into account  Network as a resource should be managed (i.e. provisioning)  Many parts of distributed Workload Management Systems like PanDA can benefit from better integration with networking  Initial results look promising  Next comes SDN/provisioning  We are on the path to use Network as a resource, on similar footing as CPU and storage K.De et al CHEP 2015, Okinawa, Japan 22

Back up slides K.De et al CHEP 2015, Okinawa, Japan 23

K.De et al CHEP 2015, Okinawa, Japan 24