Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017 edoardo.martelli@cern.ch.

Slides:



Advertisements
Similar presentations
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Advertisements

Integrating Network and Transfer Metrics to Optimize Transfer Efficiency and Experiment Workflows Shawn McKee, Marian Babik for the WLCG Network and Transfer.
GridPP Steve Lloyd, Chair of the GridPP Collaboration Board.
NORDUnet NORDUnet The Fibre Generation Lars Fischer CTO NORDUnet.
ALICE data access WLCG data WG revival 4 October 2013.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
Integration Program Update Rob Gardner US ATLAS Tier 3 Workshop OSG All LIGO.
INTRODUCTION The GRID Data Center at INFN Pisa hosts a big Tier2 for the CMS experiment, together with local usage from other HEP related/not related activities.
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
Xrootd Monitoring and Control Harsh Arora CERN. Setting Up Service  Monalisa Service  Monalisa Repository  Test Xrootd Server  ApMon Module.
Point-to-point Architecture topics for discussion Remote I/O as a data access scenario Remote I/O is a scenario that, for the first time, puts the WAN.
6 march Building the INFN Grid Proposal outline a.ghiselli,l.luminari,m.sgaravatto,c.vistoli INFN Grid meeting, milano.
U.S. ATLAS Computing Facilities Overview Bruce G. Gibbard Brookhaven National Laboratory U.S. LHC Software and Computing Review Brookhaven National Laboratory.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
Ian Bird WLCG Networking workshop CERN, 10 th February February 2014
Strawman LHCONE Point to Point Experiment Plan LHCONE meeting Paris, June 17-18, 2013.
ATLAS Distributed Computing ATLAS session WLCG pre-CHEP Workshop New York May 19-20, 2012 Alexei Klimentov Stephane Jezequel Ikuo Ueda For ATLAS Distributed.
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Monitoring the Readiness and Utilization of the Distributed CMS Computing Facilities XVIII International Conference on Computing in High Energy and Nuclear.
LHCOPN / LHCONE Status Update John Shade /CERN IT-CS Summary of the LHCOPN/LHCONE meeting in Amsterdam Grid Deployment Board, October 2011.
HEPiX IPv6 Working Group David Kelsey (STFC-RAL) GridPP33 Ambleside 22 Aug 2014.
LHCb Computing 2015 Q3 Report Stefan Roiser LHCC Referees Meeting 1 December 2015.
LHC collisions rate: Hz New PHYSICS rate: Hz Event selection: 1 in 10,000,000,000,000 Signal/Noise: Raw Data volumes produced.
Accounting Review Summary and action list from the (pre)GDB Julia Andreeva CERN-IT WLCG MB 19th April
Daniele Bonacorsi Andrea Sciabà
Dynamic Extension of the INFN Tier-1 on external resources
WLCG IPv6 deployment strategy
LHCb distributed computing during the LHC Runs 1,2 and 3
Laurence Field IT/SDC Cloud Activity Coordination
WLCG Workshop 2017 [Manchester] Operations Session Summary
LHCOPN/LHCONE status report pre-GDB on Networking CERN, Switzerland 10th January 2017
WLCG Network Discussion
Ian Bird WLCG Workshop San Francisco, 8th October 2016
ALICE internal and external network
Computing models, facilities, distributed computing
Overview of the Belle II computing
LHCOPN update Brookhaven, 4th of April 2017
Semester 4 - Chapter 3 – WAN Design
POW MND section.
Data Challenge with the Grid in ATLAS
One independent ‘policy-bridge’ PKI
Grid related projects CERN openlab LCG EDG F.Fluckiger
INFNGRID Workshop – Bari, Italy, October 2004
Update on Plan for KISTI-GSDC
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
The INFN TIER1 Regional Centre
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
UK Status and Plans Scientific Computing Forum 27th Oct 2017
Update from the HEPiX IPv6 WG
3rd Asia Tier Centre Forum Summary report LHCONE meeting in Tsukuba 17th October 2017
WP7 objectives, achievements and plans
Monitoring at a Multi-Site Tier 1
System performance and cost model working group
Simulation use cases for T2 in ALICE
Alerting/Notifications (MadAlert)
The INFN Tier-1 Storage Implementation
CernVM Status Report Predrag Buncic (CERN/PH-SFT).
WLCG and support for IPv6-only CPU
Bernd Panzer-Steindel CERN/IT
February WLC GDB Short summaries
WLCG Collaboration Workshop;
New strategies of the LHC experiments to meet
Grid Canada Testbed using HEP applications
ExaO: Software Defined Data Distribution for Exascale Sciences
Grid Computing 6th FCPPL Workshop
WLCG Status – 1 Use remains consistently high
Development of LHCb Computing Model F Harris
The LHC Computing Grid Visit of Professor Andreas Demetriou
Presentation transcript:

Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017 edoardo.martelli@cern.ch

WLCG Network Requirements The 4 major LHC experiments were asked to provide their network requirements for the coming years, in terms of: - bandwidth - special capabilities - monitoring

ALICE Recommendations for Tier2s: - WAN: 100Mbps of WAN per 1000 cores - LAN to local storage: 20Gbps per 1000 cores - Better read data locally: CPU efficiency get a 15% penalty per 20ms RTT ALICE depends on and strongly encourages: - full implementation of LHCONE at all sites - plus the associated tools (like PerfSONAR), properly instrumented to be used in the individual Grid frameworks

ATLAS ATLAS moving to a Nucleus and Satellite Model Nucleus will store primary data and act as a source for data distribution: - Storage capacity > 1PB - Good Network throughput - Site availability: > 95% ATLAS would like - better visibility into networks to improve ability to resolve network issues - to see network traffic data from R&E Networks

CMS Site types and requirements: - Full Tier-2, many CPUs and large disk capacity: Some 10Gbit/s or 100Gbit/s for both LAN and WAN are advisable for sites with several 1000 cores - Disk-rich Tier-2, more storage than average, perhaps hosting disk for co-located CPU only site: Good WAN connection even more important CMS schedules typically 10-15% of jobs with remote data access - Penalty in CPU efficiency: drop of ~10% for remote access - But large gain in flexibility Computing model likely to evolve towards scenarios that require fast interconnects Need to improve CMS transfer system and scheduling system to better exploit network metrics

LHCb - Majority of workflows are simulation jobs without input data - A certain fraction are user jobs with local access to input data - Currently small usage of working group productions. Will increase - In case of input data, always read by default from local site - Output data always goes to different storage areas on T1 sites Monitoring, including network monitoring, available from within Dirac: - for “helper data processing” sites to check the WAN connectivity to a T1 storage - for data management to check WAN quality

Network throughput WG WG has established an infrastructure to monitor and measure networks: - proven record on fixing existing network problems and improving transfer efficiency - stable production infrastructure Mid-term evolution topics: - Network capacities planning - Network utilization monitoring, both site-level and WAN - Evolving and integration of monitoring data: new sources, dashboards and network stream - Network Analytics: Alerting/Notifications and anomaly detection - SDN Networking demonstrators and testbeds

GEANT GEANT is investigating ways to increase ability to differentiate, break dependence on vendors, and provide capabilities to automate/orchestrate multi-domain services Asked the Experiments if interested in participating

From the discussions Need to maintain, monitor and develop network monitoring; effort is required to: - respond to GGUS tickets - track service and metric status and follow up on down services, mis-configuration, etc. - evolve the system, implementing new user interfaces, analytics, alerting and new measurements - coordinate network activities with others Some interest has been expressed about network programmability, especially in light of foreseen commercially driven network developments.

References Presentations available here: https://indico.cern.ch/event/609911/sessions/238341/#20170620

Questions? edoardo.martelli@cern.ch