WLCG IPv6 deployment strategy

Slides:

Advertisements

Similar presentations

Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.

Advertisements

News from the HEPiX IPv6 Working Group David Kelsey (STFC-RAL) HEPiX, Oxford 24 Mar 2015.

Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.

News from the HEPiX IPv6 Working Group David Kelsey (STFC-RAL) WLCG GDB, CERN 8 July 2015.

The production deployment of IPv6 on WLCG David Kelsey (STFC-RAL) CHEP2015, OIST, Okinawa 16 Apr 2015.

MW Readiness Verification Status Andrea Manzi IT/SDC 21/01/ /01/15 2.

MW Readiness WG Update Andrea Manzi Maria Dimou Lionel Cons 10/12/2014.

1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.

1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.

15-Feb-02Steve Traylen, RAL WP6 Test Bed Report1 RAL/UK WP6 Test Bed Report Steve Traylen, WP6 PPGRID/RAL, UK

Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.

WLCG Operations Coordination Andrea Sciabà IT/SDC 10 th July 2013.

WLCG critical services update Andrea Sciabà WLCG operations coordination meeting December 18, 2014.

The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.

WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.

MW Readiness WG Update Andrea Manzi Maria Dimou Lionel Cons Maarten Litmaath On behalf of the WG participants GDB 09/09/2015.

HEPiX IPv6 Working Group David Kelsey david DOT kelsey AT stfc DOT ac DOT uk (STFC-RAL) HEPiX, Vancouver 26 Oct 2011.

News from the HEPiX IPv6 Working Group David Kelsey (STFC-RAL) HEPIX, BNL 13 Oct 2015.

The HEPiX IPv6 Working Group David Kelsey (STFC-RAL) EGI OMB 19 Dec 2013.

WLCG Operations Coordination report Maria Dimou Andrea Sciabà IT/SDC On behalf of the WLCG Operations Coordination team GDB 12 th November 2014.

HEPiX IPv6 Working Group David Kelsey (STFC-RAL) GridPP33 Ambleside 22 Aug 2014.

WLCG Operations Coordination Andrea Sciabà IT/SDC GDB 11 th September 2013.

RAL Site Report HEP SYSMAN June 2016 – RAL Gareth Smith, STFC-RAL With thanks to Martin Bly, STFC-RAL.

IT Monitoring Service Status and Progress 1 Alberto AIMAR, IT-CM-MM.

Claudio Grandi INFN Bologna Workshop congiunto CCR e INFNGrid 13 maggio 2009 Le strategie per l’analisi nell’esperimento CMS Claudio Grandi (INFN Bologna)

Daniele Bonacorsi Andrea Sciabà

Dynamic Extension of the INFN Tier-1 on external resources

Monitoring Evolution and IPv6

WLCG Workshop 2017 [Manchester] Operations Session Summary

Pete Gronbech GridPP Project Manager April 2016

LCG Service Challenge: Planning and Milestones

Sviluppi in ambito WLCG Highlights

gLite->EMI2/UMD2 transition

HEPiX Spring 2014 Annecy-le Vieux May Martin Bly, STFC-RAL

Plans to support IPv6-only CPU on WLCG

Storage Interfaces and Access: Introduction

WLCG Operations Coordination

dCache – protocol developments and plans

Database Readiness Workshop Intro & Goals

How to enable computing

Update on Plan for KISTI-GSDC

The CREAM CE: When can the LCG-CE be replaced?

Taming the protocol zoo

Experiment Dashboard overviw of the applications

Support for IPv6-only CPU – an update from the HEPiX IPv6 WG

Deployment of IPv6-only CPU on WLCG – an update from the HEPiX IPv6 WG

Marian Babik Andrea Sciabà

Update from the HEPiX IPv6 WG

Deployment of IPv6-only CPU on WLCG – an update from the HEPiX IPv6 WG

Short update on the latest gLite status

Deployment of IPv6-only CPU on WLCG - update from the HEPiX IPv6 WG

Summary from last MB “The MB agreed that a detailed deployment plan and a realistic time scale are required for deploying glexec with setuid mode at WLCG.

Simulation use cases for T2 in ALICE

ALICE – FAIR Offline Meeting KVI (Groningen), 3-4 May 2010

Ákos Frohner EGEE'08 September 2008

WLCG and support for IPv6-only CPU

WLCG Collaboration Workshop;

HEPiX IPv6 Working Group F2F Meeting

Monitoring of the infrastructure from the VO perspective

LHC Data Analysis using a worldwide computing grid

IPv6 transition Vincenzo Spinoso EGI Operations.

ETHZ, Zürich September 1st , 2016

WMS Options: DIRAC and GlideIN-WMS

UMD 2 Decommissioning Status

CHIPP - CSCS F2F meeting CSCS, Lugano January 25th , 2018.

This work is supported by projects Research infrastructure CERN (CERN-CZ, LM ) and OP RDE CERN Computing (CZ /0.0/0.0/1 6013/ ) from.

IPv6 update Duncan Rand Imperial College London

Deploying Production GRID Servers & Services

The LHCb Computing Data Challenge DC06

Presentation transcript:

WLCG IPv6 deployment strategy Alastair Dewhurst Andrea Sciabà on behalf of the HEPiX IPv6 WG 07/07/2016

Introduction IPv6 requirements of the experiments have been discussed at the last GDB A document has been prepared and almost finalised To be submitted at the July MB A summary is presented here

Motivations Exhaustion of IPv4 address space A real problem at some smaller sites Some cloud providers may provide only (or charge less for) IPv6-only resources Opportunistic resources may be IPv6-only

Goals Eventually, replace IPv4 with IPv6 (for all Internet)! Allow WLCG sites to deploy IPv6-only CPUs Provide an upgrade path for sites to plan Ideally, minimise the need of dual-stack with respect to IPv6-only, given the constraints Consequently, make all VO and Grid services contacted by WNs dual stack Storage Central services

Is the middleware ready? In short, yes HTCondor fully supports IPv6 CEs do CREAM, ARC-CE, HTCondor-CE But still some problems with CREAM client, critical for LHCb and ALICE xrootd 4, GridFTP, HTTP(s) do Xrootd storage and redirection, GridFTP-only storage Most storage systems (dCache, DPM, StoRM, …) do CASTOR does not (and never will) but possible to provide a “gateway” service via xrootd/GridFTP if needed EOS does, in versions using xrootd4 All central Grid services do MyProxy, VOMS, FTS, CVMFS, BDII, Frontier, … Although often not yet deployed in dual-stack

What does this mean for a site? IPv6-only WNs if needed Dual-stack would not address the address exhaustion issue! IPv6-enabled CEs IPv4 irrelevant if VO workload management supports IPv6 Dual-stack storage To support all CPU resources doing remote access and transfers to/from all sites IPv6-enabled Squid caches IPv4 desirable if not all Stratum-1’s support IPv6

Typical migration path for a site Deploy IPv6 on the network infrastructure Deploy a dual-stack perfSONAR instance Make storage dual-stack Applies to all sites, for the sake of allowing others to remotely access data via IPv6 Make all local services dual stack In some cases (see batch system) may not be possible, e.g. only IPv4 is supported Eventually it should not be necessary, e.g. IPv4 could be decommissioned Make all WNs IPv6-only Allow for a “grace period” during which IPv4 is kept as backup Use IPv4 private addresses if needed (e.g. by batch system)

Role of Tier-0/1 sites Extremely important as they provide access to, and distribute, a lot of data They run several central Grid and VO services FTS, Frontier, Stratum-0/1, VOMS, MyProxy, WM/DM services, ETF, Dashboard monitoring… Therefore, critical to make services dual-stack Even if availability of IPv4 addresses is not an issue at the site Storage performance and reliability should not depend on the IP protocol version used. Proposed targets: At least 1Gb/s and 90% availability by April 2017 At least 10 Gb/s and 95% availability by April 2018

Tier-1 status and plans

ALICE status ALICE central services are running in dual-stack since more than a year Storage is fully federated Any site can access data from any site To support IPv6-only resources, all data must be available on some IPv6-enabled storage All sites must run xrootd4 Current situation: 1/3 of sites still use xrootd3 5% of SEs run in dual-stack

ATLAS status ATLAS WM and DM tools (PanDA, Rucio, …) are based on HTTP and are tested on IPv6 WNs interact via HTTP with PanDA, Rucio, CVMFS and Frontier Current situation and plans: Pilot factories are dual-stack Complete migration of services to dual stack by next April Production PanDA server nodes Rucio Authentication and Production nodes Frontier servers at CERN, IN2P3, RAL and TRIUMF ARC Control Tower

CMS status WM is based on glideinWMS and HTCondor, both validated on IPv6 Federated storage via xrootd DM is based on PhEDEx, which uses the Oracle client for communication with central service WNs do not interact with it Other central services are hosted on the cmsweb cluster, which works in dual-stack Current situation and plans: glideinWMS frontends and factories are dual-stack Only ~11 sites support IPv6 No problems seen in monitoring and computing operations CMS has to enable dual-stack on all collectors and schedd’s by April 2017 Sites have to enable dual-stack on their storage by the end of Run2 Sites are advised to keep IPv4 as a backup until the end of Run2

LHCb status WM/DM is based on DIRAC, which supports IPv6 WNs need to access DIRAC central services, storage and optionally one of six VO boxes at Tier-1’s Current situation and plans: Only one Tier-1 and one Tier-2D storage system support LHCb in dual-stack configuration Central services are being moved to dual-stack machines

Proposed timeline By April 1st 2017 By April 1st 2018 By end of Run2 Sites can provide IPv6-only CPUs if necessary Tier-1’s must provide dual-stack storage access with sufficient performance and reliability At least in a testbed setup Stratum-1 service at CERN must be dual-stack A dedicated ETF infrastructure to test IPv6 services must be available ATLAS and CMS must deploy all services interacting with WNs in dual-stack All the above, without disrupting normal WLCG operations By April 1st 2018 Tier-1’s must provide dual-stack storage access in production with increased performance and reliability Tier-1’s must upgrade their Stratum-1 and FTS to dual-stack The official ETF infrastructure must be migrated to dual-stack GOCDB, OIM, GGUS, BDII should be dual-stack By end of Run2 A large number of sites will have migrated their storage to IPv6 The recommendation to keep IPv4 as a backup will be dropped

Executive Summary Sites can provide IPv6-only CPU resources from April 2017 onwards if necessary Sites can provide IPv6-only interfaces to their CPU resources, if necessary The VO infrastructure (e.g. central services provided by VOs) must provide an equal quality of service to both IPv4 and IPv6 resources Any site wishing to deploy IPv6-only CPU should contact the HEPiX IPv6 working group to discuss detailed plans By April 2018 it should be possible to deploy IPv6-only CPU with relative ease By the end of Run II enough sites should have upgraded their storage to dual stack to allow almost complete data availability over IPv6