WLCG Operations Coordination Andrea Sciabà IT/SDC 10 th July 2013.

Slides:



Advertisements
Similar presentations
Operations Coordination Team Maria Girone, CERN IT-ES GDB 10 th October 2012.
Advertisements

Operations Coordination Team Maria Girone, CERN IT-ES Kick-off meeting 24 th September 2012.
The Middleware Readiness Working Group LHCb Computing Workshop LHCb Computing Workshop Maria Dimou IT/SDC 2014/05/22.
New VOMS servers campaign GDB, 8 th Oct 2014 Maarten Litmaath IT/SDC.
IPv6 testing plans 25 Jan Short term – next 6 weeks Add sites to testbed – Glasgow (DPM storage end point) – Fix DESY – Others? Is GridFTP mesh.
News from the HEPiX IPv6 Working Group David Kelsey (STFC-RAL) HEPiX, Oxford 24 Mar 2015.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES News on monitoring for CMS distributed computing operations Andrea.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
WLCG Operations Coordination report Maria Dimou / CERN With input and on behalf of the WLCG Operations Coordination team May 2015 GDB CERN indico event.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
Status of WLCG Tier-0 Maite Barroso, CERN-IT With input from T0 service managers Grid Deployment Board 9 April Apr-2014 Maite Barroso Lopez (at)
Input from CMS Nicolò Magini Andrea Sciabà IT/SDC 5 July 2013.
WLCG Service Report ~~~ WLCG Management Board, 27 th October
News from the HEPiX IPv6 Working Group David Kelsey (STFC-RAL) WLCG GDB, CERN 8 July 2015.
News from the HEPiX IPv6 Working Group David Kelsey (STFC-RAL) GridPP35, Liverpool 11 Sep 2015.
The production deployment of IPv6 on WLCG David Kelsey (STFC-RAL) CHEP2015, OIST, Okinawa 16 Apr 2015.
Marian Babik, Luca Magnoni SAM Test Framework. Outline  SAM Test Framework  Update on Job Submission Timeouts  Impact of Condor and direct CREAM tests.
1. Maria Girone, CERN  Q WLCG Resource Utilization  Commissioning the HLT for data reprocessing and MC production  Preparing for Run II  Data.
WLCG Service Report ~~~ WLCG Management Board, 1 st September
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
MW Readiness Verification Status Andrea Manzi IT/SDC 21/01/ /01/15 2.
CERN11 th February WLCG Ops Coordination [GDB Report] Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team GDB – CERN.
WLCG operations A. Sciabà, M. Alandes, J. Flix, A. Forti WLCG collaboration workshop July , Barcelona.
MW Readiness WG Update Andrea Manzi Maria Dimou Lionel Cons 10/12/2014.
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
Storage Federations and FAX (the ATLAS Federation) Wahid Bhimji University of Edinburgh.
Information System Status and Evolution Maria Alandes Pradillo, CERN CERN IT Department, Grid Technology Group GDB 13 th June 2012.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT DPM / LFC and FTS news Ricardo Rocha ( on behalf of the IT/GT/DMS.
LCG Introduction John Gordon, STFC GDB June 8 th 2011.
PanDA Status Report Kaushik De Univ. of Texas at Arlington ANSE Meeting, Nashville May 13, 2014.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES GGUS Ticket review T1 Service Coordination Meeting 2010/10/28.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
LCG Report from GDB John Gordon, STFC-RAL MB meeting February24 th, 2009.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.
Julia Andreeva on behalf of the MND section MND review.
Service Availability Monitor tests for ATLAS Current Status Tests in development To Do Alessandro Di Girolamo CERN IT/PSS-ED.
The HEPiX IPv6 working group David Kelsey (STFC-RAL) HEPiX meeting, Bologna 17 Apr 2013.
December GDB Brief summary – J Coles. Meetings January meeting moved to 15 th 2014 events created. Check March meeting outside CERN. Copenhagen workshop.
FAX UPDATE 12 TH AUGUST Discussion points: Developments FAX failover monitoring and issues SSB Mailing issues Panda re-brokering to FAX Monitoring.
WLCG Service Report ~~~ WLCG Management Board, 18 th September
SL5 Site Status GDB, September 2009 John Gordon. LCG SL5 Site Status ASGC T1 - will be finished before mid September. Actually the OS migration process.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
December GDB Summary See also: Jeremy’s notes.
LCG Issues from GDB John Gordon, STFC WLCG MB meeting September 28 th 2010.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
ATLAS Distributed Computing ATLAS session WLCG pre-CHEP Workshop New York May 19-20, 2012 Alexei Klimentov Stephane Jezequel Ikuo Ueda For ATLAS Distributed.
MW Readiness WG Update Andrea Manzi Maria Dimou Lionel Cons Maarten Litmaath On behalf of the WG participants GDB 09/09/2015.
SAM Status Update Piotr Nyczyk LCG Management Board CERN, 5 June 2007.
WLCG Operations Coordination and Commissioning Maria Girone, CERN IT On behalf of the Operations Coordination Team 11 th March OSG All Hands Meeting,
WLCG Operations Coordination news and meeting restructuring Maria Alandes Pradillo Josep Flix Alessandra Forti Andrea Sciabà WLCG operations coordination.
News from the HEPiX IPv6 Working Group David Kelsey (STFC-RAL) HEPIX, BNL 13 Oct 2015.
The HEPiX IPv6 Working Group David Kelsey (STFC-RAL) EGI OMB 19 Dec 2013.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
Outcome should be a documented strategy Not everything needs to go back to square one! – Some things work! – Some work has already been (is being) done.
WLCG Operations Coordination report Maria Dimou Andrea Sciabà IT/SDC On behalf of the WLCG Operations Coordination team GDB 12 th November 2014.
Site notifications with SAM and Dashboards Marian Babik SDC/MI Team IT/SDC/MI 12 th June 2013 GDB.
HEPiX IPv6 Working Group David Kelsey (STFC-RAL) GridPP33 Ambleside 22 Aug 2014.
WLCG Operations Coordination Andrea Sciabà IT/SDC GDB 11 th September 2013.
Daniele Bonacorsi Andrea Sciabà
WLCG IPv6 deployment strategy
gLite->EMI2/UMD2 transition
WLCG Operations Coordination
Update on Plan for KISTI-GSDC
Deployment of IPv6-only CPU on WLCG – an update from the HEPiX IPv6 WG
Update from the HEPiX IPv6 WG
IPv6 update Duncan Rand Imperial College London
Presentation transcript:

WLCG Operations Coordination Andrea Sciabà IT/SDC 10 th July 2013

Outline  Status of task forces  News from the WLCG Operations Planning meeting  Experiment plans  New activities  Conclusions 10 July 2013 WLCG Operations Coordination – A. Sciabà 2

Middleware  EMI-3 WN and UI already installed for testing at few sites but not yet recommended  Issue with VOMS client (fix available in EMI-3)  Validation ongoing  Under test at Liverpool and DESY  Baseline version in EMI-3 for some services  BDII_top, L&B, StoRM, WMS 10 July 2013 WLCG Operations Coordination – A. Sciabà 3

SL6 migration  Deployment status  Total number of Tier1s Done: 6/15 (Alice 4/9, Atlas 4/12, CMS 2/9, LHCb 3/8)  3 of the “not done” are in progress (PIC will finish this week)  Total number of Tier2s Done: 30/124 (Alice 7/39, Atlas 13/86, CMS 18/61, LHCb 8/45)  CVMFS issue with SL6 reported at previous meeting was fixed by a kernel update 10 July 2013 WLCG Operations Coordination – A. Sciabà 4

gLExec  The tentative deadline for enabling gLExec at sites was October 1 st  the actual deadline will likely be coupled to the timeline for the WN migration to SL6  ~100 tickets already opened to sites (~10 already solved and verified)  USCMS ~100% OK, USATLAS no plans yet  Deployment status trackedtracked 10 July 2013 WLCG Operations Coordination – A. Sciabà 5

SHA-2  New CERN CA certificate available in IGTF  DIRAC services ready for SHA-2  Several ATLAS services tested and ready (AGIS, PanDA, DDM, …)  EGI just started to run SAM tests for SHA-2 compliance of site services  Migration to VOMS-Admin to be carefully planned  VO managers will need time to learn 10 July 2013 WLCG Operations Coordination – A. Sciabà 6

CVMFS  Deployment for ALICE has begun  GGUS tickets sent to ALICE sites, already some closed  Issue with latest version (2.1.11), fix soon to be released: sites should install it when available and skip July 2013 WLCG Operations Coordination – A. Sciabà 7

FTS-3  RAL server production-ready, CERN very soon  Pilot services also at ASGC, BNL  BNL, KIT, CNAF will deploy production servers  Under discussion at PIC, IN2P3-CC  Next milestones  July: migrate some production transfers to FTS-3 at CERN and RAL in “FTS-2-like mode”  August: gain experience and include other servers 10 July 2013 WLCG Operations Coordination – A. Sciabà 8

xrootd  Both AAA and FAX have ~40 sites each  Not all of them produce monitoring information  Almost all needed plugins in the WLCG repository  The dCache one missing, needs some finishing touches  Request to register all xrootd endpoints and redirectors in GOCDB/OIM  Allows to declare downtimes, run ad-hoc SAM tests, etc.  Need to solve an issue with DPM  Only local traffic is monitored 10 July 2013 WLCG Operations Coordination – A. Sciabà 9

Tracking tools evolution  Savannah-to-JIRA migration status  Instructions updated Instructions  GGUS tracker transition status updatedstatus  Further development will wait for the upgrade to JIRA 6 this month  Savannah-to-GGUS bridge for CMS being moved to GGUS-only  Progress trackedtracked  Today the new GGUS SU for “Grid monitoring” will be created  Will eventually supersede Dashboard and SAM SUs 10 July 2013 WLCG Operations Coordination – A. Sciabà 10

perfSONAR  Version 3.3 released, will be deployed in the next three months on WLCG  Sites are strongly encouraged to upgrade to/install this version  Sites which did not do it already should publish their instances in GOCDB/OIM  Testing the new modular Dashboard, including the API 10 July 2013 WLCG Operations Coordination – A. Sciabà 11

ALICE plans  ALICE increasingly committed to CVMFS  AliEn being developed and tested for it  Working on rationalising SAM tests  Import results from MonALISA  Xrootd, VOBOX 10 July 2013 WLCG Operations Coordination – A. Sciabà 12

ATLAS plans  Residual need for shared area soon to be eliminated  Simulation validated for multicore  Sites encouraged to deploy more queues  All sites should deploy perfSONAR  All sites should provide WebDAV access for storage management operations (or discuss an alternative with ATLAS) by September  Widely use xrootd for WAN and LAN data access after summer  Main use cases for FAX are fail-over for local access failures and breaking jobs-to-data locality  Russian Proto-T1 is contributing to production (but no tape yet)  Migrate ATLAS central services to OpenStack VMs with SL6 during 2014  Start stress testing RUCIO in July and release first official version of JEDI by end of summer 10 July 2013 WLCG Operations Coordination – A. Sciabà 13

CMS plans  Multicore  Pilots can now run several single-threaded CMSSW processes  Commission multi-threaded CMSSW by end 2013  CRAB3-PanDA integration open to beta testers  Xrootd federation  Integrate > 90% of sites by autumn  Disk-tape separation  Start testing in autumn  Opportunistic resources  Non-CMS sites, clouds, HPCs, via Parrot and CVMFS  Interest in using grid.cern.ch for Grid clients 10 July 2013 WLCG Operations Coordination – A. Sciabà 14

LHCb plans  Introducing Tier2Ds (with disk storage for analysis jobs)  Looking into using perfSONAR data as quality metric and to choose Tier-2’s for reprocessing campaigns  Start working on algorithms to take decisions based on data popularity metrics  Enhance SAM tests by publishing information from DIRAC  Align strategy with WLCG monitoring consolidation project  By end 2013, new software releases only for SL6 to use C++11 features  T1 sites should provide SL6 resources 10 July 2013 WLCG Operations Coordination – A. Sciabà 15

New activities  Just started collaborating with the Hepix IPv6 working group on WLCG application testing  Contribute the site perspective to the new WLCG Monitoring Consolidation project  All monitoring experts from sites welcome to contribute via mailing list  Pepe Flix will represent the OCCT in the project  New task force on Job/Machine Features just launched  Coordinated by Stefan Roiser 10 July 2013 WLCG Operations Coordination – A. Sciabà 16

Conclusions  Steady progress for all task forces  Last quarter of 2013 as target date for many of them  Experiment plans focus on common topics  CVMFS adoption  Monitoring (SAM, perfSONAR)  New data management  Tools, storage federations, protocols, etc.  Multicore in production  Virtualisation, clouds, opportunistic resources 10 July 2013 WLCG Operations Coordination – A. Sciabà 17