Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.

Slides:



Advertisements
Similar presentations
Applications Area Issues RWL Jones GridPP13 – 5 th June 2005.
Advertisements

Exporting Raw/ESD data from Tier-0 Tier-1s Wrap-up.
LCG Tiziana Ferrari - SC3: INFN installation status report 1 Service Challenge Phase 3: Status report Tiziana Ferrari on behalf of the INFN SC team INFN.
Applications Area Issues RWL Jones Deployment Team – 2 nd June 2005.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
WLCG ‘Weekly’ Service Report ~~~ WLCG Management Board, 22 th July 2008.
Stefano Belforte INFN Trieste 1 CMS SC4 etc. July 5, 2006 CMS Service Challenge 4 and beyond.
October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
Status of WLCG Tier-0 Maite Barroso, CERN-IT With input from T0 service managers Grid Deployment Board 9 April Apr-2014 Maite Barroso Lopez (at)
SC4 Workshop Outline (Strong overlap with POW!) 1.Get data rates at all Tier1s up to MoU Values Recent re-run shows the way! (More on next slides…) 2.Re-deploy.
WLCG Service Report ~~~ WLCG Management Board, 27 th October
CERN IT Department CH-1211 Genève 23 Switzerland t EIS section review of recent activities Harry Renshall Andrea Sciabà IT-GS group meeting.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
LCG ** * * * * * * * * * * LCG Service Challenges: Status and Plans –
SRM 2.2: tests and site deployment 30 th January 2007 Flavia Donno, Maarten Litmaath IT/GD, CERN.
SRM 2.2: status of the implementations and GSSD 6 th March 2007 Flavia Donno, Maarten Litmaath INFN and IT/GD, CERN.
LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.
WLCG Service Report ~~~ WLCG Management Board, 24 th November
Workshop Summary (my impressions at least) Dirk Duellmann, CERN IT LCG Database Deployment & Persistency Workshop.
John Gordon STFC-RAL Tier1 Status 9 th July, 2008 Grid Deployment Board.
BNL Service Challenge 3 Site Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
Δ Storage Middleware GridPP10 What’s new since GridPP9? CERN, June 2004.
CCRC’08 Weekly Update Jamie Shiers ~~~ LCG MB, 1 st April 2008.
GridPP Deployment Status GridPP14 Jeremy Coles 6 th September 2005.
LCG Service Challenges: Planning for Tier2 Sites Update for HEPiX meeting Jamie Shiers IT-GD, CERN.
LCG Service Challenges: Planning for Tier2 Sites Update for HEPiX meeting Jamie Shiers IT-GD, CERN.
Author: Andrew C. Smith Abstract: LHCb's participation in LCG's Service Challenge 3 involves testing the bulk data transfer infrastructure developed to.
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
ATLAS Bulk Pre-stageing Tests Graeme Stewart University of Glasgow.
SC4 Planning Planning for the Initial LCG Service September 2005.
INFSO-RI Enabling Grids for E-sciencE The gLite File Transfer Service: Middleware Lessons Learned form Service Challenges Paolo.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
WLCG Service Report ~~~ WLCG Management Board, 7 th September 2010 Updated 8 th September
4 March 2008CCRC'08 Feb run - preliminary WLCG report 1 CCRC’08 Feb Run Preliminary WLCG Report.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
LCG Service Challenges SC2 Goals Jamie Shiers, CERN-IT-GD 24 February 2005.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
SL5 Site Status GDB, September 2009 John Gordon. LCG SL5 Site Status ASGC T1 - will be finished before mid September. Actually the OS migration process.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
SRM v2.2 Production Deployment SRM v2.2 production deployment at CERN now underway. – One ‘endpoint’ per LHC experiment, plus a public one (as for CASTOR2).
Operations model Maite Barroso, CERN On behalf of EGEE operations WLCG Service Workshop 11/02/2006.
ASCC Site Report Eric Yen & Simon C. Lin Academia Sinica 20 July 2005.
8 August 2006MB Report on Status and Progress of SC4 activities 1 MB (Snapshot) Report on Status and Progress of SC4 activities A weekly report is gathered.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.
WLCG Service Report Jean-Philippe Baud ~~~ WLCG Management Board, 24 th August
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
LCG Service Challenges: Progress Since The Last One –
Status of gLite-3.0 deployment and uptake Ian Bird CERN IT LCG-LHCC Referees Meeting 29 th January 2007.
Summary of SC4 Disk-Disk Transfers LCG MB, April Jamie Shiers, CERN.
WLCG Service Report ~~~ WLCG Management Board, 10 th November
ARDA Massimo Lamanna / CERN Massimo Lamanna 2 TOC ARDA Workshop Post-workshop activities Milestones (already shown in December)
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Top 5 Experiment Issues ExperimentALICEATLASCMSLHCb Issue #1xrootd- CASTOR2 functionality & performance Data Access from T1 MSS Issue.
ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon
LHCC meeting – Feb’06 1 SC3 - Experiments’ Experiences Nick Brook In chronological order: ALICE CMS LHCb ATLAS.
LCG Service Challenges: Progress Since The Last One –
Operations Workshop Introduction and Goals Markus Schulz, Ian Bird Bologna 24 th May 2005.
“A Data Movement Service for the LHC”
LCG Service Challenge: Planning and Milestones
Service Challenge 3 CERN
3D Application Tests Application test proposals
Database Readiness Workshop Intro & Goals
CMS — Service Challenge 3 Requirements and Objectives
Olof Bärring LCG-LHCC Review, 22nd September 2008
LHC Data Analysis using a worldwide computing grid
The LHCb Computing Data Challenge DC06
Presentation transcript:

Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005

LCG Project, Plans for SC3 2 LCG Deployment Schedule

LCG Project, Plans for SC3 3 Service Challenge 3 - Phases High level view:  Setup phase  Finishes with 2 weeks sustained throughput test in July 2005  Primary goals:  150MB/s disk – disk to Tier1s;  60MB/s disk (T0) – tape (T1s)  Secondary goals:  Include a few named T2 sites (T2 -> T1 transfers)  Encourage remaining T1s to start disk – disk transfers  Service phase – must be run as the real production service  September – end 2005  Start with ALICE & CMS, add ATLAS and LHCb October/November  All offline use cases except for analysis  More components: WMS, VOMS, catalogs, experiment-specific solutions  Implies production setup (CE, SE, …)

LCG Project, Plans for SC3 4 SC3 – Deadlines and Deliverables  May 31 st 2005: basic components delivered and in place  June 2005: integration testing  June 13 – 15: SC3 planning workshop at CERN – experiment issues  June 30 th 2005: integration testing successfully completed  July 1 – 10: start disk – disk throughput tests  Assume a number of false starts / difficulties  July 11 – 20: disk tests  July 21 – 27: tape tests  July 28 – 31: T2 tests

LCG Project, Plans for SC3 5 Service Challenge Workshop  Three-day meeting (13-15 June)  First two days with presentations from Experiments. 1/2 day per experiment to cover:  Summary of Grid Data Challenges to date  Goals for SC3  Plans for usage of SC3 infrastructure  Third day focused on issues for the Tier-1 sites  Discussion focused on issues raised during previous two days  SRM requirements presentations from experiments and developers  Approximately 40 people for first two days and 60 for last day  Many CERN IT people appearing for last day  Not all sites present during first two days (??) – if present, very quiet!

LCG Project, Plans for SC3 6 Experiment Goals and Plans  All four experiments plan to be involved in SC3  Brief “one-line” summary  LHCb will evaluate the new tools via the pilot and do a data management challenge in September. Assuming ok will want to use a service from October  ALICE will also evaluate the new tools but want to run a full data challenge based on this infrastructure asap  CMS will use the resources to run two challenges in September and November, but with modest throughput. These includes T0-T1-T2 data movement and T2-T1 movement for MC Data  ATLAS plan to run a Tier-0 exercise in October along with MC production at T2 and reprocessing at Tier-1. They will use their new DDM software stack

LCG Project, Plans for SC3 7 Experiment Goals and Plans  Concern that the experiment timelines all overlap  Creating a unified timeline from the detailed presentations  We need to respond with what is possible  Pilot services for FTS and LFC are of great interest to experiments.  They’d like Fireman as well for testing  Long discussions about “VO Boxes” at all sites – neither sites, experiments or middleware providers have worked through full implications of this  First we need to list exactly what the expt requirements are  Plan is to provide an interim solution for evaluation during SC3

LCG Project, Plans for SC3 8 Tier-1 Plans and Goals  Clear message from workshop that some sites did not understand what SC3 mean in terms of compute resources  “more than a transfer test”  We need to resolve how to integrate SC3 resources into the production grid environment  “there can only be one production environment” – discussed in GDB last week  Service levels provided will be “best-effort”  We should be able to live with a site being down for a while  But we must measure site uptime/availability/response during the challenge.

LCG Project, Plans for SC3 9 Software at Tier-1s  Many SRM services are late – deadline was for end May  Many sites still haven’t got services ready for SC3  Some need to upgrade versions (BNL)  Some need to debug LAN network connections (RAL)  Some are finalizing installs (FZK, ASCC, …)  And we’re still mostly at the level of debugging SRM transfers  Many errors and retries detected at FTS level  Still need to rerun iperf tests to measure expected network throughput for all sites  Activity required from Tier-1s to run the network measurement tests and more SRM level tests  Sites need to be more proactive in testing and publishing the information

LCG Project, Plans for SC3 10 CERN Services

LCG Project, Plans for SC3 11 SC3 Services Status  FTS  SC3 service installed and configured. Limited testing undergone with Tier-1s. Many Tier-1’s still upgrading to dCache and it’s not all stable yet  BNL have a version of the FTS Server for their T1-T2 traffic  seeing many problems in getting it installed and configured  working with gLite team to try and solve these  Pilot services not ready yet  Installed but not configured yet  Experienced long delays for new software through gLite build+test process  but we now have a tag that will be ok for setup/throughput  This is part of LCG-2_5_0  Will need new version of FTS for service phase  Current version does not do inter-VO scheduling  This presents a risk since it will be a major rewrite

LCG Project, Plans for SC3 12 SC3 Services Status  LFC  Pilot and SC3 services are installed, configured and announced to experiments  POOL interface now available (POOL 2.1.0)  Not much usage yet by experiments  CASTORGRIDSC SRM  20TB setup running using old stager and old SRM code  Plan is to migrate to new CASTOR stager  fallback solution is to use old stager for setup phase  Migration of SC setup to new Castor stager is in progress

LCG Project, Plans for SC3 13 SC3 Services Status  Starting to put in place the service teams for SC3  First level support at CERN from operators  Second line support at CERN from GD SC and EIS teams  Third line support from software experts  LFC, FTS, Castor-SRM, …  Site support through site specific service challenge mailing lists  What is the level of support we will get?  Operator procedures and problem escalation steps still not clear  Reporting of problems through – tied into problem tracking system

LCG Project, Plans for SC3 14 Communication  Service Challenge Wiki  Takes over from service-radiant wiki/web-site used in SC1 & 2  hallenges hallenges  Contains Tier-0 and Tier-1 contact/configuration information and work logs for SC teams  Weekly phonecons ongoing  Daily service meetings for CERN teams from 27 th June  Technical communication through service-challenge- tech list  What else is required by Tier-1s?  Daily (or frequent) meetings during SC?

LCG Project, Plans for SC3 15 Summary  Good understanding and agreement on goals of SC3  What services need to run where,  Proposed metrics to define success  Detailed schedule  Detailed discussion of experiment goals/plans in workshop last week  Concerns about readiness of many sites to run production-level services  Preparations are late, but lots of pressure and effort  Are enough resources available to run services?  Backups, single points of failure, vacations, …  Experiments expect that SC3 leads to real production service by end of year  Must continue to run during preparations for SC4  This is the build up to the LHC service – must ensure that appropriate resources are behind it

LCG Project, Plans for SC3 16 Core Site Services  CERN  Storage: Castor/SRM  File catalogue: POOL LFC Oracle  FNAL  Storage: dCache/SRM  File catalogue: POOL Globus RLS  CNAF  Storage: Castor/SRM  File catalogue: POOL LFC Oracle  RAL  Storage: dCache/SRM  File catalogue: POOL LFC Oracle?  IN2P3  Storage: dCache/SRM  File catalogue: POOL LFC Oracle  SARA/NIKHEF  Storage:dCache/SRM  File catalogue: POOL LFC MySQL(?)  PIC  Storage: Castor/SRM  File catalogue: POOL LFC MySQL  FZK  Storage: dCache/SRM  File catalogue: POOL LFC Oracle  ASCC  Storage: Castor/SRM  File catalogue: POOL LFC Oracle  BNL  Storage: dCache/SRM  File catalogue: POOL LFC Oracle  TRIUMF  Storage: dCache/SRM  File catalogue: POOL LRC MySQL(?)  NDGF  Storage:  File catalogue: Running FTS service for T2s