Stefano Belforte INFN Trieste 1 CMS SC4 etc. July 5, 2006 CMS Service Challenge 4 and beyond.

Slides:



Advertisements
Similar presentations
Applications Area Issues RWL Jones GridPP13 – 5 th June 2005.
Advertisements

LCG Tiziana Ferrari - SC3: INFN installation status report 1 Service Challenge Phase 3: Status report Tiziana Ferrari on behalf of the INFN SC team INFN.
Applications Area Issues RWL Jones Deployment Team – 2 nd June 2005.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES News on monitoring for CMS distributed computing operations Andrea.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
Claudio Grandi INFN Bologna CMS Operations Update Ian Fisk, Claudio Grandi 1.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
CMS STEP09 C. Charlot / LLR LCG-DIR 19/06/2009. Réunion LCG-France, 19/06/2009 C.Charlot STEP09: scale tests STEP09 was: A series of tests, not an integrated.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.
2 Sep Experience and tools for Site Commissioning.
WLCG Service Report ~~~ WLCG Management Board, 1 st September
GridPP Deployment Status GridPP14 Jeremy Coles 6 th September 2005.
CERN Using the SAM framework for the CMS specific tests Andrea Sciabà System Analysis WG Meeting 15 November, 2007.
Ian Bird LHC Computing Grid Project Leader LHC Grid Fest 3 rd October 2008 A worldwide collaboration.
Stefano Belforte INFN Trieste 1 Middleware February 14, 2007 Resource Broker, gLite etc. CMS vs. middleware.
Stefano Belforte INFN Trieste 1 CMS Simulation at Tier2 June 12, 2006 Simulation (Monte Carlo) Production for CMS Stefano Belforte WLCG-Tier2 workshop.
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Site Validation Session Report Co-Chairs: Piotr Nyczyk, CERN IT/GD Leigh Grundhoefer, IU / OSG Notes from Judy Novak WLCG-OSG-EGEE Workshop CERN, June.
INFSO-RI Enabling Grids for E-sciencE The gLite File Transfer Service: Middleware Lessons Learned form Service Challenges Paolo.
WLCG Tier1 [ Performance ] Metrics ~~~ Points for Discussion ~~~ WLCG GDB, 8 th July 2009.
Jan 2010 OSG Update Grid Deployment Board, Feb 10 th 2010 Now having daily attendance at the WLCG daily operations meeting. Helping in ensuring tickets.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
US-CMS T2 Centers US-CMS Tier 2 Report Patricia McBride Fermilab GDB Meeting August 31, 2007 Triumf - Vancouver.
WLCG Planning Issues GDB June Harry Renshall, Jamie Shiers.
Site Report: Prague Jiří Chudoba Institute of Physics, Prague WLCG GridKa+T2s Workshop.
The CMS Top 5 Issues/Concerns wrt. WLCG services WLCG-MB April 3, 2007 Matthias Kasemann CERN/DESY.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
The CMS Computing System: getting ready for Data Analysis Matthias Kasemann CERN/DESY.
June 15, PMG Ruth Pordes Status Report US CMS PMG July 15th Tier-1 –LCG Service Challenge 3 (SC3) –FY05 hardware delivery –UAF support Grid Services.
Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
Report from GSSD Storage Workshop Flavia Donno CERN WLCG GDB 4 July 2007.
Eygene Ryabinkin, on behalf of KI and JINR Grid teams Russian Tier-1 status report May 9th 2014, WLCG Overview Board meeting.
LCG Service Challenges SC2 Goals Jamie Shiers, CERN-IT-GD 24 February 2005.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Criteria for Deploying gLite WMS and CE Ian Bird CERN IT LCG MB 6 th March 2007.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
ASCC Site Report Eric Yen & Simon C. Lin Academia Sinica 20 July 2005.
8 August 2006MB Report on Status and Progress of SC4 activities 1 MB (Snapshot) Report on Status and Progress of SC4 activities A weekly report is gathered.
1 November 17, 2005 FTS integration at CMS Stefano Belforte - INFN-Trieste CMS Computing Integration Coordinator CMS/LCG integration Task Force leader.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
Grid Deployment Board 5 December 2007 GSSD Status Report Flavia Donno CERN/IT-GD.
The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
Status of gLite-3.0 deployment and uptake Ian Bird CERN IT LCG-LHCC Referees Meeting 29 th January 2007.
Summary of SC4 Disk-Disk Transfers LCG MB, April Jamie Shiers, CERN.
Monitoring the Readiness and Utilization of the Distributed CMS Computing Facilities XVIII International Conference on Computing in High Energy and Nuclear.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
ALICE Physics Data Challenge ’05 and LCG Service Challenge 3 Latchezar Betev / ALICE Geneva, 6 April 2005 LCG Storage Management Workshop.
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
TIFR, Mumbai, India, Feb 13-17, GridView - A Grid Monitoring and Visualization Tool Rajesh Kalmady, Digamber Sonvane, Kislay Bhatt, Phool Chand,
ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon
Pledged and delivered resources to ALICE Grid computing in Germany Kilian Schwarz GSI Darmstadt ALICE Offline Week.
Baseline Services Group Status of File Transfer Service discussions Storage Management Workshop 6 th April 2005 Ian Bird IT/GD.
Daniele Bonacorsi Andrea Sciabà
Computing Operations Roadmap
LCG Service Challenge: Planning and Milestones
POW MND section.
Data Challenge with the Grid in ATLAS
Update on Plan for KISTI-GSDC
LHC Data Analysis using a worldwide computing grid
The LHCb Computing Data Challenge DC06
Presentation transcript:

Stefano Belforte INFN Trieste 1 CMS SC4 etc. July 5, 2006 CMS Service Challenge 4 and beyond

Stefano Belforte INFN Trieste July 5, 2006 CMS SC4 etc. 2 CMS SC4 targets/milestones CMS has set targets for SC4 since beginning of 2006  Communicated them to all our sites  Posted on Twiki, Web, Agenda’s, mails… Aimed to smoothly roll in a WLCG service that CMS can use in October 2006 to test computing flow at 25% of 2008 scale (CSA06) Establish early in the year baseline of non-demaning continous file transfers (20MB/sec per site disk-only). Avoid SC3 syndrome. Targets for June  Have all CMS Tier1 and most Tier2 on board  Demonstrate data transfers at realistic rates (tape at T1)  150MB/sec out of CERN total (600MB/sec in 2008)  Ramp up job submission rate over WLCG to 25K jobs/day  90% submission efficiency  double to 50K jobs/day by October (200K target in 2008)  1 MB/sec/execution-slot from disk to CPU

Stefano Belforte INFN Trieste July 5, 2006 CMS SC4 etc. 3 Spring summary Could not establish working baseline in the spring  in spite of start of activiyt in early April  srmCopy, srm push/pull, Castor2, WLCG tests, FTS, gLite 3.0 … Used the time to get ready for SC4:  FTS integrated in PhEDEx  Develop job submission/monitoring tools  Integrate in new CMS Software framework Started SC4 at beginning of June

Stefano Belforte INFN Trieste July 5, 2006 CMS SC4 etc. 4 June summary Steady progress, but not there yet Finally have all Tier1’s and 20 Tier2’s (including perspective ones) on board: run one CMS simulation job, transfer 100GB of CMS data, run one CMS analysis job on local data Transfers: a few channels work, mostly below target rate  much better on OSG side.  Ad hoc effort started in coordination with WLCG, daily meetings, several CMS persons involved Job submission rate: reached what seems to be LCG RB scale limit (5K jobs/day/RB) 90% job success  OK for WMS (OSG and EGEE)  OK for single sites at certain times  Not achieved yet on all sites for the same day (not even all T1’s) Not started measuring disk/CPU throughput

Stefano Belforte INFN Trieste July 5, 2006 CMS SC4 etc. 5 T0  T1 transfers

Stefano Belforte INFN Trieste July 5, 2006 CMS SC4 etc. 6 Job submission rates Imperial College WMS monitoring list jobs by RB each day for all VO  Can hardly find an RB with more then 5K jobs/day

Stefano Belforte INFN Trieste July 5, 2006 CMS SC4 etc. 7 Job success rate by site: two typical days 75% success overall CNAF lost cooling Jobs not sent to CE since yet, no LB news (overloaded RB) Two sites with mware problems Random failures at manageable level Sites go in/out BDII and RB can not match

Stefano Belforte INFN Trieste July 5, 2006 CMS SC4 etc. 8 Sites issues We find sites in general responsive to CMS needs and problems  Thank you !!! gLite 3.0 not really there yet as a production service  FTS servers/channels not configured/exercised using CMS end points. Data transfer is an end-to-end activity.  Hints of configuration problems in gLite 3.0 LCG flavor CE  All CMS jobs fail with Maradona error on selected CE’s. – other VO OK, SFT all green – a couple sites kept failing all jobs since > 2weeks – a 3rd site apparently went back to LCG 2_7_0 CE after a few days Not using the same infrastructure as WLCG throughput tests yet  E.g. Castor1 instead of Castor2 at Tier1’s  Could not get a clean schedule for Castor2 avaibility for CMS  Want to be on final production infrastructure asap  Not be connected to something that in your opinion will not work

Stefano Belforte INFN Trieste July 5, 2006 CMS SC4 etc. 9 Summer plans Keep up basic SC4 activity:  File transfers up to target, then stay there  Job submission at ~12Kjob/day now, 10Kj/d demonstrated on OSG using Condor-G earlier on Add more complexity  Simulate 1M event/day in July/August to prepare for CSA06  Commission gLite WMS (RB, maybe also CE) to reach 50K jobs/day without 10 RB’s  Add calibration/conditions data remote access  Test data serving throughput at sites (disk  CPU)  Allocate CPU resources separately for production and analysis jobs. Grid should not be a global FIFO queue It will be a busy summer

Stefano Belforte INFN Trieste July 5, 2006 CMS SC4 etc. 10 After the Summer CSA06 in October 2006 From CERN disk to Tier0 to Tier1 and Tier2 Demonstrating reconstration, analysis, calibration, reprocessing At 25% of 2008 scale  Hz  Over a month  Would like to try higher througput Well captured in Harry Renshall’s twiki  Ramp up to 2008 pledges  We caution sites against planning for a large increase in capacity in a short time. Every x2 is a challenge  True for Tier1 and Tier2  We look forward to stress test sites as soon as resources are available