Download presentation
Presentation is loading. Please wait.
1
LCG Service Challenge: Planning and Milestones
INFN CNAF Review Mar 1, 2006
2
Service Challenge 4 schedule SC3 and pre-SC4 results SC4:
Outline Service Challenge 4 schedule SC3 and pre-SC4 results SC4: CNAF hw resources and milestones Service planning Personnel
3
SC Milestones 2006 January SC3 disk repeat – Nominal rate (200 MB/s) capped at 150MB/s/ February CHEP Workshop; T1-T1 Use Cases, SC3 disk - tape repeat (50MB/s, 5 drives) March Detailed plan for SC4 service agreed (M/W + DM service enhancements). gLite 3.0 release beta testing April SC4 disk – disk (200 MB/s) and disk – tape (75 MB/s) throughput tests. gLite 3.0 release available for distribution May Installation, configuration and testing of gLite 3.0 release at sites June Start of SC4 production tests by experiments of ‘T1 Use Cases’ T2 Workshop: identification of key Use Cases and Milestones for T2s July Tape throughput tests at full nominal rates! August T2 Milestones – debugging of tape results if needed September LHCC review – rerun of tape tests if required October WLCG Service Officially opened. Capacity continues to build up November 1st WLCG ‘conference’. All sites have network / tape h/w in production(?) December Final service / middleware review leading to early 2007 upgrades for LHC data taking??
4
Disk – disk throughput test rerun (Jan 06)
5
Alice: Running jobs (Jan-Feb 06)
Alice Running Jobs [CS] FarmLast value Min Avg Max SUM 0 0 786.2 3651 Bari 0 0 11.28 84 CNAF 6.929 0 209.2 1072 Torino 1.108 0 21.99 47 SUM CNAF
6
LHCb: FTS channel mesh testing Jan-Feb 2006
CNAF
7
SC3: CMS, Phase 1 report (Sep-Nov 05)
Objective: 10 or 50 TB per Tier-1 and ~5 TB per Tier-2 (source: L.Tuura)
8
SC3: LHCb report Phase 1 (Oct-Nov 05, Data moving)
Less than 1TB stripped DSTs replicated. At INFN most of this data already existed with only a few files missing from the dataset therefore it was only necessary to replicate a small fraction of the files from CERN (source: A. C. Smith) Configuration of an entire CNAF-to-Tier1 channel matrix (for replication of stripped data)
9
Production phase started on Nov 2
SC3: ATLAS Production phase started on Nov 2 5932 files copied and registered at CNAF: 89 “Failed replication” events 14 “No replicas found”
10
SC4 Hardware Resources at CNAF
Total capacities: 2500 CPU slots (1500 physical CPUs) disk: 400 TB, including the Castor buffer space tape: 200 TB (4 9940B + 6 LTO2 drives) 2 Gb/s available bandwidth for LHC (CERN - CNAF) LHC fraction: up to 2500 CPU slots (1500 physical CPUs) - all worker nodes are shared disk: 112 TB, including the Castor front-end tape: 160 TB (4 9940B + 6 LTO2 drives) All CPU installed with SLC and LCG 2.6.0 Additional test farm available Total capacities: 2500 CPU slots (1500 physical CPUs) disk: 400 TB, including the Castor buffer space tape: 200 TB (4 9940B + 6 LTO2 drives) 2 Gb/s available bandwidth for LHC (CERN - CNAF) LHC fraction: up to 2500 CPU slots (1500 physical CPUs) - all worker nodes are shared disk: 112 TB, including the Castor front-end tape: 160 TB (4 9940B + 6 LTO2 drives) All CPU installed with SLC and LCG 2.6.0 Additional test farm available Total capacities: 2500 CPU slots (1500 physical CPUs) disk: 400 TB, including the Castor buffer space tape: 200 TB (4 9940B + 6 LTO2 drives) 2 Gb/s available bandwidth for LHC (CERN - CNAF) LHC fraction: up to 2500 CPU slots (1500 physical CPUs) - all worker nodes are shared disk: 112 TB, including the Castor front-end tape: 160 TB (4 9940B + 6 LTO2 drives) All CPU installed with SLC and LCG 2.6.0 Additional test farm available Total capacities: 2500 CPU slots (1500 physical CPUs) disk: 400 TB, including the Castor buffer space tape: 200 TB (4 9940B + 6 LTO2 drives) 2 Gb/s available bandwidth for LHC (CERN - CNAF) LHC fraction: up to 2500 CPU slots (1500 physical CPUs) - all worker nodes are shared disk: 112 TB, including the Castor front-end tape: 160 TB (4 9940B + 6 LTO2 drives) All CPU installed with SLC and LCG 2.6.0 Additional test farm available SC4 Hardware Resources at CNAF 2500 CPU slots (out of 1000 physical CPUs) LHC: up to 2500 CPU slots (Worker Nodes are shared) SLC and LCG 2.6.0 Disk: 400 TB (including the Castor buffer space) LHC: 100 TB (including the Castor buffer space) Tape: 260 TB (7 9940B + 6 LTO2 drives) LHC: 160 TB Network available bandwidth (CERN – CNAF): 10 Gb/s CNAF – T2 connectivity already tested (Bari, Catania, Legnaro, Milan, Pisa, Torino) CNAF – Karlsruhe connectivity (1 Gb/s, MPLS): under implementation
11
LCG CNAF Milestones (1st Q 2006)
150 MB/s (SC3 disk-disk, throughput test rerun) ☺ Feb upgrade to CASTOR v2 (installation started on Dec ) ongoing Feb purchase of additional 120 TB on tape ongoing Feb all required sw baseline services deployed (SRM, LFC, FTS, CE, RB, BDII, RGMA) ☺ Mar 2006 setup of CNAF – Karlsruhe backup connection Mar 2006 evaluation of dCache and StoRM (for disk-only SRM) Tier-2’s: Definition of INFN Tier-2 service plan is ongoing
12
Candidate Tier-2 sites in SC3 (Oct 05)
Torino (ALICE): FTS, LFC, dCache (LCG 2.6.0) Storage Space: 2 TBy Milano (ATLAS): FTS, LFC, DPM 1.3.7 Storage space: 5.29 TBy Pisa (ATLAS/CMS): FTS, PhEDEx, POOL file cat, PubDB, LFC, DPM 1.3.5 Storage space: 5 TBy available, 5 TBy expected Legnaro (CMS): FTS, PhEDEx, Pool file cat., PubDB, DPM (1 pool, 80 Gby) Storage space: 4 TBy Bari (ALICE/CMS): FTS, PhEDEx, POOL file cat., PubDB, LFC, dCache, DPM Storage space: 1.4 TBy available, 4 TBy expected Catania (ALICE): DPM and Classic SE (Storage space: 1.8 Tby) LHCb CNAF Catania hw configuration
13
SC4 Service Planning – CNAF (1/2)
COMPO- NENT NEEDED BY PILOT USE PRODU-CTION COMMENT STATUS VOMS ALL Mar - Apr 2006 1 June 2006 Installation for production: May 06 Deployed Myproxy Mar- Apr 2006 BDII/GLUE FTS ALICE, ATLAS LHCb LFC CMS, ATLAS ALICE Lcg-utils GFAL RB Glite 3.0
14
SC4 Service Planning – CNAF (2/2)
COMPO- NENT NEEDED BY PILOT USE PRODU-CTION COMMENT STATUS CE – classic gLite 3.0 ALL Mar - Apr 2006 1 June 2006 Installation for production: May 06 Deployed (Classic CE and CREAM) gPBox CMS, ATLAS Mar- Apr 2006 TBD v on INFN Certification testbed RGMA/ GridICE ALL (GridICE) Deployed FTS ALICE, ATLAS LHCb Deployed (GridICE server ) LFC ALICE APEL DGAS ALL (DGAS) VOBOX Until May CASTOR DPM dCache Casto2 and dCache under testing, Installation in May StoRM also under testing
15
Personnel Storage group: Castor, Castor2, dCache, FTS
Network group: LHC OPN CNAF – CERN link configuration Farming group: from October 2005 Grid operations group: installation, testing, monitoring of SC services SC coordination at INFN: Tiziana Ferrari (CNAF) Michele Michelotto (INFN Padova, deputy) INFN SC mailing list Tier-2 relying on local man-power, working in close collaboration with CNAF
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.