Presentation is loading. Please wait.

Presentation is loading. Please wait.

CERN Castor external operation meeting – November 2006 Olof Bärring CERN / IT.

Similar presentations


Presentation on theme: "CERN Castor external operation meeting – November 2006 Olof Bärring CERN / IT."— Presentation transcript:

1 CERN Castor external operation meeting – November 2006 Olof Bärring CERN / IT

2 Olof Bärring (IT/FIO/FS) 2 Outline  Recent ‘achievement’  CASTOR2 operation team  CASTOR2 deployment and operation  CASTOR2 instances  LHC experiments’ migration to CASTOR2  Plans for non-LHC experiments  Main problems and workarounds during 2006  Tape service  SRM service  Conclusions

3 Olof Bärring (IT/FIO/FS) 3 The 100 th million CASTOR file nsls -i /castor/cern.ch/cms/MTCC/data/ /A/mtcc A.testStorageManager_0.22.root /castor/cern.ch/cms/MTCC/data/ /A/mtcc A.testStorageManager_0.22.root Only ~50M files

4 Olof Bärring (IT/FIO/FS) 4 CASTOR2 operation team Jan Veronique MiguelIgnacio Olof

5 Olof Bärring (IT/FIO/FS) 5 CASTOR2 instances c2alicec2atlasc2cmsc2lhcbc2publicc2testC2itdc dev srm-v1.cern.ch Tape service dev srm-v2.cern.ch

6 Olof Bärring (IT/FIO/FS) 6 The c2 layout c2 srv01 c2 scheduler LSF master c2 srv02 c2 rtcpcld rtcpclientd MigHunter c2 srv03 c2 stager stager cleaning c2 srv04 c2 rh rhserver c2 srv05 c2 dlf dlfserver c2 srv06 c2 rmmaster C2 expert rmmaster expertd C2 stgdbC2 dlfdb Head nodes Oracle DB servers Normal disk servers today Next year: oracle certified h/w (NAS/Gbit solution) Disk pools All servers run rfiod, rootd and gridftp Currently all SLC3 + XFS, hw RAID-5 Soon SLC4/64bit + XFS

7 Olof Bärring (IT/FIO/FS) 7 LHC experiments’ CASTOR2 migration Jan’06 Feb’06 Mar’06 Apr’06 May’06 Jun’06 Jul’06 Aug’06 Sep’06 Oct’06 Nov’06 Dec’06 ALICE ATLAS CMS LHCb SC4 SC3 rerun CMS CSA06 ATLAS T0-2006/2 ALICE MDC7 ATLAS T0-2006/1

8 Olof Bärring (IT/FIO/FS) 8 ALICE  Smooth migration  STAGE_HOST switched in group environment  Simple castor usage: rfcp from WN or xrootd servers  4 disk servers (~20TB) xrootd cache  Challenges  ALICE MDC7 running now 1.2GB from ALICE pit  castor2  tape  Special requirements  xrootd support as internal protocol  Pools  default: 16TB  wan: 73TB  alimdc: 112TB

9 Olof Bärring (IT/FIO/FS) 9 ALICEMDC7 22 disk servers 25 tape drives (12 IBM 3592, 13 T10K)

10 Olof Bärring (IT/FIO/FS) 10 ATLAS  Smooth migration  STAGE_HOST switched in group environment  Usage  Production: mostly rfcp  Users: long-lived RFIO and ROOT streams  Challenges  ATLAS T0-2006/1 (July) and T0-2006/2 (October) CDR + reconstruction + data export  Special requirements  ‘Durable’ disk pools Special SRM v11 endpoint: srm-durable-atlas.cern.ch  Pools  default: 38TB  wan: 23TB (being removed)  atldata: 15TB no GC  analysis: 5TB  t0merge: 15TB  t0export: 130TB  atlprod: 30TB (being created)

11 Olof Bärring (IT/FIO/FS) 11 CMS  Migration  Users: quite smooth, STAGE_HOST switched in group environment  CDR: still some testbeam activity on stagepublic (castor1)  Usage  Production: rfcp and direct RFIO access  Users: long-lived RFIO streams  Challenges  CMS CSA’06 (October)  Special requirements  Low request latency (new LSF plugin)  Pools  default: 89TB  wan: 22TB  cmsprod: 22TB  t0input: 64TB (no GC)  t0export: 155TB

12 Olof Bärring (IT/FIO/FS) 12 CSA06

13 Olof Bärring (IT/FIO/FS) 13 LHCb  Migration  Production: smooth, mostly done already in March  Users: some difficulties Dependency on old ROOT3 data delayed migration Flip of STAGE_HOST not sufficient: gridjobs have no CERN specific env  most WN access to stagepublic (castor1)  Usage  RFIO and ROOT access  Challenges  Lots of tape writing at CERN in early summer  Data export in July - August  Special requirements  ‘Durable’ disk pools Special SRM endpoint: srm-durable-lhcb.cern.ch  Pools  default: 28TB  wan: 51TB  lhcbdata: 5TB no GC  lhcblog: 5TB no GC

14 Olof Bärring (IT/FIO/FS) 14 Plans for non-LHC migration  Plan is to migrate all non-LHC groups to a shared CASTOR2 instance: castorpublic  Dedicated pools for large groups  Small groups will share ‘default’  Also used for the dteam background transfers and by the ‘repack’ service  NA48 first out: plan is to switch off stagena48 end of January 2007  COMPASS  Complications  Engineering community may require windows client  How to migrate small groups without computing coordinators?

15 Olof Bärring (IT/FIO/FS) 15 Main problems and workarounds  prepareForMigration:  Deadlocks resulted in CASTOR NS not updated  file remains 0 size while tape segment>0 Tedious cleanup  GC  Long period of instabilities during the summer. Now OK since  Stager_qry:  Now you see your file, now you don’t… users confused  Used by operational procedure for draining disk server: manual and tedious workaround for INVALID status bug  LSF plugin related problems  Meltdown Limit PENDing jobs to 1000 workaround But may result in a rmmaster meltdown instead Problematic with ‘durable’ pools which are not properly managed  Recent problem with lsb_postjobmsg ‘Bad file descriptor’. Plugin cannot recover, workaround is to restart LSF  Missing cleanups  Accumulation of stageRm subrequests, diskcopies in FAILED, …  Looping migrators  NBTAPECOPIESINFS inconsistency. Workaround in hotfix of early September reduced the impact on the tape mounting but manual cleanup is still required  Looping recallers  Due to zero-size files (see above)  Due to a stageRm bug (insufficient cleanup)  Client/server (in)compatibility matrix  Request mixing…!

16 Olof Bärring (IT/FIO/FS) 16 Tape service (TSI section)  Both T10K and 3592 used in production during 2006  No preference  buy both  Current drive park:  40 SUN T10K  40 IBM 3592  6 LTO3  B  Current robot park  1 SUN SL8500  6 SUN powderhorns (recently dismounted 6)  1 IBM 3485  Buying for next year  10 more drives of each T10K and 3592  50 of each in total  1 more SUN SL8500  Enough media to fill the new robotics ~18k pieces of media: 12k T10K, 6k 3592 (700GB)

17 Olof Bärring (IT/FIO/FS) 17 Tape / Robots IBM 3584 Tape Library Monolithic Solution - 40 x 3592E05 IBM Tape Drives - ~6000 Tape Slots - 2 Accessors - ~38 m 2 of Floor Space SUN/STK SL8500 Tape Library Modular Solution - 40 x SUN T10K Tape Drives - 21 x LTO-3 Tape Drives - 10 x 9940B Tape Drives - ~8000 Tape Slots - 2 x 4 Handbots - Pass-Through Mechanism - ~19 m 2 of Floor Space SUN/STK SL8500 IBM 3584

18 Olof Bärring (IT/FIO/FS) 18 Repack of 22k 9940B tapes  Leave 4 powderhorn silos for 9940B tapes to be repacked to new media  Some tapes have a huge number of small files  Record: 165k files on a single 9940B tape (200GB). Will take ~1month to repack that tape alone…

19 Olof Bärring (IT/FIO/FS) 19 SRM service  SRM v11  Shared facility accessed through a single endpoint: srm.cern.ch 9 CPU servers, DNS loadbalanced 1 CPU server used for the request repository  Some dirty workaround for ‘durable’ space required setting up some extra endpoints (srm-durable-xyz.cern.ch)  All transfers initiated through srm.cern.ch (== castorsrm.cern.ch) are redirected to the disk servers. The old castorgrid gateway only used by non-LHC for non-SRM access (e.g. NA48 and compass) All CASTOR2 diskservers are on LCG network (also visible to the Tier-2 sites through the HTAR)  SRM v22  Test facility up and running (srm-v2.cern.ch)  No need for additional endpoints for ‘durable’ storage: durable space is addressed through SRM spacetokens

20 Olof Bärring (IT/FIO/FS) 20 Conclusions  4 LHC experiments successfully migrated to CASTOR2  All major SC4 milestones completed successfully  Non-LHC migration has ‘started’  New tape equipment running in production without any major problem  Our next challenges  Dare to remove dirty workarounds when bugs get fixed  SRM v22 operation and support  Repack 22k 9940B tapes to new media


Download ppt "CERN Castor external operation meeting – November 2006 Olof Bärring CERN / IT."

Similar presentations


Ads by Google