Presentation on theme: "Evolution of the Italian Tier1 (INFN-T1) Umea, May 2009 1."— Presentation transcript:
Evolution of the Italian Tier1 (INFN-T1) Umea, May 2009 Felice.Rosso@cnaf.infn.it 1
May 2009 2 In 2001 the project of the Italian Tier1 in Bologna at CNAF was born. First computers were based on Intel Pentium III @ 800 MHz, 512 Mbytes of RAM, 20 GBytes of local storage and 100 Mbit/sec network interface. No redundancy: 1 chiller, 1 electric line power, 1 static UPS (based on chemical batteries), 1 motor generator diesel engine, no remote control. Maximum usable electric power was 1200 KVA, that means maximum 400 KVA for computers. Situation not compatible with HEP experiment requests. We decided to rebuild everything! In 2008 works started for the new mainframe rooms and new power station. Today all works are finished, all is up&running. We are on time and the original schedule was fully respected. Total infrastructure redundancy and total remote control. 2
May 2009 33 New infrastructure is based on 4 rooms: 1)Chiller cluster (5 + 1) 2)2 Mainframe rooms + Power Panel Room 3)1 Power station 4)2 Rotary UPS and 1 diesel motor generator Chiller cluster: 350 KWatt cooled per chiller: in best effiency (12C in 7C out) ratio 2:1, 50 KWatt needed to cool 100 KWatt Mainframe: more than 100 (48 + 70) rack APC Netshelter SX 42U 600mm Wide x 1070mm deep Enclosure. 15 KVolt Power station 3 Transformer (2 + 1 redundancy, 2500 KVA each one) Rotary UPS (2 x 1700 KVA) 2 electrical lines (red and green, 230 Volt per phase, 3 phases + ground). Motor generator diesel engine: 1250 KVA (60.000 cc) 3
Sala calcolo (piano -2) Sala chiller (piano -1) Sala calcolo (piano -2) Spazi consegna elettr. Spazi emergenza elettr. 5 Blindo bar Quadri (power panel) Rack
Spazi consegna elettr. V i a R a n z a n i ENEL 15KV ENEL 15KV Cabina consegna Corridoio cavi 15KV Sala di trasformazione TR2 TR1 TR3 Quadri elettrici Sala chiller (piano -1) Sala calcolo (piano -2) Spazi consegna elettr. Spazi emergenza elettr. 6 Collaudo elettrico
Spazi emergenza elettr. 2 gruppi rotanti (rotary UPS) 2 cisterne gasolio (diesel oil tank) Sala chiller (piano -1) Sala calcolo (piano -2) Spazi consegna elettr. Spazi emergenza elettr. 7
8 Sala chiller (piano -1) Sala calcolo (piano -2) Spazi consegna elettr. Spazi emergenza elettr.
9 Sala chiller (piano -1) Sala calcolo (piano -2) Spazi consegna elettr. Spazi emergenza elettr.
Farming upgrade Q4/2009: +20.000 HEP-SPEC installed No more FSB technology (Nehalem and Shanghai only) Option to buy in 2010 directly (less bureaucracy) Remote Console ~50% for LHC experiments We support more than 20 experiments/collaborations May 2009 22
Disk Storage Systems May 2009 24 All systems interconnected in a SAN 12 FC switches (2 core switches) with 2/4 gbps connections ~ 200 disk-servers ~ 2.6 PB raw (~ 2.1 PB-N) of disk-spac 13 EMC/DELL Clarion CX3-80 systems (SATA disks) interconnected to SAN 1 dedicated to databases (FC disks) ~ 0.8 GB/s bandwidth 150-200 TB each 12 disk-servers (2 x 1 gbits uplinks + 2 FC4 connections) part configured as gridftp servers (if needed), 64 bits OS (see next?? slide) 1 CX4-960 (SATA disks) ~ 2 GB/s bandwidth 600 TB Other older hw (Flexline, Fast-T etc..) being progressively phased out No support (part used as cold spares) Not suitable for GPFS
Tape libraries May 2009 25 New SUN SL8500 in production since July’08 10000 slots 20 T10KB drives (1 TB) in place 8 T10KA (0.5 TB) drives still in place 4000 tapes on line (4 PB) “old” SUN SL5500 1 PB on line, 9 9940b and 5 LTO drives Nearly full No more used for writing Repack on going ~5k 200 GB tapes to be repacked ~ 500 GB tapes
How experiments use the storage Experiments present at CNAF make different use of (storage) resources –Some use almost only the disk storage (e.g. CDF, BABAR) –Some use also the tape system as an archive for older data (e.g. VIRGO) –The LHC experiments exploit the functionalities of the HSM system… –….but in different ways CMS and Alice) use primarily the disk with tape back-end, while others Atlas and LHCb concentrate their activity on the disk-only storage (see next slide for details) Standardization over few storage systems, protocols –Srm vs. direct access –file, rfio, as LAN protocols –Gridftp as WAN protocol –Some other protocols used but not supported (xrootd, bbftp) May 2009 26
STORAGE CLASSES 3 class of services/quality (aka Storage Classes) defined in WLCG Present implementation at CNAF of the 3 SC’s –Disk1 Tape0 (D1T0 or online-replica) GPFS/StoRM Space managed by VO Mainly LHCb, Atlas, some usage from CMS and Alice –Disk1 Tape1 (D1T1 or online-custodial) GPFS/TSM/StoRM Space managed by VO (i.e. if disk is full, copy fails) Large buffer of disk with tape back end and no garbage collector LHCb only –Disk0 Tape1 (D0T1 or nearline-replica) CASTOR Space managed by system Data migrated to tapes and deleted from disk when staging area full CMS, LHCb, Atlas, Alice testing GPFS/TSM/StoRM This setup satisfies nearly all WLCG requirements (so far) exception made for: Multiple copies in different Storage Area for a sURL Name space orthogonality May 2009 27