Presentation is loading. Please wait.

Presentation is loading. Please wait.

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

Similar presentations


Presentation on theme: "TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost."— Presentation transcript:

1 TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost

2 LINUX at TRIUMF UseISO CDs Kickstart Available Auto Updates RH9 Servers Desktop Yes (only errata / no new hardware) Fedora Core 1 Leading Desktop, Special Needs Servers Yes Errata – 18months Scientific Linux Desktop. Future Servers & Desktops – Support ! Yes 36 months for hardware, 60 months for errata by RH TRIUMF urges proper support for Scientific Linux TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost

3 WAN Replacement MRV units (10Gb/sec capable) Third Passport Router TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost

4 WestGrid – UBC/TRIUMF Site 504 dual 3.06 GHz Xeon IBM blades Red Hat Linux 9 to allow GPFS (NFS nixed) OPENPBS Scheduling with (MOAB) Maui 10 TB disk storage 70 TB tape storage Direct Gigabit connection between sites Possible 10GB in future February 2004 – opened for general use. TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost

5 WestGrid – UBC/TRIUMF Site (www.westgrid.ca) From a cold start : GPFS servers load in 5-10min All nodes up on 60-90min Bring up single nodes – 10min Rebuild (disk) for node – 2 hrs Single node failure rate ~ 1/day Node disk failures dominate Utilization about 87% TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost

6 Network / Servers TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost

7 Servers Upgrade Program TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost

8 LCG Grid Participant TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost

9 Hardware nice but… 40pin IDE cable is a problem with 2.6 kernel Mounting bracket screws can short audio & halt boot High I/O Testbed TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost

10 STORM1 & STORM2 Dual 3.2 GHz Xeons 4GB memory 4 3WARE LP 16 SATA GB DRIVES 20GB ST92011A DRIVE INTEL 10GBE PXLA8590LR TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost

11 High Speed I/O –Part 1 Used ext2 for highest speeds (no journaling, but 2TB file size limit) RH 9 One Four disk (writes) software RAID 0 3-Ware Controller 50.6, 98, 124, 141 MB/sec respectively. Four disks split over two 3-Ware controllers 162 MB/sec writes Four disks on 1 hardware raid 0 and software raid 0 138MB/sec writes Adding 4 more disks on second 3-Ware – 250 MB/sec (slots 2,5) MB/sec (slots 2,3) Adding 4 more disks on third 3-Ware MB/sec (slots 2,3,5) MB/sec (slots 2,3,4) Adding 4 more disks on fourth 3-Ware MB/sec (slots 2,3,4,5) TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost

12 High Speed I/O- Part 2 Using 4 3-ware in hardware raid 0 mode, software raided by Linux dd if=/dev/zero of=/raid/8GB bs=81920 count= Fedora1 – non-smp – np1 HT ext2 -T news write 370 MB/sec Fedora1 – non-smp – np1 HT reiserfs write 227 MB/sec Loaded e2fs module to fix -largefile and –largefile4 creation with mkfs –T largefile /dev/md0 Fedora1 –non-smp – npt1 HTlargefile ext2write349 MB/sec Fedora1 –non-smp npt1 noHT largefile ext2 write300 MB/sec Fedora1 –non-smp – 2.6.6#1 HTlargefile ext2 write375 MB/sec Replaced 40 with 80 pin ide cable to main disk allowed SMP to boot Fedora1 –SMP – 2.6.6#1 noHTlargefile ext2 write309 MB/sec echo > /proc/sys/net/core/rmem_default echo > /proc/sys/net/core/rmem_max echo > /proc/sys/net/core/wmem_default echo > /proc/sys/net/core/wmem_max echo > /proc/sys/net/core/netdev_max_backlog echo > /proc/sys/net/core/optmem_max sysctl -w net.ipv4.tcp_rmem=" " sysctl -w net.ipv4.tcp_wmem=" " sysctl -w net.ipv4.tcp_mem=" " Iperf maxed out at 2.3Gbits/sec with recompiled kernel for WEB100 TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost

13 High Speed I/O- Part 3 root]# time ttcp -t -b l storm1-10g storm1-10g ttcp-t: socket ttcp-t: sndbuf ttcp-t: connect ttcp-t: bytes in real seconds = KB/sec +++ ttcp-t: I/O calls, msec/call = 0.52, calls/sec = ttcp-t: 0.0user 22.2sys 0:42real 52% 0i+0d 0maxrss 0+25pf csw Ttcp disk to disk 191 Mbytes/sec Three Walls : CPU % seen 3Ware I/O Controller (140MB/sec instead of 4*50, 375MB/sec instead of 4*140) 10Gbit Intel Card using ixgb driver (2.3 Gb/sec) Ongoing: Tuning Process Affinity (using /usr/bin/run) Interrupt Affinity (IRQ of 3-ware and 10GbE set to CPUs eg /proc/irq/24/smp_affinity) TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost

14 Misc. Developments Build a cheap hot-swap Serial ATA drives Raid 5 system 1 Promise Fasttrack S150 SX4 controller $233Can 3 Promise Superswap 1100 Drive Enclosures for SATA/150 $112Can 3 Maxtor 120GB S-ATA drives (6Y120M0) $145Can Test on cheap 1.7GHz Celeron, Intel D845GVSLR, 256Mb memory Redhat 9.0 base (wont work on updated kernels) Read large file – 46.8 Mbytes/sec Write large file – 46.5 Mbytes/sec Able to pull disk while active – auto rebuilds in 75min when replaced. TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost

15 Misc. Developments Remote power on/off using networked power bars TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost

16 Mail at TRIUMF TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost

17 IMP Webmail TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost

18 Squirrel Webmail TRIUMF Site Report for HEPiX, Edinburgh, May 2004 – Corrie Kost


Download ppt "TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost."

Similar presentations


Ads by Google