Presentation is loading. Please wait.

Presentation is loading. Please wait.

Randy MelenApril 14, 19991 Stanford Linear Accelerator Center Site Report April 1999 Randy Melen SLAC Computing Services/Systems HPC Team Leader.

Similar presentations


Presentation on theme: "Randy MelenApril 14, 19991 Stanford Linear Accelerator Center Site Report April 1999 Randy Melen SLAC Computing Services/Systems HPC Team Leader."— Presentation transcript:

1 Randy MelenApril 14, 19991 Stanford Linear Accelerator Center Site Report HEPiX@RAL April 1999 Randy Melen SLAC Computing Services/Systems HPC Team Leader

2 Randy MelenApril 14, 19992 Past 12 months... n Busy! n Target of May 9 for BaBar detector to begin n Challenge to get systems assembled and tested in time, to get C++ code working and sufficiently optimized, to handle 100 events/second for reconstruction and event recording n Once BaBar data begins, more difficult to make system changes, take service outages

3 Randy MelenApril 14, 19993 New Hardware Developments n Increased Solaris batch systems to compute farm ( from 5 Sun Ultra 2300 systems to 18 systems) n Upgraded Sun UE 6000 to 4GB memory n Acquired 4 Sun UE4500 systems, increased to 6 systems, for HPSS data movers, total of 4TB of disk n Acquired Sun UE10000 (24 CPUs, 12GB memory, 1.5TB disk, 2 domains) n 4 Sun E250 systems as tape movers n 3 IBM F50 systems as data movers

4 Randy MelenApril 14, 19994 New Hardware Developments (cont.) n Added 220 Sun U5 systems (256MB, 9GB IDE disk, 333MHz UltraSPARC IIi with 2MB cache, $188/SI95) n Expect to add ~200 more U5 systems 2Q1999, probably more disk, perhaps UE10000 upgrade to 400MHz CPUs

5 Randy MelenApril 14, 19995 Farm Management n Upgraded farm master for LSF to IBM F50 n Working with Sun Auto Client software and cacheFS to centrally manage 200-400 Sun U5 systems n Actively doing Solaris performance tuning on UE6000 and UE10000 n Adding 2 Sun E250 systems as BaBar build systems; need to be able to build 1M C++ lines of code each night (twice?)

6 Randy MelenApril 14, 19996 Mass Storage Hardware n Upgraded 5 STK silos to PowderHorn robots n Added a 6’th STK silo and 12 STK Eagle drives; more Eagle drives will be needed n Need to add BaBar data import/export tape device; considering STK 9740 with DLT 7000 and RedWood drives

7 Randy MelenApril 14, 19997 Farm Network Technology n Currently using 3 Cisco Catalyst 5500 switches (~1.2 Gbps backplanes), everything on Fast Ethernet, single collision domains n Migrating to 3 Cisco Catalyst 6509 switches (~16 Gbps backplanes) n Deploying Gb Ethernet on ~16 Solaris servers

8 Randy MelenApril 14, 19998 HPSS Phase 3 (Porting) Ongoing n With assistance from Sun, began moving and testing the Solaris 2.5.1 port to Solaris 2.6 n Lots of issues related to getting infrastructure pieces at correct version levels n Began HPSS 4.1 datamover port to Solaris 2.6 n Sun and IBM signed agreement for IBM to port HPSS 4.1A; we expect to deploy ~4Q1999

9 Randy MelenApril 14, 19999 HPSS Stage 4 (PRV0) Plans n While Solaris port continues, use IBM F50 systems as datamovers n Move development (porting and testing) to Solaris U250 build servers

10 Randy MelenApril 14, 199910 Currently Supported Systems n General Servers u generally Solaris 2.5.1--> Solaris 2.6 u AFS servers will become Sun U2300 systems for AFS 3.5 multithreading u AIX 4.1.5 --> 4.2.1 u phasing out “core” NFS file server (AIX 3.2.5!) by moving binaries and home directories to AFS n Farm Servers u AIX 4.2.1 now frozen, not a porting platform for BaBar as of 7/1998 u Solaris 2.5.1 -> 2.6 completed n Desktop u still NT though much more Linux than before now

11 Randy MelenApril 14, 199911 Intel Farm Prototype n A prototype 17 node Intel compute farm acquired 4Q1998: u 2-way 256MB, 9GB disk, Dell 450MHz Pentium-II u partnership with Accelerator Research group and NERSC u strong interest in MPI and developing for Cray T3E production u decided on Linux from RedHat u modest success so far for scalability u expect to expand to 32 nodes 3Q1999 u Issues that remain: F Commercial software support (e.g., Objectivity, AFS, LSF with AFS support) F Manageability of large numbers of systems F MPI cluster vs “task farm”


Download ppt "Randy MelenApril 14, 19991 Stanford Linear Accelerator Center Site Report April 1999 Randy Melen SLAC Computing Services/Systems HPC Team Leader."

Similar presentations


Ads by Google