Presentation is loading. Please wait.

Presentation is loading. Please wait.

CASPUR Site Report Andrei Maslennikov Group Leader - Systems RAL, April 1999.

Similar presentations


Presentation on theme: "CASPUR Site Report Andrei Maslennikov Group Leader - Systems RAL, April 1999."— Presentation transcript:

1 CASPUR Site Report Andrei Maslennikov Group Leader - Systems RAL, April 1999

2 A.Maslennikov - HEPiX - RAL 992 Will be shortly covered: Central computers Other nodes Network Distributed storage Tape-related systems CASPUR and HEP Gentes/Ateneo project Short-term plans

3 A.Maslennikov - HEPiX - RAL 993 Central computers Alpha SMP Cluster 4100 - 28 processors - DU 4.0d interactive (front-end) : 1 x 400Mhz/2Gb parallel batch (LSF) : 4 x 400Mhz/1Gb + 2 x 600Mhz/2Gb 1999: 20 more EV6 processors (or upgrade to), 32-proc “wildfire”? Sun SMP - 22 processors - Solaris 2.6 interactive + parallel batch (LSF) : 1 x 3500/336Mhz/2Gb (8 processors) parallel batch (LSF) : 1 x 4500/336Mhz/3.6GB (14 processors) 1999: waiting for new SMP models IBM SP2 - 32 processors - AIX 4.3.2++/PSSP2.4++ interactive : 4 thin nodes (390) serial batch (LSF) : 12 thin nodes parallel batch+interactive (EASY) : 16 thin nodes 1999: waiting for SP3 offer (need SMP nodes with 4-16 proc)

4 A.Maslennikov - HEPiX - RAL 994 Other nodes Some 200 UNIX nodes under our direct supervision (all UNIX flavours, single nodes and clusters). Around 100 PCs running Windows and Linux. Worth mentioning: - Linux Beowulf Cluster (10 PPro 200 + 4 PII 400) (MPI with GAMMA protocol on Digital FE cards) - Graphics nodes: 2 Alpha 533au(2) with 4D51T and 4D60T cards with 64 MB of texture memory; - 2 Power-3 biprocessor AIX nodes

5 A.Maslennikov - HEPiX - RAL 995 Network In 1998 our LAN became fully switched, currently we have around 100 100baseT switch ports. Switch hardware: several Cabletron and Compaq switches interconnected via Gigabit Ethernet; we also use virtual LANs Principal nodes are on FDDI (22 DEC GigaSwitch ports) Planning to try Gigabit Ethernet at host level, few GE cards are already under test on Sun and Linux

6 A.Maslennikov - HEPiX - RAL 996 Distributed Storage TCP/IP-less datastore with true data sharing across platforms is not yet available. So we are still investing in both NFS and AFS solutions. NFS is mainly used as a store for large data files, and as an element of the Staging System. AFS is used for home directories and as a store for collections of various ready-to-run software. We currently run 6 cells with some 300 Gb online, also over WAN.

7 A.Maslennikov - HEPiX - RAL 997 NFS: one more Filer Current NFS Server: F540 Network Appliance Filer with 150 Gb of formatted RAID space on FE and FDDI. Just ordered: another Filer (F760/600Mhz/1Gb) with 300 Gb of RAID disk and GE/FDDI network interfaces - 3 times more NFSops/sec than F540 - allows for clustering (better scaleability)

8 A.Maslennikov - HEPiX - RAL 998 AFS: news since last report Purchased AFS Source Code. This allowed us to compile AFS on Solaris/Intel (thanks to Rainer Toebbicke /CERN who proved that this is possible). University of Rome-3 went Solaris/Intel also for DB (3 servers). Abdus Salam Centre for Theoretical Physics joined our AFS License. Upgraded central servers (now 3 Alpha 500au on FE and FDDI). Proved to be very stable and performant. We go Fibre Channel! - Just ordered 280 Gb of RAID-5/FC from Artecon - Dual active-active controllers - Gadzoox hubs and HBAs from Genroco - This system will be replacing most of the on-site AFS disks.

9 A.Maslennikov - HEPiX - RAL 999 Tape access During l998, all services which use the tape robotics operated steamlessly: AFS and ADSM backups, staging. Some 80 Gb were deeply archived via the Staging System. With F540 Filer we stage at 4+ Mbytes/sec, almost at the limit of Timberline tape. In 1999 we plan to replace the STK Silo with 9840 library: - doubles the tape speed - BABAR-compliant - smaller maintenance fees - frees the physical space in computer centre.

10 A.Maslennikov - HEPiX - RAL 9910 CASPUR and HEP Geographical AFS system support for INFN Regular ASIS mirroring over WAN to 17 INFN Sections across Italy Linux system support for INFN. - Linux tree maintenance - AFS-enabled bootable Linux CDs at the latest patchlevel. Software collaboration with CERN (ASIS, Linux, AFS). Regional Centre for BABAR: fullscale system support.

11 A.Maslennikov - HEPiX - RAL 9911 Gentes/Ateneo project Scope: provide a turnkey computing environment for a generic research organization / university department. F ully Intel-based D esktop on Linux and/or WNT J ust 4 Intel machines make into a core: - Entry Point Linux host with a firewall - AFS fileserver on Solaris - Management Linux host with YARD dbms and https tools - General Services (mail,web,print,efax,ppp,majordomo etc) on a single Linux (SMP) machine W NT/Linux AFS-based integration: single password, common filestore, YARD ODBC C lient installation: cloning with Norton Ghost P rogressing well. First presentation: June 1999.

12 A.Maslennikov - HEPiX - RAL 9912 Some short-term plans Compile AFS 3.5 Server on Solaris/Intel - will improve performance for en masse serving of small files Test FC on Linux (QLogic card) - first to provide a RAID space for mail spool - next to take a look at Global File System (w. Seagate disks) Test FC on AIX - CASPUR will be probably asked to propose a set of high availability services for PCM; IBM DFS with FC RAID might make into a good combination. Try LoadLeveler on Solaris - LSF becomes too expensive (they charge per CPU)


Download ppt "CASPUR Site Report Andrei Maslennikov Group Leader - Systems RAL, April 1999."

Similar presentations


Ads by Google