Presentation is loading. Please wait.

Presentation is loading. Please wait.

CASPUR Site Report Andrei Maslennikov Lead - Systems Edinburgh, May 2004.

Similar presentations


Presentation on theme: "CASPUR Site Report Andrei Maslennikov Lead - Systems Edinburgh, May 2004."— Presentation transcript:

1 CASPUR Site Report Andrei Maslennikov Lead - Systems Edinburgh, May 2004

2 A.Maslennikov - Edinburgh 20042 Contents Update on central computers Storage news Network highlights Projects 2004

3 A.Maslennikov - Edinburgh 20043 Central computers IBM SMP: - 3 frames with 80 POWER-4 CPUs at 1.1 GHz and 144 GB of RAM - 1 legacy frame with 64 POWER-3 CPUs at 375 MHz and 64 GB of RAM - AIX 5.2 ML2+, AFS and SGEE on all nodes - Very stable, all CPUs are heavily used - Under lease until 2006; will be probably upgrading to POWER-5 in 2005 HP SMP: - 8 4-CPU ES45 nodes at 1.25 GHz, 64GB of RAM and 1.2 TB of local FC disk - 6 Legacy ES40 nodes at 500-600 MHz used for BIOGRID project - Tru64 5.1a++ on all nodes, AFS + SGEEE on 5 standalone nodes - True Cluster on 9 nodes (AFS via Translator; powerful Solaris 9 gateway, memcache, modified SSH) - Requires a lot of attention, but very fast and fully used (mainly computational chemistry apps) - Arriving: 32-CPU EV7 node Itanium-2 SMP: - 1 single CPU, 5 biprocessor and 1 quad nodes (900 MHz - 1 Ghz – 1.5 GHz) - RH AS 3 on one node, all others run CERN CEL3/AS3 Build for ia64, AFS, SGEEE NEC SX-6i: - single CPU 4GB RAM, 8 GFLOP - speedup up to 10x against POWER4 for some apps, currently considering SMP purchase Reference: several biprocessor Intel/32bit and AMD/64bit

4 A.Maslennikov - Edinburgh 20044 Storage update AFS: 4 cells on site and 6 outside - OpenAFS 1.2.11 on Linux - Main Servers: SuperMicro 2x2.8 GHZ, June 2004: 6 TB (Infortrend SATA/FC) - Vice partitions in SGI XFS – only one XFS-related problem in 1.5 years - Standalone backup server on GigE, 84 GB/hour with 2 LTO2 drives - 3 cells are running Heimdal KDC since 6 months - AFS-aware SSH 3.8p1 binary builds (GSSAPI, K5 or AFSpw login+token) - Linux / WXP Heimdal single sign-on and AFS homedir in one of the cells. Administration: ssh but have just successfully tested AD (w. help of INFN-Lecce) - Will soon be migrating INFNs national cell to K5-MIT (cross-realm and Win issues)

5 A.Maslennikov - Edinburgh 20045 Storage update - 2 NFS (Mountain View Data): - In production since 1.5 years, very stable (runs off XFS, no crashes so far) - 2 SuperMicro 2x2.8 GHZ, June 2004: 8 TB (Infortrend SATA/FC) - 0.5 TB under staging (5 TB archived) Digital Library services on GFS: - Science Server, Web of Science web services – heavy load - 3300 scientific magazines, 2.5 million articles in fulltext PDF, searchable DB - Needed for load balancing: shared filestore with locking - On Sistina GFS since 6 months, 3 SM 2-way servers, 16 TB (Infortrend SATA/FC) - EXT3 copy of everything (tape backup is too slow for this number of files)

6 A.Maslennikov - Edinburgh 20046 Network highlights Plentitude of networks under control of Clavister FW - Internal workplaces, training class, visitors room – only outgoing connectivity - Internal and external DMZs, lab networks, internal DNS – quite complex - Private NAS GigE network outside FW - FW is far from saturation Internet Exchange Point - NAMEX - About 20 big customers (Telecom, Tiscali, Albacom, mobile operators, industry) - Traffic: around 1 Gbit / sec F-Root Name Server - Second in Europe after Madrid, first (and still the only one) in Italy IPv6 - Active member of 6NET project - CASPURs web site can be reached on IPv6

7 A.Maslennikov - Edinburgh 20047 CASPUR: principal resources in 2004 IBM – 150 CPUs (375 -1100 MHz) Itanium2 – 15 CPUs (0.9-1.5 GHz) HP 60 CPUs (667 – 1200MHz) FC SAN FC TAPE SYSTEMS 60 / 120 TB FC RAID SYSTEMS 32 TB Private NAS GigE AFS 6TB NFS 8 TB AFS Backup and Data Movers Digital Library 16TB Internet Internal infrastructure Internal GigEs TSM Backup NEC 6Xi

8 A.Maslennikov - Edinburgh 20048 Some activities in 2004 Technology tracking (in collab. with CERN and other centers) – 1 FTE - New storage devices - New software solutions in the field of storage - Excellent relationship with vendors, tested so far: more than 600 KUSD worth of hardware Staging IIa – 1 FTE (funded by CSP/Turin) - New version of Tape Dispatcher coming out (general clean-up, virtual tape library support) - Remote FC tape / libraries will be supported Data replication over WAN (in collab. with ENEA and GARR) – 0.5 FTE - Several centers with identical data inside and outside RDBMS - Each center has to be fully autonomous but should be able to forward any new data to all other centers - Bidirectional DB and plain data exchanges with eventual mediation at the head organization - Data mirroring with non-disruptive release scheme University La Sapienza – student accounts - Provide an account (space, personal web page, mail etc) for each of the 150 000 students - In progress: active discussions with Interdepartmental Computing Authority (CITICORD)


Download ppt "CASPUR Site Report Andrei Maslennikov Lead - Systems Edinburgh, May 2004."

Similar presentations


Ads by Google