Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cn - fhe - jun 94-1 CERN Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994 Frédéric Hemmer Computing & Networks Division.

Similar presentations


Presentation on theme: "Cn - fhe - jun 94-1 CERN Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994 Frédéric Hemmer Computing & Networks Division."— Presentation transcript:

1 cn - fhe - jun 94-1 CERN Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994 Frédéric Hemmer Computing & Networks Division CERN, Geneva, switzerland

2 cn - fhe - jun 94-2 CERN CERN - The European Laboratory for Particle Physics Fundamental research in particle physics Designs, builds & operates large accelerators Financed by 19 European countries SFR 950M budget - operation + new accelerators 3,000 staff Experiments conducted by a small number of large collaborations: 400 physicists, 50 institutes, 18 countries using experimental apparatus costing 100s of MSFR

3 cn - fhe - jun 94-3 CERN Computing at CERN computers are everywhere embedded microprocessors 2,000 personal computers 1,400 scientific workstations RISC clusters, even mainframes estimate 40 MSFR per year (+ staff)

4 cn - fhe - jun 94-4 CERN Central Computing Services 6,000 users Physics data processing traditionally: mainframes + batch emphasis on: reliability, utilisation level Tapes: 300,000 active volumes 22,000 tape mounts per week

5 cn - fhe - jun 94-5 CERN Application Characteristics inherent coarse grain parallelism (at event or job level) Fortran modest floating point content high data volumes –disks –tapes, tape robots moderate, but respectable, data rates - a few MB/sec per fast RISC cpu Obvious candidate for RISC clusters A major challenge

6 cn - fhe - jun 94-6 CERN CORE - Centrally Operated Risc Environment Single management domain Services configured for specific applications, groups but common system management Focus on data - external access to tape and disk services from CERN network, or even outside CERN

7 Home directories & registry CERN Network CSF Simulation Facility PIAF - Interactive Analysis Facility SPARCstations Central Data Services Shared Disk Servers consoles & monitors CORE Physics Services CERN SHIFT Data intensive services 7 IBM, SUN servers Scalable Parallel Processors 25 H-P H-P H-P H-P H-P GB RAID disk 5 H-P GB RAID disk 8 node SPARCcenter 32 node Meiko CS-2 (Early 1994) 8 node SPARCcenter 32 node Meiko CS-2 (Early 1994) Processors: 24 SGI; 11 DEC Alpha; 9 H-P; 2 SUN; 1 IBM Embedded disk: 1.1 TeraBytes Processors: 24 SGI; 11 DEC Alpha; 9 H-P; 2 SUN; 1 IBM Embedded disk: 1.1 TeraBytes 260 GBytes 6 SGI, DEC, IBM servers 260 GBytes 6 SGI, DEC, IBM servers 3 tape robots 21 tape drives 6 EXABYTEs 3 tape robots 21 tape drives 6 EXABYTEs SPARCservers Baydel RAID disks tape juke box SPARCservers Baydel RAID disks tape juke box les robertson /cn Shared Tape Servers equipment installed or on order Jamuary 1994

8 CERN Network CSF Simulation Facility PIAF - Interactive Analysis Facility SPARCstations Home directories & registry Central Data Services Shared Disk Servers consoles & monitors CORE Physics Services CERN SHIFT Data intensive services 7 IBM, SUN servers Scalable Parallel Processors 25 H-P H-P H-P H-P H-P GB RAID disk 5 H-P GB RAID disk 8 node SPARCcenter 32 node Meiko CS-2 (Early 1994) 8 node SPARCcenter 32 node Meiko CS-2 (Early 1994) Processors: 24 SGI; 11 DEC Alpha; 9 H-P; 2 SUN; 1 IBM Embedded disk: 1.1 TeraBytes Processors: 24 SGI; 11 DEC Alpha; 9 H-P; 2 SUN; 1 IBM Embedded disk: 1.1 TeraBytes 260 GBytes 6 SGI, DEC, IBM servers 260 GBytes 6 SGI, DEC, IBM servers 3 tape robots 21 tape drives 6 EXABYTEs 3 tape robots 21 tape drives 6 EXABYTEs SPARCservers Baydel RAID disks tape juke box SPARCservers Baydel RAID disks tape juke box les robertson /cn Shared Tape Servers equipment installed or on order Jamuary 1994

9 CERN Network CSF Simulation Facility PIAF - Interactive Analysis Facility SPARCstations Home directories & registry Central Data Services Shared Disk Servers consoles & monitors CORE Physics Services CERN SHIFT Data intensive services 7 IBM, SUN servers Scalable Parallel Processors 25 H-P H-P H-P H-P H-P GB RAID disk 5 H-P GB RAID disk 8 node SPARCcenter 32 node Meiko CS-2 (Early 1994) 8 node SPARCcenter 32 node Meiko CS-2 (Early 1994) Processors: 24 SGI; 11 DEC Alpha; 9 H-P; 2 SUN; 1 IBM Embedded disk: 1.1 TeraBytes Processors: 24 SGI; 11 DEC Alpha; 9 H-P; 2 SUN; 1 IBM Embedded disk: 1.1 TeraBytes 260 GBytes 6 SGI, DEC, IBM servers 260 GBytes 6 SGI, DEC, IBM servers 3 tape robots 21 tape drives 6 EXABYTEs 3 tape robots 21 tape drives 6 EXABYTEs SPARCservers Baydel RAID disks tape juke box SPARCservers Baydel RAID disks tape juke box les robertson /cn Shared Tape Servers equipment installed or on order Jamuary 1994

10 CERN Network CSF Simulation Facility PIAF - Interactive Analysis Facility SPARCstations Home directories & registry Central Data Services Shared Disk Servers consoles & monitors CORE Physics Services CERN SHIFT Data intensive services 7 IBM, SUN servers Scalable Parallel Processors 25 H-P H-P H-P H-P H-P GB RAID disk 5 H-P GB RAID disk 8 node SPARCcenter 32 node Meiko CS-2 (Early 1994) 8 node SPARCcenter 32 node Meiko CS-2 (Early 1994) Processors: 24 SGI; 11 DEC Alpha; 9 H-P; 2 SUN; 1 IBM Embedded disk: 1.1 TeraBytes Processors: 24 SGI; 11 DEC Alpha; 9 H-P; 2 SUN; 1 IBM Embedded disk: 1.1 TeraBytes 260 GBytes 6 SGI, DEC, IBM servers 260 GBytes 6 SGI, DEC, IBM servers 3 tape robots 21 tape drives 6 EXABYTEs 3 tape robots 21 tape drives 6 EXABYTEs SPARCservers Baydel RAID disks tape juke box SPARCservers Baydel RAID disks tape juke box les robertson /cn Shared Tape Servers equipment installed or on order Jamuary 1994

11 CERN Network CSF Simulation Facility PIAF - Interactive Analysis Facility SPARCstations Home directories & registry Central Data Services Shared Disk Servers consoles & monitors CORE Physics Services CERN SHIFT Data intensive services 7 IBM, SUN servers Scalable Parallel Processors 25 H-P H-P H-P H-P H-P GB RAID disk 5 H-P GB RAID disk 8 node SPARCcenter 32 node Meiko CS-2 (Early 1994) 8 node SPARCcenter 32 node Meiko CS-2 (Early 1994) Processors: 24 SGI; 11 DEC Alpha; 9 H-P; 2 SUN; 1 IBM Embedded disk: 1.1 TeraBytes Processors: 24 SGI; 11 DEC Alpha; 9 H-P; 2 SUN; 1 IBM Embedded disk: 1.1 TeraBytes 260 GBytes 6 SGI, DEC, IBM servers 260 GBytes 6 SGI, DEC, IBM servers 3 tape robots 21 tape drives 6 EXABYTEs 3 tape robots 21 tape drives 6 EXABYTEs SPARCservers Baydel RAID disks tape juke box SPARCservers Baydel RAID disks tape juke box les robertson /cn Shared Tape Servers equipment installed or on order Jamuary 1994

12 cn - fhe - jun CERN CSF - Central Simulation Facility second generation, joint project with H-P interactive host job queues shared, load balanced H-P 750 tape servers ethernet FDDI 25 H-P 735s - 48 MB memory, 400MB disk one job per processor generates data on local disk staged out to tape at end of job long jobs (4 to 48 hours) very high cpu utilisation : >97% very reliable : > 1 month MTBI

13 cn - fhe - jun CERN SHIFT Scalable, Heterogeneous, Integrated, Facility Designed in 1990 fast access to large amounts of disk data good tape support cheap & easy to expand vendor independent mainframe quality First implementation in production within 6 months

14 cn - fhe - jun CERN Design choices Unix + TCP/IP system-wide batch job queues “single system image” target Cray style & service quality pseudo distributed file system assumes no read/write file sharing distributed tape staging model (disk cache of tape files) –the tape access primitives are copy disk file to tape copy tape file to disk

15 cn - fhe - jun CERN IP network The Software Model disk servers cpu servers stage servers tape servers queue servers Define functional interfaces ---- scalable heterogeneous distributed

16 cn - fhe - jun CERN Unix Tape Subsystem (multi-user, labels, multi-file, operation) Fast Remote File Access System Remote Tape Copy System Disk Pool Manager Tape Stager Clustered NQS batch system Integration with standard I/O packages FATMEN, RZ, FZ, EPIO,.. Network Operation Monitoring Basic Software 

17 cn - fhe - jun CERN Unix Tape Control tape daemon –operator interface / robot interface –tape unit allocation / deallocation –label checking, writing

18 cn - fhe - jun CERN Remote Tape Copy System selects a suitable tape server initiates the tape-disk copy tpread -v CUT322 -g SMCF -q 4,6 pathname tpwrite -v IX2857 -q 3-5 file 3 file4 file5 tpread -v UX3465 `sfget -p opaldst file34`

19 cn - fhe - jun CERN Remote File Access System - RFIO high performance, reliability (improve on NFS) C I/O compatibility library Fortran subroutine interface rfio daemon started by open on remote machine optimised for specific networks asynchronous operation (read ahead) optional vector pre-seek –ordered list of the records which will probably be read next

20 cn - fhe - jun CERN sgi1 dec24 sun5 disk pool a disk pool is a collection of Unix file systems, possibly on several nodes, viewed as a single chunk of allocatable space

21 cn - fhe - jun CERN Disk Pool Management allocation of files to pools –pools can be public or private and filesystems –capacity management name server garbage collection –pools can be temporary or permanent example: sfget -p opaldst file26 may create file like: /shift/shd01/data6/ws/panzer/file26

22 cn - fhe - jun CERN implements a disk cache of magnetic tape files integrates: Remote Tape Copy System & Disk Pool Management queues concurrent requests for same tape file provides full error recovery - restage &/or operator control on hardware/system error initiate garbage collection if disk full supports disk pools & single (private) file systems available from any workstation Tape Stager

23 cn - fhe - jun CERN Tape Stager tape server rtcopy tape, file disk server stage control sfget file tpread tape, file cpu server (user job) stagein tape, file RFIO independent stage control for each disk pool

24 cn - fhe - jun CERN SHIFT Status equipment installed or on order January 1994 configuration -- capacity -- cpu(CU*) disk(GB) OPALSGI Challenge 4-cpu + 8-cpu (R MHz) Two SGI 340S 4-cpu (R MHz) ALEPHSGI Challenge 4-cpu (R MHz) Eight DEC DELPHITwo H-P 9000/ L3SGI Challenge 4-cpu (R MHz) ATLASH-P 9000/ CMSH-P 9000/ SMCSUN SPARCserver10, 4/ CPLEAR DEC AXP, 500AXP CHORUS IBM RS/ NOMAD DEC AXP Totals * CERN-Units:one CU equals approx. 4 SPECints ( CERN IBM mainframe ) CERN group

25 cn - fhe - jun CERN Current SHIFT Usage 60% cpu utilisation 9,000 tape mounts per week, 15% write still some way from holding the active data on disk MTBI - cpu and disk servers 400 hours for an individual server MTBF for disks: 160K hours maturing service, but does not yet surpass the quality of the mainframe

26 cn - fhe - jun CERN UltraNet 1 Gbps backbone 6 MBytes/sec sustained SHIFT cpu servers SHIFT disk servers IBM mainframe FDDI + GigaSwitch MBytes sustained SHIFT tape servers Ethernet + Fibronics hubs - aggregate 2 MBytes/sec sustained Simulation service Home directories connection to CERN & external networks CORE Networking

27 cn - fhe - jun CERN FDDI Performance (September 1993) 100 MByte disk file read/written sequentially using 32KB records client: H-P 735 server: SGI Crimson, SEAGATE Wren 9 disk system read write NFS1.6 MB/sec300 KB/sec RFIO2.7 MB/sec1.7 MB/sec

28 cn - fhe - jun CERN PIAF - Parallel Interactive Data Analysis Facility (R.Brun, A.Nathaniel, F.Rademakers CERN) the data is “spread” across the interactive server cluster the user formulates a transaction on his personal workstation the transaction is executed simultaneously on all servers the partial results are combined and returned to the user’s workstation

29 cn - fhe - jun CERN PIAF worker PIAF Architecture PIAF client display manager PIAF server PIAF worker PIAF worker PIAF worker PIAF worker user personal workstation PIAF Service

30 cn - fhe - jun CERN Scalable Parallel Processors embarrassingly parallel application - therefore in competition with workstation clusters SMPs and SPPs should do a better job for SHIFT than loosely coupled clusters computing requirements will increase by three orders of magnitude over next ten years R&D project started, funded by ESPRIT - GPMIMD2 32 processor Meiko CS-2 25 man-years development

31 cn - fhe - jun CERN Conclusion Workstation clusters have replaced mainframes at CERN for physics data processing For the first time, we see computing budgets come within reach of the requirements Very large, distributed & scalable disk and tape configurations can be supported Mixed manufacturer environments work, and allow smooth expansion of the configuration Network performance is the biggest weakness in scalability Requires a different operational style & organisation from mainframe services

32 cn - fhe - jun CERN Operating RISC machines SMP’s easier to manage SMP’s requires less manpower Distributed management not yet robust Network is THE problem Much easier than mainframes, and... cost effective


Download ppt "Cn - fhe - jun 94-1 CERN Analyse de Physique sur machines RISC : expériences au CERN SACLAY 20 JUIN 1994 Frédéric Hemmer Computing & Networks Division."

Similar presentations


Ads by Google