Presentation is loading. Please wait.

Presentation is loading. Please wait.

Swiss-T1 : A Commodity MPI computing solution Mars 1999 Ralf Gruber, EPFL-SIC/CAPA/Swiss-Tx, Lausanne.

Similar presentations


Presentation on theme: "Swiss-T1 : A Commodity MPI computing solution Mars 1999 Ralf Gruber, EPFL-SIC/CAPA/Swiss-Tx, Lausanne."— Presentation transcript:

1 Swiss-T1 : A Commodity MPI computing solution Mars 1999 Ralf Gruber, EPFL-SIC/CAPA/Swiss-Tx, Lausanne

2 Swiss-T1 : A Commodity MPI computing solution March 2000 Content: 1.Distributed Commodity HPC 2.Characterisation of machines and applications 3.Swiss-Tx project

3 July 1998 Past : SUPERCOMPUTER Cray Research Convex Connection Machines KSR Intel Paragon Japanese companies Teracomputers Taken over by SGI Taken over by HP Disappeared Stopped supercomputing Still existing (not main) Develop since 6 years Produced own processors Developped own memory switches Needed special memories Developped own operating system Developped own compiler Special I/O : HW and SW Own communication system ManufacturesWhat happenedWhy it happened

4 Processor performance evolution July 1998

5 SMP/NUMA DIGITAL SUN IBM HP SGI ….. Wildfire Starfire SP-2 Exemplar Origin 2000 ….. Off the shelf processors Off the shelf memory switches Off the shelf memories Special parts of operating system Special compiler extensions Special I/O and SW Own communication system ManufacturerParallel serverPresent situation What is the trend ?

6 March 2000 Commodity Computing (MPI/PCI) PC clusters/Linux: Fast Ethernet: Beowulf SOS cooperation (Alpha): Myrinet/DS10: C-Plant (SNL) T-Net/DS20: Swiss-T1 (EPFL) Customised commodity: Quadrics/ES40: Compaq/Sierra Off the shelf processors Off the shelf memory switches Off the shelf memories Off the shelf local I/O HW and SW Off the shelf operating systems Off the shelf compilers New communication system New distributed file/IO system

7 March 2000 4th SOS workshop on Distributed Commodity HPC Participants: SNL, ORNL, Swiss-Tx, LLNL, LANL, ANL, NASA, LBL, PSC, DOE, UNM, Syracuse, Compaq, IBM, Cray, Sun, SME’s Content: Vision, Clusters, Interconnects, Integration, OS, I/O, Applications, Usability, Crystal ball

8 March 2000 Distributed commodity HPC User’s Group Goals: Characterise the machines Characterise the applications Match machines to applications

9 Characterise processors, machines, and applications Performance Processors: V mac V mac = peak proc. performance/peak memory BW Parallel machines:  mac  mac = effective proc. perf./effective network perf. Applications:  app  app = operation count/words to be sent

10 15 juin 1998 In a box: V mac values V mac = R  [Mflop/s] / M  [Mword/s] Table: V mac values for Alpha 21164 and 21264 boxes and NEC SX-4 Machine N R  M  V mac Alpha server 120022133 138 15 DS2022000 667 3 DS20+22667 667 4 NEC SX-4120002000 1

11 Between boxes:  mac value  mac = N * R [Mflop/s] * / C [Mword/s] Table:  mac of different machines Machine Type Nproc Peak Eff perf Eff bw  mac Gravitor Beowulf 128 50 6.4* 0.064 100 Swiss-T1 T-Net 64 64 13 0.32 40 Swiss-T1 FE 64 64 13 0.032 400 Baby T1 C+PCI 12 12 2.4 0.072 30 Origin2K NUMA/MPI 80 32 9 1 9 NEC SX4 vector 8 16 8 6.4 1.3 Effective performance measured with MATMULT, * estimated. Effective bandwidth measured with point to point

12 The  app value  app = Operations/Communicated words Material sciences (3D Fourier analysis):  app ~ 50 Beowulf insufficient, Swiss-T1 just about right Crash analysis (3D non-linear FE):  app > 1000 Beowulf sufficient, latency?

13 The  app value for Finite Elements  app = Operations/Communicated words FE: Ops  Nb of volume nodes Ops  Nb of variables per node square Ops  Nb of non-zero matrix elements Ops  Nb of operations per matrix element FE: Comm  Nb of surface nodes Comm  Nb of variables per node FE:  app  Nb of nodes in one direction  app  Nb of variables per node  app  Nb of non-zero matrix elements  app  Nb of operations per matrix element  app  Nb of surfaces

14 The  app value Statistics for 3D brick problem (Finite elements) Nb ofNb ofNb MflopMflopkBkB  app SubdNodesinterface/cycle/data /cycle/cycle Nodes/proctransfer/proc 15049013.513.50.00.0  2520215313.56.87.23.615074 4550845913.53.421.55.45028 166366131713.50.861.73.91755 326960191113.60.489.62.81211 647572252313.60.2118.31.8918 1288796374713.60.1175.61.4620 Table: Current day case, 4096 elements

15 March 2000 Fat-tree/Crossbars 16x16 N=8, P=8, N*P=64 PUs, X=12, BiW=32, L=64

16 March 2000 Circulant graphs/Crossbars 12x12 K=2 (1/3) N=8, P=8, X=8 BiW=8, L=16 K=3 (1/3/5) N=11, P=6, X=11 BiW=18, L=33 K=4 (1/3/5/7) N=16, P=4, X=16 BiW=32, L=64

17 March 2000 Fat-tree/Circulant graphs

18 The Swiss-Tx machines September 1998 Swiss-T0 Machine Swiss-T0 * (Dual) Baby T1* Swiss-T1 Installation Date Place 12.97 EPFL 10.98 EPFL 8.99 EPFL 4.00 DGM 1.00 EPFL #P 8 16 70 Peak Gflop/s 8 16 70 Memory GBytes 2 8 8 35 Disk GBytes 64 170 950 Archive TBytes 1** - - Operating system Digital Unix Windows NT Digital Unix Tru64 Unix Connection EasyNet bus FE bus system Crossbar 12x12 FE switch EasyNet bus FE switch Crossbar 12x12 FE switch -90002521008504? Not decided Crossbar 12x12 FE switch Swiss-T2 * Baby T1 is an upgrade of T0(Dual)** Archive ported from T0 to T1

19 March 2000 Swiss-T1

20 Components 32 computational DS20E 2 frontend DS20E 1 development DS20E 300 GB RAID disks 600 GB distributed disks 1 TB DLT archive Fast/Gigabit Ethernet Tru64/TruCluster Unix LSF, GRD/Codine Totalview, Paradyn MPICH/PVM T-Net network technology ( 8+1)12x12 crossbar 100MB/s 32 bit PCI adapter 75 MB/s (64 bit PCI adapter 180 MB/s) Flexible, non-blocking Reliable Optimal routing FCI 5  s MPI 18  s Monitoring system Remote control Up to 3 Tflop/s (  < 100)

21 March 2000 Swiss-T1 Architecture

22 March 2000 Swiss-T1 Routing table

23 Swiss-T1: Software in a Box March 2000 *Digital UnixCompaqOperating system in each box *F77/F90CompaqFortran compilers *HPFCompaqHigh performance Fortran *C/C++CompaqC and C++ compilers *DXMLCompaqDigital math library in each box *MPICompaqSMP message passing interface *Posix threadsCompaqThreading in a box *OpenMPCompaqMultiprocessor usage in a box through directives *KAP-FKAITo parallelise a Fortran code in a multiprocessor box *KAP-CKAITo parallelise a C program in a multiprocessor box

24 Swiss-T1: Software between Boxes March 2000 *LSFPlatform Inc.Load Sharing Facility for resource management *TotalviewDolphinParallel debugger *ParadynMadison/CSCS Profiler to help parallelising programs *MPI-1/FCISCS AGMessage passing interface between boxes running over TNET *MPICHArgonneMessage passing interface running over Fast Ethernet **PVMUTKParallel virtual machine running over Fast Ethernet *BLACSUTKBasic linear algebra subroutines *ScaLAPACKUTKLinear algebra matrix solvers MPI I/OSCS/LSPMessage passing interface for I/O MONITOREPFLMonitoring of system parameters NAGNAGMath library package EnsightEnsight4D visualisation MEMCOMSMR SAData management system for distributed architectures ShmemEPFLInterface Cray to Swiss-Tx

25 March 2000 Baby T1 Architecture

26 Swiss-T1 : Alternative network March 2000

27 Swiss-T2 : K-Ring architecture

28 Create SwissTx Company Commercialise T-Net Commercialise dedicated machines Transfer knowhow in parallel application technology

29 Between boxes:  mac value * measured (SAXPY and Parkbench)** expected  mac = N * R [Mflop/s] * / C [Mword/s] Table : The  mac values for Swiss-T0, Swiss-T0(Dual) and Swiss-T1 for MATMUL Machine N R  % N * R C  mac T0 (Bus) 8 8000 5* 400 * 4 * 1100 T0(Dual) (Bus) 8*2 16533 6* 1000 * 4 * 1250 Baby T1 (Switch) 6*2 12000 20* 2400 * 90* 1 27 T1(local) (Switch) 4*2 8000 20* 1600 * 60 ** 1 27 T1(global)(Switch) 32*2 64000 20* 12800 * 400** 1.25 40 T1 (Fast Ethernet) 32*2 64000 20* 12800* 80** 1160

30 Time Schedule March 2000 1.1.981.1.99 1.1.00 1st phase2nd phase Swiss-T2 504 processors OS not defined Baby T1 12 processors Digital Unix Swiss-T0(Dual) 16 processors Windows NT Swiss-T0(Dual) 16 processors Digital Unix 1.6.9831.10.001.11.99 Swiss-T1 68 processors Digital Unix EasyNet bus based prototypesT-Net switch based prototype/production machines

31 March 2000 Phase I: Machines installed Swiss-T0: 23 December 97 (accepted 25 May 98) Swiss-T0(Dual): 29 September 98 (accepted 11 Dec. 98 / NT) Swiss-T0(Dual): 29 September 98 (accepted 22 Jan. 99 / Unix) Swiss-T1 Baby: 19 August 99 (accepted 18 Oct. 99 / Unix) Swiss-T1: 21 Jan. 2000

32 Swiss-T1 Node Architecture Mars 1999

33 March 2000 2nd Phase Swiss-Tx: The 8 WPs Managing Board: Michel Deville Technical Team: Ralf Gruber Management: Jean-Michel Lafourcade WP1: Hardware developmentRoland Paul, SCS WP2: Communication software developmentMartin Frey, SCS WP3: System and user environmentMichel Jaunin, SIC-EPFL WP4: Data management issuesRoger Hersch, DI-EPFL WP5: ApplicationsRalf Gruber, CAPA/SIC-EPFL WP6: Swiss-Tx conceptPierre Kuonen, DI-EPFL WP7: ManagementJean-Michel Lafourcade, CAPA/DGM-EPFL WP8: SwissTx Spin-off CompanyJean-Michel Lafourcade, CAPA/DGM-EPFL

34 ;March 2000 2nd Phase Swiss-Tx: The MUSTs WP1: PCI adapter page table/ 64 bit PCI adapter WP2: Dual processor FCI / Network monitoring / Shmem WP3: Management / Automatic SI / Monitoring / PE / Libraries WP4: MPI-I/O / Distributed file management WP5: Applications WP6: Swiss-Tx architecture / Autoparallelisation WP7: Management WP8: SwissTx Spin-off Company


Download ppt "Swiss-T1 : A Commodity MPI computing solution Mars 1999 Ralf Gruber, EPFL-SIC/CAPA/Swiss-Tx, Lausanne."

Similar presentations


Ads by Google